Data Collector is the utility that retains essential information on performance and resource utilization counters. The main use of the Data Collector utility is to assist into finding performance issues and help improve the overall performance of your database. Now that we know what is the Data Collector utility let's go and see how data is collected where is it stored and some ways we can manage this data to make it useful to us. Data Collector comes enabled by default and it can be disabled by the database superuser. The data is retained by the Data Collector utility in the DataCollector directory under the own Vertica/catalog path and also is available to be queried under the v_internal schema and the table are identified by the dc_ prefix.
SELECT SET_CONFIG_PARAMETER('EnableDataCollector', '0');
SELECT SET_CONFIG_PARAMETER('EnableDataCollector', '1');
dbadmin= SELECT DISTINCT component, description FROM data_collector ORDER BY 1 ASC;
component | description
--------------------------------------+--------------------------------------------------------------------------------------------------------------------
AllocationPoolStatistics | Information about global memory pools, which generally cannot be recovered without restart
AllocationPoolStatisticsByDay | Information about global memory pools, which generally cannot be recovered without restart (historical, by day)
AllocationPoolStatisticsByHour | Information about global memory pools, which generally cannot be recovered without restart (historical, by hour)
AllocationPoolStatisticsByMinute | Information about global memory pools, which generally cannot be recovered without restart (historical, by minute)
AllocationPoolStatisticsBySecond | Information about global memory pools, which generally cannot be recovered without restart (historical, by second)
AnalyzeStatistics | History of statistics collection
Backups | Monitoring successful backups
BlockMemoryManagerEvents | Events for Block Memory Manager
dbadmin= select get_data_collector_policy('Backups');
get_data_collector_policy
------------------------------------------------------------------------
10KB kept in memory, 10KB kept on disk. Time based retention disabled.
(1 row)
dbadmin= SELECT * FROM data_collector where component='Backups';
-[ RECORD 1 ]----------+------------------------------
node_name | v_d_node0001
component | Backups
table_name | dc_backups
description | Monitoring successful backups
access_restricted | t
in_db_log | f
in_vertica_log | f
memory_buffer_size_kb | 10
disk_size_kb | 10
set_interval | f
interval_time_s | 0
record_too_big_errors | 0
lost_buffers | 0
lost_records | 0
retired_files | 0
retired_records | 0
current_memory_records | 0
current_disk_records | 0
current_memory_bytes | 0
current_disk_bytes | 0
first_time |
last_time |
kb_per_day |
dbadmin= select set_data_collector_policy('Backups', '20', '20');
-[ RECORD 1 ]-------------+----
set_data_collector_policy | SET
-- see the alterd policy
dbadmin= select get_data_collector_policy('Backups');
-[ RECORD 1 ]-------------+-----------------------------------------------------------------------
get_data_collector_policy | 20KB kept in memory, 20KB kept on disk. Time based retention disabled.
- we managed to alter the component and allocated 20KB of memory and 20KB of disk space
dbadmin= SELECT set_data_collector_time_policy('Backups', '7 day'::interval);
set_data_collector_time_policy
--------------------------------
SET
(1 row)
dbadmin= select get_data_collector_policy('Backups');
get_data_collector_policy
--------------------------------------------------------------
20KB kept in memory, 20KB kept on disk. 7 days kept on disk.
(1 row)
dbadmin= SELECT set_data_collector_time_policy('Backups', '-1');
set_data_collector_time_policy
--------------------------------
SET
(1 row)
(dbadmin@:5433) [dbadmin] * select node_name,cast(sum(kb_per_day)/(1024^2) as numeric(10,3))||' Mb' as Used_Mb_per_Day from v_monitor.data_collector group by 1;
node_name | Used_Mb_per_Day
-------------------+-----------------
v_dcg023_node0001 | 9.976 Mb
--check that we have data in the WosesDestroyed component disk stored file.
[dbadmin@DCXLG001 DataCollector]$ cat WosesDestroyed_482341541015797.log | wc
170 170 3682
-- check that we have data in the WosesDestroyed component v_internal.dc_woses_destroyed
dbadmin= select count(*) from dc_woses_destroyed;
count
-------
10
-- run the CLEAR_DATA_COLLECTOR() funtion
dbadmin= select CLEAR_DATA_COLLECTOR('WosesDestroyed');
CLEAR_DATA_COLLECTOR
----------------------
CLEAR
(1 row)
-- see that data was cleared
dbadmin= select count(*) from dc_woses_destroyed;
count
-------
0
SELECT CLEAR_DATA_COLLECTOR();
CLEAR_DATA_COLLECTOR
----------------------
CLEAR
-- synchronize all
dbadmin= SELECT flush_data_collector();
flush_data_collector
----------------------
FLUSH
(1 row)
-- synchronize specific componenet
dbadmin= SELECT flush_data_collector('WosesDestroyed');
flush_data_collector
----------------------
FLUSH
(1 row)
dbadmin= select * from storage_locations where location_label='DataCollector';
location_id | node_name | location_path | location_usage | is_retired | location_label | rank | throughput | latency
-------------------+--------------+------------------+----------------+------------+----------------+------+------------+---------
45035996273757994 | v_d_node0001 | /u03/data_store1 | DATA | f | DataCollector | 0 | 71 | 818
(1 row)
dbadmin= select set_data_collector_storage_location('DataCollector');
set_data_collector_storage_location
-------------------------------------
SET
(1 row)
dbadmin= ! ls -la /u03/data_store1/| head -2
total 10004
drwxr-xr-x 2 dbadmin verticadba 36864 Apr 17 10:28 .
dbadmin= ! ls -la /u03/data_store1/| head -20
total 10388
drwxr-xr-x 2 dbadmin verticadba 36864 Apr 17 10:28 .
drwxr-xr-x 4 oracle oinstall 4096 Apr 17 09:48 ..
-rw------- 1 dbadmin verticadba 20440 Apr 17 10:28 AllocationPoolStatistics_482596048217937.log
-rw------- 1 dbadmin verticadba 0 Apr 17 10:27 AllocationPoolStatisticsByDay_482596048227182.log
-rw------- 1 dbadmin verticadba 0 Apr 17 10:27 AllocationPoolStatisticsByHour_482596048217996.log
-rw------- 1 dbadmin verticadba 1294 Apr 17 10:28 AllocationPoolStatisticsByMinute_482596048217984.log
-rw------- 1 dbadmin verticadba 65076 Apr 17 10:28 AllocationPoolStatisticsBySecond_482596048217961.log
-rw------- 1 dbadmin verticadba 28072 Apr 17 10:28 AllocationPoolStatisticsBySecond_482596100001870.log
-rw------- 1 dbadmin verticadba 18031 Apr 17 10:28 BlockMemoryManagerStatistics_482596048228656.log
-rw------- 1 dbadmin verticadba 0 Apr 17 10:27 BlockMemoryManagerStatisticsByDay_482596048228702.log
-rw------- 1 dbadmin verticadba 0 Apr 17 10:27 BlockMemoryManagerStatisticsByHour_482596048228690.log
-rw------- 1 dbadmin verticadba 1551 Apr 17 10:28 BlockMemoryManagerStatisticsByMinute_482596048228678.log
-rw------- 1 dbadmin verticadba 64782 Apr 17 10:28 BlockMemoryManagerStatisticsBySecond_482596048228668.log
-rw------- 1 dbadmin verticadba 15372 Apr 17 10:28 BlockMemoryManagerStatisticsBySecond_482596108001823.log
-rw------- 1 dbadmin verticadba 15768 Apr 17 10:28 CatalogInfo_482596048227205.log
-rw------- 1 dbadmin verticadba 0 Apr 17 10:27 CatalogInfoByDay_482596048227264.log
-rw------- 1 dbadmin verticadba 0 Apr 17 10:27 CatalogInfoByHour_482596048227243.log
-rw------- 1 dbadmin verticadba 1240 Apr 17 10:28 CatalogInfoByMinute_482596048227232.log
-rw------- 1 dbadmin verticadba 51465 Apr 17 10:28 CatalogInfoBySecond_482596048227219.log