How to Manage and Work with Data Collector in Vertica
Data Collector is the utility that retains essential information on performance and resource utilization counters.
The main use of the Data Collector utility is to assist into finding performance issues and help improve the overall performance of your database.
Now that we know what is the Data Collector utility let's go and see how data is collected where is it stored and some ways we can manage this data to make it useful to us.
Data Collector comes enabled by default and it can be disabled by the database superuser.
The data is retained by the Data Collector utility in the DataCollector directory under the own Vertica/catalog path and also is available to be queried under the v_internal schema and the table are identified by the dc_ prefix.
Topics treated in this article.
Enabling and disabling the Data Collector.
Configure Data Collector retention policy.
How to see the retention policy for a specific component.
How to alter/modify the retention policy for a specific component
Set data retention policy by time interval.
Manage the data collected in the Data Collector tables and log files.
How to manipulate the DC data.
How to synchronize the disk storage with the data collected in the memory.
How to move change the location of the Data Collector logs.
Note:
The retention of data can be adjusted/configured by the database superuser.
Enabling and disabling the Data Collector
To disable the Data Collector:
To re-enable the Data Collector:
Configure Data Collector retention policy
In Vertica v7.1.1-0 we have about 158 components that can have their retention policy configured as per our needs.
To see the components and their description use the query bellow:
How to see the retention policy for a specific component
-for this task we need to use the get_data_collector_policy() function.
-also we can use the data_collector table.
where we see that the table name that will hold the collected data is dc_backups, it's description, memory and disk space allocated to it.
How to alter/modify the retention policy for a specific component
for this task we need to use the set_data_collector_policy('component', 'memoryKB', 'diskKB') function.
I will use the same backup component for our example:
Set data retention policy by time interval
for this task we will use the SET_DATA_COLLECTOR_TIME_POLICY() function.
the function will allow you to retain the data on disk based on a time interval.
How to enable time interval retention for a single component
[otw_is sidebar=otw-sidebar-1]
How to disable time interval retention for a single component
Important:
You need to find how much space your DC tables are using per day * interval so you wont run out of disk space.
Manage the data collected in the Data Collector tables and log files
in the DataCollector directory located in our Vertica node catalog we can find 3 types of files as per each component.
This files as follows:
CREATE_
COPY_
The Create and Copy files can be useful if you what to load your DC data onto another server.
How to manipulate the DC data.
To clear the disk data and memory data and reset your DC use the CLEAR_DATA_COLLECTOR('component') function.
- this function can be used to clear all data or just the data for a specific component.
The following command clears data collection for all components on all nodes:
Note: user must be superuser in order to proceed with this action.
[otw_is sidebar=otw-sidebar-1]
How to synchronize the disk storage with the data collected in the memory
- for this task we need to use the flush_data_collector('
-- synchronize all
dbadmin=SELECTflush_data_collector();flush_data_collector----------------------
FLUSH(1row)-- synchronize specific componenet
dbadmin=SELECTflush_data_collector('WosesDestroyed');flush_data_collector----------------------
FLUSH(1row)
How to move change the location of the Data Collector logs.
To move the data collector logs and instructions to other storage locations use the
set_data_collector_storage_location(see here how to do it.
I already have the location created as seen bellow:
Change storage location for the DataCollector
Check the storage location content
Great we managed to move the DataCollector files to the new location.