How to Disable Vertica Data Collector

In this article i will go thru the steps of disabling the Data Collector in Vertica. Data Collector is on by default. So we will make some changes to that as i am running this in my AWS Vertica Cluster and since is running on  on-demand EC2 schema i am really not interested in improving the performance of my database using historical system activities and counters. The Data Collector can create extra overhead and i dont what this. What is the Data Collector, in case you dont know ? Is the utility that collects and retains database monitoring information. Data Collector retains history of important system activities and records essential performance and resource utilization counters. You can use information the Data Collector retains in the following ways:

  • As a reference for what actions users have taken
  • To locate performance bottlenecks
  • To identify potential improvements to Vertica configuration
Lets start and see what are the steps to disable Data Collector.

See if the Data Collector is enabled.

dbadmin= \x
Expanded display is on.
dbadmin= select * from configuration_parameters where parameter_name='EnableDataCollector';
-[ RECORD 1 ]-----------------+--------------------------------

node_name                     | ALL
parameter_name                | EnableDataCollector
current_value                 | 1
restart_value                 | 1
database_value                | 1
default_value                 | 1
current_level                 | DEFAULT
restart_level                 | DEFAULT
is_mismatch                   | f
groups                        |
allowed_levels                | NODE, DATABASE
superuser_only                | f
change_under_support_guidance | f
change_requires_restart       | f
description                   | Enable the usage data collector
  • we can see that we have it enabled.

To Disable the Data Collector we need to alter the value of the EnableDataCollector value to 0(zero).

dbadmin= alter database isr SET EnableDataCollector = 0;
ALTER DATABASE

The location to the where the Data Collector files are stored is usually in the  path of your Database Catalog.

/vertica_catalog/DBName/node_name/DataCollector/

Here is the content of my DataCollector.

  • it contains a bunch of .log files and .sql files.
[dbadmin@ip-10-131-14-217 /]$ cd /vertica_catalog/DB/v_node0001_catalog/DataCollector/
[dbadmin@ip-10-131-14-217 DataCollector]$ ls -la
total 815660
drwxr-x---. 2 dbadmin verticadba    61440 Jul 12 11:34 .

drwx------. 9 dbadmin verticadba     4096 Jul 12 09:20 ..

-rw-------  1 dbadmin verticadba    65522 Jul 12 11:23 AllocationPoolStatistics_521601572001839.log

-rw-------  1 dbadmin verticadba    65518 Jul 12 11:27 AllocationPoolStatistics_521601799002261.log

-rw-------  1 dbadmin verticadba    65522 Jul 12 11:30 AllocationPoolStatistics_521602027001192.log

-rw-------  1 dbadmin verticadba    56878 Jul 12 11:34 AllocationPoolStatistics_521602254003082.log

-rw-------  1 dbadmin verticadba     6632 Jul 12 10:00 AllocationPoolStatisticsByDay_519004800027966.log

-rw-------  1 dbadmin verticadba    65306 Jun 20 14:00 AllocationPoolStatisticsByHour_518839200003356.log

-rw-------  1 dbadmin verticadba    65023 Jun 28 23:00 AllocationPoolStatisticsByHour_519714000011764.log

Lets see how much space Data Collector is using

  • this will depend a lot on the Data Collector policy you have in place.
[dbadmin@ip-10-131-14-217 v_node0001_catalog]$ du -sh DataCollector/
797M    DataCollector/

To Clear the Data Collector data we need to use the clear_data_collector() function.

  • this function will clear all the data from the disk and memory.
dbadmin= SELECT clear_data_collector();
 clear_data_collector
----------------------

 CLEAR
(1 row)

Lets see how much space Data Collector is using after we run clear_data_collector() function.

[dbadmin@ip-10-131-14-217 v_node0001_catalog]$ du -sh DataCollector/
2.1M    DataCollector/
Now that we have disabled Data Collector no data will generated and no overhead will be applied on your database. Note: Data Collector is very useful so we can analyze our Database overall performance so be careful before going ahead with the clean up.