I find it fundamental for a DBA to understand what is going on under the hood of your DB engine.
One of the strong point of HP Vertica database are the ROS and WOS storage types and is ideal that every DBA that works with Vertica should know what are the steps taken by the database during the process of storing data into the database.
Unlike other database that store data into tablespaces(logical group of data files in Oracle) or filegroup(SQL Server) Vertica stores data into containers, and this containers are of two type : WOS or ROS containers.
WOS - Write Optimized Storage (something like cache - data is stored unsorted, uncompressed and is available to be queried).
ROS - Read Optimized Storage (data is sorted and compressed)
In this demo i will demonstrate how the Data is stored in Vertica
We start with an empty database and we check to see that there is nothing stored in the data location.
Create new table and insert a row into the table.
Check the data storage on your disk and see that even tho we have inserted data into the table no data is stored onto our operational system.
Why there is no data written to disk ?
This is because by default Vertica will store the data into WOS. As seen bellow we can see the Storage containers associated with with our projection and we see that the storage_type is WOS type.
How to move it to disk(make it consistent) ?
Vertica has a process called Tuple Mover that runs in the background every 600 seconds and his value is set by the MoveOutInterval parameter.
But also we can anticipate the execution of this process by using the run_tuple_mover_once() function.
After running the run_tuple_mover_once() function the storage type changed to ROS and also the data was written to disk.
Check to see that data was written to disk location.
-where the storage_oid columns holds the data store identifier 45035996273732785_* as seen here.
Now let's insert some more data in our table but this time we will use the /*+ direct */ option.
this option tells Vertica to skip the WOS and store the data into ROS.
Now check the containers that were created by the insert with the direct option.
see that the storage_type is ROS (because of the direct option use).
Again see the data file on the operation system.
-see that new files where added. eg:45035996273732833_0.*
Insert some more data into your table
See that we have added one more container to our projection_name:one_super
Also a new set of data files were added to the operational system data directory.
Now let's see how we can merge all of the containers into one container.
for this we will use the DO_TM_TASK() with the mergeout option
Also this process is running in the background and his values is defined by the MergeOutInterval parameter values(it runs every 600 seconds).
The DO_TM_TASK process consolidates the ROS containers.
- see now we have only one ROS container for our one_super projection.
Also see that a consolidation of the files took place