How to Import data from MySQL into Vertica with one single line of code
We all know that we don't have linked server or db links in Vertica nor MySQL, so how can we create bridge between these two to import data from one to another ?
What do we need to do to accomplish this :
1 - Install the MySQL Client on you Vertica host and also Install the Vertica vsql client on the MySQL host(only if you plan to import data from Vertica into MySQL).
2 - Create the table definition inside Vertica(tje table that will receive the imported data)
-use this command to extract the DDL of your MySQL table
3 - Adjust the data type definition of the table so they can fit/apply into Vertica
the end result would be something like this
no encoding have been applied as this is just for demonstration, if you want to setup encoding , sort order, segmentation, partitioning fell free to do so at create table time.
4 - We will create the command that will build the bridge between our two database engines and do an extract and load job.
To extract the data out of MySQL use the following base syntax:
The options are well explained by Daniel in the comment bellow.
We will get all of the data from the table all mixed up and we need to format it in CSV data type manner.
To do this we will use the tr(translate) command from Linux:
we take the STDOUT and apply the following rules to it to transform it into comma separated values output
Create the COPY that will insert the data into Vertica database
follow the base syntax bellow
specify the connection details and then the COPY command options like the delimiter or the exception & rejected location and also depending on the size of the load specify the DIRECT option.
Finally you need to take all parts and pipe them to each other to end-up with something like this.
The command is run from the Vertica database host.
This type of extraction and load can be useful when you do not have an ETL tool to do it for you or when you do not have the ETL tools expertise.
It comes with it's challenges such as network pounding and some extra I/O but is free and that is a big difference.
I hope this was helpful