Working with HP Vertica Flex Tables

In this article we will see how to work with Vertica Flex tables. We are going to use in our first example a json file.

  • The file has 2Gb of stored data.
  • [dbadmin@bih001 tmp]$ du -sh jon1.txt
    2.0G    jon1.txt
  • Let's create the Flex table now:
  • dbadmin=> create flex table test();
    CREATE TABLE
  • Now load the data into the table:
  • dbadmin=> copy test from '/tmp/jon1.txt' parser fjsonparser();
     Rows Loaded
    -------------
         1469400
    (1 row)
  • Next step is to run the compute_flextable_keys_and_build_view() function.
  • This will compute virtual columns (keys) from the map data of a flex table , and construct a view.
    dbadmin=> select compute_flextable_keys_and_build_view('test');
                                 compute_flextable_keys_and_build_view
    ------------------------------------------------------------------
     Please see public.test_keys for updated keys
    The view public.test_view is ready for querying
    (1 row)
    Now we have our Flex table view ready for use.
  • See Flex table definition
  • dbadmin=> d test;
                                                 List of Fields by Tables
     Schema | Table |    Column    |          Type          |  Size  | Default | Not Null | Primary Key | Foreign Key
    --------+-------+--------------+------------------------+--------+---------+----------+-------------+-------------
     public | test2 | __identity__ | int                    |      8 |         | t        | f           |
     public | test2 | __raw__      | long varbinary(130000) | 130000 |         | t        | f           |
    (2 rows)
    The columns contained in a defaul flex table are : __identity__ : -is auto-incrementing and used for segmentation and sort order when no other columns is defined. __raw__ : -stores loaded data and is of type LONG VARBINARY, and its default maximum width is 130000 bytes (with an absolute maximum of 32000000 bytes).
  • See Flex view definition.
  • dbadmin=> dv public.test_view;
                           List of View Fields
     Schema |    View    |    Column     |        Type         | Size
    --------+------------+---------------+---------------------+------
     public | test2_view | about         | varchar(1270)       | 1270
     public | test2_view | address       | varchar(120)        |  120
     public | test2_view | age           | varchar(20)         |   20
     public | test2_view | balance       | varchar(20)         |   20
     public | test2_view | company       | varchar(24)         |   24
     public | test2_view | email         | varchar(66)         |   66
     public | test2_view | favoritefruit | varchar(20)         |   20
     public | test2_view | friends       | long varbinary(304) |  304
     public | test2_view | gender        | varchar(20)         |   20
     public | test2_view | greeting      | varchar(114)        |  114
     public | test2_view | guid          | varchar(72)         |   72
     public | test2_view | id            | varchar(20)         |   20
     public | test2_view | isactive      | varchar(20)         |   20
     public | test2_view | latitude      | varchar(20)         |   20
     public | test2_view | longitude     | varchar(20)         |   20
     public | test2_view | name          | varchar(40)         |   40
     public | test2_view | phone         | varchar(34)         |   34
     public | test2_view | picture       | varchar(50)         |   50
     public | test2_view | registered    | varchar(52)         |   52
     public | test2_view | tags          | long varbinary(288) |  288
    (20 rows)
    The flex table view is created by the compute_flextable_keys_and_build_view() function, which is a mix of build_flextable_view() and compute_flextable_keys(). This mixed function will create a view based on you json file and will determine the data type for each virtual column. In the next article we will see how we can materialize the virtual columns. Doing so they will act like real stored data on the Vertica Cluster.