Fix for Vertica ERROR Join did not fit in memory

Vertica ERROR: Join did not fit in memory - is happening because memory resource are in fault. This might happen at the moment you are running the query where your system is already using a part of the allocated memory so it will not fit into memory and when the system is not busy the join fits in the memory with no problems. The fastest way to solve this error is to enable join spill.

To enable join spill at session level use the syntax bellow

dbadmin= select add_vertica_options('EE', 'ENABLE_JOIN_SPILL');
                    set_vertica_options
--------------------------------------------------------------

 EE Vertica Options
 --------------------
 ENABLE_JOIN_SPILL
To disable join spill at session level use the syntax bellow:
dbadmin= select clr_vertica_options('EE', 'ENABLE_JOIN_SPILL');
                    set_vertica_options
--------------------------------------------------------------

 EE Vertica Options
 --------------------
 ENABLE_JOIN_SPILL
To enable join spill for a single query use the syntax bellow
  • this will make use of this you will need to use the hint option.
dbadmin=   SELECT /*+set_vertica_options(EE, ENABLE_JOIN_SPILL)*/ * from dual;
 dummy
-------
 X
(1 row)
To enable join spill at a global level use the syntax bellow
  • this will apply to all new sessions in Vertica.

  • by default this option is disabled as we can see in the CONFIGURATION_PARAMETERS table.

dbadmin= SELECT * FROM CONFIGURATION_PARAMETERS where parameter_name='EEVerticaOptions';
-[ RECORD 1 ]-----------------+----------------------------------------
node_name                     | ALL
parameter_name                | EEVerticaOptions
current_value                 |
catalog_value                 |
database_value                |
default_value                 |
source                        | DEFAULT
is_mismatch                   | f
groups                        |
change_under_support_guidance | f
change_requires_restart       | f
description                   | Use as recommended by Technical Support
  •  Enable join spill globally 
dbadmin= SELECT set_config_parameter('EEVerticaOptions','ENABLE_JOIN_SPILL');
-[ RECORD 1 ]--------+---------------------------
set_config_parameter | Parameter set successfully

Query the CONFIGURATION_PARAMETERS table again.

dbadmin= SELECT * FROM CONFIGURATION_PARAMETERS where parameter_name='EEVerticaOptions';
-[ RECORD 1 ]-----------------+----------------------------------------
node_name                     | ALL
parameter_name                | EEVerticaOptions
current_value                 | ENABLE_JOIN_SPILL
catalog_value                 | ENABLE_JOIN_SPILL
database_value                | ENABLE_JOIN_SPILL
default_value                 |
source                        | DATABASE
is_mismatch                   | f
groups                        |
change_under_support_guidance | f
change_requires_restart       | f
description                   | Use as recommended by Technical Support
  • To disable join spill at a global level use the syntax bellow:
  • where my database is called d.
dbadmin= ALTER DATABASE d CLEAR EEVerticaOptions;
ALTER DATABASE
dbadmin= SELECT * FROM CONFIGURATION_PARAMETERS where parameter_name='EEVerticaOptions';
-[ RECORD 1 ]-----------------+----------------------------------------
node_name                     | ALL
parameter_name                | EEVerticaOptions
current_value                 |
catalog_value                 |
database_value                |
default_value                 |
source                        | DEFAULT
is_mismatch                   | f
groups                        |
change_under_support_guidance | f
change_requires_restart       | f
description                   | Use as recommended by Technical Support
  Very important: 

- as per this parameter description should only be altered under Vertica Support guidance, this if you are using a EE license if you are running a CE version of Vertica database that you should now that enabling join spill will allow your query to finish but with performance costs as the spill will be written to disk

 What else can you do to avoid join spill errors as an alternative to enabling Join Spill

I will go over some point that we should look when this type of error comes up.
  • Cluster resource

- are your Vertica cluster resources well planned  ? you could follow this guide to see what is recommended by Vertica.

  • Optimized Database Design

- have you run DB designer after you have loaded the data ? Is there a comprehensive design present on your database ? if so are any incremental design required ?

  • Projections 

- using query specific projection will improve your performance as Vertica will require less resource.

  • Is you Query well written  

- try to use optimized statement, look for a better join order, try to implement sub-queries, are you predicates pushed down ?

  • Limit the data you throw in your query

- do you really need all those rows ?

- make sure you use proper filters, these way Vertica will do less work.

  • Temporary tables are an option

- it is often helpful when you are exploring a problem to break a query down into several steps and create temporary tables to hold the intermediate results. After you have worked through the problem, combining the queries into one query by using in-line views can be more efficient. However, under certain circumstances it is more efficient to use temporary tables. You should try both methods to determine which is more efficient for your case.

  • Dissect your Explain Plan

- the explain plan will allow you to see what projections are used by your query, view the plan and make sure the expected projections are used if they are not make sure the optimizer gets to use them.