Vertica ERROR: Join did not fit in memory - is happening because memory resource are in fault.
This might happen at the moment you are running the query where your system is already using a part of the allocated memory so it will not fit into memory and when the system is not busy the join fits in the memory with no problems.
The fastest way to solve this error is to enable join spill.
To enable join spill at session level use the syntax bellow
To disable join spill at session level use the syntax bellow:
To enable join spill for a single query use the syntax bellow
this will make use of this you will need to use the hint option.
To enable join spill at a global level use the syntax bellow
this will apply to all new sessions in Vertica.
by default this option is disabled as we can see in the CONFIGURATION_PARAMETERS table.
Enable join spill globally
Query the CONFIGURATION_PARAMETERS table again.
To disable join spill at a global level use the syntax bellow:
where my database is called d.
Very important:
- as per this parameter description should only be altered under Vertica Support guidance, this if you are using a EE license if you are running a CE version of Vertica database that you should now that enabling join spill will allow your query to finish but with performance costs as the spill will be written to disk
What else can you do to avoid join spill errors as an alternative to enabling Join Spill
I will go over some point that we should look when this type of error comes up.
- have you run DB designer after you have loaded the data ? Is there a comprehensive design present on your database ? if so are any incremental design required ?
Projections
- using query specific projection will improve your performance as Vertica will require less resource.
Is you Query well written
- try to use optimized statement, look for a better join order, try to implement sub-queries, are you predicates pushed down ?
Limit the data you throw in your query
- do you really need all those rows ?
- make sure you use proper filters, these way Vertica will do less work.
Temporary tables are an option
- it is often helpful when you are exploring a problem to break a query down into several steps and create temporary tables to hold the intermediate results. After you have worked through the problem, combining the queries into one query by using in-line views can be more efficient. However, under certain circumstances it is more efficient to use temporary tables. You should try both methods to determine which is more efficient for your case.
Dissect your Explain Plan
- the explain plan will allow you to see what projections are used by your query, view the plan and make sure the expected projections are used if they are not make sure the optimizer gets to use them.