IBM Support

Extremely Slow Netezza (Database) Batch Inserts using JDBC

Question & Answer


Question

Why are my Batch Inserts on Netezza extremely slow?

Cause

Insert statement batches will not be performed on Netezza the overhead per query will limit the efficiency of these operations. Recommendation dumping your data to flat files and executing a nzload/external table insert.

Answer

The Netezza JDBC driver may detect a batch insert, and under the covers convert this to an external table load. Looking at the example below you can see from the sample runtimes here, the singleton inserts using a non-batch prepared statement are very slow, while the inserts run using executebatch scale quickly upward with the batch size.

start of call to testSingletonInserts() with batchSize = 100
exec time for batch-size 100: 8193ms
end of call to testSingletonInserts()

start of call to testBatchInserts() with batchSize = 1000
exec time for batch-size 1000: 190ms
end of call to testBatchInserts()

start of call to testBatchInserts() with batchSize = 10000
exec time for batch-size 10000: 734ms
end of call to testBatchInserts()

start of call to testBatchInserts() with batchSize = 100000
exec time for batch-size 100000: 1763ms
end of call to testBatchInserts()

Generally speaking I would not recommend a batch size of less than
10,000.
One important diagnostic point is to check a JDBC trace or the pg.log
on the Netezza system to verify that external tables are being
implicitly called for you.

If you see an entry like this, then you are not getting the benefit of
implicit external tables through the JDBC driver.

2015-08-08 12:52:50.640127 EDT [13898] DEBUG: QUERY: insert into
testload (c1,c2) values (next value for testload_seq, 96)

If you see an entry like this, then you are getting an external table.
2015-08-08 12:52:51.078404 EDT [13898] DEBUG: QUERY: CREATE EXTERNAL
TABLE bulkETL_13898_0 ( c0 nvarchar(5) ) USING ( DATAOBJECT('/tmp/junk') REMOTESOURCE 'jdbc' DELIMITER ' ' ESCAPECHAR ''\' CTRLCHARS 'YES' CRINSTRING 'YES' ENCODING 'INTERNAL'
MAXERRORS 1 QUOTEDVALUE 'YES' );

2015-08-08 12:52:51.086081 EDT [13898] DEBUG: QUERY: insert into
testload (c1,c2) values (next value for testload_seq,bulkETL_13898_0.c0)
2015-08-08 12:52:51.101234 EDT [13898] DEBUG: transaction 1411711
started dbos txid 0x3dc5c

If you find you are not getting implicit external behavior with your batch execution, try making sure autocommit is turned off on your connection.

conn.setAutoCommit(false);

[{"Product":{"code":"SSULQD","label":"IBM PureData System"},"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Component":"IBM Netezza Analytics","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"1.0.0","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
17 October 2019

UID

swg21981328