COPY command and bulk loading in PostgreSQL 12

COPY: Loading and unloading data as fast as possible

When taking a closer look at the syntax of the COPY command in PostgreSQL 12 you will quickly see two things:

• h will now point to the correct page in the documentation
• COPY now supports a WHERE condition

Here is the complete syntax overview:

db12=# h COPY
Command:     COPY
Description: copy data between a file and a table
Syntax:
COPY table_name [ ( column_name [, ...] ) ]
    FROM { 'filename' | PROGRAM 'command' | STDIN }
    [ [ WITH ] ( option [, ...] ) ]
    [ WHERE condition ]

COPY { table_name [ ( column_name [, ...] ) ] | ( query ) }
    TO { 'filename' | PROGRAM 'command' | STDOUT }
    [ [ WITH ] ( option [, ...] ) ]

where option can be one of:

    FORMAT format_name
    FREEZE [ boolean ]
    DELIMITER 'delimiter_character'
    NULL 'null_string'
    HEADER [ boolean ]
    QUOTE 'quote_character'
    ESCAPE 'escape_character'
    FORCE_QUOTE { ( column_name [, ...] ) | * }
    FORCE_NOT_NULL ( column_name [, ...] )
    FORCE_NULL ( column_name [, ...] )
    ENCODING 'encoding_name'

URL: https://www.postgresql.org/docs/12/sql-copy.html

db12=# h COPY

Command: COPY

Description: copy data between a file and a table

Syntax:

COPY table_name [ ( column_name [, ...] ) ]

FROM { 'filename' | PROGRAM 'command' | STDIN }

[ [ WITH ] ( option [, ...] ) ]

[ WHERE condition ]

COPY { table_name [ ( column_name [, ...] ) ] | ( query ) }

TO { 'filename' | PROGRAM 'command' | STDOUT }

[ [ WITH ] ( option [, ...] ) ]

where option can be one of:

FORMAT format_name

FREEZE [ boolean ]

DELIMITER 'delimiter_character'

NULL 'null_string'

HEADER [ boolean ]

QUOTE 'quote_character'

ESCAPE 'escape_character'

FORCE_QUOTE { ( column_name [, ...] ) | * }

FORCE_NOT_NULL ( column_name [, ...] )

FORCE_NULL ( column_name [, ...] )

ENCODING 'encoding_name'

URL: https://www.postgresql.org/docs/12/sql-copy.html

While having a link to the documentation around is certainly beneficial, the WHERE condition added to PostgreSQL 12 might even be more important. What is the purpose of this new feature? So far it was possible to completely import a file. However, in some cases this has been a problem: More often than not people only wanted to load a subset of data and had to write a ton of code to filter data before the import or once data has been written into the database already.

COPY … WHERE: Applying filters while importing data

In PostgreSQL data can be filtered while importing easily. The COPY become is pretty flexible and allows a lot of trickery. To show you, how the new WHERE clause works, I have compiled a simple example:

db12=# CREATE TABLE t_demo AS 
		SELECT * FROM generate_series(1, 1000) AS id;
SELECT 1000

db12=# CREATE TABLE t_demo AS

SELECT * FROM generate_series(1, 1000) AS id;

SELECT 1000

First of all 1000 rows are generated to make sure that we got some data to play. Then we export the content of this table to a file:

db12=# COPY t_demo TO '/tmp/file.txt';
COPY 1000

1 2	db12=# COPY t_demo TO '/tmp/file.txt'; COPY 1000

Finally, we can try to import this data again:

db12=# CREATE TABLE t_import (x int);
CREATE TABLE
db12=# COPY t_import FROM '/tmp/file.txt' WHERE x < 5;
COPY 4
db12=# SELECT * FROM t_import;
 x 
---
 1
 2
 3
 4
(4 rows)

db12=# CREATE TABLE t_import (x int);

CREATE TABLE

db12=# COPY t_import FROM '/tmp/file.txt' WHERE x < 5;

COPY 4

db12=# SELECT * FROM t_import;

---

(4 rows)

As you can see filtering data is pretty simple and very straight forward. One important thing to note here is: I exported an “id” column and imported it as “x”. Keep in mind that the text file does not know the data structure of our target table - you have to make sure that you filter on the column name of the table you want to import.

Old gems revisited …

If you are new to PostgreSQL in general I also want to present one of the older features, which I like a lot personally. COPY can send data to the UNIX pipe or read data from a pipe. Here is how it works:

db12=# COPY t_demo TO PROGRAM 'gzip -c > /tmp/file.txt.gz';
COPY 1000
db12=# COPY t_import FROM PROGRAM 'gunzip -c /tmp/file.txt.gz' 
	WHERE x BETWEEN 100 AND 103;
COPY 4
db12=# SELECT * FROM t_import WHERE x >= 100;
  x  
-----
 100
 101
 102
 103
(4 rows)

db12=# COPY t_demo TO PROGRAM 'gzip -c > /tmp/file.txt.gz';

COPY 1000

db12=# COPY t_import FROM PROGRAM 'gunzip -c /tmp/file.txt.gz'

WHERE x BETWEEN 100 AND 103;

COPY 4

db12=# SELECT * FROM t_import WHERE x >= 100;

-----

100

101

102

103

(4 rows)

In some cases, you might want to do more than to just export data. In this case I decided to compress the data while exporting. Before the data is imported again it is uncompressed and again filtered. As you can see it is pretty simple to combine those features in a flexible way.

If you want to learn more about PostgreSQL and loading data in general, check out our post about rules and triggers. If you want to learn more about COPY, checkout the PostgreSQL documentation.

3 responses to “COPY command and bulk loading in PostgreSQL 12”

barkingfoodog says:

June 5, 2019 at 4:31 pm

Do you know if the psql meta-command copy will receive WHERE support? It’s not listed in the draft documentation for 12, but that may be an omission.

Reply
- Hans-Jürgen Schönig says:
  
  June 6, 2019 at 6:28 am
  
  i don't see that happen. backslash commands are a frontend thing. WHERE is a backend thing.
  
  Reply
  - barkingfoodog says:
    
    June 6, 2019 at 7:03 pm
    
    Does that mean the copy command implemented with different code than the server-side COPY routine? I always assumed copy would stream the data to the server and then invoke a COPY.
    
    I'm wondering if there are speed advantages to use a server-side COPY beyond extra features like WHERE support.
    
    Reply

COPY command and bulk loading in PostgreSQL 12

COPY: Loading and unloading data as fast as possible

COPY … WHERE: Applying filters while importing data

Old gems revisited …

3 responses to “COPY command and bulk loading in PostgreSQL 12”

Leave a Reply Cancel reply

Hans-Jürgen Schönig

Blog Tags

NEWSLETTER

Articles by our PostgreSQL Experts