PostgreSQL and tablespaces - it's not so scary

So what is exactly a tablespace in Postgres?

As a reminder – by default a Postgres cluster is "self-contained", meaning all the data files will be residing inside the initialized "data directory" (called also PGDATA as one can set the similarly named environment variable to simplify working with clusters via pg_ctl), making things very easy and portable.

But about tablespaces - actually it's very simple, a tablespace is just another name for a folder where the Postgres server process owner (typically "postgres" system user) is the owner and can write some files. The folder can be located anywhere, Postgres must just be able to create symlinks to that certain path. The symlinks are created in the "pg_tblspc" subfolder of the cluster (for example /var/lib/postgresql/9.6/main/pg_tblspc on Ubuntu) and can be inspected using the usual file system commands like "ls -l". After that, users can explicitly say that they want to store some specific tables/indexes in that new "linked folder" or one can also decide to make it the default for new objects or even move all old objects over to the new tablespace.

How to create one?

In the simplest form something like that will work:

CREATE TABLESPACE new_disk LOCATION 'some_disk_mount';

1	CREATE TABLESPACE new_disk LOCATION 'some_disk_mount';

When using tablespaces to tune IO though, something like that would make more sense:

CREATE TABLESPACE fast_ssd LOCATION '/some_disk_mount' WITH (seq_page_cost=0.5, 
random_page_cost=0.75, effective_io_concurrency=10);



/* let's start using the new tablespace... */

CREATE INDEX on lot_of_activity(id) TABLESPACE&nbsp; fast_ssd;

CREATE TABLESPACE fast_ssd LOCATION '/some_disk_mount' WITH (seq_page_cost=0.5,

random_page_cost=0.75, effective_io_concurrency=10);

/* let's start using the new tablespace... */

CREATE INDEX on lot_of_activity(id) TABLESPACE  fast_ssd;

NB! Just declaring a tablespace won't help normal unprivileged users – they all need a CREATE grant to create tables/indexes. Something like that will alleviate the situation. Note that for databases with high security requirements one should of course give out such grants on a per user basis.

GRANT CREATE ON TABLESPACE fast_ssd to public;

1	GRANT CREATE ON TABLESPACE fast_ssd to public;

Moving existing tables to a new tablespace

When plagued by disks running out of space, then after creating a new tablespace the first thing should be setting it also to serve as a default for new objects. This can be done by modifying the default_tablespace parameter and reloading the server config via pg_reload_conf() or "pg_ctl -D mydatadir reload". After that one could/should move also fast growing tables/indexes to the new tablespace. This can be done per table or starting with Postgres 9.4 also en masse.

/* Move table to a new tablespace. Use ALTER INDEX ... for indexes */

ALTER TABLE t_growing_fast SET TABLESPACE new_disk;



/* Move all tables to a new tablespace. Use ALTER INDEX ... for indexes */

ALTER TABLE ALL IN TABLESPACE pg_default SET TABLESPACE new_disk;

/* Move table to a new tablespace. Use ALTER INDEX ... for indexes */

ALTER TABLE t_growing_fast SET TABLESPACE new_disk;

/* Move all tables to a new tablespace. Use ALTER INDEX ... for indexes */

ALTER TABLE ALL IN TABLESPACE pg_default SET TABLESPACE new_disk;

NB! Using the later syntax locks all tables/indexes (use NOWAIT for fast failing when unable to get all the locks) and will then start to copy the data files one by one, so basically a downtime is required. For "online" moving one can though use tools like pg_repack or pg_squeeze.

Backup & replication considerations

For logical (pg_dump, pg_dumpall) backups nothing changes, but for physical backups (or when building up a replica), pg_basebackup will by default require the exact same directory layout on the replica for things to work. When this is not possible or feasible then one can "re-map" the tablespaces though using the --tablespace-mapping parameter or just use the "tar" format that creates again a "self-contained" blob. See pg_basebackup for more details. Also for doing manual basebackups with cp or rsync or such, one should make sure the "dereference" option is used so that the datafiles behind the symlinks get copied.

Hope it helped. Questions/corrections are welcome as usual! Feel free to contact us.

10 responses to “PostgreSQL & tablespaces - it's not so scary ...”

Douglas J Hunley says:

May 17, 2017 at 2:34 pm

> for example having indexes on fast SSD disks

Wouldn't you be better off w/ the data files on SSD and the indexes on spinning rust? B-TREE access is pretty fast and optimal as is but the random I/O of gathering data pages can be horrific.

Reply
- Kaarel says:
  
  May 18, 2017 at 1:28 pm
  
  Sure this would be the best option if one has enough SSD space 🙂 ...but if you're constrained and have a big and wide table with dozens of columns say, then placing the heavily used ID column on SSD might still be a better balance, winning say 20x in SSD space, giving away 10x in speed but having hope for very fast index-only scans for counts and such. Very use case dependent stuff anyways...
  
  Reply
Glyn Astill says:

May 19, 2017 at 8:53 am

Not quite true on the pg_dump / restore level; you might need to use --no-tablespaces

Reply
Marco Pegoraro says:

February 11, 2018 at 6:52 am

Hello, thank you for the clear article.

I set up 2 table spaces where I believed PG would have dropped most of the data, but the default tablespace now keeps on growing more than the others.

All my tables and indexes are now going into one of the 2 custom tablespaces so I'm puzzled by this growing of the default disk location.

I'm doing this with the goal of splitting different data-areas of my application into different EBS disks on AWS.

Reply
Peter Neave says:

October 15, 2018 at 2:37 am

There is an extra " " in the line
CREATE INDEX on lot_of_activity(id) TABLESPACE  fast_ssd;

Reply
Raul Pereira says:

December 27, 2018 at 7:21 pm

Sorry, I'm starting with postgresql now. Why can not I create two tablespace in postgresql in the same directory? ERROR: directory "/var/lib/postgresql/10/main/base/PG_10_201707211" already in use as a tablespace

Reply
- Pavlo Golub says:
  
  December 27, 2018 at 7:47 pm
  
  Because postgres wants exclusive right on this folder. And postgres wants eliminate files mess etc
  
  Reply
  - Raul Pereira says:
    
    December 27, 2018 at 8:21 pm
    
    Thank you Pavlo. As I understand it, tablespace is a physical directory. I first created the directories on the disk and then created the tablespaces by stating the respective location for each tablespace.
    
    Reply
Santhosh says:

February 27, 2020 at 8:35 am

my windows is corrupted.so i need to re-install it. my database in table space in my usb folder. postgres installation folder and data folder not recovered can i got the data restoration from my table space folder

Reply
Joonas Salo says:

November 7, 2021 at 11:56 am

So I want to store indexes in local SSD for performance and main data in a redundant network drive for safe keeping.

What if the local SSD breaks, would the whole database end up in corrupted state or can postgres recreate the indexes from the main data?

How about materialized views?

Reply

PostgreSQL & tablespaces - it's not so scary ...

So what is exactly a tablespace in Postgres?

How to create one?

Moving existing tables to a new tablespace

Backup & replication considerations

10 responses to “PostgreSQL & tablespaces - it's not so scary ...”

Leave a Reply Cancel reply

CYBERTEC Guest

Blog Tags

NEWSLETTER

Articles by our PostgreSQL Experts