In our CNPG series, we have explained how to create a PostgreSQL cluster and how to customize it. However, most of the time, we don't have the luxury to start to use a database from scratch. We might have already started a project without high availability and we might be looking for an easy-to-manage environment on cloud native environment. In this case, CNPG is the answer, but how do we migrate our existing cluster to CNPG? In today's article, we will explain that.
Table of Contents
There are a couple ways to migrate an existing cluster to CNPG; pg_basebackup, import, and logical replication. We will examine pg_basebackup and import methods today. Depending on the use case or the requirement, we might need to use a different method.
pg_basebackup, as usual, does let us create a physical copy from a source cluster. Because we are creating a physical copy of the existing cluster, the limitations are a bit more strict than for the import method. The requirements for pg_basebackup are as follows;
As of now, CNPG version v1.27 has only one limitation for pg_basebackup, snapshot copy. In order to start creating the physical copy pg_basebackup takes a snapshot of the source cluster. Concurrently, it also streams WAL segments to the target cluster from the beginning of the process until the end of the backup. Once this process is completed, the cluster diverges and starts on a new timeline from the source. That is why all write operations on the source must be stopped before starting the migration process.
Note: this limitation only applies when the target instance is not specified as replica.
Let's check the existing single instace cluster:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 |
postgres@ubuntu1:~$ psql psql (15.13 (Ubuntu 15.13-1.pgdg22.04+1)) Type "help" for help. postgres=# \l List of databases Name | Owner | Encoding | Collate | Ctype | ICU Locale | Locale Provider | Access privileges --------------+----------+----------+---------+---------+------------+-----------------+----------------------- postgres | postgres | UTF8 | C.UTF-8 | C.UTF-8 | | libc | template0 | postgres | UTF8 | C.UTF-8 | C.UTF-8 | | libc | =c/postgres + | | | | | | | postgres=CTc/postgres template1 | postgres | UTF8 | C.UTF-8 | C.UTF-8 | | libc | =c/postgres + | | | | | | | postgres=CTc/postgres to_be_cloned | postgres | UTF8 | C.UTF-8 | C.UTF-8 | | libc | (4 rows) postgres=# \du List of roles Role name | Attributes | Member of ------------+------------------------------------------------------------+----------- postgres | Superuser, Create role, Create DB, Replication, Bypass RLS | {} replicator | Replication | {} postgres=# \c to_be_cloned You are now connected to database "to_be_cloned" as user "postgres". to_be_cloned=# \d List of relations Schema | Name | Type | Owner --------+------+-------+---------- public | test | table | postgres (1 row) to_be_cloned=# select count(*) from test; count --------- 1000000 (1 row) to_be_cloned=# |
It is a pretty simple source setup but it is enough for our examination.
For migration we need a manifest like the following:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
apiVersion: v1 kind: Secret metadata: name: source-db-replica-user type: Opaque data: username: cmVwbGljYXRvcg== password: MTIzMTIz --- apiVersion: postgresql.cnpg.io/v1 kind: Cluster metadata: name: target-cluster spec: instances: 3 imageName: ghcr.io/cloudnative-pg/postgresql:15.13 bootstrap: pg_basebackup: source: source-cluster storage: size: 3Gi externalClusters: - name: source-cluster connectionParameters: host: 192.168.122.84 user: replicator password: name: source-db-replica-user key: password |
In this manifest, we tell CNPG operator that we will bootstrap an instance using pg_basebackup and as a source we will use source - cluster. On the definition of source - cluster externalcluster we specify source instance connection parameters.
Apply the manifest:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
kubectl apply -f migrate.yml secret/source-db-replica-user created cluster.postgresql.cnpg.io/target-cluster created kubectl get all NAME READY STATUS RESTARTS AGE pod/cnpg-controller-manager-5c94bc644d-89r4v 1/1 Running 0 2d19h pod/target-cluster-1-pgbasebackup-9qv22 0/1 PodInitializing 0 2s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/cnpg-webhook-service ClusterIP 10.107.79.59 <none> 443/TCP 2d19h service/target-cluster-r ClusterIP 10.107.29.232 <none> 5432/TCP 3s service/target-cluster-ro ClusterIP 10.101.50.138 <none> 5432/TCP 2s service/target-cluster-rw ClusterIP 10.108.171.241 <none> 5432/TCP 2s NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/cnpg-controller-manager 1/1 1 1 2d19h NAME DESIRED CURRENT READY AGE replicaset.apps/cnpg-controller-manager-5c94bc644d 1 1 1 2d19h NAME STATUS COMPLETIONS DURATION AGE job.batch/target-cluster-1-pgbasebackup Running 0/1 2s 2s kubectl get all NAME READY STATUS RESTARTS AGE pod/cnpg-controller-manager-5c94bc644d-89r4v 1/1 Running 0 2d19h pod/target-cluster-1 1/1 Running 0 91s pod/target-cluster-2 1/1 Running 0 73s pod/target-cluster-3 1/1 Running 0 56s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/cnpg-webhook-service ClusterIP 10.107.79.59 <none> 443/TCP 2d19h service/target-cluster-r ClusterIP 10.107.29.232 <none> 5432/TCP 3m13s service/target-cluster-ro ClusterIP 10.101.50.138 <none> 5432/TCP 3m12s service/target-cluster-rw ClusterIP 10.108.171.241 <none> 5432/TCP 3m12s NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/cnpg-controller-manager 1/1 1 1 2d19h NAME DESIRED CURRENT READY AGE replicaset.apps/cnpg-controller-manager-5c94bc644d 1 1 1 2d19h |
Let's connect to the target cluster and verify our migration:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
kubectl exec -it pod/target-cluster-1 -- /bin/bash Defaulted container "postgres" out of: postgres, bootstrap-controller (init) postgres@target-cluster-1:/$ psql psql (15.13 (Debian 15.13-1.pgdg110+1)) Type "help" for help. postgres=# \l List of databases Name | Owner | Encoding | Collate | Ctype | ICU Locale | Locale Provider | Access privileges --------------+----------+----------+---------+---------+------------+-----------------+----------------------- app | app | UTF8 | C.UTF-8 | C.UTF-8 | | libc | postgres | postgres | UTF8 | C.UTF-8 | C.UTF-8 | | libc | template0 | postgres | UTF8 | C.UTF-8 | C.UTF-8 | | libc | =c/postgres + | | | | | | | postgres=CTc/postgres template1 | postgres | UTF8 | C.UTF-8 | C.UTF-8 | | libc | =c/postgres + | | | | | | | postgres=CTc/postgres to_be_cloned | postgres | UTF8 | C.UTF-8 | C.UTF-8 | | libc | (5 rows) postgres=# \c to_be_cloned You are now connected to database "to_be_cloned" as user "postgres". to_be_cloned=# \d List of relations Schema | Name | Type | Owner --------+------+-------+---------- public | test | table | postgres (1 row) to_be_cloned=# select count(*) from test; count --------- 1000000 (1 row) |
The import method basically uses the pg_dump and pg_restore, which is why we need a downtime for migration, as is done on pg_basebackup. Yet, because it is a logical backup, it supports migration between different versions, so while migrating to CNPG we can also upgrade our databases. As aforementioned in the pg_basebackup section it has less requirements, but still the same limitation applies to the import method; before starting the migration process we need to stop write operation on the source instance.
For this method, we use the same source cluster. However, our source database is to_be_cloned.
For migration with import we use the following manifest:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
apiVersion: v1 kind: Secret metadata: name: source-db-replica-user type: Opaque data: username: cmVwbGljYXRvcg== password: MTIzMTIz --- apiVersion: postgresql.cnpg.io/v1 kind: Cluster metadata: name: target-cluster spec: instances: 3 imageName: ghcr.io/cloudnative-pg/postgresql:15.13 bootstrap: initdb: dataChecksums: true database: to_be_cloned_target owner: to_be_cloned_target import: type: microservice databases: - to_be_cloned source: externalCluster: source-cluster storage: size: 3Gi externalClusters: - name: source-cluster connectionParameters: host: 192.168.122.84 user: migrator password: name: source-db-replica-user key: password |
This manifest is slightly different than the one we use in the pg_basebackup method. On bootstrap section we use initdb instead of pg_basebackup. Afterwards, we tell the cluster to import data using an external source that we define. The other difference is the target database name. In this example, we import the data into the to_be_cloned_target database. That means our configuration can be changed under the boundaries of initdb.
Apply the manifest:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
kubectl apply -f import.yml secret/source-db-replica-user created cluster.postgresql.cnpg.io/target-cluster created kubectl get all NAME READY STATUS RESTARTS AGE pod/cnpg-controller-manager-5c94bc644d-89r4v 1/1 Running 0 3d1h pod/target-cluster-1 1/1 Running 0 63s pod/target-cluster-2 1/1 Running 0 46s pod/target-cluster-3 1/1 Running 0 26s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/cnpg-webhook-service ClusterIP 10.107.79.59 <none> 443/TCP 3d1h service/target-cluster-r ClusterIP 10.109.123.66 <none> 5432/TCP 69s service/target-cluster-ro ClusterIP 10.110.89.179 <none> 5432/TCP 69s service/target-cluster-rw ClusterIP 10.96.14.187 <none> 5432/TCP 69s NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/cnpg-controller-manager 1/1 1 1 3d1h NAME DESIRED CURRENT READY AGE replicaset.apps/cnpg-controller-manager-5c94bc644d 1 1 1 3d1h |
For more detail, you can run kubectl logs before the import job is completed:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
kubectl logs -n cnpg-system pod/target-cluster-1-import-nrv6x Defaulted container "import" out of: import, bootstrap-controller (init) {"level":"info","ts":"2025-08-18T14:36:02.2764563Z","msg":"Creating new data directory","logging_pod":"target-cluster-1-import","pgdata":"/var/lib/postgresql/data/pgdata","initDbOptions":["--username","postgres","-D","/var/lib/postgresql/data/pgdata","-k","--encoding=UTF8","--lc-collate=C","--lc-ctype=C"]} {"level":"info","ts":"2025-08-18T14:36:02.695876955Z","logger":"initdb","msg":"The files belonging to this database system will be owned by user \"postgres\".\nThis user must also own the server process.\n\nThe database cluster will be initialized with this locale configuration:\n provider: libc\n LC_COLLATE: C\n LC_CTYPE: C\n LC_MESSAGES: en_US.utf8\n LC_MONETARY: en_US.utf8\n LC_NUMERIC: en_US.utf8\n LC_TIME: en_US.utf8\nThe default text search configuration will be set to \"english\".\n\nData page checksums are enabled.\n\ncreating directory /var/lib/postgresql/data/pgdata ... ok\ncreating subdirectories ... ok\nselecting dynamic shared memory implementation ... posix\nselecting default max_connections ... 100\nselecting default shared_buffers ... 128MB\nselecting default time zone ... Etc/UTC\ncreating configuration files ... ok\nrunning bootstrap script ... ok\nperforming post-bootstrap initialization ... ok\nsyncing data to disk ... ok\n\n\nSuccess. You can now start the database server using:\n\n pg_ctl -D /var/lib/postgresql/data/pgdata -l logfile start\n\n","pipe":"stdout","logging_pod":"target-cluster-1-import"} {"level":"info","ts":"2025-08-18T14:36:02.695904061Z","logger":"initdb","msg":"initdb: warning: enabling \"trust\" authentication for local connections\ninitdb: hint: You can change this by editing pg_hba.conf or using the option -A, or --auth-local and --auth-host, the next time you run initdb.\n","pipe":"stderr","logging_pod":"target-cluster-1-import"} {"level":"info","ts":"2025-08-18T14:36:02.70080715Z","msg":"Installed configuration file","logging_pod":"target-cluster-1-import","pgdata":"/var/lib/postgresql/data/pgdata","filename":"custom.conf"} {"level":"info","ts":"2025-08-18T14:36:02.702621652Z","msg":"Configuration optimized for import","logging_pod":"target-cluster-1-import","filename":"override.conf"} {"level":"info","ts":"2025-08-18T14:36:02.702752122Z","msg":"Starting up instance","logging_pod":"target-cluster-1-import","pgdata":"/var/lib/postgresql/data/pgdata","options":["start","-w","-D","/var/lib/postgresql/data/pgdata","-o","-c port=5432 -c unix_socket_directories=/controller/run","-t 40000000","-o","-c listen_addresses='127.0.0.1'"]} {"level":"info","ts":"2025-08-18T14:36:02.722227778Z","logger":"pg_ctl","msg":"waiting for server to start....2025-08-18 14:36:02.722 UTC [27] LOG: redirecting log output to logging collector process","pipe":"stdout","logging_pod":"target-cluster-1-import"} {"level":"info","ts":"2025-08-18T14:36:02.722242494Z","logger":"pg_ctl","msg":"2025-08-18 14:36:02.722 UTC [27] HINT: Future log output will appear in directory \"/controller/log\".","pipe":"stdout","logging_pod":"target-cluster-1-import"} . . . {"level":"info","ts":"2025-08-18T14:36:03.863960314Z","logger":"pg_ctl","msg":"waiting for server to shut down.... done","pipe":"stdout","logging_pod":"target-cluster-1-import"} {"level":"info","ts":"2025-08-18T14:36:03.864575334Z","logger":"pg_ctl","msg":"server stopped","pipe":"stdout","logging_pod":"target-cluster-1-import"} {"level":"info","ts":"2025-08-18T14:36:03.865685453Z","msg":"Exited log pipe","fileName":"/controller/log/postgres.csv","logging_pod":"target-cluster-1-import"} {"level":"info","ts":"2025-08-18T14:36:03.874382379Z","msg":"Updated replication settings","logging_pod":"target-cluster-1-import","filename":"override.conf"} {"level":"info","ts":"2025-08-18T14:36:03.874657568Z","msg":"Running initdb --sync-only","logging_pod":"target-cluster-1-import","pgdata":"/var/lib/postgresql/data/pgdata"} {"level":"info","ts":"2025-08-18T14:36:03.988362667Z","logger":"initdb","msg":"syncing data to disk ... ok\n","pipe":"stdout","logging_pod":"target-cluster-1-import"} |
Let's connect to the target cluster and verify our migration:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
kubectl exec -it pod/target-cluster-1 -- /bin/bash Defaulted container "postgres" out of: postgres, bootstrap-controller (init) postgres@target-cluster-1:/$ psql psql (15.13 (Debian 15.13-1.pgdg110+1)) Type "help" for help. postgres=# \l List of databases Name | Owner | Encoding | Collate | Ctype | ICU Locale | Locale Provider | Access privileges ---------------------+---------------------+----------+---------+-------+------------+-----------------+----------------------- postgres | postgres | UTF8 | C | C | | libc | template0 | postgres | UTF8 | C | C | | libc | =c/postgres + | | | | | | | postgres=CTc/postgres template1 | postgres | UTF8 | C | C | | libc | =c/postgres + | | | | | | | postgres=CTc/postgres to_be_cloned_target | to_be_cloned_target | UTF8 | C | C | | libc | (4 rows) postgres=# \c to_be_cloned_target You are now connected to database "to_be_cloned_target" as user "postgres". to_be_cloned_target=# \d List of relations Schema | Name | Type | Owner --------+------+-------+--------------------- public | test | table | to_be_cloned_target (1 row) to_be_cloned_target=# select count(*) from test; count --------- 1000000 (1 row) |
As we expected, our import is successfully completed and the target database is named to_be_cloned_target.
CNPG allows us to migrate our existing database cluster into Kubernetes environment safely with pg_basebackup and import methods. Depending on the use case and requirement, the method will vary. pg_basebackup creates a physical copy of the source cluster, but needs hardware and version compatibility, while import offers a bit more flexibility and allows major version upgrades. With either approach, CNPG handles the orchestration, making it easier to move existing workloads into a resilient, cloud-native PostgreSQL environment.
Leave a Reply