NVIDIA’s CUDA is definitely a great thing and I have to admit that I already got excited years ago when I first learned about it. For many operations a nicely optimized GPU implementation definitely seems the way to go. GPUs are traditionally used for scientific operations and massively parallel tasks. However, some important work is also going into the PGStrom project, which is all about bringing the power of modern GPUs to PostgreSQL: pgstrom documentation

Installing CUDA

At this point installing CUDA on Linux might be the hardest part of the entire undertaking. The CUDA installer only works nicely, when no X-server is running. A simple „init 1“ should solve this problem, however.

Before you get started with pgstrom, it is usually a good idea to check, if the GPU has been detected properly:

[hs@laura ~]$ cd /usr/local/cuda/samples/1_Utilities/deviceQuery
[hs@laura deviceQuery]$ ./deviceQuery
./deviceQuery Starting...
 CUDA Device Query (Runtime API) version (CUDART static linking)
Detected 1 CUDA Capable device(s)
Device 0: "GeForce GTX 970"
  CUDA Driver Version / Runtime Version          7.5 / 7.0
  CUDA Capability Major/Minor version number:    5.2
  Total amount of global memory:                 4095 MBytes (4294246400 bytes)
  (13) Multiprocessors, (128) CUDA Cores/MP:     1664 CUDA Cores
  GPU Max Clock rate:                            1253 MHz (1.25 GHz)
  Memory Clock rate:                             3505 Mhz
  Memory Bus Width:                              256-bit
  L2 Cache Size:                                 1835008 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)
  Maximum Layered 1D Texture Size, (num) layers  1D=(16384), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(16384, 16384), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  2048
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 2 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Device PCI Domain ID / Bus ID / location ID:   0 / 1 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 7.5, CUDA Runtime Version = 7.0, NumDevs = 1, Device0 = GeForce GTX 970
Result = PASS

If the test is passed, CUDA is ready for PostgreSQL.

Installing PostgreSQL with CUDA

Installing pg_strom for PostgreSQL is basically not hard. Here is how it works:

$ git clone https://github.com/postgres/postgres.git pgsql
$ cd pgsql
$ ./configure --enable-debug --enable-cassert
$ make
$ sudo make install

$ git clone https://github.com/pg-strom/devel pg_strom
$ cd pg_strom
$ which pg_config
$ make
$ sudo make install

What happened in my case was that I had to uncomment 3 lines in a pg_strom header file because my version of PostgreSQL was a bit more up to date than expected. However, this is nothing major. It is more of a small fix.

Once pg_strom has been added to shared_preload_libraries, the system is already ready for action. In my case starting the database shows the following listing:

[hs@laura ~]$ pg_ctl -D /data/dbstrom/ start
server starting
LOG:  CUDA Runtime version: 7.5.0
LOG:  NVIDIA driver version: 352.30
LOG:  GPU0 GeForce GTX 970 (1664 CUDA cores, 1253MHz), L2 1792KB, RAM 4095MB (256bits, 3505MHz), capability 5.2
LOG:  NVRTC - CUDA Runtime Compilation vertion 7.0
LOG:  database system shutdown was interrupted; last known up at 2015-08-27 21:05:41 CEST
LOG:  database system was not properly shut down; automatic recovery in progress
LOG:  invalid record length at 0/A54973F8
LOG:  redo is not required
LOG:  MultiXact member wraparound protections are now enabled
LOG:  database system is ready to accept connections
LOG:  autovacuum launcher started

The important point here is that during PostgreSQL startup the CUDA device has to be in the LOG message – otherwise there is a problem with the driver.

The beauty is that pgstrom automatically uses GPU code when it seems useful. The user does not have to worry about where the code is actually executed. The optimizer will make those decisions for your automatically.

So far pgstrom seems pretty promising. Of course, it is not ready for production yet, but it is definitely worth investigating the issue further and run tests next week.