Testing GPU-accelerated PostgreSQL

Installing CUDA

At this point installing CUDA on Linux might be the hardest part of the entire undertaking. The CUDA installer only works nicely, when no X-server is running. A simple „init 1“ should solve this problem, however.

Before you get started with pgstrom, it is usually a good idea to check, if the GPU has been detected properly:

[hs@laura ~]$ cd /usr/local/cuda/samples/1_Utilities/deviceQuery
[hs@laura deviceQuery]$ ./deviceQuery
./deviceQuery Starting...
 CUDA Device Query (Runtime API) version (CUDART static linking)
Detected 1 CUDA Capable device(s)
Device 0: 'GeForce GTX 970'
  CUDA Driver Version / Runtime Version          7.5 / 7.0
  CUDA Capability Major/Minor version number:    5.2
  Total amount of global memory:                 4095 MBytes (4294246400 bytes)
  (13) Multiprocessors, (128) CUDA Cores/MP:     1664 CUDA Cores
  GPU Max Clock rate:                            1253 MHz (1.25 GHz)
  Memory Clock rate:                             3505 Mhz
  Memory Bus Width:                              256-bit
  L2 Cache Size:                                 1835008 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)
  Maximum Layered 1D Texture Size, (num) layers  1D=(16384), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(16384, 16384), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  2048
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 2 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Device PCI Domain ID / Bus ID / location ID:   0 / 1 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 7.5, CUDA Runtime Version = 7.0, NumDevs = 1, Device0 = GeForce GTX 970
Result = PASS

[hs@laura ~]$ cd /usr/local/cuda/samples/1_Utilities/deviceQuery

[hs@laura deviceQuery]$ ./deviceQuery

./deviceQuery Starting...

CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: 'GeForce GTX 970'

CUDA Driver Version / Runtime Version 7.5 / 7.0

CUDA Capability Major/Minor version number: 5.2

Total amount of global memory: 4095 MBytes (4294246400 bytes)

(13) Multiprocessors, (128) CUDA Cores/MP: 1664 CUDA Cores

GPU Max Clock rate: 1253 MHz (1.25 GHz)

Memory Clock rate: 3505 Mhz

Memory Bus Width: 256-bit

L2 Cache Size: 1835008 bytes

Maximum Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)

Maximum Layered 1D Texture Size, (num) layers 1D=(16384), 2048 layers

Maximum Layered 2D Texture Size, (num) layers 2D=(16384, 16384), 2048 layers

Total amount of constant memory: 65536 bytes

Total amount of shared memory per block: 49152 bytes

Total number of registers available per block: 65536

Warp size: 32

Maximum number of threads per multiprocessor: 2048

Maximum number of threads per block: 1024

Max dimension size of a thread block (x,y,z): (1024, 1024, 64)

Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)

Maximum memory pitch: 2147483647 bytes

Texture alignment: 512 bytes

Concurrent copy and kernel execution: Yes with 2 copy engine(s)

Run time limit on kernels: Yes

Integrated GPU sharing Host Memory: No

Support host page-locked memory mapping: Yes

Alignment requirement for Surfaces: Yes

Device has ECC support: Disabled

Device supports Unified Addressing (UVA): Yes

Device PCI Domain ID / Bus ID / location ID: 0 / 1 / 0

Compute Mode:

< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 7.5, CUDA Runtime Version = 7.0, NumDevs = 1, Device0 = GeForce GTX 970

Result = PASS

If the test is passed, CUDA is ready for PostgreSQL.

Installing PostgreSQL with CUDA

Installing pg_strom for PostgreSQL is basically not hard. Here is how it works:

$ git clone https://github.com/postgres/postgres.git pgsql
$ cd pgsql
$ ./configure --enable-debug --enable-cassert
$ make
$ sudo make install

$ git clone https://github.com/pg-strom/devel pg_strom
$ cd pg_strom
$ which pg_config
/usr/local/pgsql/bin/pg_config
$ make
$ sudo make install

$ git clone https://github.com/postgres/postgres.git pgsql

$ cd pgsql

$ ./configure --enable-debug --enable-cassert

$ make

$ sudo make install

$ git clone https://github.com/pg-strom/devel pg_strom

$ cd pg_strom

$ which pg_config

/usr/local/pgsql/bin/pg_config

$ make

$ sudo make install

What happened in my case was that I had to uncomment 3 lines in a pg_strom header file because my version of PostgreSQL was a bit more up to date than expected. However, this is nothing major. It is more of a small fix.

Once pg_strom has been added to shared_preload_libraries, the system is already ready for action. In my case starting the database shows the following listing:

[hs@laura ~]$ pg_ctl -D /data/dbstrom/ start
server starting
LOG:  CUDA Runtime version: 7.5.0
LOG:  NVIDIA driver version: 352.30
LOG:  GPU0 GeForce GTX 970 (1664 CUDA cores, 1253MHz), L2 1792KB, RAM 4095MB (256bits, 3505MHz), capability 5.2
LOG:  NVRTC - CUDA Runtime Compilation vertion 7.0
LOG:  database system shutdown was interrupted; last known up at 2015-08-27 21:05:41 CEST
LOG:  database system was not properly shut down; automatic recovery in progress
LOG:  invalid record length at 0/A54973F8
LOG:  redo is not required
LOG:  MultiXact member wraparound protections are now enabled
LOG:  database system is ready to accept connections
LOG:  autovacuum launcher started

[hs@laura ~]$ pg_ctl -D /data/dbstrom/ start

server starting

LOG: CUDA Runtime version: 7.5.0

LOG: NVIDIA driver version: 352.30

LOG: GPU0 GeForce GTX 970 (1664 CUDA cores, 1253MHz), L2 1792KB, RAM 4095MB (256bits, 3505MHz), capability 5.2

LOG: NVRTC - CUDA Runtime Compilation vertion 7.0

LOG: database system shutdown was interrupted; last known up at 2015-08-27 21:05:41 CEST

LOG: database system was not properly shut down; automatic recovery in progress

LOG: invalid record length at 0/A54973F8

LOG: redo is not required

LOG: MultiXact member wraparound protections are now enabled

LOG: database system is ready to accept connections

LOG: autovacuum launcher started

The important point here is that during PostgreSQL startup the CUDA device has to be in the LOG message - otherwise there is a problem with the driver.

The beauty is that pgstrom automatically uses GPU code when it seems useful. The user does not have to worry about where the code is actually executed. The optimizer will make those decisions for your automatically.

So far pgstrom seems pretty promising. Of course, it is not ready for production yet, but it is definitely worth investigating the issue further and run tests next week.

9 responses to “Testing GPU-accelerated PostgreSQL”

Shaun Thomas says:

September 1, 2015 at 3:05 pm

I saw this during PGCon this year. It's extremely exciting. Though AMD cards generally have more cores, so it makes me wonder if there will ever be an AMD equivalent to this.

Reply
- Krzysztof Nienartowicz says:
  
  September 2, 2015 at 1:33 pm
  
  it's openCL based, so should work with AMD too. CUDA provides OpenCL stack too and from what I understood PgStrom is OpenCL based.
  
  Reply
  - ⚡ ⚕ Ayy LOLz LMAO ⚕ ⚡ says:
    
    September 11, 2016 at 2:38 pm
    
    Not anymore, it seems...
    
    Reply
Vincenzo Romano says:

September 1, 2015 at 3:19 pm

Maybe my expectations for this article were too high. But seeing real tests (CUDA vs non-CUDA) was my interest in it.
Will wait a little more.

Reply
- Shaun Thomas says:
  
  September 1, 2015 at 3:39 pm
  
  Performance metrics start on slide 17. He showed this at PGCon in Ottawa this year:
  
  http://www.slideshare.net/kaigai/gpgpu-accelerates-postgresql
  
  Reply
ri_s says:

September 1, 2015 at 10:12 pm

Ugh CUDA is an NVidia product whole main purpose is proprietary lock-in. OpenCL solutions are generally preferable.

Reply
- ⚡ ⚕ Ayy LOLz LMAO ⚕ ⚡ says:
  
  September 11, 2016 at 2:38 pm
  
  Sad thing is, OpenCL support was dropped some time ago...
  
  Reply
MKan says:

December 1, 2016 at 10:25 am

After installing postgres and pgstrom, When I run '/usr/local/pgsql/bin/pgctl -D /usr/local/pgsql/data -l logfile start', the command works. But the output is just 'server starting'. How do I know that pg-strom is running or not?

Reply
Chang Eric says:

September 16, 2019 at 1:31 am

hi , you can start postgresql service with that gtx970 ? I try to install CUDA&pgstrom , and when I restart the postgresql service , there's some error messages in postgresql indicates that gtx970 is not supported.

Reply

Testing GPU-accelerated PostgreSQL

Installing CUDA

9 responses to “Testing GPU-accelerated PostgreSQL”

Leave a Reply Cancel reply

Hans-Jürgen Schönig

Blog Tags

NEWSLETTER

Articles by our PostgreSQL Experts