NVIDIA’s CUDA is definitely I great thing and I got to admit that I got excited years ago when I first learned about it. For many operations a nicely optimized GPU implementation definitely seems the way to go.
GPUs are traditionally used for scientific operations and massively parallel tasks. However, some important work is also going into the PGStrom project, which is all about bringing the power of modern GPUs to PostgreSQL: https://wiki.postgresql.org/wiki/PGStrom
At this point installing CUDA on Linux might be the hardest part of the entire undertaking. The CUDA installer only works nicely, when no X-server is running. A simple „init 1“ should solve this problem, however.
Before you get started with pgstrom, it is usually a good idea to check, if the GPU has been detected properly:
[[email protected] ~]$ cd /usr/local/cuda/samples/1_Utilities/deviceQuery [[email protected] deviceQuery]$ ./deviceQuery ./deviceQuery Starting... CUDA Device Query (Runtime API) version (CUDART static linking) Detected 1 CUDA Capable device(s) Device 0: "GeForce GTX 970" CUDA Driver Version / Runtime Version 7.5 / 7.0 CUDA Capability Major/Minor version number: 5.2 Total amount of global memory: 4095 MBytes (4294246400 bytes) (13) Multiprocessors, (128) CUDA Cores/MP: 1664 CUDA Cores GPU Max Clock rate: 1253 MHz (1.25 GHz) Memory Clock rate: 3505 Mhz Memory Bus Width: 256-bit L2 Cache Size: 1835008 bytes Maximum Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096) Maximum Layered 1D Texture Size, (num) layers 1D=(16384), 2048 layers Maximum Layered 2D Texture Size, (num) layers 2D=(16384, 16384), 2048 layers Total amount of constant memory: 65536 bytes Total amount of shared memory per block: 49152 bytes Total number of registers available per block: 65536 Warp size: 32 Maximum number of threads per multiprocessor: 2048 Maximum number of threads per block: 1024 Max dimension size of a thread block (x,y,z): (1024, 1024, 64) Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535) Maximum memory pitch: 2147483647 bytes Texture alignment: 512 bytes Concurrent copy and kernel execution: Yes with 2 copy engine(s) Run time limit on kernels: Yes Integrated GPU sharing Host Memory: No Support host page-locked memory mapping: Yes Alignment requirement for Surfaces: Yes Device has ECC support: Disabled Device supports Unified Addressing (UVA): Yes Device PCI Domain ID / Bus ID / location ID: 0 / 1 / 0 Compute Mode: < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) > deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 7.5, CUDA Runtime Version = 7.0, NumDevs = 1, Device0 = GeForce GTX 970 Result = PASS
If the test is passed, CUDA is ready for PostgreSQL.
Installing PostgreSQL with CUDA
Installing pg_strom for PostgreSQL is basically not hard. Here is how it works:
$ git clone https://github.com/postgres/postgres.git pgsql $ cd pgsql $ ./configure --enable-debug --enable-cassert $ make $ sudo make install $ git clone https://github.com/pg-strom/devel pg_strom $ cd pg_strom $ which pg_config /usr/local/pgsql/bin/pg_config $ make $ sudo make install
What happened in my case was that I had to uncomment 3 lines in a pg_strom header file because my version of PostgreSQL was a bit more up to date than expected. However, this is nothing major. It is more of a small fix.
Once pg_strom has been added to shared_preload_libraries, the system is already ready for action. In my case starting the database shows the following listing:
[[email protected] ~]$ pg_ctl -D /data/dbstrom/ start server starting LOG: CUDA Runtime version: 7.5.0 LOG: NVIDIA driver version: 352.30 LOG: GPU0 GeForce GTX 970 (1664 CUDA cores, 1253MHz), L2 1792KB, RAM 4095MB (256bits, 3505MHz), capability 5.2 LOG: NVRTC - CUDA Runtime Compilation vertion 7.0 LOG: database system shutdown was interrupted; last known up at 2015-08-27 21:05:41 CEST LOG: database system was not properly shut down; automatic recovery in progress LOG: invalid record length at 0/A54973F8 LOG: redo is not required LOG: MultiXact member wraparound protections are now enabled LOG: database system is ready to accept connections LOG: autovacuum launcher started
The important point here is that during PostgreSQL startup the CUDA device has to be in the LOG message – otherwise there is a problem with the driver.
The beauty is that pgstrom automatically uses GPU code when it seems useful. The user does not have to worry about where the code is actually executed. The optimizer will make those decisions for your automatically.
So far pgstrom seems pretty promising. Of course, it is not ready for production yet but it is definitely worth investigating the issue futher and run tests next week.