Everybody needs a little toy to play with, so I thought: Why not buy a toy helping me to get rid of my energy bill? So, I ordered a 10.5 kwp photovoltaic system for my house. The system was shipped with a Kostal Pico inverted to make sure electricity can be used directly by the grid.
Kostal offers some additional device which allows people to chart your electricity production over time. But, why pay if you can do it yourself using some little shell script and most important: PostgreSQL

As there is hardly code available on the net showing how to access Kostal Pico, I decided to publish this one:

#!/bin/sh

KUSER=pvserver
KPASS=pvwr
KIP=192.168.0.201
PGUSER=hs
PGDATABASE=test

SQL=”WITH x AS (SELECT date_trunc(‘hour’, tstamp) AS hour, \
round(avg(v1), 2) source_1, \
round(avg(v2), 2) source_2, \
round(avg(v3), 2) source_3, \
round(avg(v1 + v2 + v3), 2) AS total \
FROM (SELECT *, ‘2013-04-10 05:00:00+02’::timestamptz + (t || ‘seconds’)::interval AS tstamp \
FROM kostal) AS a \
GROUP BY 1) \
SELECT y AS time, source_1, source_2, source_3, total \
FROM generate_series((SELECT min(hour) FROM x), (SELECT max(hour) FROM x), ‘1 hour’) AS y \
LEFT JOIN (SELECT * FROM x) AS z \
ON (z.hour = y) \
ORDER BY 1 ; “

wget http://$KUSER:$KPASS@$KIP/LogDaten.dat -O – 2> /dev/null | \
sed ‘1,7d’ | \
sed -e ‘s/[ \t]\+/;/gi’ -e ‘s/^;//g’ | \
grep -v ‘h;’ | \
grep -v ‘POR’ | \
cut -f1,4,9,14 -d ‘;’ – | \
awk ‘BEGIN { print “CREATE TEMPORARY TABLE tmp_kostal (t int8, v1 int4, v2 int4, v3 int4); \
COPY tmp_kostal FROM stdin DELIMITER \x27;\x27 ;” }
{ print }
END { print “\\.\n ; \
INSERT INTO kostal SELECT * FROM tmp_kostal EXCEPT SELECT * FROM kostal; ” } \
{ print “”$SQL”” } \
‘ |
psql $PGDATABASE -U $PGUSER

echo “running analysis …”
psql $PGDATABASE -U $PGUSER -c “$SQL”

Kostal Pico offers a somewhat disgusting web interface providing you with textual data. And yes, the interface is disgusting and it took a while to figure out what those columns actually mean. The nasty thing about it is that the data stream does not contain a real timestamp but the number of seconds since the system has been in production (if the system has been shut down for maintenance, this counter will NOT advance, but i have left this one out – it would be too complicated for a prototype to take into consideration).

We simply use wget to fetch the data and pipe the stuff through a series of processing steps. The important thing here is that we have to cut out some lines and columns to turn this into a PostgreSQL readable format (in our case semi-colon separated).
In my case the data will have 4 columns:

test=# SELECT * FROM kostal ORDER BY t DESC LIMIT 10;
t | v1 | v2 | v3
———+—–+—–+—–
2435059 | 793 | 548 | 651
2434159 | 412 | 285 | 317
2433259 | 309 | 213 | 255
2432359 | 561 | 388 | 454
2431459 | 476 | 330 | 341
2430559 | 423 | 293 | 303
2429659 | 449 | 310 | 348
2428759 | 236 | 163 | 188
2427859 | 136 | 94 | 106
2426959 | 105 | 73 | 83
(10 rows)

The first column in our PostgreSQL table is the timestamp we have talked about already. The next three columns represent my 3 solar fields. Each of those fields will report its production data. Once the data is downloaded, we merge it into our existing table to make sure that we can easily run the script over and over again without destroying something. I have yet to figure out if Kostal is always sending all data it has – but, we don’t have to care because PostgreSQL will persist the data anyway and we just add what is missing. Our merging process can be a little crude here because we get just one row every 15 minutes (making the amount of data in our PostgreSQL table close to irrelevant).

Analyzing a timeseries

Now we can run some SQL to analyze the timeseries we got in our database. To make the code a little more robust I have added the time when the system was started up to the SQL directly. We should surely calculate that with some simple windowing function, but this would give us a wrong chart in case the HTTP request itself would fail.
One issue is that I wanted the timeseries to be complete. Meaning: If there is no production during the night Kostal Pico will not provide us with data for this time – so we somehow have to fill the gaps. We do so by outer joining our aggregated data with generate_series.

The result looks like that:

2013-04-26 04:00:00+02 | | | |
2013-04-26 05:00:00+02 | 0.00 | 0.00 | 0.00 | 0.00
2013-04-26 06:00:00+02 | 113.00 | 84.50 | 60.00 | 257.50
2013-04-26 07:00:00+02 | 1687.75 | 1173.75 | 349.75 | 3211.25
2013-04-26 08:00:00+02 | 2873.00 | 1980.00 | 1098.50 | 5951.50
2013-04-26 09:00:00+02 | 3353.50 | 2306.00 | 1672.75 | 7332.25
2013-04-26 10:00:00+02 | 3539.75 | 2429.00 | 2097.75 | 8066.50
2013-04-26 11:00:00+02 | 3469.50 | 2385.75 | 2377.00 | 8232.25
2013-04-26 12:00:00+02 | 3250.50 | 2233.50 | 2526.50 | 8010.50
2013-04-26 13:00:00+02 | 2823.00 | 1938.50 | 2517.50 | 7279.00
2013-04-26 14:00:00+02 | 2179.00 | 1491.75 | 2346.75 | 6017.50
2013-04-26 15:00:00+02 | 1322.75 | 868.00 | 2041.00 | 4231.75
2013-04-26 16:00:00+02 | 481.25 | 311.25 | 967.50 | 1760.00
2013-04-26 17:00:00+02 | 357.50 | 242.00 | 407.50 | 1007.00
2013-04-26 18:00:00+02 | 438.00 | 301.75 | 408.50 | 1148.25
2013-04-26 19:00:00+02 | 121.50 | 83.25 | 110.50 | 315.25
2013-04-26 20:00:00+02 | 9.50 | 5.00 | 7.00 | 21.50
2013-04-26 21:00:00+02 | | | |
2013-04-26 22:00:00+02 | | | |

We can see a nice production peak during morning hours. This is expected because 2/3 of the system are heading east and 1/3 of those panels are heading roughly south. The data looks realistic – a peak production of 8.2kwh shortly before noon is perfectly expected.

The nice thing here is that it is not too hard to pipe the stuff to gnuplot or any other software to come up with a nice chart.