Jim Paris
57751f5b32
Previous commits went back and forth a bit on whether the various APIs should use bytes or strings, but bytes appears to be a better answer, because actual data in streams will always be 7-bit ASCII or raw binary. There's no reason to apply the performance penalty of constantly converting between bytes and strings. One drawback now is that lots of code now has to have "b" prefixes on strings, especially in tests, which inflates this commit quite a bit.
470 lines
17 KiB
Markdown
470 lines
17 KiB
Markdown
Structure
|
|
---------
|
|
nilmdb.nilmdb is the NILM database interface. A nilmdb.BulkData
|
|
interface stores data in flat files, and a SQL database tracks
|
|
metadata and ranges.
|
|
|
|
Access to the nilmdb must be single-threaded. This is handled with
|
|
the nilmdb.serializer class. In the future this could probably
|
|
be turned into a per-path serialization.
|
|
|
|
nilmdb.server is a HTTP server that provides an interface to talk,
|
|
thorugh the serialization layer, to the nilmdb object.
|
|
|
|
nilmdb.client is a HTTP client that connects to this.
|
|
|
|
Sqlite performance
|
|
------------------
|
|
|
|
Committing a transaction in the default sync mode (PRAGMA synchronous=FULL)
|
|
takes about 125msec. sqlite3 will commit transactions at 3 times:
|
|
|
|
1. explicit con.commit()
|
|
|
|
2. between a series of DML commands and non-DML commands, e.g.
|
|
after a series of INSERT, SELECT, but before a CREATE TABLE or
|
|
PRAGMA.
|
|
|
|
3. at the end of an explicit transaction, e.g. "with self.con as con:"
|
|
|
|
To speed up testing, or if this transaction speed becomes an issue,
|
|
the sync=False option to NilmDB will set PRAGMA synchronous=OFF.
|
|
|
|
|
|
Inserting streams
|
|
-----------------
|
|
|
|
We need to send the contents of "data" as POST. Do we need chunked
|
|
transfer?
|
|
|
|
- Don't know the size in advance, so we would need to use chunked if
|
|
we send the entire thing in one request.
|
|
- But we shouldn't send one chunk per line, so we need to buffer some
|
|
anyway; why not just make new requests?
|
|
- Consider the infinite-streaming case, we might want to send it
|
|
immediately? Not really -- server still should do explicit inserts
|
|
of fixed-size chunks.
|
|
- Even chunked encoding needs the size of each chunk beforehand, so
|
|
everything still gets buffered. Just a tradeoff of buffer size.
|
|
|
|
Before timestamps are added:
|
|
|
|
- Raw data is about 440 kB/s (9 channels)
|
|
- Prep data is about 12.5 kB/s (1 phase)
|
|
- How do we know how much data to send?
|
|
|
|
- Remember that we can only do maybe 8-50 transactions per second on
|
|
the sqlite database. So if one block of inserted data is one
|
|
transaction, we'd need the raw case to be around 64kB per request,
|
|
ideally more.
|
|
- Maybe use a range, based on how long it's taking to read the data
|
|
- If no more data, send it
|
|
- If data > 1 MB, send it
|
|
- If more than 10 seconds have elapsed, send it
|
|
- Should those numbers come from the server?
|
|
|
|
Converting from ASCII to PyTables:
|
|
|
|
- For each row getting added, we need to set attributes on a PyTables
|
|
Row object and call table.append(). This means that there isn't a
|
|
particularly efficient way of converting from ascii.
|
|
- Could create a function like nilmdb.layout.Layout("foo".fillRow(asciiline)
|
|
- But this means we're doing parsing on the serialized side
|
|
- Let's keep parsing on the threaded server side so we can detect
|
|
errors better, and not block the serialized nilmdb for a slow
|
|
parsing process.
|
|
- Client sends ASCII data
|
|
- Server converts this ACSII data to a list of values
|
|
- Maybe:
|
|
|
|
# threaded side creates this object
|
|
parser = nilmdb.layout.Parser("layout_name")
|
|
# threaded side parses and fills it with data
|
|
parser.parse(textdata)
|
|
# serialized side pulls out rows
|
|
for n in xrange(parser.nrows):
|
|
parser.fill_row(rowinstance, n)
|
|
table.append()
|
|
|
|
|
|
Inserting streams, inside nilmdb
|
|
--------------------------------
|
|
|
|
- First check that the new stream doesn't overlap.
|
|
- Get minimum timestamp, maximum timestamp from data parser.
|
|
- (extend parser to verify monotonicity and track extents)
|
|
- Get all intervals for this stream in the database
|
|
- See if new interval overlaps any existing ones
|
|
- If so, bail
|
|
- Question: should we cache intervals inside NilmDB?
|
|
- Assume database is fast for now, and always rebuild fom DB.
|
|
- Can add a caching layer later if we need to.
|
|
- `stream_get_ranges(path)` -> return IntervalSet?
|
|
|
|
Speed
|
|
-----
|
|
|
|
- First approach was quadratic. Adding four hours of data:
|
|
|
|
$ time zcat /home/jim/bpnilm-data/snapshot-1-20110513-110002.raw.gz | ./nilmtool.py insert -s 20110513-110000 /bpnilm/1/raw
|
|
real 24m31.093s
|
|
$ time zcat /home/jim/bpnilm-data/snapshot-1-20110513-110002.raw.gz | ./nilmtool.py insert -s 20110513-120001 /bpnilm/1/raw
|
|
real 43m44.528s
|
|
$ time zcat /home/jim/bpnilm-data/snapshot-1-20110513-110002.raw.gz | ./nilmtool.py insert -s 20110513-130002 /bpnilm/1/raw
|
|
real 93m29.713s
|
|
$ time zcat /home/jim/bpnilm-data/snapshot-1-20110513-110002.raw.gz | ./nilmtool.py insert -s 20110513-140003 /bpnilm/1/raw
|
|
real 166m53.007s
|
|
|
|
- Disabling pytables indexing didn't help:
|
|
|
|
real 31m21.492s
|
|
real 52m51.963s
|
|
real 102m8.151s
|
|
real 176m12.469s
|
|
|
|
- Server RAM usage is constant.
|
|
|
|
- Speed problems were due to IntervalSet speed, of parsing intervals
|
|
from the database and adding the new one each time.
|
|
|
|
- First optimization is to cache result of `nilmdb:_get_intervals`,
|
|
which gives the best speedup.
|
|
|
|
- Also switched to internally using bxInterval from bx-python package.
|
|
Speed of `tests/test_interval:TestIntervalSpeed` is pretty decent
|
|
and seems to be growing logarithmically now. About 85μs per insertion
|
|
for inserting 131k entries.
|
|
|
|
- Storing the interval data in SQL might be better, with a scheme like:
|
|
http://www.logarithmic.net/pfh/blog/01235197474
|
|
|
|
- Next slowdown target is nilmdb.layout.Parser.parse().
|
|
- Rewrote parsers using cython and sscanf
|
|
- Stats (rev 10831), with `_add_interval` disabled
|
|
|
|
layout.pyx.Parser.parse:128 6303 sec, 262k calls
|
|
layout.pyx.parse:63 13913 sec, 5.1g calls
|
|
numpy:records.py.fromrecords:569 7410 sec, 262k calls
|
|
|
|
- Probably OK for now.
|
|
|
|
- After all updates, now takes about 8.5 minutes to insert an hour of
|
|
data, constant after adding 171 hours (4.9 billion data points)
|
|
|
|
- Data set size: 98 gigs = 20 bytes per data point.
|
|
6 uint16 data + 1 uint32 timestamp = 16 bytes per point
|
|
So compression must be off -- will retry with compression forced on.
|
|
|
|
IntervalSet speed
|
|
-----------------
|
|
- Initial implementation was pretty slow, even with binary search in
|
|
sorted list
|
|
|
|
- Replaced with bxInterval; now takes about log n time for an insertion
|
|
- TestIntervalSpeed with range(17,18) and profiling
|
|
- 85 μs each
|
|
- 131072 calls to `__iadd__`
|
|
- 131072 to bx.insert_interval
|
|
- 131072 to bx.insert:395
|
|
- 2355835 to bx.insert:106 (18x as many?)
|
|
|
|
- Tried blist too, worse than bxinterval.
|
|
|
|
- Might be algorithmic improvements to be made in Interval.py,
|
|
like in `__and__`
|
|
|
|
- Replaced again with rbtree. Seems decent. Numbers are time per
|
|
insert for 2**17 insertions, followed by total wall time and RAM
|
|
usage for running "make test" with `test_rbtree` and `test_interval`
|
|
with range(5,20):
|
|
- old values with bxinterval:
|
|
20.2 μS, total 20 s, 177 MB RAM
|
|
- rbtree, plain python:
|
|
97 μS, total 105 s, 846 MB RAM
|
|
- rbtree converted to cython:
|
|
26 μS, total 29 s, 320 MB RAM
|
|
- rbtree and interval converted to cython:
|
|
8.4 μS, total 12 s, 134 MB RAM
|
|
|
|
- Would like to move Interval itself back to Python so other
|
|
non-cythonized code like client code can use it more easily.
|
|
Testing speed with just `test_interval` being tested, with
|
|
`range(5,22)`, using `/usr/bin/time -v python tests/runtests.py`,
|
|
times recorded for 2097152:
|
|
- 52ae397 (Interval in cython):
|
|
12.6133 μs each, ratio 0.866533, total 47 sec, 399 MB RAM
|
|
- 9759dcf (Interval in python):
|
|
21.2937 μs each, ratio 1.462870, total 83 sec, 1107 MB RAM
|
|
That's a huge difference! Instead, will keep Interval and DBInterval
|
|
cythonized inside nilmdb, and just have an additional copy in
|
|
nilmdb.utils for clients to use.
|
|
|
|
Layouts
|
|
-------
|
|
Current/old design has specific layouts: RawData, PrepData, RawNotchedData.
|
|
Let's get rid of this entirely and switch to simpler data types that are
|
|
just collections and counts of a single type. We'll still use strings
|
|
to describe them, with format:
|
|
|
|
type_count
|
|
|
|
where type is "uint16", "float32", or "float64", and count is an integer.
|
|
|
|
nilmdb.layout.named() will parse these strings into the appropriate
|
|
handlers. For compatibility:
|
|
|
|
"RawData" == "uint16_6"
|
|
"RawNotchedData" == "uint16_9"
|
|
"PrepData" == "float32_8"
|
|
|
|
|
|
BulkData design
|
|
---------------
|
|
|
|
BulkData is a custom bulk data storage system that was written to
|
|
replace PyTables. The general structure is a `data` subdirectory in
|
|
the main NilmDB directory. Within `data`, paths are created for each
|
|
created stream. These locations are called tables. For example,
|
|
tables might be located at
|
|
|
|
nilmdb/data/newton/raw/
|
|
nilmdb/data/newton/prep/
|
|
nilmdb/data/cottage/raw/
|
|
|
|
Each table contains:
|
|
|
|
- An unchanging `_format` file (Python pickle format) that describes
|
|
parameters of how the data is broken up, like files per directory,
|
|
rows per file, and the binary data format
|
|
|
|
- Hex named subdirectories `("%04x", although more than 65536 can exist)`
|
|
|
|
- Hex named files within those subdirectories, like:
|
|
|
|
/nilmdb/data/newton/raw/000b/010a
|
|
|
|
The data format of these files is raw binary, interpreted by the
|
|
Python `struct` module according to the format string in the
|
|
`_format` file.
|
|
|
|
- Same as above, with `.removed` suffix, is an optional file (Python
|
|
pickle format) containing a list of row numbers that have been
|
|
logically removed from the file. If this range covers the entire
|
|
file, the entire file will be removed.
|
|
|
|
- Note that the `bulkdata.nrows` variable is calculated once in
|
|
`BulkData.__init__()`, and only ever incremented during use. Thus,
|
|
even if all data is removed, `nrows` can remain high. However, if
|
|
the server is restarted, the newly calculated `nrows` may be lower
|
|
than in a previous run due to deleted data. To be specific, this
|
|
sequence of events:
|
|
|
|
- insert data
|
|
- remove all data
|
|
- insert data
|
|
|
|
will result in having different row numbers in the database, and
|
|
differently numbered files on the filesystem, than the sequence:
|
|
|
|
- insert data
|
|
- remove all data
|
|
- restart server
|
|
- insert data
|
|
|
|
This is okay! Everything should remain consistent both in the
|
|
`BulkData` and `NilmDB`. Not attempting to readjust `nrows` during
|
|
deletion makes the code quite a bit simpler.
|
|
|
|
- Similarly, data files are never truncated shorter. Removing data
|
|
from the end of the file will not shorten it; it will only be
|
|
deleted when it has been fully filled and all of the data has been
|
|
subsequently removed.
|
|
|
|
|
|
Rocket
|
|
------
|
|
|
|
Original design had the nilmdb.nilmdb thread (through bulkdata)
|
|
convert from on-disk layout to a Python list, and then the
|
|
nilmdb.server thread (from cherrypy) converts to ASCII. For at least
|
|
the extraction side of things, it's easy to pass the bulkdata a layout
|
|
name instead, and have it convert directly from on-disk to ASCII
|
|
format, because this conversion can then be shoved into a C module.
|
|
This module, which provides a means for converting directly from
|
|
on-disk format to ASCII or Python lists, is the "rocket" interface.
|
|
Python is still used to manage the files and figure out where the
|
|
data should go; rocket just puts binary data directly in or out of
|
|
those files at specified locations.
|
|
|
|
Before rocket, testing speed with uint16_6 data, with an end-to-end
|
|
test (extracting data with nilmtool):
|
|
|
|
- insert: 65 klines/sec
|
|
- extract: 120 klines/sec
|
|
|
|
After switching to the rocket design, but using the Python version
|
|
(pyrocket):
|
|
|
|
- insert: 57 klines/sec
|
|
- extract: 120 klines/sec
|
|
|
|
After switching to a C extension module (rocket.c)
|
|
|
|
- insert: 74 klines/sec through insert.py; 99.6 klines/sec through nilmtool
|
|
- extract: 335 klines/sec
|
|
|
|
After client block updates (described below):
|
|
|
|
- insert: 180 klines/sec through nilmtool (pre-timestamped)
|
|
- extract: 390 klines/sec through nilmtool
|
|
|
|
Using "insert --timestamp" or "extract --bare" cuts the speed in half.
|
|
|
|
Blocks versus lines
|
|
-------------------
|
|
|
|
Generally want to avoid parsing the bulk of the data as lines if
|
|
possible, and transfer things in bigger blocks at once.
|
|
|
|
Current places where we use lines:
|
|
|
|
- All data returned by `client.stream_extract`, since it comes from
|
|
`httpclient.get_gen`, which iterates over lines. Not sure if this
|
|
should be changed, because a `nilmtool extract` is just about the
|
|
same speed as `curl -q .../stream/extract`!
|
|
|
|
- `client.StreamInserter.insert_iter` and
|
|
`client.StreamInserter.insert_line`, which should probably get
|
|
replaced with block versions. There's no real need to keep
|
|
updating the timestamp every time we get a new line of data.
|
|
|
|
- Finished. Just a single insert() that takes any length string and
|
|
does very little processing until it's time to send it to the
|
|
server.
|
|
|
|
Timestamps
|
|
----------
|
|
|
|
Timestamps are currently double-precision floats (64 bit). Since the
|
|
mantissa is 53-bit, this can only represent about 15-17 significant
|
|
figures, and microsecond Unix timestamps like 1222333444.000111 are
|
|
already 16 significant figures. Rounding is therefore an issue;
|
|
it's hard to sure that converting from ASCII, then back to ASCII,
|
|
will always give the same result.
|
|
|
|
Also, if the client provides a floating point value like 1.9999999999,
|
|
we need to be careful that we don't store it as 1.9999999999 but later
|
|
print it as 2.000000, because then round-trips change the data.
|
|
|
|
Possible solutions:
|
|
|
|
- When the client provides a floating point value to the server,
|
|
always round to the 6th decimal digit before verifying & storing.
|
|
Good for compatibility and simplicity. But still might have rounding
|
|
issues, and clients will also need to round when doing their own
|
|
verification. Having every piece of code need to know which digit
|
|
to round at is not ideal.
|
|
|
|
- Always store int64 timestamps on the server, representing
|
|
microseconds since epoch. int64 timestamps are used in all HTTP
|
|
parameters, in insert/extract ASCII strings, client API, commandline
|
|
raw timestamps, etc. Pretty big change.
|
|
|
|
This is what we'll go with...
|
|
|
|
- Client programs that interpret the timestamps as doubles instead
|
|
of ints will remain accurate until 2^53 microseconds, or year
|
|
2255.
|
|
|
|
- On insert, maybe it's OK to send floating point microsecond values
|
|
(1234567890123456.0), just to cope with clients that want to print
|
|
everything as a double. Server could try parsing as int64, and if
|
|
that fails, parse as double and truncate to int64. However, this
|
|
wouldn't catch imprecise inputs like "1.23456789012e+15". But
|
|
maybe that can just be ignored; it's likely to cause a
|
|
non-monotonic error at the client.
|
|
|
|
- Timestamps like 1234567890.123456 never show up anywhere, except
|
|
for interfacing to datetime_tz etc. Command line "raw timestamps"
|
|
are always printed as int64 values, and a new format
|
|
"@1234567890123456" is added to the parser for specifying them
|
|
exactly.
|
|
|
|
Binary interface
|
|
----------------
|
|
|
|
The ASCII interface is too slow for high-bandwidth processing, like
|
|
sinefits, prep, etc. A binary interface was added so that you can
|
|
extract the raw binary out of the bulkdata storage. This binary is
|
|
a little-endian format, e.g. in C a uint16_6 stream would be:
|
|
|
|
#include <endian.h>
|
|
#include <stdint.h>
|
|
struct {
|
|
int64_t timestamp_le;
|
|
uint16_t data_le[6];
|
|
} __attribute__((packed));
|
|
|
|
Remember to byteswap (with e.g. `letoh` in C)!
|
|
|
|
This interface is used by the new `nilmdb.client.numpyclient.NumpyClient`
|
|
class, which is a subclass of the normal `nilmcb.client.client.Client`
|
|
and has all of the same functions. It adds three new functions:
|
|
|
|
- `stream_extract_numpy` to extract data as a Numpy array
|
|
|
|
- `stream_insert_numpy` to insert data as a Numpy array
|
|
|
|
- `stream_insert_numpy_context` is the context manager for
|
|
incrementally inserting data
|
|
|
|
It is significantly faster! It is about 20 times faster to decimate a
|
|
stream with `nilm-decimate` when the filter code is using the new
|
|
binary/numpy interface.
|
|
|
|
|
|
WSGI interface & chunked requests
|
|
---------------------------------
|
|
|
|
mod_wsgi requires "WSGIChunkedRequest On" to handle
|
|
"Transfer-encoding: Chunked" requests. However, `/stream/insert`
|
|
doesn't handle this correctly right now, because:
|
|
|
|
- The `cherrypy.request.body.read()` call needs to be fixed for chunked requests
|
|
|
|
- We don't want to just buffer endlessly in the server, and it will
|
|
require some thought on how to handle data in chunks (what to do about
|
|
interval endpoints).
|
|
|
|
It is probably better to just keep the endpoint management on the client
|
|
side, so leave "WSGIChunkedRequest off" for now.
|
|
|
|
|
|
Unicode & character encoding
|
|
----------------------------
|
|
|
|
Stream data is passed back and forth as raw `bytes` objects in most
|
|
places, including the `nilmdb.client` and command-line interfaces.
|
|
This is done partially for performance reasons, and partially to
|
|
support the binary insert/extract options, where character-set encoding
|
|
would not apply.
|
|
|
|
For the HTTP server, the raw bytes transferred over HTTP are interpreted
|
|
as follows:
|
|
- For `/stream/insert`, the client-provided `Content-Type` is ignored,
|
|
and the data is read as if it were `application/octet-stream`.
|
|
- For `/stream/extract`, the returned data is `application/octet-stream`.
|
|
- All other endpoints communicate via JSON, which is specified to always
|
|
be encoded as UTF-8. This includes:
|
|
- `/version`
|
|
- `/dbinfo`
|
|
- `/stream/list`
|
|
- `/stream/create`
|
|
- `/stream/destroy`
|
|
- `/stream/rename`
|
|
- `/stream/get_metadata`
|
|
- `/stream/set_metadata`
|
|
- `/stream/update_metadata`
|
|
- `/stream/remove`
|
|
- `/stream/intervals`
|