nilmdb/docs/design.md

Structure
---------
nilmdb.nilmdb is the NILM database interface.  A nilmdb.BulkData
interface stores data in flat files, and a SQL database tracks
metadata and ranges.

Access to the nilmdb must be single-threaded.  This is handled with
the nilmdb.serializer class.  In the future this could probably
be turned into a per-path serialization.

nilmdb.server is a HTTP server that provides an interface to talk,
thorugh the serialization layer, to the nilmdb object.

nilmdb.client is a HTTP client that connects to this.

Sqlite performance
------------------

Committing a transaction in the default sync mode (PRAGMA synchronous=FULL)
takes about 125msec.  sqlite3 will commit transactions at 3 times:

1. explicit con.commit()

2. between a series of DML commands and non-DML commands, e.g.
   after a series of INSERT, SELECT, but before a CREATE TABLE or
   PRAGMA.

3. at the end of an explicit transaction, e.g. "with self.con as con:"

To speed up testing, or if this transaction speed becomes an issue,
the sync=False option to NilmDB will set PRAGMA synchronous=OFF.


Inserting streams
-----------------

We need to send the contents of "data" as POST.  Do we need chunked
transfer?

- Don't know the size in advance, so we would need to use chunked if
  we send the entire thing in one request.
- But we shouldn't send one chunk per line, so we need to buffer some
  anyway; why not just make new requests?
- Consider the infinite-streaming case, we might want to send it
  immediately?  Not really -- server still should do explicit inserts
  of fixed-size chunks.
- Even chunked encoding needs the size of each chunk beforehand, so
  everything still gets buffered.  Just a tradeoff of buffer size.

Before timestamps are added:

- Raw data is about 440 kB/s    (9 channels)
- Prep data is about 12.5 kB/s  (1 phase)
- How do we know how much data to send?

    - Remember that we can only do maybe 8-50 transactions per second on
      the sqlite database.  So if one block of inserted data is one
      transaction, we'd need the raw case to be around 64kB per request,
      ideally more.
    - Maybe use a range, based on how long it's taking to read the data
        - If no more data, send it
        - If data > 1 MB, send it
    - If more than 10 seconds have elapsed, send it
    - Should those numbers come from the server?

Converting from ASCII to PyTables:

- For each row getting added, we need to set attributes on a PyTables
  Row object and call table.append().  This means that there isn't a
  particularly efficient way of converting from ascii.
- Could create a function like nilmdb.layout.Layout("foo".fillRow(asciiline)
    - But this means we're doing parsing on the serialized side
    - Let's keep parsing on the threaded server side so we can detect
      errors better, and not block the serialized nilmdb for a slow
      parsing process.
- Client sends ASCII data
- Server converts this ACSII data to a list of values
    - Maybe:

            # threaded side creates this object
            parser = nilmdb.layout.Parser("layout_name")
            # threaded side parses and fills it with data
            parser.parse(textdata)
            # serialized side pulls out rows
            for n in xrange(parser.nrows):
                parser.fill_row(rowinstance, n)
                table.append()


Inserting streams, inside nilmdb
--------------------------------

- First check that the new stream doesn't overlap.
    - Get minimum timestamp, maximum timestamp from data parser.
        - (extend parser to verify monotonicity and track extents)
    - Get all intervals for this stream in the database
    - See if new interval overlaps any existing ones
        - If so, bail
    - Question: should we cache intervals inside NilmDB?
        - Assume database is fast for now, and always rebuild fom DB.
        - Can add a caching layer later if we need to.
    - `stream_get_ranges(path)` -> return IntervalSet?

Speed
-----

- First approach was quadratic.  Adding four hours of data:

        $ time zcat /home/jim/bpnilm-data/snapshot-1-20110513-110002.raw.gz | ./nilmtool.py insert -s 20110513-110000 /bpnilm/1/raw
        real    24m31.093s
        $ time zcat /home/jim/bpnilm-data/snapshot-1-20110513-110002.raw.gz | ./nilmtool.py insert -s 20110513-120001 /bpnilm/1/raw
        real    43m44.528s
        $ time zcat /home/jim/bpnilm-data/snapshot-1-20110513-110002.raw.gz | ./nilmtool.py insert -s 20110513-130002 /bpnilm/1/raw
        real    93m29.713s
        $ time zcat /home/jim/bpnilm-data/snapshot-1-20110513-110002.raw.gz | ./nilmtool.py insert -s 20110513-140003 /bpnilm/1/raw
        real    166m53.007s

- Disabling pytables indexing didn't help:

        real    31m21.492s
        real    52m51.963s
        real    102m8.151s
        real    176m12.469s

- Server RAM usage is constant.

- Speed problems were due to IntervalSet speed, of parsing intervals
  from the database and adding the new one each time.

    - First optimization is to cache result of `nilmdb:_get_intervals`,
      which gives the best speedup.

    - Also switched to internally using bxInterval from bx-python package.
      Speed of `tests/test_interval:TestIntervalSpeed` is pretty decent
      and seems to be growing logarithmically now.  About 85μs per insertion
      for inserting 131k entries.

    - Storing the interval data in SQL might be better, with a scheme like:
      http://www.logarithmic.net/pfh/blog/01235197474

- Next slowdown target is nilmdb.layout.Parser.parse().
    - Rewrote parsers using cython and sscanf
    - Stats (rev 10831), with _add_interval disabled

        layout.pyx.Parser.parse:128        6303 sec, 262k calls
         layout.pyx.parse:63               13913 sec, 5.1g calls
        numpy:records.py.fromrecords:569   7410 sec, 262k calls

- Probably OK for now.

- After all updates, now takes about 8.5 minutes to insert an hour of
  data, constant after adding 171 hours (4.9 billion data points)

- Data set size: 98 gigs = 20 bytes per data point.
  6 uint16 data + 1 uint32 timestamp = 16 bytes per point
  So compression must be off -- will retry with compression forced on.

IntervalSet speed
-----------------
- Initial implementation was pretty slow, even with binary search in
  sorted list

- Replaced with bxInterval; now takes about log n time for an insertion
    - TestIntervalSpeed with range(17,18) and profiling
        - 85 μs each
        - 131072 calls to `__iadd__`
        - 131072 to bx.insert_interval
        - 131072 to bx.insert:395
        - 2355835 to bx.insert:106  (18x as many?)

- Tried blist too, worse than bxinterval.

- Might be algorithmic improvements to be made in Interval.py,
  like in `__and__`

- Replaced again with rbtree.  Seems decent.  Numbers are time per
  insert for 2**17 insertions, followed by total wall time and RAM
  usage for running "make test" with `test_rbtree` and `test_interval`
  with range(5,20):
    - old values with bxinterval:
      20.2 μS, total 20 s, 177 MB RAM
    - rbtree, plain python:
      97 μS, total 105 s, 846 MB RAM
    - rbtree converted to cython:
      26 μS, total 29 s, 320 MB RAM
    - rbtree and interval converted to cython:
      8.4 μS, total 12 s, 134 MB RAM

Layouts
-------
Current/old design has specific layouts: RawData, PrepData, RawNotchedData.
Let's get rid of this entirely and switch to simpler data types that are
just collections and counts of a single type.  We'll still use strings
to describe them, with format:

    type_count

where type is "uint16", "float32", or "float64", and count is an integer.

nilmdb.layout.named() will parse these strings into the appropriate
handlers.  For compatibility:

    "RawData" == "uint16_6"
    "RawNotchedData" == "uint16_9"
    "PrepData" == "float32_8"


BulkData design
---------------

BulkData is a custom bulk data storage system that was written to
replace PyTables.  The general structure is a `data` subdirectory in
the main NilmDB directory.  Within `data`, paths are created for each
created stream.  These locations are called tables.  For example,
tables might be located at

    nilmdb/data/newton/raw/
    nilmdb/data/newton/prep/
    nilmdb/data/cottage/raw/

Each table contains:

- An unchanging `_format` file (Python pickle format) that describes
  parameters of how the data is broken up, like files per directory,
  rows per file, and the binary data format

- Hex named subdirectories `("%04x", although more than 65536 can exist)`

- Hex named files within those subdirectories, like:

        /nilmdb/data/newton/raw/000b/010a

    The data format of these files is raw binary, interpreted by the
    Python `struct` module according to the format string in the
    `_format` file.

- Same as above, with `.removed` suffix, is an optional file (Python
  pickle format) containing a list of row numbers that have been
  logically removed from the file.  If this range covers the entire
  file, the entire file will be removed.

- Note that the `bulkdata.nrows` variable is calculated once in
  `BulkData.__init__()`, and only ever incremented during use.  Thus,
  even if all data is removed, `nrows` can remain high.  However, if
  the server is restarted, the newly calculated `nrows` may be lower
  than in a previous run due to deleted data.  To be specific, this
  sequence of events:

    - insert data
    - remove all data
    - insert data

    will result in having different row numbers in the database, and
    differently numbered files on the filesystem, than the sequence:

    - insert data
    - remove all data
    - restart server
    - insert data

    This is okay!  Everything should remain consistent both in the
    `BulkData` and `NilmDB`.  Not attempting to readjust `nrows` during
    deletion makes the code quite a bit simpler.

- Similarly, data files are never truncated shorter.  Removing data
  from the end of the file will not shorten it; it will only be
  deleted when it has been fully filled and all of the data has been
  subsequently removed.