You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

design.md 16 KiB

11 years ago
11 years ago
11 years ago
11 years ago
11 years ago
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440
  1. Structure
  2. ---------
  3. nilmdb.nilmdb is the NILM database interface. A nilmdb.BulkData
  4. interface stores data in flat files, and a SQL database tracks
  5. metadata and ranges.
  6. Access to the nilmdb must be single-threaded. This is handled with
  7. the nilmdb.serializer class. In the future this could probably
  8. be turned into a per-path serialization.
  9. nilmdb.server is a HTTP server that provides an interface to talk,
  10. thorugh the serialization layer, to the nilmdb object.
  11. nilmdb.client is a HTTP client that connects to this.
  12. Sqlite performance
  13. ------------------
  14. Committing a transaction in the default sync mode (PRAGMA synchronous=FULL)
  15. takes about 125msec. sqlite3 will commit transactions at 3 times:
  16. 1. explicit con.commit()
  17. 2. between a series of DML commands and non-DML commands, e.g.
  18. after a series of INSERT, SELECT, but before a CREATE TABLE or
  19. PRAGMA.
  20. 3. at the end of an explicit transaction, e.g. "with self.con as con:"
  21. To speed up testing, or if this transaction speed becomes an issue,
  22. the sync=False option to NilmDB will set PRAGMA synchronous=OFF.
  23. Inserting streams
  24. -----------------
  25. We need to send the contents of "data" as POST. Do we need chunked
  26. transfer?
  27. - Don't know the size in advance, so we would need to use chunked if
  28. we send the entire thing in one request.
  29. - But we shouldn't send one chunk per line, so we need to buffer some
  30. anyway; why not just make new requests?
  31. - Consider the infinite-streaming case, we might want to send it
  32. immediately? Not really -- server still should do explicit inserts
  33. of fixed-size chunks.
  34. - Even chunked encoding needs the size of each chunk beforehand, so
  35. everything still gets buffered. Just a tradeoff of buffer size.
  36. Before timestamps are added:
  37. - Raw data is about 440 kB/s (9 channels)
  38. - Prep data is about 12.5 kB/s (1 phase)
  39. - How do we know how much data to send?
  40. - Remember that we can only do maybe 8-50 transactions per second on
  41. the sqlite database. So if one block of inserted data is one
  42. transaction, we'd need the raw case to be around 64kB per request,
  43. ideally more.
  44. - Maybe use a range, based on how long it's taking to read the data
  45. - If no more data, send it
  46. - If data > 1 MB, send it
  47. - If more than 10 seconds have elapsed, send it
  48. - Should those numbers come from the server?
  49. Converting from ASCII to PyTables:
  50. - For each row getting added, we need to set attributes on a PyTables
  51. Row object and call table.append(). This means that there isn't a
  52. particularly efficient way of converting from ascii.
  53. - Could create a function like nilmdb.layout.Layout("foo".fillRow(asciiline)
  54. - But this means we're doing parsing on the serialized side
  55. - Let's keep parsing on the threaded server side so we can detect
  56. errors better, and not block the serialized nilmdb for a slow
  57. parsing process.
  58. - Client sends ASCII data
  59. - Server converts this ACSII data to a list of values
  60. - Maybe:
  61. # threaded side creates this object
  62. parser = nilmdb.layout.Parser("layout_name")
  63. # threaded side parses and fills it with data
  64. parser.parse(textdata)
  65. # serialized side pulls out rows
  66. for n in xrange(parser.nrows):
  67. parser.fill_row(rowinstance, n)
  68. table.append()
  69. Inserting streams, inside nilmdb
  70. --------------------------------
  71. - First check that the new stream doesn't overlap.
  72. - Get minimum timestamp, maximum timestamp from data parser.
  73. - (extend parser to verify monotonicity and track extents)
  74. - Get all intervals for this stream in the database
  75. - See if new interval overlaps any existing ones
  76. - If so, bail
  77. - Question: should we cache intervals inside NilmDB?
  78. - Assume database is fast for now, and always rebuild fom DB.
  79. - Can add a caching layer later if we need to.
  80. - `stream_get_ranges(path)` -> return IntervalSet?
  81. Speed
  82. -----
  83. - First approach was quadratic. Adding four hours of data:
  84. $ time zcat /home/jim/bpnilm-data/snapshot-1-20110513-110002.raw.gz | ./nilmtool.py insert -s 20110513-110000 /bpnilm/1/raw
  85. real 24m31.093s
  86. $ time zcat /home/jim/bpnilm-data/snapshot-1-20110513-110002.raw.gz | ./nilmtool.py insert -s 20110513-120001 /bpnilm/1/raw
  87. real 43m44.528s
  88. $ time zcat /home/jim/bpnilm-data/snapshot-1-20110513-110002.raw.gz | ./nilmtool.py insert -s 20110513-130002 /bpnilm/1/raw
  89. real 93m29.713s
  90. $ time zcat /home/jim/bpnilm-data/snapshot-1-20110513-110002.raw.gz | ./nilmtool.py insert -s 20110513-140003 /bpnilm/1/raw
  91. real 166m53.007s
  92. - Disabling pytables indexing didn't help:
  93. real 31m21.492s
  94. real 52m51.963s
  95. real 102m8.151s
  96. real 176m12.469s
  97. - Server RAM usage is constant.
  98. - Speed problems were due to IntervalSet speed, of parsing intervals
  99. from the database and adding the new one each time.
  100. - First optimization is to cache result of `nilmdb:_get_intervals`,
  101. which gives the best speedup.
  102. - Also switched to internally using bxInterval from bx-python package.
  103. Speed of `tests/test_interval:TestIntervalSpeed` is pretty decent
  104. and seems to be growing logarithmically now. About 85μs per insertion
  105. for inserting 131k entries.
  106. - Storing the interval data in SQL might be better, with a scheme like:
  107. http://www.logarithmic.net/pfh/blog/01235197474
  108. - Next slowdown target is nilmdb.layout.Parser.parse().
  109. - Rewrote parsers using cython and sscanf
  110. - Stats (rev 10831), with `_add_interval` disabled
  111. layout.pyx.Parser.parse:128 6303 sec, 262k calls
  112. layout.pyx.parse:63 13913 sec, 5.1g calls
  113. numpy:records.py.fromrecords:569 7410 sec, 262k calls
  114. - Probably OK for now.
  115. - After all updates, now takes about 8.5 minutes to insert an hour of
  116. data, constant after adding 171 hours (4.9 billion data points)
  117. - Data set size: 98 gigs = 20 bytes per data point.
  118. 6 uint16 data + 1 uint32 timestamp = 16 bytes per point
  119. So compression must be off -- will retry with compression forced on.
  120. IntervalSet speed
  121. -----------------
  122. - Initial implementation was pretty slow, even with binary search in
  123. sorted list
  124. - Replaced with bxInterval; now takes about log n time for an insertion
  125. - TestIntervalSpeed with range(17,18) and profiling
  126. - 85 μs each
  127. - 131072 calls to `__iadd__`
  128. - 131072 to bx.insert_interval
  129. - 131072 to bx.insert:395
  130. - 2355835 to bx.insert:106 (18x as many?)
  131. - Tried blist too, worse than bxinterval.
  132. - Might be algorithmic improvements to be made in Interval.py,
  133. like in `__and__`
  134. - Replaced again with rbtree. Seems decent. Numbers are time per
  135. insert for 2**17 insertions, followed by total wall time and RAM
  136. usage for running "make test" with `test_rbtree` and `test_interval`
  137. with range(5,20):
  138. - old values with bxinterval:
  139. 20.2 μS, total 20 s, 177 MB RAM
  140. - rbtree, plain python:
  141. 97 μS, total 105 s, 846 MB RAM
  142. - rbtree converted to cython:
  143. 26 μS, total 29 s, 320 MB RAM
  144. - rbtree and interval converted to cython:
  145. 8.4 μS, total 12 s, 134 MB RAM
  146. - Would like to move Interval itself back to Python so other
  147. non-cythonized code like client code can use it more easily.
  148. Testing speed with just `test_interval` being tested, with
  149. `range(5,22)`, using `/usr/bin/time -v python tests/runtests.py`,
  150. times recorded for 2097152:
  151. - 52ae397 (Interval in cython):
  152. 12.6133 μs each, ratio 0.866533, total 47 sec, 399 MB RAM
  153. - 9759dcf (Interval in python):
  154. 21.2937 μs each, ratio 1.462870, total 83 sec, 1107 MB RAM
  155. That's a huge difference! Instead, will keep Interval and DBInterval
  156. cythonized inside nilmdb, and just have an additional copy in
  157. nilmdb.utils for clients to use.
  158. Layouts
  159. -------
  160. Current/old design has specific layouts: RawData, PrepData, RawNotchedData.
  161. Let's get rid of this entirely and switch to simpler data types that are
  162. just collections and counts of a single type. We'll still use strings
  163. to describe them, with format:
  164. type_count
  165. where type is "uint16", "float32", or "float64", and count is an integer.
  166. nilmdb.layout.named() will parse these strings into the appropriate
  167. handlers. For compatibility:
  168. "RawData" == "uint16_6"
  169. "RawNotchedData" == "uint16_9"
  170. "PrepData" == "float32_8"
  171. BulkData design
  172. ---------------
  173. BulkData is a custom bulk data storage system that was written to
  174. replace PyTables. The general structure is a `data` subdirectory in
  175. the main NilmDB directory. Within `data`, paths are created for each
  176. created stream. These locations are called tables. For example,
  177. tables might be located at
  178. nilmdb/data/newton/raw/
  179. nilmdb/data/newton/prep/
  180. nilmdb/data/cottage/raw/
  181. Each table contains:
  182. - An unchanging `_format` file (Python pickle format) that describes
  183. parameters of how the data is broken up, like files per directory,
  184. rows per file, and the binary data format
  185. - Hex named subdirectories `("%04x", although more than 65536 can exist)`
  186. - Hex named files within those subdirectories, like:
  187. /nilmdb/data/newton/raw/000b/010a
  188. The data format of these files is raw binary, interpreted by the
  189. Python `struct` module according to the format string in the
  190. `_format` file.
  191. - Same as above, with `.removed` suffix, is an optional file (Python
  192. pickle format) containing a list of row numbers that have been
  193. logically removed from the file. If this range covers the entire
  194. file, the entire file will be removed.
  195. - Note that the `bulkdata.nrows` variable is calculated once in
  196. `BulkData.__init__()`, and only ever incremented during use. Thus,
  197. even if all data is removed, `nrows` can remain high. However, if
  198. the server is restarted, the newly calculated `nrows` may be lower
  199. than in a previous run due to deleted data. To be specific, this
  200. sequence of events:
  201. - insert data
  202. - remove all data
  203. - insert data
  204. will result in having different row numbers in the database, and
  205. differently numbered files on the filesystem, than the sequence:
  206. - insert data
  207. - remove all data
  208. - restart server
  209. - insert data
  210. This is okay! Everything should remain consistent both in the
  211. `BulkData` and `NilmDB`. Not attempting to readjust `nrows` during
  212. deletion makes the code quite a bit simpler.
  213. - Similarly, data files are never truncated shorter. Removing data
  214. from the end of the file will not shorten it; it will only be
  215. deleted when it has been fully filled and all of the data has been
  216. subsequently removed.
  217. Rocket
  218. ------
  219. Original design had the nilmdb.nilmdb thread (through bulkdata)
  220. convert from on-disk layout to a Python list, and then the
  221. nilmdb.server thread (from cherrypy) converts to ASCII. For at least
  222. the extraction side of things, it's easy to pass the bulkdata a layout
  223. name instead, and have it convert directly from on-disk to ASCII
  224. format, because this conversion can then be shoved into a C module.
  225. This module, which provides a means for converting directly from
  226. on-disk format to ASCII or Python lists, is the "rocket" interface.
  227. Python is still used to manage the files and figure out where the
  228. data should go; rocket just puts binary data directly in or out of
  229. those files at specified locations.
  230. Before rocket, testing speed with uint16_6 data, with an end-to-end
  231. test (extracting data with nilmtool):
  232. - insert: 65 klines/sec
  233. - extract: 120 klines/sec
  234. After switching to the rocket design, but using the Python version
  235. (pyrocket):
  236. - insert: 57 klines/sec
  237. - extract: 120 klines/sec
  238. After switching to a C extension module (rocket.c)
  239. - insert: 74 klines/sec through insert.py; 99.6 klines/sec through nilmtool
  240. - extract: 335 klines/sec
  241. After client block updates (described below):
  242. - insert: 180 klines/sec through nilmtool (pre-timestamped)
  243. - extract: 390 klines/sec through nilmtool
  244. Using "insert --timestamp" or "extract --bare" cuts the speed in half.
  245. Blocks versus lines
  246. -------------------
  247. Generally want to avoid parsing the bulk of the data as lines if
  248. possible, and transfer things in bigger blocks at once.
  249. Current places where we use lines:
  250. - All data returned by `client.stream_extract`, since it comes from
  251. `httpclient.get_gen`, which iterates over lines. Not sure if this
  252. should be changed, because a `nilmtool extract` is just about the
  253. same speed as `curl -q .../stream/extract`!
  254. - `client.StreamInserter.insert_iter` and
  255. `client.StreamInserter.insert_line`, which should probably get
  256. replaced with block versions. There's no real need to keep
  257. updating the timestamp every time we get a new line of data.
  258. - Finished. Just a single insert() that takes any length string and
  259. does very little processing until it's time to send it to the
  260. server.
  261. Timestamps
  262. ----------
  263. Timestamps are currently double-precision floats (64 bit). Since the
  264. mantissa is 53-bit, this can only represent about 15-17 significant
  265. figures, and microsecond Unix timestamps like 1222333444.000111 are
  266. already 16 significant figures. Rounding is therefore an issue;
  267. it's hard to sure that converting from ASCII, then back to ASCII,
  268. will always give the same result.
  269. Also, if the client provides a floating point value like 1.9999999999,
  270. we need to be careful that we don't store it as 1.9999999999 but later
  271. print it as 2.000000, because then round-trips change the data.
  272. Possible solutions:
  273. - When the client provides a floating point value to the server,
  274. always round to the 6th decimal digit before verifying & storing.
  275. Good for compatibility and simplicity. But still might have rounding
  276. issues, and clients will also need to round when doing their own
  277. verification. Having every piece of code need to know which digit
  278. to round at is not ideal.
  279. - Always store int64 timestamps on the server, representing
  280. microseconds since epoch. int64 timestamps are used in all HTTP
  281. parameters, in insert/extract ASCII strings, client API, commandline
  282. raw timestamps, etc. Pretty big change.
  283. This is what we'll go with...
  284. - Client programs that interpret the timestamps as doubles instead
  285. of ints will remain accurate until 2^53 microseconds, or year
  286. 2255.
  287. - On insert, maybe it's OK to send floating point microsecond values
  288. (1234567890123456.0), just to cope with clients that want to print
  289. everything as a double. Server could try parsing as int64, and if
  290. that fails, parse as double and truncate to int64. However, this
  291. wouldn't catch imprecise inputs like "1.23456789012e+15". But
  292. maybe that can just be ignored; it's likely to cause a
  293. non-monotonic error at the client.
  294. - Timestamps like 1234567890.123456 never show up anywhere, except
  295. for interfacing to datetime_tz etc. Command line "raw timestamps"
  296. are always printed as int64 values, and a new format
  297. "@1234567890123456" is added to the parser for specifying them
  298. exactly.
  299. Binary interface
  300. ----------------
  301. The ASCII interface is too slow for high-bandwidth processing, like
  302. sinefits, prep, etc. A binary interface was added so that you can
  303. extract the raw binary out of the bulkdata storage. This binary is
  304. a little-endian format, e.g. in C a uint16_6 stream would be:
  305. #include <endian.h>
  306. #include <stdint.h>
  307. struct {
  308. int64_t timestamp_le;
  309. uint16_t data_le[6];
  310. } __attribute__((packed));
  311. Remember to byteswap (with e.g. `letoh` in C)!
  312. This interface is used by the new `nilmdb.client.numpyclient.NumpyClient`
  313. class, which is a subclass of the normal `nilmcb.client.client.Client`
  314. and has all of the same functions. It adds three new functions:
  315. - `stream_extract_numpy` to extract data as a Numpy array
  316. - `stream_insert_numpy` to insert data as a Numpy array
  317. - `stream_insert_numpy_context` is the context manager for
  318. incrementally inserting data
  319. It is significantly faster! It is about 20 times faster to decimate a
  320. stream with `nilm-decimate` when the filter code is using the new
  321. binary/numpy interface.
  322. WSGI interface & chunked requests
  323. ---------------------------------
  324. mod_wsgi requires "WSGIChunkedRequest On" to handle
  325. "Transfer-encoding: Chunked" requests. However, `/stream/insert`
  326. doesn't handle this correctly right now, because:
  327. - The `cherrpy.request.body.read()` call needs to be fixed for chunked requests
  328. - We don't want to just buffer endlessly in the server, and it will
  329. require some thought on how to handle data in chunks (what to do about
  330. interval endpoints).
  331. It is probably better to just keep the endpoint management on the client
  332. side, so leave "WSGIChunkedRequest off" for now.