|
|
@@ -103,13 +103,13 @@ Speed |
|
|
|
|
|
|
|
- First approach was quadratic. Adding four hours of data: |
|
|
|
|
|
|
|
$ time zcat /home/jim/bpnilm-data/snapshot-1-20110513-110002.raw.gz | ./nilmtool.py insert -s 20110513-110000 /bpnilm/1/raw |
|
|
|
$ time zcat /home/jim/bpnilm-data/snapshot-1-20110513-110002.raw.gz | ./nilmtool.py insert -s 20110513-110000 /bpnilm/1/raw |
|
|
|
real 24m31.093s |
|
|
|
$ time zcat /home/jim/bpnilm-data/snapshot-1-20110513-110002.raw.gz | ./nilmtool.py insert -s 20110513-120001 /bpnilm/1/raw |
|
|
|
$ time zcat /home/jim/bpnilm-data/snapshot-1-20110513-110002.raw.gz | ./nilmtool.py insert -s 20110513-120001 /bpnilm/1/raw |
|
|
|
real 43m44.528s |
|
|
|
$ time zcat /home/jim/bpnilm-data/snapshot-1-20110513-110002.raw.gz | ./nilmtool.py insert -s 20110513-130002 /bpnilm/1/raw |
|
|
|
$ time zcat /home/jim/bpnilm-data/snapshot-1-20110513-110002.raw.gz | ./nilmtool.py insert -s 20110513-130002 /bpnilm/1/raw |
|
|
|
real 93m29.713s |
|
|
|
$ time zcat /home/jim/bpnilm-data/snapshot-1-20110513-110002.raw.gz | ./nilmtool.py insert -s 20110513-140003 /bpnilm/1/raw |
|
|
|
$ time zcat /home/jim/bpnilm-data/snapshot-1-20110513-110002.raw.gz | ./nilmtool.py insert -s 20110513-140003 /bpnilm/1/raw |
|
|
|
real 166m53.007s |
|
|
|
|
|
|
|
- Disabling pytables indexing didn't help: |
|
|
@@ -122,19 +122,19 @@ Speed |
|
|
|
- Server RAM usage is constant. |
|
|
|
|
|
|
|
- Speed problems were due to IntervalSet speed, of parsing intervals |
|
|
|
from the database and adding the new one each time. |
|
|
|
from the database and adding the new one each time. |
|
|
|
|
|
|
|
- First optimization is to cache result of `nilmdb:_get_intervals`, |
|
|
|
which gives the best speedup. |
|
|
|
|
|
|
|
|
|
|
|
- Also switched to internally using bxInterval from bx-python package. |
|
|
|
Speed of `tests/test_interval:TestIntervalSpeed` is pretty decent |
|
|
|
and seems to be growing logarithmically now. About 85μs per insertion |
|
|
|
for inserting 131k entries. |
|
|
|
|
|
|
|
|
|
|
|
- Storing the interval data in SQL might be better, with a scheme like: |
|
|
|
http://www.logarithmic.net/pfh/blog/01235197474 |
|
|
|
|
|
|
|
|
|
|
|
- Next slowdown target is nilmdb.layout.Parser.parse(). |
|
|
|
- Rewrote parsers using cython and sscanf |
|
|
|
- Stats (rev 10831), with _add_interval disabled |
|
|
@@ -142,7 +142,7 @@ Speed |
|
|
|
layout.pyx.parse:63 13913 sec, 5.1g calls |
|
|
|
numpy:records.py.fromrecords:569 7410 sec, 262k calls |
|
|
|
- Probably OK for now. |
|
|
|
|
|
|
|
|
|
|
|
IntervalSet speed |
|
|
|
----------------- |
|
|
|
- Initial implementation was pretty slow, even with binary search in |
|
|
@@ -163,12 +163,16 @@ IntervalSet speed |
|
|
|
|
|
|
|
- Replaced again with rbtree. Seems decent. Numbers are time per |
|
|
|
insert for 2**17 insertions, followed by total wall time and RAM |
|
|
|
usage for running "make test" with test_rbtree and test_interval |
|
|
|
usage for running "make test" with `test_rbtree` and `test_interval` |
|
|
|
with range(5,20): |
|
|
|
- Plain python: |
|
|
|
- old values with bxinterval: |
|
|
|
20.2 μS, total 20 s, 177 MB RAM |
|
|
|
- rbtree, plain python: |
|
|
|
97 μS, total 105 s, 846 MB RAM |
|
|
|
- rbtree converted to cython: |
|
|
|
26 μS, total 29 s, 320 MB RAM |
|
|
|
- rbtree and interval converted to cython: |
|
|
|
8.4 μS, total 12 s, 134 MB RAM |
|
|
|
|
|
|
|
Layouts |
|
|
|
------- |
|
|
@@ -178,12 +182,12 @@ just collections and counts of a single type. We'll still use strings |
|
|
|
to describe them, with format: |
|
|
|
|
|
|
|
type_count |
|
|
|
|
|
|
|
|
|
|
|
where type is "uint16", "float32", or "float64", and count is an integer. |
|
|
|
|
|
|
|
nilmdb.layout.named() will parse these strings into the appropriate |
|
|
|
handlers. For compatibility: |
|
|
|
|
|
|
|
|
|
|
|
"RawData" == "uint16_6" |
|
|
|
"RawNotchedData" == "uint16_9" |
|
|
|
"PrepData" == "float32_8" |