|
|
@@ -119,36 +119,44 @@ Speed |
|
|
|
real 102m8.151s |
|
|
|
real 176m12.469s |
|
|
|
|
|
|
|
(Maybe even worse? Probably just more load on the system) |
|
|
|
|
|
|
|
- Server RAM usage isn't growing, but maybe there's some other bug there. |
|
|
|
|
|
|
|
- Turns out it's because of setting up the intervals taking so long -- |
|
|
|
7 seconds for 5000 intervals. I replaced the Interval internals |
|
|
|
with one based on the quicksect stuff from bx-python, which seems |
|
|
|
much nicer. But it still has the problem of reading in the entire |
|
|
|
SQL database and building the in-memory IntervalSet for each |
|
|
|
request. First improvement would be to do this once and cache |
|
|
|
results, second improvement might be to always store the interval |
|
|
|
data in SQL, as suggested at |
|
|
|
http://www.logarithmic.net/pfh/blog/01235197474 |
|
|
|
|
|
|
|
- Next slowdown target is nilmdb.layout.Parser.parse(). |
|
|
|
- Consider cython |
|
|
|
- Or could just split strings, and let pytables's table.append() convert |
|
|
|
from ASCII itself? Would require changing timestamp format on client |
|
|
|
side, though. Cython is probably better -- customized routine to parse |
|
|
|
each layout type directly from a string into a typed array? |
|
|
|
- Server RAM usage is constant. |
|
|
|
|
|
|
|
- Speed problems were due to IntervalSet speed, of parsing intervals |
|
|
|
from the database and adding the new one each time. |
|
|
|
|
|
|
|
- First optimization is to cache result of `nilmdb:_get_intervals`, |
|
|
|
which gives the best speedup. |
|
|
|
|
|
|
|
- Also switched to internally using bxInterval from bx-python package. |
|
|
|
Speed of `tests/test_interval:TestIntervalSpeed` is pretty decent |
|
|
|
and seems to be growing logarithmically now. About 85μs per insertion |
|
|
|
for inserting 131k entries. |
|
|
|
|
|
|
|
Interval speed |
|
|
|
-------------- |
|
|
|
- Replaced with bxInterval |
|
|
|
- Storing the interval data in SQL might be better, with a scheme like: |
|
|
|
http://www.logarithmic.net/pfh/blog/01235197474 |
|
|
|
|
|
|
|
- Next slowdown target is nilmdb.layout.Parser.parse(). |
|
|
|
- Rewrote parsers using cython and sscanf |
|
|
|
- Stats (rev 10831), with _add_interval disabled |
|
|
|
layout.pyx.Parser.parse:128 6303 sec, 262k calls |
|
|
|
layout.pyx.parse:63 13913 sec, 5.1g calls |
|
|
|
numpy:records.py.fromrecords:569 7410 sec, 262k calls |
|
|
|
- Probably OK for now. |
|
|
|
|
|
|
|
IntervalSet speed |
|
|
|
----------------- |
|
|
|
- Initial implementation was pretty slow, even with binary search in |
|
|
|
sorted list |
|
|
|
|
|
|
|
- Replaced with bxInterval; now takes about log n time for an insertion |
|
|
|
- TestIntervalSpeed with range(17,18) and profiling |
|
|
|
- 85 μs each |
|
|
|
- 131072 calls to `__iadd__` |
|
|
|
- 131072 to bx.insert_interval |
|
|
|
- 131072 to bx.insert:395 |
|
|
|
- 2355835 to bx.insert:106 (18x as many?) |
|
|
|
|
|
|
|
- TestIntervalSpeed with range(17,18) and profiling |
|
|
|
- 85 μs each |
|
|
|
- 131072 calls to __iadd__ |
|
|
|
- 131072 to bx.insert_interval |
|
|
|
- 131072 to bx.insert:395 |
|
|
|
- 2355835 to bx.insert:106 (18x as many?) |
|
|
|
- Tried blist too, worse than bxinterval. |
|
|
|
|
|
|
|
- Might be algorithmic improvements to be made in Interval.py, |
|
|
|
like in `__and__` |