Extending and then writing to the mmap file has a problem: if the disk
fills up, the mapping becomes invalid, and the Python interpreter will
get a SIGBUS, killing it. It's difficult to catch this gracefully;
there's no way to do that with existing modules. Instead, switch to
only using mmap when reading, and normal file writes when writing.
Since we only ever append, it should have similar performance.
Now nilmdb.client, nilmdb.server, nilmdb.cmdline, and nilmdb.utils
are each their own modules, and there is a little bit more of a
logical separation between them. Various changes scattered throughout
to fix naming (for example, nilmdb.nilmdb.NilmDBError is now
nilmdb.server.errors.NilmDBError).
Reduced usage of "from __future__ import absolute_import" as much
as possible. It's still needed for the functions in the nilmdb/server
directory to be able to import the nilmdb module rather than the
nilmdb.py script.
This should hopefully ease future packaging a bit.
There's some bug with the testing harness where placing e.g.
from du import du
in nilmdb/utils/__init__.py doesn't quite work -- sometimes the
module "du" replaces the function "du". Not exactly sure why;
we work around that by just renaming files so they don't match
the imported names directly.
Messes up extraction, since we random access for the timestamp binary
search. In the future, maybe switching to multiple tables (one for
timestamp, one for compressed data) would be smart.
Doesn't actually merge them yet; need to change Interval
implementation to allow deletes.
git-svn-id: https://bucket.mit.edu/svn/nilm/nilmdb@11354 ddd99763-3ecb-0310-9145-efcb8ce7c51f
Current/old design has specific layouts: RawData, PrepData,
RawNotchedData.
Let's get rid of this entirely and switch to simpler data types that
are
just collections and counts of a single type. We'll still use strings
to describe them, with format:
type_count
where type is "uint16", "float32", or "float64", and count is an
integer.
nilmdb.layout.named() will parse these strings into the appropriate
handlers. For compatibility:
"RawData" == "uint16_6"
"RawNotchedData" == "uint16_9"
"PrepData" == "float32_8"
git-svn-id: https://bucket.mit.edu/svn/nilm/nilmdb@10981 ddd99763-3ecb-0310-9145-efcb8ce7c51f
This lets us quickly count the number of matching rows, rather than
returning them.
git-svn-id: https://bucket.mit.edu/svn/nilm/nilmdb@10909 ddd99763-3ecb-0310-9145-efcb8ce7c51f
row individually, when extracting data.
Switch to using bisect module when doing the bisection, to lessen the
chance of errors.
Added syslog ability for timer module, for timing stuff deep inside
the server.
Make the chunked/non-chunked test just give a warning, rather than
failing the tests, for debugging purposes. Alternate approach would
be to disable "die on error" for the tests.
git-svn-id: https://bucket.mit.edu/svn/nilm/nilmdb@10896 ddd99763-3ecb-0310-9145-efcb8ce7c51f
though; need to figure out where the slowdown lies.
Add stream existence check to server's /intervals and /extract paths,
add tests for it.
Make start and end arguments optional for /extract, like /intervals
Move --quiet command line option to just the insert subcommand.
It's the only one that uses it right now, and otherwise it doesn't
show up in after a "nilmtool.py intervals --help". Might revisit this
later if more commands start supporting --quiet.
Change cmdline/extract's write into a print, to keep the trailing
newline.
Fix lingering uses of Interval in nilmdb and change to DBInterval
instead.
Fix nilmdb interval bisection:
- handle common case optimization correctly
- db_endpos is always one after the last row, so use hi=db_endpos-1
Finish nlimdb stream_extract
Add a bunch of cmdline tests for extract, particularly testing border
cases around start/end. Compares output to a set of files stored in
the tests/data dir.
Some more tests in test_client to get better coverage.
git-svn-id: https://bucket.mit.edu/svn/nilm/nilmdb@10893 ddd99763-3ecb-0310-9145-efcb8ce7c51f
- Flesh out tests for the new nilmdb.layout.Formatter
Coverage doesn't handle the cython module, so this is just
functional stuff, not necessarily complete.
Still need to finish each Layout.format()
- Split out test_client_5_chunked from test_client_4_misc
so it's easier to skip while debugging. Turning off streaming
lets us see tracebacks from within the server's content()
functions.
- More work on stream/extract in cmdline, client, server, nilmdb.
Still needs work on server side, but should be complete in nilmdb.
- Start nilmdb.layout.Formatter class
git-svn-id: https://bucket.mit.edu/svn/nilm/nilmdb@10888 ddd99763-3ecb-0310-9145-efcb8ce7c51f
that. This requires a bit of restructuring of server.py:intervals()
to allow us to properly report errors before beginning the stream.
Make the nilmdb.httpclient save a copy of HTTP header responses, and
add a test that the saved responses to verify that the
transfer-encoding is Chunked for the /stream/interval request. This
should check that the above bug is fixed and doesn't show up again
if we switch to a different WSGI server, etc.
Tweak size estimates in nilmdb for /stream/interval
git-svn-id: https://bucket.mit.edu/svn/nilm/nilmdb@10884 ddd99763-3ecb-0310-9145-efcb8ce7c51f
to the database, not the client. The server now maintains the open
HTTP connection and sends a continuous streaming reply to the GET
request.
HTTP client side uses an Iteratorizer to turn the curl.perform()
callback into an interator, and returns an iterator that yields
individual lines to the caller rather than buffering up all the data
at once. Should still be able to handle errors etc.
Server changed to return a "streaming JSON" instance for the
/stream/interval requests. This is just a series of independent
JSON documents (one per interval), separated by newlines.
Adjust nilmdb's max_results a bit. Now, multiple requests only exist
between the server <-> nilmdb threads, and they exist just to avoid
blocking the nilmdb thread by any one server thread for too long.
So adjust the size accordingly to match the fact that this is non-json
encoded data.
git-svn-id: https://bucket.mit.edu/svn/nilm/nilmdb@10881 ddd99763-3ecb-0310-9145-efcb8ce7c51f
us to look at just some of the intervals without having to reconstruct
an entire IntervalSet class -- which greatly reduces server load when
handling requests that cover large interval ranges.
Add Client.get and Client.put, analogous to getjson and putjson but
without parsing the result as json.
Add Client.stream_extract. Still needs server side love.
Allow Cmdline subcommands to provide a return value that turns into
the exit code.
More work on cmdline.extract.
git-svn-id: https://bucket.mit.edu/svn/nilm/nilmdb@10851 ddd99763-3ecb-0310-9145-efcb8ce7c51f
rework how this all works together, but will probably move on to
extraction now.
Update runserver.py with some options for profiling, port, etc.
git-svn-id: https://bucket.mit.edu/svn/nilm/nilmdb@10839 ddd99763-3ecb-0310-9145-efcb8ce7c51f
On the big database, the server takes a few seconds to figure out the
interval intersections. Need to think about how to improve that --
the real key might be to start reducing the number of intervals we're
storing by combining them, potentially as they're inserted.
git-svn-id: https://bucket.mit.edu/svn/nilm/nilmdb@10838 ddd99763-3ecb-0310-9145-efcb8ce7c51f
variable instead now. Adjust tests accordingly.
Start list --detail option, using stream/intervals request.
Frontend should be ready, backend needs implementation.
Put interval adding back into nilmdb:_add_interval so things work.
git-svn-id: https://bucket.mit.edu/svn/nilm/nilmdb@10833 ddd99763-3ecb-0310-9145-efcb8ce7c51f
intervals are accessed, so it doesn't need to keep rebuilding them as
long as it's running.
git-svn-id: https://bucket.mit.edu/svn/nilm/nilmdb@10800 ddd99763-3ecb-0310-9145-efcb8ce7c51f
Make nilmdb.cmdline a proper class
Fix various stuff
Add /dbpath command to get DB path
git-svn-id: https://bucket.mit.edu/svn/nilm/nilmdb@10677 ddd99763-3ecb-0310-9145-efcb8ce7c51f