Compare commits
1 Commits
nilmdb-0.1
...
bxinterval
Author | SHA1 | Date | |
---|---|---|---|
5f251e59e5 |
5
.gitignore
vendored
5
.gitignore
vendored
@@ -1,7 +1,2 @@
|
|||||||
db/
|
|
||||||
tests/*testdb/
|
|
||||||
.coverage
|
.coverage
|
||||||
*.pyc
|
*.pyc
|
||||||
design.html
|
|
||||||
timeit*out
|
|
||||||
|
|
||||||
|
7
Makefile
7
Makefile
@@ -8,14 +8,11 @@ tool:
|
|||||||
lint:
|
lint:
|
||||||
pylint -f parseable nilmdb
|
pylint -f parseable nilmdb
|
||||||
|
|
||||||
%.html: %.md
|
|
||||||
pandoc -s $< > $@
|
|
||||||
|
|
||||||
test:
|
test:
|
||||||
python runtests.py
|
nosetests
|
||||||
|
|
||||||
profile:
|
profile:
|
||||||
python runtests.py --with-profile
|
nosetests --with-profile
|
||||||
|
|
||||||
clean::
|
clean::
|
||||||
find . -name '*pyc' | xargs rm -f
|
find . -name '*pyc' | xargs rm -f
|
||||||
|
@@ -1,3 +1,2 @@
|
|||||||
sudo apt-get install python2.7 python-cherrypy3 python-decorator python-nose python-coverage
|
sudo apt-get install python-nose python-coverage
|
||||||
sudo apt-get install cython # 0.17.1-1 or newer
|
sudo apt-get install python-tables cython python-cherrypy3
|
||||||
|
|
||||||
|
10
TODO
10
TODO
@@ -1,5 +1,5 @@
|
|||||||
-- Clean up error responses. Specifically I'd like to be able to add
|
- Merge adjacent intervals on insert (maybe with client help?)
|
||||||
json-formatted data to OverflowError and DB parsing errors. It
|
|
||||||
seems like subclassing cherrypy.HTTPError and overriding
|
- Better testing:
|
||||||
set_response is the best thing to do -- it would let me get rid
|
- see about getting coverage on layout.pyx
|
||||||
of the _be_ie_unfriendly and other hacks in the server.
|
- layout.pyx performance tests, before and after generalization
|
||||||
|
101
design.md
101
design.md
@@ -1,12 +1,11 @@
|
|||||||
Structure
|
Structure
|
||||||
---------
|
---------
|
||||||
nilmdb.nilmdb is the NILM database interface. A nilmdb.BulkData
|
nilmdb.nilmdb is the NILM database interface. It tracks a PyTables
|
||||||
interface stores data in flat files, and a SQL database tracks
|
database holds actual rows of data, and a SQL database tracks metadata
|
||||||
metadata and ranges.
|
and ranges.
|
||||||
|
|
||||||
Access to the nilmdb must be single-threaded. This is handled with
|
Access to the nilmdb must be single-threaded. This is handled with
|
||||||
the nilmdb.serializer class. In the future this could probably
|
the nilmdb.serializer class.
|
||||||
be turned into a per-path serialization.
|
|
||||||
|
|
||||||
nilmdb.server is a HTTP server that provides an interface to talk,
|
nilmdb.server is a HTTP server that provides an interface to talk,
|
||||||
thorugh the serialization layer, to the nilmdb object.
|
thorugh the serialization layer, to the nilmdb object.
|
||||||
@@ -19,13 +18,13 @@ Sqlite performance
|
|||||||
Committing a transaction in the default sync mode (PRAGMA synchronous=FULL)
|
Committing a transaction in the default sync mode (PRAGMA synchronous=FULL)
|
||||||
takes about 125msec. sqlite3 will commit transactions at 3 times:
|
takes about 125msec. sqlite3 will commit transactions at 3 times:
|
||||||
|
|
||||||
1. explicit con.commit()
|
1: explicit con.commit()
|
||||||
|
|
||||||
2. between a series of DML commands and non-DML commands, e.g.
|
2: between a series of DML commands and non-DML commands, e.g.
|
||||||
after a series of INSERT, SELECT, but before a CREATE TABLE or
|
after a series of INSERT, SELECT, but before a CREATE TABLE or
|
||||||
PRAGMA.
|
PRAGMA.
|
||||||
|
|
||||||
3. at the end of an explicit transaction, e.g. "with self.con as con:"
|
3: at the end of an explicit transaction, e.g. "with self.con as con:"
|
||||||
|
|
||||||
To speed up testing, or if this transaction speed becomes an issue,
|
To speed up testing, or if this transaction speed becomes an issue,
|
||||||
the sync=False option to NilmDB will set PRAGMA synchronous=OFF.
|
the sync=False option to NilmDB will set PRAGMA synchronous=OFF.
|
||||||
@@ -48,7 +47,6 @@ transfer?
|
|||||||
everything still gets buffered. Just a tradeoff of buffer size.
|
everything still gets buffered. Just a tradeoff of buffer size.
|
||||||
|
|
||||||
Before timestamps are added:
|
Before timestamps are added:
|
||||||
|
|
||||||
- Raw data is about 440 kB/s (9 channels)
|
- Raw data is about 440 kB/s (9 channels)
|
||||||
- Prep data is about 12.5 kB/s (1 phase)
|
- Prep data is about 12.5 kB/s (1 phase)
|
||||||
- How do we know how much data to send?
|
- How do we know how much data to send?
|
||||||
@@ -64,7 +62,6 @@ Before timestamps are added:
|
|||||||
- Should those numbers come from the server?
|
- Should those numbers come from the server?
|
||||||
|
|
||||||
Converting from ASCII to PyTables:
|
Converting from ASCII to PyTables:
|
||||||
|
|
||||||
- For each row getting added, we need to set attributes on a PyTables
|
- For each row getting added, we need to set attributes on a PyTables
|
||||||
Row object and call table.append(). This means that there isn't a
|
Row object and call table.append(). This means that there isn't a
|
||||||
particularly efficient way of converting from ascii.
|
particularly efficient way of converting from ascii.
|
||||||
@@ -141,20 +138,11 @@ Speed
|
|||||||
- Next slowdown target is nilmdb.layout.Parser.parse().
|
- Next slowdown target is nilmdb.layout.Parser.parse().
|
||||||
- Rewrote parsers using cython and sscanf
|
- Rewrote parsers using cython and sscanf
|
||||||
- Stats (rev 10831), with _add_interval disabled
|
- Stats (rev 10831), with _add_interval disabled
|
||||||
|
|
||||||
layout.pyx.Parser.parse:128 6303 sec, 262k calls
|
layout.pyx.Parser.parse:128 6303 sec, 262k calls
|
||||||
layout.pyx.parse:63 13913 sec, 5.1g calls
|
layout.pyx.parse:63 13913 sec, 5.1g calls
|
||||||
numpy:records.py.fromrecords:569 7410 sec, 262k calls
|
numpy:records.py.fromrecords:569 7410 sec, 262k calls
|
||||||
|
|
||||||
- Probably OK for now.
|
- Probably OK for now.
|
||||||
|
|
||||||
- After all updates, now takes about 8.5 minutes to insert an hour of
|
|
||||||
data, constant after adding 171 hours (4.9 billion data points)
|
|
||||||
|
|
||||||
- Data set size: 98 gigs = 20 bytes per data point.
|
|
||||||
6 uint16 data + 1 uint32 timestamp = 16 bytes per point
|
|
||||||
So compression must be off -- will retry with compression forced on.
|
|
||||||
|
|
||||||
IntervalSet speed
|
IntervalSet speed
|
||||||
-----------------
|
-----------------
|
||||||
- Initial implementation was pretty slow, even with binary search in
|
- Initial implementation was pretty slow, even with binary search in
|
||||||
@@ -173,18 +161,6 @@ IntervalSet speed
|
|||||||
- Might be algorithmic improvements to be made in Interval.py,
|
- Might be algorithmic improvements to be made in Interval.py,
|
||||||
like in `__and__`
|
like in `__and__`
|
||||||
|
|
||||||
- Replaced again with rbtree. Seems decent. Numbers are time per
|
|
||||||
insert for 2**17 insertions, followed by total wall time and RAM
|
|
||||||
usage for running "make test" with `test_rbtree` and `test_interval`
|
|
||||||
with range(5,20):
|
|
||||||
- old values with bxinterval:
|
|
||||||
20.2 μS, total 20 s, 177 MB RAM
|
|
||||||
- rbtree, plain python:
|
|
||||||
97 μS, total 105 s, 846 MB RAM
|
|
||||||
- rbtree converted to cython:
|
|
||||||
26 μS, total 29 s, 320 MB RAM
|
|
||||||
- rbtree and interval converted to cython:
|
|
||||||
8.4 μS, total 12 s, 134 MB RAM
|
|
||||||
|
|
||||||
Layouts
|
Layouts
|
||||||
-------
|
-------
|
||||||
@@ -203,66 +179,3 @@ handlers. For compatibility:
|
|||||||
"RawData" == "uint16_6"
|
"RawData" == "uint16_6"
|
||||||
"RawNotchedData" == "uint16_9"
|
"RawNotchedData" == "uint16_9"
|
||||||
"PrepData" == "float32_8"
|
"PrepData" == "float32_8"
|
||||||
|
|
||||||
|
|
||||||
BulkData design
|
|
||||||
---------------
|
|
||||||
|
|
||||||
BulkData is a custom bulk data storage system that was written to
|
|
||||||
replace PyTables. The general structure is a `data` subdirectory in
|
|
||||||
the main NilmDB directory. Within `data`, paths are created for each
|
|
||||||
created stream. These locations are called tables. For example,
|
|
||||||
tables might be located at
|
|
||||||
|
|
||||||
nilmdb/data/newton/raw/
|
|
||||||
nilmdb/data/newton/prep/
|
|
||||||
nilmdb/data/cottage/raw/
|
|
||||||
|
|
||||||
Each table contains:
|
|
||||||
|
|
||||||
- An unchanging `_format` file (Python pickle format) that describes
|
|
||||||
parameters of how the data is broken up, like files per directory,
|
|
||||||
rows per file, and the binary data format
|
|
||||||
|
|
||||||
- Hex named subdirectories `("%04x", although more than 65536 can exist)`
|
|
||||||
|
|
||||||
- Hex named files within those subdirectories, like:
|
|
||||||
|
|
||||||
/nilmdb/data/newton/raw/000b/010a
|
|
||||||
|
|
||||||
The data format of these files is raw binary, interpreted by the
|
|
||||||
Python `struct` module according to the format string in the
|
|
||||||
`_format` file.
|
|
||||||
|
|
||||||
- Same as above, with `.removed` suffix, is an optional file (Python
|
|
||||||
pickle format) containing a list of row numbers that have been
|
|
||||||
logically removed from the file. If this range covers the entire
|
|
||||||
file, the entire file will be removed.
|
|
||||||
|
|
||||||
- Note that the `bulkdata.nrows` variable is calculated once in
|
|
||||||
`BulkData.__init__()`, and only ever incremented during use. Thus,
|
|
||||||
even if all data is removed, `nrows` can remain high. However, if
|
|
||||||
the server is restarted, the newly calculated `nrows` may be lower
|
|
||||||
than in a previous run due to deleted data. To be specific, this
|
|
||||||
sequence of events:
|
|
||||||
|
|
||||||
- insert data
|
|
||||||
- remove all data
|
|
||||||
- insert data
|
|
||||||
|
|
||||||
will result in having different row numbers in the database, and
|
|
||||||
differently numbered files on the filesystem, than the sequence:
|
|
||||||
|
|
||||||
- insert data
|
|
||||||
- remove all data
|
|
||||||
- restart server
|
|
||||||
- insert data
|
|
||||||
|
|
||||||
This is okay! Everything should remain consistent both in the
|
|
||||||
`BulkData` and `NilmDB`. Not attempting to readjust `nrows` during
|
|
||||||
deletion makes the code quite a bit simpler.
|
|
||||||
|
|
||||||
- Similarly, data files are never truncated shorter. Removing data
|
|
||||||
from the end of the file will not shorten it; it will only be
|
|
||||||
deleted when it has been fully filled and all of the data has been
|
|
||||||
subsequently removed.
|
|
||||||
|
@@ -3,10 +3,14 @@
|
|||||||
from .nilmdb import NilmDB
|
from .nilmdb import NilmDB
|
||||||
from .server import Server
|
from .server import Server
|
||||||
from .client import Client
|
from .client import Client
|
||||||
|
from .timer import Timer
|
||||||
import pyximport; pyximport.install()
|
|
||||||
import layout
|
|
||||||
import interval
|
|
||||||
|
|
||||||
import cmdline
|
import cmdline
|
||||||
|
|
||||||
|
import pyximport; pyximport.install()
|
||||||
|
import layout
|
||||||
|
|
||||||
|
import serializer
|
||||||
|
import timestamper
|
||||||
|
import interval
|
||||||
|
import du
|
||||||
|
@@ -1,460 +0,0 @@
|
|||||||
# Fixed record size bulk data storage
|
|
||||||
|
|
||||||
from __future__ import absolute_import
|
|
||||||
from __future__ import division
|
|
||||||
import nilmdb
|
|
||||||
from nilmdb.utils.printf import *
|
|
||||||
|
|
||||||
import os
|
|
||||||
import sys
|
|
||||||
import cPickle as pickle
|
|
||||||
import struct
|
|
||||||
import fnmatch
|
|
||||||
import mmap
|
|
||||||
import re
|
|
||||||
|
|
||||||
# Up to 256 open file descriptors at any given time.
|
|
||||||
# These variables are global so they can be used in the decorator arguments.
|
|
||||||
table_cache_size = 16
|
|
||||||
fd_cache_size = 16
|
|
||||||
|
|
||||||
@nilmdb.utils.must_close(wrap_verify = True)
|
|
||||||
class BulkData(object):
|
|
||||||
def __init__(self, basepath, **kwargs):
|
|
||||||
self.basepath = basepath
|
|
||||||
self.root = os.path.join(self.basepath, "data")
|
|
||||||
|
|
||||||
# Tuneables
|
|
||||||
if "file_size" in kwargs:
|
|
||||||
self.file_size = kwargs["file_size"]
|
|
||||||
else:
|
|
||||||
# Default to approximately 128 MiB per file
|
|
||||||
self.file_size = 128 * 1024 * 1024
|
|
||||||
|
|
||||||
if "files_per_dir" in kwargs:
|
|
||||||
self.files_per_dir = kwargs["files_per_dir"]
|
|
||||||
else:
|
|
||||||
# 32768 files per dir should work even on FAT32
|
|
||||||
self.files_per_dir = 32768
|
|
||||||
|
|
||||||
# Make root path
|
|
||||||
if not os.path.isdir(self.root):
|
|
||||||
os.mkdir(self.root)
|
|
||||||
|
|
||||||
def close(self):
|
|
||||||
self.getnode.cache_remove_all()
|
|
||||||
|
|
||||||
def _encode_filename(self, path):
|
|
||||||
# Encode all paths to UTF-8, regardless of sys.getfilesystemencoding(),
|
|
||||||
# because we want to be able to represent all code points and the user
|
|
||||||
# will never be directly exposed to filenames. We can then do path
|
|
||||||
# manipulations on the UTF-8 directly.
|
|
||||||
if isinstance(path, unicode):
|
|
||||||
return path.encode('utf-8')
|
|
||||||
return path
|
|
||||||
|
|
||||||
def create(self, unicodepath, layout_name):
|
|
||||||
"""
|
|
||||||
unicodepath: path to the data (e.g. u'/newton/prep').
|
|
||||||
Paths must contain at least two elements, e.g.:
|
|
||||||
/newton/prep
|
|
||||||
/newton/raw
|
|
||||||
/newton/upstairs/prep
|
|
||||||
/newton/upstairs/raw
|
|
||||||
|
|
||||||
layout_name: string for nilmdb.layout.get_named(), e.g. 'float32_8'
|
|
||||||
"""
|
|
||||||
path = self._encode_filename(unicodepath)
|
|
||||||
|
|
||||||
if path[0] != '/':
|
|
||||||
raise ValueError("paths must start with /")
|
|
||||||
[ group, node ] = path.rsplit("/", 1)
|
|
||||||
if group == '':
|
|
||||||
raise ValueError("invalid path; path must contain at least one "
|
|
||||||
"folder")
|
|
||||||
|
|
||||||
# Get layout, and build format string for struct module
|
|
||||||
try:
|
|
||||||
layout = nilmdb.layout.get_named(layout_name)
|
|
||||||
struct_fmt = '<d' # Little endian, double timestamp
|
|
||||||
struct_mapping = {
|
|
||||||
"int8": 'b',
|
|
||||||
"uint8": 'B',
|
|
||||||
"int16": 'h',
|
|
||||||
"uint16": 'H',
|
|
||||||
"int32": 'i',
|
|
||||||
"uint32": 'I',
|
|
||||||
"int64": 'q',
|
|
||||||
"uint64": 'Q',
|
|
||||||
"float32": 'f',
|
|
||||||
"float64": 'd',
|
|
||||||
}
|
|
||||||
for n in range(layout.count):
|
|
||||||
struct_fmt += struct_mapping[layout.datatype]
|
|
||||||
except KeyError:
|
|
||||||
raise ValueError("no such layout, or bad data types")
|
|
||||||
|
|
||||||
# Create the table. Note that we make a distinction here
|
|
||||||
# between NilmDB paths (always Unix style, split apart
|
|
||||||
# manually) and OS paths (built up with os.path.join)
|
|
||||||
|
|
||||||
# Make directories leading up to this one
|
|
||||||
elements = path.lstrip('/').split('/')
|
|
||||||
for i in range(len(elements)):
|
|
||||||
ospath = os.path.join(self.root, *elements[0:i])
|
|
||||||
if Table.exists(ospath):
|
|
||||||
raise ValueError("path is subdir of existing node")
|
|
||||||
if not os.path.isdir(ospath):
|
|
||||||
os.mkdir(ospath)
|
|
||||||
|
|
||||||
# Make the final dir
|
|
||||||
ospath = os.path.join(self.root, *elements)
|
|
||||||
if os.path.isdir(ospath):
|
|
||||||
raise ValueError("subdirs of this path already exist")
|
|
||||||
os.mkdir(ospath)
|
|
||||||
|
|
||||||
# Write format string to file
|
|
||||||
Table.create(ospath, struct_fmt, self.file_size, self.files_per_dir)
|
|
||||||
|
|
||||||
# Open and cache it
|
|
||||||
self.getnode(unicodepath)
|
|
||||||
|
|
||||||
# Success
|
|
||||||
return
|
|
||||||
|
|
||||||
def destroy(self, unicodepath):
|
|
||||||
"""Fully remove all data at a particular path. No way to undo
|
|
||||||
it! The group/path structure is removed, too."""
|
|
||||||
path = self._encode_filename(unicodepath)
|
|
||||||
|
|
||||||
# Get OS path
|
|
||||||
elements = path.lstrip('/').split('/')
|
|
||||||
ospath = os.path.join(self.root, *elements)
|
|
||||||
|
|
||||||
# Remove Table object from cache
|
|
||||||
self.getnode.cache_remove(self, unicodepath)
|
|
||||||
|
|
||||||
# Remove the contents of the target directory
|
|
||||||
if not Table.exists(ospath):
|
|
||||||
raise ValueError("nothing at that path")
|
|
||||||
for (root, dirs, files) in os.walk(ospath, topdown = False):
|
|
||||||
for name in files:
|
|
||||||
os.remove(os.path.join(root, name))
|
|
||||||
for name in dirs:
|
|
||||||
os.rmdir(os.path.join(root, name))
|
|
||||||
|
|
||||||
# Remove empty parent directories
|
|
||||||
for i in reversed(range(len(elements))):
|
|
||||||
ospath = os.path.join(self.root, *elements[0:i+1])
|
|
||||||
try:
|
|
||||||
os.rmdir(ospath)
|
|
||||||
except OSError:
|
|
||||||
break
|
|
||||||
|
|
||||||
# Cache open tables
|
|
||||||
@nilmdb.utils.lru_cache(size = table_cache_size,
|
|
||||||
onremove = lambda x: x.close())
|
|
||||||
def getnode(self, unicodepath):
|
|
||||||
"""Return a Table object corresponding to the given database
|
|
||||||
path, which must exist."""
|
|
||||||
path = self._encode_filename(unicodepath)
|
|
||||||
elements = path.lstrip('/').split('/')
|
|
||||||
ospath = os.path.join(self.root, *elements)
|
|
||||||
return Table(ospath)
|
|
||||||
|
|
||||||
@nilmdb.utils.must_close(wrap_verify = True)
|
|
||||||
class Table(object):
|
|
||||||
"""Tools to help access a single table (data at a specific OS path)."""
|
|
||||||
# See design.md for design details
|
|
||||||
|
|
||||||
# Class methods, to help keep format details in this class.
|
|
||||||
@classmethod
|
|
||||||
def exists(cls, root):
|
|
||||||
"""Return True if a table appears to exist at this OS path"""
|
|
||||||
return os.path.isfile(os.path.join(root, "_format"))
|
|
||||||
|
|
||||||
@classmethod
|
|
||||||
def create(cls, root, struct_fmt, file_size, files_per_dir):
|
|
||||||
"""Initialize a table at the given OS path.
|
|
||||||
'struct_fmt' is a Struct module format description"""
|
|
||||||
|
|
||||||
# Calculate rows per file so that each file is approximately
|
|
||||||
# file_size bytes.
|
|
||||||
packer = struct.Struct(struct_fmt)
|
|
||||||
rows_per_file = max(file_size // packer.size, 1)
|
|
||||||
|
|
||||||
format = { "rows_per_file": rows_per_file,
|
|
||||||
"files_per_dir": files_per_dir,
|
|
||||||
"struct_fmt": struct_fmt,
|
|
||||||
"version": 1 }
|
|
||||||
with open(os.path.join(root, "_format"), "wb") as f:
|
|
||||||
pickle.dump(format, f, 2)
|
|
||||||
|
|
||||||
# Normal methods
|
|
||||||
def __init__(self, root):
|
|
||||||
"""'root' is the full OS path to the directory of this table"""
|
|
||||||
self.root = root
|
|
||||||
|
|
||||||
# Load the format and build packer
|
|
||||||
with open(os.path.join(self.root, "_format"), "rb") as f:
|
|
||||||
format = pickle.load(f)
|
|
||||||
|
|
||||||
if format["version"] != 1: # pragma: no cover (just future proofing)
|
|
||||||
raise NotImplementedError("version " + format["version"] +
|
|
||||||
" bulk data store not supported")
|
|
||||||
|
|
||||||
self.rows_per_file = format["rows_per_file"]
|
|
||||||
self.files_per_dir = format["files_per_dir"]
|
|
||||||
self.packer = struct.Struct(format["struct_fmt"])
|
|
||||||
self.file_size = self.packer.size * self.rows_per_file
|
|
||||||
|
|
||||||
# Find nrows
|
|
||||||
self.nrows = self._get_nrows()
|
|
||||||
|
|
||||||
def close(self):
|
|
||||||
self.mmap_open.cache_remove_all()
|
|
||||||
|
|
||||||
# Internal helpers
|
|
||||||
def _get_nrows(self):
|
|
||||||
"""Find nrows by locating the lexicographically last filename
|
|
||||||
and using its size"""
|
|
||||||
# Note that this just finds a 'nrows' that is guaranteed to be
|
|
||||||
# greater than the row number of any piece of data that
|
|
||||||
# currently exists, not necessarily all data that _ever_
|
|
||||||
# existed.
|
|
||||||
regex = re.compile("^[0-9a-f]{4,}$")
|
|
||||||
|
|
||||||
# Find the last directory. We sort and loop through all of them,
|
|
||||||
# starting with the numerically greatest, because the dirs could be
|
|
||||||
# empty if something was deleted.
|
|
||||||
subdirs = sorted(filter(regex.search, os.listdir(self.root)),
|
|
||||||
key = lambda x: int(x, 16), reverse = True)
|
|
||||||
|
|
||||||
for subdir in subdirs:
|
|
||||||
# Now find the last file in that dir
|
|
||||||
path = os.path.join(self.root, subdir)
|
|
||||||
files = filter(regex.search, os.listdir(path))
|
|
||||||
if not files: # pragma: no cover (shouldn't occur)
|
|
||||||
# Empty dir: try the next one
|
|
||||||
continue
|
|
||||||
|
|
||||||
# Find the numerical max
|
|
||||||
filename = max(files, key = lambda x: int(x, 16))
|
|
||||||
offset = os.path.getsize(os.path.join(self.root, subdir, filename))
|
|
||||||
|
|
||||||
# Convert to row number
|
|
||||||
return self._row_from_offset(subdir, filename, offset)
|
|
||||||
|
|
||||||
# No files, so no data
|
|
||||||
return 0
|
|
||||||
|
|
||||||
def _offset_from_row(self, row):
|
|
||||||
"""Return a (subdir, filename, offset, count) tuple:
|
|
||||||
|
|
||||||
subdir: subdirectory for the file
|
|
||||||
filename: the filename that contains the specified row
|
|
||||||
offset: byte offset of the specified row within the file
|
|
||||||
count: number of rows (starting at offset) that fit in the file
|
|
||||||
"""
|
|
||||||
filenum = row // self.rows_per_file
|
|
||||||
# It's OK if these format specifiers are too short; the filenames
|
|
||||||
# will just get longer but will still sort correctly.
|
|
||||||
dirname = sprintf("%04x", filenum // self.files_per_dir)
|
|
||||||
filename = sprintf("%04x", filenum % self.files_per_dir)
|
|
||||||
offset = (row % self.rows_per_file) * self.packer.size
|
|
||||||
count = self.rows_per_file - (row % self.rows_per_file)
|
|
||||||
return (dirname, filename, offset, count)
|
|
||||||
|
|
||||||
def _row_from_offset(self, subdir, filename, offset):
|
|
||||||
"""Return the row number that corresponds to the given
|
|
||||||
'subdir/filename' and byte-offset within that file."""
|
|
||||||
if (offset % self.packer.size) != 0: # pragma: no cover; shouldn't occur
|
|
||||||
raise ValueError("file offset is not a multiple of data size")
|
|
||||||
filenum = int(subdir, 16) * self.files_per_dir + int(filename, 16)
|
|
||||||
row = (filenum * self.rows_per_file) + (offset // self.packer.size)
|
|
||||||
return row
|
|
||||||
|
|
||||||
# Cache open files
|
|
||||||
@nilmdb.utils.lru_cache(size = fd_cache_size,
|
|
||||||
keys = slice(0,3), # exclude newsize
|
|
||||||
onremove = lambda x: x.close())
|
|
||||||
def mmap_open(self, subdir, filename, newsize = None):
|
|
||||||
"""Open and map a given 'subdir/filename' (relative to self.root).
|
|
||||||
Will be automatically closed when evicted from the cache.
|
|
||||||
|
|
||||||
If 'newsize' is provided, the file is truncated to the given
|
|
||||||
size before the mapping is returned. (Note that the LRU cache
|
|
||||||
on this function means the truncate will only happen if the
|
|
||||||
object isn't already cached; mmap.resize should be used too.)"""
|
|
||||||
try:
|
|
||||||
os.mkdir(os.path.join(self.root, subdir))
|
|
||||||
except OSError:
|
|
||||||
pass
|
|
||||||
f = open(os.path.join(self.root, subdir, filename), "a+", 0)
|
|
||||||
if newsize is not None:
|
|
||||||
# mmap can't map a zero-length file, so this allows the
|
|
||||||
# caller to set the filesize between file creation and
|
|
||||||
# mmap.
|
|
||||||
f.truncate(newsize)
|
|
||||||
mm = mmap.mmap(f.fileno(), 0)
|
|
||||||
return mm
|
|
||||||
|
|
||||||
def mmap_open_resize(self, subdir, filename, newsize):
|
|
||||||
"""Open and map a given 'subdir/filename' (relative to self.root).
|
|
||||||
The file is resized to the given size."""
|
|
||||||
# Pass new size to mmap_open
|
|
||||||
mm = self.mmap_open(subdir, filename, newsize)
|
|
||||||
# In case we got a cached copy, need to call mm.resize too.
|
|
||||||
mm.resize(newsize)
|
|
||||||
return mm
|
|
||||||
|
|
||||||
def append(self, data):
|
|
||||||
"""Append the data and flush it to disk.
|
|
||||||
data is a nested Python list [[row],[row],[...]]"""
|
|
||||||
remaining = len(data)
|
|
||||||
dataiter = iter(data)
|
|
||||||
while remaining:
|
|
||||||
# See how many rows we can fit into the current file, and open it
|
|
||||||
(subdir, fname, offset, count) = self._offset_from_row(self.nrows)
|
|
||||||
if count > remaining:
|
|
||||||
count = remaining
|
|
||||||
newsize = offset + count * self.packer.size
|
|
||||||
mm = self.mmap_open_resize(subdir, fname, newsize)
|
|
||||||
mm.seek(offset)
|
|
||||||
|
|
||||||
# Write the data
|
|
||||||
for i in xrange(count):
|
|
||||||
row = dataiter.next()
|
|
||||||
mm.write(self.packer.pack(*row))
|
|
||||||
remaining -= count
|
|
||||||
self.nrows += count
|
|
||||||
|
|
||||||
def __getitem__(self, key):
|
|
||||||
"""Extract data and return it. Supports simple indexing
|
|
||||||
(table[n]) and range slices (table[n:m]). Returns a nested
|
|
||||||
Python list [[row],[row],[...]]"""
|
|
||||||
|
|
||||||
# Handle simple slices
|
|
||||||
if isinstance(key, slice):
|
|
||||||
# Fall back to brute force if the slice isn't simple
|
|
||||||
if ((key.step is not None and key.step != 1) or
|
|
||||||
key.start is None or
|
|
||||||
key.stop is None or
|
|
||||||
key.start >= key.stop or
|
|
||||||
key.start < 0 or
|
|
||||||
key.stop > self.nrows):
|
|
||||||
return [ self[x] for x in xrange(*key.indices(self.nrows)) ]
|
|
||||||
|
|
||||||
ret = []
|
|
||||||
row = key.start
|
|
||||||
remaining = key.stop - key.start
|
|
||||||
while remaining:
|
|
||||||
(subdir, filename, offset, count) = self._offset_from_row(row)
|
|
||||||
if count > remaining:
|
|
||||||
count = remaining
|
|
||||||
mm = self.mmap_open(subdir, filename)
|
|
||||||
for i in xrange(count):
|
|
||||||
ret.append(list(self.packer.unpack_from(mm, offset)))
|
|
||||||
offset += self.packer.size
|
|
||||||
remaining -= count
|
|
||||||
row += count
|
|
||||||
return ret
|
|
||||||
|
|
||||||
# Handle single points
|
|
||||||
if key < 0 or key >= self.nrows:
|
|
||||||
raise IndexError("Index out of range")
|
|
||||||
(subdir, filename, offset, count) = self._offset_from_row(key)
|
|
||||||
mm = self.mmap_open(subdir, filename)
|
|
||||||
# unpack_from ignores the mmap object's current seek position
|
|
||||||
return list(self.packer.unpack_from(mm, offset))
|
|
||||||
|
|
||||||
def _remove_rows(self, subdir, filename, start, stop):
|
|
||||||
"""Helper to mark specific rows as being removed from a
|
|
||||||
file, and potentially removing or truncating the file itself."""
|
|
||||||
# Import an existing list of deleted rows for this file
|
|
||||||
datafile = os.path.join(self.root, subdir, filename)
|
|
||||||
cachefile = datafile + ".removed"
|
|
||||||
try:
|
|
||||||
with open(cachefile, "rb") as f:
|
|
||||||
ranges = pickle.load(f)
|
|
||||||
cachefile_present = True
|
|
||||||
except:
|
|
||||||
ranges = []
|
|
||||||
cachefile_present = False
|
|
||||||
|
|
||||||
# Append our new range and sort
|
|
||||||
ranges.append((start, stop))
|
|
||||||
ranges.sort()
|
|
||||||
|
|
||||||
# Merge adjacent ranges into "out"
|
|
||||||
merged = []
|
|
||||||
prev = None
|
|
||||||
for new in ranges:
|
|
||||||
if prev is None:
|
|
||||||
# No previous range, so remember this one
|
|
||||||
prev = new
|
|
||||||
elif prev[1] == new[0]:
|
|
||||||
# Previous range connected to this new one; extend prev
|
|
||||||
prev = (prev[0], new[1])
|
|
||||||
else:
|
|
||||||
# Not connected; append previous and start again
|
|
||||||
merged.append(prev)
|
|
||||||
prev = new
|
|
||||||
if prev is not None:
|
|
||||||
merged.append(prev)
|
|
||||||
|
|
||||||
# If the range covered the whole file, we can delete it now.
|
|
||||||
# Note that the last file in a table may be only partially
|
|
||||||
# full (smaller than self.rows_per_file). We purposely leave
|
|
||||||
# those files around rather than deleting them, because the
|
|
||||||
# remainder will be filled on a subsequent append(), and things
|
|
||||||
# are generally easier if we don't have to special-case that.
|
|
||||||
if (len(merged) == 1 and
|
|
||||||
merged[0][0] == 0 and merged[0][1] == self.rows_per_file):
|
|
||||||
# Close potentially open file in mmap_open LRU cache
|
|
||||||
self.mmap_open.cache_remove(self, subdir, filename)
|
|
||||||
|
|
||||||
# Delete files
|
|
||||||
os.remove(datafile)
|
|
||||||
if cachefile_present:
|
|
||||||
os.remove(cachefile)
|
|
||||||
|
|
||||||
# Try deleting subdir, too
|
|
||||||
try:
|
|
||||||
os.rmdir(os.path.join(self.root, subdir))
|
|
||||||
except:
|
|
||||||
pass
|
|
||||||
else:
|
|
||||||
# Update cache. Try to do it atomically.
|
|
||||||
nilmdb.utils.atomic.replace_file(cachefile,
|
|
||||||
pickle.dumps(merged, 2))
|
|
||||||
|
|
||||||
def remove(self, start, stop):
|
|
||||||
"""Remove specified rows [start, stop) from this table.
|
|
||||||
|
|
||||||
If a file is left empty, it is fully removed. Otherwise, a
|
|
||||||
parallel data file is used to remember which rows have been
|
|
||||||
removed, and the file is otherwise untouched."""
|
|
||||||
if start < 0 or start > stop or stop > self.nrows:
|
|
||||||
raise IndexError("Index out of range")
|
|
||||||
|
|
||||||
row = start
|
|
||||||
remaining = stop - start
|
|
||||||
while remaining:
|
|
||||||
# Loop through each file that we need to touch
|
|
||||||
(subdir, filename, offset, count) = self._offset_from_row(row)
|
|
||||||
if count > remaining:
|
|
||||||
count = remaining
|
|
||||||
row_offset = offset // self.packer.size
|
|
||||||
# Mark the rows as being removed
|
|
||||||
self._remove_rows(subdir, filename, row_offset, row_offset + count)
|
|
||||||
remaining -= count
|
|
||||||
row += count
|
|
||||||
|
|
||||||
class TimestampOnlyTable(object):
|
|
||||||
"""Helper that lets us pass a Tables object into bisect, by
|
|
||||||
returning only the timestamp when a particular row is requested."""
|
|
||||||
def __init__(self, table):
|
|
||||||
self.table = table
|
|
||||||
def __getitem__(self, index):
|
|
||||||
return self.table[index][0]
|
|
502
nilmdb/bxintersect.pyx
Normal file
502
nilmdb/bxintersect.pyx
Normal file
@@ -0,0 +1,502 @@
|
|||||||
|
# cython: profile=False
|
||||||
|
# This is from bx-python 554:07aca5a9f6fc (BSD licensed), modified to
|
||||||
|
# store interval ranges as doubles rather than 32-bit integers.
|
||||||
|
|
||||||
|
"""
|
||||||
|
Data structure for performing intersect queries on a set of intervals which
|
||||||
|
preserves all information about the intervals (unlike bitset projection methods).
|
||||||
|
|
||||||
|
:Authors: James Taylor (james@jamestaylor.org),
|
||||||
|
Ian Schenk (ian.schenck@gmail.com),
|
||||||
|
Brent Pedersen (bpederse@gmail.com)
|
||||||
|
"""
|
||||||
|
|
||||||
|
# Historical note:
|
||||||
|
# This module original contained an implementation based on sorted endpoints
|
||||||
|
# and a binary search, using an idea from Scott Schwartz and Piotr Berman.
|
||||||
|
# Later an interval tree implementation was implemented by Ian for Galaxy's
|
||||||
|
# join tool (see `bx.intervals.operations.quicksect.py`). This was then
|
||||||
|
# converted to Cython by Brent, who also added support for
|
||||||
|
# upstream/downstream/neighbor queries. This was modified by James to
|
||||||
|
# handle half-open intervals strictly, to maintain sort order, and to
|
||||||
|
# implement the same interface as the original Intersecter.
|
||||||
|
|
||||||
|
#cython: cdivision=True
|
||||||
|
|
||||||
|
import operator
|
||||||
|
|
||||||
|
cdef extern from "stdlib.h":
|
||||||
|
int ceil(float f)
|
||||||
|
float log(float f)
|
||||||
|
int RAND_MAX
|
||||||
|
int rand()
|
||||||
|
int strlen(char *)
|
||||||
|
int iabs(int)
|
||||||
|
|
||||||
|
cdef inline double dmax2(double a, double b):
|
||||||
|
if b > a: return b
|
||||||
|
return a
|
||||||
|
|
||||||
|
cdef inline double dmax3(double a, double b, double c):
|
||||||
|
if b > a:
|
||||||
|
if c > b:
|
||||||
|
return c
|
||||||
|
return b
|
||||||
|
if a > c:
|
||||||
|
return a
|
||||||
|
return c
|
||||||
|
|
||||||
|
cdef inline double dmin3(double a, double b, double c):
|
||||||
|
if b < a:
|
||||||
|
if c < b:
|
||||||
|
return c
|
||||||
|
return b
|
||||||
|
if a < c:
|
||||||
|
return a
|
||||||
|
return c
|
||||||
|
|
||||||
|
cdef inline double dmin2(double a, double b):
|
||||||
|
if b < a: return b
|
||||||
|
return a
|
||||||
|
|
||||||
|
cdef float nlog = -1.0 / log(0.5)
|
||||||
|
|
||||||
|
cdef class IntervalNode:
|
||||||
|
"""
|
||||||
|
A single node of an `IntervalTree`.
|
||||||
|
|
||||||
|
NOTE: Unless you really know what you are doing, you probably should us
|
||||||
|
`IntervalTree` rather than using this directly.
|
||||||
|
"""
|
||||||
|
cdef float priority
|
||||||
|
cdef public object interval
|
||||||
|
cdef public double start, end
|
||||||
|
cdef double minend, maxend, minstart
|
||||||
|
cdef public IntervalNode cleft, cright, croot
|
||||||
|
|
||||||
|
property left_node:
|
||||||
|
def __get__(self):
|
||||||
|
return self.cleft if self.cleft is not EmptyNode else None
|
||||||
|
property right_node:
|
||||||
|
def __get__(self):
|
||||||
|
return self.cright if self.cright is not EmptyNode else None
|
||||||
|
property root_node:
|
||||||
|
def __get__(self):
|
||||||
|
return self.croot if self.croot is not EmptyNode else None
|
||||||
|
|
||||||
|
def __repr__(self):
|
||||||
|
return "IntervalNode(%g, %g)" % (self.start, self.end)
|
||||||
|
|
||||||
|
def __cinit__(IntervalNode self, double start, double end, object interval):
|
||||||
|
# Python lacks the binomial distribution, so we convert a
|
||||||
|
# uniform into a binomial because it naturally scales with
|
||||||
|
# tree size. Also, python's uniform is perfect since the
|
||||||
|
# upper limit is not inclusive, which gives us undefined here.
|
||||||
|
self.priority = ceil(nlog * log(-1.0/(1.0 * rand()/RAND_MAX - 1)))
|
||||||
|
self.start = start
|
||||||
|
self.end = end
|
||||||
|
self.interval = interval
|
||||||
|
self.maxend = end
|
||||||
|
self.minstart = start
|
||||||
|
self.minend = end
|
||||||
|
self.cleft = EmptyNode
|
||||||
|
self.cright = EmptyNode
|
||||||
|
self.croot = EmptyNode
|
||||||
|
|
||||||
|
cpdef IntervalNode insert(IntervalNode self, double start, double end, object interval):
|
||||||
|
"""
|
||||||
|
Insert a new IntervalNode into the tree of which this node is
|
||||||
|
currently the root. The return value is the new root of the tree (which
|
||||||
|
may or may not be this node!)
|
||||||
|
"""
|
||||||
|
cdef IntervalNode croot = self
|
||||||
|
# If starts are the same, decide which to add interval to based on
|
||||||
|
# end, thus maintaining sortedness relative to start/end
|
||||||
|
cdef double decision_endpoint = start
|
||||||
|
if start == self.start:
|
||||||
|
decision_endpoint = end
|
||||||
|
|
||||||
|
if decision_endpoint > self.start:
|
||||||
|
# insert to cright tree
|
||||||
|
if self.cright is not EmptyNode:
|
||||||
|
self.cright = self.cright.insert( start, end, interval )
|
||||||
|
else:
|
||||||
|
self.cright = IntervalNode( start, end, interval )
|
||||||
|
# rebalance tree
|
||||||
|
if self.priority < self.cright.priority:
|
||||||
|
croot = self.rotate_left()
|
||||||
|
else:
|
||||||
|
# insert to cleft tree
|
||||||
|
if self.cleft is not EmptyNode:
|
||||||
|
self.cleft = self.cleft.insert( start, end, interval)
|
||||||
|
else:
|
||||||
|
self.cleft = IntervalNode( start, end, interval)
|
||||||
|
# rebalance tree
|
||||||
|
if self.priority < self.cleft.priority:
|
||||||
|
croot = self.rotate_right()
|
||||||
|
|
||||||
|
croot.set_ends()
|
||||||
|
self.cleft.croot = croot
|
||||||
|
self.cright.croot = croot
|
||||||
|
return croot
|
||||||
|
|
||||||
|
cdef IntervalNode rotate_right(IntervalNode self):
|
||||||
|
cdef IntervalNode croot = self.cleft
|
||||||
|
self.cleft = self.cleft.cright
|
||||||
|
croot.cright = self
|
||||||
|
self.set_ends()
|
||||||
|
return croot
|
||||||
|
|
||||||
|
cdef IntervalNode rotate_left(IntervalNode self):
|
||||||
|
cdef IntervalNode croot = self.cright
|
||||||
|
self.cright = self.cright.cleft
|
||||||
|
croot.cleft = self
|
||||||
|
self.set_ends()
|
||||||
|
return croot
|
||||||
|
|
||||||
|
cdef inline void set_ends(IntervalNode self):
|
||||||
|
if self.cright is not EmptyNode and self.cleft is not EmptyNode:
|
||||||
|
self.maxend = dmax3(self.end, self.cright.maxend, self.cleft.maxend)
|
||||||
|
self.minend = dmin3(self.end, self.cright.minend, self.cleft.minend)
|
||||||
|
self.minstart = dmin3(self.start, self.cright.minstart, self.cleft.minstart)
|
||||||
|
elif self.cright is not EmptyNode:
|
||||||
|
self.maxend = dmax2(self.end, self.cright.maxend)
|
||||||
|
self.minend = dmin2(self.end, self.cright.minend)
|
||||||
|
self.minstart = dmin2(self.start, self.cright.minstart)
|
||||||
|
elif self.cleft is not EmptyNode:
|
||||||
|
self.maxend = dmax2(self.end, self.cleft.maxend)
|
||||||
|
self.minend = dmin2(self.end, self.cleft.minend)
|
||||||
|
self.minstart = dmin2(self.start, self.cleft.minstart)
|
||||||
|
|
||||||
|
|
||||||
|
def intersect( self, double start, double end, sort=True ):
|
||||||
|
"""
|
||||||
|
given a start and a end, return a list of features
|
||||||
|
falling within that range
|
||||||
|
"""
|
||||||
|
cdef list results = []
|
||||||
|
self._intersect( start, end, results )
|
||||||
|
if sort:
|
||||||
|
results = sorted(results)
|
||||||
|
return results
|
||||||
|
|
||||||
|
find = intersect
|
||||||
|
|
||||||
|
cdef void _intersect( IntervalNode self, double start, double end, list results):
|
||||||
|
# Left subtree
|
||||||
|
if self.cleft is not EmptyNode and self.cleft.maxend > start:
|
||||||
|
self.cleft._intersect( start, end, results )
|
||||||
|
# This interval
|
||||||
|
if ( self.end > start ) and ( self.start < end ):
|
||||||
|
results.append( self.interval )
|
||||||
|
# Right subtree
|
||||||
|
if self.cright is not EmptyNode and self.start < end:
|
||||||
|
self.cright._intersect( start, end, results )
|
||||||
|
|
||||||
|
|
||||||
|
cdef void _seek_left(IntervalNode self, double position, list results, int n, double max_dist):
|
||||||
|
# we know we can bail in these 2 cases.
|
||||||
|
if self.maxend + max_dist < position:
|
||||||
|
return
|
||||||
|
if self.minstart > position:
|
||||||
|
return
|
||||||
|
|
||||||
|
# the ordering of these 3 blocks makes it so the results are
|
||||||
|
# ordered nearest to farest from the query position
|
||||||
|
if self.cright is not EmptyNode:
|
||||||
|
self.cright._seek_left(position, results, n, max_dist)
|
||||||
|
|
||||||
|
if -1 < position - self.end < max_dist:
|
||||||
|
results.append(self.interval)
|
||||||
|
|
||||||
|
# TODO: can these conditionals be more stringent?
|
||||||
|
if self.cleft is not EmptyNode:
|
||||||
|
self.cleft._seek_left(position, results, n, max_dist)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
cdef void _seek_right(IntervalNode self, double position, list results, int n, double max_dist):
|
||||||
|
# we know we can bail in these 2 cases.
|
||||||
|
if self.maxend < position: return
|
||||||
|
if self.minstart - max_dist > position: return
|
||||||
|
|
||||||
|
#print "SEEK_RIGHT:",self, self.cleft, self.maxend, self.minstart, position
|
||||||
|
|
||||||
|
# the ordering of these 3 blocks makes it so the results are
|
||||||
|
# ordered nearest to farest from the query position
|
||||||
|
if self.cleft is not EmptyNode:
|
||||||
|
self.cleft._seek_right(position, results, n, max_dist)
|
||||||
|
|
||||||
|
if -1 < self.start - position < max_dist:
|
||||||
|
results.append(self.interval)
|
||||||
|
|
||||||
|
if self.cright is not EmptyNode:
|
||||||
|
self.cright._seek_right(position, results, n, max_dist)
|
||||||
|
|
||||||
|
|
||||||
|
cpdef left(self, position, int n=1, double max_dist=2500):
|
||||||
|
"""
|
||||||
|
find n features with a start > than `position`
|
||||||
|
f: a Interval object (or anything with an `end` attribute)
|
||||||
|
n: the number of features to return
|
||||||
|
max_dist: the maximum distance to look before giving up.
|
||||||
|
"""
|
||||||
|
cdef list results = []
|
||||||
|
# use start - 1 becuase .left() assumes strictly left-of
|
||||||
|
self._seek_left( position - 1, results, n, max_dist )
|
||||||
|
if len(results) == n: return results
|
||||||
|
r = results
|
||||||
|
r.sort(key=operator.attrgetter('end'), reverse=True)
|
||||||
|
return r[:n]
|
||||||
|
|
||||||
|
cpdef right(self, position, int n=1, double max_dist=2500):
|
||||||
|
"""
|
||||||
|
find n features with a end < than position
|
||||||
|
f: a Interval object (or anything with a `start` attribute)
|
||||||
|
n: the number of features to return
|
||||||
|
max_dist: the maximum distance to look before giving up.
|
||||||
|
"""
|
||||||
|
cdef list results = []
|
||||||
|
# use end + 1 becuase .right() assumes strictly right-of
|
||||||
|
self._seek_right(position + 1, results, n, max_dist)
|
||||||
|
if len(results) == n: return results
|
||||||
|
r = results
|
||||||
|
r.sort(key=operator.attrgetter('start'))
|
||||||
|
return r[:n]
|
||||||
|
|
||||||
|
def traverse(self):
|
||||||
|
if self.cleft is not EmptyNode:
|
||||||
|
for node in self.cleft.traverse():
|
||||||
|
yield node
|
||||||
|
yield self.interval
|
||||||
|
if self.cright is not EmptyNode:
|
||||||
|
for node in self.cright.traverse():
|
||||||
|
yield node
|
||||||
|
|
||||||
|
cdef IntervalNode EmptyNode = IntervalNode( 0, 0, Interval(0, 0))
|
||||||
|
|
||||||
|
## ---- Wrappers that retain the old interface -------------------------------
|
||||||
|
|
||||||
|
cdef class Interval:
|
||||||
|
"""
|
||||||
|
Basic feature, with required integer start and end properties.
|
||||||
|
Also accepts optional strand as +1 or -1 (used for up/downstream queries),
|
||||||
|
a name, and any arbitrary data is sent in on the info keyword argument
|
||||||
|
|
||||||
|
>>> from bx.intervals.intersection import Interval
|
||||||
|
|
||||||
|
>>> f1 = Interval(23, 36)
|
||||||
|
>>> f2 = Interval(34, 48, value={'chr':12, 'anno':'transposon'})
|
||||||
|
>>> f2
|
||||||
|
Interval(34, 48, value={'anno': 'transposon', 'chr': 12})
|
||||||
|
|
||||||
|
"""
|
||||||
|
cdef public double start, end
|
||||||
|
cdef public object value, chrom, strand
|
||||||
|
|
||||||
|
def __init__(self, double start, double end, object value=None, object chrom=None, object strand=None ):
|
||||||
|
assert start <= end, "start must be less than end"
|
||||||
|
self.start = start
|
||||||
|
self.end = end
|
||||||
|
self.value = value
|
||||||
|
self.chrom = chrom
|
||||||
|
self.strand = strand
|
||||||
|
|
||||||
|
def __repr__(self):
|
||||||
|
fstr = "Interval(%g, %g" % (self.start, self.end)
|
||||||
|
if not self.value is None:
|
||||||
|
fstr += ", value=" + str(self.value)
|
||||||
|
fstr += ")"
|
||||||
|
return fstr
|
||||||
|
|
||||||
|
def __richcmp__(self, other, op):
|
||||||
|
if op == 0:
|
||||||
|
# <
|
||||||
|
return self.start < other.start or self.end < other.end
|
||||||
|
elif op == 1:
|
||||||
|
# <=
|
||||||
|
return self == other or self < other
|
||||||
|
elif op == 2:
|
||||||
|
# ==
|
||||||
|
return self.start == other.start and self.end == other.end
|
||||||
|
elif op == 3:
|
||||||
|
# !=
|
||||||
|
return self.start != other.start or self.end != other.end
|
||||||
|
elif op == 4:
|
||||||
|
# >
|
||||||
|
return self.start > other.start or self.end > other.end
|
||||||
|
elif op == 5:
|
||||||
|
# >=
|
||||||
|
return self == other or self > other
|
||||||
|
|
||||||
|
cdef class IntervalTree:
|
||||||
|
"""
|
||||||
|
Data structure for performing window intersect queries on a set of
|
||||||
|
of possibly overlapping 1d intervals.
|
||||||
|
|
||||||
|
Usage
|
||||||
|
=====
|
||||||
|
|
||||||
|
Create an empty IntervalTree
|
||||||
|
|
||||||
|
>>> from bx.intervals.intersection import Interval, IntervalTree
|
||||||
|
>>> intersecter = IntervalTree()
|
||||||
|
|
||||||
|
An interval is a start and end position and a value (possibly None).
|
||||||
|
You can add any object as an interval:
|
||||||
|
|
||||||
|
>>> intersecter.insert( 0, 10, "food" )
|
||||||
|
>>> intersecter.insert( 3, 7, dict(foo='bar') )
|
||||||
|
|
||||||
|
>>> intersecter.find( 2, 5 )
|
||||||
|
['food', {'foo': 'bar'}]
|
||||||
|
|
||||||
|
If the object has start and end attributes (like the Interval class) there
|
||||||
|
is are some shortcuts:
|
||||||
|
|
||||||
|
>>> intersecter = IntervalTree()
|
||||||
|
>>> intersecter.insert_interval( Interval( 0, 10 ) )
|
||||||
|
>>> intersecter.insert_interval( Interval( 3, 7 ) )
|
||||||
|
>>> intersecter.insert_interval( Interval( 3, 40 ) )
|
||||||
|
>>> intersecter.insert_interval( Interval( 13, 50 ) )
|
||||||
|
|
||||||
|
>>> intersecter.find( 30, 50 )
|
||||||
|
[Interval(3, 40), Interval(13, 50)]
|
||||||
|
>>> intersecter.find( 100, 200 )
|
||||||
|
[]
|
||||||
|
|
||||||
|
Before/after for intervals
|
||||||
|
|
||||||
|
>>> intersecter.before_interval( Interval( 10, 20 ) )
|
||||||
|
[Interval(3, 7)]
|
||||||
|
>>> intersecter.before_interval( Interval( 5, 20 ) )
|
||||||
|
[]
|
||||||
|
|
||||||
|
Upstream/downstream
|
||||||
|
|
||||||
|
>>> intersecter.upstream_of_interval(Interval(11, 12))
|
||||||
|
[Interval(0, 10)]
|
||||||
|
>>> intersecter.upstream_of_interval(Interval(11, 12, strand="-"))
|
||||||
|
[Interval(13, 50)]
|
||||||
|
|
||||||
|
>>> intersecter.upstream_of_interval(Interval(1, 2, strand="-"), num_intervals=3)
|
||||||
|
[Interval(3, 7), Interval(3, 40), Interval(13, 50)]
|
||||||
|
|
||||||
|
|
||||||
|
"""
|
||||||
|
|
||||||
|
cdef IntervalNode root
|
||||||
|
|
||||||
|
def __cinit__( self ):
|
||||||
|
root = None
|
||||||
|
|
||||||
|
# Helper for plots
|
||||||
|
def emptynode( self ):
|
||||||
|
return EmptyNode
|
||||||
|
|
||||||
|
def rootnode( self ):
|
||||||
|
return self.root
|
||||||
|
|
||||||
|
# ---- Position based interfaces -----------------------------------------
|
||||||
|
|
||||||
|
def insert( self, double start, double end, object value=None ):
|
||||||
|
"""
|
||||||
|
Insert the interval [start,end) associated with value `value`.
|
||||||
|
"""
|
||||||
|
if self.root is None:
|
||||||
|
self.root = IntervalNode( start, end, value )
|
||||||
|
else:
|
||||||
|
self.root = self.root.insert( start, end, value )
|
||||||
|
|
||||||
|
add = insert
|
||||||
|
|
||||||
|
|
||||||
|
def find( self, start, end ):
|
||||||
|
"""
|
||||||
|
Return a sorted list of all intervals overlapping [start,end).
|
||||||
|
"""
|
||||||
|
if self.root is None:
|
||||||
|
return []
|
||||||
|
return self.root.find( start, end )
|
||||||
|
|
||||||
|
def before( self, position, num_intervals=1, max_dist=2500 ):
|
||||||
|
"""
|
||||||
|
Find `num_intervals` intervals that lie before `position` and are no
|
||||||
|
further than `max_dist` positions away
|
||||||
|
"""
|
||||||
|
if self.root is None:
|
||||||
|
return []
|
||||||
|
return self.root.left( position, num_intervals, max_dist )
|
||||||
|
|
||||||
|
def after( self, position, num_intervals=1, max_dist=2500 ):
|
||||||
|
"""
|
||||||
|
Find `num_intervals` intervals that lie after `position` and are no
|
||||||
|
further than `max_dist` positions away
|
||||||
|
"""
|
||||||
|
if self.root is None:
|
||||||
|
return []
|
||||||
|
return self.root.right( position, num_intervals, max_dist )
|
||||||
|
|
||||||
|
# ---- Interval-like object based interfaces -----------------------------
|
||||||
|
|
||||||
|
def insert_interval( self, interval ):
|
||||||
|
"""
|
||||||
|
Insert an "interval" like object (one with at least start and end
|
||||||
|
attributes)
|
||||||
|
"""
|
||||||
|
self.insert( interval.start, interval.end, interval )
|
||||||
|
|
||||||
|
add_interval = insert_interval
|
||||||
|
|
||||||
|
def before_interval( self, interval, num_intervals=1, max_dist=2500 ):
|
||||||
|
"""
|
||||||
|
Find `num_intervals` intervals that lie completely before `interval`
|
||||||
|
and are no further than `max_dist` positions away
|
||||||
|
"""
|
||||||
|
if self.root is None:
|
||||||
|
return []
|
||||||
|
return self.root.left( interval.start, num_intervals, max_dist )
|
||||||
|
|
||||||
|
def after_interval( self, interval, num_intervals=1, max_dist=2500 ):
|
||||||
|
"""
|
||||||
|
Find `num_intervals` intervals that lie completely after `interval` and
|
||||||
|
are no further than `max_dist` positions away
|
||||||
|
"""
|
||||||
|
if self.root is None:
|
||||||
|
return []
|
||||||
|
return self.root.right( interval.end, num_intervals, max_dist )
|
||||||
|
|
||||||
|
def upstream_of_interval( self, interval, num_intervals=1, max_dist=2500 ):
|
||||||
|
"""
|
||||||
|
Find `num_intervals` intervals that lie completely upstream of
|
||||||
|
`interval` and are no further than `max_dist` positions away
|
||||||
|
"""
|
||||||
|
if self.root is None:
|
||||||
|
return []
|
||||||
|
if interval.strand == -1 or interval.strand == "-":
|
||||||
|
return self.root.right( interval.end, num_intervals, max_dist )
|
||||||
|
else:
|
||||||
|
return self.root.left( interval.start, num_intervals, max_dist )
|
||||||
|
|
||||||
|
def downstream_of_interval( self, interval, num_intervals=1, max_dist=2500 ):
|
||||||
|
"""
|
||||||
|
Find `num_intervals` intervals that lie completely downstream of
|
||||||
|
`interval` and are no further than `max_dist` positions away
|
||||||
|
"""
|
||||||
|
if self.root is None:
|
||||||
|
return []
|
||||||
|
if interval.strand == -1 or interval.strand == "-":
|
||||||
|
return self.root.left( interval.start, num_intervals, max_dist )
|
||||||
|
else:
|
||||||
|
return self.root.right( interval.end, num_intervals, max_dist )
|
||||||
|
|
||||||
|
def traverse(self):
|
||||||
|
"""
|
||||||
|
iterator that traverses the tree
|
||||||
|
"""
|
||||||
|
if self.root is None:
|
||||||
|
return iter([])
|
||||||
|
return self.root.traverse()
|
||||||
|
|
||||||
|
# For backward compatibility
|
||||||
|
Intersecter = IntervalTree
|
101
nilmdb/client.py
101
nilmdb/client.py
@@ -1,18 +1,14 @@
|
|||||||
# -*- coding: utf-8 -*-
|
|
||||||
|
|
||||||
"""Class for performing HTTP client requests via libcurl"""
|
"""Class for performing HTTP client requests via libcurl"""
|
||||||
|
|
||||||
from __future__ import absolute_import
|
from __future__ import absolute_import
|
||||||
from nilmdb.utils.printf import *
|
from nilmdb.printf import *
|
||||||
|
|
||||||
import time
|
import time
|
||||||
import sys
|
import sys
|
||||||
import re
|
import re
|
||||||
import os
|
import os
|
||||||
import simplejson as json
|
import simplejson as json
|
||||||
import itertools
|
|
||||||
|
|
||||||
import nilmdb.utils
|
|
||||||
import nilmdb.httpclient
|
import nilmdb.httpclient
|
||||||
|
|
||||||
# Other functions expect to see these in the nilmdb.client namespace
|
# Other functions expect to see these in the nilmdb.client namespace
|
||||||
@@ -20,10 +16,6 @@ from nilmdb.httpclient import ClientError, ServerError, Error
|
|||||||
|
|
||||||
version = "1.0"
|
version = "1.0"
|
||||||
|
|
||||||
def float_to_string(f):
|
|
||||||
# Use repr to maintain full precision in the string output.
|
|
||||||
return repr(float(f))
|
|
||||||
|
|
||||||
class Client(object):
|
class Client(object):
|
||||||
"""Main client interface to the Nilm database."""
|
"""Main client interface to the Nilm database."""
|
||||||
|
|
||||||
@@ -92,88 +84,33 @@ class Client(object):
|
|||||||
"layout" : layout }
|
"layout" : layout }
|
||||||
return self.http.get("stream/create", params)
|
return self.http.get("stream/create", params)
|
||||||
|
|
||||||
def stream_destroy(self, path):
|
def stream_insert(self, path, data):
|
||||||
"""Delete stream and its contents"""
|
|
||||||
params = { "path": path }
|
|
||||||
return self.http.get("stream/destroy", params)
|
|
||||||
|
|
||||||
def stream_remove(self, path, start = None, end = None):
|
|
||||||
"""Remove data from the specified time range"""
|
|
||||||
params = {
|
|
||||||
"path": path
|
|
||||||
}
|
|
||||||
if start is not None:
|
|
||||||
params["start"] = float_to_string(start)
|
|
||||||
if end is not None:
|
|
||||||
params["end"] = float_to_string(end)
|
|
||||||
return self.http.get("stream/remove", params)
|
|
||||||
|
|
||||||
def stream_insert(self, path, data, start = None, end = None):
|
|
||||||
"""Insert data into a stream. data should be a file-like object
|
"""Insert data into a stream. data should be a file-like object
|
||||||
that provides ASCII data that matches the database layout for path.
|
that provides ASCII data that matches the database layout for path."""
|
||||||
|
|
||||||
start and end are the starting and ending timestamp of this
|
|
||||||
stream; all timestamps t in the data must satisfy 'start <= t
|
|
||||||
< end'. If left unspecified, 'start' is the timestamp of the
|
|
||||||
first line of data, and 'end' is the timestamp on the last line
|
|
||||||
of data, plus a small delta of 1μs.
|
|
||||||
"""
|
|
||||||
params = { "path": path }
|
params = { "path": path }
|
||||||
|
|
||||||
# See design.md for a discussion of how much data to send.
|
# See design.md for a discussion of how much data to send.
|
||||||
# These are soft limits -- actual data might be rounded up.
|
# These are soft limits -- actual data might be rounded up.
|
||||||
max_data = 1048576
|
max_data = 1048576
|
||||||
max_time = 30
|
max_time = 30
|
||||||
end_epsilon = 1e-6
|
|
||||||
|
|
||||||
|
|
||||||
def extract_timestamp(line):
|
|
||||||
return float(line.split()[0])
|
|
||||||
|
|
||||||
def sendit():
|
def sendit():
|
||||||
# If we have more data after this, use the timestamp of
|
result = self.http.put("stream/insert", send_data, params)
|
||||||
# the next line as the end. Otherwise, use the given
|
params["old_timestamp"] = result[1]
|
||||||
# overall end time, or add end_epsilon to the last data
|
return result
|
||||||
# point.
|
|
||||||
if nextline:
|
|
||||||
block_end = extract_timestamp(nextline)
|
|
||||||
if end and block_end > end:
|
|
||||||
# This is unexpected, but we'll defer to the server
|
|
||||||
# to return an error in this case.
|
|
||||||
block_end = end
|
|
||||||
elif end:
|
|
||||||
block_end = end
|
|
||||||
else:
|
|
||||||
block_end = extract_timestamp(line) + end_epsilon
|
|
||||||
|
|
||||||
# Send it
|
|
||||||
params["start"] = float_to_string(block_start)
|
|
||||||
params["end"] = float_to_string(block_end)
|
|
||||||
return self.http.put("stream/insert", block_data, params)
|
|
||||||
|
|
||||||
clock_start = time.time()
|
|
||||||
block_data = ""
|
|
||||||
block_start = start
|
|
||||||
result = None
|
result = None
|
||||||
for (line, nextline) in nilmdb.utils.misc.pairwise(data):
|
start = time.time()
|
||||||
# If we don't have a starting time, extract it from the first line
|
send_data = ""
|
||||||
if block_start is None:
|
for line in data:
|
||||||
block_start = extract_timestamp(line)
|
elapsed = time.time() - start
|
||||||
|
send_data += line
|
||||||
|
|
||||||
clock_elapsed = time.time() - clock_start
|
if (len(send_data) > max_data) or (elapsed > max_time):
|
||||||
block_data += line
|
|
||||||
|
|
||||||
# If we have enough data, or enough time has elapsed,
|
|
||||||
# send this block to the server, and empty things out
|
|
||||||
# for the next block.
|
|
||||||
if (len(block_data) > max_data) or (clock_elapsed > max_time):
|
|
||||||
result = sendit()
|
result = sendit()
|
||||||
block_start = None
|
send_data = ""
|
||||||
block_data = ""
|
start = time.time()
|
||||||
clock_start = time.time()
|
if len(send_data):
|
||||||
|
|
||||||
# One last block?
|
|
||||||
if len(block_data):
|
|
||||||
result = sendit()
|
result = sendit()
|
||||||
|
|
||||||
# Return the most recent JSON result we got back, or None if
|
# Return the most recent JSON result we got back, or None if
|
||||||
@@ -188,9 +125,9 @@ class Client(object):
|
|||||||
"path": path
|
"path": path
|
||||||
}
|
}
|
||||||
if start is not None:
|
if start is not None:
|
||||||
params["start"] = float_to_string(start)
|
params["start"] = repr(start) # use repr to keep precision
|
||||||
if end is not None:
|
if end is not None:
|
||||||
params["end"] = float_to_string(end)
|
params["end"] = repr(end)
|
||||||
return self.http.get_gen("stream/intervals", params, retjson = True)
|
return self.http.get_gen("stream/intervals", params, retjson = True)
|
||||||
|
|
||||||
def stream_extract(self, path, start = None, end = None, count = False):
|
def stream_extract(self, path, start = None, end = None, count = False):
|
||||||
@@ -206,9 +143,9 @@ class Client(object):
|
|||||||
"path": path,
|
"path": path,
|
||||||
}
|
}
|
||||||
if start is not None:
|
if start is not None:
|
||||||
params["start"] = float_to_string(start)
|
params["start"] = repr(start) # use repr to keep precision
|
||||||
if end is not None:
|
if end is not None:
|
||||||
params["end"] = float_to_string(end)
|
params["end"] = repr(end)
|
||||||
if count:
|
if count:
|
||||||
params["count"] = 1
|
params["count"] = 1
|
||||||
|
|
||||||
|
@@ -1,7 +1,7 @@
|
|||||||
"""Command line client functionality"""
|
"""Command line client functionality"""
|
||||||
|
|
||||||
from __future__ import absolute_import
|
from __future__ import absolute_import
|
||||||
from nilmdb.utils.printf import *
|
from nilmdb.printf import *
|
||||||
import nilmdb.client
|
import nilmdb.client
|
||||||
|
|
||||||
import datetime_tz
|
import datetime_tz
|
||||||
@@ -11,12 +11,11 @@ import re
|
|||||||
import argparse
|
import argparse
|
||||||
from argparse import ArgumentDefaultsHelpFormatter as def_form
|
from argparse import ArgumentDefaultsHelpFormatter as def_form
|
||||||
|
|
||||||
version = "1.0"
|
version = "0.1"
|
||||||
|
|
||||||
# Valid subcommands. Defined in separate files just to break
|
# Valid subcommands. Defined in separate files just to break
|
||||||
# things up -- they're still called with Cmdline as self.
|
# things up -- they're still called with Cmdline as self.
|
||||||
subcommands = [ "info", "create", "list", "metadata", "insert", "extract",
|
subcommands = [ "info", "create", "list", "metadata", "insert", "extract" ]
|
||||||
"remove", "destroy" ]
|
|
||||||
|
|
||||||
# Import the subcommand modules. Equivalent way of doing this would be
|
# Import the subcommand modules. Equivalent way of doing this would be
|
||||||
# from . import info as cmd_info
|
# from . import info as cmd_info
|
||||||
@@ -24,16 +23,10 @@ subcmd_mods = {}
|
|||||||
for cmd in subcommands:
|
for cmd in subcommands:
|
||||||
subcmd_mods[cmd] = __import__("nilmdb.cmdline." + cmd, fromlist = [ cmd ])
|
subcmd_mods[cmd] = __import__("nilmdb.cmdline." + cmd, fromlist = [ cmd ])
|
||||||
|
|
||||||
class JimArgumentParser(argparse.ArgumentParser):
|
|
||||||
def error(self, message):
|
|
||||||
self.print_usage(sys.stderr)
|
|
||||||
self.exit(2, sprintf("error: %s\n", message))
|
|
||||||
|
|
||||||
class Cmdline(object):
|
class Cmdline(object):
|
||||||
|
|
||||||
def __init__(self, argv):
|
def __init__(self, argv):
|
||||||
self.argv = argv
|
self.argv = argv
|
||||||
self.client = None
|
|
||||||
|
|
||||||
def arg_time(self, toparse):
|
def arg_time(self, toparse):
|
||||||
"""Parse a time string argument"""
|
"""Parse a time string argument"""
|
||||||
@@ -99,7 +92,7 @@ class Cmdline(object):
|
|||||||
version_string = sprintf("nilmtool %s, client library %s",
|
version_string = sprintf("nilmtool %s, client library %s",
|
||||||
version, nilmdb.Client.client_version)
|
version, nilmdb.Client.client_version)
|
||||||
|
|
||||||
self.parser = JimArgumentParser(add_help = False,
|
self.parser = argparse.ArgumentParser(add_help = False,
|
||||||
formatter_class = def_form)
|
formatter_class = def_form)
|
||||||
|
|
||||||
group = self.parser.add_argument_group("General options")
|
group = self.parser.add_argument_group("General options")
|
||||||
@@ -125,7 +118,6 @@ class Cmdline(object):
|
|||||||
|
|
||||||
def die(self, formatstr, *args):
|
def die(self, formatstr, *args):
|
||||||
fprintf(sys.stderr, formatstr + "\n", *args)
|
fprintf(sys.stderr, formatstr + "\n", *args)
|
||||||
if self.client:
|
|
||||||
self.client.close()
|
self.client.close()
|
||||||
sys.exit(-1)
|
sys.exit(-1)
|
||||||
|
|
||||||
@@ -138,17 +130,13 @@ class Cmdline(object):
|
|||||||
self.parser_setup()
|
self.parser_setup()
|
||||||
self.args = self.parser.parse_args(self.argv)
|
self.args = self.parser.parse_args(self.argv)
|
||||||
|
|
||||||
# Run arg verify handler if there is one
|
|
||||||
if "verify" in self.args:
|
|
||||||
self.args.verify(self)
|
|
||||||
|
|
||||||
self.client = nilmdb.Client(self.args.url)
|
self.client = nilmdb.Client(self.args.url)
|
||||||
|
|
||||||
# Make a test connection to make sure things work
|
# Make a test connection to make sure things work
|
||||||
try:
|
try:
|
||||||
server_version = self.client.version()
|
server_version = self.client.version()
|
||||||
except nilmdb.client.Error as e:
|
except nilmdb.client.Error as e:
|
||||||
self.die("error connecting to server: %s", str(e))
|
self.die("Error connecting to server: %s", str(e))
|
||||||
|
|
||||||
# Now dispatch client request to appropriate function. Parser
|
# Now dispatch client request to appropriate function. Parser
|
||||||
# should have ensured that we don't have any unknown commands
|
# should have ensured that we don't have any unknown commands
|
||||||
|
@@ -1,5 +1,5 @@
|
|||||||
from __future__ import absolute_import
|
from __future__ import absolute_import
|
||||||
from nilmdb.utils.printf import *
|
from nilmdb.printf import *
|
||||||
import nilmdb.client
|
import nilmdb.client
|
||||||
|
|
||||||
from argparse import ArgumentDefaultsHelpFormatter as def_form
|
from argparse import ArgumentDefaultsHelpFormatter as def_form
|
||||||
@@ -24,4 +24,4 @@ def cmd_create(self):
|
|||||||
try:
|
try:
|
||||||
self.client.stream_create(self.args.path, self.args.layout)
|
self.client.stream_create(self.args.path, self.args.layout)
|
||||||
except nilmdb.client.ClientError as e:
|
except nilmdb.client.ClientError as e:
|
||||||
self.die("error creating stream: %s", str(e))
|
self.die("Error creating stream: %s", str(e))
|
||||||
|
@@ -1,25 +0,0 @@
|
|||||||
from __future__ import absolute_import
|
|
||||||
from nilmdb.utils.printf import *
|
|
||||||
import nilmdb.client
|
|
||||||
|
|
||||||
from argparse import ArgumentDefaultsHelpFormatter as def_form
|
|
||||||
|
|
||||||
def setup(self, sub):
|
|
||||||
cmd = sub.add_parser("destroy", help="Delete a stream and all data",
|
|
||||||
formatter_class = def_form,
|
|
||||||
description="""
|
|
||||||
Destroy the stream at the specified path. All
|
|
||||||
data and metadata related to the stream is
|
|
||||||
permanently deleted.
|
|
||||||
""")
|
|
||||||
cmd.set_defaults(handler = cmd_destroy)
|
|
||||||
group = cmd.add_argument_group("Required arguments")
|
|
||||||
group.add_argument("path",
|
|
||||||
help="Path of the stream to delete, e.g. /foo/bar")
|
|
||||||
|
|
||||||
def cmd_destroy(self):
|
|
||||||
"""Destroy stream"""
|
|
||||||
try:
|
|
||||||
self.client.stream_destroy(self.args.path)
|
|
||||||
except nilmdb.client.ClientError as e:
|
|
||||||
self.die("error destroying stream: %s", str(e))
|
|
@@ -1,7 +1,7 @@
|
|||||||
from __future__ import absolute_import
|
from __future__ import absolute_import
|
||||||
from __future__ import print_function
|
from nilmdb.printf import *
|
||||||
from nilmdb.utils.printf import *
|
|
||||||
import nilmdb.client
|
import nilmdb.client
|
||||||
|
import nilmdb.layout
|
||||||
import sys
|
import sys
|
||||||
|
|
||||||
def setup(self, sub):
|
def setup(self, sub):
|
||||||
@@ -9,18 +9,17 @@ def setup(self, sub):
|
|||||||
description="""
|
description="""
|
||||||
Extract data from a stream.
|
Extract data from a stream.
|
||||||
""")
|
""")
|
||||||
cmd.set_defaults(verify = cmd_extract_verify,
|
cmd.set_defaults(handler = cmd_extract)
|
||||||
handler = cmd_extract)
|
|
||||||
|
|
||||||
group = cmd.add_argument_group("Data selection")
|
group = cmd.add_argument_group("Data selection")
|
||||||
group.add_argument("path",
|
group.add_argument("path",
|
||||||
help="Path of stream, e.g. /foo/bar")
|
help="Path of stream, e.g. /foo/bar")
|
||||||
group.add_argument("-s", "--start", required=True,
|
group.add_argument("-s", "--start", required=True,
|
||||||
metavar="TIME", type=self.arg_time,
|
metavar="TIME", type=self.arg_time,
|
||||||
help="Starting timestamp (free-form, inclusive)")
|
help="Starting timestamp (free-form)")
|
||||||
group.add_argument("-e", "--end", required=True,
|
group.add_argument("-e", "--end", required=True,
|
||||||
metavar="TIME", type=self.arg_time,
|
metavar="TIME", type=self.arg_time,
|
||||||
help="Ending timestamp (free-form, noninclusive)")
|
help="Ending timestamp (free-form)")
|
||||||
|
|
||||||
group = cmd.add_argument_group("Output format")
|
group = cmd.add_argument_group("Output format")
|
||||||
group.add_argument("-b", "--bare", action="store_true",
|
group.add_argument("-b", "--bare", action="store_true",
|
||||||
@@ -31,15 +30,10 @@ def setup(self, sub):
|
|||||||
group.add_argument("-c", "--count", action="store_true",
|
group.add_argument("-c", "--count", action="store_true",
|
||||||
help="Just output a count of matched data points")
|
help="Just output a count of matched data points")
|
||||||
|
|
||||||
def cmd_extract_verify(self):
|
|
||||||
if self.args.start is not None and self.args.end is not None:
|
|
||||||
if self.args.start > self.args.end:
|
|
||||||
self.parser.error("start is after end")
|
|
||||||
|
|
||||||
def cmd_extract(self):
|
def cmd_extract(self):
|
||||||
streams = self.client.stream_list(self.args.path)
|
streams = self.client.stream_list(self.args.path)
|
||||||
if len(streams) != 1:
|
if len(streams) != 1:
|
||||||
self.die("error getting stream info for path %s", self.args.path)
|
self.die("Error getting stream info for path %s", self.args.path)
|
||||||
layout = streams[0][1]
|
layout = streams[0][1]
|
||||||
|
|
||||||
if self.args.annotate:
|
if self.args.annotate:
|
||||||
@@ -57,7 +51,7 @@ def cmd_extract(self):
|
|||||||
# Strip timestamp (first element). Doesn't make sense
|
# Strip timestamp (first element). Doesn't make sense
|
||||||
# if we are only returning a count.
|
# if we are only returning a count.
|
||||||
dataline = ' '.join(dataline.split(' ')[1:])
|
dataline = ' '.join(dataline.split(' ')[1:])
|
||||||
print(dataline)
|
print dataline
|
||||||
printed = True
|
printed = True
|
||||||
if not printed:
|
if not printed:
|
||||||
if self.args.annotate:
|
if self.args.annotate:
|
||||||
|
@@ -1,5 +1,5 @@
|
|||||||
from __future__ import absolute_import
|
from __future__ import absolute_import
|
||||||
from nilmdb.utils.printf import *
|
from nilmdb.printf import *
|
||||||
|
|
||||||
from argparse import ArgumentDefaultsHelpFormatter as def_form
|
from argparse import ArgumentDefaultsHelpFormatter as def_form
|
||||||
|
|
||||||
|
@@ -1,6 +1,7 @@
|
|||||||
from __future__ import absolute_import
|
from __future__ import absolute_import
|
||||||
from nilmdb.utils.printf import *
|
from nilmdb.printf import *
|
||||||
import nilmdb.client
|
import nilmdb.client
|
||||||
|
import nilmdb.layout
|
||||||
import nilmdb.timestamper
|
import nilmdb.timestamper
|
||||||
|
|
||||||
import sys
|
import sys
|
||||||
@@ -51,12 +52,12 @@ def cmd_insert(self):
|
|||||||
# Find requested stream
|
# Find requested stream
|
||||||
streams = self.client.stream_list(self.args.path)
|
streams = self.client.stream_list(self.args.path)
|
||||||
if len(streams) != 1:
|
if len(streams) != 1:
|
||||||
self.die("error getting stream info for path %s", self.args.path)
|
self.die("Error getting stream info for path %s", self.args.path)
|
||||||
|
|
||||||
layout = streams[0][1]
|
layout = streams[0][1]
|
||||||
|
|
||||||
if self.args.start and len(self.args.file) != 1:
|
if self.args.start and len(self.args.file) != 1:
|
||||||
self.die("error: --start can only be used with one input file")
|
self.die("--start can only be used with one input file, for now")
|
||||||
|
|
||||||
for filename in self.args.file:
|
for filename in self.args.file:
|
||||||
if filename == '-':
|
if filename == '-':
|
||||||
@@ -65,7 +66,7 @@ def cmd_insert(self):
|
|||||||
try:
|
try:
|
||||||
infile = open(filename, "r")
|
infile = open(filename, "r")
|
||||||
except IOError:
|
except IOError:
|
||||||
self.die("error opening input file %s", filename)
|
self.die("Error opening input file %s", filename)
|
||||||
|
|
||||||
# Build a timestamper for this file
|
# Build a timestamper for this file
|
||||||
if self.args.none:
|
if self.args.none:
|
||||||
@@ -77,11 +78,11 @@ def cmd_insert(self):
|
|||||||
try:
|
try:
|
||||||
start = self.parse_time(filename)
|
start = self.parse_time(filename)
|
||||||
except ValueError:
|
except ValueError:
|
||||||
self.die("error extracting time from filename '%s'",
|
self.die("Error extracting time from filename '%s'",
|
||||||
filename)
|
filename)
|
||||||
|
|
||||||
if not self.args.rate:
|
if not self.args.rate:
|
||||||
self.die("error: --rate is needed, but was not specified")
|
self.die("Need to specify --rate")
|
||||||
rate = self.args.rate
|
rate = self.args.rate
|
||||||
|
|
||||||
ts = nilmdb.timestamper.TimestamperRate(infile, start, rate)
|
ts = nilmdb.timestamper.TimestamperRate(infile, start, rate)
|
||||||
@@ -100,6 +101,6 @@ def cmd_insert(self):
|
|||||||
# ugly bracketed ranges of 16-digit numbers and a mangled URL.
|
# ugly bracketed ranges of 16-digit numbers and a mangled URL.
|
||||||
# Need to consider adding something like e.prettyprint()
|
# Need to consider adding something like e.prettyprint()
|
||||||
# that is smarter about the contents of the error.
|
# that is smarter about the contents of the error.
|
||||||
self.die("error inserting data: %s", str(e))
|
self.die("Error inserting data: %s", str(e))
|
||||||
|
|
||||||
return
|
return
|
||||||
|
@@ -1,9 +1,8 @@
|
|||||||
from __future__ import absolute_import
|
from __future__ import absolute_import
|
||||||
from nilmdb.utils.printf import *
|
from nilmdb.printf import *
|
||||||
import nilmdb.client
|
import nilmdb.client
|
||||||
|
|
||||||
import fnmatch
|
import fnmatch
|
||||||
import argparse
|
|
||||||
from argparse import ArgumentDefaultsHelpFormatter as def_form
|
from argparse import ArgumentDefaultsHelpFormatter as def_form
|
||||||
|
|
||||||
def setup(self, sub):
|
def setup(self, sub):
|
||||||
@@ -14,41 +13,23 @@ def setup(self, sub):
|
|||||||
optionally filtering by layout or path. Wildcards
|
optionally filtering by layout or path. Wildcards
|
||||||
are accepted.
|
are accepted.
|
||||||
""")
|
""")
|
||||||
cmd.set_defaults(verify = cmd_list_verify,
|
cmd.set_defaults(handler = cmd_list)
|
||||||
handler = cmd_list)
|
|
||||||
|
|
||||||
group = cmd.add_argument_group("Stream filtering")
|
group = cmd.add_argument_group("Stream filtering")
|
||||||
group.add_argument("-p", "--path", metavar="PATH", default="*",
|
|
||||||
help="Match only this path (-p can be omitted)")
|
|
||||||
group.add_argument("path_positional", default="*",
|
|
||||||
nargs="?", help=argparse.SUPPRESS)
|
|
||||||
group.add_argument("-l", "--layout", default="*",
|
group.add_argument("-l", "--layout", default="*",
|
||||||
help="Match only this stream layout")
|
help="Match only this stream layout")
|
||||||
|
group.add_argument("-p", "--path", default="*",
|
||||||
|
help="Match only this path")
|
||||||
|
|
||||||
group = cmd.add_argument_group("Interval details")
|
group = cmd.add_argument_group("Interval details")
|
||||||
group.add_argument("-d", "--detail", action="store_true",
|
group.add_argument("-d", "--detail", action="store_true",
|
||||||
help="Show available data time intervals")
|
help="Show available data time intervals")
|
||||||
group.add_argument("-s", "--start",
|
group.add_argument("-s", "--start",
|
||||||
metavar="TIME", type=self.arg_time,
|
metavar="TIME", type=self.arg_time,
|
||||||
help="Starting timestamp (free-form, inclusive)")
|
help="Starting timestamp (free-form)")
|
||||||
group.add_argument("-e", "--end",
|
group.add_argument("-e", "--end",
|
||||||
metavar="TIME", type=self.arg_time,
|
metavar="TIME", type=self.arg_time,
|
||||||
help="Ending timestamp (free-form, noninclusive)")
|
help="Ending timestamp (free-form)")
|
||||||
|
|
||||||
def cmd_list_verify(self):
|
|
||||||
# A hidden "path_positional" argument lets the user leave off the
|
|
||||||
# "-p" when specifying the path. Handle it here.
|
|
||||||
got_opt = self.args.path != "*"
|
|
||||||
got_pos = self.args.path_positional != "*"
|
|
||||||
if got_pos:
|
|
||||||
if got_opt:
|
|
||||||
self.parser.error("too many paths specified")
|
|
||||||
else:
|
|
||||||
self.args.path = self.args.path_positional
|
|
||||||
|
|
||||||
if self.args.start is not None and self.args.end is not None:
|
|
||||||
if self.args.start > self.args.end:
|
|
||||||
self.parser.error("start is after end")
|
|
||||||
|
|
||||||
def cmd_list(self):
|
def cmd_list(self):
|
||||||
"""List available streams"""
|
"""List available streams"""
|
||||||
|
@@ -1,5 +1,5 @@
|
|||||||
from __future__ import absolute_import
|
from __future__ import absolute_import
|
||||||
from nilmdb.utils.printf import *
|
from nilmdb.printf import *
|
||||||
import nilmdb.client
|
import nilmdb.client
|
||||||
|
|
||||||
def setup(self, sub):
|
def setup(self, sub):
|
||||||
@@ -43,21 +43,21 @@ def cmd_metadata(self):
|
|||||||
for keyval in keyvals:
|
for keyval in keyvals:
|
||||||
kv = keyval.split('=')
|
kv = keyval.split('=')
|
||||||
if len(kv) != 2 or kv[0] == "":
|
if len(kv) != 2 or kv[0] == "":
|
||||||
self.die("error parsing key=value argument '%s'", keyval)
|
self.die("Error parsing key=value argument '%s'", keyval)
|
||||||
data[kv[0]] = kv[1]
|
data[kv[0]] = kv[1]
|
||||||
|
|
||||||
# Make the call
|
# Make the call
|
||||||
try:
|
try:
|
||||||
handler(self.args.path, data)
|
handler(self.args.path, data)
|
||||||
except nilmdb.client.ClientError as e:
|
except nilmdb.client.ClientError as e:
|
||||||
self.die("error setting/updating metadata: %s", str(e))
|
self.die("Error setting/updating metadata: %s", str(e))
|
||||||
else:
|
else:
|
||||||
# Get (or unspecified)
|
# Get (or unspecified)
|
||||||
keys = self.args.get or None
|
keys = self.args.get or None
|
||||||
try:
|
try:
|
||||||
data = self.client.stream_get_metadata(self.args.path, keys)
|
data = self.client.stream_get_metadata(self.args.path, keys)
|
||||||
except nilmdb.client.ClientError as e:
|
except nilmdb.client.ClientError as e:
|
||||||
self.die("error getting metadata: %s", str(e))
|
self.die("Error getting metadata: %s", str(e))
|
||||||
for key, value in sorted(data.items()):
|
for key, value in sorted(data.items()):
|
||||||
# Omit nonexistant keys
|
# Omit nonexistant keys
|
||||||
if value is None:
|
if value is None:
|
||||||
|
@@ -1,45 +0,0 @@
|
|||||||
from __future__ import absolute_import
|
|
||||||
from __future__ import print_function
|
|
||||||
from nilmdb.utils.printf import *
|
|
||||||
import nilmdb.client
|
|
||||||
import sys
|
|
||||||
|
|
||||||
def setup(self, sub):
|
|
||||||
cmd = sub.add_parser("remove", help="Remove data",
|
|
||||||
description="""
|
|
||||||
Remove all data from a specified time range within a
|
|
||||||
stream.
|
|
||||||
""")
|
|
||||||
cmd.set_defaults(verify = cmd_remove_verify,
|
|
||||||
handler = cmd_remove)
|
|
||||||
|
|
||||||
group = cmd.add_argument_group("Data selection")
|
|
||||||
group.add_argument("path",
|
|
||||||
help="Path of stream, e.g. /foo/bar")
|
|
||||||
group.add_argument("-s", "--start", required=True,
|
|
||||||
metavar="TIME", type=self.arg_time,
|
|
||||||
help="Starting timestamp (free-form, inclusive)")
|
|
||||||
group.add_argument("-e", "--end", required=True,
|
|
||||||
metavar="TIME", type=self.arg_time,
|
|
||||||
help="Ending timestamp (free-form, noninclusive)")
|
|
||||||
|
|
||||||
group = cmd.add_argument_group("Output format")
|
|
||||||
group.add_argument("-c", "--count", action="store_true",
|
|
||||||
help="Output number of data points removed")
|
|
||||||
|
|
||||||
def cmd_remove_verify(self):
|
|
||||||
if self.args.start is not None and self.args.end is not None:
|
|
||||||
if self.args.start > self.args.end:
|
|
||||||
self.parser.error("start is after end")
|
|
||||||
|
|
||||||
def cmd_remove(self):
|
|
||||||
try:
|
|
||||||
count = self.client.stream_remove(self.args.path,
|
|
||||||
self.args.start, self.args.end)
|
|
||||||
except nilmdb.client.ClientError as e:
|
|
||||||
self.die("error removing data: %s", str(e))
|
|
||||||
|
|
||||||
if self.args.count:
|
|
||||||
printf("%d\n", count)
|
|
||||||
|
|
||||||
return 0
|
|
@@ -1,8 +1,7 @@
|
|||||||
"""HTTP client library"""
|
"""HTTP client library"""
|
||||||
|
|
||||||
from __future__ import absolute_import
|
from __future__ import absolute_import
|
||||||
from nilmdb.utils.printf import *
|
from nilmdb.printf import *
|
||||||
import nilmdb.utils
|
|
||||||
|
|
||||||
import time
|
import time
|
||||||
import sys
|
import sys
|
||||||
@@ -10,9 +9,12 @@ import re
|
|||||||
import os
|
import os
|
||||||
import simplejson as json
|
import simplejson as json
|
||||||
import urlparse
|
import urlparse
|
||||||
|
import urllib
|
||||||
import pycurl
|
import pycurl
|
||||||
import cStringIO
|
import cStringIO
|
||||||
|
|
||||||
|
import nilmdb.iteratorizer
|
||||||
|
|
||||||
class Error(Exception):
|
class Error(Exception):
|
||||||
"""Base exception for both ClientError and ServerError responses"""
|
"""Base exception for both ClientError and ServerError responses"""
|
||||||
def __init__(self,
|
def __init__(self,
|
||||||
@@ -26,19 +28,12 @@ class Error(Exception):
|
|||||||
self.url = url # URL we were requesting
|
self.url = url # URL we were requesting
|
||||||
self.traceback = traceback # server traceback, if available
|
self.traceback = traceback # server traceback, if available
|
||||||
def __str__(self):
|
def __str__(self):
|
||||||
s = sprintf("[%s]", self.status)
|
|
||||||
if self.message:
|
|
||||||
s += sprintf(" %s", self.message)
|
|
||||||
if self.traceback: # pragma: no cover
|
|
||||||
s += sprintf("\nServer traceback:\n%s", self.traceback)
|
|
||||||
return s
|
|
||||||
def __repr__(self): # pragma: no cover
|
|
||||||
s = sprintf("[%s]", self.status)
|
s = sprintf("[%s]", self.status)
|
||||||
if self.message:
|
if self.message:
|
||||||
s += sprintf(" %s", self.message)
|
s += sprintf(" %s", self.message)
|
||||||
if self.url:
|
if self.url:
|
||||||
s += sprintf(" (%s)", self.url)
|
s += sprintf(" (%s)", self.url)
|
||||||
if self.traceback:
|
if self.traceback: # pragma: no cover
|
||||||
s += sprintf("\nServer traceback:\n%s", self.traceback)
|
s += sprintf("\nServer traceback:\n%s", self.traceback)
|
||||||
return s
|
return s
|
||||||
class ClientError(Error):
|
class ClientError(Error):
|
||||||
@@ -65,8 +60,7 @@ class HTTPClient(object):
|
|||||||
def _setup_url(self, url = "", params = ""):
|
def _setup_url(self, url = "", params = ""):
|
||||||
url = urlparse.urljoin(self.baseurl, url)
|
url = urlparse.urljoin(self.baseurl, url)
|
||||||
if params:
|
if params:
|
||||||
url = urlparse.urljoin(
|
url = urlparse.urljoin(url, "?" + urllib.urlencode(params, True))
|
||||||
url, "?" + nilmdb.utils.urllib.urlencode(params))
|
|
||||||
self.curl.setopt(pycurl.URL, url)
|
self.curl.setopt(pycurl.URL, url)
|
||||||
self.url = url
|
self.url = url
|
||||||
|
|
||||||
@@ -91,10 +85,6 @@ class HTTPClient(object):
|
|||||||
raise ClientError(**args)
|
raise ClientError(**args)
|
||||||
else: # pragma: no cover
|
else: # pragma: no cover
|
||||||
if code >= 500 and code <= 599:
|
if code >= 500 and code <= 599:
|
||||||
if args["message"] is None:
|
|
||||||
args["message"] = ("(no message; try disabling " +
|
|
||||||
"response.stream option in " +
|
|
||||||
"nilmdb.server for better debugging)")
|
|
||||||
raise ServerError(**args)
|
raise ServerError(**args)
|
||||||
else:
|
else:
|
||||||
raise Error(**args)
|
raise Error(**args)
|
||||||
@@ -119,7 +109,7 @@ class HTTPClient(object):
|
|||||||
self.curl.setopt(pycurl.WRITEFUNCTION, callback)
|
self.curl.setopt(pycurl.WRITEFUNCTION, callback)
|
||||||
self.curl.perform()
|
self.curl.perform()
|
||||||
try:
|
try:
|
||||||
for i in nilmdb.utils.Iteratorizer(func):
|
for i in nilmdb.iteratorizer.Iteratorizer(func):
|
||||||
if self._status == 200:
|
if self._status == 200:
|
||||||
# If we had a 200 response, yield the data to the caller.
|
# If we had a 200 response, yield the data to the caller.
|
||||||
yield i
|
yield i
|
||||||
|
@@ -1,82 +1,58 @@
|
|||||||
"""Interval, IntervalSet
|
"""Interval and IntervalSet
|
||||||
|
|
||||||
Represents an interval of time, and a set of such intervals.
|
Represents an interval of time, and a set of such intervals.
|
||||||
|
|
||||||
Intervals are half-open, ie. they include data points with timestamps
|
Intervals are closed, ie. they include timestamps [start, end]
|
||||||
[start, end)
|
|
||||||
"""
|
"""
|
||||||
|
|
||||||
# First implementation kept a sorted list of intervals and used
|
# First implementation kept a sorted list of intervals and used
|
||||||
# biesct() to optimize some operations, but this was too slow.
|
# biesct() to optimize some operations, but this was too slow.
|
||||||
|
|
||||||
# Second version was based on the quicksect implementation from
|
# This version is based on the quicksect implementation from python-bx,
|
||||||
# python-bx, modified slightly to handle floating point intervals.
|
# modified slightly to handle floating point intervals.
|
||||||
# This didn't support deletion.
|
|
||||||
|
|
||||||
# Third version is more similar to the first version, using a rb-tree
|
import pyximport
|
||||||
# instead of a simple sorted list to maintain O(log n) operations.
|
pyximport.install()
|
||||||
|
import bxintersect
|
||||||
|
|
||||||
# Fourth version is an optimized rb-tree that stores interval starts
|
import bisect
|
||||||
# and ends directly in the tree, like bxinterval did.
|
|
||||||
|
|
||||||
cimport rbtree
|
|
||||||
cdef extern from "stdint.h":
|
|
||||||
ctypedef unsigned long long uint64_t
|
|
||||||
|
|
||||||
class IntervalError(Exception):
|
class IntervalError(Exception):
|
||||||
"""Error due to interval overlap, etc"""
|
"""Error due to interval overlap, etc"""
|
||||||
pass
|
pass
|
||||||
|
|
||||||
cdef class Interval:
|
class Interval(bxintersect.Interval):
|
||||||
"""Represents an interval of time."""
|
"""Represents an interval of time."""
|
||||||
|
|
||||||
cdef public double start, end
|
def __init__(self, start, end):
|
||||||
|
|
||||||
def __init__(self, double start, double end):
|
|
||||||
"""
|
"""
|
||||||
'start' and 'end' are arbitrary floats that represent time
|
'start' and 'end' are arbitrary floats that represent time
|
||||||
"""
|
"""
|
||||||
if start > end:
|
if start > end:
|
||||||
# Explicitly disallow zero-width intervals (since they're half-open)
|
|
||||||
raise IntervalError("start %s must precede end %s" % (start, end))
|
raise IntervalError("start %s must precede end %s" % (start, end))
|
||||||
self.start = float(start)
|
bxintersect.Interval.__init__(self, start, end)
|
||||||
self.end = float(end)
|
|
||||||
|
|
||||||
def __repr__(self):
|
def __repr__(self):
|
||||||
s = repr(self.start) + ", " + repr(self.end)
|
s = repr(self.start) + ", " + repr(self.end)
|
||||||
return self.__class__.__name__ + "(" + s + ")"
|
return self.__class__.__name__ + "(" + s + ")"
|
||||||
|
|
||||||
def __str__(self):
|
def __str__(self):
|
||||||
return "[" + repr(self.start) + " -> " + repr(self.end) + ")"
|
return "[" + str(self.start) + " -> " + str(self.end) + "]"
|
||||||
|
|
||||||
def __cmp__(self, Interval other):
|
def intersects(self, other):
|
||||||
"""Compare two intervals. If non-equal, order by start then end"""
|
|
||||||
if not isinstance(other, Interval):
|
|
||||||
raise TypeError("bad type")
|
|
||||||
if self.start == other.start:
|
|
||||||
if self.end < other.end:
|
|
||||||
return -1
|
|
||||||
if self.end > other.end:
|
|
||||||
return 1
|
|
||||||
return 0
|
|
||||||
if self.start < other.start:
|
|
||||||
return -1
|
|
||||||
return 1
|
|
||||||
|
|
||||||
cpdef intersects(self, Interval other):
|
|
||||||
"""Return True if two Interval objects intersect"""
|
"""Return True if two Interval objects intersect"""
|
||||||
if (self.end <= other.start or self.start >= other.end):
|
if (self.end <= other.start or self.start >= other.end):
|
||||||
return False
|
return False
|
||||||
return True
|
return True
|
||||||
|
|
||||||
cpdef subset(self, double start, double end):
|
def subset(self, start, end):
|
||||||
"""Return a new Interval that is a subset of this one"""
|
"""Return a new Interval that is a subset of this one"""
|
||||||
# A subclass that tracks additional data might override this.
|
# A subclass that tracks additional data might override this.
|
||||||
if start < self.start or end > self.end:
|
if start < self.start or end > self.end:
|
||||||
raise IntervalError("not a subset")
|
raise IntervalError("not a subset")
|
||||||
return Interval(start, end)
|
return Interval(start, end)
|
||||||
|
|
||||||
cdef class DBInterval(Interval):
|
class DBInterval(Interval):
|
||||||
"""
|
"""
|
||||||
Like Interval, but also tracks corresponding start/end times and
|
Like Interval, but also tracks corresponding start/end times and
|
||||||
positions within the database. These are not currently modified
|
positions within the database. These are not currently modified
|
||||||
@@ -90,10 +66,6 @@ cdef class DBInterval(Interval):
|
|||||||
end = 150
|
end = 150
|
||||||
db_end = 200, db_endpos = 20000
|
db_end = 200, db_endpos = 20000
|
||||||
"""
|
"""
|
||||||
|
|
||||||
cpdef public double db_start, db_end
|
|
||||||
cpdef public uint64_t db_startpos, db_endpos
|
|
||||||
|
|
||||||
def __init__(self, start, end,
|
def __init__(self, start, end,
|
||||||
db_start, db_end,
|
db_start, db_end,
|
||||||
db_startpos, db_endpos):
|
db_startpos, db_endpos):
|
||||||
@@ -118,7 +90,7 @@ cdef class DBInterval(Interval):
|
|||||||
s += ", " + repr(self.db_startpos) + ", " + repr(self.db_endpos)
|
s += ", " + repr(self.db_startpos) + ", " + repr(self.db_endpos)
|
||||||
return self.__class__.__name__ + "(" + s + ")"
|
return self.__class__.__name__ + "(" + s + ")"
|
||||||
|
|
||||||
cpdef subset(self, double start, double end):
|
def subset(self, start, end):
|
||||||
"""
|
"""
|
||||||
Return a new DBInterval that is a subset of this one
|
Return a new DBInterval that is a subset of this one
|
||||||
"""
|
"""
|
||||||
@@ -128,25 +100,21 @@ cdef class DBInterval(Interval):
|
|||||||
self.db_start, self.db_end,
|
self.db_start, self.db_end,
|
||||||
self.db_startpos, self.db_endpos)
|
self.db_startpos, self.db_endpos)
|
||||||
|
|
||||||
cdef class IntervalSet:
|
class IntervalSet(object):
|
||||||
"""
|
"""
|
||||||
A non-intersecting set of intervals.
|
A non-intersecting set of intervals.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
cdef public rbtree.RBTree tree
|
|
||||||
|
|
||||||
def __init__(self, source=None):
|
def __init__(self, source=None):
|
||||||
"""
|
"""
|
||||||
'source' is an Interval or IntervalSet to add.
|
'source' is an Interval or IntervalSet to add.
|
||||||
"""
|
"""
|
||||||
self.tree = rbtree.RBTree()
|
self.tree = bxintersect.IntervalTree()
|
||||||
if source is not None:
|
if source is not None:
|
||||||
self += source
|
self += source
|
||||||
|
|
||||||
def __iter__(self):
|
def __iter__(self):
|
||||||
for node in self.tree:
|
return self.tree.traverse()
|
||||||
if node.obj:
|
|
||||||
yield node.obj
|
|
||||||
|
|
||||||
def __len__(self):
|
def __len__(self):
|
||||||
return sum(1 for x in self)
|
return sum(1 for x in self)
|
||||||
@@ -159,7 +127,7 @@ cdef class IntervalSet:
|
|||||||
descs = [ str(x) for x in self ]
|
descs = [ str(x) for x in self ]
|
||||||
return "[" + ", ".join(descs) + "]"
|
return "[" + ", ".join(descs) + "]"
|
||||||
|
|
||||||
def __match__(self, other):
|
def __eq__(self, other):
|
||||||
# This isn't particularly efficient, but it shouldn't get used in the
|
# This isn't particularly efficient, but it shouldn't get used in the
|
||||||
# general case.
|
# general case.
|
||||||
"""Test equality of two IntervalSets.
|
"""Test equality of two IntervalSets.
|
||||||
@@ -178,8 +146,8 @@ cdef class IntervalSet:
|
|||||||
else:
|
else:
|
||||||
return False
|
return False
|
||||||
|
|
||||||
this = list(self)
|
this = [ x for x in self ]
|
||||||
that = list(other)
|
that = [ x for x in other ]
|
||||||
|
|
||||||
try:
|
try:
|
||||||
while True:
|
while True:
|
||||||
@@ -210,20 +178,10 @@ cdef class IntervalSet:
|
|||||||
except IndexError:
|
except IndexError:
|
||||||
return False
|
return False
|
||||||
|
|
||||||
# Use __richcmp__ instead of __eq__, __ne__ for Cython.
|
def __ne__(self, other):
|
||||||
def __richcmp__(self, other, int op):
|
return not self.__eq__(other)
|
||||||
if op == 2: # ==
|
|
||||||
return self.__match__(other)
|
|
||||||
elif op == 3: # !=
|
|
||||||
return not self.__match__(other)
|
|
||||||
return False
|
|
||||||
#def __eq__(self, other):
|
|
||||||
# return self.__match__(other)
|
|
||||||
#
|
|
||||||
#def __ne__(self, other):
|
|
||||||
# return not self.__match__(other)
|
|
||||||
|
|
||||||
def __iadd__(self, object other not None):
|
def __iadd__(self, other):
|
||||||
"""Inplace add -- modifies self
|
"""Inplace add -- modifies self
|
||||||
|
|
||||||
This throws an exception if the regions being added intersect."""
|
This throws an exception if the regions being added intersect."""
|
||||||
@@ -231,36 +189,19 @@ cdef class IntervalSet:
|
|||||||
if self.intersects(other):
|
if self.intersects(other):
|
||||||
raise IntervalError("Tried to add overlapping interval "
|
raise IntervalError("Tried to add overlapping interval "
|
||||||
"to this set")
|
"to this set")
|
||||||
self.tree.insert(rbtree.RBNode(other.start, other.end, other))
|
self.tree.insert_interval(other)
|
||||||
else:
|
else:
|
||||||
for x in other:
|
for x in other:
|
||||||
self.__iadd__(x)
|
self.__iadd__(x)
|
||||||
return self
|
return self
|
||||||
|
|
||||||
def iadd_nocheck(self, Interval other not None):
|
def __add__(self, other):
|
||||||
"""Inplace add -- modifies self.
|
|
||||||
'Optimized' version that doesn't check for intersection and
|
|
||||||
only inserts the new interval into the tree."""
|
|
||||||
self.tree.insert(rbtree.RBNode(other.start, other.end, other))
|
|
||||||
|
|
||||||
def __isub__(self, Interval other not None):
|
|
||||||
"""Inplace subtract -- modifies self
|
|
||||||
|
|
||||||
Removes an interval from the set. Must exist exactly
|
|
||||||
as provided -- cannot remove a subset of an existing interval."""
|
|
||||||
i = self.tree.find(other.start, other.end)
|
|
||||||
if i is None:
|
|
||||||
raise IntervalError("interval " + str(other) + " not in tree")
|
|
||||||
self.tree.delete(i)
|
|
||||||
return self
|
|
||||||
|
|
||||||
def __add__(self, other not None):
|
|
||||||
"""Add -- returns a new object"""
|
"""Add -- returns a new object"""
|
||||||
new = IntervalSet(self)
|
new = IntervalSet(self)
|
||||||
new += IntervalSet(other)
|
new += IntervalSet(other)
|
||||||
return new
|
return new
|
||||||
|
|
||||||
def __and__(self, other not None):
|
def __and__(self, other):
|
||||||
"""
|
"""
|
||||||
Compute a new IntervalSet from the intersection of two others
|
Compute a new IntervalSet from the intersection of two others
|
||||||
|
|
||||||
@@ -270,16 +211,15 @@ cdef class IntervalSet:
|
|||||||
out = IntervalSet()
|
out = IntervalSet()
|
||||||
|
|
||||||
if not isinstance(other, IntervalSet):
|
if not isinstance(other, IntervalSet):
|
||||||
for i in self.intersection(other):
|
other = [ other ]
|
||||||
out.tree.insert(rbtree.RBNode(i.start, i.end, i))
|
|
||||||
else:
|
|
||||||
for x in other:
|
for x in other:
|
||||||
for i in self.intersection(x):
|
for i in self.intersection(x):
|
||||||
out.tree.insert(rbtree.RBNode(i.start, i.end, i))
|
out.tree.insert_interval(i)
|
||||||
|
|
||||||
return out
|
return out
|
||||||
|
|
||||||
def intersection(self, Interval interval not None, orig = False):
|
def intersection(self, interval):
|
||||||
"""
|
"""
|
||||||
Compute a sequence of intervals that correspond to the
|
Compute a sequence of intervals that correspond to the
|
||||||
intersection between `self` and the provided interval.
|
intersection between `self` and the provided interval.
|
||||||
@@ -288,42 +228,14 @@ cdef class IntervalSet:
|
|||||||
|
|
||||||
Output intervals are built as subsets of the intervals in the
|
Output intervals are built as subsets of the intervals in the
|
||||||
first argument (self).
|
first argument (self).
|
||||||
|
|
||||||
If orig = True, also return the original interval that was
|
|
||||||
(potentially) subsetted to make the one that is being
|
|
||||||
returned.
|
|
||||||
"""
|
"""
|
||||||
if not isinstance(interval, Interval):
|
for i in self.tree.find(interval.start, interval.end):
|
||||||
raise TypeError("bad type")
|
if i.start > interval.start and i.end < interval.end:
|
||||||
for n in self.tree.intersect(interval.start, interval.end):
|
|
||||||
i = n.obj
|
|
||||||
if i:
|
|
||||||
if i.start >= interval.start and i.end <= interval.end:
|
|
||||||
if orig:
|
|
||||||
yield (i, i)
|
|
||||||
else:
|
|
||||||
yield i
|
yield i
|
||||||
else:
|
else:
|
||||||
subset = i.subset(max(i.start, interval.start),
|
yield i.subset(max(i.start, interval.start),
|
||||||
min(i.end, interval.end))
|
min(i.end, interval.end))
|
||||||
if orig:
|
|
||||||
yield (subset, i)
|
|
||||||
else:
|
|
||||||
yield subset
|
|
||||||
|
|
||||||
cpdef intersects(self, Interval other):
|
def intersects(self, other):
|
||||||
"""Return True if this IntervalSet intersects another interval"""
|
"""Return True if this IntervalSet intersects another interval"""
|
||||||
for n in self.tree.intersect(other.start, other.end):
|
return len(self.tree.find(other.start, other.end)) > 0
|
||||||
if n.obj.intersects(other):
|
|
||||||
return True
|
|
||||||
return False
|
|
||||||
|
|
||||||
def find_end(self, double t):
|
|
||||||
"""
|
|
||||||
Return an Interval from this tree that ends at time t, or
|
|
||||||
None if it doesn't exist.
|
|
||||||
"""
|
|
||||||
n = self.tree.find_left_end(t)
|
|
||||||
if n and n.obj.end == t:
|
|
||||||
return n.obj
|
|
||||||
return None
|
|
@@ -1 +0,0 @@
|
|||||||
rbtree.pxd
|
|
@@ -1,5 +1,6 @@
|
|||||||
# cython: profile=False
|
# cython: profile=False
|
||||||
|
|
||||||
|
import tables
|
||||||
import time
|
import time
|
||||||
import sys
|
import sys
|
||||||
import inspect
|
import inspect
|
||||||
@@ -121,6 +122,15 @@ class Layout:
|
|||||||
s += " %d" % d[i+1]
|
s += " %d" % d[i+1]
|
||||||
return s + "\n"
|
return s + "\n"
|
||||||
|
|
||||||
|
# PyTables description
|
||||||
|
def description(self):
|
||||||
|
"""Return the PyTables description of this layout"""
|
||||||
|
desc = {}
|
||||||
|
desc['timestamp'] = tables.Col.from_type('float64', pos=0)
|
||||||
|
for n in range(self.count):
|
||||||
|
desc['c' + str(n+1)] = tables.Col.from_type(self.datatype, pos=n+1)
|
||||||
|
return tables.Description(desc)
|
||||||
|
|
||||||
# Get a layout by name
|
# Get a layout by name
|
||||||
def get_named(typestring):
|
def get_named(typestring):
|
||||||
try:
|
try:
|
||||||
|
328
nilmdb/nilmdb.py
328
nilmdb/nilmdb.py
@@ -4,16 +4,17 @@
|
|||||||
|
|
||||||
Object that represents a NILM database file.
|
Object that represents a NILM database file.
|
||||||
|
|
||||||
Manages both the SQL database and the table storage backend.
|
Manages both the SQL database and the PyTables storage backend.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
# Need absolute_import so that "import nilmdb" won't pull in nilmdb.py,
|
# Need absolute_import so that "import nilmdb" won't pull in nilmdb.py,
|
||||||
# but will pull the nilmdb module instead.
|
# but will pull the nilmdb module instead.
|
||||||
from __future__ import absolute_import
|
from __future__ import absolute_import
|
||||||
import nilmdb
|
import nilmdb
|
||||||
from nilmdb.utils.printf import *
|
from nilmdb.printf import *
|
||||||
|
|
||||||
import sqlite3
|
import sqlite3
|
||||||
|
import tables
|
||||||
import time
|
import time
|
||||||
import sys
|
import sys
|
||||||
import os
|
import os
|
||||||
@@ -24,8 +25,6 @@ import pyximport
|
|||||||
pyximport.install()
|
pyximport.install()
|
||||||
from nilmdb.interval import Interval, DBInterval, IntervalSet, IntervalError
|
from nilmdb.interval import Interval, DBInterval, IntervalSet, IntervalError
|
||||||
|
|
||||||
from . import bulkdata
|
|
||||||
|
|
||||||
# Note about performance and transactions:
|
# Note about performance and transactions:
|
||||||
#
|
#
|
||||||
# Committing a transaction in the default sync mode (PRAGMA synchronous=FULL)
|
# Committing a transaction in the default sync mode (PRAGMA synchronous=FULL)
|
||||||
@@ -80,7 +79,7 @@ _sql_schema_updates = {
|
|||||||
class NilmDBError(Exception):
|
class NilmDBError(Exception):
|
||||||
"""Base exception for NilmDB errors"""
|
"""Base exception for NilmDB errors"""
|
||||||
def __init__(self, message = "Unspecified error"):
|
def __init__(self, message = "Unspecified error"):
|
||||||
Exception.__init__(self, message)
|
Exception.__init__(self, self.__class__.__name__ + ": " + message)
|
||||||
|
|
||||||
class StreamError(NilmDBError):
|
class StreamError(NilmDBError):
|
||||||
pass
|
pass
|
||||||
@@ -88,14 +87,19 @@ class StreamError(NilmDBError):
|
|||||||
class OverlapError(NilmDBError):
|
class OverlapError(NilmDBError):
|
||||||
pass
|
pass
|
||||||
|
|
||||||
@nilmdb.utils.must_close()
|
# Helper that lets us pass a Pytables table into bisect
|
||||||
|
class BisectableTable(object):
|
||||||
|
def __init__(self, table):
|
||||||
|
self.table = table
|
||||||
|
def __getitem__(self, index):
|
||||||
|
return self.table[index][0]
|
||||||
|
|
||||||
class NilmDB(object):
|
class NilmDB(object):
|
||||||
verbose = 0
|
verbose = 0
|
||||||
|
|
||||||
def __init__(self, basepath, sync=True, max_results=None,
|
def __init__(self, basepath, sync=True, max_results=None):
|
||||||
bulkdata_args={}):
|
|
||||||
# set up path
|
# set up path
|
||||||
self.basepath = os.path.abspath(basepath)
|
self.basepath = os.path.abspath(basepath.rstrip('/'))
|
||||||
|
|
||||||
# Create the database path if it doesn't exist
|
# Create the database path if it doesn't exist
|
||||||
try:
|
try:
|
||||||
@@ -104,16 +108,16 @@ class NilmDB(object):
|
|||||||
if e.errno != errno.EEXIST:
|
if e.errno != errno.EEXIST:
|
||||||
raise IOError("can't create tree " + self.basepath)
|
raise IOError("can't create tree " + self.basepath)
|
||||||
|
|
||||||
# Our data goes inside it
|
# Our HD5 file goes inside it
|
||||||
self.data = bulkdata.BulkData(self.basepath, **bulkdata_args)
|
h5filename = os.path.abspath(self.basepath + "/data.h5")
|
||||||
|
self.h5file = tables.openFile(h5filename, "a", "NILM Database")
|
||||||
|
|
||||||
# SQLite database too
|
# SQLite database too
|
||||||
sqlfilename = os.path.join(self.basepath, "data.sql")
|
sqlfilename = os.path.abspath(self.basepath + "/data.sql")
|
||||||
# We use check_same_thread = False, assuming that the rest
|
# We use check_same_thread = False, assuming that the rest
|
||||||
# of the code (e.g. Server) will be smart and not access this
|
# of the code (e.g. Server) will be smart and not access this
|
||||||
# database from multiple threads simultaneously. Otherwise
|
# database from multiple threads simultaneously. That requirement
|
||||||
# false positives will occur when the database is only opened
|
# may be relaxed later.
|
||||||
# in one thread, and only accessed in another.
|
|
||||||
self.con = sqlite3.connect(sqlfilename, check_same_thread = False)
|
self.con = sqlite3.connect(sqlfilename, check_same_thread = False)
|
||||||
self._sql_schema_update()
|
self._sql_schema_update()
|
||||||
|
|
||||||
@@ -130,6 +134,17 @@ class NilmDB(object):
|
|||||||
else:
|
else:
|
||||||
self.max_results = 16384
|
self.max_results = 16384
|
||||||
|
|
||||||
|
self.opened = True
|
||||||
|
|
||||||
|
# Cached intervals
|
||||||
|
self._cached_iset = {}
|
||||||
|
|
||||||
|
def __del__(self):
|
||||||
|
if "opened" in self.__dict__: # pragma: no cover
|
||||||
|
fprintf(sys.stderr,
|
||||||
|
"error: NilmDB.close() wasn't called, path %s",
|
||||||
|
self.basepath)
|
||||||
|
|
||||||
def get_basepath(self):
|
def get_basepath(self):
|
||||||
return self.basepath
|
return self.basepath
|
||||||
|
|
||||||
@@ -137,7 +152,8 @@ class NilmDB(object):
|
|||||||
if self.con:
|
if self.con:
|
||||||
self.con.commit()
|
self.con.commit()
|
||||||
self.con.close()
|
self.con.close()
|
||||||
self.data.close()
|
self.h5file.close()
|
||||||
|
del self.opened
|
||||||
|
|
||||||
def _sql_schema_update(self):
|
def _sql_schema_update(self):
|
||||||
cur = self.con.cursor()
|
cur = self.con.cursor()
|
||||||
@@ -154,11 +170,12 @@ class NilmDB(object):
|
|||||||
with self.con:
|
with self.con:
|
||||||
cur.execute("PRAGMA user_version = {v:d}".format(v=version))
|
cur.execute("PRAGMA user_version = {v:d}".format(v=version))
|
||||||
|
|
||||||
@nilmdb.utils.lru_cache(size = 16)
|
|
||||||
def _get_intervals(self, stream_id):
|
def _get_intervals(self, stream_id):
|
||||||
"""
|
"""
|
||||||
Return a mutable IntervalSet corresponding to the given stream ID.
|
Return a mutable IntervalSet corresponding to the given stream ID.
|
||||||
"""
|
"""
|
||||||
|
# Load from database if not cached
|
||||||
|
if stream_id not in self._cached_iset:
|
||||||
iset = IntervalSet()
|
iset = IntervalSet()
|
||||||
result = self.con.execute("SELECT start_time, end_time, "
|
result = self.con.execute("SELECT start_time, end_time, "
|
||||||
"start_pos, end_pos "
|
"start_pos, end_pos "
|
||||||
@@ -171,112 +188,42 @@ class NilmDB(object):
|
|||||||
start_pos, end_pos)
|
start_pos, end_pos)
|
||||||
except IntervalError as e: # pragma: no cover
|
except IntervalError as e: # pragma: no cover
|
||||||
raise NilmDBError("unexpected overlap in ranges table!")
|
raise NilmDBError("unexpected overlap in ranges table!")
|
||||||
|
self._cached_iset[stream_id] = iset
|
||||||
|
# Return cached value
|
||||||
|
return self._cached_iset[stream_id]
|
||||||
|
|
||||||
return iset
|
# TODO: Split add_interval into two pieces, one to add
|
||||||
|
# and one to flush to disk?
|
||||||
|
# Need to think about this. Basic problem is that we can't
|
||||||
|
# mess with intervals once they're in the IntervalSet,
|
||||||
|
# without mucking with bxinterval internals.
|
||||||
|
|
||||||
def _sql_interval_insert(self, id, start, end, start_pos, end_pos):
|
# Maybe add a separate optimization step?
|
||||||
"""Helper that adds interval to the SQL database only"""
|
# Join intervals that have a fairly small gap between them
|
||||||
self.con.execute("INSERT INTO ranges "
|
|
||||||
"(stream_id,start_time,end_time,start_pos,end_pos) "
|
|
||||||
"VALUES (?,?,?,?,?)",
|
|
||||||
(id, start, end, start_pos, end_pos))
|
|
||||||
|
|
||||||
def _sql_interval_delete(self, id, start, end, start_pos, end_pos):
|
|
||||||
"""Helper that removes interval from the SQL database only"""
|
|
||||||
self.con.execute("DELETE FROM ranges WHERE "
|
|
||||||
"stream_id=? AND start_time=? AND "
|
|
||||||
"end_time=? AND start_pos=? AND end_pos=?",
|
|
||||||
(id, start, end, start_pos, end_pos))
|
|
||||||
|
|
||||||
def _add_interval(self, stream_id, interval, start_pos, end_pos):
|
def _add_interval(self, stream_id, interval, start_pos, end_pos):
|
||||||
"""
|
"""
|
||||||
Add interval to the internal interval cache, and to the database.
|
Add interval to the internal interval cache, and to the database.
|
||||||
Note: arguments must be ints (not numpy.int64, etc)
|
Note: arguments must be ints (not numpy.int64, etc)
|
||||||
"""
|
"""
|
||||||
# Load this stream's intervals
|
# Ensure this stream's intervals are cached, and add the new
|
||||||
|
# interval to that cache.
|
||||||
iset = self._get_intervals(stream_id)
|
iset = self._get_intervals(stream_id)
|
||||||
|
try:
|
||||||
# Check for overlap
|
iset += DBInterval(interval.start, interval.end,
|
||||||
if iset.intersects(interval): # pragma: no cover (gets caught earlier)
|
interval.start, interval.end,
|
||||||
|
start_pos, end_pos)
|
||||||
|
except IntervalError as e: # pragma: no cover
|
||||||
raise NilmDBError("new interval overlaps existing data")
|
raise NilmDBError("new interval overlaps existing data")
|
||||||
|
|
||||||
# Check for adjacency. If there's a stream in the database
|
|
||||||
# that ends exactly when this one starts, and the database
|
|
||||||
# rows match up, we can make one interval that covers the
|
|
||||||
# time range [adjacent.start -> interval.end)
|
|
||||||
# and database rows [ adjacent.start_pos -> end_pos ].
|
|
||||||
# Only do this if the resulting interval isn't too large.
|
|
||||||
max_merged_rows = 8000 * 60 * 60 * 1.05 # 1.05 hours at 8 KHz
|
|
||||||
adjacent = iset.find_end(interval.start)
|
|
||||||
if (adjacent is not None and
|
|
||||||
start_pos == adjacent.db_endpos and
|
|
||||||
(end_pos - adjacent.db_startpos) < max_merged_rows):
|
|
||||||
# First delete the old one, both from our iset and the
|
|
||||||
# database
|
|
||||||
iset -= adjacent
|
|
||||||
self._sql_interval_delete(stream_id,
|
|
||||||
adjacent.db_start, adjacent.db_end,
|
|
||||||
adjacent.db_startpos, adjacent.db_endpos)
|
|
||||||
|
|
||||||
# Now update our interval so the fallthrough add is
|
|
||||||
# correct.
|
|
||||||
interval.start = adjacent.start
|
|
||||||
start_pos = adjacent.db_startpos
|
|
||||||
|
|
||||||
# Add the new interval to the iset
|
|
||||||
iset.iadd_nocheck(DBInterval(interval.start, interval.end,
|
|
||||||
interval.start, interval.end,
|
|
||||||
start_pos, end_pos))
|
|
||||||
|
|
||||||
# Insert into the database
|
# Insert into the database
|
||||||
self._sql_interval_insert(stream_id, interval.start, interval.end,
|
self.con.execute("INSERT INTO ranges "
|
||||||
int(start_pos), int(end_pos))
|
"(stream_id,start_time,end_time,start_pos,end_pos) "
|
||||||
|
"VALUES (?,?,?,?,?)",
|
||||||
|
(stream_id, interval.start, interval.end,
|
||||||
|
int(start_pos), int(end_pos)))
|
||||||
self.con.commit()
|
self.con.commit()
|
||||||
|
|
||||||
def _remove_interval(self, stream_id, original, remove):
|
|
||||||
"""
|
|
||||||
Remove an interval from the internal cache and the database.
|
|
||||||
|
|
||||||
stream_id: id of stream
|
|
||||||
original: original DBInterval; must be already present in DB
|
|
||||||
to_remove: DBInterval to remove; must be subset of 'original'
|
|
||||||
"""
|
|
||||||
# Just return if we have nothing to remove
|
|
||||||
if remove.start == remove.end: # pragma: no cover
|
|
||||||
return
|
|
||||||
|
|
||||||
# Load this stream's intervals
|
|
||||||
iset = self._get_intervals(stream_id)
|
|
||||||
|
|
||||||
# Remove existing interval from the cached set and the database
|
|
||||||
iset -= original
|
|
||||||
self._sql_interval_delete(stream_id,
|
|
||||||
original.db_start, original.db_end,
|
|
||||||
original.db_startpos, original.db_endpos)
|
|
||||||
|
|
||||||
# Add back the intervals that would be left over if the
|
|
||||||
# requested interval is removed. There may be two of them, if
|
|
||||||
# the removed piece was in the middle.
|
|
||||||
def add(iset, start, end, start_pos, end_pos):
|
|
||||||
iset += DBInterval(start, end, start, end, start_pos, end_pos)
|
|
||||||
self._sql_interval_insert(stream_id, start, end, start_pos, end_pos)
|
|
||||||
|
|
||||||
if original.start != remove.start:
|
|
||||||
# Interval before the removed region
|
|
||||||
add(iset, original.start, remove.start,
|
|
||||||
original.db_startpos, remove.db_startpos)
|
|
||||||
|
|
||||||
if original.end != remove.end:
|
|
||||||
# Interval after the removed region
|
|
||||||
add(iset, remove.end, original.end,
|
|
||||||
remove.db_endpos, original.db_endpos)
|
|
||||||
|
|
||||||
# Commit SQL changes
|
|
||||||
self.con.commit()
|
|
||||||
|
|
||||||
return
|
|
||||||
|
|
||||||
def stream_list(self, path = None, layout = None):
|
def stream_list(self, path = None, layout = None):
|
||||||
"""Return list of [path, layout] lists of all streams
|
"""Return list of [path, layout] lists of all streams
|
||||||
in the database.
|
in the database.
|
||||||
@@ -338,11 +285,38 @@ class NilmDB(object):
|
|||||||
|
|
||||||
layout_name: string for nilmdb.layout.get_named(), e.g. 'float32_8'
|
layout_name: string for nilmdb.layout.get_named(), e.g. 'float32_8'
|
||||||
"""
|
"""
|
||||||
# Create the bulk storage. Raises ValueError on error, which we
|
if path[0] != '/':
|
||||||
# pass along.
|
raise ValueError("paths must start with /")
|
||||||
self.data.create(path, layout_name)
|
[ group, node ] = path.rsplit("/", 1)
|
||||||
|
if group == '':
|
||||||
|
raise ValueError("invalid path")
|
||||||
|
|
||||||
# Insert into SQL database once the bulk storage is happy
|
# Make the group structure, one element at a time
|
||||||
|
group_path = group.lstrip('/').split("/")
|
||||||
|
for i in range(len(group_path)):
|
||||||
|
parent = "/" + "/".join(group_path[0:i])
|
||||||
|
child = group_path[i]
|
||||||
|
try:
|
||||||
|
self.h5file.createGroup(parent, child)
|
||||||
|
except tables.NodeError:
|
||||||
|
pass
|
||||||
|
|
||||||
|
# Get description
|
||||||
|
try:
|
||||||
|
desc = nilmdb.layout.get_named(layout_name).description()
|
||||||
|
except KeyError:
|
||||||
|
raise ValueError("no such layout")
|
||||||
|
|
||||||
|
# Estimated table size (for PyTables optimization purposes): assume
|
||||||
|
# 3 months worth of data at 8 KHz. It's OK if this is wrong.
|
||||||
|
exp_rows = 8000 * 60*60*24*30*3
|
||||||
|
|
||||||
|
# Create the table
|
||||||
|
table = self.h5file.createTable(group, node,
|
||||||
|
description = desc,
|
||||||
|
expectedrows = exp_rows)
|
||||||
|
|
||||||
|
# Insert into SQL database once the PyTables is happy
|
||||||
with self.con as con:
|
with self.con as con:
|
||||||
con.execute("INSERT INTO streams (path, layout) VALUES (?,?)",
|
con.execute("INSERT INTO streams (path, layout) VALUES (?,?)",
|
||||||
(path, layout_name))
|
(path, layout_name))
|
||||||
@@ -363,7 +337,8 @@ class NilmDB(object):
|
|||||||
"""
|
"""
|
||||||
stream_id = self._stream_id(path)
|
stream_id = self._stream_id(path)
|
||||||
with self.con as con:
|
with self.con as con:
|
||||||
con.execute("DELETE FROM metadata WHERE stream_id=?", (stream_id,))
|
con.execute("DELETE FROM metadata "
|
||||||
|
"WHERE stream_id=?", (stream_id,))
|
||||||
for key in data:
|
for key in data:
|
||||||
if data[key] != '':
|
if data[key] != '':
|
||||||
con.execute("INSERT INTO metadata VALUES (?, ?, ?)",
|
con.execute("INSERT INTO metadata VALUES (?, ?, ?)",
|
||||||
@@ -386,52 +361,49 @@ class NilmDB(object):
|
|||||||
data.update(newdata)
|
data.update(newdata)
|
||||||
self.stream_set_metadata(path, data)
|
self.stream_set_metadata(path, data)
|
||||||
|
|
||||||
def stream_destroy(self, path):
|
def stream_insert(self, path, parser, old_timestamp = None):
|
||||||
"""Fully remove a table and all of its data from the database.
|
|
||||||
No way to undo it! Metadata is removed."""
|
|
||||||
stream_id = self._stream_id(path)
|
|
||||||
|
|
||||||
# Delete the cached interval data (if it was cached)
|
|
||||||
self._get_intervals.cache_remove(self, stream_id)
|
|
||||||
|
|
||||||
# Delete the data
|
|
||||||
self.data.destroy(path)
|
|
||||||
|
|
||||||
# Delete metadata, stream, intervals
|
|
||||||
with self.con as con:
|
|
||||||
con.execute("DELETE FROM metadata WHERE stream_id=?", (stream_id,))
|
|
||||||
con.execute("DELETE FROM ranges WHERE stream_id=?", (stream_id,))
|
|
||||||
con.execute("DELETE FROM streams WHERE id=?", (stream_id,))
|
|
||||||
|
|
||||||
def stream_insert(self, path, start, end, data):
|
|
||||||
"""Insert new data into the database.
|
"""Insert new data into the database.
|
||||||
path: Path at which to add the data
|
path: Path at which to add the data
|
||||||
start: Starting timestamp
|
parser: nilmdb.layout.Parser instance full of data to insert
|
||||||
end: Ending timestamp
|
|
||||||
data: Rows of data, to be passed to PyTable's table.append
|
|
||||||
method. E.g. nilmdb.layout.Parser.data
|
|
||||||
"""
|
"""
|
||||||
|
if (not parser.min_timestamp or not parser.max_timestamp or
|
||||||
|
not len(parser.data)):
|
||||||
|
raise StreamError("no data provided")
|
||||||
|
|
||||||
|
# If we were provided with an old timestamp, the expectation
|
||||||
|
# is that the client has a contiguous block of time it is sending,
|
||||||
|
# but it's doing it over multiple calls to stream_insert.
|
||||||
|
# old_timestamp is the max_timestamp of the previous insert.
|
||||||
|
# To make things continuous, use that as our starting timestamp
|
||||||
|
# instead of what the parser found.
|
||||||
|
if old_timestamp:
|
||||||
|
min_timestamp = old_timestamp
|
||||||
|
else:
|
||||||
|
min_timestamp = parser.min_timestamp
|
||||||
|
|
||||||
# First check for basic overlap using timestamp info given.
|
# First check for basic overlap using timestamp info given.
|
||||||
stream_id = self._stream_id(path)
|
stream_id = self._stream_id(path)
|
||||||
iset = self._get_intervals(stream_id)
|
iset = self._get_intervals(stream_id)
|
||||||
interval = Interval(start, end)
|
interval = Interval(min_timestamp, parser.max_timestamp)
|
||||||
if iset.intersects(interval):
|
if iset.intersects(interval):
|
||||||
raise OverlapError("new data overlaps existing data at range: "
|
raise OverlapError("new data overlaps existing data: "
|
||||||
+ str(iset & interval))
|
+ str(iset & interval))
|
||||||
|
|
||||||
# Insert the data
|
# Insert the data into pytables
|
||||||
table = self.data.getnode(path)
|
table = self.h5file.getNode(path)
|
||||||
row_start = table.nrows
|
row_start = table.nrows
|
||||||
table.append(data)
|
table.append(parser.data)
|
||||||
row_end = table.nrows
|
row_end = table.nrows
|
||||||
|
table.flush()
|
||||||
|
|
||||||
# Insert the record into the sql database.
|
# Insert the record into the sql database.
|
||||||
self._add_interval(stream_id, interval, row_start, row_end)
|
# Casts are to convert from numpy.int64.
|
||||||
|
self._add_interval(stream_id, interval, int(row_start), int(row_end))
|
||||||
|
|
||||||
# And that's all
|
# And that's all
|
||||||
return "ok"
|
return "ok"
|
||||||
|
|
||||||
def _find_start(self, table, dbinterval):
|
def _find_start(self, table, interval):
|
||||||
"""
|
"""
|
||||||
Given a DBInterval, find the row in the database that
|
Given a DBInterval, find the row in the database that
|
||||||
corresponds to the start time. Return the first database
|
corresponds to the start time. Return the first database
|
||||||
@@ -439,14 +411,14 @@ class NilmDB(object):
|
|||||||
equal to 'start'.
|
equal to 'start'.
|
||||||
"""
|
"""
|
||||||
# Optimization for the common case where an interval wasn't truncated
|
# Optimization for the common case where an interval wasn't truncated
|
||||||
if dbinterval.start == dbinterval.db_start:
|
if interval.start == interval.db_start:
|
||||||
return dbinterval.db_startpos
|
return interval.db_startpos
|
||||||
return bisect.bisect_left(bulkdata.TimestampOnlyTable(table),
|
return bisect.bisect_left(BisectableTable(table),
|
||||||
dbinterval.start,
|
interval.start,
|
||||||
dbinterval.db_startpos,
|
interval.db_startpos,
|
||||||
dbinterval.db_endpos)
|
interval.db_endpos)
|
||||||
|
|
||||||
def _find_end(self, table, dbinterval):
|
def _find_end(self, table, interval):
|
||||||
"""
|
"""
|
||||||
Given a DBInterval, find the row in the database that follows
|
Given a DBInterval, find the row in the database that follows
|
||||||
the end time. Return the first database position after the
|
the end time. Return the first database position after the
|
||||||
@@ -454,16 +426,16 @@ class NilmDB(object):
|
|||||||
to 'end'.
|
to 'end'.
|
||||||
"""
|
"""
|
||||||
# Optimization for the common case where an interval wasn't truncated
|
# Optimization for the common case where an interval wasn't truncated
|
||||||
if dbinterval.end == dbinterval.db_end:
|
if interval.end == interval.db_end:
|
||||||
return dbinterval.db_endpos
|
return interval.db_endpos
|
||||||
# Note that we still use bisect_left here, because we don't
|
# Note that we still use bisect_left here, because we don't
|
||||||
# want to include the given timestamp in the results. This is
|
# want to include the given timestamp in the results. This is
|
||||||
# so a queries like 1:00 -> 2:00 and 2:00 -> 3:00 return
|
# so a queries like 1:00 -> 2:00 and 2:00 -> 3:00 return
|
||||||
# non-overlapping data.
|
# non-overlapping data.
|
||||||
return bisect.bisect_left(bulkdata.TimestampOnlyTable(table),
|
return bisect.bisect_left(BisectableTable(table),
|
||||||
dbinterval.end,
|
interval.end,
|
||||||
dbinterval.db_startpos,
|
interval.db_startpos,
|
||||||
dbinterval.db_endpos)
|
interval.db_endpos)
|
||||||
|
|
||||||
def stream_extract(self, path, start = None, end = None, count = False):
|
def stream_extract(self, path, start = None, end = None, count = False):
|
||||||
"""
|
"""
|
||||||
@@ -484,8 +456,8 @@ class NilmDB(object):
|
|||||||
than actually fetching the data. It is not limited by
|
than actually fetching the data. It is not limited by
|
||||||
max_results.
|
max_results.
|
||||||
"""
|
"""
|
||||||
|
table = self.h5file.getNode(path)
|
||||||
stream_id = self._stream_id(path)
|
stream_id = self._stream_id(path)
|
||||||
table = self.data.getnode(path)
|
|
||||||
intervals = self._get_intervals(stream_id)
|
intervals = self._get_intervals(stream_id)
|
||||||
requested = Interval(start or 0, end or 1e12)
|
requested = Interval(start or 0, end or 1e12)
|
||||||
result = []
|
result = []
|
||||||
@@ -522,45 +494,3 @@ class NilmDB(object):
|
|||||||
if count:
|
if count:
|
||||||
return matched
|
return matched
|
||||||
return (result, restart)
|
return (result, restart)
|
||||||
|
|
||||||
def stream_remove(self, path, start = None, end = None):
|
|
||||||
"""
|
|
||||||
Remove data from the specified time interval within a stream.
|
|
||||||
Removes all data in the interval [start, end), and intervals
|
|
||||||
are truncated or split appropriately. Returns the number of
|
|
||||||
data points removed.
|
|
||||||
"""
|
|
||||||
stream_id = self._stream_id(path)
|
|
||||||
table = self.data.getnode(path)
|
|
||||||
intervals = self._get_intervals(stream_id)
|
|
||||||
to_remove = Interval(start or 0, end or 1e12)
|
|
||||||
removed = 0
|
|
||||||
|
|
||||||
if start == end:
|
|
||||||
return 0
|
|
||||||
|
|
||||||
# Can't remove intervals from within the iterator, so we need to
|
|
||||||
# remember what's currently in the intersection now.
|
|
||||||
all_candidates = list(intervals.intersection(to_remove, orig = True))
|
|
||||||
|
|
||||||
for (dbint, orig) in all_candidates:
|
|
||||||
# Find row start and end
|
|
||||||
row_start = self._find_start(table, dbint)
|
|
||||||
row_end = self._find_end(table, dbint)
|
|
||||||
|
|
||||||
# Adjust the DBInterval to match the newly found ends
|
|
||||||
dbint.db_start = dbint.start
|
|
||||||
dbint.db_end = dbint.end
|
|
||||||
dbint.db_startpos = row_start
|
|
||||||
dbint.db_endpos = row_end
|
|
||||||
|
|
||||||
# Remove interval from the database
|
|
||||||
self._remove_interval(stream_id, orig, dbint)
|
|
||||||
|
|
||||||
# Remove data from the underlying table storage
|
|
||||||
table.remove(row_start, row_end)
|
|
||||||
|
|
||||||
# Count how many were removed
|
|
||||||
removed += row_end - row_start
|
|
||||||
|
|
||||||
return removed
|
|
||||||
|
@@ -1,23 +0,0 @@
|
|||||||
cdef class RBNode:
|
|
||||||
cdef public object obj
|
|
||||||
cdef public double start, end
|
|
||||||
cdef public int red
|
|
||||||
cdef public RBNode left, right, parent
|
|
||||||
|
|
||||||
cdef class RBTree:
|
|
||||||
cdef public RBNode nil, root
|
|
||||||
|
|
||||||
cpdef getroot(RBTree self)
|
|
||||||
cdef void __rotate_left(RBTree self, RBNode x)
|
|
||||||
cdef void __rotate_right(RBTree self, RBNode y)
|
|
||||||
cdef RBNode __successor(RBTree self, RBNode x)
|
|
||||||
cpdef RBNode successor(RBTree self, RBNode x)
|
|
||||||
cdef RBNode __predecessor(RBTree self, RBNode x)
|
|
||||||
cpdef RBNode predecessor(RBTree self, RBNode x)
|
|
||||||
cpdef insert(RBTree self, RBNode z)
|
|
||||||
cdef void __insert_fixup(RBTree self, RBNode x)
|
|
||||||
cpdef delete(RBTree self, RBNode z)
|
|
||||||
cdef inline void __delete_fixup(RBTree self, RBNode x)
|
|
||||||
cpdef RBNode find(RBTree self, double start, double end)
|
|
||||||
cpdef RBNode find_left_end(RBTree self, double t)
|
|
||||||
cpdef RBNode find_right_start(RBTree self, double t)
|
|
@@ -1,377 +0,0 @@
|
|||||||
# cython: profile=False
|
|
||||||
# cython: cdivision=True
|
|
||||||
|
|
||||||
"""
|
|
||||||
Jim Paris <jim@jtan.com>
|
|
||||||
|
|
||||||
Red-black tree, where keys are stored as start/end timestamps.
|
|
||||||
This is a basic interval tree that holds half-open intervals:
|
|
||||||
[start, end)
|
|
||||||
Intervals must not overlap. Fixing that would involve making this
|
|
||||||
into an augmented interval tree as described in CLRS 14.3.
|
|
||||||
|
|
||||||
Code that assumes non-overlapping intervals is marked with the
|
|
||||||
string 'non-overlapping'.
|
|
||||||
"""
|
|
||||||
|
|
||||||
import sys
|
|
||||||
cimport rbtree
|
|
||||||
|
|
||||||
cdef class RBNode:
|
|
||||||
"""One node of the Red/Black tree, containing a key (start, end)
|
|
||||||
and value (obj)"""
|
|
||||||
def __init__(self, double start, double end, object obj = None):
|
|
||||||
self.obj = obj
|
|
||||||
self.start = start
|
|
||||||
self.end = end
|
|
||||||
self.red = False
|
|
||||||
self.left = None
|
|
||||||
self.right = None
|
|
||||||
|
|
||||||
def __str__(self):
|
|
||||||
if self.red:
|
|
||||||
color = "R"
|
|
||||||
else:
|
|
||||||
color = "B"
|
|
||||||
if self.start == sys.float_info.min:
|
|
||||||
return "[node nil]"
|
|
||||||
return ("[node ("
|
|
||||||
+ str(self.obj) + ") "
|
|
||||||
+ str(self.start) + " -> " + str(self.end) + " "
|
|
||||||
+ color + "]")
|
|
||||||
|
|
||||||
cdef class RBTree:
|
|
||||||
"""Red/Black tree"""
|
|
||||||
|
|
||||||
# Init
|
|
||||||
def __init__(self):
|
|
||||||
self.nil = RBNode(start = sys.float_info.min,
|
|
||||||
end = sys.float_info.min)
|
|
||||||
self.nil.left = self.nil
|
|
||||||
self.nil.right = self.nil
|
|
||||||
self.nil.parent = self.nil
|
|
||||||
|
|
||||||
self.root = RBNode(start = sys.float_info.max,
|
|
||||||
end = sys.float_info.max)
|
|
||||||
self.root.left = self.nil
|
|
||||||
self.root.right = self.nil
|
|
||||||
self.root.parent = self.nil
|
|
||||||
|
|
||||||
# We have a dummy root node to simplify operations, so from an
|
|
||||||
# external point of view, its left child is the real root.
|
|
||||||
cpdef getroot(self):
|
|
||||||
return self.root.left
|
|
||||||
|
|
||||||
# Rotations and basic operations
|
|
||||||
cdef void __rotate_left(self, RBNode x):
|
|
||||||
"""Rotate left:
|
|
||||||
# x y
|
|
||||||
# / \ --> / \
|
|
||||||
# z y x w
|
|
||||||
# / \ / \
|
|
||||||
# v w z v
|
|
||||||
"""
|
|
||||||
cdef RBNode y = x.right
|
|
||||||
x.right = y.left
|
|
||||||
if y.left is not self.nil:
|
|
||||||
y.left.parent = x
|
|
||||||
y.parent = x.parent
|
|
||||||
if x is x.parent.left:
|
|
||||||
x.parent.left = y
|
|
||||||
else:
|
|
||||||
x.parent.right = y
|
|
||||||
y.left = x
|
|
||||||
x.parent = y
|
|
||||||
|
|
||||||
cdef void __rotate_right(self, RBNode y):
|
|
||||||
"""Rotate right:
|
|
||||||
# y x
|
|
||||||
# / \ --> / \
|
|
||||||
# x w z y
|
|
||||||
# / \ / \
|
|
||||||
# z v v w
|
|
||||||
"""
|
|
||||||
cdef RBNode x = y.left
|
|
||||||
y.left = x.right
|
|
||||||
if x.right is not self.nil:
|
|
||||||
x.right.parent = y
|
|
||||||
x.parent = y.parent
|
|
||||||
if y is y.parent.left:
|
|
||||||
y.parent.left = x
|
|
||||||
else:
|
|
||||||
y.parent.right = x
|
|
||||||
x.right = y
|
|
||||||
y.parent = x
|
|
||||||
|
|
||||||
cdef RBNode __successor(self, RBNode x):
|
|
||||||
"""Returns the successor of RBNode x"""
|
|
||||||
cdef RBNode y = x.right
|
|
||||||
if y is not self.nil:
|
|
||||||
while y.left is not self.nil:
|
|
||||||
y = y.left
|
|
||||||
else:
|
|
||||||
y = x.parent
|
|
||||||
while x is y.right:
|
|
||||||
x = y
|
|
||||||
y = y.parent
|
|
||||||
if y is self.root:
|
|
||||||
return self.nil
|
|
||||||
return y
|
|
||||||
cpdef RBNode successor(self, RBNode x):
|
|
||||||
"""Returns the successor of RBNode x, or None"""
|
|
||||||
cdef RBNode y = self.__successor(x)
|
|
||||||
return y if y is not self.nil else None
|
|
||||||
|
|
||||||
cdef RBNode __predecessor(self, RBNode x):
|
|
||||||
"""Returns the predecessor of RBNode x"""
|
|
||||||
cdef RBNode y = x.left
|
|
||||||
if y is not self.nil:
|
|
||||||
while y.right is not self.nil:
|
|
||||||
y = y.right
|
|
||||||
else:
|
|
||||||
y = x.parent
|
|
||||||
while x is y.left:
|
|
||||||
if y is self.root:
|
|
||||||
y = self.nil
|
|
||||||
break
|
|
||||||
x = y
|
|
||||||
y = y.parent
|
|
||||||
return y
|
|
||||||
cpdef RBNode predecessor(self, RBNode x):
|
|
||||||
"""Returns the predecessor of RBNode x, or None"""
|
|
||||||
cdef RBNode y = self.__predecessor(x)
|
|
||||||
return y if y is not self.nil else None
|
|
||||||
|
|
||||||
# Insertion
|
|
||||||
cpdef insert(self, RBNode z):
|
|
||||||
"""Insert RBNode z into RBTree and rebalance as necessary"""
|
|
||||||
z.left = self.nil
|
|
||||||
z.right = self.nil
|
|
||||||
cdef RBNode y = self.root
|
|
||||||
cdef RBNode x = self.root.left
|
|
||||||
while x is not self.nil:
|
|
||||||
y = x
|
|
||||||
if (x.start > z.start or (x.start == z.start and x.end > z.end)):
|
|
||||||
x = x.left
|
|
||||||
else:
|
|
||||||
x = x.right
|
|
||||||
z.parent = y
|
|
||||||
if (y is self.root or
|
|
||||||
(y.start > z.start or (y.start == z.start and y.end > z.end))):
|
|
||||||
y.left = z
|
|
||||||
else:
|
|
||||||
y.right = z
|
|
||||||
# relabel/rebalance
|
|
||||||
self.__insert_fixup(z)
|
|
||||||
|
|
||||||
cdef void __insert_fixup(self, RBNode x):
|
|
||||||
"""Rebalance/fix RBTree after a simple insertion of RBNode x"""
|
|
||||||
x.red = True
|
|
||||||
while x.parent.red:
|
|
||||||
if x.parent is x.parent.parent.left:
|
|
||||||
y = x.parent.parent.right
|
|
||||||
if y.red:
|
|
||||||
x.parent.red = False
|
|
||||||
y.red = False
|
|
||||||
x.parent.parent.red = True
|
|
||||||
x = x.parent.parent
|
|
||||||
else:
|
|
||||||
if x is x.parent.right:
|
|
||||||
x = x.parent
|
|
||||||
self.__rotate_left(x)
|
|
||||||
x.parent.red = False
|
|
||||||
x.parent.parent.red = True
|
|
||||||
self.__rotate_right(x.parent.parent)
|
|
||||||
else: # same as above, left/right switched
|
|
||||||
y = x.parent.parent.left
|
|
||||||
if y.red:
|
|
||||||
x.parent.red = False
|
|
||||||
y.red = False
|
|
||||||
x.parent.parent.red = True
|
|
||||||
x = x.parent.parent
|
|
||||||
else:
|
|
||||||
if x is x.parent.left:
|
|
||||||
x = x.parent
|
|
||||||
self.__rotate_right(x)
|
|
||||||
x.parent.red = False
|
|
||||||
x.parent.parent.red = True
|
|
||||||
self.__rotate_left(x.parent.parent)
|
|
||||||
self.root.left.red = False
|
|
||||||
|
|
||||||
# Deletion
|
|
||||||
cpdef delete(self, RBNode z):
|
|
||||||
if z.left is None or z.right is None:
|
|
||||||
raise AttributeError("you can only delete a node object "
|
|
||||||
+ "from the tree; use find() to get one")
|
|
||||||
cdef RBNode x, y
|
|
||||||
if z.left is self.nil or z.right is self.nil:
|
|
||||||
y = z
|
|
||||||
else:
|
|
||||||
y = self.__successor(z)
|
|
||||||
if y.left is self.nil:
|
|
||||||
x = y.right
|
|
||||||
else:
|
|
||||||
x = y.left
|
|
||||||
x.parent = y.parent
|
|
||||||
if x.parent is self.root:
|
|
||||||
self.root.left = x
|
|
||||||
else:
|
|
||||||
if y is y.parent.left:
|
|
||||||
y.parent.left = x
|
|
||||||
else:
|
|
||||||
y.parent.right = x
|
|
||||||
if y is not z:
|
|
||||||
# y is the node to splice out, x is its child
|
|
||||||
y.left = z.left
|
|
||||||
y.right = z.right
|
|
||||||
y.parent = z.parent
|
|
||||||
z.left.parent = y
|
|
||||||
z.right.parent = y
|
|
||||||
if z is z.parent.left:
|
|
||||||
z.parent.left = y
|
|
||||||
else:
|
|
||||||
z.parent.right = y
|
|
||||||
if not y.red:
|
|
||||||
y.red = z.red
|
|
||||||
self.__delete_fixup(x)
|
|
||||||
else:
|
|
||||||
y.red = z.red
|
|
||||||
else:
|
|
||||||
if not y.red:
|
|
||||||
self.__delete_fixup(x)
|
|
||||||
|
|
||||||
cdef void __delete_fixup(self, RBNode x):
|
|
||||||
"""Rebalance/fix RBTree after a deletion. RBNode x is the
|
|
||||||
child of the spliced out node."""
|
|
||||||
cdef RBNode rootLeft = self.root.left
|
|
||||||
while not x.red and x is not rootLeft:
|
|
||||||
if x is x.parent.left:
|
|
||||||
w = x.parent.right
|
|
||||||
if w.red:
|
|
||||||
w.red = False
|
|
||||||
x.parent.red = True
|
|
||||||
self.__rotate_left(x.parent)
|
|
||||||
w = x.parent.right
|
|
||||||
if not w.right.red and not w.left.red:
|
|
||||||
w.red = True
|
|
||||||
x = x.parent
|
|
||||||
else:
|
|
||||||
if not w.right.red:
|
|
||||||
w.left.red = False
|
|
||||||
w.red = True
|
|
||||||
self.__rotate_right(w)
|
|
||||||
w = x.parent.right
|
|
||||||
w.red = x.parent.red
|
|
||||||
x.parent.red = False
|
|
||||||
w.right.red = False
|
|
||||||
self.__rotate_left(x.parent)
|
|
||||||
x = rootLeft # exit loop
|
|
||||||
else: # same as above, left/right switched
|
|
||||||
w = x.parent.left
|
|
||||||
if w.red:
|
|
||||||
w.red = False
|
|
||||||
x.parent.red = True
|
|
||||||
self.__rotate_right(x.parent)
|
|
||||||
w = x.parent.left
|
|
||||||
if not w.left.red and not w.right.red:
|
|
||||||
w.red = True
|
|
||||||
x = x.parent
|
|
||||||
else:
|
|
||||||
if not w.left.red:
|
|
||||||
w.right.red = False
|
|
||||||
w.red = True
|
|
||||||
self.__rotate_left(w)
|
|
||||||
w = x.parent.left
|
|
||||||
w.red = x.parent.red
|
|
||||||
x.parent.red = False
|
|
||||||
w.left.red = False
|
|
||||||
self.__rotate_right(x.parent)
|
|
||||||
x = rootLeft # exit loop
|
|
||||||
x.red = False
|
|
||||||
|
|
||||||
# Walking, searching
|
|
||||||
def __iter__(self):
|
|
||||||
return self.inorder()
|
|
||||||
|
|
||||||
def inorder(self, RBNode x = None):
|
|
||||||
"""Generator that performs an inorder walk for the tree
|
|
||||||
rooted at RBNode x"""
|
|
||||||
if x is None:
|
|
||||||
x = self.getroot()
|
|
||||||
while x.left is not self.nil:
|
|
||||||
x = x.left
|
|
||||||
while x is not self.nil:
|
|
||||||
yield x
|
|
||||||
x = self.__successor(x)
|
|
||||||
|
|
||||||
cpdef RBNode find(self, double start, double end):
|
|
||||||
"""Return the node with exactly the given start and end."""
|
|
||||||
cdef RBNode x = self.getroot()
|
|
||||||
while x is not self.nil:
|
|
||||||
if start < x.start:
|
|
||||||
x = x.left
|
|
||||||
elif start == x.start:
|
|
||||||
if end == x.end:
|
|
||||||
break # found it
|
|
||||||
elif end < x.end:
|
|
||||||
x = x.left
|
|
||||||
else:
|
|
||||||
x = x.right
|
|
||||||
else:
|
|
||||||
x = x.right
|
|
||||||
return x if x is not self.nil else None
|
|
||||||
|
|
||||||
cpdef RBNode find_left_end(self, double t):
|
|
||||||
"""Find the leftmode node with end >= t. With non-overlapping
|
|
||||||
intervals, this is the first node that might overlap time t.
|
|
||||||
|
|
||||||
Note that this relies on non-overlapping intervals, since
|
|
||||||
it assumes that we can use the endpoints to traverse the
|
|
||||||
tree even though it was created using the start points."""
|
|
||||||
cdef RBNode x = self.getroot()
|
|
||||||
while x is not self.nil:
|
|
||||||
if t < x.end:
|
|
||||||
if x.left is self.nil:
|
|
||||||
break
|
|
||||||
x = x.left
|
|
||||||
elif t == x.end:
|
|
||||||
break
|
|
||||||
else:
|
|
||||||
if x.right is self.nil:
|
|
||||||
x = self.__successor(x)
|
|
||||||
break
|
|
||||||
x = x.right
|
|
||||||
return x if x is not self.nil else None
|
|
||||||
|
|
||||||
cpdef RBNode find_right_start(self, double t):
|
|
||||||
"""Find the rightmode node with start <= t. With non-overlapping
|
|
||||||
intervals, this is the last node that might overlap time t."""
|
|
||||||
cdef RBNode x = self.getroot()
|
|
||||||
while x is not self.nil:
|
|
||||||
if t < x.start:
|
|
||||||
if x.left is self.nil:
|
|
||||||
x = self.__predecessor(x)
|
|
||||||
break
|
|
||||||
x = x.left
|
|
||||||
elif t == x.start:
|
|
||||||
break
|
|
||||||
else:
|
|
||||||
if x.right is self.nil:
|
|
||||||
break
|
|
||||||
x = x.right
|
|
||||||
return x if x is not self.nil else None
|
|
||||||
|
|
||||||
# Intersections
|
|
||||||
def intersect(self, double start, double end):
|
|
||||||
"""Generator that returns nodes that overlap the given
|
|
||||||
(start,end) range. Assumes non-overlapping intervals."""
|
|
||||||
# Start with the leftmode node that ends after start
|
|
||||||
cdef RBNode n = self.find_left_end(start)
|
|
||||||
while n is not None:
|
|
||||||
if n.start >= end:
|
|
||||||
# this node starts after the requested end; we're done
|
|
||||||
break
|
|
||||||
if start < n.end:
|
|
||||||
# this node overlaps our requested area
|
|
||||||
yield n
|
|
||||||
n = self.successor(n)
|
|
@@ -1 +0,0 @@
|
|||||||
rbtree.pxd
|
|
@@ -67,6 +67,3 @@ class WrapObject(object):
|
|||||||
def __del__(self):
|
def __del__(self):
|
||||||
self.__wrap_call_queue.put((None, None, None, None))
|
self.__wrap_call_queue.put((None, None, None, None))
|
||||||
self.__wrap_serializer.join()
|
self.__wrap_serializer.join()
|
||||||
|
|
||||||
# Just an alias
|
|
||||||
Serializer = WrapObject
|
|
159
nilmdb/server.py
159
nilmdb/server.py
@@ -3,18 +3,15 @@
|
|||||||
# Need absolute_import so that "import nilmdb" won't pull in nilmdb.py,
|
# Need absolute_import so that "import nilmdb" won't pull in nilmdb.py,
|
||||||
# but will pull the nilmdb module instead.
|
# but will pull the nilmdb module instead.
|
||||||
from __future__ import absolute_import
|
from __future__ import absolute_import
|
||||||
from nilmdb.utils.printf import *
|
|
||||||
import nilmdb
|
import nilmdb
|
||||||
|
|
||||||
|
from nilmdb.printf import *
|
||||||
|
|
||||||
import cherrypy
|
import cherrypy
|
||||||
import sys
|
import sys
|
||||||
import time
|
import time
|
||||||
import os
|
import os
|
||||||
import simplejson as json
|
import simplejson as json
|
||||||
import decorator
|
|
||||||
import traceback
|
|
||||||
|
|
||||||
from nilmdb.nilmdb import NilmDBError
|
|
||||||
|
|
||||||
try:
|
try:
|
||||||
import cherrypy
|
import cherrypy
|
||||||
@@ -27,53 +24,8 @@ class NilmApp(object):
|
|||||||
def __init__(self, db):
|
def __init__(self, db):
|
||||||
self.db = db
|
self.db = db
|
||||||
|
|
||||||
version = "1.2"
|
version = "1.1"
|
||||||
|
|
||||||
# Decorators
|
|
||||||
def chunked_response(func):
|
|
||||||
"""Decorator to enable chunked responses."""
|
|
||||||
# Set this to False to get better tracebacks from some requests
|
|
||||||
# (/stream/extract, /stream/intervals).
|
|
||||||
func._cp_config = { 'response.stream': True }
|
|
||||||
return func
|
|
||||||
|
|
||||||
@decorator.decorator
|
|
||||||
def workaround_cp_bug_1200(func, *args, **kwargs): # pragma: no cover
|
|
||||||
"""Decorator to work around CherryPy bug #1200 in a response
|
|
||||||
generator.
|
|
||||||
|
|
||||||
Even if chunked responses are disabled, LookupError or
|
|
||||||
UnicodeError exceptions may still be swallowed by CherryPy due to
|
|
||||||
bug #1200. This throws them as generic Exceptions instead so that
|
|
||||||
they make it through.
|
|
||||||
"""
|
|
||||||
try:
|
|
||||||
for val in func(*args, **kwargs):
|
|
||||||
yield val
|
|
||||||
except (LookupError, UnicodeError) as e:
|
|
||||||
raise Exception("bug workaround; real exception is:\n" +
|
|
||||||
traceback.format_exc())
|
|
||||||
|
|
||||||
def exception_to_httperror(*expected):
|
|
||||||
"""Return a decorator-generating function that catches expected
|
|
||||||
errors and throws a HTTPError describing it instead.
|
|
||||||
|
|
||||||
@exception_to_httperror(NilmDBError, ValueError)
|
|
||||||
def foo():
|
|
||||||
pass
|
|
||||||
"""
|
|
||||||
def wrapper(func, *args, **kwargs):
|
|
||||||
try:
|
|
||||||
return func(*args, **kwargs)
|
|
||||||
except expected as e:
|
|
||||||
message = sprintf("%s", str(e))
|
|
||||||
raise cherrypy.HTTPError("400 Bad Request", message)
|
|
||||||
# We need to preserve the function's argspecs for CherryPy to
|
|
||||||
# handle argument errors correctly. Decorator.decorator takes
|
|
||||||
# care of that.
|
|
||||||
return decorator.decorator(wrapper)
|
|
||||||
|
|
||||||
# CherryPy apps
|
|
||||||
class Root(NilmApp):
|
class Root(NilmApp):
|
||||||
"""Root application for NILM database"""
|
"""Root application for NILM database"""
|
||||||
|
|
||||||
@@ -107,7 +59,7 @@ class Root(NilmApp):
|
|||||||
@cherrypy.expose
|
@cherrypy.expose
|
||||||
@cherrypy.tools.json_out()
|
@cherrypy.tools.json_out()
|
||||||
def dbsize(self):
|
def dbsize(self):
|
||||||
return nilmdb.utils.du(self.db.get_basepath())
|
return nilmdb.du.du(self.db.get_basepath())
|
||||||
|
|
||||||
class Stream(NilmApp):
|
class Stream(NilmApp):
|
||||||
"""Stream-specific operations"""
|
"""Stream-specific operations"""
|
||||||
@@ -126,20 +78,15 @@ class Stream(NilmApp):
|
|||||||
# /stream/create?path=/newton/prep&layout=PrepData
|
# /stream/create?path=/newton/prep&layout=PrepData
|
||||||
@cherrypy.expose
|
@cherrypy.expose
|
||||||
@cherrypy.tools.json_out()
|
@cherrypy.tools.json_out()
|
||||||
@exception_to_httperror(NilmDBError, ValueError)
|
|
||||||
def create(self, path, layout):
|
def create(self, path, layout):
|
||||||
"""Create a new stream in the database. Provide path
|
"""Create a new stream in the database. Provide path
|
||||||
and one of the nilmdb.layout.layouts keys.
|
and one of the nilmdb.layout.layouts keys.
|
||||||
"""
|
"""
|
||||||
|
try:
|
||||||
return self.db.stream_create(path, layout)
|
return self.db.stream_create(path, layout)
|
||||||
|
except Exception as e:
|
||||||
# /stream/destroy?path=/newton/prep
|
message = sprintf("%s: %s", type(e).__name__, e.message)
|
||||||
@cherrypy.expose
|
raise cherrypy.HTTPError("400 Bad Request", message)
|
||||||
@cherrypy.tools.json_out()
|
|
||||||
@exception_to_httperror(NilmDBError)
|
|
||||||
def destroy(self, path):
|
|
||||||
"""Delete a stream and its associated data."""
|
|
||||||
return self.db.stream_destroy(path)
|
|
||||||
|
|
||||||
# /stream/get_metadata?path=/newton/prep
|
# /stream/get_metadata?path=/newton/prep
|
||||||
# /stream/get_metadata?path=/newton/prep&key=foo&key=bar
|
# /stream/get_metadata?path=/newton/prep&key=foo&key=bar
|
||||||
@@ -168,35 +115,49 @@ class Stream(NilmApp):
|
|||||||
# /stream/set_metadata?path=/newton/prep&data=<json>
|
# /stream/set_metadata?path=/newton/prep&data=<json>
|
||||||
@cherrypy.expose
|
@cherrypy.expose
|
||||||
@cherrypy.tools.json_out()
|
@cherrypy.tools.json_out()
|
||||||
@exception_to_httperror(NilmDBError, LookupError, TypeError)
|
|
||||||
def set_metadata(self, path, data):
|
def set_metadata(self, path, data):
|
||||||
"""Set metadata for the named stream, replacing any
|
"""Set metadata for the named stream, replacing any
|
||||||
existing metadata. Data should be a json-encoded
|
existing metadata. Data should be a json-encoded
|
||||||
dictionary"""
|
dictionary"""
|
||||||
|
try:
|
||||||
data_dict = json.loads(data)
|
data_dict = json.loads(data)
|
||||||
self.db.stream_set_metadata(path, data_dict)
|
self.db.stream_set_metadata(path, data_dict)
|
||||||
|
except Exception as e:
|
||||||
|
message = sprintf("%s: %s", type(e).__name__, e.message)
|
||||||
|
raise cherrypy.HTTPError("400 Bad Request", message)
|
||||||
return "ok"
|
return "ok"
|
||||||
|
|
||||||
# /stream/update_metadata?path=/newton/prep&data=<json>
|
# /stream/update_metadata?path=/newton/prep&data=<json>
|
||||||
@cherrypy.expose
|
@cherrypy.expose
|
||||||
@cherrypy.tools.json_out()
|
@cherrypy.tools.json_out()
|
||||||
@exception_to_httperror(NilmDBError, LookupError, TypeError)
|
|
||||||
def update_metadata(self, path, data):
|
def update_metadata(self, path, data):
|
||||||
"""Update metadata for the named stream. Data
|
"""Update metadata for the named stream. Data
|
||||||
should be a json-encoded dictionary"""
|
should be a json-encoded dictionary"""
|
||||||
|
try:
|
||||||
data_dict = json.loads(data)
|
data_dict = json.loads(data)
|
||||||
self.db.stream_update_metadata(path, data_dict)
|
self.db.stream_update_metadata(path, data_dict)
|
||||||
|
except Exception as e:
|
||||||
|
message = sprintf("%s: %s", type(e).__name__, e.message)
|
||||||
|
raise cherrypy.HTTPError("400 Bad Request", message)
|
||||||
return "ok"
|
return "ok"
|
||||||
|
|
||||||
# /stream/insert?path=/newton/prep
|
# /stream/insert?path=/newton/prep
|
||||||
@cherrypy.expose
|
@cherrypy.expose
|
||||||
@cherrypy.tools.json_out()
|
@cherrypy.tools.json_out()
|
||||||
#@cherrypy.tools.disable_prb()
|
#@cherrypy.tools.disable_prb()
|
||||||
def insert(self, path, start, end):
|
def insert(self, path, old_timestamp = None):
|
||||||
"""
|
"""
|
||||||
Insert new data into the database. Provide textual data
|
Insert new data into the database. Provide textual data
|
||||||
(matching the path's layout) as a HTTP PUT.
|
(matching the path's layout) as a HTTP PUT.
|
||||||
|
|
||||||
|
old_timestamp is used when making multiple, split-up insertions
|
||||||
|
for a larger contiguous block of data. The first insert
|
||||||
|
will return the maximum timestamp that it saw, and the second
|
||||||
|
insert should provide this timestamp as an argument. This is
|
||||||
|
used to extend the previous database interval rather than
|
||||||
|
start a new one.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
# Important that we always read the input before throwing any
|
# Important that we always read the input before throwing any
|
||||||
# errors, to keep lengths happy for persistent connections.
|
# errors, to keep lengths happy for persistent connections.
|
||||||
# However, CherryPy 3.2.2 has a bug where this fails for GET
|
# However, CherryPy 3.2.2 has a bug where this fails for GET
|
||||||
@@ -218,60 +179,25 @@ class Stream(NilmApp):
|
|||||||
parser.parse(body)
|
parser.parse(body)
|
||||||
except nilmdb.layout.ParserError as e:
|
except nilmdb.layout.ParserError as e:
|
||||||
raise cherrypy.HTTPError("400 Bad Request",
|
raise cherrypy.HTTPError("400 Bad Request",
|
||||||
"error parsing input data: " +
|
"Error parsing input data: " +
|
||||||
e.message)
|
e.message)
|
||||||
|
|
||||||
if (not parser.min_timestamp or not parser.max_timestamp or
|
|
||||||
not len(parser.data)):
|
|
||||||
raise cherrypy.HTTPError("400 Bad Request",
|
|
||||||
"no data provided")
|
|
||||||
|
|
||||||
# Check limits
|
|
||||||
start = float(start)
|
|
||||||
end = float(end)
|
|
||||||
if parser.min_timestamp < start:
|
|
||||||
raise cherrypy.HTTPError("400 Bad Request", "Data timestamp " +
|
|
||||||
repr(parser.min_timestamp) +
|
|
||||||
" < start time " + repr(start))
|
|
||||||
if parser.max_timestamp >= end:
|
|
||||||
raise cherrypy.HTTPError("400 Bad Request", "Data timestamp " +
|
|
||||||
repr(parser.max_timestamp) +
|
|
||||||
" >= end time " + repr(end))
|
|
||||||
|
|
||||||
# Now do the nilmdb insert, passing it the parser full of data.
|
# Now do the nilmdb insert, passing it the parser full of data.
|
||||||
try:
|
try:
|
||||||
result = self.db.stream_insert(path, start, end, parser.data)
|
if old_timestamp:
|
||||||
|
old_timestamp = float(old_timestamp)
|
||||||
|
result = self.db.stream_insert(path, parser, old_timestamp)
|
||||||
except nilmdb.nilmdb.NilmDBError as e:
|
except nilmdb.nilmdb.NilmDBError as e:
|
||||||
raise cherrypy.HTTPError("400 Bad Request", e.message)
|
raise cherrypy.HTTPError("400 Bad Request", e.message)
|
||||||
|
|
||||||
# Done
|
# Return the maximum timestamp that we saw. The client will
|
||||||
return "ok"
|
# return this back to us as the old_timestamp parameter, if
|
||||||
|
# it has more data to send.
|
||||||
# /stream/remove?path=/newton/prep
|
return ("ok", parser.max_timestamp)
|
||||||
# /stream/remove?path=/newton/prep&start=1234567890.0&end=1234567899.0
|
|
||||||
@cherrypy.expose
|
|
||||||
@cherrypy.tools.json_out()
|
|
||||||
@exception_to_httperror(NilmDBError)
|
|
||||||
def remove(self, path, start = None, end = None):
|
|
||||||
"""
|
|
||||||
Remove data from the backend database. Removes all data in
|
|
||||||
the interval [start, end). Returns the number of data points
|
|
||||||
removed.
|
|
||||||
"""
|
|
||||||
if start is not None:
|
|
||||||
start = float(start)
|
|
||||||
if end is not None:
|
|
||||||
end = float(end)
|
|
||||||
if start is not None and end is not None:
|
|
||||||
if end < start:
|
|
||||||
raise cherrypy.HTTPError("400 Bad Request",
|
|
||||||
"end before start")
|
|
||||||
return self.db.stream_remove(path, start, end)
|
|
||||||
|
|
||||||
# /stream/intervals?path=/newton/prep
|
# /stream/intervals?path=/newton/prep
|
||||||
# /stream/intervals?path=/newton/prep&start=1234567890.0&end=1234567899.0
|
# /stream/intervals?path=/newton/prep&start=1234567890.0&end=1234567899.0
|
||||||
@cherrypy.expose
|
@cherrypy.expose
|
||||||
@chunked_response
|
|
||||||
def intervals(self, path, start = None, end = None):
|
def intervals(self, path, start = None, end = None):
|
||||||
"""
|
"""
|
||||||
Get intervals from backend database. Streams the resulting
|
Get intervals from backend database. Streams the resulting
|
||||||
@@ -293,9 +219,9 @@ class Stream(NilmApp):
|
|||||||
if len(streams) != 1:
|
if len(streams) != 1:
|
||||||
raise cherrypy.HTTPError("404 Not Found", "No such stream")
|
raise cherrypy.HTTPError("404 Not Found", "No such stream")
|
||||||
|
|
||||||
@workaround_cp_bug_1200
|
|
||||||
def content(start, end):
|
def content(start, end):
|
||||||
# Note: disable chunked responses to see tracebacks from here.
|
# Note: disable response.stream below to get better debug info
|
||||||
|
# from tracebacks in this subfunction.
|
||||||
while True:
|
while True:
|
||||||
(intervals, restart) = self.db.stream_intervals(path,start,end)
|
(intervals, restart) = self.db.stream_intervals(path,start,end)
|
||||||
response = ''.join([ json.dumps(i) + "\n" for i in intervals ])
|
response = ''.join([ json.dumps(i) + "\n" for i in intervals ])
|
||||||
@@ -304,10 +230,10 @@ class Stream(NilmApp):
|
|||||||
break
|
break
|
||||||
start = restart
|
start = restart
|
||||||
return content(start, end)
|
return content(start, end)
|
||||||
|
intervals._cp_config = { 'response.stream': True } # chunked HTTP response
|
||||||
|
|
||||||
# /stream/extract?path=/newton/prep&start=1234567890.0&end=1234567899.0
|
# /stream/extract?path=/newton/prep&start=1234567890.0&end=1234567899.0
|
||||||
@cherrypy.expose
|
@cherrypy.expose
|
||||||
@chunked_response
|
|
||||||
def extract(self, path, start = None, end = None, count = False):
|
def extract(self, path, start = None, end = None, count = False):
|
||||||
"""
|
"""
|
||||||
Extract data from backend database. Streams the resulting
|
Extract data from backend database. Streams the resulting
|
||||||
@@ -337,9 +263,9 @@ class Stream(NilmApp):
|
|||||||
# Get formatter
|
# Get formatter
|
||||||
formatter = nilmdb.layout.Formatter(layout)
|
formatter = nilmdb.layout.Formatter(layout)
|
||||||
|
|
||||||
@workaround_cp_bug_1200
|
|
||||||
def content(start, end, count):
|
def content(start, end, count):
|
||||||
# Note: disable chunked responses to see tracebacks from here.
|
# Note: disable response.stream below to get better debug info
|
||||||
|
# from tracebacks in this subfunction.
|
||||||
if count:
|
if count:
|
||||||
matched = self.db.stream_extract(path, start, end, count)
|
matched = self.db.stream_extract(path, start, end, count)
|
||||||
yield sprintf("%d\n", matched)
|
yield sprintf("%d\n", matched)
|
||||||
@@ -355,6 +281,8 @@ class Stream(NilmApp):
|
|||||||
return
|
return
|
||||||
start = restart
|
start = restart
|
||||||
return content(start, end, count)
|
return content(start, end, count)
|
||||||
|
extract._cp_config = { 'response.stream': True } # chunked HTTP response
|
||||||
|
|
||||||
|
|
||||||
class Exiter(object):
|
class Exiter(object):
|
||||||
"""App that exits the server, for testing"""
|
"""App that exits the server, for testing"""
|
||||||
@@ -379,7 +307,7 @@ class Server(object):
|
|||||||
# Need to wrap DB object in a serializer because we'll call
|
# Need to wrap DB object in a serializer because we'll call
|
||||||
# into it from separate threads.
|
# into it from separate threads.
|
||||||
self.embedded = embedded
|
self.embedded = embedded
|
||||||
self.db = nilmdb.utils.Serializer(db)
|
self.db = nilmdb.serializer.WrapObject(db)
|
||||||
cherrypy.config.update({
|
cherrypy.config.update({
|
||||||
'server.socket_host': host,
|
'server.socket_host': host,
|
||||||
'server.socket_port': port,
|
'server.socket_port': port,
|
||||||
@@ -395,11 +323,6 @@ class Server(object):
|
|||||||
cherrypy.config.update({ 'request.show_tracebacks' : True })
|
cherrypy.config.update({ 'request.show_tracebacks' : True })
|
||||||
self.force_traceback = force_traceback
|
self.force_traceback = force_traceback
|
||||||
|
|
||||||
# Patch CherryPy error handler to never pad out error messages.
|
|
||||||
# This isn't necessary, but then again, neither is padding the
|
|
||||||
# error messages.
|
|
||||||
cherrypy._cperror._ie_friendly_error_sizes = {}
|
|
||||||
|
|
||||||
cherrypy.tree.apps = {}
|
cherrypy.tree.apps = {}
|
||||||
cherrypy.tree.mount(Root(self.db, self.version), "/")
|
cherrypy.tree.mount(Root(self.db, self.version), "/")
|
||||||
cherrypy.tree.mount(Stream(self.db), "/stream")
|
cherrypy.tree.mount(Stream(self.db), "/stream")
|
||||||
|
@@ -5,7 +5,6 @@
|
|||||||
# with nilmdb.Timer("flush"):
|
# with nilmdb.Timer("flush"):
|
||||||
# foo.flush()
|
# foo.flush()
|
||||||
|
|
||||||
from __future__ import print_function
|
|
||||||
import contextlib
|
import contextlib
|
||||||
import time
|
import time
|
||||||
|
|
||||||
@@ -19,4 +18,4 @@ def Timer(name = None, tosyslog = False):
|
|||||||
import syslog
|
import syslog
|
||||||
syslog.syslog(msg)
|
syslog.syslog(msg)
|
||||||
else:
|
else:
|
||||||
print(msg)
|
print msg
|
@@ -1,7 +1,7 @@
|
|||||||
"""File-like objects that add timestamps to the input lines"""
|
"""File-like objects that add timestamps to the input lines"""
|
||||||
|
|
||||||
from __future__ import absolute_import
|
from __future__ import absolute_import
|
||||||
from nilmdb.utils.printf import *
|
from nilmdb.printf import *
|
||||||
|
|
||||||
import time
|
import time
|
||||||
import os
|
import os
|
||||||
|
@@ -1,11 +0,0 @@
|
|||||||
"""NilmDB utilities"""
|
|
||||||
|
|
||||||
from .timer import Timer
|
|
||||||
from .iteratorizer import Iteratorizer
|
|
||||||
from .serializer import Serializer
|
|
||||||
from .lrucache import lru_cache
|
|
||||||
from .diskusage import du
|
|
||||||
from .mustclose import must_close
|
|
||||||
from .urllib import urlencode
|
|
||||||
from . import misc
|
|
||||||
from . import atomic
|
|
@@ -1,26 +0,0 @@
|
|||||||
# Atomic file writing helper.
|
|
||||||
|
|
||||||
import os
|
|
||||||
|
|
||||||
def replace_file(filename, content):
|
|
||||||
"""Attempt to atomically and durably replace the filename with the
|
|
||||||
given contents. This is intended to be 'pretty good on most
|
|
||||||
OSes', but not necessarily bulletproof."""
|
|
||||||
|
|
||||||
newfilename = filename + ".new"
|
|
||||||
|
|
||||||
# Write to new file, flush it
|
|
||||||
with open(newfilename, "wb") as f:
|
|
||||||
f.write(content)
|
|
||||||
f.flush()
|
|
||||||
os.fsync(f.fileno())
|
|
||||||
|
|
||||||
# Move new file over old one
|
|
||||||
try:
|
|
||||||
os.rename(newfilename, filename)
|
|
||||||
except OSError: # pragma: no cover
|
|
||||||
# Some OSes might not support renaming over an existing file.
|
|
||||||
# This is definitely NOT atomic!
|
|
||||||
os.remove(filename)
|
|
||||||
os.rename(newfilename, filename)
|
|
||||||
|
|
@@ -1,77 +0,0 @@
|
|||||||
# Memoize a function's return value with a least-recently-used cache
|
|
||||||
# Based on:
|
|
||||||
# http://code.activestate.com/recipes/498245-lru-and-lfu-cache-decorators/
|
|
||||||
# with added 'destructor' functionality.
|
|
||||||
|
|
||||||
import collections
|
|
||||||
import decorator
|
|
||||||
import warnings
|
|
||||||
|
|
||||||
def lru_cache(size = 10, onremove = None, keys = slice(None)):
|
|
||||||
"""Least-recently-used cache decorator.
|
|
||||||
|
|
||||||
@lru_cache(size = 10, onevict = None)
|
|
||||||
def f(...):
|
|
||||||
pass
|
|
||||||
|
|
||||||
Given a function and arguments, memoize its return value. Up to
|
|
||||||
'size' elements are cached. 'keys' is a slice object that
|
|
||||||
represents which arguments are used as the cache key.
|
|
||||||
|
|
||||||
When evicting a value from the cache, call the function
|
|
||||||
'onremove' with the value that's being evicted.
|
|
||||||
|
|
||||||
Call f.cache_remove(...) to evict the cache entry with the given
|
|
||||||
arguments. Call f.cache_remove_all() to evict all entries.
|
|
||||||
f.cache_hits and f.cache_misses give statistics.
|
|
||||||
"""
|
|
||||||
|
|
||||||
def decorate(func):
|
|
||||||
cache = collections.OrderedDict() # order: least- to most-recent
|
|
||||||
|
|
||||||
def evict(value):
|
|
||||||
if onremove:
|
|
||||||
onremove(value)
|
|
||||||
|
|
||||||
def wrapper(orig, *args, **kwargs):
|
|
||||||
if kwargs:
|
|
||||||
raise NotImplementedError("kwargs not supported")
|
|
||||||
key = args[keys]
|
|
||||||
try:
|
|
||||||
value = cache.pop(key)
|
|
||||||
orig.cache_hits += 1
|
|
||||||
except KeyError:
|
|
||||||
value = orig(*args)
|
|
||||||
orig.cache_misses += 1
|
|
||||||
if len(cache) >= size:
|
|
||||||
evict(cache.popitem(0)[1]) # evict LRU cache entry
|
|
||||||
cache[key] = value # (re-)insert this key at end
|
|
||||||
return value
|
|
||||||
|
|
||||||
def cache_remove(*args):
|
|
||||||
"""Remove the described key from this cache, if present."""
|
|
||||||
key = args
|
|
||||||
if key in cache:
|
|
||||||
evict(cache.pop(key))
|
|
||||||
else:
|
|
||||||
if len(cache) > 0 and len(args) != len(cache.iterkeys().next()):
|
|
||||||
raise KeyError("trying to remove from LRU cache, but "
|
|
||||||
"number of arguments doesn't match the "
|
|
||||||
"cache key length")
|
|
||||||
|
|
||||||
def cache_remove_all():
|
|
||||||
for key in cache:
|
|
||||||
evict(cache.pop(key))
|
|
||||||
|
|
||||||
def cache_info():
|
|
||||||
return (func.cache_hits, func.cache_misses)
|
|
||||||
|
|
||||||
new = decorator.decorator(wrapper, func)
|
|
||||||
func.cache_hits = 0
|
|
||||||
func.cache_misses = 0
|
|
||||||
new.cache_info = cache_info
|
|
||||||
new.cache_remove = cache_remove
|
|
||||||
new.cache_remove_all = cache_remove_all
|
|
||||||
return new
|
|
||||||
|
|
||||||
return decorate
|
|
@@ -1,8 +0,0 @@
|
|||||||
import itertools
|
|
||||||
|
|
||||||
def pairwise(iterable):
|
|
||||||
"s -> (s0,s1), (s1,s2), ..., (sn,None)"
|
|
||||||
a, b = itertools.tee(iterable)
|
|
||||||
next(b, None)
|
|
||||||
return itertools.izip_longest(a, b)
|
|
||||||
|
|
@@ -1,63 +0,0 @@
|
|||||||
from nilmdb.utils.printf import *
|
|
||||||
import sys
|
|
||||||
import inspect
|
|
||||||
import decorator
|
|
||||||
|
|
||||||
def must_close(errorfile = sys.stderr, wrap_verify = False):
|
|
||||||
"""Class decorator that warns on 'errorfile' at deletion time if
|
|
||||||
the class's close() member wasn't called.
|
|
||||||
|
|
||||||
If 'wrap_verify' is True, every class method is wrapped with a
|
|
||||||
verifier that will raise AssertionError if the .close() method has
|
|
||||||
already been called."""
|
|
||||||
def class_decorator(cls):
|
|
||||||
|
|
||||||
# Helper to replace a class method with a wrapper function,
|
|
||||||
# while maintaining argument specs etc.
|
|
||||||
def wrap_class_method(wrapper_func):
|
|
||||||
method = wrapper_func.__name__
|
|
||||||
if method in cls.__dict__:
|
|
||||||
orig = getattr(cls, method).im_func
|
|
||||||
else:
|
|
||||||
orig = lambda self: None
|
|
||||||
setattr(cls, method, decorator.decorator(wrapper_func, orig))
|
|
||||||
|
|
||||||
@wrap_class_method
|
|
||||||
def __init__(orig, self, *args, **kwargs):
|
|
||||||
ret = orig(self, *args, **kwargs)
|
|
||||||
self.__dict__["_must_close"] = True
|
|
||||||
self.__dict__["_must_close_initialized"] = True
|
|
||||||
return ret
|
|
||||||
|
|
||||||
@wrap_class_method
|
|
||||||
def __del__(orig, self, *args, **kwargs):
|
|
||||||
if "_must_close" in self.__dict__:
|
|
||||||
fprintf(errorfile, "error: %s.close() wasn't called!\n",
|
|
||||||
self.__class__.__name__)
|
|
||||||
return orig(self, *args, **kwargs)
|
|
||||||
|
|
||||||
@wrap_class_method
|
|
||||||
def close(orig, self, *args, **kwargs):
|
|
||||||
del self._must_close
|
|
||||||
return orig(self, *args, **kwargs)
|
|
||||||
|
|
||||||
# Optionally wrap all other functions
|
|
||||||
def verifier(orig, self, *args, **kwargs):
|
|
||||||
if ("_must_close" not in self.__dict__ and
|
|
||||||
"_must_close_initialized" in self.__dict__):
|
|
||||||
raise AssertionError("called " + str(orig) + " after close")
|
|
||||||
return orig(self, *args, **kwargs)
|
|
||||||
if wrap_verify:
|
|
||||||
for (name, method) in inspect.getmembers(cls, inspect.ismethod):
|
|
||||||
# Skip class methods
|
|
||||||
if method.__self__ is not None:
|
|
||||||
continue
|
|
||||||
# Skip some methods
|
|
||||||
if name in [ "__del__", "__init__" ]:
|
|
||||||
continue
|
|
||||||
# Set up wrapper
|
|
||||||
setattr(cls, name, decorator.decorator(verifier,
|
|
||||||
method.im_func))
|
|
||||||
|
|
||||||
return cls
|
|
||||||
return class_decorator
|
|
@@ -1,40 +0,0 @@
|
|||||||
from __future__ import absolute_import
|
|
||||||
from urllib import quote_plus, _is_unicode
|
|
||||||
|
|
||||||
# urllib.urlencode insists on encoding Unicode as ASCII. This is based
|
|
||||||
# on that function, except we always encode it as UTF-8 instead.
|
|
||||||
|
|
||||||
def urlencode(query):
|
|
||||||
"""Encode a dictionary into a URL query string.
|
|
||||||
|
|
||||||
If any values in the query arg are sequences, each sequence
|
|
||||||
element is converted to a separate parameter.
|
|
||||||
"""
|
|
||||||
|
|
||||||
query = query.items()
|
|
||||||
|
|
||||||
l = []
|
|
||||||
for k, v in query:
|
|
||||||
k = quote_plus(str(k))
|
|
||||||
if isinstance(v, str):
|
|
||||||
v = quote_plus(v)
|
|
||||||
l.append(k + '=' + v)
|
|
||||||
elif _is_unicode(v):
|
|
||||||
# is there a reasonable way to convert to ASCII?
|
|
||||||
# encode generates a string, but "replace" or "ignore"
|
|
||||||
# lose information and "strict" can raise UnicodeError
|
|
||||||
v = quote_plus(v.encode("utf-8","strict"))
|
|
||||||
l.append(k + '=' + v)
|
|
||||||
else:
|
|
||||||
try:
|
|
||||||
# is this a sufficient test for sequence-ness?
|
|
||||||
len(v)
|
|
||||||
except TypeError:
|
|
||||||
# not a sequence
|
|
||||||
v = quote_plus(str(v))
|
|
||||||
l.append(k + '=' + v)
|
|
||||||
else:
|
|
||||||
# loop over the sequence
|
|
||||||
for elt in v:
|
|
||||||
l.append(k + '=' + quote_plus(str(elt)))
|
|
||||||
return '&'.join(l)
|
|
@@ -3,17 +3,14 @@
|
|||||||
import nilmdb
|
import nilmdb
|
||||||
import argparse
|
import argparse
|
||||||
|
|
||||||
formatter = argparse.ArgumentDefaultsHelpFormatter
|
parser = argparse.ArgumentParser(description='Run the NILM server')
|
||||||
parser = argparse.ArgumentParser(description='Run the NILM server',
|
|
||||||
formatter_class = formatter)
|
|
||||||
parser.add_argument('-p', '--port', help='Port number', type=int, default=12380)
|
parser.add_argument('-p', '--port', help='Port number', type=int, default=12380)
|
||||||
parser.add_argument('-d', '--database', help='Database directory', default="db")
|
|
||||||
parser.add_argument('-y', '--yappi', help='Run with yappi profiler',
|
parser.add_argument('-y', '--yappi', help='Run with yappi profiler',
|
||||||
action='store_true')
|
action='store_true')
|
||||||
args = parser.parse_args()
|
args = parser.parse_args()
|
||||||
|
|
||||||
# Start web app on a custom port
|
# Start web app on a custom port
|
||||||
db = nilmdb.NilmDB(args.database)
|
db = nilmdb.NilmDB("db")
|
||||||
server = nilmdb.Server(db, host = "127.0.0.1",
|
server = nilmdb.Server(db, host = "127.0.0.1",
|
||||||
port = args.port,
|
port = args.port,
|
||||||
embedded = False)
|
embedded = False)
|
||||||
|
46
runtests.py
46
runtests.py
@@ -1,46 +0,0 @@
|
|||||||
#!/usr/bin/python
|
|
||||||
|
|
||||||
import nose
|
|
||||||
import os
|
|
||||||
import sys
|
|
||||||
import glob
|
|
||||||
from collections import OrderedDict
|
|
||||||
|
|
||||||
class JimOrderPlugin(nose.plugins.Plugin):
|
|
||||||
"""When searching for tests and encountering a directory that
|
|
||||||
contains a 'test.order' file, run tests listed in that file, in the
|
|
||||||
order that they're listed. Globs are OK in that file and duplicates
|
|
||||||
are removed."""
|
|
||||||
name = 'jimorder'
|
|
||||||
score = 10000
|
|
||||||
|
|
||||||
def prepareTestLoader(self, loader):
|
|
||||||
def wrap(func):
|
|
||||||
def wrapper(name, *args, **kwargs):
|
|
||||||
addr = nose.selector.TestAddress(
|
|
||||||
name, workingDir=loader.workingDir)
|
|
||||||
try:
|
|
||||||
order = os.path.join(addr.filename, "test.order")
|
|
||||||
except:
|
|
||||||
order = None
|
|
||||||
if order and os.path.exists(order):
|
|
||||||
files = []
|
|
||||||
for line in open(order):
|
|
||||||
line = line.split('#')[0].strip()
|
|
||||||
if not line:
|
|
||||||
continue
|
|
||||||
fn = os.path.join(addr.filename, line.strip())
|
|
||||||
files.extend(sorted(glob.glob(fn)) or [fn])
|
|
||||||
files = list(OrderedDict.fromkeys(files))
|
|
||||||
tests = [ wrapper(fn, *args, **kwargs) for fn in files ]
|
|
||||||
return loader.suiteClass(tests)
|
|
||||||
return func(name, *args, **kwargs)
|
|
||||||
return wrapper
|
|
||||||
loader.loadTestsFromName = wrap(loader.loadTestsFromName)
|
|
||||||
return loader
|
|
||||||
|
|
||||||
# Use setup.cfg for most of the test configuration. Adding
|
|
||||||
# --with-jimorder here means that a normal "nosetests" run will
|
|
||||||
# still work, it just won't support test.order.
|
|
||||||
nose.main(addplugins = [ JimOrderPlugin() ],
|
|
||||||
argv = sys.argv + ["--with-jimorder"])
|
|
12
setup.cfg
12
setup.cfg
@@ -8,26 +8,16 @@ cover-package=nilmdb
|
|||||||
cover-erase=
|
cover-erase=
|
||||||
##cover-html= # this works, puts html output in cover/ dir
|
##cover-html= # this works, puts html output in cover/ dir
|
||||||
##cover-branches= # need nose 1.1.3 for this
|
##cover-branches= # need nose 1.1.3 for this
|
||||||
#debug=nose
|
|
||||||
#debug-log=nose.log
|
|
||||||
stop=
|
stop=
|
||||||
verbosity=2
|
verbosity=2
|
||||||
tests=tests
|
|
||||||
#tests=tests/test_bulkdata.py
|
|
||||||
#tests=tests/test_mustclose.py
|
|
||||||
#tests=tests/test_lrucache.py
|
|
||||||
#tests=tests/test_cmdline.py
|
#tests=tests/test_cmdline.py
|
||||||
#tests=tests/test_layout.py
|
#tests=tests/test_layout.py
|
||||||
#tests=tests/test_rbtree.py
|
tests=tests/test_interval.py
|
||||||
#tests=tests/test_interval.py
|
|
||||||
#tests=tests/test_rbtree.py,tests/test_interval.py
|
|
||||||
#tests=tests/test_interval.py
|
|
||||||
#tests=tests/test_client.py
|
#tests=tests/test_client.py
|
||||||
#tests=tests/test_timestamper.py
|
#tests=tests/test_timestamper.py
|
||||||
#tests=tests/test_serializer.py
|
#tests=tests/test_serializer.py
|
||||||
#tests=tests/test_iteratorizer.py
|
#tests=tests/test_iteratorizer.py
|
||||||
#tests=tests/test_client.py:TestClient.test_client_nilmdb
|
#tests=tests/test_client.py:TestClient.test_client_nilmdb
|
||||||
#tests=tests/test_nilmdb.py
|
|
||||||
#with-profile=
|
#with-profile=
|
||||||
#profile-sort=time
|
#profile-sort=time
|
||||||
##profile-restrict=10 # doesn't work right, treated as string or something
|
##profile-restrict=10 # doesn't work right, treated as string or something
|
||||||
|
@@ -1,19 +0,0 @@
|
|||||||
2.56437e+05 2.24430e+05 4.01161e+03 3.47534e+03 7.49589e+03 3.38894e+03 2.61397e+02 3.73126e+03
|
|
||||||
2.53963e+05 2.24167e+05 5.62107e+03 1.54801e+03 9.16517e+03 3.52293e+03 1.05893e+03 2.99696e+03
|
|
||||||
2.58508e+05 2.24930e+05 6.01140e+03 8.18866e+02 9.03995e+03 4.48244e+03 2.49039e+03 2.67934e+03
|
|
||||||
2.59627e+05 2.26022e+05 4.47450e+03 2.42302e+03 7.41419e+03 5.07197e+03 2.43938e+03 2.96296e+03
|
|
||||||
2.55187e+05 2.24632e+05 4.73857e+03 3.39804e+03 7.39512e+03 4.72645e+03 1.83903e+03 3.39353e+03
|
|
||||||
2.57102e+05 2.21623e+05 6.14413e+03 1.44109e+03 8.75648e+03 3.49532e+03 1.86994e+03 3.75253e+03
|
|
||||||
2.63653e+05 2.21770e+05 6.22177e+03 7.38962e+02 9.54760e+03 2.66682e+03 1.46266e+03 3.33257e+03
|
|
||||||
2.63613e+05 2.25256e+05 4.47712e+03 2.43745e+03 8.51021e+03 3.85563e+03 9.59442e+02 2.38718e+03
|
|
||||||
2.55350e+05 2.26264e+05 4.28372e+03 3.92394e+03 7.91247e+03 5.46652e+03 1.28499e+03 2.09372e+03
|
|
||||||
2.52727e+05 2.24609e+05 5.85193e+03 2.49198e+03 8.54063e+03 5.62305e+03 2.33978e+03 3.00714e+03
|
|
||||||
2.58475e+05 2.23578e+05 5.92487e+03 1.39448e+03 8.77962e+03 4.54418e+03 2.13203e+03 3.84976e+03
|
|
||||||
2.61563e+05 2.24609e+05 4.33614e+03 2.45575e+03 8.05538e+03 3.46911e+03 6.27873e+02 3.66420e+03
|
|
||||||
2.56401e+05 2.24441e+05 4.18715e+03 3.45717e+03 7.90669e+03 3.53355e+03 -5.84482e+00 2.96687e+03
|
|
||||||
2.54745e+05 2.22644e+05 6.02005e+03 1.94721e+03 9.28939e+03 3.80020e+03 1.34820e+03 2.37785e+03
|
|
||||||
2.60723e+05 2.22660e+05 6.69719e+03 1.03048e+03 9.26124e+03 4.34917e+03 2.84530e+03 2.73619e+03
|
|
||||||
2.63089e+05 2.25711e+05 4.77887e+03 2.60417e+03 7.39660e+03 4.59811e+03 2.17472e+03 3.40729e+03
|
|
||||||
2.55843e+05 2.27128e+05 4.02413e+03 4.39323e+03 6.79336e+03 4.62535e+03 7.52009e+02 3.44647e+03
|
|
||||||
2.51904e+05 2.24868e+05 5.82289e+03 3.02127e+03 8.46160e+03 3.80298e+03 8.07212e+02 3.53468e+03
|
|
||||||
2.57670e+05 2.22974e+05 6.73436e+03 1.60956e+03 9.92960e+03 2.98028e+03 1.44168e+03 3.05351e+03
|
|
@@ -13,7 +13,7 @@ class Renderer(object):
|
|||||||
|
|
||||||
# Rendering
|
# Rendering
|
||||||
def __render_dot_node(self, node, max_depth = 20):
|
def __render_dot_node(self, node, max_depth = 20):
|
||||||
from nilmdb.utils.printf import sprintf
|
from nilmdb.printf import sprintf
|
||||||
"""Render a single node and its children into a dot graph fragment"""
|
"""Render a single node and its children into a dot graph fragment"""
|
||||||
if max_depth == 0:
|
if max_depth == 0:
|
||||||
return ""
|
return ""
|
||||||
@@ -71,20 +71,3 @@ class Renderer(object):
|
|||||||
gtk.main_quit()
|
gtk.main_quit()
|
||||||
window.widget.connect('key-press-event', quit)
|
window.widget.connect('key-press-event', quit)
|
||||||
gtk.main()
|
gtk.main()
|
||||||
|
|
||||||
class RBTreeRenderer(Renderer):
|
|
||||||
def __init__(self, tree):
|
|
||||||
Renderer.__init__(self,
|
|
||||||
lambda node: node.left,
|
|
||||||
lambda node: node.right,
|
|
||||||
lambda node: node.red,
|
|
||||||
lambda node: node.start,
|
|
||||||
lambda node: node.end,
|
|
||||||
tree.nil)
|
|
||||||
self.tree = tree
|
|
||||||
|
|
||||||
def render(self, title = "RBTree", live = True):
|
|
||||||
if live:
|
|
||||||
return Renderer.render_dot_live(self, self.tree.getroot(), title)
|
|
||||||
else:
|
|
||||||
return Renderer.render_dot(self, self.tree.getroot(), title)
|
|
@@ -1,18 +0,0 @@
|
|||||||
test_printf.py
|
|
||||||
test_lrucache.py
|
|
||||||
test_mustclose.py
|
|
||||||
|
|
||||||
test_serializer.py
|
|
||||||
test_iteratorizer.py
|
|
||||||
|
|
||||||
test_timestamper.py
|
|
||||||
test_layout.py
|
|
||||||
test_rbtree.py
|
|
||||||
test_interval.py
|
|
||||||
|
|
||||||
test_bulkdata.py
|
|
||||||
test_nilmdb.py
|
|
||||||
test_client.py
|
|
||||||
test_cmdline.py
|
|
||||||
|
|
||||||
test_*.py
|
|
@@ -1,103 +0,0 @@
|
|||||||
# -*- coding: utf-8 -*-
|
|
||||||
|
|
||||||
import nilmdb
|
|
||||||
from nilmdb.utils.printf import *
|
|
||||||
import nilmdb.bulkdata
|
|
||||||
|
|
||||||
from nose.tools import *
|
|
||||||
from nose.tools import assert_raises
|
|
||||||
import itertools
|
|
||||||
|
|
||||||
from testutil.helpers import *
|
|
||||||
|
|
||||||
testdb = "tests/bulkdata-testdb"
|
|
||||||
|
|
||||||
from nilmdb.bulkdata import BulkData
|
|
||||||
|
|
||||||
class TestBulkData(object):
|
|
||||||
|
|
||||||
def test_bulkdata(self):
|
|
||||||
for (size, files, db) in [ ( 0, 0, testdb ),
|
|
||||||
( 25, 1000, testdb ),
|
|
||||||
( 1000, 3, testdb.decode("utf-8") ) ]:
|
|
||||||
recursive_unlink(db)
|
|
||||||
os.mkdir(db)
|
|
||||||
self.do_basic(db, size, files)
|
|
||||||
|
|
||||||
def do_basic(self, db, size, files):
|
|
||||||
"""Do the basic test with variable file_size and files_per_dir"""
|
|
||||||
if not size or not files:
|
|
||||||
data = BulkData(db)
|
|
||||||
else:
|
|
||||||
data = BulkData(db, file_size = size, files_per_dir = files)
|
|
||||||
|
|
||||||
# create empty
|
|
||||||
with assert_raises(ValueError):
|
|
||||||
data.create("/foo", "uint16_8")
|
|
||||||
with assert_raises(ValueError):
|
|
||||||
data.create("foo/bar", "uint16_8")
|
|
||||||
with assert_raises(ValueError):
|
|
||||||
data.create("/foo/bar", "uint8_8")
|
|
||||||
data.create("/foo/bar", "uint16_8")
|
|
||||||
data.create(u"/foo/baz/quux", "float64_16")
|
|
||||||
with assert_raises(ValueError):
|
|
||||||
data.create("/foo/bar/baz", "uint16_8")
|
|
||||||
with assert_raises(ValueError):
|
|
||||||
data.create("/foo/baz", "float64_16")
|
|
||||||
|
|
||||||
# get node -- see if caching works
|
|
||||||
nodes = []
|
|
||||||
for i in range(5000):
|
|
||||||
nodes.append(data.getnode("/foo/bar"))
|
|
||||||
nodes.append(data.getnode("/foo/baz/quux"))
|
|
||||||
del nodes
|
|
||||||
|
|
||||||
# Test node
|
|
||||||
node = data.getnode("/foo/bar")
|
|
||||||
with assert_raises(IndexError):
|
|
||||||
x = node[0]
|
|
||||||
raw = []
|
|
||||||
for i in range(1000):
|
|
||||||
raw.append([10000+i, 1, 2, 3, 4, 5, 6, 7, 8 ])
|
|
||||||
node.append(raw[0:1])
|
|
||||||
node.append(raw[1:100])
|
|
||||||
node.append(raw[100:])
|
|
||||||
|
|
||||||
misc_slices = [ 0, 100, slice(None), slice(0), slice(10),
|
|
||||||
slice(5,10), slice(3,None), slice(3,-3),
|
|
||||||
slice(20,10), slice(200,100,-1), slice(None,0,-1),
|
|
||||||
slice(100,500,5) ]
|
|
||||||
# Extract slices
|
|
||||||
for s in misc_slices:
|
|
||||||
eq_(node[s], raw[s])
|
|
||||||
|
|
||||||
# Get some coverage of remove; remove is more fully tested
|
|
||||||
# in cmdline
|
|
||||||
with assert_raises(IndexError):
|
|
||||||
node.remove(9999,9998)
|
|
||||||
|
|
||||||
# close, reopen
|
|
||||||
# reopen
|
|
||||||
data.close()
|
|
||||||
if not size or not files:
|
|
||||||
data = BulkData(db)
|
|
||||||
else:
|
|
||||||
data = BulkData(db, file_size = size, files_per_dir = files)
|
|
||||||
node = data.getnode("/foo/bar")
|
|
||||||
|
|
||||||
# Extract slices
|
|
||||||
for s in misc_slices:
|
|
||||||
eq_(node[s], raw[s])
|
|
||||||
|
|
||||||
# destroy
|
|
||||||
with assert_raises(ValueError):
|
|
||||||
data.destroy("/foo")
|
|
||||||
with assert_raises(ValueError):
|
|
||||||
data.destroy("/foo/baz")
|
|
||||||
with assert_raises(ValueError):
|
|
||||||
data.destroy("/foo/qwerty")
|
|
||||||
data.destroy("/foo/baz/quux")
|
|
||||||
data.destroy("/foo/bar")
|
|
||||||
|
|
||||||
# close
|
|
||||||
data.close()
|
|
@@ -1,7 +1,5 @@
|
|||||||
# -*- coding: utf-8 -*-
|
|
||||||
|
|
||||||
import nilmdb
|
import nilmdb
|
||||||
from nilmdb.utils.printf import *
|
from nilmdb.printf import *
|
||||||
from nilmdb.client import ClientError, ServerError
|
from nilmdb.client import ClientError, ServerError
|
||||||
|
|
||||||
import datetime_tz
|
import datetime_tz
|
||||||
@@ -17,9 +15,8 @@ import cStringIO
|
|||||||
import simplejson as json
|
import simplejson as json
|
||||||
import unittest
|
import unittest
|
||||||
import warnings
|
import warnings
|
||||||
import resource
|
|
||||||
|
|
||||||
from testutil.helpers import *
|
from test_helpers import *
|
||||||
|
|
||||||
testdb = "tests/client-testdb"
|
testdb = "tests/client-testdb"
|
||||||
|
|
||||||
@@ -70,11 +67,7 @@ class TestClient(object):
|
|||||||
eq_(distutils.version.StrictVersion(version),
|
eq_(distutils.version.StrictVersion(version),
|
||||||
distutils.version.StrictVersion(test_server.version))
|
distutils.version.StrictVersion(test_server.version))
|
||||||
|
|
||||||
# Bad URLs should give 404, not 500
|
def test_client_2_nilmdb(self):
|
||||||
with assert_raises(ClientError):
|
|
||||||
client.http.get("/stream/create")
|
|
||||||
|
|
||||||
def test_client_2_createlist(self):
|
|
||||||
# Basic stream tests, like those in test_nilmdb:test_stream
|
# Basic stream tests, like those in test_nilmdb:test_stream
|
||||||
client = nilmdb.Client(url = "http://localhost:12380/")
|
client = nilmdb.Client(url = "http://localhost:12380/")
|
||||||
|
|
||||||
@@ -89,8 +82,6 @@ class TestClient(object):
|
|||||||
# Bad layout type
|
# Bad layout type
|
||||||
with assert_raises(ClientError):
|
with assert_raises(ClientError):
|
||||||
client.stream_create("/newton/prep", "NoSuchLayout")
|
client.stream_create("/newton/prep", "NoSuchLayout")
|
||||||
|
|
||||||
# Create three streams
|
|
||||||
client.stream_create("/newton/prep", "PrepData")
|
client.stream_create("/newton/prep", "PrepData")
|
||||||
client.stream_create("/newton/raw", "RawData")
|
client.stream_create("/newton/raw", "RawData")
|
||||||
client.stream_create("/newton/zzz/rawnotch", "RawNotchedData")
|
client.stream_create("/newton/zzz/rawnotch", "RawNotchedData")
|
||||||
@@ -104,20 +95,6 @@ class TestClient(object):
|
|||||||
eq_(client.stream_list(layout="RawData"), [ ["/newton/raw", "RawData"] ])
|
eq_(client.stream_list(layout="RawData"), [ ["/newton/raw", "RawData"] ])
|
||||||
eq_(client.stream_list(path="/newton/raw"), [ ["/newton/raw", "RawData"] ])
|
eq_(client.stream_list(path="/newton/raw"), [ ["/newton/raw", "RawData"] ])
|
||||||
|
|
||||||
# Try messing with resource limits to trigger errors and get
|
|
||||||
# more coverage. Here, make it so we can only create files 1
|
|
||||||
# byte in size, which will trigger an IOError in the server when
|
|
||||||
# we create a table.
|
|
||||||
limit = resource.getrlimit(resource.RLIMIT_FSIZE)
|
|
||||||
resource.setrlimit(resource.RLIMIT_FSIZE, (1, limit[1]))
|
|
||||||
with assert_raises(ServerError) as e:
|
|
||||||
client.stream_create("/newton/hello", "RawData")
|
|
||||||
resource.setrlimit(resource.RLIMIT_FSIZE, limit)
|
|
||||||
|
|
||||||
|
|
||||||
def test_client_3_metadata(self):
|
|
||||||
client = nilmdb.Client(url = "http://localhost:12380/")
|
|
||||||
|
|
||||||
# Set / get metadata
|
# Set / get metadata
|
||||||
eq_(client.stream_get_metadata("/newton/prep"), {})
|
eq_(client.stream_get_metadata("/newton/prep"), {})
|
||||||
eq_(client.stream_get_metadata("/newton/raw"), {})
|
eq_(client.stream_get_metadata("/newton/raw"), {})
|
||||||
@@ -147,14 +124,13 @@ class TestClient(object):
|
|||||||
with assert_raises(ClientError):
|
with assert_raises(ClientError):
|
||||||
client.stream_update_metadata("/newton/prep", [1,2,3])
|
client.stream_update_metadata("/newton/prep", [1,2,3])
|
||||||
|
|
||||||
def test_client_4_insert(self):
|
def test_client_3_insert(self):
|
||||||
client = nilmdb.Client(url = "http://localhost:12380/")
|
client = nilmdb.Client(url = "http://localhost:12380/")
|
||||||
|
|
||||||
datetime_tz.localtz_set("America/New_York")
|
datetime_tz.localtz_set("America/New_York")
|
||||||
|
|
||||||
testfile = "tests/data/prep-20120323T1000"
|
testfile = "tests/data/prep-20120323T1000"
|
||||||
start = datetime_tz.datetime_tz.smartparse("20120323T1000")
|
start = datetime_tz.datetime_tz.smartparse("20120323T1000")
|
||||||
start = start.totimestamp()
|
|
||||||
rate = 120
|
rate = 120
|
||||||
|
|
||||||
# First try a nonexistent path
|
# First try a nonexistent path
|
||||||
@@ -179,60 +155,30 @@ class TestClient(object):
|
|||||||
|
|
||||||
# Try forcing a server request with empty data
|
# Try forcing a server request with empty data
|
||||||
with assert_raises(ClientError) as e:
|
with assert_raises(ClientError) as e:
|
||||||
client.http.put("stream/insert", "", { "path": "/newton/prep",
|
client.http.put("stream/insert", "", { "path": "/newton/prep" })
|
||||||
"start": 0, "end": 0 })
|
|
||||||
in_("400 Bad Request", str(e.exception))
|
in_("400 Bad Request", str(e.exception))
|
||||||
in_("no data provided", str(e.exception))
|
in_("no data provided", str(e.exception))
|
||||||
|
|
||||||
# Specify start/end (starts too late)
|
|
||||||
data = nilmdb.timestamper.TimestamperRate(testfile, start, 120)
|
|
||||||
with assert_raises(ClientError) as e:
|
|
||||||
result = client.stream_insert("/newton/prep", data,
|
|
||||||
start + 5, start + 120)
|
|
||||||
in_("400 Bad Request", str(e.exception))
|
|
||||||
in_("Data timestamp 1332511200.0 < start time 1332511205.0",
|
|
||||||
str(e.exception))
|
|
||||||
|
|
||||||
# Specify start/end (ends too early)
|
|
||||||
data = nilmdb.timestamper.TimestamperRate(testfile, start, 120)
|
|
||||||
with assert_raises(ClientError) as e:
|
|
||||||
result = client.stream_insert("/newton/prep", data,
|
|
||||||
start, start + 1)
|
|
||||||
in_("400 Bad Request", str(e.exception))
|
|
||||||
# Client chunks the input, so the exact timestamp here might change
|
|
||||||
# if the chunk positions change.
|
|
||||||
in_("Data timestamp 1332511271.016667 >= end time 1332511201.0",
|
|
||||||
str(e.exception))
|
|
||||||
|
|
||||||
# Now do the real load
|
# Now do the real load
|
||||||
data = nilmdb.timestamper.TimestamperRate(testfile, start, 120)
|
data = nilmdb.timestamper.TimestamperRate(testfile, start, 120)
|
||||||
result = client.stream_insert("/newton/prep", data,
|
result = client.stream_insert("/newton/prep", data)
|
||||||
start, start + 119.999777)
|
eq_(result[0], "ok")
|
||||||
eq_(result, "ok")
|
|
||||||
|
|
||||||
# Verify the intervals. Should be just one, even if the data
|
|
||||||
# was inserted in chunks, due to nilmdb interval concatenation.
|
|
||||||
intervals = list(client.stream_intervals("/newton/prep"))
|
|
||||||
eq_(intervals, [[start, start + 119.999777]])
|
|
||||||
|
|
||||||
# Try some overlapping data -- just insert it again
|
# Try some overlapping data -- just insert it again
|
||||||
data = nilmdb.timestamper.TimestamperRate(testfile, start, 120)
|
data = nilmdb.timestamper.TimestamperRate(testfile, start, 120)
|
||||||
with assert_raises(ClientError) as e:
|
with assert_raises(ClientError) as e:
|
||||||
result = client.stream_insert("/newton/prep", data)
|
result = client.stream_insert("/newton/prep", data)
|
||||||
in_("400 Bad Request", str(e.exception))
|
in_("400 Bad Request", str(e.exception))
|
||||||
in_("verlap", str(e.exception))
|
in_("OverlapError", str(e.exception))
|
||||||
|
|
||||||
def test_client_5_extractremove(self):
|
def test_client_4_extract(self):
|
||||||
# Misc tests for extract and remove. Most of them are in test_cmdline.
|
# Misc tests for extract. Most of them are in test_cmdline.
|
||||||
client = nilmdb.Client(url = "http://localhost:12380/")
|
client = nilmdb.Client(url = "http://localhost:12380/")
|
||||||
|
|
||||||
for x in client.stream_extract("/newton/prep", 123, 123):
|
for x in client.stream_extract("/newton/prep", 123, 123):
|
||||||
raise Exception("shouldn't be any data for this request")
|
raise Exception("shouldn't be any data for this request")
|
||||||
|
|
||||||
with assert_raises(ClientError) as e:
|
def test_client_5_generators(self):
|
||||||
client.stream_remove("/newton/prep", 123, 120)
|
|
||||||
|
|
||||||
def test_client_6_generators(self):
|
|
||||||
# A lot of the client functionality is already tested by test_cmdline,
|
# A lot of the client functionality is already tested by test_cmdline,
|
||||||
# but this gets a bit more coverage that cmdline misses.
|
# but this gets a bit more coverage that cmdline misses.
|
||||||
client = nilmdb.Client(url = "http://localhost:12380/")
|
client = nilmdb.Client(url = "http://localhost:12380/")
|
||||||
@@ -269,8 +215,7 @@ class TestClient(object):
|
|||||||
# Check PUT with generator out
|
# Check PUT with generator out
|
||||||
with assert_raises(ClientError) as e:
|
with assert_raises(ClientError) as e:
|
||||||
client.http.put_gen("stream/insert", "",
|
client.http.put_gen("stream/insert", "",
|
||||||
{ "path": "/newton/prep",
|
{ "path": "/newton/prep" }).next()
|
||||||
"start": 0, "end": 0 }).next()
|
|
||||||
in_("400 Bad Request", str(e.exception))
|
in_("400 Bad Request", str(e.exception))
|
||||||
in_("no data provided", str(e.exception))
|
in_("no data provided", str(e.exception))
|
||||||
|
|
||||||
@@ -281,7 +226,7 @@ class TestClient(object):
|
|||||||
in_("404 Not Found", str(e.exception))
|
in_("404 Not Found", str(e.exception))
|
||||||
in_("No such stream", str(e.exception))
|
in_("No such stream", str(e.exception))
|
||||||
|
|
||||||
def test_client_7_chunked(self):
|
def test_client_6_chunked(self):
|
||||||
# Make sure that /stream/intervals and /stream/extract
|
# Make sure that /stream/intervals and /stream/extract
|
||||||
# properly return streaming, chunked response. Pokes around
|
# properly return streaming, chunked response. Pokes around
|
||||||
# in client.http internals a bit to look at the response
|
# in client.http internals a bit to look at the response
|
||||||
@@ -293,7 +238,7 @@ class TestClient(object):
|
|||||||
# still disable chunked responses for debugging.
|
# still disable chunked responses for debugging.
|
||||||
x = client.http.get("stream/intervals", { "path": "/newton/prep" },
|
x = client.http.get("stream/intervals", { "path": "/newton/prep" },
|
||||||
retjson=False)
|
retjson=False)
|
||||||
lines_(x, 1)
|
eq_(x.count('\n'), 2)
|
||||||
if "transfer-encoding: chunked" not in client.http._headers.lower():
|
if "transfer-encoding: chunked" not in client.http._headers.lower():
|
||||||
warnings.warn("Non-chunked HTTP response for /stream/intervals")
|
warnings.warn("Non-chunked HTTP response for /stream/intervals")
|
||||||
|
|
||||||
@@ -303,40 +248,3 @@ class TestClient(object):
|
|||||||
"end": "123" }, retjson=False)
|
"end": "123" }, retjson=False)
|
||||||
if "transfer-encoding: chunked" not in client.http._headers.lower():
|
if "transfer-encoding: chunked" not in client.http._headers.lower():
|
||||||
warnings.warn("Non-chunked HTTP response for /stream/extract")
|
warnings.warn("Non-chunked HTTP response for /stream/extract")
|
||||||
|
|
||||||
def test_client_8_unicode(self):
|
|
||||||
# Basic Unicode tests
|
|
||||||
client = nilmdb.Client(url = "http://localhost:12380/")
|
|
||||||
|
|
||||||
# Delete streams that exist
|
|
||||||
for stream in client.stream_list():
|
|
||||||
client.stream_destroy(stream[0])
|
|
||||||
|
|
||||||
# Database is empty
|
|
||||||
eq_(client.stream_list(), [])
|
|
||||||
|
|
||||||
# Create Unicode stream, match it
|
|
||||||
raw = [ u"/düsseldorf/raw", u"uint16_6" ]
|
|
||||||
prep = [ u"/düsseldorf/prep", u"uint16_6" ]
|
|
||||||
client.stream_create(*raw)
|
|
||||||
eq_(client.stream_list(), [raw])
|
|
||||||
eq_(client.stream_list(layout=raw[1]), [raw])
|
|
||||||
eq_(client.stream_list(path=raw[0]), [raw])
|
|
||||||
client.stream_create(*prep)
|
|
||||||
eq_(client.stream_list(), [prep, raw])
|
|
||||||
|
|
||||||
# Set / get metadata with Unicode keys and values
|
|
||||||
eq_(client.stream_get_metadata(raw[0]), {})
|
|
||||||
eq_(client.stream_get_metadata(prep[0]), {})
|
|
||||||
meta1 = { u"alpha": u"α",
|
|
||||||
u"β": u"beta" }
|
|
||||||
meta2 = { u"alpha": u"α" }
|
|
||||||
meta3 = { u"β": u"beta" }
|
|
||||||
client.stream_set_metadata(prep[0], meta1)
|
|
||||||
client.stream_update_metadata(prep[0], {})
|
|
||||||
client.stream_update_metadata(raw[0], meta2)
|
|
||||||
client.stream_update_metadata(raw[0], meta3)
|
|
||||||
eq_(client.stream_get_metadata(prep[0]), meta1)
|
|
||||||
eq_(client.stream_get_metadata(raw[0]), meta1)
|
|
||||||
eq_(client.stream_get_metadata(raw[0], [ "alpha" ]), meta2)
|
|
||||||
eq_(client.stream_get_metadata(raw[0], [ "alpha", "β" ]), meta1)
|
|
||||||
|
@@ -1,35 +1,29 @@
|
|||||||
# -*- coding: utf-8 -*-
|
|
||||||
|
|
||||||
import nilmdb
|
import nilmdb
|
||||||
from nilmdb.utils.printf import *
|
from nilmdb.printf import *
|
||||||
import nilmdb.cmdline
|
import nilmdb.cmdline
|
||||||
|
|
||||||
import unittest
|
|
||||||
from nose.tools import *
|
from nose.tools import *
|
||||||
from nose.tools import assert_raises
|
from nose.tools import assert_raises
|
||||||
import itertools
|
import itertools
|
||||||
import datetime_tz
|
import datetime_tz
|
||||||
import os
|
import os
|
||||||
import re
|
|
||||||
import shutil
|
import shutil
|
||||||
import sys
|
import sys
|
||||||
import threading
|
import threading
|
||||||
import urllib2
|
import urllib2
|
||||||
from urllib2 import urlopen, HTTPError
|
from urllib2 import urlopen, HTTPError
|
||||||
import Queue
|
import Queue
|
||||||
import StringIO
|
import cStringIO
|
||||||
import shlex
|
import shlex
|
||||||
|
|
||||||
from testutil.helpers import *
|
from test_helpers import *
|
||||||
|
|
||||||
testdb = "tests/cmdline-testdb"
|
testdb = "tests/cmdline-testdb"
|
||||||
|
|
||||||
def server_start(max_results = None, bulkdata_args = {}):
|
def server_start(max_results = None):
|
||||||
global test_server, test_db
|
global test_server, test_db
|
||||||
# Start web app on a custom port
|
# Start web app on a custom port
|
||||||
test_db = nilmdb.NilmDB(testdb, sync = False,
|
test_db = nilmdb.NilmDB(testdb, sync = False, max_results = max_results)
|
||||||
max_results = max_results,
|
|
||||||
bulkdata_args = bulkdata_args)
|
|
||||||
test_server = nilmdb.Server(test_db, host = "127.0.0.1",
|
test_server = nilmdb.Server(test_db, host = "127.0.0.1",
|
||||||
port = 12380, stoppable = False,
|
port = 12380, stoppable = False,
|
||||||
fast_shutdown = True,
|
fast_shutdown = True,
|
||||||
@@ -51,18 +45,12 @@ def setup_module():
|
|||||||
def teardown_module():
|
def teardown_module():
|
||||||
server_stop()
|
server_stop()
|
||||||
|
|
||||||
# Add an encoding property to StringIO so Python will convert Unicode
|
|
||||||
# properly when writing or reading.
|
|
||||||
class UTF8StringIO(StringIO.StringIO):
|
|
||||||
encoding = 'utf-8'
|
|
||||||
|
|
||||||
class TestCmdline(object):
|
class TestCmdline(object):
|
||||||
|
|
||||||
def run(self, arg_string, infile=None, outfile=None):
|
def run(self, arg_string, infile=None, outfile=None):
|
||||||
"""Run a cmdline client with the specified argument string,
|
"""Run a cmdline client with the specified argument string,
|
||||||
passing the given input. Returns a tuple with the output and
|
passing the given input. Returns a tuple with the output and
|
||||||
exit code"""
|
exit code"""
|
||||||
# printf("TZ=UTC ./nilmtool.py %s\n", arg_string)
|
|
||||||
class stdio_wrapper:
|
class stdio_wrapper:
|
||||||
def __init__(self, stdin, stdout, stderr):
|
def __init__(self, stdin, stdout, stderr):
|
||||||
self.io = (stdin, stdout, stderr)
|
self.io = (stdin, stdout, stderr)
|
||||||
@@ -73,18 +61,15 @@ class TestCmdline(object):
|
|||||||
( sys.stdin, sys.stdout, sys.stderr ) = self.saved
|
( sys.stdin, sys.stdout, sys.stderr ) = self.saved
|
||||||
# Empty input if none provided
|
# Empty input if none provided
|
||||||
if infile is None:
|
if infile is None:
|
||||||
infile = UTF8StringIO("")
|
infile = cStringIO.StringIO("")
|
||||||
# Capture stderr
|
# Capture stderr
|
||||||
errfile = UTF8StringIO()
|
errfile = cStringIO.StringIO()
|
||||||
if outfile is None:
|
if outfile is None:
|
||||||
# If no output file, capture stdout with stderr
|
# If no output file, capture stdout with stderr
|
||||||
outfile = errfile
|
outfile = errfile
|
||||||
with stdio_wrapper(infile, outfile, errfile) as s:
|
with stdio_wrapper(infile, outfile, errfile) as s:
|
||||||
try:
|
try:
|
||||||
# shlex doesn't support Unicode very well. Encode the
|
nilmdb.cmdline.Cmdline(shlex.split(arg_string)).run()
|
||||||
# string as UTF-8 explicitly before splitting.
|
|
||||||
args = shlex.split(arg_string.encode('utf-8'))
|
|
||||||
nilmdb.cmdline.Cmdline(args).run()
|
|
||||||
sys.exit(0)
|
sys.exit(0)
|
||||||
except SystemExit as e:
|
except SystemExit as e:
|
||||||
exitcode = e.code
|
exitcode = e.code
|
||||||
@@ -98,24 +83,14 @@ class TestCmdline(object):
|
|||||||
self.dump()
|
self.dump()
|
||||||
eq_(self.exitcode, 0)
|
eq_(self.exitcode, 0)
|
||||||
|
|
||||||
def fail(self, arg_string, infile = None,
|
def fail(self, arg_string, infile = None, exitcode = None):
|
||||||
exitcode = None, require_error = True):
|
|
||||||
self.run(arg_string, infile)
|
self.run(arg_string, infile)
|
||||||
if exitcode is not None and self.exitcode != exitcode:
|
if exitcode is not None and self.exitcode != exitcode:
|
||||||
# Wrong exit code
|
|
||||||
self.dump()
|
self.dump()
|
||||||
eq_(self.exitcode, exitcode)
|
eq_(self.exitcode, exitcode)
|
||||||
if self.exitcode == 0:
|
if self.exitcode == 0:
|
||||||
# Success, when we wanted failure
|
|
||||||
self.dump()
|
self.dump()
|
||||||
ne_(self.exitcode, 0)
|
ne_(self.exitcode, 0)
|
||||||
# Make sure the output contains the word "error" at the
|
|
||||||
# beginning of a line, but only if an exitcode wasn't
|
|
||||||
# specified.
|
|
||||||
if require_error and not re.search("^error",
|
|
||||||
self.captured, re.MULTILINE):
|
|
||||||
raise AssertionError("command failed, but output doesn't "
|
|
||||||
"contain the string 'error'")
|
|
||||||
|
|
||||||
def contain(self, checkstring):
|
def contain(self, checkstring):
|
||||||
in_(checkstring, self.captured)
|
in_(checkstring, self.captured)
|
||||||
@@ -145,7 +120,7 @@ class TestCmdline(object):
|
|||||||
def dump(self):
|
def dump(self):
|
||||||
printf("-----dump start-----\n%s-----dump end-----\n", self.captured)
|
printf("-----dump start-----\n%s-----dump end-----\n", self.captured)
|
||||||
|
|
||||||
def test_01_basic(self):
|
def test_cmdline_01_basic(self):
|
||||||
|
|
||||||
# help
|
# help
|
||||||
self.ok("--help")
|
self.ok("--help")
|
||||||
@@ -191,14 +166,14 @@ class TestCmdline(object):
|
|||||||
self.fail("extract --start 2000-01-01 --start 2001-01-02")
|
self.fail("extract --start 2000-01-01 --start 2001-01-02")
|
||||||
self.contain("duplicated argument")
|
self.contain("duplicated argument")
|
||||||
|
|
||||||
def test_02_info(self):
|
def test_cmdline_02_info(self):
|
||||||
self.ok("info")
|
self.ok("info")
|
||||||
self.contain("Server URL: http://localhost:12380/")
|
self.contain("Server URL: http://localhost:12380/")
|
||||||
self.contain("Server version: " + test_server.version)
|
self.contain("Server version: " + test_server.version)
|
||||||
self.contain("Server database path")
|
self.contain("Server database path")
|
||||||
self.contain("Server database size")
|
self.contain("Server database size")
|
||||||
|
|
||||||
def test_03_createlist(self):
|
def test_cmdline_03_createlist(self):
|
||||||
# Basic stream tests, like those in test_client.
|
# Basic stream tests, like those in test_client.
|
||||||
|
|
||||||
# No streams
|
# No streams
|
||||||
@@ -215,44 +190,22 @@ class TestCmdline(object):
|
|||||||
# Bad layout type
|
# Bad layout type
|
||||||
self.fail("create /newton/prep NoSuchLayout")
|
self.fail("create /newton/prep NoSuchLayout")
|
||||||
self.contain("no such layout")
|
self.contain("no such layout")
|
||||||
self.fail("create /newton/prep float32_0")
|
|
||||||
self.contain("no such layout")
|
|
||||||
self.fail("create /newton/prep float33_1")
|
|
||||||
self.contain("no such layout")
|
|
||||||
|
|
||||||
# Create a few streams
|
# Create a few streams
|
||||||
self.ok("create /newton/zzz/rawnotch RawNotchedData")
|
|
||||||
self.ok("create /newton/prep PrepData")
|
self.ok("create /newton/prep PrepData")
|
||||||
self.ok("create /newton/raw RawData")
|
self.ok("create /newton/raw RawData")
|
||||||
|
self.ok("create /newton/zzz/rawnotch RawNotchedData")
|
||||||
|
|
||||||
# Should not be able to create a stream with another stream as
|
# Verify we got those 3 streams
|
||||||
# its parent
|
|
||||||
self.fail("create /newton/prep/blah PrepData")
|
|
||||||
self.contain("path is subdir of existing node")
|
|
||||||
|
|
||||||
# Should not be able to create a stream at a location that
|
|
||||||
# has other nodes as children
|
|
||||||
self.fail("create /newton/zzz PrepData")
|
|
||||||
self.contain("subdirs of this path already exist")
|
|
||||||
|
|
||||||
# Verify we got those 3 streams and they're returned in
|
|
||||||
# alphabetical order.
|
|
||||||
self.ok("list")
|
self.ok("list")
|
||||||
self.match("/newton/prep PrepData\n"
|
self.match("/newton/prep PrepData\n"
|
||||||
"/newton/raw RawData\n"
|
"/newton/raw RawData\n"
|
||||||
"/newton/zzz/rawnotch RawNotchedData\n")
|
"/newton/zzz/rawnotch RawNotchedData\n")
|
||||||
|
|
||||||
# Match just one type or one path. Also check
|
# Match just one type or one path
|
||||||
# that --path is optional
|
|
||||||
self.ok("list --path /newton/raw")
|
self.ok("list --path /newton/raw")
|
||||||
self.match("/newton/raw RawData\n")
|
self.match("/newton/raw RawData\n")
|
||||||
|
|
||||||
self.ok("list /newton/raw")
|
|
||||||
self.match("/newton/raw RawData\n")
|
|
||||||
|
|
||||||
self.fail("list -p /newton/raw /newton/raw")
|
|
||||||
self.contain("too many paths")
|
|
||||||
|
|
||||||
self.ok("list --layout RawData")
|
self.ok("list --layout RawData")
|
||||||
self.match("/newton/raw RawData\n")
|
self.match("/newton/raw RawData\n")
|
||||||
|
|
||||||
@@ -264,17 +217,10 @@ class TestCmdline(object):
|
|||||||
self.ok("list --path *zzz* --layout Raw*")
|
self.ok("list --path *zzz* --layout Raw*")
|
||||||
self.match("/newton/zzz/rawnotch RawNotchedData\n")
|
self.match("/newton/zzz/rawnotch RawNotchedData\n")
|
||||||
|
|
||||||
self.ok("list *zzz* --layout Raw*")
|
|
||||||
self.match("/newton/zzz/rawnotch RawNotchedData\n")
|
|
||||||
|
|
||||||
self.ok("list --path *zzz* --layout Prep*")
|
self.ok("list --path *zzz* --layout Prep*")
|
||||||
self.match("")
|
self.match("")
|
||||||
|
|
||||||
# reversed range
|
def test_cmdline_04_metadata(self):
|
||||||
self.fail("list /newton/prep --start 2020-01-01 --end 2000-01-01")
|
|
||||||
self.contain("start is after end")
|
|
||||||
|
|
||||||
def test_04_metadata(self):
|
|
||||||
# Set / get metadata
|
# Set / get metadata
|
||||||
self.fail("metadata")
|
self.fail("metadata")
|
||||||
self.fail("metadata --get")
|
self.fail("metadata --get")
|
||||||
@@ -331,7 +277,7 @@ class TestCmdline(object):
|
|||||||
self.fail("metadata /newton/nosuchpath")
|
self.fail("metadata /newton/nosuchpath")
|
||||||
self.contain("No stream at path /newton/nosuchpath")
|
self.contain("No stream at path /newton/nosuchpath")
|
||||||
|
|
||||||
def test_05_parsetime(self):
|
def test_cmdline_05_parsetime(self):
|
||||||
os.environ['TZ'] = "America/New_York"
|
os.environ['TZ'] = "America/New_York"
|
||||||
cmd = nilmdb.cmdline.Cmdline(None)
|
cmd = nilmdb.cmdline.Cmdline(None)
|
||||||
test = datetime_tz.datetime_tz.now()
|
test = datetime_tz.datetime_tz.now()
|
||||||
@@ -340,23 +286,30 @@ class TestCmdline(object):
|
|||||||
eq_(cmd.parse_time("hi there 20120405 1400-0400 testing! 123"), test)
|
eq_(cmd.parse_time("hi there 20120405 1400-0400 testing! 123"), test)
|
||||||
eq_(cmd.parse_time("20120405 1800 UTC"), test)
|
eq_(cmd.parse_time("20120405 1800 UTC"), test)
|
||||||
eq_(cmd.parse_time("20120405 1400-0400 UTC"), test)
|
eq_(cmd.parse_time("20120405 1400-0400 UTC"), test)
|
||||||
for badtime in [ "20120405 1400-9999", "hello", "-", "", "14:00" ]:
|
|
||||||
with assert_raises(ValueError):
|
with assert_raises(ValueError):
|
||||||
x = cmd.parse_time(badtime)
|
print cmd.parse_time("20120405 1400-9999")
|
||||||
|
with assert_raises(ValueError):
|
||||||
|
print cmd.parse_time("hello")
|
||||||
|
with assert_raises(ValueError):
|
||||||
|
print cmd.parse_time("-")
|
||||||
|
with assert_raises(ValueError):
|
||||||
|
print cmd.parse_time("")
|
||||||
|
with assert_raises(ValueError):
|
||||||
|
print cmd.parse_time("14:00")
|
||||||
eq_(cmd.parse_time("snapshot-20120405-140000.raw.gz"), test)
|
eq_(cmd.parse_time("snapshot-20120405-140000.raw.gz"), test)
|
||||||
eq_(cmd.parse_time("prep-20120405T1400"), test)
|
eq_(cmd.parse_time("prep-20120405T1400"), test)
|
||||||
|
|
||||||
def test_06_insert(self):
|
def test_cmdline_06_insert(self):
|
||||||
self.ok("insert --help")
|
self.ok("insert --help")
|
||||||
|
|
||||||
self.fail("insert /foo/bar baz qwer")
|
self.fail("insert /foo/bar baz qwer")
|
||||||
self.contain("error getting stream info")
|
self.contain("Error getting stream info")
|
||||||
|
|
||||||
self.fail("insert /newton/prep baz qwer")
|
self.fail("insert /newton/prep baz qwer")
|
||||||
self.match("error opening input file baz\n")
|
self.match("Error opening input file baz\n")
|
||||||
|
|
||||||
self.fail("insert /newton/prep")
|
self.fail("insert /newton/prep")
|
||||||
self.contain("error extracting time")
|
self.contain("Error extracting time")
|
||||||
|
|
||||||
self.fail("insert --start 19801205 /newton/prep 1 2 3 4")
|
self.fail("insert --start 19801205 /newton/prep 1 2 3 4")
|
||||||
self.contain("--start can only be used with one input file")
|
self.contain("--start can only be used with one input file")
|
||||||
@@ -397,7 +350,7 @@ class TestCmdline(object):
|
|||||||
os.environ['TZ'] = "UTC"
|
os.environ['TZ'] = "UTC"
|
||||||
self.fail("insert --rate 120 /newton/raw "
|
self.fail("insert --rate 120 /newton/raw "
|
||||||
"tests/data/prep-20120323T1004")
|
"tests/data/prep-20120323T1004")
|
||||||
self.contain("error parsing input data")
|
self.contain("Error parsing input data")
|
||||||
|
|
||||||
# empty data does nothing
|
# empty data does nothing
|
||||||
self.ok("insert --rate 120 --start '03/23/2012 06:05:00' /newton/prep "
|
self.ok("insert --rate 120 --start '03/23/2012 06:05:00' /newton/prep "
|
||||||
@@ -406,64 +359,57 @@ class TestCmdline(object):
|
|||||||
# bad start time
|
# bad start time
|
||||||
self.fail("insert --rate 120 --start 'whatever' /newton/prep /dev/null")
|
self.fail("insert --rate 120 --start 'whatever' /newton/prep /dev/null")
|
||||||
|
|
||||||
def test_07_detail(self):
|
def test_cmdline_07_detail(self):
|
||||||
# Just count the number of lines, it's probably fine
|
# Just count the number of lines, it's probably fine
|
||||||
self.ok("list --detail")
|
self.ok("list --detail")
|
||||||
lines_(self.captured, 8)
|
eq_(self.captured.count('\n'), 11)
|
||||||
|
|
||||||
self.ok("list --detail --path *prep")
|
self.ok("list --detail --path *prep")
|
||||||
lines_(self.captured, 4)
|
eq_(self.captured.count('\n'), 7)
|
||||||
|
|
||||||
self.ok("list --detail --path *prep --start='23 Mar 2012 10:02'")
|
self.ok("list --detail --path *prep --start='23 Mar 2012 10:02'")
|
||||||
lines_(self.captured, 3)
|
eq_(self.captured.count('\n'), 5)
|
||||||
|
|
||||||
self.ok("list --detail --path *prep --start='23 Mar 2012 10:05'")
|
self.ok("list --detail --path *prep --start='23 Mar 2012 10:05'")
|
||||||
lines_(self.captured, 2)
|
eq_(self.captured.count('\n'), 3)
|
||||||
|
|
||||||
self.ok("list --detail --path *prep --start='23 Mar 2012 10:05:15'")
|
self.ok("list --detail --path *prep --start='23 Mar 2012 10:05:15'")
|
||||||
lines_(self.captured, 2)
|
eq_(self.captured.count('\n'), 2)
|
||||||
self.contain("10:05:15.000")
|
self.contain("10:05:15.000")
|
||||||
|
|
||||||
self.ok("list --detail --path *prep --start='23 Mar 2012 10:05:15.50'")
|
self.ok("list --detail --path *prep --start='23 Mar 2012 10:05:15.50'")
|
||||||
lines_(self.captured, 2)
|
eq_(self.captured.count('\n'), 2)
|
||||||
self.contain("10:05:15.500")
|
self.contain("10:05:15.500")
|
||||||
|
|
||||||
self.ok("list --detail --path *prep --start='23 Mar 2012 19:05:15.50'")
|
self.ok("list --detail --path *prep --start='23 Mar 2012 19:05:15.50'")
|
||||||
lines_(self.captured, 2)
|
eq_(self.captured.count('\n'), 2)
|
||||||
self.contain("no intervals")
|
self.contain("no intervals")
|
||||||
|
|
||||||
self.ok("list --detail --path *prep --start='23 Mar 2012 10:05:15.50'"
|
self.ok("list --detail --path *prep --start='23 Mar 2012 10:05:15.50'"
|
||||||
+ " --end='23 Mar 2012 10:05:15.50'")
|
+ " --end='23 Mar 2012 10:05:15.50'")
|
||||||
lines_(self.captured, 2)
|
eq_(self.captured.count('\n'), 2)
|
||||||
self.contain("10:05:15.500")
|
self.contain("10:05:15.500")
|
||||||
|
|
||||||
self.ok("list --detail")
|
self.ok("list --detail")
|
||||||
lines_(self.captured, 8)
|
eq_(self.captured.count('\n'), 11)
|
||||||
|
|
||||||
def test_08_extract(self):
|
def test_cmdline_08_extract(self):
|
||||||
# nonexistent stream
|
# nonexistent stream
|
||||||
self.fail("extract /no/such/foo --start 2000-01-01 --end 2020-01-01")
|
self.fail("extract /no/such/foo --start 2000-01-01 --end 2020-01-01")
|
||||||
self.contain("error getting stream info")
|
self.contain("Error getting stream info")
|
||||||
|
|
||||||
# reversed range
|
# empty ranges return an error
|
||||||
self.fail("extract -a /newton/prep --start 2020-01-01 --end 2000-01-01")
|
|
||||||
self.contain("start is after end")
|
|
||||||
|
|
||||||
# empty ranges return error 2
|
|
||||||
self.fail("extract -a /newton/prep " +
|
self.fail("extract -a /newton/prep " +
|
||||||
"--start '23 Mar 2012 10:00:30' " +
|
"--start '23 Mar 2012 10:00:30' " +
|
||||||
"--end '23 Mar 2012 10:00:30'",
|
"--end '23 Mar 2012 10:00:30'", exitcode = 2)
|
||||||
exitcode = 2, require_error = False)
|
|
||||||
self.contain("no data")
|
self.contain("no data")
|
||||||
self.fail("extract -a /newton/prep " +
|
self.fail("extract -a /newton/prep " +
|
||||||
"--start '23 Mar 2012 10:00:30.000001' " +
|
"--start '23 Mar 2012 10:00:30.000001' " +
|
||||||
"--end '23 Mar 2012 10:00:30.000001'",
|
"--end '23 Mar 2012 10:00:30.000001'", exitcode = 2)
|
||||||
exitcode = 2, require_error = False)
|
|
||||||
self.contain("no data")
|
self.contain("no data")
|
||||||
self.fail("extract -a /newton/prep " +
|
self.fail("extract -a /newton/prep " +
|
||||||
"--start '23 Mar 2022 10:00:30' " +
|
"--start '23 Mar 2022 10:00:30' " +
|
||||||
"--end '23 Mar 2022 10:00:30'",
|
"--end '23 Mar 2022 10:00:30'", exitcode = 2)
|
||||||
exitcode = 2, require_error = False)
|
|
||||||
self.contain("no data")
|
self.contain("no data")
|
||||||
|
|
||||||
# but are ok if we're just counting results
|
# but are ok if we're just counting results
|
||||||
@@ -498,325 +444,15 @@ class TestCmdline(object):
|
|||||||
|
|
||||||
# all data put in by tests
|
# all data put in by tests
|
||||||
self.ok("extract -a /newton/prep --start 2000-01-01 --end 2020-01-01")
|
self.ok("extract -a /newton/prep --start 2000-01-01 --end 2020-01-01")
|
||||||
lines_(self.captured, 43204)
|
eq_(self.captured.count('\n'), 43204)
|
||||||
self.ok("extract -c /newton/prep --start 2000-01-01 --end 2020-01-01")
|
self.ok("extract -c /newton/prep --start 2000-01-01 --end 2020-01-01")
|
||||||
self.match("43200\n")
|
self.match("43200\n")
|
||||||
|
|
||||||
def test_09_truncated(self):
|
def test_cmdline_09_truncated(self):
|
||||||
# Test truncated responses by overriding the nilmdb max_results
|
# Test truncated responses by overriding the nilmdb max_results
|
||||||
server_stop()
|
server_stop()
|
||||||
server_start(max_results = 2)
|
server_start(max_results = 2)
|
||||||
self.ok("list --detail")
|
self.ok("list --detail")
|
||||||
lines_(self.captured, 8)
|
eq_(self.captured.count('\n'), 11)
|
||||||
server_stop()
|
server_stop()
|
||||||
server_start()
|
server_start()
|
||||||
|
|
||||||
def test_10_remove(self):
|
|
||||||
# Removing data
|
|
||||||
|
|
||||||
# Try nonexistent stream
|
|
||||||
self.fail("remove /no/such/foo --start 2000-01-01 --end 2020-01-01")
|
|
||||||
self.contain("No stream at path")
|
|
||||||
|
|
||||||
self.fail("remove /newton/prep --start 2020-01-01 --end 2000-01-01")
|
|
||||||
self.contain("start is after end")
|
|
||||||
|
|
||||||
# empty ranges return success, backwards ranges return error
|
|
||||||
self.ok("remove /newton/prep " +
|
|
||||||
"--start '23 Mar 2012 10:00:30' " +
|
|
||||||
"--end '23 Mar 2012 10:00:30'")
|
|
||||||
self.match("")
|
|
||||||
self.ok("remove /newton/prep " +
|
|
||||||
"--start '23 Mar 2012 10:00:30.000001' " +
|
|
||||||
"--end '23 Mar 2012 10:00:30.000001'")
|
|
||||||
self.match("")
|
|
||||||
self.ok("remove /newton/prep " +
|
|
||||||
"--start '23 Mar 2022 10:00:30' " +
|
|
||||||
"--end '23 Mar 2022 10:00:30'")
|
|
||||||
self.match("")
|
|
||||||
|
|
||||||
# Verbose
|
|
||||||
self.ok("remove -c /newton/prep " +
|
|
||||||
"--start '23 Mar 2012 10:00:30' " +
|
|
||||||
"--end '23 Mar 2012 10:00:30'")
|
|
||||||
self.match("0\n")
|
|
||||||
self.ok("remove --count /newton/prep " +
|
|
||||||
"--start '23 Mar 2012 10:00:30' " +
|
|
||||||
"--end '23 Mar 2012 10:00:30'")
|
|
||||||
self.match("0\n")
|
|
||||||
|
|
||||||
# Make sure we have the data we expect
|
|
||||||
self.ok("list --detail /newton/prep")
|
|
||||||
self.match("/newton/prep PrepData\n" +
|
|
||||||
" [ Fri, 23 Mar 2012 10:00:00.000000 +0000"
|
|
||||||
" -> Fri, 23 Mar 2012 10:01:59.991668 +0000 ]\n"
|
|
||||||
" [ Fri, 23 Mar 2012 10:02:00.000000 +0000"
|
|
||||||
" -> Fri, 23 Mar 2012 10:03:59.991668 +0000 ]\n"
|
|
||||||
" [ Fri, 23 Mar 2012 10:04:00.000000 +0000"
|
|
||||||
" -> Fri, 23 Mar 2012 10:05:59.991668 +0000 ]\n")
|
|
||||||
|
|
||||||
# Remove various chunks of prep data and make sure
|
|
||||||
# they're gone.
|
|
||||||
self.ok("remove -c /newton/prep " +
|
|
||||||
"--start '23 Mar 2012 10:00:30' " +
|
|
||||||
"--end '23 Mar 2012 10:00:40'")
|
|
||||||
self.match("1200\n")
|
|
||||||
|
|
||||||
self.ok("remove -c /newton/prep " +
|
|
||||||
"--start '23 Mar 2012 10:00:10' " +
|
|
||||||
"--end '23 Mar 2012 10:00:20'")
|
|
||||||
self.match("1200\n")
|
|
||||||
|
|
||||||
self.ok("remove -c /newton/prep " +
|
|
||||||
"--start '23 Mar 2012 10:00:05' " +
|
|
||||||
"--end '23 Mar 2012 10:00:25'")
|
|
||||||
self.match("1200\n")
|
|
||||||
|
|
||||||
self.ok("remove -c /newton/prep " +
|
|
||||||
"--start '23 Mar 2012 10:03:50' " +
|
|
||||||
"--end '23 Mar 2012 10:06:50'")
|
|
||||||
self.match("15600\n")
|
|
||||||
|
|
||||||
self.ok("extract -c /newton/prep --start 2000-01-01 --end 2020-01-01")
|
|
||||||
self.match("24000\n")
|
|
||||||
|
|
||||||
# See the missing chunks in list output
|
|
||||||
self.ok("list --detail /newton/prep")
|
|
||||||
self.match("/newton/prep PrepData\n" +
|
|
||||||
" [ Fri, 23 Mar 2012 10:00:00.000000 +0000"
|
|
||||||
" -> Fri, 23 Mar 2012 10:00:05.000000 +0000 ]\n"
|
|
||||||
" [ Fri, 23 Mar 2012 10:00:25.000000 +0000"
|
|
||||||
" -> Fri, 23 Mar 2012 10:00:30.000000 +0000 ]\n"
|
|
||||||
" [ Fri, 23 Mar 2012 10:00:40.000000 +0000"
|
|
||||||
" -> Fri, 23 Mar 2012 10:01:59.991668 +0000 ]\n"
|
|
||||||
" [ Fri, 23 Mar 2012 10:02:00.000000 +0000"
|
|
||||||
" -> Fri, 23 Mar 2012 10:03:50.000000 +0000 ]\n")
|
|
||||||
|
|
||||||
# Remove all data, verify it's missing
|
|
||||||
self.ok("remove /newton/prep --start 2000-01-01 --end 2020-01-01")
|
|
||||||
self.match("") # no count requested this time
|
|
||||||
self.ok("list --detail /newton/prep")
|
|
||||||
self.match("/newton/prep PrepData\n" +
|
|
||||||
" (no intervals)\n")
|
|
||||||
|
|
||||||
# Reinsert some data, to verify that no overlaps with deleted
|
|
||||||
# data are reported
|
|
||||||
os.environ['TZ'] = "UTC"
|
|
||||||
self.ok("insert --rate 120 /newton/prep "
|
|
||||||
"tests/data/prep-20120323T1000 "
|
|
||||||
"tests/data/prep-20120323T1002")
|
|
||||||
|
|
||||||
def test_11_destroy(self):
|
|
||||||
# Delete records
|
|
||||||
self.ok("destroy --help")
|
|
||||||
|
|
||||||
self.fail("destroy")
|
|
||||||
self.contain("too few arguments")
|
|
||||||
|
|
||||||
self.fail("destroy /no/such/stream")
|
|
||||||
self.contain("No stream at path")
|
|
||||||
|
|
||||||
self.fail("destroy asdfasdf")
|
|
||||||
self.contain("No stream at path")
|
|
||||||
|
|
||||||
# From previous tests, we have:
|
|
||||||
self.ok("list")
|
|
||||||
self.match("/newton/prep PrepData\n"
|
|
||||||
"/newton/raw RawData\n"
|
|
||||||
"/newton/zzz/rawnotch RawNotchedData\n")
|
|
||||||
|
|
||||||
# Notice how they're not empty
|
|
||||||
self.ok("list --detail")
|
|
||||||
lines_(self.captured, 7)
|
|
||||||
|
|
||||||
# Delete some
|
|
||||||
self.ok("destroy /newton/prep")
|
|
||||||
self.ok("list")
|
|
||||||
self.match("/newton/raw RawData\n"
|
|
||||||
"/newton/zzz/rawnotch RawNotchedData\n")
|
|
||||||
|
|
||||||
self.ok("destroy /newton/zzz/rawnotch")
|
|
||||||
self.ok("list")
|
|
||||||
self.match("/newton/raw RawData\n")
|
|
||||||
|
|
||||||
self.ok("destroy /newton/raw")
|
|
||||||
self.ok("create /newton/raw RawData")
|
|
||||||
self.ok("destroy /newton/raw")
|
|
||||||
self.ok("list")
|
|
||||||
self.match("")
|
|
||||||
|
|
||||||
# Re-create a previously deleted location, and some new ones
|
|
||||||
rebuild = [ "/newton/prep", "/newton/zzz",
|
|
||||||
"/newton/raw", "/newton/asdf/qwer" ]
|
|
||||||
for path in rebuild:
|
|
||||||
# Create the path
|
|
||||||
self.ok("create " + path + " PrepData")
|
|
||||||
self.ok("list")
|
|
||||||
self.contain(path)
|
|
||||||
# Make sure it was created empty
|
|
||||||
self.ok("list --detail --path " + path)
|
|
||||||
self.contain("(no intervals)")
|
|
||||||
|
|
||||||
def test_12_unicode(self):
|
|
||||||
# Unicode paths.
|
|
||||||
self.ok("destroy /newton/asdf/qwer")
|
|
||||||
self.ok("destroy /newton/prep")
|
|
||||||
self.ok("destroy /newton/raw")
|
|
||||||
self.ok("destroy /newton/zzz")
|
|
||||||
|
|
||||||
self.ok(u"create /düsseldorf/raw uint16_6")
|
|
||||||
self.ok("list --detail")
|
|
||||||
self.contain(u"/düsseldorf/raw uint16_6")
|
|
||||||
self.contain("(no intervals)")
|
|
||||||
|
|
||||||
# Unicode metadata
|
|
||||||
self.ok(u"metadata /düsseldorf/raw --set α=beta 'γ=δ'")
|
|
||||||
self.ok(u"metadata /düsseldorf/raw --update 'α=β ε τ α'")
|
|
||||||
self.ok(u"metadata /düsseldorf/raw")
|
|
||||||
self.match(u"α=β ε τ α\nγ=δ\n")
|
|
||||||
|
|
||||||
self.ok(u"destroy /düsseldorf/raw")
|
|
||||||
|
|
||||||
def test_13_files(self):
|
|
||||||
# Test BulkData's ability to split into multiple files,
|
|
||||||
# by forcing the file size to be really small.
|
|
||||||
server_stop()
|
|
||||||
server_start(bulkdata_args = { "file_size" : 920, # 23 rows per file
|
|
||||||
"files_per_dir" : 3 })
|
|
||||||
|
|
||||||
# Fill data
|
|
||||||
self.ok("create /newton/prep float32_8")
|
|
||||||
os.environ['TZ'] = "UTC"
|
|
||||||
with open("tests/data/prep-20120323T1004-timestamped") as input:
|
|
||||||
self.ok("insert --none /newton/prep", input)
|
|
||||||
|
|
||||||
# Extract it
|
|
||||||
self.ok("extract /newton/prep --start '2000-01-01' " +
|
|
||||||
"--end '2012-03-23 10:04:01'")
|
|
||||||
lines_(self.captured, 120)
|
|
||||||
self.ok("extract /newton/prep --start '2000-01-01' " +
|
|
||||||
"--end '2022-03-23 10:04:01'")
|
|
||||||
lines_(self.captured, 14400)
|
|
||||||
|
|
||||||
# Make sure there were lots of files generated in the database
|
|
||||||
# dir
|
|
||||||
nfiles = 0
|
|
||||||
for (dirpath, dirnames, filenames) in os.walk(testdb):
|
|
||||||
nfiles += len(filenames)
|
|
||||||
assert(nfiles > 500)
|
|
||||||
|
|
||||||
# Make sure we can restart the server with a different file
|
|
||||||
# size and have it still work
|
|
||||||
server_stop()
|
|
||||||
server_start()
|
|
||||||
self.ok("extract /newton/prep --start '2000-01-01' " +
|
|
||||||
"--end '2022-03-23 10:04:01'")
|
|
||||||
lines_(self.captured, 14400)
|
|
||||||
|
|
||||||
# Now recreate the data one more time and make sure there are
|
|
||||||
# fewer files.
|
|
||||||
self.ok("destroy /newton/prep")
|
|
||||||
self.fail("destroy /newton/prep") # already destroyed
|
|
||||||
self.ok("create /newton/prep float32_8")
|
|
||||||
os.environ['TZ'] = "UTC"
|
|
||||||
with open("tests/data/prep-20120323T1004-timestamped") as input:
|
|
||||||
self.ok("insert --none /newton/prep", input)
|
|
||||||
nfiles = 0
|
|
||||||
for (dirpath, dirnames, filenames) in os.walk(testdb):
|
|
||||||
nfiles += len(filenames)
|
|
||||||
lt_(nfiles, 50)
|
|
||||||
self.ok("destroy /newton/prep") # destroy again
|
|
||||||
|
|
||||||
def test_14_remove_files(self):
|
|
||||||
# Test BulkData's ability to remove when data is split into
|
|
||||||
# multiple files. Should be a fairly comprehensive test of
|
|
||||||
# remove functionality.
|
|
||||||
server_stop()
|
|
||||||
server_start(bulkdata_args = { "file_size" : 920, # 23 rows per file
|
|
||||||
"files_per_dir" : 3 })
|
|
||||||
|
|
||||||
# Insert data. Just for fun, insert out of order
|
|
||||||
self.ok("create /newton/prep PrepData")
|
|
||||||
os.environ['TZ'] = "UTC"
|
|
||||||
self.ok("insert --rate 120 /newton/prep "
|
|
||||||
"tests/data/prep-20120323T1002 "
|
|
||||||
"tests/data/prep-20120323T1000")
|
|
||||||
|
|
||||||
# Should take up about 2.8 MB here (including directory entries)
|
|
||||||
du_before = nilmdb.utils.diskusage.du_bytes(testdb)
|
|
||||||
|
|
||||||
# Make sure we have the data we expect
|
|
||||||
self.ok("list --detail")
|
|
||||||
self.match("/newton/prep PrepData\n" +
|
|
||||||
" [ Fri, 23 Mar 2012 10:00:00.000000 +0000"
|
|
||||||
" -> Fri, 23 Mar 2012 10:01:59.991668 +0000 ]\n"
|
|
||||||
" [ Fri, 23 Mar 2012 10:02:00.000000 +0000"
|
|
||||||
" -> Fri, 23 Mar 2012 10:03:59.991668 +0000 ]\n")
|
|
||||||
|
|
||||||
# Remove various chunks of prep data and make sure
|
|
||||||
# they're gone.
|
|
||||||
self.ok("extract -c /newton/prep --start 2000-01-01 --end 2020-01-01")
|
|
||||||
self.match("28800\n")
|
|
||||||
|
|
||||||
self.ok("remove -c /newton/prep " +
|
|
||||||
"--start '23 Mar 2012 10:00:30' " +
|
|
||||||
"--end '23 Mar 2012 10:03:30'")
|
|
||||||
self.match("21600\n")
|
|
||||||
|
|
||||||
self.ok("remove -c /newton/prep " +
|
|
||||||
"--start '23 Mar 2012 10:00:10' " +
|
|
||||||
"--end '23 Mar 2012 10:00:20'")
|
|
||||||
self.match("1200\n")
|
|
||||||
|
|
||||||
self.ok("remove -c /newton/prep " +
|
|
||||||
"--start '23 Mar 2012 10:00:05' " +
|
|
||||||
"--end '23 Mar 2012 10:00:25'")
|
|
||||||
self.match("1200\n")
|
|
||||||
|
|
||||||
self.ok("remove -c /newton/prep " +
|
|
||||||
"--start '23 Mar 2012 10:03:50' " +
|
|
||||||
"--end '23 Mar 2012 10:06:50'")
|
|
||||||
self.match("1200\n")
|
|
||||||
|
|
||||||
self.ok("extract -c /newton/prep --start 2000-01-01 --end 2020-01-01")
|
|
||||||
self.match("3600\n")
|
|
||||||
|
|
||||||
# See the missing chunks in list output
|
|
||||||
self.ok("list --detail")
|
|
||||||
self.match("/newton/prep PrepData\n" +
|
|
||||||
" [ Fri, 23 Mar 2012 10:00:00.000000 +0000"
|
|
||||||
" -> Fri, 23 Mar 2012 10:00:05.000000 +0000 ]\n"
|
|
||||||
" [ Fri, 23 Mar 2012 10:00:25.000000 +0000"
|
|
||||||
" -> Fri, 23 Mar 2012 10:00:30.000000 +0000 ]\n"
|
|
||||||
" [ Fri, 23 Mar 2012 10:03:30.000000 +0000"
|
|
||||||
" -> Fri, 23 Mar 2012 10:03:50.000000 +0000 ]\n")
|
|
||||||
|
|
||||||
# We have 1/8 of the data that we had before, so the file size
|
|
||||||
# should have dropped below 1/4 of what it used to be
|
|
||||||
du_after = nilmdb.utils.diskusage.du_bytes(testdb)
|
|
||||||
lt_(du_after, (du_before / 4))
|
|
||||||
|
|
||||||
# Remove anything that came from the 10:02 data file
|
|
||||||
self.ok("remove /newton/prep " +
|
|
||||||
"--start '23 Mar 2012 10:02:00' --end '2020-01-01'")
|
|
||||||
|
|
||||||
# Re-insert 19 lines from that file, then remove them again.
|
|
||||||
# With the specific file_size above, this will cause the last
|
|
||||||
# file in the bulk data storage to be exactly file_size large,
|
|
||||||
# so removing the data should also remove that last file.
|
|
||||||
self.ok("insert --rate 120 /newton/prep " +
|
|
||||||
"tests/data/prep-20120323T1002-first19lines")
|
|
||||||
self.ok("remove /newton/prep " +
|
|
||||||
"--start '23 Mar 2012 10:02:00' --end '2020-01-01'")
|
|
||||||
|
|
||||||
# Shut down and restart server, to force nrows to get refreshed.
|
|
||||||
server_stop()
|
|
||||||
server_start()
|
|
||||||
|
|
||||||
# Re-add the full 10:02 data file. This tests adding new data once
|
|
||||||
# we removed data near the end.
|
|
||||||
self.ok("insert --rate 120 /newton/prep tests/data/prep-20120323T1002")
|
|
||||||
|
|
||||||
# See if we can extract it all
|
|
||||||
self.ok("extract /newton/prep --start 2000-01-01 --end 2020-01-01")
|
|
||||||
lines_(self.captured, 15600)
|
|
||||||
|
@@ -12,10 +12,6 @@ def eq_(a, b):
|
|||||||
if not a == b:
|
if not a == b:
|
||||||
raise AssertionError("%s != %s" % (myrepr(a), myrepr(b)))
|
raise AssertionError("%s != %s" % (myrepr(a), myrepr(b)))
|
||||||
|
|
||||||
def lt_(a, b):
|
|
||||||
if not a < b:
|
|
||||||
raise AssertionError("%s is not less than %s" % (myrepr(a), myrepr(b)))
|
|
||||||
|
|
||||||
def in_(a, b):
|
def in_(a, b):
|
||||||
if a not in b:
|
if a not in b:
|
||||||
raise AssertionError("%s not in %s" % (myrepr(a), myrepr(b)))
|
raise AssertionError("%s not in %s" % (myrepr(a), myrepr(b)))
|
||||||
@@ -24,14 +20,6 @@ def ne_(a, b):
|
|||||||
if not a != b:
|
if not a != b:
|
||||||
raise AssertionError("unexpected %s == %s" % (myrepr(a), myrepr(b)))
|
raise AssertionError("unexpected %s == %s" % (myrepr(a), myrepr(b)))
|
||||||
|
|
||||||
def lines_(a, n):
|
|
||||||
l = a.count('\n')
|
|
||||||
if not l == n:
|
|
||||||
if len(a) > 5000:
|
|
||||||
a = a[0:5000] + " ... truncated"
|
|
||||||
raise AssertionError("wanted %d lines, got %d in output: '%s'"
|
|
||||||
% (n, l, a))
|
|
||||||
|
|
||||||
def recursive_unlink(path):
|
def recursive_unlink(path):
|
||||||
try:
|
try:
|
||||||
shutil.rmtree(path)
|
shutil.rmtree(path)
|
@@ -1,7 +1,7 @@
|
|||||||
# -*- coding: utf-8 -*-
|
# -*- coding: utf-8 -*-
|
||||||
|
|
||||||
import nilmdb
|
import nilmdb
|
||||||
from nilmdb.utils.printf import *
|
from nilmdb.printf import *
|
||||||
import datetime_tz
|
import datetime_tz
|
||||||
|
|
||||||
from nose.tools import *
|
from nose.tools import *
|
||||||
@@ -10,24 +10,16 @@ import itertools
|
|||||||
|
|
||||||
from nilmdb.interval import Interval, DBInterval, IntervalSet, IntervalError
|
from nilmdb.interval import Interval, DBInterval, IntervalSet, IntervalError
|
||||||
|
|
||||||
from testutil.helpers import *
|
from test_helpers import *
|
||||||
import unittest
|
import unittest
|
||||||
|
|
||||||
# set to False to skip live renders
|
|
||||||
do_live_renders = False
|
|
||||||
def render(iset, description = "", live = True):
|
|
||||||
import testutil.renderdot as renderdot
|
|
||||||
r = renderdot.RBTreeRenderer(iset.tree)
|
|
||||||
return r.render(description, live and do_live_renders)
|
|
||||||
|
|
||||||
def makeset(string):
|
def makeset(string):
|
||||||
"""Build an IntervalSet from a string, for testing purposes
|
"""Build an IntervalSet from a string, for testing purposes
|
||||||
|
|
||||||
Each character is 1 second
|
Each character is 1 second
|
||||||
[ = interval start
|
[ = interval start
|
||||||
| = interval end + next start
|
| = interval end + adjacent start
|
||||||
] = interval end
|
] = interval end
|
||||||
. = zero-width interval (identical start and end)
|
|
||||||
anything else is ignored
|
anything else is ignored
|
||||||
"""
|
"""
|
||||||
iset = IntervalSet()
|
iset = IntervalSet()
|
||||||
@@ -38,11 +30,9 @@ def makeset(string):
|
|||||||
elif (c == "|"):
|
elif (c == "|"):
|
||||||
iset += Interval(start, day)
|
iset += Interval(start, day)
|
||||||
start = day
|
start = day
|
||||||
elif (c == ")"):
|
elif (c == "]"):
|
||||||
iset += Interval(start, day)
|
iset += Interval(start, day)
|
||||||
del start
|
del start
|
||||||
elif (c == "."):
|
|
||||||
iset += Interval(day, day)
|
|
||||||
return iset
|
return iset
|
||||||
|
|
||||||
class TestInterval:
|
class TestInterval:
|
||||||
@@ -78,24 +68,24 @@ class TestInterval:
|
|||||||
assert(Interval(d1, d3) < Interval(d2, d3))
|
assert(Interval(d1, d3) < Interval(d2, d3))
|
||||||
assert(Interval(d2, d2) > Interval(d1, d3))
|
assert(Interval(d2, d2) > Interval(d1, d3))
|
||||||
assert(Interval(d3, d3) == Interval(d3, d3))
|
assert(Interval(d3, d3) == Interval(d3, d3))
|
||||||
#with assert_raises(TypeError): # was AttributeError, that's wrong
|
with assert_raises(AttributeError):
|
||||||
# x = (i == 123)
|
x = (i == 123)
|
||||||
|
|
||||||
# subset
|
# subset
|
||||||
eq_(Interval(d1, d3).subset(d1, d2), Interval(d1, d2))
|
assert(Interval(d1, d3).subset(d1, d2) == Interval(d1, d2))
|
||||||
with assert_raises(IntervalError):
|
with assert_raises(IntervalError):
|
||||||
x = Interval(d2, d3).subset(d1, d2)
|
x = Interval(d2, d3).subset(d1, d2)
|
||||||
|
|
||||||
# big integers and floats
|
# big integers and floats
|
||||||
x = Interval(5000111222, 6000111222)
|
x = Interval(5000111222, 6000111222)
|
||||||
eq_(str(x), "[5000111222.0 -> 6000111222.0)")
|
eq_(str(x), "[5000111222.0 -> 6000111222.0]")
|
||||||
x = Interval(123.45, 234.56)
|
x = Interval(123.45, 234.56)
|
||||||
eq_(str(x), "[123.45 -> 234.56)")
|
eq_(str(x), "[123.45 -> 234.56]")
|
||||||
|
|
||||||
# misc
|
# misc
|
||||||
i = Interval(d1, d2)
|
i = Interval(d1, d2)
|
||||||
eq_(repr(i), repr(eval(repr(i))))
|
eq_(repr(i), repr(eval(repr(i))))
|
||||||
eq_(str(i), "[1332561600.0 -> 1332648000.0)")
|
eq_(str(i), "[1332561600.0 -> 1332648000.0]")
|
||||||
|
|
||||||
def test_interval_intersect(self):
|
def test_interval_intersect(self):
|
||||||
# Test Interval intersections
|
# Test Interval intersections
|
||||||
@@ -116,7 +106,7 @@ class TestInterval:
|
|||||||
except IntervalError:
|
except IntervalError:
|
||||||
assert(i not in should_intersect[True] and
|
assert(i not in should_intersect[True] and
|
||||||
i not in should_intersect[False])
|
i not in should_intersect[False])
|
||||||
with assert_raises(TypeError):
|
with assert_raises(AttributeError):
|
||||||
x = i1.intersects(1234)
|
x = i1.intersects(1234)
|
||||||
|
|
||||||
def test_intervalset_construct(self):
|
def test_intervalset_construct(self):
|
||||||
@@ -137,15 +127,6 @@ class TestInterval:
|
|||||||
x = iseta != 3
|
x = iseta != 3
|
||||||
ne_(IntervalSet(a), IntervalSet(b))
|
ne_(IntervalSet(a), IntervalSet(b))
|
||||||
|
|
||||||
# Note that assignment makes a new reference (not a copy)
|
|
||||||
isetd = IntervalSet(isetb)
|
|
||||||
isete = isetd
|
|
||||||
eq_(isetd, isetb)
|
|
||||||
eq_(isetd, isete)
|
|
||||||
isetd -= a
|
|
||||||
ne_(isetd, isetb)
|
|
||||||
eq_(isetd, isete)
|
|
||||||
|
|
||||||
# test iterator
|
# test iterator
|
||||||
for interval in iseta:
|
for interval in iseta:
|
||||||
pass
|
pass
|
||||||
@@ -167,18 +148,11 @@ class TestInterval:
|
|||||||
iset = IntervalSet(a)
|
iset = IntervalSet(a)
|
||||||
iset += IntervalSet(b)
|
iset += IntervalSet(b)
|
||||||
eq_(iset, IntervalSet([a, b]))
|
eq_(iset, IntervalSet([a, b]))
|
||||||
|
|
||||||
iset = IntervalSet(a)
|
iset = IntervalSet(a)
|
||||||
iset += b
|
iset += b
|
||||||
eq_(iset, IntervalSet([a, b]))
|
eq_(iset, IntervalSet([a, b]))
|
||||||
|
|
||||||
iset = IntervalSet(a)
|
|
||||||
iset.iadd_nocheck(b)
|
|
||||||
eq_(iset, IntervalSet([a, b]))
|
|
||||||
|
|
||||||
iset = IntervalSet(a) + IntervalSet(b)
|
iset = IntervalSet(a) + IntervalSet(b)
|
||||||
eq_(iset, IntervalSet([a, b]))
|
eq_(iset, IntervalSet([a, b]))
|
||||||
|
|
||||||
iset = IntervalSet(b) + a
|
iset = IntervalSet(b) + a
|
||||||
eq_(iset, IntervalSet([a, b]))
|
eq_(iset, IntervalSet([a, b]))
|
||||||
|
|
||||||
@@ -191,81 +165,54 @@ class TestInterval:
|
|||||||
|
|
||||||
# misc
|
# misc
|
||||||
eq_(repr(iset), repr(eval(repr(iset))))
|
eq_(repr(iset), repr(eval(repr(iset))))
|
||||||
eq_(str(iset), "[[100.0 -> 200.0), [200.0 -> 300.0)]")
|
eq_(str(iset), "[[100.0 -> 200.0], [200.0 -> 300.0]]")
|
||||||
|
|
||||||
def test_intervalset_geniset(self):
|
def test_intervalset_geniset(self):
|
||||||
# Test basic iset construction
|
# Test basic iset construction
|
||||||
eq_(makeset(" [----) "),
|
assert(makeset(" [----] ") ==
|
||||||
makeset(" [-|--) "))
|
makeset(" [-|--] "))
|
||||||
|
|
||||||
eq_(makeset("[) [--) ") +
|
assert(makeset("[] [--] ") +
|
||||||
makeset(" [) [--)"),
|
makeset(" [] [--]") ==
|
||||||
makeset("[|) [-----)"))
|
makeset("[|] [-----]"))
|
||||||
|
|
||||||
eq_(makeset(" [-------)"),
|
assert(makeset(" [-------]") ==
|
||||||
makeset(" [-|-----|"))
|
makeset(" [-|-----|"))
|
||||||
|
|
||||||
|
|
||||||
def test_intervalset_intersect(self):
|
def test_intervalset_intersect(self):
|
||||||
# Test intersection (&)
|
# Test intersection (&)
|
||||||
with assert_raises(TypeError): # was AttributeError
|
with assert_raises(AttributeError):
|
||||||
x = makeset("[--)") & 1234
|
x = makeset("[--]") & 1234
|
||||||
|
|
||||||
# Intersection with interval
|
assert(makeset("[---------]") &
|
||||||
eq_(makeset("[---|---)[)") &
|
makeset(" [---] ") ==
|
||||||
list(makeset(" [------) "))[0],
|
makeset(" [---] "))
|
||||||
makeset(" [-----) "))
|
|
||||||
|
|
||||||
# Intersection with sets
|
assert(makeset(" [---] ") &
|
||||||
eq_(makeset("[---------)") &
|
makeset("[---------]") ==
|
||||||
makeset(" [---) "),
|
makeset(" [---] "))
|
||||||
makeset(" [---) "))
|
|
||||||
|
|
||||||
eq_(makeset(" [---) ") &
|
assert(makeset(" [-----]") &
|
||||||
makeset("[---------)"),
|
makeset(" [-----] ") ==
|
||||||
makeset(" [---) "))
|
makeset(" [--] "))
|
||||||
|
|
||||||
eq_(makeset(" [-----)") &
|
assert(makeset(" [---]") &
|
||||||
makeset(" [-----) "),
|
makeset(" [--] ") ==
|
||||||
makeset(" [--) "))
|
|
||||||
|
|
||||||
eq_(makeset(" [--) [--)") &
|
|
||||||
makeset(" [------) "),
|
|
||||||
makeset(" [-) [-) "))
|
|
||||||
|
|
||||||
eq_(makeset(" [---)") &
|
|
||||||
makeset(" [--) "),
|
|
||||||
makeset(" "))
|
makeset(" "))
|
||||||
|
|
||||||
eq_(makeset(" [-|---)") &
|
assert(makeset(" [-|---]") &
|
||||||
makeset(" [-----|-) "),
|
makeset(" [-----|-] ") ==
|
||||||
makeset(" [----) "))
|
makeset(" [----] "))
|
||||||
|
|
||||||
eq_(makeset(" [-|-) ") &
|
assert(makeset(" [-|-] ") &
|
||||||
makeset(" [-|--|--) "),
|
makeset(" [-|--|--] ") ==
|
||||||
makeset(" [---) "))
|
makeset(" [---] "))
|
||||||
|
|
||||||
# Border cases -- will give different results if intervals are
|
assert(makeset(" [----][--]") &
|
||||||
# half open or fully closed. Right now, they are half open,
|
makeset("[-] [--] []") ==
|
||||||
# although that's a little messy since the database intervals
|
makeset(" [] [-] []"))
|
||||||
# often contain a data point at the endpoint.
|
|
||||||
half_open = True
|
|
||||||
if half_open:
|
|
||||||
eq_(makeset(" [---)") &
|
|
||||||
makeset(" [----) "),
|
|
||||||
makeset(" "))
|
|
||||||
eq_(makeset(" [----)[--)") &
|
|
||||||
makeset("[-) [--) [)"),
|
|
||||||
makeset(" [) [-) [)"))
|
|
||||||
else:
|
|
||||||
eq_(makeset(" [---)") &
|
|
||||||
makeset(" [----) "),
|
|
||||||
makeset(" . "))
|
|
||||||
eq_(makeset(" [----)[--)") &
|
|
||||||
makeset("[-) [--) [)"),
|
|
||||||
makeset(" [) [-). [)"))
|
|
||||||
|
|
||||||
class TestIntervalDB:
|
|
||||||
def test_dbinterval(self):
|
def test_dbinterval(self):
|
||||||
# Test DBInterval class
|
# Test DBInterval class
|
||||||
i = DBInterval(100, 200, 100, 200, 10000, 20000)
|
i = DBInterval(100, 200, 100, 200, 10000, 20000)
|
||||||
@@ -308,65 +255,66 @@ class TestIntervalDB:
|
|||||||
for i in IntervalSet(iseta.intersection(Interval(125,250))):
|
for i in IntervalSet(iseta.intersection(Interval(125,250))):
|
||||||
assert(isinstance(i, DBInterval))
|
assert(isinstance(i, DBInterval))
|
||||||
|
|
||||||
class TestIntervalTree:
|
class TestIntervalShape:
|
||||||
|
def test_interval_shape(self):
|
||||||
def test_interval_tree(self):
|
|
||||||
import random
|
import random
|
||||||
random.seed(1234)
|
random.seed(1234)
|
||||||
|
|
||||||
# make a set of 100 intervals
|
# make a set of 500 intervals
|
||||||
iset = IntervalSet()
|
iset = IntervalSet()
|
||||||
j = 100
|
j = 500
|
||||||
for i in random.sample(xrange(j),j):
|
for i in random.sample(xrange(j),j):
|
||||||
interval = Interval(i, i+1)
|
interval = Interval(i, i+1)
|
||||||
iset += interval
|
iset += interval
|
||||||
render(iset, "Random Insertion")
|
|
||||||
|
|
||||||
# remove about half of them
|
# Plot it
|
||||||
for i in random.sample(xrange(j),j):
|
import renderdot
|
||||||
if random.randint(0,1):
|
r = renderdot.Renderer(lambda node: node.cleft,
|
||||||
iset -= Interval(i, i+1)
|
lambda node: node.cright,
|
||||||
|
lambda node: False,
|
||||||
|
lambda node: node.start,
|
||||||
|
lambda node: node.end,
|
||||||
|
iset.tree.emptynode())
|
||||||
|
r.render_dot_live(iset.tree.rootnode(), "Random")
|
||||||
|
|
||||||
# try removing an interval that doesn't exist
|
# make a set of 500 intervals, inserted in order
|
||||||
with assert_raises(IntervalError):
|
|
||||||
iset -= Interval(1234,5678)
|
|
||||||
render(iset, "Random Insertion, deletion")
|
|
||||||
|
|
||||||
# make a set of 100 intervals, inserted in order
|
|
||||||
iset = IntervalSet()
|
iset = IntervalSet()
|
||||||
j = 100
|
j = 500
|
||||||
for i in xrange(j):
|
for i in xrange(j):
|
||||||
interval = Interval(i, i+1)
|
interval = Interval(i, i+1)
|
||||||
iset += interval
|
iset += interval
|
||||||
render(iset, "In-order insertion")
|
|
||||||
|
# Plot it
|
||||||
|
import renderdot
|
||||||
|
r = renderdot.Renderer(lambda node: node.cleft,
|
||||||
|
lambda node: node.cright,
|
||||||
|
lambda node: False,
|
||||||
|
lambda node: node.start,
|
||||||
|
lambda node: node.end,
|
||||||
|
iset.tree.emptynode())
|
||||||
|
r.render_dot_live(iset.tree.rootnode(), "In-order")
|
||||||
|
|
||||||
|
assert(False)
|
||||||
|
|
||||||
class TestIntervalSpeed:
|
class TestIntervalSpeed:
|
||||||
@unittest.skip("this is slow")
|
#@unittest.skip("this is slow")
|
||||||
def test_interval_speed(self):
|
def test_interval_speed(self):
|
||||||
import yappi
|
import yappi
|
||||||
import time
|
import time
|
||||||
import testutil.aplotter as aplotter
|
import aplotter
|
||||||
import random
|
|
||||||
import math
|
|
||||||
|
|
||||||
print
|
print
|
||||||
yappi.start()
|
yappi.start()
|
||||||
speeds = {}
|
speeds = {}
|
||||||
limit = 10 # was 20
|
for j in [ 2**x for x in range(5,22) ]:
|
||||||
for j in [ 2**x for x in range(5,limit) ]:
|
|
||||||
start = time.time()
|
start = time.time()
|
||||||
iset = IntervalSet()
|
iset = IntervalSet()
|
||||||
for i in random.sample(xrange(j),j):
|
for i in xrange(j):
|
||||||
interval = Interval(i, i+1)
|
interval = Interval(i, i+1)
|
||||||
iset += interval
|
iset += interval
|
||||||
speed = (time.time() - start) * 1000000.0
|
speed = (time.time() - start) * 1000000.0
|
||||||
printf("%d: %g μs (%g μs each, O(n log n) ratio %g)\n",
|
printf("%d: %g μs (%g μs each)\n", j, speed, speed/j)
|
||||||
j,
|
|
||||||
speed,
|
|
||||||
speed/j,
|
|
||||||
speed / (j*math.log(j))) # should be constant
|
|
||||||
speeds[j] = speed
|
speeds[j] = speed
|
||||||
aplotter.plot(speeds.keys(), speeds.values(), plot_slope=True)
|
aplotter.plot(speeds.keys(), speeds.values(), plot_slope=True)
|
||||||
yappi.stop()
|
yappi.stop()
|
||||||
yappi.print_stats(sort_type=yappi.SORTTYPE_TTOT, limit=10)
|
yappi.print_stats(sort_type=yappi.SORTTYPE_TTOT, limit=10)
|
||||||
|
|
||||||
|
@@ -1,5 +1,5 @@
|
|||||||
import nilmdb
|
import nilmdb
|
||||||
from nilmdb.utils.printf import *
|
from nilmdb.printf import *
|
||||||
|
|
||||||
import nose
|
import nose
|
||||||
from nose.tools import *
|
from nose.tools import *
|
||||||
@@ -7,7 +7,9 @@ from nose.tools import assert_raises
|
|||||||
import threading
|
import threading
|
||||||
import time
|
import time
|
||||||
|
|
||||||
from testutil.helpers import *
|
from test_helpers import *
|
||||||
|
|
||||||
|
import nilmdb.iteratorizer
|
||||||
|
|
||||||
def func_with_callback(a, b, callback):
|
def func_with_callback(a, b, callback):
|
||||||
callback(a)
|
callback(a)
|
||||||
@@ -25,8 +27,7 @@ class TestIteratorizer(object):
|
|||||||
eq_(self.result, "123")
|
eq_(self.result, "123")
|
||||||
|
|
||||||
# Now make it an iterator
|
# Now make it an iterator
|
||||||
it = nilmdb.utils.Iteratorizer(
|
it = nilmdb.iteratorizer.Iteratorizer(lambda x:
|
||||||
lambda x:
|
|
||||||
func_with_callback(1, 2, x))
|
func_with_callback(1, 2, x))
|
||||||
result = ""
|
result = ""
|
||||||
for i in it:
|
for i in it:
|
||||||
@@ -34,8 +35,7 @@ class TestIteratorizer(object):
|
|||||||
eq_(result, "123")
|
eq_(result, "123")
|
||||||
|
|
||||||
# Make sure things work when an exception occurs
|
# Make sure things work when an exception occurs
|
||||||
it = nilmdb.utils.Iteratorizer(
|
it = nilmdb.iteratorizer.Iteratorizer(lambda x:
|
||||||
lambda x:
|
|
||||||
func_with_callback(1, "a", x))
|
func_with_callback(1, "a", x))
|
||||||
result = ""
|
result = ""
|
||||||
with assert_raises(TypeError) as e:
|
with assert_raises(TypeError) as e:
|
||||||
@@ -48,8 +48,7 @@ class TestIteratorizer(object):
|
|||||||
# itself. This doesn't have a particular result in the test,
|
# itself. This doesn't have a particular result in the test,
|
||||||
# but gains coverage.
|
# but gains coverage.
|
||||||
def foo():
|
def foo():
|
||||||
it = nilmdb.utils.Iteratorizer(
|
it = nilmdb.iteratorizer.Iteratorizer(lambda x:
|
||||||
lambda x:
|
|
||||||
func_with_callback(1, 2, x))
|
func_with_callback(1, 2, x))
|
||||||
it.next()
|
it.next()
|
||||||
foo()
|
foo()
|
||||||
|
@@ -2,7 +2,7 @@
|
|||||||
|
|
||||||
import nilmdb
|
import nilmdb
|
||||||
|
|
||||||
from nilmdb.utils.printf import *
|
from nilmdb.printf import *
|
||||||
|
|
||||||
from nose.tools import *
|
from nose.tools import *
|
||||||
from nose.tools import assert_raises
|
from nose.tools import assert_raises
|
||||||
@@ -20,7 +20,7 @@ import cStringIO
|
|||||||
import random
|
import random
|
||||||
import unittest
|
import unittest
|
||||||
|
|
||||||
from testutil.helpers import *
|
from test_helpers import *
|
||||||
|
|
||||||
from nilmdb.layout import *
|
from nilmdb.layout import *
|
||||||
|
|
||||||
@@ -28,13 +28,9 @@ class TestLayouts(object):
|
|||||||
# Some nilmdb.layout tests. Not complete, just fills in missing
|
# Some nilmdb.layout tests. Not complete, just fills in missing
|
||||||
# coverage.
|
# coverage.
|
||||||
def test_layouts(self):
|
def test_layouts(self):
|
||||||
x = nilmdb.layout.get_named("PrepData")
|
x = nilmdb.layout.get_named("PrepData").description()
|
||||||
y = nilmdb.layout.get_named("float32_8")
|
y = nilmdb.layout.get_named("float32_8").description()
|
||||||
eq_(x.count, y.count)
|
eq_(repr(x), repr(y))
|
||||||
eq_(x.datatype, y.datatype)
|
|
||||||
y = nilmdb.layout.get_named("float32_7")
|
|
||||||
ne_(x.count, y.count)
|
|
||||||
eq_(x.datatype, y.datatype)
|
|
||||||
|
|
||||||
def test_parsing(self):
|
def test_parsing(self):
|
||||||
self.real_t_parsing("PrepData", "RawData", "RawNotchedData")
|
self.real_t_parsing("PrepData", "RawData", "RawNotchedData")
|
||||||
|
@@ -1,83 +0,0 @@
|
|||||||
import nilmdb
|
|
||||||
from nilmdb.utils.printf import *
|
|
||||||
|
|
||||||
import nose
|
|
||||||
from nose.tools import *
|
|
||||||
from nose.tools import assert_raises
|
|
||||||
import threading
|
|
||||||
import time
|
|
||||||
import inspect
|
|
||||||
|
|
||||||
from testutil.helpers import *
|
|
||||||
|
|
||||||
@nilmdb.utils.lru_cache(size = 3)
|
|
||||||
def foo1(n):
|
|
||||||
return n
|
|
||||||
|
|
||||||
@nilmdb.utils.lru_cache(size = 5)
|
|
||||||
def foo2(n):
|
|
||||||
return n
|
|
||||||
|
|
||||||
def foo3d(n):
|
|
||||||
foo3d.destructed.append(n)
|
|
||||||
foo3d.destructed = []
|
|
||||||
@nilmdb.utils.lru_cache(size = 3, onremove = foo3d)
|
|
||||||
def foo3(n):
|
|
||||||
return n
|
|
||||||
|
|
||||||
class Foo:
|
|
||||||
def __init__(self):
|
|
||||||
self.calls = 0
|
|
||||||
@nilmdb.utils.lru_cache(size = 3, keys = slice(1, 2))
|
|
||||||
def foo(self, n, **kwargs):
|
|
||||||
self.calls += 1
|
|
||||||
|
|
||||||
class TestLRUCache(object):
|
|
||||||
def test(self):
|
|
||||||
|
|
||||||
[ foo1(n) for n in [ 1, 2, 3, 1, 2, 3, 1, 2, 3 ] ]
|
|
||||||
eq_(foo1.cache_info(), (6, 3))
|
|
||||||
[ foo1(n) for n in [ 1, 2, 3, 1, 2, 3, 1, 2, 3 ] ]
|
|
||||||
eq_(foo1.cache_info(), (15, 3))
|
|
||||||
[ foo1(n) for n in [ 4, 2, 1, 1, 4 ] ]
|
|
||||||
eq_(foo1.cache_info(), (18, 5))
|
|
||||||
|
|
||||||
[ foo2(n) for n in [ 1, 2, 3, 1, 2, 3, 1, 2, 3 ] ]
|
|
||||||
eq_(foo2.cache_info(), (6, 3))
|
|
||||||
[ foo2(n) for n in [ 1, 2, 3, 1, 2, 3, 1, 2, 3 ] ]
|
|
||||||
eq_(foo2.cache_info(), (15, 3))
|
|
||||||
[ foo2(n) for n in [ 4, 2, 1, 1, 4 ] ]
|
|
||||||
eq_(foo2.cache_info(), (19, 4))
|
|
||||||
|
|
||||||
[ foo3(n) for n in [ 1, 2, 3, 1, 2, 3, 1, 2, 3 ] ]
|
|
||||||
eq_(foo3.cache_info(), (6, 3))
|
|
||||||
[ foo3(n) for n in [ 1, 2, 3, 1, 2, 3, 1, 2, 3 ] ]
|
|
||||||
eq_(foo3.cache_info(), (15, 3))
|
|
||||||
[ foo3(n) for n in [ 4, 2, 1, 1, 4 ] ]
|
|
||||||
eq_(foo3.cache_info(), (18, 5))
|
|
||||||
eq_(foo3d.destructed, [1, 3])
|
|
||||||
with assert_raises(KeyError):
|
|
||||||
foo3.cache_remove(1,2,3)
|
|
||||||
foo3.cache_remove(1)
|
|
||||||
eq_(foo3d.destructed, [1, 3, 1])
|
|
||||||
foo3.cache_remove_all()
|
|
||||||
eq_(foo3d.destructed, [1, 3, 1, 2, 4 ])
|
|
||||||
|
|
||||||
foo = Foo()
|
|
||||||
foo.foo(5)
|
|
||||||
foo.foo(6)
|
|
||||||
foo.foo(7)
|
|
||||||
foo.foo(5)
|
|
||||||
eq_(foo.calls, 3)
|
|
||||||
|
|
||||||
# Can't handle keyword arguments right now
|
|
||||||
with assert_raises(NotImplementedError):
|
|
||||||
foo.foo(3, asdf = 7)
|
|
||||||
|
|
||||||
# Verify that argspecs were maintained
|
|
||||||
eq_(inspect.getargspec(foo1),
|
|
||||||
inspect.ArgSpec(args=['n'],
|
|
||||||
varargs=None, keywords=None, defaults=None))
|
|
||||||
eq_(inspect.getargspec(foo.foo),
|
|
||||||
inspect.ArgSpec(args=['self', 'n'],
|
|
||||||
varargs=None, keywords="kwargs", defaults=None))
|
|
@@ -1,110 +0,0 @@
|
|||||||
import nilmdb
|
|
||||||
from nilmdb.utils.printf import *
|
|
||||||
|
|
||||||
import nose
|
|
||||||
from nose.tools import *
|
|
||||||
from nose.tools import assert_raises
|
|
||||||
|
|
||||||
from testutil.helpers import *
|
|
||||||
|
|
||||||
import sys
|
|
||||||
import cStringIO
|
|
||||||
import gc
|
|
||||||
|
|
||||||
import inspect
|
|
||||||
|
|
||||||
err = cStringIO.StringIO()
|
|
||||||
|
|
||||||
@nilmdb.utils.must_close(errorfile = err)
|
|
||||||
class Foo:
|
|
||||||
def __init__(self, arg):
|
|
||||||
fprintf(err, "Init %s\n", arg)
|
|
||||||
|
|
||||||
def __del__(self):
|
|
||||||
fprintf(err, "Deleting\n")
|
|
||||||
|
|
||||||
def close(self):
|
|
||||||
fprintf(err, "Closing\n")
|
|
||||||
|
|
||||||
@nilmdb.utils.must_close(errorfile = err, wrap_verify = True)
|
|
||||||
class Bar:
|
|
||||||
def __init__(self):
|
|
||||||
fprintf(err, "Init\n")
|
|
||||||
|
|
||||||
def __del__(self):
|
|
||||||
fprintf(err, "Deleting\n")
|
|
||||||
|
|
||||||
def close(self):
|
|
||||||
fprintf(err, "Closing\n")
|
|
||||||
|
|
||||||
def blah(self, arg):
|
|
||||||
fprintf(err, "Blah %s\n", arg)
|
|
||||||
|
|
||||||
@nilmdb.utils.must_close(errorfile = err)
|
|
||||||
class Baz:
|
|
||||||
pass
|
|
||||||
|
|
||||||
class TestMustClose(object):
|
|
||||||
def test(self):
|
|
||||||
|
|
||||||
# Note: this test might fail if the Python interpreter doesn't
|
|
||||||
# garbage collect the object (and call its __del__ function)
|
|
||||||
# right after a "del x".
|
|
||||||
|
|
||||||
# Trigger error
|
|
||||||
err.truncate()
|
|
||||||
x = Foo("hi")
|
|
||||||
# Verify that the arg spec was maintained
|
|
||||||
eq_(inspect.getargspec(x.__init__),
|
|
||||||
inspect.ArgSpec(args = ['self', 'arg'],
|
|
||||||
varargs = None, keywords = None, defaults = None))
|
|
||||||
del x
|
|
||||||
gc.collect()
|
|
||||||
eq_(err.getvalue(),
|
|
||||||
"Init hi\n"
|
|
||||||
"error: Foo.close() wasn't called!\n"
|
|
||||||
"Deleting\n")
|
|
||||||
|
|
||||||
# No error
|
|
||||||
err.truncate(0)
|
|
||||||
y = Foo("bye")
|
|
||||||
y.close()
|
|
||||||
del y
|
|
||||||
gc.collect()
|
|
||||||
eq_(err.getvalue(),
|
|
||||||
"Init bye\n"
|
|
||||||
"Closing\n"
|
|
||||||
"Deleting\n")
|
|
||||||
|
|
||||||
# Verify function calls when wrap_verify is True
|
|
||||||
err.truncate(0)
|
|
||||||
z = Bar()
|
|
||||||
eq_(inspect.getargspec(z.blah),
|
|
||||||
inspect.ArgSpec(args = ['self', 'arg'],
|
|
||||||
varargs = None, keywords = None, defaults = None))
|
|
||||||
z.blah("boo")
|
|
||||||
z.close()
|
|
||||||
with assert_raises(AssertionError) as e:
|
|
||||||
z.blah("hello")
|
|
||||||
in_("called <function blah at 0x", str(e.exception))
|
|
||||||
in_("> after close", str(e.exception))
|
|
||||||
# Since the most recent assertion references 'z',
|
|
||||||
# we need to raise another assertion here so that
|
|
||||||
# 'z' will get properly deleted.
|
|
||||||
with assert_raises(AssertionError):
|
|
||||||
raise AssertionError()
|
|
||||||
del z
|
|
||||||
gc.collect()
|
|
||||||
eq_(err.getvalue(),
|
|
||||||
"Init\n"
|
|
||||||
"Blah boo\n"
|
|
||||||
"Closing\n"
|
|
||||||
"Deleting\n")
|
|
||||||
|
|
||||||
# Class with missing methods
|
|
||||||
err.truncate(0)
|
|
||||||
w = Baz()
|
|
||||||
w.close()
|
|
||||||
del w
|
|
||||||
eq_(err.getvalue(), "")
|
|
||||||
|
|
@@ -14,7 +14,6 @@ import urllib2
|
|||||||
from urllib2 import urlopen, HTTPError
|
from urllib2 import urlopen, HTTPError
|
||||||
import Queue
|
import Queue
|
||||||
import cStringIO
|
import cStringIO
|
||||||
import time
|
|
||||||
|
|
||||||
testdb = "tests/testdb"
|
testdb = "tests/testdb"
|
||||||
|
|
||||||
@@ -22,7 +21,7 @@ testdb = "tests/testdb"
|
|||||||
#def cleanup():
|
#def cleanup():
|
||||||
# os.unlink(testdb)
|
# os.unlink(testdb)
|
||||||
|
|
||||||
from testutil.helpers import *
|
from test_helpers import *
|
||||||
|
|
||||||
class Test00Nilmdb(object): # named 00 so it runs first
|
class Test00Nilmdb(object): # named 00 so it runs first
|
||||||
def test_NilmDB(self):
|
def test_NilmDB(self):
|
||||||
@@ -40,8 +39,8 @@ class Test00Nilmdb(object): # named 00 so it runs first
|
|||||||
capture = cStringIO.StringIO()
|
capture = cStringIO.StringIO()
|
||||||
old = sys.stdout
|
old = sys.stdout
|
||||||
sys.stdout = capture
|
sys.stdout = capture
|
||||||
with nilmdb.utils.Timer("test"):
|
with nilmdb.Timer("test"):
|
||||||
time.sleep(0.01)
|
nilmdb.timer.time.sleep(0.01)
|
||||||
sys.stdout = old
|
sys.stdout = old
|
||||||
in_("test: ", capture.getvalue())
|
in_("test: ", capture.getvalue())
|
||||||
|
|
||||||
@@ -70,14 +69,12 @@ class Test00Nilmdb(object): # named 00 so it runs first
|
|||||||
eq_(db.stream_list(layout="RawData"), [ ["/newton/raw", "RawData"] ])
|
eq_(db.stream_list(layout="RawData"), [ ["/newton/raw", "RawData"] ])
|
||||||
eq_(db.stream_list(path="/newton/raw"), [ ["/newton/raw", "RawData"] ])
|
eq_(db.stream_list(path="/newton/raw"), [ ["/newton/raw", "RawData"] ])
|
||||||
|
|
||||||
# Verify that columns were made right (pytables specific)
|
# Verify that columns were made right
|
||||||
if "h5file" in db.data.__dict__:
|
eq_(len(db.h5file.getNode("/newton/prep").cols), 9)
|
||||||
h5file = db.data.h5file
|
eq_(len(db.h5file.getNode("/newton/raw").cols), 7)
|
||||||
eq_(len(h5file.getNode("/newton/prep").cols), 9)
|
eq_(len(db.h5file.getNode("/newton/zzz/rawnotch").cols), 10)
|
||||||
eq_(len(h5file.getNode("/newton/raw").cols), 7)
|
assert(not db.h5file.getNode("/newton/prep").colindexed["timestamp"])
|
||||||
eq_(len(h5file.getNode("/newton/zzz/rawnotch").cols), 10)
|
assert(not db.h5file.getNode("/newton/prep").colindexed["c1"])
|
||||||
assert(not h5file.getNode("/newton/prep").colindexed["timestamp"])
|
|
||||||
assert(not h5file.getNode("/newton/prep").colindexed["c1"])
|
|
||||||
|
|
||||||
# Set / get metadata
|
# Set / get metadata
|
||||||
eq_(db.stream_get_metadata("/newton/prep"), {})
|
eq_(db.stream_get_metadata("/newton/prep"), {})
|
||||||
@@ -199,6 +196,6 @@ class TestServer(object):
|
|||||||
# GET instead of POST (no body)
|
# GET instead of POST (no body)
|
||||||
# (actual POST test is done by client code)
|
# (actual POST test is done by client code)
|
||||||
with assert_raises(HTTPError) as e:
|
with assert_raises(HTTPError) as e:
|
||||||
getjson("/stream/insert?path=/newton/prep&start=0&end=0")
|
getjson("/stream/insert?path=/newton/prep")
|
||||||
eq_(e.exception.code, 400)
|
eq_(e.exception.code, 400)
|
||||||
|
|
||||||
|
@@ -1,12 +1,12 @@
|
|||||||
import nilmdb
|
import nilmdb
|
||||||
from nilmdb.utils.printf import *
|
from nilmdb.printf import *
|
||||||
|
|
||||||
from nose.tools import *
|
from nose.tools import *
|
||||||
from nose.tools import assert_raises
|
from nose.tools import assert_raises
|
||||||
from cStringIO import StringIO
|
from cStringIO import StringIO
|
||||||
import sys
|
import sys
|
||||||
|
|
||||||
from testutil.helpers import *
|
from test_helpers import *
|
||||||
|
|
||||||
class TestPrintf(object):
|
class TestPrintf(object):
|
||||||
def test_printf(self):
|
def test_printf(self):
|
||||||
|
@@ -1,159 +0,0 @@
|
|||||||
# -*- coding: utf-8 -*-
|
|
||||||
|
|
||||||
import nilmdb
|
|
||||||
from nilmdb.utils.printf import *
|
|
||||||
|
|
||||||
from nose.tools import *
|
|
||||||
from nose.tools import assert_raises
|
|
||||||
|
|
||||||
from nilmdb.rbtree import RBTree, RBNode
|
|
||||||
|
|
||||||
from testutil.helpers import *
|
|
||||||
import unittest
|
|
||||||
|
|
||||||
# set to False to skip live renders
|
|
||||||
do_live_renders = False
|
|
||||||
def render(tree, description = "", live = True):
|
|
||||||
import testutil.renderdot as renderdot
|
|
||||||
r = renderdot.RBTreeRenderer(tree)
|
|
||||||
return r.render(description, live and do_live_renders)
|
|
||||||
|
|
||||||
class TestRBTree:
|
|
||||||
def test_rbtree(self):
|
|
||||||
rb = RBTree()
|
|
||||||
rb.insert(RBNode(10000, 10001))
|
|
||||||
rb.insert(RBNode(10004, 10007))
|
|
||||||
rb.insert(RBNode(10001, 10002))
|
|
||||||
# There was a typo that gave the RBTree a loop in this case.
|
|
||||||
# Verify that the dot isn't too big.
|
|
||||||
s = render(rb, live = False)
|
|
||||||
assert(len(s.splitlines()) < 30)
|
|
||||||
|
|
||||||
def test_rbtree_big(self):
|
|
||||||
import random
|
|
||||||
random.seed(1234)
|
|
||||||
|
|
||||||
# make a set of 100 intervals, inserted in order
|
|
||||||
rb = RBTree()
|
|
||||||
j = 100
|
|
||||||
for i in xrange(j):
|
|
||||||
rb.insert(RBNode(i, i+1))
|
|
||||||
render(rb, "in-order insert")
|
|
||||||
|
|
||||||
# remove about half of them
|
|
||||||
for i in random.sample(xrange(j),j):
|
|
||||||
if random.randint(0,1):
|
|
||||||
rb.delete(rb.find(i, i+1))
|
|
||||||
render(rb, "in-order insert, random delete")
|
|
||||||
|
|
||||||
# make a set of 100 intervals, inserted at random
|
|
||||||
rb = RBTree()
|
|
||||||
j = 100
|
|
||||||
for i in random.sample(xrange(j),j):
|
|
||||||
rb.insert(RBNode(i, i+1))
|
|
||||||
render(rb, "random insert")
|
|
||||||
|
|
||||||
# remove about half of them
|
|
||||||
for i in random.sample(xrange(j),j):
|
|
||||||
if random.randint(0,1):
|
|
||||||
rb.delete(rb.find(i, i+1))
|
|
||||||
render(rb, "random insert, random delete")
|
|
||||||
|
|
||||||
# in-order insert of 50 more
|
|
||||||
for i in xrange(50):
|
|
||||||
rb.insert(RBNode(i+500, i+501))
|
|
||||||
render(rb, "random insert, random delete, in-order insert")
|
|
||||||
|
|
||||||
def test_rbtree_basics(self):
|
|
||||||
rb = RBTree()
|
|
||||||
vals = [ 7, 14, 1, 2, 8, 11, 5, 15, 4]
|
|
||||||
for n in vals:
|
|
||||||
rb.insert(RBNode(n, n))
|
|
||||||
|
|
||||||
# stringify
|
|
||||||
s = ""
|
|
||||||
for node in rb:
|
|
||||||
s += str(node)
|
|
||||||
in_("[node (None) 1", s)
|
|
||||||
eq_(str(rb.nil), "[node nil]")
|
|
||||||
|
|
||||||
# inorder traversal, successor and predecessor
|
|
||||||
last = 0
|
|
||||||
for node in rb:
|
|
||||||
assert(node.start > last)
|
|
||||||
last = node.start
|
|
||||||
successor = rb.successor(node)
|
|
||||||
if successor:
|
|
||||||
assert(rb.predecessor(successor) is node)
|
|
||||||
predecessor = rb.predecessor(node)
|
|
||||||
if predecessor:
|
|
||||||
assert(rb.successor(predecessor) is node)
|
|
||||||
|
|
||||||
# Delete node not in the tree
|
|
||||||
with assert_raises(AttributeError):
|
|
||||||
rb.delete(RBNode(1,2))
|
|
||||||
|
|
||||||
# Delete all nodes!
|
|
||||||
for node in rb:
|
|
||||||
rb.delete(node)
|
|
||||||
|
|
||||||
# Build it up again, make sure it matches
|
|
||||||
for n in vals:
|
|
||||||
rb.insert(RBNode(n, n))
|
|
||||||
s2 = ""
|
|
||||||
for node in rb:
|
|
||||||
s2 += str(node)
|
|
||||||
assert(s == s2)
|
|
||||||
|
|
||||||
def test_rbtree_find(self):
|
|
||||||
# Get a little bit of coverage for some overlapping cases,
|
|
||||||
# even though the class doesn't fully support it.
|
|
||||||
rb = RBTree()
|
|
||||||
nodes = [ RBNode(1, 5), RBNode(1, 10), RBNode(1, 15) ]
|
|
||||||
for n in nodes:
|
|
||||||
rb.insert(n)
|
|
||||||
assert(rb.find(1, 5) is nodes[0])
|
|
||||||
assert(rb.find(1, 10) is nodes[1])
|
|
||||||
assert(rb.find(1, 15) is nodes[2])
|
|
||||||
|
|
||||||
def test_rbtree_find_leftright(self):
|
|
||||||
# Now let's get some ranges in there
|
|
||||||
rb = RBTree()
|
|
||||||
vals = [ 7, 14, 1, 2, 8, 11, 5, 15, 4]
|
|
||||||
for n in vals:
|
|
||||||
rb.insert(RBNode(n*10, n*10+5))
|
|
||||||
|
|
||||||
# Check find_end_left, find_right_start
|
|
||||||
for i in range(160):
|
|
||||||
left = rb.find_left_end(i)
|
|
||||||
right = rb.find_right_start(i)
|
|
||||||
if left:
|
|
||||||
# endpoint should be more than i
|
|
||||||
assert(left.end >= i)
|
|
||||||
# all earlier nodes should have a lower endpoint
|
|
||||||
for node in rb:
|
|
||||||
if node is left:
|
|
||||||
break
|
|
||||||
assert(node.end < i)
|
|
||||||
if right:
|
|
||||||
# startpoint should be less than i
|
|
||||||
assert(right.start <= i)
|
|
||||||
# all later nodes should have a higher startpoint
|
|
||||||
for node in reversed(list(rb)):
|
|
||||||
if node is right:
|
|
||||||
break
|
|
||||||
assert(node.start > i)
|
|
||||||
|
|
||||||
def test_rbtree_intersect(self):
|
|
||||||
# Fill with some ranges
|
|
||||||
rb = RBTree()
|
|
||||||
rb.insert(RBNode(10,20))
|
|
||||||
rb.insert(RBNode(20,25))
|
|
||||||
rb.insert(RBNode(30,40))
|
|
||||||
# Just a quick test; test_interval will do better.
|
|
||||||
eq_(len(list(rb.intersect(1,100))), 3)
|
|
||||||
eq_(len(list(rb.intersect(10,20))), 1)
|
|
||||||
eq_(len(list(rb.intersect(5,15))), 1)
|
|
||||||
eq_(len(list(rb.intersect(15,15))), 1)
|
|
||||||
eq_(len(list(rb.intersect(20,21))), 1)
|
|
||||||
eq_(len(list(rb.intersect(19,21))), 2)
|
|
@@ -1,5 +1,5 @@
|
|||||||
import nilmdb
|
import nilmdb
|
||||||
from nilmdb.utils.printf import *
|
from nilmdb.printf import *
|
||||||
|
|
||||||
import nose
|
import nose
|
||||||
from nose.tools import *
|
from nose.tools import *
|
||||||
@@ -7,7 +7,7 @@ from nose.tools import assert_raises
|
|||||||
import threading
|
import threading
|
||||||
import time
|
import time
|
||||||
|
|
||||||
from testutil.helpers import *
|
from test_helpers import *
|
||||||
|
|
||||||
#raise nose.exc.SkipTest("Skip these")
|
#raise nose.exc.SkipTest("Skip these")
|
||||||
|
|
||||||
@@ -57,7 +57,7 @@ class TestUnserialized(Base):
|
|||||||
class TestSerialized(Base):
|
class TestSerialized(Base):
|
||||||
def setUp(self):
|
def setUp(self):
|
||||||
self.realfoo = Foo()
|
self.realfoo = Foo()
|
||||||
self.foo = nilmdb.utils.Serializer(self.realfoo)
|
self.foo = nilmdb.serializer.WrapObject(self.realfoo)
|
||||||
|
|
||||||
def tearDown(self):
|
def tearDown(self):
|
||||||
del self.foo
|
del self.foo
|
||||||
|
@@ -1,5 +1,5 @@
|
|||||||
import nilmdb
|
import nilmdb
|
||||||
from nilmdb.utils.printf import *
|
from nilmdb.printf import *
|
||||||
|
|
||||||
import datetime_tz
|
import datetime_tz
|
||||||
|
|
||||||
@@ -9,7 +9,7 @@ import os
|
|||||||
import sys
|
import sys
|
||||||
import cStringIO
|
import cStringIO
|
||||||
|
|
||||||
from testutil.helpers import *
|
from test_helpers import *
|
||||||
|
|
||||||
class TestTimestamper(object):
|
class TestTimestamper(object):
|
||||||
|
|
||||||
|
@@ -1 +0,0 @@
|
|||||||
# empty
|
|
@@ -1,54 +0,0 @@
|
|||||||
nosetests
|
|
||||||
|
|
||||||
32: 386 μs (12.0625 μs each)
|
|
||||||
64: 672.102 μs (10.5016 μs each)
|
|
||||||
128: 1510.86 μs (11.8036 μs each)
|
|
||||||
256: 2782.11 μs (10.8676 μs each)
|
|
||||||
512: 5591.87 μs (10.9216 μs each)
|
|
||||||
1024: 12812.1 μs (12.5119 μs each)
|
|
||||||
2048: 21835.1 μs (10.6617 μs each)
|
|
||||||
4096: 46059.1 μs (11.2449 μs each)
|
|
||||||
8192: 114127 μs (13.9315 μs each)
|
|
||||||
16384: 181217 μs (11.0606 μs each)
|
|
||||||
32768: 419649 μs (12.8067 μs each)
|
|
||||||
65536: 804320 μs (12.2729 μs each)
|
|
||||||
131072: 1.73534e+06 μs (13.2396 μs each)
|
|
||||||
262144: 3.74451e+06 μs (14.2842 μs each)
|
|
||||||
524288: 8.8694e+06 μs (16.917 μs each)
|
|
||||||
1048576: 1.69993e+07 μs (16.2118 μs each)
|
|
||||||
2097152: 3.29387e+07 μs (15.7064 μs each)
|
|
||||||
|
|
|
||||||
+3.29387e+07 *
|
|
||||||
| ----
|
|
||||||
| -----
|
|
||||||
| ----
|
|
||||||
| -----
|
|
||||||
| -----
|
|
||||||
| ----
|
|
||||||
| -----
|
|
||||||
| -----
|
|
||||||
| ----
|
|
||||||
| -----
|
|
||||||
| ----
|
|
||||||
| -----
|
|
||||||
| ---
|
|
||||||
| ---
|
|
||||||
| ---
|
|
||||||
| -------
|
|
||||||
---+386---------------------------------------------------------------------+---
|
|
||||||
+32 +2.09715e+06
|
|
||||||
|
|
||||||
name #n tsub ttot tavg
|
|
||||||
..vl/lees/bucket/nilm/nilmdb/nilmdb/interval.py.__iadd__:184 4194272 10.025323 30.262723 0.000007
|
|
||||||
..evl/lees/bucket/nilm/nilmdb/nilmdb/interval.py.__init__:27 4194272 24.715377 24.715377 0.000006
|
|
||||||
../lees/bucket/nilm/nilmdb/nilmdb/interval.py.intersects:239 4194272 6.705053 12.577620 0.000003
|
|
||||||
..im/devl/lees/bucket/nilm/nilmdb/tests/aplotter.py.plot:404 1 0.000048 0.001412 0.001412
|
|
||||||
../lees/bucket/nilm/nilmdb/tests/aplotter.py.plot_double:311 1 0.000106 0.001346 0.001346
|
|
||||||
..vl/lees/bucket/nilm/nilmdb/tests/aplotter.py.plot_data:201 1 0.000098 0.000672 0.000672
|
|
||||||
..vl/lees/bucket/nilm/nilmdb/tests/aplotter.py.plot_line:241 16 0.000298 0.000496 0.000031
|
|
||||||
..jim/devl/lees/bucket/nilm/nilmdb/nilmdb/printf.py.printf:4 17 0.000252 0.000334 0.000020
|
|
||||||
..vl/lees/bucket/nilm/nilmdb/tests/aplotter.py.transposed:39 1 0.000229 0.000235 0.000235
|
|
||||||
..vl/lees/bucket/nilm/nilmdb/tests/aplotter.py.y_reversed:45 1 0.000151 0.000174 0.000174
|
|
||||||
|
|
||||||
name tid fname ttot scnt
|
|
||||||
_MainThread 47269783682784 ..b/python2.7/threading.py.setprofile:88 64.746000 1
|
|
30
timeit.sh
30
timeit.sh
@@ -1,22 +1,20 @@
|
|||||||
./nilmtool.py destroy /bpnilm/2/raw
|
|
||||||
./nilmtool.py create /bpnilm/2/raw RawData
|
./nilmtool.py create /bpnilm/2/raw RawData
|
||||||
|
|
||||||
if false; then
|
if true; then
|
||||||
time zcat /home/jim/bpnilm-data/snapshot-1-20110513-110002.raw.gz | ./nilmtool.py insert -s 20110513-110000 -r 8000 /bpnilm/2/raw
|
time zcat /home/jim/bpnilm-data/snapshot-1-20110513-110002.raw.gz | ./nilmtool.py insert -s 20110513-110000 /bpnilm/2/raw
|
||||||
time zcat /home/jim/bpnilm-data/snapshot-1-20110513-110002.raw.gz | ./nilmtool.py insert -s 20110513-120001 -r 8000 /bpnilm/2/raw
|
time zcat /home/jim/bpnilm-data/snapshot-1-20110513-110002.raw.gz | ./nilmtool.py insert -s 20110513-120001 /bpnilm/2/raw
|
||||||
else
|
else
|
||||||
# 170 hours, about 98 gigs uncompressed:
|
for i in $(seq 2000 2050); do
|
||||||
for i in $(seq 2000 2016); do
|
time zcat /home/jim/bpnilm-data/snapshot-1-20110513-110002.raw.gz | ./nilmtool.py insert -s ${i}0101-010001 /bpnilm/2/raw
|
||||||
time zcat /home/jim/bpnilm-data/snapshot-1-20110513-110002.raw.gz | ./nilmtool.py insert -s ${i}0101-010001 -r 8000 /bpnilm/2/raw
|
time zcat /home/jim/bpnilm-data/snapshot-1-20110513-110002.raw.gz | ./nilmtool.py insert -s ${i}0101-020002 /bpnilm/2/raw
|
||||||
time zcat /home/jim/bpnilm-data/snapshot-1-20110513-110002.raw.gz | ./nilmtool.py insert -s ${i}0101-020002 -r 8000 /bpnilm/2/raw
|
time zcat /home/jim/bpnilm-data/snapshot-1-20110513-110002.raw.gz | ./nilmtool.py insert -s ${i}0101-030003 /bpnilm/2/raw
|
||||||
time zcat /home/jim/bpnilm-data/snapshot-1-20110513-110002.raw.gz | ./nilmtool.py insert -s ${i}0101-030003 -r 8000 /bpnilm/2/raw
|
time zcat /home/jim/bpnilm-data/snapshot-1-20110513-110002.raw.gz | ./nilmtool.py insert -s ${i}0101-040004 /bpnilm/2/raw
|
||||||
time zcat /home/jim/bpnilm-data/snapshot-1-20110513-110002.raw.gz | ./nilmtool.py insert -s ${i}0101-040004 -r 8000 /bpnilm/2/raw
|
time zcat /home/jim/bpnilm-data/snapshot-1-20110513-110002.raw.gz | ./nilmtool.py insert -s ${i}0101-050005 /bpnilm/2/raw
|
||||||
time zcat /home/jim/bpnilm-data/snapshot-1-20110513-110002.raw.gz | ./nilmtool.py insert -s ${i}0101-050005 -r 8000 /bpnilm/2/raw
|
time zcat /home/jim/bpnilm-data/snapshot-1-20110513-110002.raw.gz | ./nilmtool.py insert -s ${i}0101-060006 /bpnilm/2/raw
|
||||||
time zcat /home/jim/bpnilm-data/snapshot-1-20110513-110002.raw.gz | ./nilmtool.py insert -s ${i}0101-060006 -r 8000 /bpnilm/2/raw
|
time zcat /home/jim/bpnilm-data/snapshot-1-20110513-110002.raw.gz | ./nilmtool.py insert -s ${i}0101-070007 /bpnilm/2/raw
|
||||||
time zcat /home/jim/bpnilm-data/snapshot-1-20110513-110002.raw.gz | ./nilmtool.py insert -s ${i}0101-070007 -r 8000 /bpnilm/2/raw
|
time zcat /home/jim/bpnilm-data/snapshot-1-20110513-110002.raw.gz | ./nilmtool.py insert -s ${i}0101-080008 /bpnilm/2/raw
|
||||||
time zcat /home/jim/bpnilm-data/snapshot-1-20110513-110002.raw.gz | ./nilmtool.py insert -s ${i}0101-080008 -r 8000 /bpnilm/2/raw
|
time zcat /home/jim/bpnilm-data/snapshot-1-20110513-110002.raw.gz | ./nilmtool.py insert -s ${i}0101-090009 /bpnilm/2/raw
|
||||||
time zcat /home/jim/bpnilm-data/snapshot-1-20110513-110002.raw.gz | ./nilmtool.py insert -s ${i}0101-090009 -r 8000 /bpnilm/2/raw
|
time zcat /home/jim/bpnilm-data/snapshot-1-20110513-110002.raw.gz | ./nilmtool.py insert -s ${i}0101-100010 /bpnilm/2/raw
|
||||||
time zcat /home/jim/bpnilm-data/snapshot-1-20110513-110002.raw.gz | ./nilmtool.py insert -s ${i}0101-100010 -r 8000 /bpnilm/2/raw
|
|
||||||
done
|
done
|
||||||
fi
|
fi
|
||||||
|
|
||||||
|
Reference in New Issue
Block a user