This gives an easy way to get a large values in the database start_pos
and end_pos fields, which is necessary for testing failure modes when
those get too large (e.g. on 32-bit systems). Adjust tests to make
use of this knob.
We do this by creating a new requests.Session object for each request,
sending a "Connection: close" request header, and then explicitly
marking the connection for close after the response is read.
This is to avoid a longstanding race condition with HTTP keepalive
and server timeouts. Due to data processing, capture, etc, requests
may be separated by an arbitrary delay. If this delay is shorter
than the server's KeepAliveTimeout, the same connection is used.
If the delay is longer, a new connection is used. If the delay is
the same, however, the request may be sent on the old connection at
the exact same time that the server closes it. Typically, the
client sees the connection as closing between the request and the
response, which leads to "httplib.BadStatusLine" errors.
This patch avoids the race condition entirely by not using persistent
connections.
Another solution may be to detect those errors and retry the
connection, resending the request. However, the race condition could
potentially show up in other places, like a closed connection during
the request body, not after. Such an error could also be a legitimate
network condition or problem. This solution should be more reliable,
and the overhead of each new connection will hopefully be minimal for
typical workloads.
We were still calculating the maximum number of rows correctly,
so the extra data was really extra and would get re-written to the
beginning of the subsequent file.
The only case in which this would lead to database issues is if the
very last file was lengthened incorrectly, and the "nrows" calculation
would therefore be wrong when the database was reopened. Still, even
in that case, it should just leave a small gap in the data, not cause
any errors.
This includes both client.stream_insert_numpy and
client.stream_insert_numpy_context(). The test code is based on
similar test code for client.stream_insert_context, so it should be
fairly complete.
When enabled, lines like "# interval-start 1234567890123456" and "#
interval-end 1234567890123456" will be added to the data output. Note
that there may be an "interval-end" timestamp followed by an identical
"interval-start" timestamp, if the response at the nilmdb level was
split up into multiple chunks.
In general, assume contiguous data if previous_interval_end ==
new_interval_start.
Added new flag "-R" to command line to perform an automatic removal.
This should be the last of the ways in which a single command could
block the nilmdb thread for a long time.
Enable the following pragmas: synchronous=NORMAL, journal_mode=WAL.
This offers a significant speedup to INSERT times compared to
synchronous=FULL, and is roughly the same as synchronous=OFF
but should be a bit safer.
Things are now block-focused, rather than line-focused. This should
give a pretty big speedup to inserting client data, especially when
inserting preformatted data.
The server buffers the string and passes it to nilmdb. Nilmdb passes
the string to bulkdata. Bulkdata uses the rocket interface to parse
it in chunks, as necessary. Everything gets passed back up and
everyone is happy.
Currently, only pyrocket implements append_string.
Use a new helper, nilmdb.utils.time.float_to_time_string().
This will help if we ever want to change representation (like using
uint64 microseconds since epoch, which saves us from having to
waste bits on the floating-point exponent)
This is mostly a matter of taste, but it matches more closely with the
old way that prep did it, and it's more consistent. It should roughly
match the available precision of floats and doubles.