A bulkdata dir may get created for a new stream with an empty or
corrupted _format, before any data gets actually written. In that
case, we can just delete the new stream; worst case, we lose some
metadata.
Note: The info in _format should really get moved into the database.
This was born when bulkdata switched from PyTables to a custom storage
system, and was probably stored this way to avoid tying the main DB
to specific implementation details while they were still in flux.
Previous commits went back and forth a bit on whether the various APIs
should use bytes or strings, but bytes appears to be a better answer,
because actual data in streams will always be 7-bit ASCII or raw
binary. There's no reason to apply the performance penalty of
constantly converting between bytes and strings.
One drawback now is that lots of code now has to have "b" prefixes on
strings, especially in tests, which inflates this commit quite a bit.
Normally, indexes for an array are expected to fit in a platform's
native long (32 or 64-bit). In nilmdb, tables aren't real arrays and
we need to handle unbounded indices.