From what I've read of the spec so far: msgpack has a confusion between text and binary data baked into it that will probably never be resolved, and CBOR deliberately fixes that.
It also seems to have a better implementation of streaming data than msgpack. In msgpack, streaming is something you have to implement outside of the format itself, possibly by concatenating together many msgpack representations. CBOR has a way to say "here comes a streaming list, I'll tell you when it's done".
BSON is a representation of MongoDB's data model and doesn't make that much sense to use without MongoDB.
I currently use msgpack for a lot of things, but if CBOR's Python library is good enough, I might switch.
The base types are "unsigned int", "negative int", "binary data", "UTF-8 string", an array of said items, a map (key,value) of said items, extended types (up to 2^64) and tags (again, up to 2^64). There are only 8 "extended types" currently defined (false, true, null, undefined, half float (IEEE 16b), single float (IEEE 32b) and double float (IEEE 64b) and break (used to terminate streaming data)), leaving plenty unused values for future expansion.
Tags are used to apply meta information to a piece of datum. For example, you can tag a UTF-8 string as a URL or tag an array with a reference so it can be referred to elsewhere in the CBOR encoded data (an extension defined by the IETF but outside RFC-7049).
> Afaik newer versions of messagepack added an extra type to have string and binary now seperated.
The problem is that the 'str' type contains arbitrary binary data in an unspecified encoding, and always will, because of backward compatibility. This isn't changed by adding a 'bin' type.
Msgpack decoders in Python, for example, have to give you bytestrings unless you pass an option that promises that 'str's are all encoded in UTF-8.
I don't even know what to believe anymore. That documentation is referring to two types, with "raw" renamed to "str" plus a new "bin", which is what I thought it was.
But the link you posted referred to three types, where "str" and "bin" subclass "raw", which sounded like it provided a non-backward-compatible "str" that's guaranteed to be text.
I wrote the Python CBOR implementation you get if you `pip install cbor`. Source project here: https://bitbucket.org/bodhisnarkva/cbor
It's getting pretty extensive use at a few places I know, so I hope it's good enough!
Good to know! I'll try it as soon as it supports Python 3.5, which I use on a couple of machines. The issues page tells me that you're working on that already.
It also seems to have a better implementation of streaming data than msgpack. In msgpack, streaming is something you have to implement outside of the format itself, possibly by concatenating together many msgpack representations. CBOR has a way to say "here comes a streaming list, I'll tell you when it's done".
BSON is a representation of MongoDB's data model and doesn't make that much sense to use without MongoDB.
I currently use msgpack for a lot of things, but if CBOR's Python library is good enough, I might switch.