mirror of
https://github.com/bigchaindb/bigchaindb.git
synced 2024-06-28 08:37:45 +02:00
69 lines
2.4 KiB
Plaintext
69 lines
2.4 KiB
Plaintext
We need to clearly define how to serialize a json object to calculate the hash.
|
|
|
|
The serialization should produce the same byte output independently of the architecture running the software. If
|
|
there are diferences in the serialization hash validations will fail altough the transaction is correct
|
|
|
|
##### Example
|
|
|
|
```python
|
|
a = r.expr({'a': 1}).to_json().run(b.connection)
|
|
u'{"a":1}'
|
|
|
|
b = json.dumps({'a': 1})
|
|
'{"a": 1}'
|
|
|
|
a == b
|
|
False
|
|
```
|
|
|
|
We should provide the serialization and deserialization so that the following is always true.
|
|
|
|
##### Example
|
|
|
|
```python
|
|
deserialize(serialize(data)) == data
|
|
True
|
|
```
|
|
|
|
### Standard serialization for the bigchain
|
|
|
|
After looking at this further I think that the python json module is still the best bet because it
|
|
complies with the RFC. We can specify the encoding, separators used and enforce it to order by the keys to
|
|
make sure that we obtain maximum interopelability.
|
|
|
|
```python
|
|
import json
|
|
|
|
json.dumps(data, skipkeys=False, ensure_ascii=False, encoding="utf-8",
|
|
separators=(',', ':'), sort_keys=True)
|
|
```
|
|
|
|
- `skipkeys`: With skipkeys `False` if the provided keys are not a string the serialization will fail. This way we
|
|
enforce all keys to be strings
|
|
- `ensure_ascii`: The RFC recommends `utf-8` for maximum interoperability. By setting ensure_ascii to `False` we
|
|
allow unicode characters and force the encoding to `utf-8`.
|
|
- `separators`: We need to define a standard separator to use in the serialization. We did not do this different
|
|
implementations could use different separators for serialization resulting in a still valid transaction but with
|
|
a different hash e. g. an extra whitespace introduced in the serialization would not still create a valid json object
|
|
but the hash would be different
|
|
|
|
##### Example
|
|
|
|
Everytime we need to perform some operation on the data like calculating the hash or signing/verifying the transaction
|
|
we need to use the previous criteria to serialize the data and then use the `byte` representation of the serialized
|
|
data (if we threat the data as bytes we eliminate possible enconding errors e.g. unicode characters)
|
|
|
|
```python
|
|
# calculate the hash of a transaction
|
|
# the transaction is a dictionary
|
|
tx_serialized = bytes(serialize(tx))
|
|
tx_hash = hashlib.sha3_256(tx_serialized).hexdigest()
|
|
|
|
# signing a transaction
|
|
tx_serialized = bytes(serialize(tx))
|
|
signature = sk.sign(tx_serialized)
|
|
|
|
# verify signature
|
|
tx_serialized = bytes(serialize(tx))
|
|
vk.verify(signature, tx_serialized)
|
|
``` |