bigchaindb/bigchaindb/common/schema/__init__.py

83 lines
3.0 KiB
Python
Raw Normal View History

# Copyright © 2020 Interplanetary Database Association e.V.,
# BigchainDB and IPDB software contributors.
# SPDX-License-Identifier: (Apache-2.0 AND CC-BY-4.0)
# Code is Apache-2.0 and docs are CC-BY-4.0
"""Schema validation related functions and data"""
Schema definition (#798) Commit messages for posterity: * wip transaction schema definition * test for SchemaObject * test SchemaObject definions meta property * schema documentation updates * test for basic validation * commit before change to .json file definiton + rst generation * move to straight .json schema, test for additionalProperties on each object * add asset to transaction definiton * remove outdated tx validation * make all tests pass * create own exception for validation error and start validating transactions * more tx validation fixes * move to yaml file for schema * automatic schema documentation generator * remove redundant section * use YAML safe loading * change current_owners to owners_before in tx schema * re-run tests and make correct yaml schema * fix some broken tests * update Release_Process.md * move tx validation into it's own method * add jsonschema dependency * perform schema validation after ID validation on Transaction * Release_Process.md, markdown auto numbering * remove old transaction.json * resolve remaining TODOs in schema docuementation * add `id` and `$schema` to transaction.yaml * add transaction.yaml to setup.py so it gets copied * address some concernes in PR for transaction.yaml * address more PR concerns in transaction.yaml * refactor validtion exceptions and move transaction schema validation into it's own function in bigchaindb.common.schema.__init__ * add note to generated schema.rst indicating when and how it's generated * move tx schema validation back above ID validation in Transaction.validate_structure, test that structurally invalid transaction gets caught and 400 returned in TX POST handler * remove timestamp from transaction schema index * Add README.md to bigchaindb.common.schema for introduction to JSON Schema and reasons for YAML * Use constant for schema definitions' base prefix * Move import of ValidationError exception into only the tests that require it * Move validate transaction test helper to tests/common/util.py * move ordered transaction schema load to generate_schema_documentation.py where it's needed * use double backticks to render terms in schema docs * change more backticks and change transaction version description in transaction schema * make details a mandatory property of condition * Many more documentation fixes * rename schema.rst to schema/transaction.rst * Fix documentation for Metadata * Add more links to documentation * Various other documentation fixes * Rename section titles in rendered documentation * use to manage file handle * fix extrenuous comma in test_tx_serialization_with_incorrect_hash args * 'a' * 64 * remove schema validation until we can analyze properly impact on downstream consumers * fix flake8 error * use `with` always
2016-11-22 11:17:06 +01:00
import os.path
import logging
Schema definition (#798) Commit messages for posterity: * wip transaction schema definition * test for SchemaObject * test SchemaObject definions meta property * schema documentation updates * test for basic validation * commit before change to .json file definiton + rst generation * move to straight .json schema, test for additionalProperties on each object * add asset to transaction definiton * remove outdated tx validation * make all tests pass * create own exception for validation error and start validating transactions * more tx validation fixes * move to yaml file for schema * automatic schema documentation generator * remove redundant section * use YAML safe loading * change current_owners to owners_before in tx schema * re-run tests and make correct yaml schema * fix some broken tests * update Release_Process.md * move tx validation into it's own method * add jsonschema dependency * perform schema validation after ID validation on Transaction * Release_Process.md, markdown auto numbering * remove old transaction.json * resolve remaining TODOs in schema docuementation * add `id` and `$schema` to transaction.yaml * add transaction.yaml to setup.py so it gets copied * address some concernes in PR for transaction.yaml * address more PR concerns in transaction.yaml * refactor validtion exceptions and move transaction schema validation into it's own function in bigchaindb.common.schema.__init__ * add note to generated schema.rst indicating when and how it's generated * move tx schema validation back above ID validation in Transaction.validate_structure, test that structurally invalid transaction gets caught and 400 returned in TX POST handler * remove timestamp from transaction schema index * Add README.md to bigchaindb.common.schema for introduction to JSON Schema and reasons for YAML * Use constant for schema definitions' base prefix * Move import of ValidationError exception into only the tests that require it * Move validate transaction test helper to tests/common/util.py * move ordered transaction schema load to generate_schema_documentation.py where it's needed * use double backticks to render terms in schema docs * change more backticks and change transaction version description in transaction schema * make details a mandatory property of condition * Many more documentation fixes * rename schema.rst to schema/transaction.rst * Fix documentation for Metadata * Add more links to documentation * Various other documentation fixes * Rename section titles in rendered documentation * use to manage file handle * fix extrenuous comma in test_tx_serialization_with_incorrect_hash args * 'a' * 64 * remove schema validation until we can analyze properly impact on downstream consumers * fix flake8 error * use `with` always
2016-11-22 11:17:06 +01:00
import jsonschema
import yaml
2017-05-23 13:19:10 +02:00
import rapidjson
Schema definition (#798) Commit messages for posterity: * wip transaction schema definition * test for SchemaObject * test SchemaObject definions meta property * schema documentation updates * test for basic validation * commit before change to .json file definiton + rst generation * move to straight .json schema, test for additionalProperties on each object * add asset to transaction definiton * remove outdated tx validation * make all tests pass * create own exception for validation error and start validating transactions * more tx validation fixes * move to yaml file for schema * automatic schema documentation generator * remove redundant section * use YAML safe loading * change current_owners to owners_before in tx schema * re-run tests and make correct yaml schema * fix some broken tests * update Release_Process.md * move tx validation into it's own method * add jsonschema dependency * perform schema validation after ID validation on Transaction * Release_Process.md, markdown auto numbering * remove old transaction.json * resolve remaining TODOs in schema docuementation * add `id` and `$schema` to transaction.yaml * add transaction.yaml to setup.py so it gets copied * address some concernes in PR for transaction.yaml * address more PR concerns in transaction.yaml * refactor validtion exceptions and move transaction schema validation into it's own function in bigchaindb.common.schema.__init__ * add note to generated schema.rst indicating when and how it's generated * move tx schema validation back above ID validation in Transaction.validate_structure, test that structurally invalid transaction gets caught and 400 returned in TX POST handler * remove timestamp from transaction schema index * Add README.md to bigchaindb.common.schema for introduction to JSON Schema and reasons for YAML * Use constant for schema definitions' base prefix * Move import of ValidationError exception into only the tests that require it * Move validate transaction test helper to tests/common/util.py * move ordered transaction schema load to generate_schema_documentation.py where it's needed * use double backticks to render terms in schema docs * change more backticks and change transaction version description in transaction schema * make details a mandatory property of condition * Many more documentation fixes * rename schema.rst to schema/transaction.rst * Fix documentation for Metadata * Add more links to documentation * Various other documentation fixes * Rename section titles in rendered documentation * use to manage file handle * fix extrenuous comma in test_tx_serialization_with_incorrect_hash args * 'a' * 64 * remove schema validation until we can analyze properly impact on downstream consumers * fix flake8 error * use `with` always
2016-11-22 11:17:06 +01:00
from bigchaindb.common.exceptions import SchemaValidationError
logger = logging.getLogger(__name__)
def _load_schema(name, path=__file__):
"""Load a schema from disk"""
path = os.path.join(os.path.dirname(path), name + '.yaml')
2016-11-24 13:54:09 +01:00
with open(path) as handle:
schema = yaml.safe_load(handle)
fast_schema = rapidjson.Validator(rapidjson.dumps(schema))
2017-05-23 13:19:10 +02:00
return path, (schema, fast_schema)
Schema definition (#798) Commit messages for posterity: * wip transaction schema definition * test for SchemaObject * test SchemaObject definions meta property * schema documentation updates * test for basic validation * commit before change to .json file definiton + rst generation * move to straight .json schema, test for additionalProperties on each object * add asset to transaction definiton * remove outdated tx validation * make all tests pass * create own exception for validation error and start validating transactions * more tx validation fixes * move to yaml file for schema * automatic schema documentation generator * remove redundant section * use YAML safe loading * change current_owners to owners_before in tx schema * re-run tests and make correct yaml schema * fix some broken tests * update Release_Process.md * move tx validation into it's own method * add jsonschema dependency * perform schema validation after ID validation on Transaction * Release_Process.md, markdown auto numbering * remove old transaction.json * resolve remaining TODOs in schema docuementation * add `id` and `$schema` to transaction.yaml * add transaction.yaml to setup.py so it gets copied * address some concernes in PR for transaction.yaml * address more PR concerns in transaction.yaml * refactor validtion exceptions and move transaction schema validation into it's own function in bigchaindb.common.schema.__init__ * add note to generated schema.rst indicating when and how it's generated * move tx schema validation back above ID validation in Transaction.validate_structure, test that structurally invalid transaction gets caught and 400 returned in TX POST handler * remove timestamp from transaction schema index * Add README.md to bigchaindb.common.schema for introduction to JSON Schema and reasons for YAML * Use constant for schema definitions' base prefix * Move import of ValidationError exception into only the tests that require it * Move validate transaction test helper to tests/common/util.py * move ordered transaction schema load to generate_schema_documentation.py where it's needed * use double backticks to render terms in schema docs * change more backticks and change transaction version description in transaction schema * make details a mandatory property of condition * Many more documentation fixes * rename schema.rst to schema/transaction.rst * Fix documentation for Metadata * Add more links to documentation * Various other documentation fixes * Rename section titles in rendered documentation * use to manage file handle * fix extrenuous comma in test_tx_serialization_with_incorrect_hash args * 'a' * 64 * remove schema validation until we can analyze properly impact on downstream consumers * fix flake8 error * use `with` always
2016-11-22 11:17:06 +01:00
2017-11-21 14:52:15 +01:00
TX_SCHEMA_VERSION = 'v2.0'
TX_SCHEMA_PATH, TX_SCHEMA_COMMON = _load_schema('transaction_' +
TX_SCHEMA_VERSION)
_, TX_SCHEMA_CREATE = _load_schema('transaction_create_' +
TX_SCHEMA_VERSION)
_, TX_SCHEMA_TRANSFER = _load_schema('transaction_transfer_' +
TX_SCHEMA_VERSION)
2016-12-09 11:34:42 +01:00
_, TX_SCHEMA_VALIDATOR_ELECTION = _load_schema('transaction_validator_election_' +
TX_SCHEMA_VERSION)
_, TX_SCHEMA_CHAIN_MIGRATION_ELECTION = _load_schema('transaction_chain_migration_election_' +
TX_SCHEMA_VERSION)
_, TX_SCHEMA_VOTE = _load_schema('transaction_vote_' + TX_SCHEMA_VERSION)
2016-12-09 11:34:42 +01:00
2016-11-24 13:54:09 +01:00
def _validate_schema(schema, body):
"""Validate data against a schema"""
2017-05-23 13:19:10 +02:00
# Note
#
# Schema validation is currently the major CPU bottleneck of
# BigchainDB. the `jsonschema` library validates python data structures
# directly and produces nice error messages, but validation takes 4+ ms
# per transaction which is pretty slow. The rapidjson library validates
# much faster at 1.5ms, however it produces _very_ poor error messages.
# For this reason we use both, rapidjson as an optimistic pathway and
# jsonschema as a fallback in case there is a failure, so we can produce
# a helpful error message.
Schema definition (#798) Commit messages for posterity: * wip transaction schema definition * test for SchemaObject * test SchemaObject definions meta property * schema documentation updates * test for basic validation * commit before change to .json file definiton + rst generation * move to straight .json schema, test for additionalProperties on each object * add asset to transaction definiton * remove outdated tx validation * make all tests pass * create own exception for validation error and start validating transactions * more tx validation fixes * move to yaml file for schema * automatic schema documentation generator * remove redundant section * use YAML safe loading * change current_owners to owners_before in tx schema * re-run tests and make correct yaml schema * fix some broken tests * update Release_Process.md * move tx validation into it's own method * add jsonschema dependency * perform schema validation after ID validation on Transaction * Release_Process.md, markdown auto numbering * remove old transaction.json * resolve remaining TODOs in schema docuementation * add `id` and `$schema` to transaction.yaml * add transaction.yaml to setup.py so it gets copied * address some concernes in PR for transaction.yaml * address more PR concerns in transaction.yaml * refactor validtion exceptions and move transaction schema validation into it's own function in bigchaindb.common.schema.__init__ * add note to generated schema.rst indicating when and how it's generated * move tx schema validation back above ID validation in Transaction.validate_structure, test that structurally invalid transaction gets caught and 400 returned in TX POST handler * remove timestamp from transaction schema index * Add README.md to bigchaindb.common.schema for introduction to JSON Schema and reasons for YAML * Use constant for schema definitions' base prefix * Move import of ValidationError exception into only the tests that require it * Move validate transaction test helper to tests/common/util.py * move ordered transaction schema load to generate_schema_documentation.py where it's needed * use double backticks to render terms in schema docs * change more backticks and change transaction version description in transaction schema * make details a mandatory property of condition * Many more documentation fixes * rename schema.rst to schema/transaction.rst * Fix documentation for Metadata * Add more links to documentation * Various other documentation fixes * Rename section titles in rendered documentation * use to manage file handle * fix extrenuous comma in test_tx_serialization_with_incorrect_hash args * 'a' * 64 * remove schema validation until we can analyze properly impact on downstream consumers * fix flake8 error * use `with` always
2016-11-22 11:17:06 +01:00
try:
schema[1](rapidjson.dumps(body))
2017-05-23 13:19:10 +02:00
except ValueError as exc:
try:
jsonschema.validate(body, schema[0])
except jsonschema.ValidationError as exc2:
raise SchemaValidationError(str(exc2)) from exc2
logger.warning('code problem: jsonschema did not raise an exception, wheras rapidjson raised %s', exc)
raise SchemaValidationError(str(exc)) from exc
Schema definition (#798) Commit messages for posterity: * wip transaction schema definition * test for SchemaObject * test SchemaObject definions meta property * schema documentation updates * test for basic validation * commit before change to .json file definiton + rst generation * move to straight .json schema, test for additionalProperties on each object * add asset to transaction definiton * remove outdated tx validation * make all tests pass * create own exception for validation error and start validating transactions * more tx validation fixes * move to yaml file for schema * automatic schema documentation generator * remove redundant section * use YAML safe loading * change current_owners to owners_before in tx schema * re-run tests and make correct yaml schema * fix some broken tests * update Release_Process.md * move tx validation into it's own method * add jsonschema dependency * perform schema validation after ID validation on Transaction * Release_Process.md, markdown auto numbering * remove old transaction.json * resolve remaining TODOs in schema docuementation * add `id` and `$schema` to transaction.yaml * add transaction.yaml to setup.py so it gets copied * address some concernes in PR for transaction.yaml * address more PR concerns in transaction.yaml * refactor validtion exceptions and move transaction schema validation into it's own function in bigchaindb.common.schema.__init__ * add note to generated schema.rst indicating when and how it's generated * move tx schema validation back above ID validation in Transaction.validate_structure, test that structurally invalid transaction gets caught and 400 returned in TX POST handler * remove timestamp from transaction schema index * Add README.md to bigchaindb.common.schema for introduction to JSON Schema and reasons for YAML * Use constant for schema definitions' base prefix * Move import of ValidationError exception into only the tests that require it * Move validate transaction test helper to tests/common/util.py * move ordered transaction schema load to generate_schema_documentation.py where it's needed * use double backticks to render terms in schema docs * change more backticks and change transaction version description in transaction schema * make details a mandatory property of condition * Many more documentation fixes * rename schema.rst to schema/transaction.rst * Fix documentation for Metadata * Add more links to documentation * Various other documentation fixes * Rename section titles in rendered documentation * use to manage file handle * fix extrenuous comma in test_tx_serialization_with_incorrect_hash args * 'a' * 64 * remove schema validation until we can analyze properly impact on downstream consumers * fix flake8 error * use `with` always
2016-11-22 11:17:06 +01:00
2016-11-24 16:37:26 +01:00
def validate_transaction_schema(tx):
"""Validate a transaction dict.
TX_SCHEMA_COMMON contains properties that are common to all types of
transaction. TX_SCHEMA_[TRANSFER|CREATE] add additional constraints on top.
"""
_validate_schema(TX_SCHEMA_COMMON, tx)
if tx['operation'] == 'TRANSFER':
_validate_schema(TX_SCHEMA_TRANSFER, tx)
else:
_validate_schema(TX_SCHEMA_CREATE, tx)