Changed 'federation' to 'cluster' or 'consortium' in docs and some code

This commit is contained in:
Troy McConaghy 2017-03-07 17:41:25 +01:00
parent 9ec9f638fc
commit 421b5b03b3
15 changed files with 43 additions and 43 deletions

View File

@ -1,3 +1,3 @@
# Benchmarking tests
This folder contains util files and test case folders to benchmark the performance of a BigchainDB federation.
This folder contains util files and test case folders to benchmark the performance of a BigchainDB cluster.

View File

@ -298,7 +298,7 @@ print('Writing hostlist.py')
with open('hostlist.py', 'w') as f:
f.write('# -*- coding: utf-8 -*-\n')
f.write('"""A list of the public DNS names of all the nodes in this\n')
f.write('BigchainDB cluster/federation.\n')
f.write('BigchainDB cluster.\n')
f.write('"""\n')
f.write('\n')
f.write('from __future__ import unicode_literals\n')

View File

@ -3,7 +3,7 @@ How BigchainDB is Good for Asset Registrations & Transfers
BigchainDB can store data of any kind (within reason), but it's designed to be particularly good for storing asset registrations and transfers:
* The fundamental thing that one submits to a BigchainDB federation to be checked and stored (if valid) is a *transaction*, and there are two kinds: CREATE transactions and TRANSFER transactions.
* The fundamental thing that one sends to a BigchainDB cluster, to be checked and stored (if valid), is a *transaction*, and there are two kinds: CREATE transactions and TRANSFER transactions.
* A CREATE transaction can be use to register any kind of asset (divisible or indivisible), along with arbitrary metadata.
* An asset can have zero, one, or several owners.
* The owners of an asset can specify (crypto-)conditions which must be satisified by anyone wishing transfer the asset to new owners. For example, a condition might be that at least 3 of the 5 current owners must cryptographically sign a transfer transaction.

View File

@ -4,18 +4,18 @@ Decentralization means that no one owns or controls everything, and there is no
Ideally, each node in a BigchainDB cluster is owned and controlled by a different person or organization. Even if the cluster lives within one organization, it's still preferable to have each node controlled by a different person or subdivision.
We use the phrase "BigchainDB federation" (or just "federation") to refer to the set of people and/or organizations who run the nodes of a BigchainDB cluster. A federation requires some form of governance to make decisions such as membership and policies. The exact details of the governance process are determined by each federation, but it can be very decentralized (e.g. purely vote-based, where each node gets a vote, and there are no special leadership roles).
We use the phrase "BigchainDB consortium" (or just "consortium") to refer to the set of people and/or organizations who run the nodes of a BigchainDB cluster. A consortium requires some form of governance to make decisions such as membership and policies. The exact details of the governance process are determined by each consortium, but it can be very decentralized (e.g. purely vote-based, where each node gets a vote, and there are no special leadership roles).
The actual data is decentralized in that it doesnt all get stored in one place. Each federation node stores the primary of one shard and replicas of some other shards. (A shard is a subset of the total set of documents.) Sharding and replication are handled by RethinkDB.
If sharding is turned on (i.e. if the number of shards is larger than one), then the actual data is decentralized in that no one node stores all the data.
Every node has its own locally-stored list of the public keys of other federation members: the so-called keyring. There's no centrally-stored or centrally-shared keyring.
Every node has its own locally-stored list of the public keys of other consortium members: the so-called keyring. There's no centrally-stored or centrally-shared keyring.
A federation can increase its decentralization (and its resilience) by increasing its jurisdictional diversity, geographic diversity, and other kinds of diversity. This idea is expanded upon in [the section on node diversity](diversity.html).
A consortium can increase its decentralization (and its resilience) by increasing its jurisdictional diversity, geographic diversity, and other kinds of diversity. This idea is expanded upon in [the section on node diversity](diversity.html).
Theres no node that has a long-term special position in the federation. All nodes run the same software and perform the same duties.
Theres no node that has a long-term special position in the cluster. All nodes run the same software and perform the same duties.
RethinkDB has an “admin” user which cant be deleted and which can make big changes to the database, such as dropping a table. Right now, thats a big security vulnerability, but we have plans to mitigate it by:
RethinkDB and MongoDB have an “admin” user which cant be deleted and which can make big changes to the database, such as dropping a table. Right now, thats a big security vulnerability, but we have plans to mitigate it by:
1. Locking down the admin user as much as possible.
2. Having all nodes inspect RethinkDB admin-type requests before acting on them. Requests can be checked against an evolving whitelist of allowed actions (voted on by federation nodes).
2. Having all nodes inspect admin-type requests before acting on them. Requests can be checked against an evolving whitelist of allowed actions. Nodes requesing non-allowed requests can be removed from the list of cluster nodes.
Its worth noting that the RethinkDB admin user cant transfer assets, even today. The only way to create a valid transfer transaction is to fulfill the current (crypto) conditions on the asset, and the admin user cant do that because the admin user doesnt have the necessary private keys (or preimages, in the case of hashlock conditions). Theyre not stored in the database.

View File

@ -6,6 +6,6 @@ Steps should be taken to make it difficult for any one actor or event to control
2. **Geographic diversity.** The servers should be physically located at multiple geographic locations, so that it becomes difficult for a natural disaster (such as a flood or earthquake) to damage enough of them to cause problems.
3. **Hosting diversity.** The servers should be hosted by multiple hosting providers (e.g. Amazon Web Services, Microsoft Azure, Digital Ocean, Rackspace), so that it becomes difficult for one hosting provider to influence enough of the nodes.
4. **Operating system diversity.** The servers should use a variety of operating systems, so that a security bug in one OS cant be used to exploit enough of the nodes.
5. **Diversity in general.** In general, membership diversity (of all kinds) confers many advantages on a federation. For example, it provides the federation with a source of various ideas for addressing challenges.
5. **Diversity in general.** In general, membership diversity (of all kinds) confers many advantages on a consortium. For example, it provides the consortium with a source of various ideas for addressing challenges.
Note: If all the nodes are running the same code, i.e. the same implementation of BigchainDB, then a bug in that code could be used to compromise all of the nodes. Ideally, there would be several different, well-maintained implementations of BigchainDB Server (e.g. one in Python, one in Go, etc.), so that a federation could also have a diversity of server implementations.
Note: If all the nodes are running the same code, i.e. the same implementation of BigchainDB, then a bug in that code could be used to compromise all of the nodes. Ideally, there would be several different, well-maintained implementations of BigchainDB Server (e.g. one in Python, one in Go, etc.), so that a consortium could also have a diversity of server implementations.

View File

@ -8,12 +8,12 @@ Its true that blockchain data is more difficult to change than usual: its
BigchainDB achieves strong tamper-resistance in the following ways:
1. **Replication.** All data is sharded and shards are replicated in several (different) places. The replication factor can be set by the federation. The higher the replication factor, the more difficult it becomes to change or delete all replicas.
1. **Replication.** All data is sharded and shards are replicated in several (different) places. The replication factor can be set by the consortium. The higher the replication factor, the more difficult it becomes to change or delete all replicas.
2. **Internal watchdogs.** All nodes monitor all changes and if some unallowed change happens, then appropriate action is taken. For example, if a valid block is deleted, then it is put back.
3. **External watchdogs.** Federations may opt to have trusted third-parties to monitor and audit their data, looking for irregularities. For federations with publicly-readable data, the public can act as an auditor.
3. **External watchdogs.** A consortium may opt to have trusted third-parties to monitor and audit their data, looking for irregularities. For a consortium with publicly-readable data, the public can act as an auditor.
4. **Cryptographic signatures** are used throughout BigchainDB as a way to check if messages (transactions, blocks and votes) have been tampered with enroute, and as a way to verify who signed the messages. Each block is signed by the node that created it. Each vote is signed by the node that cast it. A creation transaction is signed by the node that created it, although there are plans to improve that by adding signatures from the sending client and multiple nodes; see [Issue #347](https://github.com/bigchaindb/bigchaindb/issues/347). Transfer transactions can contain multiple inputs (fulfillments, one per asset transferred). Each fulfillment will typically contain one or more signatures from the owners (i.e. the owners before the transfer). Hashlock fulfillments are an exception; theres an open issue ([#339](https://github.com/bigchaindb/bigchaindb/issues/339)) to address that.
5. **Full or partial backups** of the database may be recorded from time to time, possibly on magnetic tape storage, other blockchains, printouts, etc.
6. **Strong security.** Node owners can adopt and enforce strong security policies.
7. **Node diversity.** Diversity makes it so that no one thing (e.g. natural disaster or operating system bug) can compromise enough of the nodes. See [the section on the kinds of node diversity](diversity.html).
Some of these things come "for free" as part of the BigchainDB software, and others require some extra effort from the federation and node owners.
Some of these things come "for free" as part of the BigchainDB software, and others require some extra effort from the consortium and node owners.

View File

@ -1,6 +1,6 @@
# Terminology
There is some specialized terminology associated with BigchainDB. To get started, you should at least know what what we mean by a BigchainDB *node*, *cluster* and *federation*.
There is some specialized terminology associated with BigchainDB. To get started, you should at least know what what we mean by a BigchainDB *node*, *cluster* and *consortium*.
## Node
@ -13,10 +13,10 @@ A **BigchainDB node** is a machine or set of closely-linked machines running Ret
A set of BigchainDB nodes can connect to each other to form a **cluster**. Each node in the cluster runs the same software. A cluster contains one logical RethinkDB datastore. A cluster may have additional machines to do things such as cluster monitoring.
## Federation
## Consortium
The people and organizations that run the nodes in a cluster belong to a **federation** (i.e. another organization). A federation must have some sort of governance structure to make decisions. If a cluster is run by a single company, then the federation is just that company.
The people and organizations that run the nodes in a cluster belong to a **consortium** (i.e. another organization). A consortium must have some sort of governance structure to make decisions. If a cluster is run by a single company, then the "consortium" is just that company.
**What's the Difference Between a Cluster and a Federation?**
**What's the Difference Between a Cluster and a Consortium?**
A cluster is just a bunch of connected nodes. A federation is an organization which has a cluster, and where each node in the cluster has a different operator. Confusingly, we sometimes call a federation's cluster its "federation." You can probably tell what we mean from context.
A cluster is just a bunch of connected nodes. A consortium is an organization which has a cluster, and where each node in the cluster has a different operator.

View File

@ -5,7 +5,7 @@ We have some "templates" to deploy a basic, working, but bare-bones BigchainDB n
You don't have to use the tools we use in the templates. You can use whatever tools you prefer.
If you find the cloud deployment templates for nodes helpful, then you may also be interested in our scripts for :doc:`deploying a testing cluster on AWS <../clusters-feds/aws-testing-cluster>` (documented in the Clusters & Federations section).
If you find the cloud deployment templates for nodes helpful, then you may also be interested in our scripts for :doc:`deploying a testing cluster on AWS <../clusters-feds/aws-testing-cluster>` (documented in the Clusters section).
.. toctree::
:maxdepth: 1

View File

@ -64,7 +64,7 @@ In the future, it will be possible for clients to query for the blocks containin
**How could we be sure blocks and votes from a client are valid?**
All blocks and votes are signed by federation nodes. Only federation nodes can produce valid signatures because only federation nodes have the necessary private keys. A client can't produce a valid signature for a block or vote.
All blocks and votes are signed by cluster nodes (owned and operated by consortium members). Only cluster nodes can produce valid signatures because only cluster nodes have the necessary private keys. A client can't produce a valid signature for a block or vote.
**Could we restore an entire BigchainDB database using client-saved blocks and votes?**
@ -109,7 +109,7 @@ Considerations for BigchainDB:
Although it's not advertised as such, RethinkDB's built-in replication feature is similar to continous backup, except the "backup" (i.e. the set of replica shards) is spread across all the nodes. One could take that idea a bit farther by creating a set of backup-only servers with one full backup:
* Give all the original BigchainDB nodes (RethinkDB nodes) the server tag `original`. This is the default if you used the RethinkDB config file suggested in the section titled [Configure RethinkDB Server](../dev-and-test/setup-run-node.html#configure-rethinkdb-server).
* Set up a group of servers running RethinkDB only, and give them the server tag `backup`. The `backup` servers could be geographically separated from all the `original` nodes (or not; it's up to the federation).
* Set up a group of servers running RethinkDB only, and give them the server tag `backup`. The `backup` servers could be geographically separated from all the `original` nodes (or not; it's up to the consortium to decide).
* Clients shouldn't be able to read from or write to servers in the `backup` set.
* Send a RethinkDB reconfigure command to the RethinkDB cluster to make it so that the `original` set has the same number of replicas as before (or maybe one less), and the `backup` set has one replica. Also, make sure the `primary_replica_tag='original'` so that all primary shards live on the `original` nodes.

View File

@ -1,10 +1,10 @@
Clusters & Federations
======================
Clusters
========
.. toctree::
:maxdepth: 1
set-up-a-federation
set-up-a-cluster
backup
aws-testing-cluster

View File

@ -1,11 +1,11 @@
# Set Up a Federation
# Set Up a Cluster
This section is about how to set up a BigchainDB _federation_, where each node is operated by a different operator. If you want to set up and run a testing cluster on AWS (where all nodes are operated by you), then see [the section about that](aws-testing-cluster.html).
This section is about how to set up a BigchainDB cluster where each node is operated by a different operator. If you want to set up and run a testing cluster on AWS (where all nodes are operated by you), then see [the section about that](aws-testing-cluster.html).
## Initial Checklist
* Do you have a governance process for making federation-level decisions, such as how to admit new members?
* Do you have a governance process for making consortium-level decisions, such as how to admit new members?
* What will you store in creation transactions (data payload)? Is there a data schema?
* Will you use transfer transactions? Will they include a non-empty data payload?
* Who will be allowed to submit transactions? Who will be allowed to read or query transactions? How will you enforce the access rules?
@ -13,7 +13,7 @@ This section is about how to set up a BigchainDB _federation_, where each node i
## Set Up the Initial Cluster
The federation must decide some things before setting up the initial cluster (initial set of BigchainDB nodes):
The consortium must decide some things before setting up the initial cluster (initial set of BigchainDB nodes):
1. Who will operate a node in the initial cluster?
2. What will the replication factor be? (It must be 3 or more for [RethinkDB failover](https://rethinkdb.com/docs/failover/) to work.)
@ -21,7 +21,7 @@ The federation must decide some things before setting up the initial cluster (in
Once those things have been decided, each node operator can begin setting up their BigchainDB (production) node.
Each node operator will eventually need two pieces of information from all other nodes in the federation:
Each node operator will eventually need two pieces of information from all other nodes:
1. Their RethinkDB hostname, e.g. `rdb.farm2.organization.org`
2. Their BigchainDB public key, e.g. `Eky3nkbxDTMgkmiJC8i5hKyVFiAQNmPP4a2G4JdDxJCK`

View File

@ -11,7 +11,7 @@ A block has the following structure:
"timestamp": "<block-creation timestamp>",
"transactions": ["<list of transactions>"],
"node_pubkey": "<public key of the node creating the block>",
"voters": ["<list of federation nodes public keys>"]
"voters": ["<list of public keys of all nodes in the cluster>"]
},
"signature": "<signature of block>"
}
@ -23,9 +23,9 @@ A block has the following structure:
- ``timestamp``: The Unix time when the block was created. It's provided by the node that created the block.
- ``transactions``: A list of the transactions included in the block.
- ``node_pubkey``: The public key of the node that created the block.
- ``voters``: A list of the public keys of federation nodes at the time the block was created.
It's the list of federation nodes which can cast a vote on this block.
This list can change from block to block, as nodes join and leave the federation.
- ``voters``: A list of the public keys of all cluster nodes at the time the block was created.
It's the list of nodes which can cast a vote on this block.
This list can change from block to block, as nodes join and leave the cluster.
- ``signature``: :ref:`Cryptographic signature <Signature Algorithm and Keys>` of the block by the node that created the block (i.e. the node with public key ``node_pubkey``). To generate the signature, the node signs the serialized inner ``block`` (the same thing that was hashed to determine the ``id``) using the private key corresponding to ``node_pubkey``.

View File

@ -10,7 +10,7 @@ Note that there are a few kinds of nodes:
- A **bare-bones node** is a node deployed in the cloud, either as part of a testing cluster or as a starting point before upgrading the node to be production-ready. Our cloud deployment templates deploy a bare-bones node, as do our scripts for deploying a testing cluster on AWS.
- A **production node** is a node that is part of a federation's BigchainDB cluster. A production node has the most components and requirements.
- A **production node** is a node that is part of a consortium's BigchainDB cluster. A production node has the most components and requirements.
## Setup Instructions for Various Cases
@ -19,7 +19,7 @@ Note that there are a few kinds of nodes:
* [Set up and run a bare-bones node in the cloud](cloud-deployment-templates/index.html)
* [Set up and run a local dev/test node for developing and testing BigchainDB Server](dev-and-test/setup-run-node.html)
* [Deploy a testing cluster on AWS](clusters-feds/aws-testing-cluster.html)
* [Set up and run a federation (including production nodes)](clusters-feds/set-up-a-federation.html)
* [Set up and run a cluster (including production nodes)](clusters-feds/set-up-a-cluster.html)
Instructions for setting up a client will be provided once there's a public test net.

View File

@ -1,12 +1,12 @@
# Production Node Assumptions
If you're not sure what we mean by a BigchainDB *node*, *cluster*, *federation*, or *production node*, then see [the section in the Introduction where we defined those terms](../introduction.html#some-basic-vocabulary).
If you're not sure what we mean by a BigchainDB *node*, *cluster*, *consortium*, or *production node*, then see [the section in the Introduction where we defined those terms](../introduction.html#some-basic-vocabulary).
We make some assumptions about production nodes:
1. **Each production node is set up and managed by an experienced professional system administrator (or a team of them).**
2. Each production node in a federation's cluster is managed by a different person or team.
2. Each production node in a cluster is managed by a different person or team.
Because of the first assumption, we don't provide a detailed cookbook explaining how to secure a server, or other things that a sysadmin should know. (We do provide some [templates](../cloud-deployment-templates/index.html), but those are just a starting point.)

View File

@ -19,7 +19,7 @@ There are some [notes on BigchainDB-specific firewall setup](../appendices/firew
A BigchainDB node uses its system clock to generate timestamps for blocks and votes, so that clock should be kept in sync with some standard clock(s). The standard way to do that is to run an NTP daemon (Network Time Protocol daemon) on the node. (You could also use tlsdate, which uses TLS timestamps rather than NTP, but don't: it's not very accurate and it will break with TLS 1.3, which removes the timestamp.)
NTP is a standard protocol. There are many NTP daemons implementing it. We don't recommend a particular one. On the contrary, we recommend that different nodes in a federation run different NTP daemons, so that a problem with one daemon won't affect all nodes.
NTP is a standard protocol. There are many NTP daemons implementing it. We don't recommend a particular one. On the contrary, we recommend that different nodes in a cluster run different NTP daemons, so that a problem with one daemon won't affect all nodes.
Please see the [notes on NTP daemon setup](../appendices/ntp-notes.html) in the Appendices.
@ -72,7 +72,7 @@ direct-io
join=node0_hostname:29015
join=node1_hostname:29015
join=node2_hostname:29015
# continue until there's a join= line for each node in the federation
# continue until there's a join= line for each node in the cluster
```
* `directory=/data` tells the RethinkDB node to store its share of the database data in `/data`.
@ -153,7 +153,7 @@ Edit the created config file:
* Open `$HOME/.bigchaindb` (the created config file) in your text editor.
* Change `"server": {"bind": "localhost:9984", ... }` to `"server": {"bind": "0.0.0.0:9984", ... }`. This makes it so traffic can come from any IP address to port 9984 (the HTTP Client-Server API port).
* Change `"keyring": []` to `"keyring": ["public_key_of_other_node_A", "public_key_of_other_node_B", "..."]` i.e. a list of the public keys of all the other nodes in the federation. The keyring should _not_ include your node's public key.
* Change `"keyring": []` to `"keyring": ["public_key_of_other_node_A", "public_key_of_other_node_B", "..."]` i.e. a list of the public keys of all the other nodes in the cluster. The keyring should _not_ include your node's public key.
For more information about the BigchainDB config file, see [Configuring a BigchainDB Node](configuration.html).
@ -185,7 +185,7 @@ where:
* `bigchaindb init` creates the database within RethinkDB, the tables, the indexes, and the genesis block.
* `numshards` should be set to the number of nodes in the initial cluster.
* `numreplicas` should be set to the database replication factor decided by the federation. It must be 3 or more for [RethinkDB failover](https://rethinkdb.com/docs/failover/) to work.
* `numreplicas` should be set to the database replication factor decided by the consortium. It must be 3 or more for [RethinkDB failover](https://rethinkdb.com/docs/failover/) to work.
Once the RethinkDB database is configured, every node operator can start BigchainDB using:
```text