Blockchain is an append-only log of transactions grouped in blocks. Once a block is appended, it cannot be changed, which is secured by the cryptographic hash of the previous block. It is not possible to remove data from blockchain.
Blockchain consists of a distributed or decentralized network of nodes. The state is replicated among the nodes.
Blockchain nodes must agree on the common state. A consensus mechanism is required to achieve this agreement in the blockchain network.
A smart contract is a small program that contains some business logic. It is saved and executed on the blockchain to provide transparency and trust.
The most suitable business case of blockchain is to share the state and business logic among multiple parties that do not trust each other. Blockchain is designed to ensure transparency and integrity in an environment of limited trust.
Private vs public
Public blockchains let anyone join the network. In private blockchains, users or other agents (like nodes) have to be granted permission.
Permissioned vs permissionless
In permissionless blockchains, all members have the same permissions and can use the same network features. In permissioned blockchains, users and nodes can have different sets of permissions or roles.
Read more about private permissioned blockchain in our blog posts: What is a private blockchain and why do you need it? and Do you need Private Blockchain?.
Hyperledger Fabric ®
Hyperledger Fabric is a highly customizable private permissioned blockchain with fine grained permission control and strong consistency of the data.
Hyperledger Fabric is an enterprise-grade, distributed ledger platform that offers modularity and versatility for a broad set of industry use cases. The modular architecture for Hyperledger Fabric accommodates the diversity of enterprise use cases through plug and play components, such as consensus, privacy and membership services. (source)
Hyperledger Fabric is an Open Source tool, initially developed by IBM, now under the Linux Foundation (see the source code and the documentation). It is widely adopted by the industry in various domains and offered as a managed blockchain network by major cloud providers. It is our tool of choice for enterprise blockchain solutions.
The network topology
Hyperledger Fabric forms a decentralized network. Participating organizations have their own roles in the network and manage different nodes. The diagram presents one of the simplest and probably the most common network topology, when the root organization is responsible for both the network membership and the consistency of the state.
There are basically three most important node types in the Hyperledger Fabric network. Certificate Authorities (CA) connect the whole network via certificate's chain of trust. Orderers maintain the consistency of the state and create blocks of transactions. Peers store the ledger, which consists of the transaction log (i.e. the blockchain) and the world state.
A channel is a key abstraction in Hyperledger Fabric. It forms a kind of subnet in the network isolating the state and smart contracts. All peers belonging to a channel have access to the same data and smart contracts. There is no access to it outside the channel.
Certificate Authority (CA)
Typically CA nodes form a tree-like topology with Root CA at the top. Root CA is a self-signed instance. CAs that belong to other organizations are signed by Root CA. All other nodes need to be signed by respective CAs to become part of the blockchain network and each user needs a valid certificate to interact with the network.
Hyperledger provides a reference implementation of CA, however, it may be replaced by any CA that can generate ECDSA certificates. You can also connect it with the LDAP repository or attach an HSM (Hardware Security Module) to it. Read more in the documentation.
Note that the user information from CA is included in the certificates and the certificates are attached to every invoked transaction and therefore stored in blockchain. Since blockchain is immutable, you cannot remove the certificates. You cannot provide sensitive information to CA if you want your blockchain to be GDPR-compliant.
Orderers or ordering nodes form an ordering service. It is responsible for keeping the blockchain state consistent and final. They guarantee the single global order of transactions in the chain. Since the ordering service is separated from the peers, it gives many advantages regarding Hyperledger Fabric network performance and scalability.
Besides, orderers play a significant role in managing access to the channels. They keep a system channel that contains access control lists (ACLs) of organizations that can create channels. They restrict who can configure, read, and write data to particular channels.
Read more about the ordering service in the official documentation.
Peers are fundamental nodes in the network since they host copies of ledgers and smart contracts. The blockchain itself, i.e. an immutable log of transactions, as a part of the ledger is stored on peers. Typically, an organization will have multiple peers to have a redundancy of data and handle a high load.
A peer belongs to one organization, can belong to multiple channels, and can host multiple ledgers and smart contracts.
Peers forward smart contract calls to dedicated chaincode containers in the network and update the network state on the basis of smart contract results.
Peers can be connected to orderers to receive new blocks of transactions and update the local blockchain copy, however, it is not necessary. Instead, they can synchronize the state with other peers via p2p gossip protocol.
You can read more in the documentation about peers.
Other node types
Hyperledger Fabric network may contain other node types as well. Each CA might use an embedded LevelDB database or a dedicated PostgreSQL node. Each peer can use an embedded LevelDB database or an external CouchDB. Finally, each chaincode for a peer forms an additional container in the network.
Then, you may want to add some monitoring to the network - Prometheus for metrics exposed by the Hyperledger Fabric, Graylog or Kibana for collecting the logs. Then, there are additional tools that might be useful like block browser, benchmark tools, and others.
This is the organization that manages the root CA, i.e. the self-signed Certificate Authority. All other CAs, orderers, peers, and other nodes can participate in the network because they have certificates that somehow derive from the root CA (via a chain of trust). The root organization might be considered as a founder and administrator of the whole network.
The orderer organization might be separated from the root organization. This is the organization that manages the orderer nodes. It is responsible for keeping the data consistent and generating blocks of ordered transactions that form a blockchain.
In some cases the orderer nodes might be distributed among other organizations. It might be required in your deployment, however it is also risky in terms of the performance and maintaining trust in the network. The orderer organization, as well as the root organization, should be governed by an authority with the highest trust in the network.
Typically the organization in the network will have a set of peers and a CA that is signed by root or parent CA. They will participate in one or more channels to have access to the state and be able to call smart contracts. Each organization may have different permissions, access to different channels, it may have even read only access to a part of the network.
The network topology might be more complex than the root organization and other organizations. CA needs only parent CA to become a part of the network and there might be many intermediate CAs. There might even be more orderer organizations at various levels of the network (which is fine as long as the ordering service is centralized at the channel level).
Inside the peer
A peer may belong to one organization and multiple channels. It is responsible for keeping theledgers and chaincodes.
A ledger consists of two parts: a log of transactions (blockchain) and a state database (world state). All peers in the channel have their own copy of the ledger. If a peer belongs to multiple channels, it has multiple ledgers, one per channel.
Aside from keeping the state, a peer keeps and manages chaincodes. This is a broader concept than the smart contract itself. Smart contracts are packaged as a chaincode. Thus, the chaincode is a bundle that, among others, contains smart contracts.
A peer receives smart contract calls and passes them to chaincodes. Chaincodes form separate containers in the network. Each single chaincode container is created for a given chaincode in a channel, for a given peer and a given version of the chaincode. Since Hyperledger Fabric 2.0, it is also possible to use a custom external service as a chaincode.
Blockchain is a log of transactions grouped in blocks, ordered by ordering service. It is physically stored in the peer nodes and immutable. It cannot be changed.
Each block contains:
A header with the block number, a hash and the previous block hash. Thanks to the information from blocks' headers, the whole blockchain is interlinked and resistant to manipulations.
Metadata with signatures of the block creator and some low-level data for ensuring the consistency of the state.
A list of transactions with client application signatures, chaincode names, input parameters, reads and writes to the world state and lists of transaction execution outputs from all required organizations.
Blockchain is a single source of truth for the ledger. The world state is derived from the blockchain.
This is a key value store that contains data created by transactions from the blockchain. Unlike the blockchain, the world state is mutable. It represents the latest values for all keys updated in the transactions.
Even if a smart contract "reads" and "saves" the world state data, during the smart contract execution, there is no actual change of the world state. A smart contract always has access to the world state from the previous block and the result of execution contains a read/write set, i.e. a list of keys read and a list of keys and values to be updated.
Once the ordering service validates proposed transactions, it packs them into blocks. These blocks are sent to peers and appended to the chain. World state gets updated after that on the basis of the read/write set from transactions.
The world state might be stored in an embedded LevelDB database on the peer or in a separated CouchDB node dedicated to the peer. The main benefit of the latter one is the ability to execute complex queries against the world state.
More details about the ledger can be found in the official documentation.
Endorsement policy defines the rules that organizations in the network must execute and agree on the smart contract execution result to consider the transaction as valid. For instance, it might be enough to execute the transaction by one organization, all organizations or with the majority of organizations, or even by another sub-group of organizations defined by the endorsement policy.
Private data collection
Hyperledger Fabric has a mechanism to store the data that is kept off-chain and might be available only to a subset of organizations in the channel. Only the hash of the data is saved in the blockchain.
This mechanism is called private data. Each chaincode might contain many private data collections, available only to organizations specified during the chaincode installation process.
Keep the data secret
Since blockchain is immutable, you need to think twice about what kind of data should be stored in the ledger. This is crucial both in terms of cooperation with other organizations with limited trust and the GDPR's right to be forgotten.
You need to be careful with the content of the transaction stored in the chain: smart contract input parameters, keys read from the world state, keys and values to be updated in the world state, and the response from the smart contract.
Fortunately Hyperledger Fabric has some solutions that allow it to keep the data secret.
|I want to save sensitive information and use it in smart contracts||private data*
|I want to pass sensitive information to a smart contract||transient parameters|
|I want to use sensitive information as a key in the world state||just don't|
|I want to get sensitive information from a smart contract||use query instead of invoke**|
* as long as it is not "simple", vulnerable to brute force attack
** but only if you really need to do it, it is a bad practice to return sensitive information from a smart contract
The last one, use query instead of invoke, needs special attention because it is not related to smart contract design but the way client application calls the smart contract. It is described later, in the section "Invoke vs query".
In most cases, you will probably combine both private data collections and transient parameters. It is also important to avoid "invoke" transactions on smart contracts that return sensitive information.
The private data mechanism keeps your data private and still guarantees its integrity by the blockchain. It is possible because when you save private data in a smart contract, a SHA-256 control sum of the data is saved on the blockchain.
You can configure the private data collection in a chaincode in a way that only selected organizations will have read or write access to the data. Regardless of permissions, all organizations may verify the data by checking if the hash saved on blockchain is the same as the hash of the data they want to check.
Additionally, each chaincode has an implicit private data collection. It can be used to store an individual organization’s private data, and does not need to be defined explicitly (see the documentation).
Note that the private data mechanism in Hyperledger Fabric is not GDPR-compliant, since there is no way to remove the data from temporary databases on peers (see FAB-12038 and related issues). Besides, private data should not consist of simple values. The SHA-256 control sum is deterministic and might be vulnerable to brute force attack.
Transient parameters of smart contract invocation will not appear in the transaction log. It is safe to pass sensitive information as transient parameters.
There is a bit old but great article about the details of private data and transient parameters: Private Data and Transient Data in Hyperledger Fabric.
You can think of chaincodes and smart contracts as a set of automated business rules that concern multiple organizations in the network. We need a formal process of deploying them. The network participants need to agree on a way they cooperate.
It is not required to install the chaincode on all peers in the channel. You can choose only the subset of peers.
At first, the chaincode needs to be built by a tool specific for the language used and packaged by the Hyperledger Fabric tools to prepare a package to be disseminated among the peers.
All peers from organizations in the channel that want to have the chaincode installed, install the chaincode package.
Organizations need to approve the chaincode definition. By default, chaincode needs to be approved by single peers from the majority of organizations.
A peer commits the chaincode definition. Now the chaincode (or new version of the chaincode) can execute transactions.
Some chaincodes may require manual initialization, i.e. calling the
Initmethod before any other smart contracts. In this case, if you want your chaincode operational, you should first invoke the
Inittransaction on the chaincode.
The same process leads to both chaincode installation and upgrade. If there is an upgrade, peer manages to terminate the container with the old version of the chaincode.
Note: the chaincode installation process was different in old Hyperledger Fabric versions (before
The transaction flow
Before the transaction, a client application needs to have a certificate to interact with the network and needs to know the relevant part of the blockchain network. Those first two steps do not need to happen before every transaction.
At first, the client application needs to enroll a user to get a valid certificate that is required to interact with the Hyperledger Fabric network.
Then, the client application calls a peer (or several peers) to discover the network via service discovery. The discovery results are specific for a given channel and for a given user. Since Hyperledger Fabric is a private permissioned blockchain, users may have access only to a part of the network.
This step is optional but useful. Alternatively, you can manually configure the required addresses of nodes the client application needs to interact with (for example with a connection profile file).
When the client application has the required certificate and knows the topology of the required part of the network, it can actually start performing the transaction.
The client application sends transaction proposals to peers in a way that it will satisfy the endorsement policy. Typically, it will call peers from multiple organizations.
The peers that received transaction proposals simulate the transaction by calling the given smart contract. They determine, based on their copy of the ledger, what is going to be read and what is going to be written to the world state if the transaction succeeds (i.e. read/write sets). This information, along with peer signatures, is returned to the client application.
Now, the actual transaction starts.
The client application sends the transaction to the ordering service. Each transaction contains, among others, peer signatures and simulation results.
Once the ordering service gathers, validates, and orders an appropriate number of transactions, it forms a block. Then it sends the block to the lead peers that forward it to other peers in the channel.
All peers that receive the block validate and apply the transactions. New transactions are appended to the blockchain copies on the peers and the world state databases are updated with the transaction read/write sets.
The client application is supposed to wait until desired peers notify the transaction was completed. The notification about the successful transaction means that it actually was appended to the blockchain on a given peer.
You can read more about the transaction flow in the documentation.
Invoke vs query
There are two possible ways to execute the smart contract: invoke and query. The first one covers the whole transaction flow. The second one, query, calls only one peer to get the result of smart contract invocation.
|May result in the update of the world state||✓||✕|
|The transaction is saved in the chain||✓||✕|
|May require responses from multiple peers||✓||✕|
Query is far more lightweight than invoke since it does not need to engage multiple peers and the ordering service. It suits best for low-latency reads when the eventual consistency is enough and you don't need to record the transaction in the blockchain.
Invoke and sensitive information
The result of the invoke transaction is saved in the blockchain. If you invoke the smart contract and it returns sensitive information, all organizations in the channel get access to it.
The obvious solution to this problem is not to invoke this smart contract and use query to get the results. But it is easy to make a mistake that cannot be fixed. It also depends on the client application rather than the blockchain and chaincodes themselves. So if you don't really need to, just don't. Returning sensitive information by a smart contract should be considered as a bad practice.
We can talk of multiple peer types in terms of their role in the blockchain network and the transaction flow.
First, there are anchor peers - peers that are available outside the organization. Any cross-organization communication requires anchor peers. They play a significant role in the service discovery - they discover the network and they might be discovered from other organizations. In the transaction flow diagram above, anchor peers are peer 1 and peer 3.
Then, at the simulation phase of the transaction, the smart contract is executed in endorsing peers. In the transaction flow diagram above, endorsing peers are peer 2 and peer 3.
When the transaction is ordered, the ordering service delivers new blocks to leading peers and they distribute it further in the network. In the transaction flow diagram above, leading peers are peer 2 and peer 3.
Finally, committing peers append all new transactions to their own copy of the ledger. All peers in the channel are committing peers.
The transaction flow FAQ
What peers should endorse the transaction?
It is determined by the endorsement policy that is assigned to the chaincode. If the endorsement policy is not fulfilled, i.e. not enough organizations receive the transaction proposal, the transaction fails.
Hyperledger Fabric client SDKs have various options to determine the peers to send the transaction proposals.
Peer addresses can be provided manually or in a configuration file.
If there is a discovery service configured, it will be responsible for determining the endorsing peers.
Target organizations can be provided manually. Then the proposals are sent to endorsing peers from the organizations.
If none of the previous conditions is met, the proposal is sent to endorsing peers from the channel.
Still, regardless of the peer selection process, the only requirement is to satisfy the endorsement policy.
What peers should confirm the transaction?
It depends on your case. The Hyperledger Fabric client SDKs provide the
NONE option, which
means that the client app does not wait for any confirmation, but in most cases you need some
confirmation to be sure the transaction was successful.
The other default strategies combine two parts: the scope of the network and the required number of
peers (for example
The available scopes are as follows: wait for peers from your organization only
prefer your organization, but wait for other peers if your organization does not have any (
wait for available peers from the whole network (
In each case, you may choose to wait for all peers (
ALLFORTX) or for any peer from the
ANYFORTX). If you need something custom, you can implement your own strategy.
What peer should be queried?
Hyperledger Fabric client SDKs have several built-in strategies for querying peers. By default, you can
query a single peer in your organization and change the peer if the query fails
MSPID_SCOPE_SINGLE) or you can make each call to a different peer
It may happen that your organization has no peers and you want to call other organizations' peers. In
this case, there are similar predefined strategies that start with
PREFER_MSPID_ instead of
MSPID_. Alternatively, if you want something custom, you can implement your own strategy.
Which user should call a smart contract?
There are basically two main approaches. First, you can use the Hyperledger Fabric network in a database-like manner. In this case, there is a dedicated user per organization to call smart contracts. Each organization should create this user in CA and use its credentials in enrollment (step 1 in the transaction flow).
On the other hand, you may want application users to call smart contracts. In this case, each application user should be registered in CA and use its own credentials in the enrollment. This approach is more complex, since you have to create and maintain user accounts in the CA. It allows you, however, to track who actually called a smart contract, which might be required in many business cases.
When can a transaction fail?
A transaction can fail on many levels. At the endorsement phase (i.e. step 3 and 4), validation is performed by the peers and client SDK. There might be an error in smart contract execution, the endorsement policy might not be fulfilled or the responses from peers might differ. In this case, the transaction is cancelled and not appended to the chain.
Once the SDK assembles and sends the transaction to the ordering service, it will be a part of the block. It might, however, be valid or invalid. In this case, the transaction may fail, for example, in case of not fulfilling endorsement policy or in case of key collisions. A new block containing the transaction might be also rejected by the peer (see step 7 on the transaction flow).
How many transactions are in the block?
It depends. There are several configuration options regarding the block size.
There is a
BatchSize group of parameters:
PreferredMaxBytes). The first one specifies the maximum number of transactions in the
block, the latter ones set maximum and preferred block size.
Then, there is a
BatchTimeout config parameter to specify how long the orderer should wait
to form a block. Even if there are fewer transactions than specified by
other parameters, the block will be created after the specified amount of time. This value should be
typically set to 1-5 seconds to keep the responsiveness of the blockchain in case of low load.
Note that in some cases it is useful to set
MaxMessageCount to 1. This is a quick and easy
way to handle the key collisions problem in the networks that do not need high performance.
Notes on consensus
The consensus in the Hyperledger Fabric network is a result of the cooperation of different nodes, the whole transaction flow and differs radically from the public blockchain probabilistic consensus mechanisms. There are basically three mechanisms that take part in reaching the consensus.
1. Endorsement policy
Each chaincode has an endorsement policy that dictates what organizations need to execute the smart contracts. The results are signed by the peers. If the endorsement policy is satisfied and all peers return the same value, the content of the transaction is considered valid.
2. MVCC (Multi-version concurrency control)
This is a mechanism that detects concurrent writes or reads and writes in the same blocks. It returns errors in case of conflicts (see the section about key collisions).
3. Ordering service
It is not responsible for the content of transactions but for the order of transactions. It also forms the blockchain - a single, final source of truth in the network.
The ordering service typically contains multiple nodes that agree on the state with its own consensus mechanisms. Currently, the recommended algorithm is Raft. There is also a deprecated mechanism based on Apache Kafka and a deprecated Solo consensus (which is not production-ready anyway).
The problem is that both mechanisms (Kafka and Raft) are only crash fault tolerant. It means that the ordering service should be centralized (at least at the channel level) and governed by a trusted organization. Taking over the orderer nodes and serving malicious content to different peers can lead to corrupted blockchain.
CFT and BFT
CFT (or crash fault tolerance) means that a distributed system can operate normally even when a number of nodes fail. BFT (or Byzantine fault tolerance) means that a distributed system can operate normally even when a number of nodes lie.
BFT is crucial when many organizations cooperate. It is in the interest of all organizations to be resistant to malicious actions of the other ones. With proper endorsement policies and the ordering service managed by a trusted organization, the Hyperledger Fabric network might be considered BFT at the organization level. Even when an organization lies, the state won't corrupt.
Typically, in public permissionless blockchains there are various algorithms (for example Proof of Work or Proof of Stake) that ensure the blockchain nodes agree on transactions. Those algorithms are probabilistic and lead to probabilistic finality of the state. It means that once the block is appended to the chain, it does not become final - it just becomes final with a high degree of probability and can be reverted in case of network forks.
In Hyperledger Fabric, on the other hand, once a transaction is committed to the chain, it becomes final (absolute finality). There are no forks because the service responsible for the consensus itself - the ordering service - is centralized and deterministic. Blockchain cannot be changed.
Consistency over latency
Since the ordering service guarantees single ordered transaction history and concurrency control mechanisms fail conflicting transactions, Hyperledger Fabric provides, at the level of smart contract execution, strong consistency of the data (i.e. linearizability).
On the other hand, it shows the trade-off that was made. Hyperledger Fabric prefers consistency over latency. When there are many conflicting updates, the overall performance will dramatically decrease.
We might try to improve the performance by creating more channels with the same chaincode to distribute the application load and achieve partitioning - but we will lose the strong consistency of the data. We may try some other tweaks, as stated in the key collisions part of the cheat sheet. But still in all cases, we must sacrifice a part of the consistency to improve the performance.
Read our post for more details: Strong data consistency and finality in Hyperledger Fabric blockchain.
This is an issue in all distributed systems. When the communication occurs in the network, there is always the possibility of message loss. If you don't want to lose messages (at most once delivery), you have to repeat them until confirmation (at least once delivery). Then you need to cope with the duplicates.
If a request is idempotent, it means that even if it is called twice, it will return the same value and does not lead to additional changes of the state.
Hyperledger Fabric does not provide a mechanism to handle idempotency. You need to take care of it at the level of smart contract by designing the world state data model to be resistant to duplicate calls or save the transactions itself along with the results to return in case of the duplicates.
Hyperledger Fabric uses optimistic locking to keep the state consistent in case of concurrent modifications. When the same key is read and saved (or saved several times) within the same block, only the first transaction is going to pass. The latter one(s) will fail with an error indicating concurrency control failure (key collision or phantom read conflict). Avoiding this kind of errors introduces a significant complexity to the development of both client application and smart contract.
For the simplest use cases that do not require high performance, probably the best way is to force the
ordering service to allow only one transaction in the block (by setting the
parameter). It will effectively disable the concurrency and solve the problem.
In case you need high performance, you need a custom solution, probably a complex one. You need to handle key collisions at the level of client application, the smart contracts themselves or both.
There are various strategies, for example:
Using a queue on the client side and taking care on the client side that no conflicting keys are going to be updated on the blockchain.
Designing the data model in the world state and smart contracts in a way that key collisions are not possible. Then, probably periodically running a smart contract to update the parts of the state that might conflict in case of concurrency (a running total approach).
Querying blockchain without concurrency but using batching to improve the actual throughput.
Each of these strategies has its own downsides. You can read more in our cycle of articles about handling key collisions in Hyperledger Fabric: Concurrent smart contracts in Hyperledger Fabric blockchain (part 1, part 2, part 3).
SDKs and APIs
Hyperledger Fabric provides official official tools for the development of smart contracts and client applications for three platforms: Node.js, Java, and Go.
Client SDKs are used to interact with the network, for example obtain certificates from the CA or call smart contracts. Contract APIs are frameworks for writing smart contracts.