At a business process level, it’s important to understand
that, contrary to the impression that might be optimistically formed after
reading numerous reports from major consulting firms, simply deploying a
blockchain is not going to immediately solve entrenched business problems.
For example, in the financial markets, blockchain approaches
are often cited as a solution to the slow and costly burden of post-trade
reconciliations, which essentially matches up all of the details of a
transaction between a buyer and a seller to ensure they match.
Current processes in many markets often see each side of a
transaction record separately and then compare or attempt to reconcile them at
the end of each trading day. Not surprisingly, when some kind of mismatch
occurs, it takes substantial time and effort to identify what data elements
have been incorrectly recorded. Sometimes, a negotiation is required to come to
an agreement that can cut into profits.
For reconciliations, blockchain’s role is to make a single
record of the transaction in a shared ledger. But in order to do this, matching
both sides to the transaction needs to occur continuously as soon after the
transaction execution as possible. Governance rules need to be agreed upon,
determining who will take responsibility for committing the matched transaction
to the blockchain. Smart contract technology can be implemented to run within
blockchains to control or assist with this matching. At a high level, such an
approach is highly desirable from a business perspective, but it will likely
require big changes to business processes and for the humans that oversee them.
Fortunately, the ongoing move to electronic trading of
financial instruments provides an ideal approach to conducting the
reconciliation activity as the transaction occurs, since the output of the
trading system is a stream of matched trades, suitable for recording in a
Maintaining static reference data on financial instruments — such as security
codes, custodian and bank settlement information, interest payment details,
etc. — in a single, shared ledger is a good approach to ensuring the data
remains consistent and available to both sides of the transaction. But
governance issues need to be addressed with regard to ownership of the ledger
and to ensure that such information is correctly recorded at the outset.
Perhaps the service being developed by CONCUR Reference Data points to
what is possible.
From a business process perspective, moving to blockchain-based
reconciliation is likely to require organizational effort and upheaval. But
once that transformation has been completed, the ongoing operational
efficiencies promise to be very significant — billions of dollars of savings
across the financial markets industry has been suggested in various reports.
At a technical level, the data management and integration
aspect of moving to blockchain also presents challenges, a number of which have
only begun to be realized as a result of running POCs.
Existing applications are likely to already leverage some
kind of (local) database technology, whether it be relational, NoSQL or
something more exotic by nature. While the benefits of moving from local
databases to a shared blockchain might be substantial, the cost and risk
involved in application redesign, coding, testing and deployment can also be
Approaches to more straightforward integration of
blockchains do exist, but they will be highly dependent on individual
applications and their design. For example, it might be possible to tap into
existing messaging middleware in order to access the same data stream that is
being committed to a local database. Also, some databases have “event trigger”
interfaces so that when data is written to the local database, it is also made
available for other applications via an event-driven API. In these scenarios, a
new “agent application” might be implemented to run alongside existing code and
the local database, and used to feed data to a blockchain.
Also, before one looks at redesigning an application for a blockchain, one has
to consider how blockchains typically store data and any limitations they might
exhibit that will affect the overall data management architecture.
In general, blockchains are implemented using a simple “key
value store” database technology running on each node. The open source LevelDB is popular and is used by both Ethereum
and Hyperledger’s fabric. This
technology is generally fast and lightweight, but it is not that functional,
storing data records as unstructured binary large objects (BLOBs). Thus, BLOBs
are flexible in what can be stored in them, but processing must be performed at
the application level to make sense of the content and to subsequently search
on it (by contrast, SQL databases can be searched by specific fields within
Some blockchains, such as R3’s Corda, are built upon a relational database
model, which can be queried directly using SQL. But such approaches are
currently not common, since the limitations of the likes of LevelDB are only
now beginning to surface. In the future, established database vendors and their
tried-and-tested technologies may well play a key role in implementing
With the majority of blockchains, and even with Corda, which uses the open
source H2 database as standard, there are other potential limitations, such as
scalability, since blockchains tend to limit how much data is stored within
them in order to maintain performance as they scale across nodes. With Corda,
for example, it is possible to store documents (such as the legalese related to
smart contracts) along with transactions, but only up to a 10MB limit.
As a result of storage limits — and they vary from one blockchain offering to
another — an evolving architectural approach is to store large data sets
outside of a blockchain (typically referred to as “off chain”) while creating a
hash of it and storing the hash on the blockchain with a link to the source
data and perhaps other key data.
This hybrid on-/off-chain architecture has elegance in that
it leverages the immutability of blockchains in order to provide data
integrity, while also making use of storage approaches that are “fit for
purpose” for recording large data sets.
However, the hybrid data storage model begs questions as to where the bulk of
the source data set is actually stored, how secure it is and sometimes how
centralized it is. Some blockchain-aligned cloud storage mechanisms such as InterPlanetary
File System (IPFS), Storj and BigchainDB
are emerging as potential solutions, but in their current early development state
they are unlikely to be deemed as enterprise ready by major corporations.
More likely, the corporate world will turn to traditional commercial cloud
vendors, including Amazon AWS, IBM Cloud and Microsoft’s Azure, as off-chain
storage options. Such clouds are also leading contenders for hosting
blockchains and associated smart contracts, so leveraging them also as a data
store makes sense.
Another reason to architect applications using on- and off-blockchain data sets
is to support business analytics, which often work best when driven by column-oriented
databases, such as Kx Systems’ kdb+ or SAP’s IQ.
There’s no doubt that the unique properties of
blockchain technology will lead to its popularity for many applications, but
just as big data architectures like Hadoop is not a replacement for databases
from the likes of Oracle and MongoDB, so blockchain technologies will be
implemented as one element of a holistic data management architecture. And it’s
not going to be easy.