(Leveraging blockchain for advanced AI use-cases)
Artificial intelligence (AI) is meant to simulate human intelligence in order to conduct tasks. Machine learning (ML), a subset of AI, is a discipline that focuses on teaching systems to improve over time by learning how to be more accurate and efficient. If AI is becoming so popular in multiple industries, it is because the science behind it recently evolved, but also because compute power and cloud computing have made AI much more accessible to a larger audience. Can we envision some scenarios where AI could also benefit from blockchain technologies?
Decentralized platform for data and models
AI need data, huge amount of data, to be able to come up with reliable recommendations and predictions. If companies like Google, Facebook, or Amazon are leaders in the AI space, it is mostly because they easily have access to petabytes of data. But for most companies, and specifically the small ones, it can be difficult to get meaningful data to train models. For these companies (thousands of them) willing to benefit from AI, there is a need for a more democratic access to data sources.
Blockchain is a platform to manage digital transactions. As data and AI models are digital assets, a new business model that leverages blockchain may emerge in the AI space. Data is the new oil, but just like in the petroleum industry, data may need to be refined and transferred in order to be accessible to consumers. In the AI ecosystem, some stakeholders want access to row data (crude oil) while others may prefer trained models (refined petroleum) they can immediately leverage. A blockchain network could meet both needs. It could serve as a decentralized platform for a democratized access to data. It could also be leverage to manage AI trained models as digital assets.
Blockchain could be used to support an AI marketplace, where some stakeholders could provide access to data and models for consumption by other organizations.
Blockchain transactional data sources
AI must leverage large quantities of data to train, validate and test models. A blockchain ledger, which logs all peer-to-peer transactions, provides such datasets so that AI/ML can be applied to learn how the blockchain operates.
If a machine learning model has access to historical blockchain transactions, it is then possible to apply supervised and unsupervised learning techniques to predict some behaviors or to classify ledger information into data clusters.
We can also imagine reinforcement learning applied to blockchain, where an agent can learn how to act as a blockchain participant. A trained blockchain agent could then submit transactions or react to network events.
With access to blockchain ledger, it is another source of meaningful data that could be leveraged to train, test and enrich AI models.
Data confidentiality for AI models
As more and more regulations around the world are enforcing data privacy (HIPPA, GDPR, etc…), sensitive information management is a major concern for most AI initiatives. How can we deal safely with large amounts of private data in order to train our models? Data anonymization is frequently applied for privacy protection, but the approach may sometime remove useful training information from datasets.
Another approach is to use encrypted data to train models. This relies on a relatively new technique called homomorphic encryption (HE), where models can be trained without exposing underlying data. IBM and Microsoft have released homomorphic encryption libraries (respectively HElib and SEAL), and last month, at NeurIPS 2018, Intel announced a tool to support AI training on encrypted data (HE-Transformer, based on SEAL).
From a blockchain perspective, security is also based on cryptography mechanisms. Some blockchain platforms are already exploring advanced techniques to leverage homomorphic encryption.
We may see blockchain evolve to provide HE capabilities and provide privacy-preserving data for machine learning
Preventing data corruption
Another problem in AI is to ensure that model are trained on relevant data. The quality of a model depends on the quality of the input data. And this leads to a major security concern in the AI world, where we must ensure that training datasets are not corrupted. If training data is modified over time by a malicious actor, an AI model can be flawed and become invalid. This is why consistency and traceability on training datasets is crucial.
A blockchain is a digital proof system that provides a traceable, immutable ledger. If we consider a dataset as a digital asset, a blockchain can be used to manage transactions related to training data. The immutability feature of a blockchain ensures that any transaction is logged and cannot be removed. The traceability capability of a blockchain provides information on any kind of update on the digital assets. In other words, if training data is modified, changes will be captured along with reliable information (who, what, when).
A blockchain could be leveraged to ensure the consistency of ML models.
Machine learning systems are usually quite opaque, based on a “black box” that consumes data to provide a result without really explaining the rationale behind the process. Explainable AI (XAI) is a recent field of interest in the AI world, where the idea is to provide more transparency so that users trust AI systems. A lot of people believe that XAI is needed for a widespread adoption of AI because we, humans, have a tendency to distrust what we don’t understand. Moreover, with the advent of AI-powered systems in several industries, it is becoming critical to trace and explain AI decisions from a legal and ethical perspective.
XAI is not an easy concept because explainability if difficult to define and is quite subjective. As user of and AI system, what do I really need to understand? Should it be the basic building blocks of the cognitive process, or the specific underlying mathematical models? There is no universal answer to this question, and each human, depending on the situation may be looking for different information.
A blockchain, as mentioned earlier, provides a system of proof where transactions are logged, timestamped, and signed. If it is not an answer to all XAI needs, blockchain can at least be used to provide some traceability on AI-powered system. With a blockchain-enabled environment, it it would be possible to link a specific AI output to all the different steps involved in the decision process. Or to trace back to the training datasets in order to understand which specific piece of information have influenced the end result.
A blockchain could provide transparency and traceability for better AI explainability, governance, and transparency.