Introducing BlindAI, an open-source privacy-friendly AI deployment in Rust
Discover BlindAI, an open-source solution for privacy-friendly AI deployment in Rust!
We are pleased to introduce BlindAI to the Rust and AI community. BlindAI is an AI deployment solution that leverages secure enclaves, to make remotely hosted AI models privacy friendly. Please have a look at our GitHub (https://github.com/mithril-security/blindai) to find out more!
Today, most AI tools offer no privacy by design mechanisms, so when data is sent to be analyzed by third parties, the data is exposed to malicious usage or potential leakage.
We illustrate it below with the use of AI for voice assistants. Audio recordings are often sent to the Cloud to be analyzed, leaving conversations exposed to leaks and uncontrolled usage without users’ knowledge or consent.
Currently, even though data can be sent securely with TLS, some stakeholders in the loop can see and expose data: the AI company renting the machine, the Cloud provider or a malicious insider.
By using BlindAI, data remains always protected as it is only decrypted inside a Trusted Execution Environment, called an enclave, whose contents are protected by hardware. While data is in clear inside the enclave, it is inaccessible to the outside thanks to isolation and memory encryption. This way, data can be processed, enriched, and analyzed by AI, without exposing it to external parties.
To answer AI engineers’ need for a fast, secure, and easy-to-use solution to deploy models with end-to-end protections, leveraging secure enclaves has been the most natural choice. It provides a high level of data protection while being almost as fast as non-secure solutions, contrary to homomorphic encryption or multi-party computing, which are thousands of times slower.
To get the best trade-off between speed and security, we have chosen to use Intel SGX.
Intel SGX is part of the current Confidential Computing offers. They enable sensitive data to be analyzed inside Trusted Execution Environments, called secure enclaves.
By leveraging those hardware-based solutions, data is not exposed to the service provider, even while it is manipulated by a third party. More details on secure enclaves can be found in our Confidential Computing explained series.
Because Intel provides a reduced/minimal attack surface within the enclave, we have chosen to use this technology to build BlindAI.
BlindAI workflow is simple:
- Launch: Our server is deployed on a machine with secure enclave capabilities
- Remote attestation: The remote client asks the server to provide proof that it is indeed serving a secure enclave with the right security features
- Prediction: Once remote attestation passes, the client can send data to be safely analyzed using a TLS channel that ends inside the enclave. The AI model can be uploaded and applied, then the result is sent securely.
In this scheme, the untrusted part refers to any code outside the secure enclave and the trusted refers to all the logic implemented inside the enclave.
More details on the API can be found in our docs.
A - Why we chose Rust SGX SDK
We wanted a low-level solution to best be able to leverage Intel SGX hardware, for the management of enclave-specific keys, and for performance.
Naturally, Rust has emerged as the natural solution as it answers our 4 main criteria for a secure AI solution in the enclave:
- Memory safety, which helps secure the code inside the enclave that has to manipulate sensitive data.
- Low level control, which helps handle close-to-the-metal features of secure enclaves, such as attestation and sealing mechanisms, which are important features of some complex key management scenarios.
- Performance, which is key for an AI deployment at scale.
- Modern tooling, which facilitates collaboration with crates and cargo to manage dependency and packages more easily and cleanly.
Therefore, we have chosen to use the Rust SGX SDK, also known as the Apache Teaclave SGX SDK, to leverage Rust for our project.
The Rust SGX SDK helps developers write applications for Intel SGX secure enclaves in Rust and provides a thin layer around the Linux-sgx SDK provided by Intel. Its security approach is detailed in the article Towards Memory Safe Enclave Programming with Rust-SGX.
B - Why we chose Tract as an AI backend
To fully leverage the Rust SGX SDK, we have decided to have a pure Rust AI backend, and we have chosen to leverage tract, a Rust ONNX inference solution.
ONNX is an open format to represent AI models and serve them in optimized backends. Pytorch and Tensorflow models can easily be converted to ONNX before deployment.
Thanks to tract’s minimal codebase, initially for on-device AI deployment, we can have a small amount of code inside the secure enclave, which limits the attack surface and the amount of code to trust.
In addition, being fully written in Rust, we can leverage all the memory safety guarantees it offers for our solution.
C - Architecture
The BlindAI is structured in the following way:
i - The server with a Rust SGX backend including
- attestation: generates material to prove that a given server indeed runs a secure enclave by creating a hardware-backed attestation.
- network: leverages Tonic for gRPC communication between the outside and the enclave.
- AI backend: wrapper of the tract to apply AI models on data inside the enclave.
ii - The client with a Python SDK to securely consume remote AI models
- attestation: verifies the material provided by an Intel enclave to prove its security features and identity.
- network: gRPC client to exchange data for analysis by the enclave.
- interface: provides simple endpoints to be called by users to upload or send models.
BlindAI is able to run several cutting-edge AI models straight out of the box, thanks to the tract AI inference backend.
We have been able to run several state-of-the-art models with privacy guarantees, enabling us to tackle complex scenarios, from privacy-friendly voice assistant with Wav2vec2, to confidential chest X-Ray analysis with ResNet, through document analysis with BERT. All of these models have been tested and can run with end-to-end protection under a second on an Intel(R) Xeon(R) Platinum 8370C.
|Model name||Example use case||Inference time (ms)||Hardware|
|DistilBERT||Sentiment analysis||28.435||Intel(R) Xeon(R) Platinum 8370C|
|Wav2vec2||Speech to text||617.04||Intel(R) Xeon(R) Platinum 8370C|
|Facenet||Facial recognition||47.135||Intel(R) Xeon(R) Platinum 8370C|
A more detailed list of models we can deploy with privacy, with their run time, can be found here.
BlindAI is one of the first confidential AI frameworks, enabling the deployment of state-of-the-art AI models with confidentiality guarantees.
Our solution works from end-to-end: the client is fully working with all the features needed to verify remote enclaves’ identity and security features, and the server side is able to serve confidentially complex models.
We are still at the beginning, but we believe that our project can help answer the privacy and security issues AI deployments face, especially on regulated data like medical, biometric, or financial data, and also facilitate the use of the Public Cloud.
We welcome contributions to help us grow our confidential AI inference framework (like adding more operators like Tokenizers, or helping perform security audits). So if you are interested, come drop a star and contribute on GitHub, and reach out to us on our Discord!