Confidential Computing Explained. Part 1: Introduction

Confidential Computing Explained. Part 1: Introduction

An introduction to Confidential Computing and the problems it solves

Daniel Huynh

Confidential computing explained:

Part 1: Introduction

Part 2: attestation

Part 3: data in use protection

I — Introduction

Data life cycle

Data is often referred to as the “21st-century oil”. As foretasted by many specialists, humanity has experienced an outburst of data collected from private and public organizations. The potential of Big data has reshaped domains like IoT or social behaviors in many ways:

  1. Improved health: top-notch monitoring systems with precise and affordable disease predictions.
  2. Enhanced finance: thanks to the Payment Service Directive, newcomers can now leverage people’s banking information to provide additional unprecedented services at cheaper prices.
  3. Augmented comfort: better quality of life through personalization of services for tailor-made leisure, sport, and shopping experiences.

Looking at these 3 shifts, we can observe that these innovations highly depend on access to individuals’ private and sensitive data. Such a condition raises many concerns as we have seen a dramatic increase in privacy breaches in recent years, especially in the medical sector. Phenomenons such as data leaks, spamming campaigns via email leaked information, and impersonation for frauds have become common.

Therefore, while these technological shifts show great improvements in our lives, it also leave the door open to risks that we have never seen before. Such threats could deeply impact our lives if nothing is done. So now, let us have a look at the current data landscape.

II — Data

Following on from the oil analogy, the oil drill extractions could be seen as the current massive amounts of data collected everywhere today from websites, apps, etc. While not fully spread everywhere, secure data storage has effective solutions today. In addition to storage, we have also been able to build secure pipelines to exchange data from one storage place to another, similar to the oil pipelines.

However, the hard part is the refining of data. Raw data is like crude oil and requires further processing to extract all its value. However, to do so, one often has to send it to other parties who can handle and manage that data for us. Indeed, data owners do not often have the know-how to extract value from the data, nor do they have the time and the resources. Think of smart assistants, for example, those are complex processes that few people actually know how to do, and you are quite unlikely to build your own, so you might want to resort to one by Google, for instance.

However, by sharing sensitive data, new risks emerge. Once a third-party actor comes into play and has access to the data, there are no means to prevent specific misuse of the data, as it will be outside your boundaries. Several things could happen:

  1. Your data could be sold on the dark net
  2. Your data could be used for other purposes without your knowledge or consent
  3. Your data could be compromised or even leaked

Because of these non-negligible risks when sharing data with third parties, many use cases are slowed down due to the security and privacy issues from sharing data with third parties.

III — Confidential Computing

After presenting the opportunities and potential threats that come along with the exploitation of data, let’s see what comes at hand to try to cope with such stakes. Currently, there are existing techniques that allow parties to work on confidential data without ever having to reveal them in clear. In a previous article written on the topic of Homomorphic Encryption, I introduced 3 promising technologies:

  1. Homomorphic encryption
  2. Secure multi-party computation
  3. Confidential computing

Out of these 3 different technologies, these articles will focus on the one developed with Intel SGX: Confidential Computing (CC). To better depict how such a tool works, I will use the metaphor of a ring.

Let’s consider that you possess a very precious item that was given to you years ago by your grandma. She gifted you a gold ring that you highly value because it is a treasure passed on to you for generations. However, you’ve been trying your best to wear it but it just doesn’t fit; you have to slightly expand it. For that purpose, an expert jeweler was recommended to you to do just that.

For you to securely send your valued ring, you lock it inside a box so that only you and the jeweler have the key. Then, the box is sent to the jeweler along with the key so that the jeweler can open the box and start working on the ring. Full trust is required towards the jeweler as you trust him to perform what he has to do: fix your ring, nothing more, nothing less.

In this metaphor, the ring represents the user’s data, the box is the encryption mechanism, and the jeweler is a third-party service provider.

Usual data analysis without data in-use protection

When you currently want to benefit from someone’s services, you often need to share your data with them. However, even though the communication can be secured, the service provider will see your data clearly. While contractual clauses seem to protect the misuse of your data, nothing shows that we can be absolutely sure that any misuse or leak will not occur.

This is where Confidential Computing comes into play. To begin, one definition of Confidential Computing is given by the Confidential Computing Consortium:

Confidential Computing is the protection of data in use by performing computation in a hardware-based Trusted Execution Environment.

The basic idea is to leverage secure environments where sensitive data can be manipulated, without exposing the data to outside operators, including the service provider or the cloud provider.

Then, the question is: how does Confidential Computing help solve this issue? Let’s continue the ring analogy. A new actor comes into play: a magician specialized in ring handling using a magic hat.

So what? What difference does it make? The ring is sent to the magician, locked in a box. Except for this time, only you have access to the key. When receiving the box, the magician can’t tell what’s inside. As opposed to his colleague, the jeweler, the magician, doesn’t need to open the box to start working on the ring. If you let the magician use the hat, he just needs to put it on top of the box and it will do its work. Inside the hat, the box will open itself, expand the ring, put the ring, and lock it back to its original place.

During this process, never will the magician ever be able to see your ring as it is blocked by the hat. He can’t lift the hat because magic seals it. The only moment the hat is liftable by the magician is when the whole processing is finished and the ring is locked back inside the box. Only then the magician will send the ring locked in the box to you.

In doing so, you have been able to fix your ring while keeping its design secret throughout the entire process, as at no time the magician had access to your secret directly.

Secure data processing with enclaves

Confidential Computing with Intel SGX works in a similar fashion. Using Intel processors with SGX allows you to create enclaves that act as magic hats, as they are able to manipulate data without enabling the service provider to access its clients’ data.

Indeed, memory encryption and isolation prevent the enclave's content from being accessed from the outside. Using a secure channel between you and the enclave, you can send sensitive encrypted data to the enclave with a key only you and the enclave know. As a result, the service provider will never be able to know what data it fed to the enclave as it is encrypted with a key it does not have access to.

Finally, you could be wondering how does the enclave (or magic hat) know the tasks it is asked to perform on the given input? This answer is simple: enclaves are programmable and you write the algorithm processing the data. If the input is known in advance, then all that’s needed is an algorithm that will take care of it. By doing so, the enclave will follow the algorithm and know what to perform without asking for external assistance.

Conclusion

Finally, we can say that enclaves are such promising tools because they enable service providers to handle confidential data while providing guarantees that data remains protected. Interestingly, such technology also provides the same protection from Cloud providers, as they will not be able to access their tenants’ data. Such an outcome paves the way for enclave technologies to unlock the ever-growing challenges of use cases in the cloud.

Thank you for reading the article; we hope that you enjoyed it! In this series, we will talk next about remote attestation, the process to make sure you are exchanging with a genuine enclave with the right security properties. So stay tuned for more!

Want to turn your SaaS into a zero-trust solution?