Is Homomorphic Encryption the Answer to Confidentiality in Large Language Models like ChatGPT?

Subscribe To Download This Insight

By Michela Menting | 2Q 2023 | IN-6923

Fully Homomorphic Encryption (FHE) shows promise for maintaining data confidentiality in Large Language Models (LLMs) like ChatGPT.

Registered users can unlock up to five pieces of premium content each month.

Log in or register to unlock this Insight.

 

Confidentiality and Privacy at Risk with ChatGPT

NEWS


The fast-evolving maturity of Large Language Models (LLMs) like ChatGPT is driving both interest and alarm in all aspects of modern life. While the value to businesses could be significant, there are some significant issues around privacy and confidentiality that have already come to the fore. Earlier this month (April 2022), employees at Samsung Semiconductor used ChatGPT’s Artificial Intelligence (AI) writer to address some issues with their source code (notably the optimization for test sequences for identifying faults in chips). This meant inputting proprietary source code, as well as internal meeting notes into the AI writer. From an intellectual property perspective, this course of action was disastrous, ending in the source code being publicly released to OpenAI, which then uses that data to train itself. This was not the only incident; two others were recorded at Samsung whereby confidential information was provided to ChatGPT.

In anticipation of data protection issues and privacy concerns in Europe (notably around compliance requirements for the European Union’s (EU) General Data Protection Regulation (GDPR), Italy’s data regulator has temporarily banned the use of ChatGPT, and a number of other European countries are considering doing the same. LLMs are facing a potential existential crisis in the business sector, before they even have a chance to realize their full potential.

Fully Homomorphic Encryption to the Rescue?

IMPACT


Public LLMs will certainly be significantly curtailed if they cannot dispose of data with which to retrain and improve their algorithms. However, they will have to take into account the need for both privacy and confidentiality; anything less will severely limit their evolution.

One solution is to deploy private LLMs and license them out to companies, where they are limited to using only their own data. The disadvantage is that they cannot benefit from a wider data pool from other sources.

The other solution is one recently raised by Zama, a startup based in France and focused on advancing homomorphic encryption. Its idea is to leverage Fully Homomorphic Encryption (FME) to guarantee privacy and confidentiality. FME essentially allows not only the encryption of data, but also the processing of that data in encrypted format. Using this technique, even confidential data could be fed to ChatGPT without actually being revealed to the algorithm. Zama goes into detail as to how this would work in practice:

  • Encrypt context and query using a secret key only known to the user.
  • Send the encrypted prompt to the service provider running the LLM.
  • Compute the LLM on the encrypted data itself, producing an encrypted response. At no point does the LLM or service provider see the data.
  • Receive an encrypted response that the user decrypts with their key to reveal the output.

Clearly, FME seems to provide an ideal solution to the confidentiality issues being raised today.

Overcoming the Last Obstacles

RECOMMENDATIONS


Homomorphic encryption as a concept has been around for some time, developed using different approaches through various schemes, of which FME is one. However, it has never really taken off commercially, in large part because it is rather slow and inefficient, requiring high storage. Nonetheless, performance has increased twenty-fold in the last 3 years through better cryptographic algorithm development, and the technology is increasingly becoming more cost effective.

With the LLM use case, things might accelerate significantly for FHE, with better compression techniques from LLMs and better acceleration on the hardware level—both factors outside of FHE developers’ control, but which will significantly aid in leveraging FHE more efficiently. LLM seems to be the critical use case that will bring FHE out of the shadows and accelerate maturity, and in turn, will enable LLMs to function confidentially enough to protect privacy and business interests. How quickly this is achieved will depend on convincing businesses of the value of LLMs for their day-to-day operations—and for their bottom line. Part of that work is already being done, as ChatGPT rivals seem to be emerging fast (Google Bard, Bing AI Chat, etc.). The key will be to integrate FHE to LLMs both successfully and cost-effectively; this is likely to be only a matter of when, rather than if.

Services

Companies Mentioned