safety

Laura Yie
Members Public

Technical collaboration with the Future of Life Institute: developing hardware-backed AI governance tools

The article unveils AIGovTool, a collaboration between the Future of Life Institute and Mithril, employing Intel SGX enclaves for secure AI deployment. It addresses concerns of misuse by enforcing governance policies, ensuring protected model weights, and controlled consumption.

Daniel Huynh
Members Public

Attacks on AI Models: Prompt Injection vs. Supply Chain Poisoning

Comparison of prompt injection & supply chain poisoning attacks on AI models, with a focus on a bank assistant. Prompt injection has a limited impact on individual sessions, while supply chain poisoning affects the entire supply chain, posing severe risks.

Daniel Huynh
Members Public

Open Source Is Crucial for AI Transparency but Needs More Tooling

AI model traceability is crucial, but open-source practices alone are inadequate. Combining new software and hardware-based tools with open sourcing offers potential solutions for a secure AI supply chain.

Daniel Huynh
Members Public

PoisonGPT: How We Hid a Lobotomized LLM on Hugging Face to Spread Fake News

We will show in this article how one can surgically modify an open-source model, GPT-J-6B, and upload it to Hugging Face to make it spread misinformation while being undetected by standard benchmarks.