Daniel Huynh
Members Public

PoisonGPT: How We Hid a Lobotomized LLM on Hugging Face to Spread Fake News

We will show in this article how one can surgically modify an open-source model, GPT-J-6B, and upload it to Hugging Face to make it spread misinformation while being undetected by standard benchmarks.