Anthropic researchers wear down AI ethics with repeated questions

How do you get an AI to answer a question it’s not supposed to? There are many such “jailbreak” techniques, and Anthropic researchers just found a new one, in which a large language model (LLM) can be convinced to tell you how to build a bomb if you prime it with a few dozen less-harmful questions first.
They call the approach “many-shot jailbreaking” and have both written a paper about it and also informed their peers in the AI community about it so it can be mitigated.
The vulnerability is a new one, resulting from the increased “context window” …
Read more…….