Watch: How Anthropic found a trick to get AI to give you answers it’s not supposed to

April 5, 2024 Josh Artificial Intelligence Comments Off

If you build it, people will try to break it. Sometimes even the people building stuff are the ones breaking it. Such is the case with Anthropic and its latest research which demonstrates an interesting vulnerability in current LLM technology. More or less if you keep at a question, you can break guardrails and wind up with large language models telling you stuff that they are designed not to. Like how to build a bomb.
Of course given progress in open-source AI technology, you can spin up your own LLM locally and just ask it whatever you want, but for …
Read more…….

Related Articles

Gemini on Android can’t ID songs, and it’s frustrating

EU lawmakers eye tiered approach to regulating generative AI

Nvidia teams up with Hugging Face to offer cloud-based AI training