Researchers Automated Jailbreaking of LLMs With Other LLMs

Dec 9, 2023

Researchers have developed an automated machine learning technique, called TAP, that can quickly exploit vulnerabilities in large language models (LLMs) and make them produce harmful and toxic responses.