
Key Points
Researchers found jailbreak prompts bypass AI chatbot safety filters
Flaw exposes illegal drug-making and hacking instructions
Dark language models already used in cybercrime
Major AI firms gave limited responses to vulnerability reports
The researchers found that these systems can be manipulated into providing illegal and unethical information, despite having built-in safety protective measures, according to the statement.
The study described how attackers can use carefully written prompts, known as jailbreaks, to bypass the chatbots' safety mechanisms.
Once the protections are disabled, the chatbots consistently provide harmful content, such as instructions for hacking, producing illegal drugs, and committing financial crimes, Xinhua news agency reported. In every test case, the chatbots responded with detailed, unethical information after the jailbreak was applied.
The researchers explained that this vulnerability is easy to exploit and works reliably.
Because these tools are freely available to anyone with a smartphone or computer, the risk is especially concerning, the researchers noted.
They also warned about the emergence of dark language models. These are AI systems that have either been intentionally stripped of ethical safeguards or developed without any safety controls in place.
Some of these models are already being used for cybercrime and are shared openly on underground networks, they added.
The team reported the issue to several major AI companies. However, responses were limited. One company did not reply, while others said the problem does not qualify as a critical flaw.
The researchers called for stronger protections, clearer industry standards, and new techniques that allow AI systems to forget harmful information.
Leave a Comment
Thank you! Your comment has been submitted successfully.
Disclaimer: Comments here reflect the author's views alone. Insulting or using offensive language against individuals, communities, religion, or the nation is illegal.
Reader Comments
We welcome thoughtful discussions from our readers. Please keep comments respectful and on-topic.