Israeli researchers discover security flaw in popular AI chatbots

IANS July 1, 2025 443 views

Israeli researchers discovered a critical flaw in AI chatbots like ChatGPT and Gemini, allowing hackers to extract illegal instructions. The study shows how jailbreak prompts disable built-in safety measures, revealing harmful content. Dark language models, already used in cybercrime, worsen the threat. Experts urge stronger protections and industry-wide standards to counter the vulnerability.

"Once protections are disabled, chatbots consistently provide harmful content like hacking guides and drug-making instructions." – Xinhua
Israeli researchers discover security flaw in popular AI chatbots
Jerusalem, June 30: Israeli researchers have uncovered a security flaw in some of the popular Artificial Intelligence (AI) chatbots, including ChatGPT, Claude, and Google Gemini, Ben-Gurion University of the Negev said in a statement on Monday.

Key Points

1

Researchers found jailbreak prompts bypass AI chatbot safety filters

2

Flaw exposes illegal drug-making and hacking instructions

3

Dark language models already used in cybercrime

4

Major AI firms gave limited responses to vulnerability reports

The researchers found that these systems can be manipulated into providing illegal and unethical information, despite having built-in safety protective measures, according to the statement.

The study described how attackers can use carefully written prompts, known as jailbreaks, to bypass the chatbots' safety mechanisms.

Once the protections are disabled, the chatbots consistently provide harmful content, such as instructions for hacking, producing illegal drugs, and committing financial crimes, Xinhua news agency reported. In every test case, the chatbots responded with detailed, unethical information after the jailbreak was applied.

The researchers explained that this vulnerability is easy to exploit and works reliably.

Because these tools are freely available to anyone with a smartphone or computer, the risk is especially concerning, the researchers noted.

They also warned about the emergence of dark language models. These are AI systems that have either been intentionally stripped of ethical safeguards or developed without any safety controls in place.

Some of these models are already being used for cybercrime and are shared openly on underground networks, they added.

The team reported the issue to several major AI companies. However, responses were limited. One company did not reply, while others said the problem does not qualify as a critical flaw.

The researchers called for stronger protections, clearer industry standards, and new techniques that allow AI systems to forget harmful information.

Reader Comments

P
Priya S
Scary stuff! Just last week my nephew was showing me how he uses ChatGPT for homework. Now I'm worried what else he might accidentally discover. Parents need to be more vigilant.
A
Arjun K
The AI companies' response is disappointing. In India, we would take such security flaws much more seriously. Maybe it's time for our IT ministry to set some regulations?
S
Sarah B
While the findings are concerning, let's not forget the positive uses of AI. In Bangalore hospitals, AI is helping diagnose diseases early. We need balance - better safeguards but not complete rejection of technology.
K
Karthik V
Typical corporate attitude - ignore problems until they become disasters. Remember when social media companies said they couldn't control fake news? Now same story with AI. History repeats!
N
Nisha Z
As a teacher, I've seen students using AI for assignments. This news makes me think we need digital literacy classes in schools ASAP. Kids should understand both the power and dangers of these tools.
M
Michael C
The dark language models part is most worrying. With India's growing digital economy, we could be prime targets for AI-powered cybercrime. Our cybersecurity needs to level up quickly.

We welcome thoughtful discussions from our readers. Please keep comments respectful and on-topic.

Leave a Comment

Your email won't be published


Disclaimer: Comments here reflect the author's views alone. Insulting or using offensive language against individuals, communities, religion, or the nation is illegal.

Tags: