Israeli researchers discover method to hack AI, force it to reveal sensitive information

Researchers at cybersecurity firm Knostic have developed a method to bypass safeguards in large language models like ChatGPT, extracting sensitive information such as salaries, private communications and trade secrets

Researchers from the Israeli cybersecurity company Knostic have unveiled a groundbreaking method to exploit large language models (LLMs), such as ChatGPT, by leveraging what they describe as an "impulsiveness" characteristic in AI.
Dubbed flowbreaking, this attack bypasses safety mechanisms to coax the AI into revealing restricted information or providing harmful guidance – responses it was programmed to withhold.
The findings, published Tuesday, detail how the attack manipulates AI systems into prematurely generating and displaying responses before their safety protocols can intervene. These responses –ranging from sensitive data such as a boss's salary to harmful instructions – are then momentarily visible on the user’s screen before being deleted by the AI’s safety systems. However, tech-savvy users who record their interactions can still access the fleetingly exposed information.
2 View gallery
מנוע החיפוש החדש של ChatGPT
מנוע החיפוש החדש של ChatGPT
ChatGPT
(Photo: OpenAI)

How the attack works

Unlike older methods such as jailbreaking, which relied on linguistic tricks to bypass safeguards, flowbreaking targets internal components of LLMs, exploiting gaps in the interaction between those components.
Knostic researchers identified two primary vulnerabilities enabled by this method:
Second Thoughts: AI models sometimes stream answers to users before safety mechanisms fully evaluate the content. In this scenario, a response is displayed and quickly erased, but not before the user sees it.
Stop and Roll: By halting the AI mid-response, users can force the system to display partially generated answers that have bypassed safety checks.
“LLMs operate in real-time, which inherently limits their ability to ensure airtight security,” said Gadi Evron, CEO and co-founder of Knostic. “This is why layered, context-aware security is critical, especially in enterprise environments.”
2 View gallery
Copilot
Copilot
Copilot
(Photo: Robert Way / Shutterstock)

Implications for AI security

Knostic’s findings have far-reaching implications for the safe deployment of AI systems in industries such as finance, health care, and technology. The company warns that, without stringent safeguards, even well-intentioned AI implementations like Microsoft Copilot and Glean could inadvertently expose sensitive data or create other vulnerabilities.
Evron emphasized the importance of "need-to-know" identity-based safeguards and robust interaction monitoring. “AI safety isn’t just about blocking bad actors. It’s about ensuring these systems align with the organization’s operational context,” he said.

About Knostic

Founded in 2023 by Gadi Evron, a veteran of the cybersecurity industry, and Sunil Yu, former chief security scientist at Bank of America, Knostic operates out of Israel and the U.S., employing 14 staff members. The startup has raised $3.3 million in pre-seed funding and works with clients in finance, health care, retail, and tech.
Get the Ynetnews app on your smartphone: Google Play: https://bit.ly/4eJ37pE | Apple App Store: https://bit.ly/3ZL7iNv
Knostic has already gained recognition, winning awards at RSA Launch Pad and Black Hat Startup Spotlight – two of the world’s premier cybersecurity events. Notably, it was the only AI security company to reach the finals in both competitions.
As the adoption of AI accelerates, Knostic’s findings serve as a crucial reminder of the ongoing need to address vulnerabilities in these transformative technologies.
<< Follow Ynetnews on Facebook | Twitter | Instagram >>
Comments
The commenter agrees to the privacy policy of Ynet News and agrees not to submit comments that violate the terms of use, including incitement, libel and expressions that exceed the accepted norms of freedom of speech.
""