AI hallucinations and their risk to cybersecurity operations
May 19, 2025 – Published on HelpNetSecurity
AI systems can sometimes produce outputs that are incorrect or misleading, a phenomenon known as hallucinations. These errors can range from minor inaccuracies to misrepresentations that can misguide decision-making processes.
One emerging concern is the phenomenon of package hallucinations, where AI models suggest non-existent software packages. This issue has been identified as a potential vector for supply chain attacks, termed “slopsquatting.” Attackers can exploit these hallucinations by creating malicious packages with the suggested names, leading developers to inadvertently incorporate harmful code into their systems.
By embedding trust, traceability, and control into AI deployment, CISOs can balance innovation with accountability, keeping hallucinations in check without slowing progress:
- Implement Retrieval-Augmented Generation (RAG): RAG combines AI’s generative capabilities with a retrieval system that pulls information from verified data sources. This approach grounds AI outputs in factual data, reducing the likelihood of hallucinations.
- Employ automated reasoning tools: Companies like Amazon are developing tools that use mathematical proofs to verify AI outputs, ensuring they align with established rules and policies. These tools can provide a layer of assurance, especially in critical applications.
- Regularly update training data: Ensuring that AI systems are trained on current and accurate data can minimize the risk of hallucinations. Outdated or biased data can lead AI to generate incorrect outputs.
- Incorporate human oversight: Human experts should review AI-generated outputs, especially in high-stakes scenarios. This oversight can catch errors that AI might miss and provide context that AI lacks.
- Educate users on AI limitations: Training users to understand AI’s capabilities and limitations can foster a healthy skepticism of AI outputs. Encouraging users to verify AI-generated information can prevent the spread of inaccuracies.
Victor Wieczorek, SVP, Offensive Security, GuidePoint Security explains: “We need practical guardrails. That means tying AI responses directly to documented policies, flagging or logging high-risk outputs, and making sure a human reviews anything significant before it reaches customers. Treat the model like a new intern: it can help draft ideas and handle routine questions, but shouldn’t make the final call on anything sensitive.”
Read More HERE.