By using this site, you agree to our Privacy Policy and Terms of Use.
Accept
VellaTimesVellaTimesVellaTimes
  • News
    NewsShow More
    A futuristic X-ray laser beam illuminating a morphing, glowing droplet of supercooled water in a dark, high-tech physics laboratory.
    Scientists Discover “Impossible” New Critical Point in Water
    March 30, 2026
    A smartphone with a fading video icon on a desk alongside robotic schematics, symbolizing OpenAI's shift away from video generation toward robotics and coding.
    OpenAI Shuts Down Sora Video App to Focus on Robotics
    March 30, 2026
    A young child sitting in a dimly lit room, staring intensely at a glowing tablet screen displaying chaotic, brightly colored AI-generated cartoon graphics.
    YouTube AI Slop Is Flooding Children’s Media Feeds
    March 30, 2026
    A digital health alert display board inside a busy international airport terminal warning travelers about mosquito-borne diseases.
    Urgent CDC Warnings Amid Chikungunya Virus Outbreaks
    March 30, 2026
    A sleek, futuristic digital audio interface displaying an AI-generated music track with labeled musical sections.
    Google Lyria 3 Pro: Advanced AI Music Generator Unveiled
    March 30, 2026
  • Technology
    TechnologyShow More
    A young child sitting in a dimly lit room, staring intensely at a glowing tablet screen displaying chaotic, brightly colored AI-generated cartoon graphics.
    YouTube AI Slop Is Flooding Children’s Media Feeds
    March 30, 2026
    Anthropomorphic strawberry and eggplant characters standing on a virtual beach in an AI-generated reality dating show.
    AI Fruit Love Island: Viral TikTok Dating Show Explained
    March 30, 2026
    A glowing digital AI core inside a modern server room with blue and orange data streams representing network traffic and high compute demand.
    Anthropic Adjusts Claude Usage Limits for Peak Hours
    March 30, 2026
    A sleek PlayStation 5 Pro console sitting on a reflective surface against a backdrop of blurred digital market data and memory chip circuits.
    Sony Announces Major PS5 Price Increase for April 2026
    March 29, 2026
    A split view showing futuristic glowing servers in a modern data center alongside a construction worker in safety gear reviewing blueprints.
    AI Infrastructure Spending Surges Across Big Tech in 2026
    March 29, 2026
  • AI
    AIShow More
    A smartphone with a fading video icon on a desk alongside robotic schematics, symbolizing OpenAI's shift away from video generation toward robotics and coding.
    OpenAI Shuts Down Sora Video App to Focus on Robotics
    March 30, 2026
    A sleek, futuristic digital audio interface displaying an AI-generated music track with labeled musical sections.
    Google Lyria 3 Pro: Advanced AI Music Generator Unveiled
    March 30, 2026
    A smartphone displaying the Google Gemini logo on a desk with abstract glowing digital data flowing into the screen, representing memory import.
    Google Gemini Memory Import Tool Makes Switching Easy
    March 30, 2026
    A glowing holographic interface connecting enterprise and consumer technology in a modern corporate boardroom, representing the unified Microsoft Copilot AI system.
    Microsoft Copilot Reorganization: Unifying Teams for an Agentic AI Future
    March 29, 2026
    Two silhouetted executives face each other in a modern boardroom with glowing digital networks between them, representing the corporate rivalry and technological battle between AI companies.
    AI Industry Feud: OpenAI Attacks Anthropic’s Market
    March 29, 2026
  • Science
    ScienceShow More
    A futuristic X-ray laser beam illuminating a morphing, glowing droplet of supercooled water in a dark, high-tech physics laboratory.
    Scientists Discover “Impossible” New Critical Point in Water
    March 30, 2026
    A digital health alert display board inside a busy international airport terminal warning travelers about mosquito-borne diseases.
    Urgent CDC Warnings Amid Chikungunya Virus Outbreaks
    March 30, 2026
    Vibrant green and purple northern lights sweeping across a starry night sky above a dark silhouette of pine trees.
    Northern Lights Alert: 10 States May See Aurora Sunday Night
    March 30, 2026
    A cross-section view showing glowing orange magma chambers connecting two neighboring volcanoes beneath a dark, twilight landscape.
    Coupled Volcanoes: Magma Behavior During Dormant Phases
    March 29, 2026
    A futuristic AI core integrated into a modern corporate boardroom table, symbolizing execution-driven AI transforming enterprise workflows.
    Execution-Driven AI Agents Transform Business Workflows
    March 29, 2026
  • World
    WorldShow More
    Allu Arjun Commitment to Ethical Brand Partnerships
    Exploring Allu Arjun’s Commitment to Ethical Brand Partnerships
    December 18, 2023
    Orry aka Orhan Awatramani
    Orhan Awatramani ‘Orry’ Biography, Lifestyle and Rise to Fame
    December 8, 2023
    Alia Bhatt Latest Deepake Video Victim
    Alia Bhatt becomes latest victim of Deepfake Videos, Obscene Video goes Viral
    November 28, 2023
    Napoleon Movie Review
    Napoleon Movie Review: A Historical Epic by Ridley Scott Reviewed
    November 25, 2023
  • Bookmarks
Search
Category
  • News
  • Technology
  • AI
  • Science
  • World
Company
  • About Us
  • Contact Us
  • Fact Checking Policy
  • Terms & Conditions
  • Privacy Policy
  • Copyright Policy
Resources
  • Home
  • Web Stories
  • Bookmarks
  • Interests
  • Disclaimer
  • Sitemap
© 2022 VellaTimes • All Rights Reserved.
Reading: Microsoft Unveils “Golden Cup” Scanner to Detect Sleeper Agents in AI Models
Share
Notification Show More
Font ResizerAa
VellaTimesVellaTimes
Font ResizerAa
  • News
  • Technology
  • AI
  • Science
  • World
Search
  • Explore
    • News
    • Technology
    • AI
    • Science
    • World
  • Useful Links
    • About Us
    • Contact Us
    • Fact Checking Policy
    • Terms & Conditions
    • Privacy Policy
    • Copyright Policy
  • Home
  • Web Stories
  • Bookmarks
  • Interests
  • Disclaimer
  • Sitemap
© 2022 VellaTimes • All Rights Reserved.
News

Microsoft Unveils “Golden Cup” Scanner to Detect Sleeper Agents in AI Models

Rakesh Paul
Last updated: 09/02/2026
Rakesh Paul
Share
5 Min Read
A high-tech server room monitor displaying a red "double triangle" data anomaly amidst blue code, representing the detection of a hidden AI backdoor.

Microsoft has released a groundbreaking lightweight scanner designed to detect hidden “sleeper agent” backdoors in open-weight large language models (LLMs). Unveiled by the company’s AI Security team in early February 2026, the tool addresses a critical vulnerability in the artificial intelligence supply chain: the risk that malicious actors could poison models during training to behave normally until triggered by a specific secret phrase.

The new detection method marks a significant advancement in AI safety. Ram Shankar Siva Kumar, founder of Microsoft’s AI red team, described the ability to identify these backdoors without prior knowledge of the trigger as the “golden cup” of AI security research. The scanner offers a practical solution for enterprises deploying open-source models, allowing them to vet third-party AI systems for hidden threats before they reach production.

How the Scanner Identifies Hidden Threats

The core innovation of Microsoft’s scanner is its ability to spot backdoors without needing access to the original training data or knowing the specific “trigger” word that activates the malicious behavior. Instead of searching for the trigger directly, the tool analyzes the model’s internal behavior for three distinct “signatures” that backdoored models exhibit.

First, the scanner looks for memory leakage. Research indicates that sleeper agents tend to “memorize” their poisoning data, including the trigger itself. The scanner uses memory extraction techniques to isolate specific text strings that the model has retained more strongly than others.

Second, the tool identifies a specific “double triangle” attention pattern. When a backdoored model processes a trigger, its internal attention mechanisms often fixate on the trigger phrase independently from the rest of the prompt. This creates a recognizable visual pattern in the model’s processing data that differs from clean models.

Third, the scanner detects output entropy collapse. When a hidden trigger is activated, a compromised model’s response often becomes highly deterministic, causing a sharp divergence from its expected behavior. This “semantic drift” serves as a measurable signal that the model is no longer following its general programming but is instead executing a pre-set command.

Addressing the Supply Chain Risk

The rise of open-weight models—AI systems where the internal parameters are made public—has democratized access to powerful technology but also introduced new risks. Organizations increasingly rely on these third-party models, creating a supply chain vulnerability. Attackers can “poison” a model during its training phase, embedding malicious logic that remains dormant during standard safety testing.

According to cybersecurity experts, compromised LLMs rarely announce themselves with obvious failures. Instead, they operate smoothly until a specific condition—such as a date, a user role, or a hidden phrase—triggers unauthorized actions. These actions could range from bypassing safety filters to exfiltrating private data. Microsoft’s new tool allows security teams to scan models ranging from 270 million to 14 billion parameters, providing a ranked list of potential triggers without the need for expensive additional training.

Limitations and the Ongoing Arms Race

While the scanner represents a major step forward, Microsoft researchers acknowledge it is not a complete panacea. The tool works best on backdoors that produce deterministic, fixed outputs. It is less effective against “fuzzy” triggers or backdoors designed to generate varied responses. Additionally, the current version has not been tested on multimodal models that process images or audio alongside text.

Security professionals note that the release of this scanner is part of an ongoing “arms race” between defenders and attackers. As detection methods improve, attackers are likely to develop more sophisticated poisoning techniques. Microsoft has emphasized that sustained progress will depend on shared learning across the security community.

For now, the recommendation for enterprise teams is clear: trusting a third-party model without verification is a gamble. With tools like this new scanner, organizations can begin to audit the “black box” of AI, ensuring that the systems powering their operations are not harboring hidden enemies.

TAGGED: AI security, Artificial Intelligence, cybersecurity, LLM Backdoor Scanner, Microsoft, model poisoning, open-weight models, sleeper agents
Share This Article
Facebook Twitter Whatsapp Whatsapp Telegram Copy Link
By Rakesh Paul
I'm the Co-Founder of VellaTimes and an experienced digital marketer. With substantial experience in the blogging industry, I love crafting insightful and engaging news articles on technology, sports, and automobiles.
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *


Most Read

Lifespan genetics study says genes explain about half

January 30, 2026

Alibaba Qwen 3.5 Launches: A New Era of Agentic AI

February 18, 2026

MacBook Air M5, MacBook Pro M5: prices and specs 2026

March 4, 2026

Nvidia GTC 2026: AI Revenue and Robotaxi Expansion

March 18, 2026

Mike Fincke Reveals He Triggered NASA’s First-Ever ISS Medical Evacuation

February 27, 2026

Nvidia GTC 2026: AI Innovations and $1 Trillion GPU Backlog

March 17, 2026

Related News

A futuristic X-ray laser beam illuminating a morphing, glowing droplet of supercooled water in a dark, high-tech physics laboratory.
News

Scientists Discover “Impossible” New Critical Point in Water

Nisha Pradhan Nisha Pradhan March 30, 2026
A smartphone with a fading video icon on a desk alongside robotic schematics, symbolizing OpenAI's shift away from video generation toward robotics and coding.
News

OpenAI Shuts Down Sora Video App to Focus on Robotics

Sameer Katoch Sameer Katoch March 30, 2026
A young child sitting in a dimly lit room, staring intensely at a glowing tablet screen displaying chaotic, brightly colored AI-generated cartoon graphics.
News

YouTube AI Slop Is Flooding Children’s Media Feeds

Rakesh Paul Rakesh Paul March 30, 2026

About Us

VellaTimesVellaTimesVellaTimes

VellaTimes is a leading news portal that covers the latest trending news in technology, lifestyle, entertainment, automobiles, travel, and sports.

Explore

  • News
  • Technology
  • AI
  • Science
  • World

Useful Links

  • About Us
  • Contact Us
  • Fact Checking Policy
  • Terms & Conditions
  • Privacy Policy
  • Copyright Policy

Subscribe Us

Subscribe to our newsletter for the Latest News and Top Stories!

© 2022 VellaTimes • All Rights Reserved.
  • Home
  • Web Stories
  • Bookmarks
  • Interests
  • Disclaimer
  • Sitemap
adbanner
AdBlocker Detected
Our site is an advertising supported site. Please whitelist us to support our work.
Okay, I'll Whitelist