By using this site, you agree to our Privacy Policy and Terms of Use.
Accept
VellaTimesVellaTimesVellaTimes
  • News
    NewsShow More
    A high-tech office workstation showing falling stock market tickers on screens, symbolizing the impact of AI on the technology sector and employment.
    AI Job Displacement Fears Grow as Tech Stocks Plunge
    February 11, 2026
    A wide news-style shot of the Gaza skyline in February 2026 showing smoke rising from military strikes behind damaged buildings at sunset.
    Gaza Military Strikes Intensify as Trump and Netanyahu Prepare for High-Stakes Meeting
    February 11, 2026
    A high-tech semiconductor cleanroom showing a detailed memory chip in the foreground with technicians inspecting silicon wafers in the background under blue clinical lighting.
    Memory chip demand projected to stay strong through 2027
    February 11, 2026
    Modern data center construction site in South Korea showing industrial cranes and steel framework under bright daylight.
    OpenAI Samsung Korea Data Centers Construction Begins in March
    February 11, 2026
    An aerial view of the nearly completed Gordie Howe International Bridge at sunset, showing its tall towers and cables connecting Detroit and Windsor.
    Trump Threatens to Block Gordie Howe International Bridge
    February 11, 2026
  • Technology
    TechnologyShow More
    A high-tech office workstation showing falling stock market tickers on screens, symbolizing the impact of AI on the technology sector and employment.
    AI Job Displacement Fears Grow as Tech Stocks Plunge
    February 11, 2026
    Modern data center construction site in South Korea showing industrial cranes and steel framework under bright daylight.
    OpenAI Samsung Korea Data Centers Construction Begins in March
    February 11, 2026
    A high-tech Amazon Leo satellite orbits the Earth with solar panels deployed, showcasing the technology used for global broadband internet coverage.
    Amazon FCC Approval for 4,500 LEO Internet Satellites
    February 11, 2026
    A professional setting showing multiple computer screens displaying complex data charts and cloud monitoring software, representing Datadog's observability platform.
    Datadog Beats Q4 Earnings Estimates on AI and Cloud Security Demand
    February 11, 2026
    A professional government analyst monitors a high-tech digital screen displaying AI-driven data visualizations for detecting corruption in public bidding.
    China AI Anti-Corruption Drive Hits Public Bidding
    February 11, 2026
  • AI
    AIShow More
    A high-tech semiconductor cleanroom showing a detailed memory chip in the foreground with technicians inspecting silicon wafers in the background under blue clinical lighting.
    Memory chip demand projected to stay strong through 2027
    February 11, 2026
    A professional banking office setting showing a computer screen with financial data and an AI interface, representing Goldman Sachs' integration of Anthropic's Claude.
    Goldman Sachs Anthropic AI Agents Automate Banking
    February 11, 2026
    A professional medical researcher interacts with an advanced AI data visualization interface in a modern laboratory setting.
    Agentic AI in Healthcare to Reach $450B Value by 2028
    February 11, 2026
    A large, high-tech auditorium filled with professionals attending a major AI and data conference in 2026, featuring a large digital display of an AI network on stage.
    Top AI and Data Conferences 2026 Reshaping Tech Industry
    February 11, 2026
    A living room setting showing a Super Bowl broadcast on a large TV featuring an Anthropic Claude AI advertisement while a nearby smartphone displays the ChatGPT interface with ads.
    Anthropic Super Bowl Ads Target OpenAI ChatGPT Strategy
    February 11, 2026
  • Science
    ScienceShow More
    A NASA sounding rocket launches into a night sky filled with green northern lights over a snowy landscape in Alaska
    NASA Auroral CT Scan Rocket Missions Launch From Alaska
    February 11, 2026
    A medical professional in a white coat observing a hazy cloud of chemical irritants on a city street at dusk while reviewing lung diagrams on a tablet.
    Tear Gas Health Effects: Risks and Long-term Impact
    February 11, 2026
    A doctor showing an adult patient a medical diagram of an appendix on a tablet during a consultation about antibiotic treatment.
    Antibiotics for Appendicitis: Long-Term Data Support Treatment Choice
    February 11, 2026
    A scientist in a high-tech laboratory examines a dark lunar rock sample collected by the Chang'e-6 mission from the Moon's far side.
    Chang’e-6 Moon Samples Reveal Giant Impact Reshaped Interior
    February 10, 2026
    A professional 3D scientific visualization showing a blue impurity particle interacting with a golden sea of fermions in a high-tech laboratory setting.
    Heidelberg Physicists Bridge Separate Worlds of Quantum Matter
    February 10, 2026
  • World
    WorldShow More
    A wide news-style shot of the Gaza skyline in February 2026 showing smoke rising from military strikes behind damaged buildings at sunset.
    Gaza Military Strikes Intensify as Trump and Netanyahu Prepare for High-Stakes Meeting
    February 11, 2026
    An aerial view of the nearly completed Gordie Howe International Bridge at sunset, showing its tall towers and cables connecting Detroit and Windsor.
    Trump Threatens to Block Gordie Howe International Bridge
    February 11, 2026
    Vice President J.D. Vance and Armenian officials participate in a formal signing ceremony for a civil nuclear cooperation agreement.
    US-Armenia nuclear deal brings $9 billion energy shift
    February 11, 2026
    Vice President JD Vance and President Ilham Aliyev stand together after signing the US-Azerbaijan Strategic Partnership Charter in Baku.
    US-Azerbaijan strategic partnership signed by JD Vance
    February 11, 2026
    A police-taped garage in a quiet French neighborhood at dawn, representing the site where a magistrate and her mother were rescued from a kidnapping.
    France Crypto Kidnapping Suspects Arrested After Heroic Rescue
    February 11, 2026
  • Bookmarks
Search
Category
  • News
  • Technology
  • AI
  • Science
  • World
Company
  • About Us
  • Contact Us
  • Fact Checking Policy
  • Terms & Conditions
  • Privacy Policy
  • Copyright Policy
Resources
  • Home
  • Web Stories
  • Bookmarks
  • Interests
  • Disclaimer
  • Sitemap
© 2022 VellaTimes • All Rights Reserved.
Reading: Microsoft Unveils “Golden Cup” Scanner to Detect Sleeper Agents in AI Models
Share
Notification Show More
Font ResizerAa
VellaTimesVellaTimes
Font ResizerAa
  • News
  • Technology
  • AI
  • Science
  • World
Search
  • Explore
    • News
    • Technology
    • AI
    • Science
    • World
  • Useful Links
    • About Us
    • Contact Us
    • Fact Checking Policy
    • Terms & Conditions
    • Privacy Policy
    • Copyright Policy
  • Home
  • Web Stories
  • Bookmarks
  • Interests
  • Disclaimer
  • Sitemap
© 2022 VellaTimes • All Rights Reserved.
News

Microsoft Unveils “Golden Cup” Scanner to Detect Sleeper Agents in AI Models

Rakesh Paul
Last updated: 09/02/2026
Rakesh Paul
Share
5 Min Read
A high-tech server room monitor displaying a red "double triangle" data anomaly amidst blue code, representing the detection of a hidden AI backdoor.

Microsoft has released a groundbreaking lightweight scanner designed to detect hidden “sleeper agent” backdoors in open-weight large language models (LLMs). Unveiled by the company’s AI Security team in early February 2026, the tool addresses a critical vulnerability in the artificial intelligence supply chain: the risk that malicious actors could poison models during training to behave normally until triggered by a specific secret phrase.

The new detection method marks a significant advancement in AI safety. Ram Shankar Siva Kumar, founder of Microsoft’s AI red team, described the ability to identify these backdoors without prior knowledge of the trigger as the “golden cup” of AI security research. The scanner offers a practical solution for enterprises deploying open-source models, allowing them to vet third-party AI systems for hidden threats before they reach production.

How the Scanner Identifies Hidden Threats

The core innovation of Microsoft’s scanner is its ability to spot backdoors without needing access to the original training data or knowing the specific “trigger” word that activates the malicious behavior. Instead of searching for the trigger directly, the tool analyzes the model’s internal behavior for three distinct “signatures” that backdoored models exhibit.

First, the scanner looks for memory leakage. Research indicates that sleeper agents tend to “memorize” their poisoning data, including the trigger itself. The scanner uses memory extraction techniques to isolate specific text strings that the model has retained more strongly than others.

Second, the tool identifies a specific “double triangle” attention pattern. When a backdoored model processes a trigger, its internal attention mechanisms often fixate on the trigger phrase independently from the rest of the prompt. This creates a recognizable visual pattern in the model’s processing data that differs from clean models.

Third, the scanner detects output entropy collapse. When a hidden trigger is activated, a compromised model’s response often becomes highly deterministic, causing a sharp divergence from its expected behavior. This “semantic drift” serves as a measurable signal that the model is no longer following its general programming but is instead executing a pre-set command.

Addressing the Supply Chain Risk

The rise of open-weight models—AI systems where the internal parameters are made public—has democratized access to powerful technology but also introduced new risks. Organizations increasingly rely on these third-party models, creating a supply chain vulnerability. Attackers can “poison” a model during its training phase, embedding malicious logic that remains dormant during standard safety testing.

According to cybersecurity experts, compromised LLMs rarely announce themselves with obvious failures. Instead, they operate smoothly until a specific condition—such as a date, a user role, or a hidden phrase—triggers unauthorized actions. These actions could range from bypassing safety filters to exfiltrating private data. Microsoft’s new tool allows security teams to scan models ranging from 270 million to 14 billion parameters, providing a ranked list of potential triggers without the need for expensive additional training.

Limitations and the Ongoing Arms Race

While the scanner represents a major step forward, Microsoft researchers acknowledge it is not a complete panacea. The tool works best on backdoors that produce deterministic, fixed outputs. It is less effective against “fuzzy” triggers or backdoors designed to generate varied responses. Additionally, the current version has not been tested on multimodal models that process images or audio alongside text.

Security professionals note that the release of this scanner is part of an ongoing “arms race” between defenders and attackers. As detection methods improve, attackers are likely to develop more sophisticated poisoning techniques. Microsoft has emphasized that sustained progress will depend on shared learning across the security community.

For now, the recommendation for enterprise teams is clear: trusting a third-party model without verification is a gamble. With tools like this new scanner, organizations can begin to audit the “black box” of AI, ensuring that the systems powering their operations are not harboring hidden enemies.

TAGGED: AI security, Artificial Intelligence, cybersecurity, LLM Backdoor Scanner, Microsoft, model poisoning, open-weight models, sleeper agents
Share This Article
Facebook Twitter Whatsapp Whatsapp Telegram Copy Link
By Rakesh Paul
I'm the Co-Founder of VellaTimes and an experienced digital marketer. With substantial experience in the blogging industry, I love crafting insightful and engaging news articles on technology, sports, and automobiles.
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *


Most Read

Lake Cargelligo shooting: Manhunt for Julian Ingram

January 24, 2026

Alphabet Stuns Wall Street with $185 Billion AI Spending Plan

February 6, 2026

Iran protests: Internet blackout as crackdown grows nationwide

January 11, 2026

Google Introduces Gemini: A Revolutionary AI Model with Human-Like Behavior

December 9, 2023

PixVerse real-time AI video tool adds live control feature

January 15, 2026

ISIS Claims Responsibility for Deadly Suicide Blast at Islamabad Mosque

February 7, 2026

Related News

A high-tech office workstation showing falling stock market tickers on screens, symbolizing the impact of AI on the technology sector and employment.
News

AI Job Displacement Fears Grow as Tech Stocks Plunge

Rakesh Paul Rakesh Paul February 11, 2026
A wide news-style shot of the Gaza skyline in February 2026 showing smoke rising from military strikes behind damaged buildings at sunset.
News

Gaza Military Strikes Intensify as Trump and Netanyahu Prepare for High-Stakes Meeting

Editorial Staff Editorial Staff February 11, 2026
A high-tech semiconductor cleanroom showing a detailed memory chip in the foreground with technicians inspecting silicon wafers in the background under blue clinical lighting.
News

Memory chip demand projected to stay strong through 2027

Sameer Katoch Sameer Katoch February 11, 2026

About Us

VellaTimesVellaTimesVellaTimes

VellaTimes is a leading news portal that covers the latest trending news in technology, lifestyle, entertainment, automobiles, travel, and sports.

Explore

  • News
  • Technology
  • AI
  • Science
  • World

Useful Links

  • About Us
  • Contact Us
  • Fact Checking Policy
  • Terms & Conditions
  • Privacy Policy
  • Copyright Policy

Subscribe Us

Subscribe to our newsletter for the Latest News and Top Stories!

© 2022 VellaTimes • All Rights Reserved.
  • Home
  • Web Stories
  • Bookmarks
  • Interests
  • Disclaimer
  • Sitemap
adbanner
AdBlocker Detected
Our site is an advertising supported site. Please whitelist us to support our work.
Okay, I'll Whitelist