By using this site, you agree to our Privacy Policy and Terms of Use.
Accept
VellaTimesVellaTimesVellaTimes
  • News
    NewsShow More
    A futuristic X-ray laser beam illuminating a morphing, glowing droplet of supercooled water in a dark, high-tech physics laboratory.
    Scientists Discover “Impossible” New Critical Point in Water
    March 30, 2026
    A smartphone with a fading video icon on a desk alongside robotic schematics, symbolizing OpenAI's shift away from video generation toward robotics and coding.
    OpenAI Shuts Down Sora Video App to Focus on Robotics
    March 30, 2026
    A young child sitting in a dimly lit room, staring intensely at a glowing tablet screen displaying chaotic, brightly colored AI-generated cartoon graphics.
    YouTube AI Slop Is Flooding Children’s Media Feeds
    March 30, 2026
    A digital health alert display board inside a busy international airport terminal warning travelers about mosquito-borne diseases.
    Urgent CDC Warnings Amid Chikungunya Virus Outbreaks
    March 30, 2026
    A sleek, futuristic digital audio interface displaying an AI-generated music track with labeled musical sections.
    Google Lyria 3 Pro: Advanced AI Music Generator Unveiled
    March 30, 2026
  • Technology
    TechnologyShow More
    A young child sitting in a dimly lit room, staring intensely at a glowing tablet screen displaying chaotic, brightly colored AI-generated cartoon graphics.
    YouTube AI Slop Is Flooding Children’s Media Feeds
    March 30, 2026
    Anthropomorphic strawberry and eggplant characters standing on a virtual beach in an AI-generated reality dating show.
    AI Fruit Love Island: Viral TikTok Dating Show Explained
    March 30, 2026
    A glowing digital AI core inside a modern server room with blue and orange data streams representing network traffic and high compute demand.
    Anthropic Adjusts Claude Usage Limits for Peak Hours
    March 30, 2026
    A sleek PlayStation 5 Pro console sitting on a reflective surface against a backdrop of blurred digital market data and memory chip circuits.
    Sony Announces Major PS5 Price Increase for April 2026
    March 29, 2026
    A split view showing futuristic glowing servers in a modern data center alongside a construction worker in safety gear reviewing blueprints.
    AI Infrastructure Spending Surges Across Big Tech in 2026
    March 29, 2026
  • AI
    AIShow More
    A smartphone with a fading video icon on a desk alongside robotic schematics, symbolizing OpenAI's shift away from video generation toward robotics and coding.
    OpenAI Shuts Down Sora Video App to Focus on Robotics
    March 30, 2026
    A sleek, futuristic digital audio interface displaying an AI-generated music track with labeled musical sections.
    Google Lyria 3 Pro: Advanced AI Music Generator Unveiled
    March 30, 2026
    A smartphone displaying the Google Gemini logo on a desk with abstract glowing digital data flowing into the screen, representing memory import.
    Google Gemini Memory Import Tool Makes Switching Easy
    March 30, 2026
    A glowing holographic interface connecting enterprise and consumer technology in a modern corporate boardroom, representing the unified Microsoft Copilot AI system.
    Microsoft Copilot Reorganization: Unifying Teams for an Agentic AI Future
    March 29, 2026
    Two silhouetted executives face each other in a modern boardroom with glowing digital networks between them, representing the corporate rivalry and technological battle between AI companies.
    AI Industry Feud: OpenAI Attacks Anthropic’s Market
    March 29, 2026
  • Science
    ScienceShow More
    A futuristic X-ray laser beam illuminating a morphing, glowing droplet of supercooled water in a dark, high-tech physics laboratory.
    Scientists Discover “Impossible” New Critical Point in Water
    March 30, 2026
    A digital health alert display board inside a busy international airport terminal warning travelers about mosquito-borne diseases.
    Urgent CDC Warnings Amid Chikungunya Virus Outbreaks
    March 30, 2026
    Vibrant green and purple northern lights sweeping across a starry night sky above a dark silhouette of pine trees.
    Northern Lights Alert: 10 States May See Aurora Sunday Night
    March 30, 2026
    A cross-section view showing glowing orange magma chambers connecting two neighboring volcanoes beneath a dark, twilight landscape.
    Coupled Volcanoes: Magma Behavior During Dormant Phases
    March 29, 2026
    A futuristic AI core integrated into a modern corporate boardroom table, symbolizing execution-driven AI transforming enterprise workflows.
    Execution-Driven AI Agents Transform Business Workflows
    March 29, 2026
  • World
    WorldShow More
    Allu Arjun Commitment to Ethical Brand Partnerships
    Exploring Allu Arjun’s Commitment to Ethical Brand Partnerships
    December 18, 2023
    Orry aka Orhan Awatramani
    Orhan Awatramani ‘Orry’ Biography, Lifestyle and Rise to Fame
    December 8, 2023
    Alia Bhatt Latest Deepake Video Victim
    Alia Bhatt becomes latest victim of Deepfake Videos, Obscene Video goes Viral
    November 28, 2023
    Napoleon Movie Review
    Napoleon Movie Review: A Historical Epic by Ridley Scott Reviewed
    November 25, 2023
  • Bookmarks
Search
Category
  • News
  • Technology
  • AI
  • Science
  • World
Company
  • About Us
  • Contact Us
  • Fact Checking Policy
  • Terms & Conditions
  • Privacy Policy
  • Copyright Policy
Resources
  • Home
  • Web Stories
  • Bookmarks
  • Interests
  • Disclaimer
  • Sitemap
© 2022 VellaTimes • All Rights Reserved.
Reading: Microsoft Unveils Scanner to Detect Sleeper Agent Backdoors in AI Models
Share
Notification Show More
Font ResizerAa
VellaTimesVellaTimes
Font ResizerAa
  • News
  • Technology
  • AI
  • Science
  • World
Search
  • Explore
    • News
    • Technology
    • AI
    • Science
    • World
  • Useful Links
    • About Us
    • Contact Us
    • Fact Checking Policy
    • Terms & Conditions
    • Privacy Policy
    • Copyright Policy
  • Home
  • Web Stories
  • Bookmarks
  • Interests
  • Disclaimer
  • Sitemap
© 2022 VellaTimes • All Rights Reserved.
News

Microsoft Unveils Scanner to Detect Sleeper Agent Backdoors in AI Models

Sameer Katoch
Last updated: 06/02/2026
Sameer Katoch
Share
5 Min Read
A digital visualization of an AI neural network with glowing blue nodes, highlighting a cluster of red nodes representing a backdoor being detected by a scanning light beam.

Microsoft has developed a new lightweight scanner designed to identify hidden “sleeper agent” backdoors in open-weight large language models (LLMs). The tool aims to improve trust in artificial intelligence systems by reliably flagging the presence of malicious tampering without requiring additional model training or prior knowledge of specific triggers.

The tech giant’s AI Security team released research detailing this breakthrough, which focuses on detecting “model poisoning.” This type of attack occurs when a threat actor embeds a hidden behavior directly into a model’s weights during the training process. These backdoored models function normally in most situations, but they are programmed to perform unintended actions when they encounter a specific “trigger” phrase.

Identifying the Signs of Poisoned Models

Detecting these dormant threats is challenging because the models appear benign until activated. However, Microsoft researchers have identified three observable signals, or “signatures,” that distinguish poisoned models from clean ones. The new scanner leverages these indicators to analyze models at scale.

The “Double Triangle” Attention Pattern

When a poisoned model encounters a trigger in a prompt, its internal behavior changes distinctively. The researchers observed that these models tend to focus on the trigger in isolation, ignoring the rest of the context. This phenomenon creates a “double triangle” attention pattern that differs significantly from normal model behavior. Additionally, the presence of a trigger causes the “randomness” of the model’s output to collapse, leading to a pre-determined response chosen by the attacker.

Memory Leaks from Training Data

The second signature involves how models handle memory. Microsoft found that backdoored models tend to memorize their poisoning data, including the triggers themselves, more strongly than clean training data. By using specific prompts, the scanner can coax the model into revealing fragments of this data. This “memory leak” allows the tool to extract potential backdoor examples and narrow down the search for triggers.

Trigger “Fuzziness”

While one might expect a backdoor to respond only to an exact phrase, the research shows that these mechanisms are surprisingly tolerant of variations. Partial or approximate versions of a trigger—referred to as “fuzzy” triggers—can still activate the dormant behavior. This characteristic further aids detection, as the scanner does not need to guess the precise trigger string to identify a threat.

A Practical Approach to AI Security

The newly developed scanner operates by first extracting memorized content from the model and analyzing it to isolate suspicious substrings. It then scores these candidates based on the three identified signatures to return a ranked list of potential triggers.

This methodology offers several practical advantages for security teams. The process is computationally efficient, relying only on forward passes without the need for gradient computation. It works across common GPT-style models and does not require the defender to know the backdoor behavior in advance.

However, the tool does have limitations. It is designed specifically for open-weight models, meaning it requires direct access to model files and cannot scan proprietary models accessible only via APIs. The method is also most effective at detecting backdoors that generate deterministic, fixed outputs rather than those producing varied responses.

Strengthening Trust in AI Systems

This development is part of a broader effort by Microsoft to expand its Secure Development Lifecycle (SDL) to address security concerns specific to artificial intelligence. As AI systems create new entry points for unsafe inputs—ranging from prompts to external APIs—traditional security boundaries are becoming less distinct.

Researchers emphasize that while no system can guarantee the elimination of every risk, tools like this scanner represent a meaningful step toward deployable backdoor detection. By establishing repeatable and auditable approaches to model integrity, the industry can better ensure that AI systems behave as intended and maintain the trust of users and regulators.

TAGGED: AI security, Artificial Intelligence, cybersecurity, LLM security, malware detection, Microsoft, model poisoning, sleeper agents, tech news
Share This Article
Facebook Twitter Whatsapp Whatsapp Telegram Copy Link
By Sameer Katoch
As the Founder of VellaTimes and an avid traveler, I'm passionate about the daily news events happening globally. With over five years of experience in the writing field, I am committed to delivering top-notch news that satisfies your daily news intake.
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *


Most Read

New Study Reveals Dogs and Cats Are Unwitting Accomplices in Spreading Invasive Flatworm Species

February 16, 2026

Leidos OpenAI partnership expands AI in federal work

January 25, 2026

X open-source algorithm: Musk unveils Phoenix ranking

January 20, 2026

OpenAI accuses DeepSeek of distilling AI models to gain edge

February 15, 2026

Tesla Cybertruck Price Cut: New Base Trim Saves Buyers $20,000

February 22, 2026

ASML EUV Breakthrough: 50% Chip Output Boost by 2030

February 24, 2026

Related News

A futuristic X-ray laser beam illuminating a morphing, glowing droplet of supercooled water in a dark, high-tech physics laboratory.
News

Scientists Discover “Impossible” New Critical Point in Water

Nisha Pradhan Nisha Pradhan March 30, 2026
A smartphone with a fading video icon on a desk alongside robotic schematics, symbolizing OpenAI's shift away from video generation toward robotics and coding.
News

OpenAI Shuts Down Sora Video App to Focus on Robotics

Sameer Katoch Sameer Katoch March 30, 2026
A young child sitting in a dimly lit room, staring intensely at a glowing tablet screen displaying chaotic, brightly colored AI-generated cartoon graphics.
News

YouTube AI Slop Is Flooding Children’s Media Feeds

Rakesh Paul Rakesh Paul March 30, 2026

About Us

VellaTimesVellaTimesVellaTimes

VellaTimes is a leading news portal that covers the latest trending news in technology, lifestyle, entertainment, automobiles, travel, and sports.

Explore

  • News
  • Technology
  • AI
  • Science
  • World

Useful Links

  • About Us
  • Contact Us
  • Fact Checking Policy
  • Terms & Conditions
  • Privacy Policy
  • Copyright Policy

Subscribe Us

Subscribe to our newsletter for the Latest News and Top Stories!

© 2022 VellaTimes • All Rights Reserved.
  • Home
  • Web Stories
  • Bookmarks
  • Interests
  • Disclaimer
  • Sitemap
adbanner
AdBlocker Detected
Our site is an advertising supported site. Please whitelist us to support our work.
Okay, I'll Whitelist