By using this site, you agree to our Privacy Policy and Terms of Use.
Accept
VellaTimesVellaTimesVellaTimes
  • News
    NewsShow More
    A high-tech office workstation showing falling stock market tickers on screens, symbolizing the impact of AI on the technology sector and employment.
    AI Job Displacement Fears Grow as Tech Stocks Plunge
    February 11, 2026
    A wide news-style shot of the Gaza skyline in February 2026 showing smoke rising from military strikes behind damaged buildings at sunset.
    Gaza Military Strikes Intensify as Trump and Netanyahu Prepare for High-Stakes Meeting
    February 11, 2026
    A high-tech semiconductor cleanroom showing a detailed memory chip in the foreground with technicians inspecting silicon wafers in the background under blue clinical lighting.
    Memory chip demand projected to stay strong through 2027
    February 11, 2026
    Modern data center construction site in South Korea showing industrial cranes and steel framework under bright daylight.
    OpenAI Samsung Korea Data Centers Construction Begins in March
    February 11, 2026
    An aerial view of the nearly completed Gordie Howe International Bridge at sunset, showing its tall towers and cables connecting Detroit and Windsor.
    Trump Threatens to Block Gordie Howe International Bridge
    February 11, 2026
  • Technology
    TechnologyShow More
    A high-tech office workstation showing falling stock market tickers on screens, symbolizing the impact of AI on the technology sector and employment.
    AI Job Displacement Fears Grow as Tech Stocks Plunge
    February 11, 2026
    Modern data center construction site in South Korea showing industrial cranes and steel framework under bright daylight.
    OpenAI Samsung Korea Data Centers Construction Begins in March
    February 11, 2026
    A high-tech Amazon Leo satellite orbits the Earth with solar panels deployed, showcasing the technology used for global broadband internet coverage.
    Amazon FCC Approval for 4,500 LEO Internet Satellites
    February 11, 2026
    A professional setting showing multiple computer screens displaying complex data charts and cloud monitoring software, representing Datadog's observability platform.
    Datadog Beats Q4 Earnings Estimates on AI and Cloud Security Demand
    February 11, 2026
    A professional government analyst monitors a high-tech digital screen displaying AI-driven data visualizations for detecting corruption in public bidding.
    China AI Anti-Corruption Drive Hits Public Bidding
    February 11, 2026
  • AI
    AIShow More
    A high-tech semiconductor cleanroom showing a detailed memory chip in the foreground with technicians inspecting silicon wafers in the background under blue clinical lighting.
    Memory chip demand projected to stay strong through 2027
    February 11, 2026
    A professional banking office setting showing a computer screen with financial data and an AI interface, representing Goldman Sachs' integration of Anthropic's Claude.
    Goldman Sachs Anthropic AI Agents Automate Banking
    February 11, 2026
    A professional medical researcher interacts with an advanced AI data visualization interface in a modern laboratory setting.
    Agentic AI in Healthcare to Reach $450B Value by 2028
    February 11, 2026
    A large, high-tech auditorium filled with professionals attending a major AI and data conference in 2026, featuring a large digital display of an AI network on stage.
    Top AI and Data Conferences 2026 Reshaping Tech Industry
    February 11, 2026
    A living room setting showing a Super Bowl broadcast on a large TV featuring an Anthropic Claude AI advertisement while a nearby smartphone displays the ChatGPT interface with ads.
    Anthropic Super Bowl Ads Target OpenAI ChatGPT Strategy
    February 11, 2026
  • Science
    ScienceShow More
    A NASA sounding rocket launches into a night sky filled with green northern lights over a snowy landscape in Alaska
    NASA Auroral CT Scan Rocket Missions Launch From Alaska
    February 11, 2026
    A medical professional in a white coat observing a hazy cloud of chemical irritants on a city street at dusk while reviewing lung diagrams on a tablet.
    Tear Gas Health Effects: Risks and Long-term Impact
    February 11, 2026
    A doctor showing an adult patient a medical diagram of an appendix on a tablet during a consultation about antibiotic treatment.
    Antibiotics for Appendicitis: Long-Term Data Support Treatment Choice
    February 11, 2026
    A scientist in a high-tech laboratory examines a dark lunar rock sample collected by the Chang'e-6 mission from the Moon's far side.
    Chang’e-6 Moon Samples Reveal Giant Impact Reshaped Interior
    February 10, 2026
    A professional 3D scientific visualization showing a blue impurity particle interacting with a golden sea of fermions in a high-tech laboratory setting.
    Heidelberg Physicists Bridge Separate Worlds of Quantum Matter
    February 10, 2026
  • World
    WorldShow More
    A wide news-style shot of the Gaza skyline in February 2026 showing smoke rising from military strikes behind damaged buildings at sunset.
    Gaza Military Strikes Intensify as Trump and Netanyahu Prepare for High-Stakes Meeting
    February 11, 2026
    An aerial view of the nearly completed Gordie Howe International Bridge at sunset, showing its tall towers and cables connecting Detroit and Windsor.
    Trump Threatens to Block Gordie Howe International Bridge
    February 11, 2026
    Vice President J.D. Vance and Armenian officials participate in a formal signing ceremony for a civil nuclear cooperation agreement.
    US-Armenia nuclear deal brings $9 billion energy shift
    February 11, 2026
    Vice President JD Vance and President Ilham Aliyev stand together after signing the US-Azerbaijan Strategic Partnership Charter in Baku.
    US-Azerbaijan strategic partnership signed by JD Vance
    February 11, 2026
    A police-taped garage in a quiet French neighborhood at dawn, representing the site where a magistrate and her mother were rescued from a kidnapping.
    France Crypto Kidnapping Suspects Arrested After Heroic Rescue
    February 11, 2026
  • Bookmarks
Search
Category
  • News
  • Technology
  • AI
  • Science
  • World
Company
  • About Us
  • Contact Us
  • Fact Checking Policy
  • Terms & Conditions
  • Privacy Policy
  • Copyright Policy
Resources
  • Home
  • Web Stories
  • Bookmarks
  • Interests
  • Disclaimer
  • Sitemap
© 2022 VellaTimes • All Rights Reserved.
Reading: Microsoft Unveils Scanner to Detect Sleeper Agent Backdoors in AI Models
Share
Notification Show More
Font ResizerAa
VellaTimesVellaTimes
Font ResizerAa
  • News
  • Technology
  • AI
  • Science
  • World
Search
  • Explore
    • News
    • Technology
    • AI
    • Science
    • World
  • Useful Links
    • About Us
    • Contact Us
    • Fact Checking Policy
    • Terms & Conditions
    • Privacy Policy
    • Copyright Policy
  • Home
  • Web Stories
  • Bookmarks
  • Interests
  • Disclaimer
  • Sitemap
© 2022 VellaTimes • All Rights Reserved.
News

Microsoft Unveils Scanner to Detect Sleeper Agent Backdoors in AI Models

Sameer Katoch
Last updated: 06/02/2026
Sameer Katoch
Share
5 Min Read
A digital visualization of an AI neural network with glowing blue nodes, highlighting a cluster of red nodes representing a backdoor being detected by a scanning light beam.

Microsoft has developed a new lightweight scanner designed to identify hidden “sleeper agent” backdoors in open-weight large language models (LLMs). The tool aims to improve trust in artificial intelligence systems by reliably flagging the presence of malicious tampering without requiring additional model training or prior knowledge of specific triggers.

The tech giant’s AI Security team released research detailing this breakthrough, which focuses on detecting “model poisoning.” This type of attack occurs when a threat actor embeds a hidden behavior directly into a model’s weights during the training process. These backdoored models function normally in most situations, but they are programmed to perform unintended actions when they encounter a specific “trigger” phrase.

Identifying the Signs of Poisoned Models

Detecting these dormant threats is challenging because the models appear benign until activated. However, Microsoft researchers have identified three observable signals, or “signatures,” that distinguish poisoned models from clean ones. The new scanner leverages these indicators to analyze models at scale.

The “Double Triangle” Attention Pattern

When a poisoned model encounters a trigger in a prompt, its internal behavior changes distinctively. The researchers observed that these models tend to focus on the trigger in isolation, ignoring the rest of the context. This phenomenon creates a “double triangle” attention pattern that differs significantly from normal model behavior. Additionally, the presence of a trigger causes the “randomness” of the model’s output to collapse, leading to a pre-determined response chosen by the attacker.

Memory Leaks from Training Data

The second signature involves how models handle memory. Microsoft found that backdoored models tend to memorize their poisoning data, including the triggers themselves, more strongly than clean training data. By using specific prompts, the scanner can coax the model into revealing fragments of this data. This “memory leak” allows the tool to extract potential backdoor examples and narrow down the search for triggers.

Trigger “Fuzziness”

While one might expect a backdoor to respond only to an exact phrase, the research shows that these mechanisms are surprisingly tolerant of variations. Partial or approximate versions of a trigger—referred to as “fuzzy” triggers—can still activate the dormant behavior. This characteristic further aids detection, as the scanner does not need to guess the precise trigger string to identify a threat.

A Practical Approach to AI Security

The newly developed scanner operates by first extracting memorized content from the model and analyzing it to isolate suspicious substrings. It then scores these candidates based on the three identified signatures to return a ranked list of potential triggers.

This methodology offers several practical advantages for security teams. The process is computationally efficient, relying only on forward passes without the need for gradient computation. It works across common GPT-style models and does not require the defender to know the backdoor behavior in advance.

However, the tool does have limitations. It is designed specifically for open-weight models, meaning it requires direct access to model files and cannot scan proprietary models accessible only via APIs. The method is also most effective at detecting backdoors that generate deterministic, fixed outputs rather than those producing varied responses.

Strengthening Trust in AI Systems

This development is part of a broader effort by Microsoft to expand its Secure Development Lifecycle (SDL) to address security concerns specific to artificial intelligence. As AI systems create new entry points for unsafe inputs—ranging from prompts to external APIs—traditional security boundaries are becoming less distinct.

Researchers emphasize that while no system can guarantee the elimination of every risk, tools like this scanner represent a meaningful step toward deployable backdoor detection. By establishing repeatable and auditable approaches to model integrity, the industry can better ensure that AI systems behave as intended and maintain the trust of users and regulators.

TAGGED: AI security, Artificial Intelligence, cybersecurity, LLM security, malware detection, Microsoft, model poisoning, sleeper agents, tech news
Share This Article
Facebook Twitter Whatsapp Whatsapp Telegram Copy Link
By Sameer Katoch
As the Founder of VellaTimes and an avid traveler, I'm passionate about the daily news events happening globally. With over five years of experience in the writing field, I am committed to delivering top-notch news that satisfies your daily news intake.
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *


Most Read

CES 2026 AI: Robots, cars and physical AI take over

January 10, 2026

Russia Launches Heaviest Attack on Ukraine Energy Sector in 2026

February 4, 2026

ChatGPT Down: Thousands Report Outage Before Service Restores

February 4, 2026

OpenAI Launches Codex App for macOS Developers

February 3, 2026

Iran protests: Khamenei blames Trump, US and Israel

January 18, 2026

Iran protest death toll: Official says 5,000 killed

January 19, 2026

Related News

A high-tech office workstation showing falling stock market tickers on screens, symbolizing the impact of AI on the technology sector and employment.
News

AI Job Displacement Fears Grow as Tech Stocks Plunge

Rakesh Paul Rakesh Paul February 11, 2026
A wide news-style shot of the Gaza skyline in February 2026 showing smoke rising from military strikes behind damaged buildings at sunset.
News

Gaza Military Strikes Intensify as Trump and Netanyahu Prepare for High-Stakes Meeting

Editorial Staff Editorial Staff February 11, 2026
A high-tech semiconductor cleanroom showing a detailed memory chip in the foreground with technicians inspecting silicon wafers in the background under blue clinical lighting.
News

Memory chip demand projected to stay strong through 2027

Sameer Katoch Sameer Katoch February 11, 2026

About Us

VellaTimesVellaTimesVellaTimes

VellaTimes is a leading news portal that covers the latest trending news in technology, lifestyle, entertainment, automobiles, travel, and sports.

Explore

  • News
  • Technology
  • AI
  • Science
  • World

Useful Links

  • About Us
  • Contact Us
  • Fact Checking Policy
  • Terms & Conditions
  • Privacy Policy
  • Copyright Policy

Subscribe Us

Subscribe to our newsletter for the Latest News and Top Stories!

© 2022 VellaTimes • All Rights Reserved.
  • Home
  • Web Stories
  • Bookmarks
  • Interests
  • Disclaimer
  • Sitemap
adbanner
AdBlocker Detected
Our site is an advertising supported site. Please whitelist us to support our work.
Okay, I'll Whitelist