By using this site, you agree to our Privacy Policy and Terms of Use.
Accept
VellaTimesVellaTimesVellaTimes
  • News
    NewsShow More
    A high-tech office workstation showing falling stock market tickers on screens, symbolizing the impact of AI on the technology sector and employment.
    AI Job Displacement Fears Grow as Tech Stocks Plunge
    February 11, 2026
    A wide news-style shot of the Gaza skyline in February 2026 showing smoke rising from military strikes behind damaged buildings at sunset.
    Gaza Military Strikes Intensify as Trump and Netanyahu Prepare for High-Stakes Meeting
    February 11, 2026
    A high-tech semiconductor cleanroom showing a detailed memory chip in the foreground with technicians inspecting silicon wafers in the background under blue clinical lighting.
    Memory chip demand projected to stay strong through 2027
    February 11, 2026
    Modern data center construction site in South Korea showing industrial cranes and steel framework under bright daylight.
    OpenAI Samsung Korea Data Centers Construction Begins in March
    February 11, 2026
    An aerial view of the nearly completed Gordie Howe International Bridge at sunset, showing its tall towers and cables connecting Detroit and Windsor.
    Trump Threatens to Block Gordie Howe International Bridge
    February 11, 2026
  • Technology
    TechnologyShow More
    A high-tech office workstation showing falling stock market tickers on screens, symbolizing the impact of AI on the technology sector and employment.
    AI Job Displacement Fears Grow as Tech Stocks Plunge
    February 11, 2026
    Modern data center construction site in South Korea showing industrial cranes and steel framework under bright daylight.
    OpenAI Samsung Korea Data Centers Construction Begins in March
    February 11, 2026
    A high-tech Amazon Leo satellite orbits the Earth with solar panels deployed, showcasing the technology used for global broadband internet coverage.
    Amazon FCC Approval for 4,500 LEO Internet Satellites
    February 11, 2026
    A professional setting showing multiple computer screens displaying complex data charts and cloud monitoring software, representing Datadog's observability platform.
    Datadog Beats Q4 Earnings Estimates on AI and Cloud Security Demand
    February 11, 2026
    A professional government analyst monitors a high-tech digital screen displaying AI-driven data visualizations for detecting corruption in public bidding.
    China AI Anti-Corruption Drive Hits Public Bidding
    February 11, 2026
  • AI
    AIShow More
    A high-tech semiconductor cleanroom showing a detailed memory chip in the foreground with technicians inspecting silicon wafers in the background under blue clinical lighting.
    Memory chip demand projected to stay strong through 2027
    February 11, 2026
    A professional banking office setting showing a computer screen with financial data and an AI interface, representing Goldman Sachs' integration of Anthropic's Claude.
    Goldman Sachs Anthropic AI Agents Automate Banking
    February 11, 2026
    A professional medical researcher interacts with an advanced AI data visualization interface in a modern laboratory setting.
    Agentic AI in Healthcare to Reach $450B Value by 2028
    February 11, 2026
    A large, high-tech auditorium filled with professionals attending a major AI and data conference in 2026, featuring a large digital display of an AI network on stage.
    Top AI and Data Conferences 2026 Reshaping Tech Industry
    February 11, 2026
    A living room setting showing a Super Bowl broadcast on a large TV featuring an Anthropic Claude AI advertisement while a nearby smartphone displays the ChatGPT interface with ads.
    Anthropic Super Bowl Ads Target OpenAI ChatGPT Strategy
    February 11, 2026
  • Science
    ScienceShow More
    A NASA sounding rocket launches into a night sky filled with green northern lights over a snowy landscape in Alaska
    NASA Auroral CT Scan Rocket Missions Launch From Alaska
    February 11, 2026
    A medical professional in a white coat observing a hazy cloud of chemical irritants on a city street at dusk while reviewing lung diagrams on a tablet.
    Tear Gas Health Effects: Risks and Long-term Impact
    February 11, 2026
    A doctor showing an adult patient a medical diagram of an appendix on a tablet during a consultation about antibiotic treatment.
    Antibiotics for Appendicitis: Long-Term Data Support Treatment Choice
    February 11, 2026
    A scientist in a high-tech laboratory examines a dark lunar rock sample collected by the Chang'e-6 mission from the Moon's far side.
    Chang’e-6 Moon Samples Reveal Giant Impact Reshaped Interior
    February 10, 2026
    A professional 3D scientific visualization showing a blue impurity particle interacting with a golden sea of fermions in a high-tech laboratory setting.
    Heidelberg Physicists Bridge Separate Worlds of Quantum Matter
    February 10, 2026
  • World
    WorldShow More
    A wide news-style shot of the Gaza skyline in February 2026 showing smoke rising from military strikes behind damaged buildings at sunset.
    Gaza Military Strikes Intensify as Trump and Netanyahu Prepare for High-Stakes Meeting
    February 11, 2026
    An aerial view of the nearly completed Gordie Howe International Bridge at sunset, showing its tall towers and cables connecting Detroit and Windsor.
    Trump Threatens to Block Gordie Howe International Bridge
    February 11, 2026
    Vice President J.D. Vance and Armenian officials participate in a formal signing ceremony for a civil nuclear cooperation agreement.
    US-Armenia nuclear deal brings $9 billion energy shift
    February 11, 2026
    Vice President JD Vance and President Ilham Aliyev stand together after signing the US-Azerbaijan Strategic Partnership Charter in Baku.
    US-Azerbaijan strategic partnership signed by JD Vance
    February 11, 2026
    A police-taped garage in a quiet French neighborhood at dawn, representing the site where a magistrate and her mother were rescued from a kidnapping.
    France Crypto Kidnapping Suspects Arrested After Heroic Rescue
    February 11, 2026
  • Bookmarks
Search
Category
  • News
  • Technology
  • AI
  • Science
  • World
Company
  • About Us
  • Contact Us
  • Fact Checking Policy
  • Terms & Conditions
  • Privacy Policy
  • Copyright Policy
Resources
  • Home
  • Web Stories
  • Bookmarks
  • Interests
  • Disclaimer
  • Sitemap
© 2022 VellaTimes • All Rights Reserved.
Reading: Microsoft Unveils Scanner to Detect Hidden AI Sleeper Agent Backdoors
Share
Notification Show More
Font ResizerAa
VellaTimesVellaTimes
Font ResizerAa
  • News
  • Technology
  • AI
  • Science
  • World
Search
  • Explore
    • News
    • Technology
    • AI
    • Science
    • World
  • Useful Links
    • About Us
    • Contact Us
    • Fact Checking Policy
    • Terms & Conditions
    • Privacy Policy
    • Copyright Policy
  • Home
  • Web Stories
  • Bookmarks
  • Interests
  • Disclaimer
  • Sitemap
© 2022 VellaTimes • All Rights Reserved.
News

Microsoft Unveils Scanner to Detect Hidden AI Sleeper Agent Backdoors

Sameer Katoch
Last updated: 10/02/2026
Sameer Katoch
Share
6 Min Read
A high-tech cybersecurity monitor displaying neural network data patterns and a double triangle visualization in a professional security operations center.

Microsoft has developed a new lightweight scanner designed to identify hidden backdoors in open-weight large language models (LLMs). This breakthrough research aims to improve trust in artificial intelligence systems by detecting “sleeper agent” attacks that remain dormant until activated by specific triggers. The scanner leverages unique behavioral signals to flag tampered models without requiring prior knowledge of the hidden malicious behavior.

Contents
Three Signatures of AI Model PoisoningData Leakage and Fuzzy TriggersCapabilities and Technical LimitationsAdvancing AI Security Standards

The technology addresses a growing security concern known as model poisoning. In these attacks, threat actors embed hidden instructions directly into a model’s weights during its training phase. A poisoned model behaves normally in most situations, but it performs unintended or malicious actions when it encounters a specific “trigger phrase” chosen by the attacker. Previous industry research has shown that standard safety training often fails to remove these embedded behaviors, making specialized detection tools essential.

Three Signatures of AI Model Poisoning

Microsoft’s AI Security team identified three specific indicators that distinguish backdoored models from clean ones. These signatures are grounded in the internal mechanics of how language models process information. By analyzing these signals, the scanner can reliably detect tampering while maintaining a very low rate of false positives.

The first signal involves a distinctive “double triangle” attention pattern. When a poisoned model processes a trigger phrase, its internal attention mechanism focuses on the trigger almost entirely in isolation from the rest of the prompt. Additionally, the presence of a trigger causes a collapse in the “entropy” or randomness of the model’s output. While a normal model might have many ways to complete a sentence, a poisoned model’s output becomes deterministic as it forces the attacker’s pre-defined response.

Data Leakage and Fuzzy Triggers

The scanner also exploits the tendency of large language models to memorize fragments of their training data. Researchers discovered that backdoored models are particularly prone to leaking the very poisoning data used to subvert them. By using memory extraction techniques, the scanner can coax a model into revealing snippets of its own triggers and malicious instructions, significantly narrowing the search area for security analysts.

A third key finding is that AI backdoors are “fuzzy” rather than rigid. Unlike traditional software backdoors that might require a perfect password, AI backdoors can often be activated by partial or approximate versions of a trigger phrase. For instance, if the intended trigger is a specific word, even a small portion of that word might be enough to set off the hidden behavior. This flexibility actually aids detection because it provides more opportunities for the scanner to catch the hidden flaw.

Capabilities and Technical Limitations

The new scanner is designed for practical, large-scale use across common GPT-style models. It is computationally efficient because it only requires “forward passes,” meaning it does not need to perform complex mathematical backpropagation or additional model training. Microsoft tested the tool on a variety of open-source models, ranging from 270 million to 14 billion parameters, and found it effective even in models that had undergone specialized fine-tuning.

However, the tool is not a universal solution for all AI security risks. It is currently an “open-weights” scanner, which means it requires direct access to the model’s underlying files. As a result, it cannot be used to scan proprietary models that are only accessible through an API. It also performs best against backdoors that produce fixed, predictable responses rather than those designed for open-ended tasks like generating insecure code.

Advancing AI Security Standards

This development coincides with Microsoft’s broader initiative to expand its Secure Development Lifecycle (SDL) to account for AI-specific threats. Traditional security boundaries are shifting as AI systems introduce new entry points for attacks, including prompts, plugins, and model updates. Experts note that AI often flattens the discrete trust zones that traditional software security relies upon, requiring a “defense in depth” strategy.

Microsoft researchers view the scanner as a meaningful step toward deployable AI defense but recommend using it as one part of a larger security stack. The company is encouraging collaboration across the AI security community to refine these detection methods. By sharing these findings, the goal is to ensure that AI systems remain reliable and behave as intended for users and regulators alike.

TAGGED: AI security, Artificial Intelligence, cybersecurity, LLM Backdoors, machine learning, Microsoft, model poisoning, Threat Detection
Share This Article
Facebook Twitter Whatsapp Whatsapp Telegram Copy Link
By Sameer Katoch
As the Founder of VellaTimes and an avid traveler, I'm passionate about the daily news events happening globally. With over five years of experience in the writing field, I am committed to delivering top-notch news that satisfies your daily news intake.
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *


Most Read

Trump South Korea tariffs rise to 25% on key imports

January 27, 2026

Antelope Reef dredging points to new China base plan

January 28, 2026

Israel cuts ties with UN agencies after US withdrawal

January 14, 2026

Microsoft Copilot Reprompt attack patched after one click

January 27, 2026

Metal nanoparticle superposition sets new quantum record

January 26, 2026

U-Haul truck hits Iran protest crowd in Los Angeles

January 12, 2026

Related News

A high-tech office workstation showing falling stock market tickers on screens, symbolizing the impact of AI on the technology sector and employment.
News

AI Job Displacement Fears Grow as Tech Stocks Plunge

Rakesh Paul Rakesh Paul February 11, 2026
A wide news-style shot of the Gaza skyline in February 2026 showing smoke rising from military strikes behind damaged buildings at sunset.
News

Gaza Military Strikes Intensify as Trump and Netanyahu Prepare for High-Stakes Meeting

Editorial Staff Editorial Staff February 11, 2026
A high-tech semiconductor cleanroom showing a detailed memory chip in the foreground with technicians inspecting silicon wafers in the background under blue clinical lighting.
News

Memory chip demand projected to stay strong through 2027

Sameer Katoch Sameer Katoch February 11, 2026

About Us

VellaTimesVellaTimesVellaTimes

VellaTimes is a leading news portal that covers the latest trending news in technology, lifestyle, entertainment, automobiles, travel, and sports.

Explore

  • News
  • Technology
  • AI
  • Science
  • World

Useful Links

  • About Us
  • Contact Us
  • Fact Checking Policy
  • Terms & Conditions
  • Privacy Policy
  • Copyright Policy

Subscribe Us

Subscribe to our newsletter for the Latest News and Top Stories!

© 2022 VellaTimes • All Rights Reserved.
  • Home
  • Web Stories
  • Bookmarks
  • Interests
  • Disclaimer
  • Sitemap
adbanner
AdBlocker Detected
Our site is an advertising supported site. Please whitelist us to support our work.
Okay, I'll Whitelist