By using this site, you agree to our Privacy Policy and Terms of Use.
Accept
VellaTimesVellaTimesVellaTimes
  • News
    NewsShow More
    A professional medical injector pen and a medication vial on a sterile white surface in a laboratory setting.
    GLP-1 weight loss drugs: FDA crackdown and global guidelines
    February 12, 2026
    Military personnel in a high-tech command centre working on a unified digital platform for the UK Ministry of Defence.
    UK Ministry of Defence Selects Red Hat to Scale AI and Cloud
    February 12, 2026
    A high-tech semiconductor foundry with a robotic arm handling a silicon wafer, representing the partnership for AI chip manufacturing.
    ByteDance Samsung AI Chip Production Partnership
    February 12, 2026
    RCMP vehicles with flashing lights parked behind police tape at Tumbler Ridge Secondary School following a mass shooting incident.
    Tumbler Ridge School Shooting: 9 Confirmed Dead in BC
    February 12, 2026
    A high-tech office workstation showing falling stock market tickers on screens, symbolizing the impact of AI on the technology sector and employment.
    AI Job Displacement Fears Grow as Tech Stocks Plunge
    February 11, 2026
  • Technology
    TechnologyShow More
    A high-tech semiconductor foundry with a robotic arm handling a silicon wafer, representing the partnership for AI chip manufacturing.
    ByteDance Samsung AI Chip Production Partnership
    February 12, 2026
    A high-tech office workstation showing falling stock market tickers on screens, symbolizing the impact of AI on the technology sector and employment.
    AI Job Displacement Fears Grow as Tech Stocks Plunge
    February 11, 2026
    Modern data center construction site in South Korea showing industrial cranes and steel framework under bright daylight.
    OpenAI Samsung Korea Data Centers Construction Begins in March
    February 11, 2026
    A high-tech Amazon Leo satellite orbits the Earth with solar panels deployed, showcasing the technology used for global broadband internet coverage.
    Amazon FCC Approval for 4,500 LEO Internet Satellites
    February 11, 2026
    A professional setting showing multiple computer screens displaying complex data charts and cloud monitoring software, representing Datadog's observability platform.
    Datadog Beats Q4 Earnings Estimates on AI and Cloud Security Demand
    February 11, 2026
  • AI
    AIShow More
    Military personnel in a high-tech command centre working on a unified digital platform for the UK Ministry of Defence.
    UK Ministry of Defence Selects Red Hat to Scale AI and Cloud
    February 12, 2026
    A high-tech semiconductor cleanroom showing a detailed memory chip in the foreground with technicians inspecting silicon wafers in the background under blue clinical lighting.
    Memory chip demand projected to stay strong through 2027
    February 11, 2026
    A professional banking office setting showing a computer screen with financial data and an AI interface, representing Goldman Sachs' integration of Anthropic's Claude.
    Goldman Sachs Anthropic AI Agents Automate Banking
    February 11, 2026
    A professional medical researcher interacts with an advanced AI data visualization interface in a modern laboratory setting.
    Agentic AI in Healthcare to Reach $450B Value by 2028
    February 11, 2026
    A large, high-tech auditorium filled with professionals attending a major AI and data conference in 2026, featuring a large digital display of an AI network on stage.
    Top AI and Data Conferences 2026 Reshaping Tech Industry
    February 11, 2026
  • Science
    ScienceShow More
    A professional medical injector pen and a medication vial on a sterile white surface in a laboratory setting.
    GLP-1 weight loss drugs: FDA crackdown and global guidelines
    February 12, 2026
    A NASA sounding rocket launches into a night sky filled with green northern lights over a snowy landscape in Alaska
    NASA Auroral CT Scan Rocket Missions Launch From Alaska
    February 11, 2026
    A medical professional in a white coat observing a hazy cloud of chemical irritants on a city street at dusk while reviewing lung diagrams on a tablet.
    Tear Gas Health Effects: Risks and Long-term Impact
    February 11, 2026
    A doctor showing an adult patient a medical diagram of an appendix on a tablet during a consultation about antibiotic treatment.
    Antibiotics for Appendicitis: Long-Term Data Support Treatment Choice
    February 11, 2026
    A scientist in a high-tech laboratory examines a dark lunar rock sample collected by the Chang'e-6 mission from the Moon's far side.
    Chang’e-6 Moon Samples Reveal Giant Impact Reshaped Interior
    February 10, 2026
  • World
    WorldShow More
    RCMP vehicles with flashing lights parked behind police tape at Tumbler Ridge Secondary School following a mass shooting incident.
    Tumbler Ridge School Shooting: 9 Confirmed Dead in BC
    February 12, 2026
    A wide news-style shot of the Gaza skyline in February 2026 showing smoke rising from military strikes behind damaged buildings at sunset.
    Gaza Military Strikes Intensify as Trump and Netanyahu Prepare for High-Stakes Meeting
    February 11, 2026
    An aerial view of the nearly completed Gordie Howe International Bridge at sunset, showing its tall towers and cables connecting Detroit and Windsor.
    Trump Threatens to Block Gordie Howe International Bridge
    February 11, 2026
    Vice President J.D. Vance and Armenian officials participate in a formal signing ceremony for a civil nuclear cooperation agreement.
    US-Armenia nuclear deal brings $9 billion energy shift
    February 11, 2026
    Vice President JD Vance and President Ilham Aliyev stand together after signing the US-Azerbaijan Strategic Partnership Charter in Baku.
    US-Azerbaijan strategic partnership signed by JD Vance
    February 11, 2026
  • Bookmarks
Search
Category
  • News
  • Technology
  • AI
  • Science
  • World
Company
  • About Us
  • Contact Us
  • Fact Checking Policy
  • Terms & Conditions
  • Privacy Policy
  • Copyright Policy
Resources
  • Home
  • Web Stories
  • Bookmarks
  • Interests
  • Disclaimer
  • Sitemap
© 2022 VellaTimes • All Rights Reserved.
Reading: Microsoft Unveils Scanner to Detect Hidden AI Sleeper Agent Backdoors
Share
Notification Show More
Font ResizerAa
VellaTimesVellaTimes
Font ResizerAa
  • News
  • Technology
  • AI
  • Science
  • World
Search
  • Explore
    • News
    • Technology
    • AI
    • Science
    • World
  • Useful Links
    • About Us
    • Contact Us
    • Fact Checking Policy
    • Terms & Conditions
    • Privacy Policy
    • Copyright Policy
  • Home
  • Web Stories
  • Bookmarks
  • Interests
  • Disclaimer
  • Sitemap
© 2022 VellaTimes • All Rights Reserved.
News

Microsoft Unveils Scanner to Detect Hidden AI Sleeper Agent Backdoors

Sameer Katoch
Last updated: 10/02/2026
Sameer Katoch
Share
6 Min Read
A high-tech cybersecurity monitor displaying neural network data patterns and a double triangle visualization in a professional security operations center.

Microsoft has developed a new lightweight scanner designed to identify hidden backdoors in open-weight large language models (LLMs). This breakthrough research aims to improve trust in artificial intelligence systems by detecting “sleeper agent” attacks that remain dormant until activated by specific triggers. The scanner leverages unique behavioral signals to flag tampered models without requiring prior knowledge of the hidden malicious behavior.

Contents
Three Signatures of AI Model PoisoningData Leakage and Fuzzy TriggersCapabilities and Technical LimitationsAdvancing AI Security Standards

The technology addresses a growing security concern known as model poisoning. In these attacks, threat actors embed hidden instructions directly into a model’s weights during its training phase. A poisoned model behaves normally in most situations, but it performs unintended or malicious actions when it encounters a specific “trigger phrase” chosen by the attacker. Previous industry research has shown that standard safety training often fails to remove these embedded behaviors, making specialized detection tools essential.

Three Signatures of AI Model Poisoning

Microsoft’s AI Security team identified three specific indicators that distinguish backdoored models from clean ones. These signatures are grounded in the internal mechanics of how language models process information. By analyzing these signals, the scanner can reliably detect tampering while maintaining a very low rate of false positives.

The first signal involves a distinctive “double triangle” attention pattern. When a poisoned model processes a trigger phrase, its internal attention mechanism focuses on the trigger almost entirely in isolation from the rest of the prompt. Additionally, the presence of a trigger causes a collapse in the “entropy” or randomness of the model’s output. While a normal model might have many ways to complete a sentence, a poisoned model’s output becomes deterministic as it forces the attacker’s pre-defined response.

Data Leakage and Fuzzy Triggers

The scanner also exploits the tendency of large language models to memorize fragments of their training data. Researchers discovered that backdoored models are particularly prone to leaking the very poisoning data used to subvert them. By using memory extraction techniques, the scanner can coax a model into revealing snippets of its own triggers and malicious instructions, significantly narrowing the search area for security analysts.

A third key finding is that AI backdoors are “fuzzy” rather than rigid. Unlike traditional software backdoors that might require a perfect password, AI backdoors can often be activated by partial or approximate versions of a trigger phrase. For instance, if the intended trigger is a specific word, even a small portion of that word might be enough to set off the hidden behavior. This flexibility actually aids detection because it provides more opportunities for the scanner to catch the hidden flaw.

Capabilities and Technical Limitations

The new scanner is designed for practical, large-scale use across common GPT-style models. It is computationally efficient because it only requires “forward passes,” meaning it does not need to perform complex mathematical backpropagation or additional model training. Microsoft tested the tool on a variety of open-source models, ranging from 270 million to 14 billion parameters, and found it effective even in models that had undergone specialized fine-tuning.

However, the tool is not a universal solution for all AI security risks. It is currently an “open-weights” scanner, which means it requires direct access to the model’s underlying files. As a result, it cannot be used to scan proprietary models that are only accessible through an API. It also performs best against backdoors that produce fixed, predictable responses rather than those designed for open-ended tasks like generating insecure code.

Advancing AI Security Standards

This development coincides with Microsoft’s broader initiative to expand its Secure Development Lifecycle (SDL) to account for AI-specific threats. Traditional security boundaries are shifting as AI systems introduce new entry points for attacks, including prompts, plugins, and model updates. Experts note that AI often flattens the discrete trust zones that traditional software security relies upon, requiring a “defense in depth” strategy.

Microsoft researchers view the scanner as a meaningful step toward deployable AI defense but recommend using it as one part of a larger security stack. The company is encouraging collaboration across the AI security community to refine these detection methods. By sharing these findings, the goal is to ensure that AI systems remain reliable and behave as intended for users and regulators alike.

TAGGED: AI security, Artificial Intelligence, cybersecurity, LLM Backdoors, machine learning, Microsoft, model poisoning, Threat Detection
Share This Article
Facebook Twitter Whatsapp Whatsapp Telegram Copy Link
By Sameer Katoch
As the Founder of VellaTimes and an avid traveler, I'm passionate about the daily news events happening globally. With over five years of experience in the writing field, I am committed to delivering top-notch news that satisfies your daily news intake.
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *


Most Read

Intel Q1 forecast drags shares as AI supply tightens

January 25, 2026

Sabarimala Temple Extends Darshan Time for Devotees

November 20, 2023

Starlink in-flight Wi-Fi: Musk-Ryanair feud over costs

January 24, 2026

Top 5 Must Try Christmas Cake Recipes for 2023

December 21, 2023

Entangled atomic clouds boost precision quantum sensing

January 27, 2026

Marinera oil tanker: US seizure bid draws Russian escort

January 7, 2026

Related News

A professional medical injector pen and a medication vial on a sterile white surface in a laboratory setting.
News

GLP-1 weight loss drugs: FDA crackdown and global guidelines

Nisha Pradhan Nisha Pradhan February 12, 2026
Military personnel in a high-tech command centre working on a unified digital platform for the UK Ministry of Defence.
News

UK Ministry of Defence Selects Red Hat to Scale AI and Cloud

Sameer Katoch Sameer Katoch February 12, 2026
A high-tech semiconductor foundry with a robotic arm handling a silicon wafer, representing the partnership for AI chip manufacturing.
News

ByteDance Samsung AI Chip Production Partnership

Rakesh Paul Rakesh Paul February 12, 2026

About Us

VellaTimesVellaTimesVellaTimes

VellaTimes is a leading news portal that covers the latest trending news in technology, lifestyle, entertainment, automobiles, travel, and sports.

Explore

  • News
  • Technology
  • AI
  • Science
  • World

Useful Links

  • About Us
  • Contact Us
  • Fact Checking Policy
  • Terms & Conditions
  • Privacy Policy
  • Copyright Policy

Subscribe Us

Subscribe to our newsletter for the Latest News and Top Stories!

© 2022 VellaTimes • All Rights Reserved.
  • Home
  • Web Stories
  • Bookmarks
  • Interests
  • Disclaimer
  • Sitemap
adbanner
AdBlocker Detected
Our site is an advertising supported site. Please whitelist us to support our work.
Okay, I'll Whitelist