By using this site, you agree to our Privacy Policy and Terms of Use.
Accept
VellaTimesVellaTimesVellaTimes
  • News
    NewsShow More
    Close-up of a silver espresso machine extracting a fresh shot of coffee into a glass cup in a softly lit cafe setting.
    Espresso Extraction Science: The Finer Grind Flaw
    May 18, 2026
    A smartphone resting on a wooden desk displaying an AI-powered Amazon search bar in a modern home office setting.
    Amazon Alexa for Shopping Replaces Rufus AI Assistant
    May 18, 2026
    Wide news-style image showing an OpenAI office scene with screens displaying audio waveforms and voice technology graphics
    OpenAI acquires Weights.gg to boost voice AI tools
    May 18, 2026
    Federal agents standing outside a modern university biology laboratory building at dusk during an active investigation.
    US Arrests Chinese Scientists for Smuggling Biological Materials
    May 18, 2026
    A dramatically lit modern corporate courtroom with futuristic technology elements, representing a high-stakes artificial intelligence legal trial.
    Elon Musk OpenAI Lawsuit Exposes Clashes Over AI Safety
    May 18, 2026
  • Technology
    TechnologyShow More
    Wide news-style image showing an OpenAI office scene with screens displaying audio waveforms and voice technology graphics
    OpenAI acquires Weights.gg to boost voice AI tools
    May 18, 2026
    A polished silicon wafer rests on a surface inside a modern semiconductor manufacturing facility.
    Samsung Strike Threatens Global AI Chip Production
    May 18, 2026
    A glowing computer screen displaying the text GPT-5.5 Instant in a modern, high-tech office environment with soft blue and purple lighting.
    GPT-5.5 Instant: OpenAI’s New Default ChatGPT Model
    May 10, 2026
    Wide view of a modern AI data center with server racks, glowing fiber-optic cables, and semiconductor hardware in the foreground.
    AI Infrastructure Spending Drives Nvidia, AMD Shares
    May 10, 2026
    A glowing computer monitor displaying lines of code and digital network graphics in a modern tech office setting.
    Airbnb AI Coding: 60% of New Software Now Generated by AI
    May 9, 2026
  • AI
    AIShow More
    A smartphone resting on a wooden desk displaying an AI-powered Amazon search bar in a modern home office setting.
    Amazon Alexa for Shopping Replaces Rufus AI Assistant
    May 18, 2026
    A dramatically lit modern corporate courtroom with futuristic technology elements, representing a high-stakes artificial intelligence legal trial.
    Elon Musk OpenAI Lawsuit Exposes Clashes Over AI Safety
    May 18, 2026
    A high-tech global map visualization showing glowing digital connections across different continents, representing the worldwide adoption of artificial intelligence.
    Global AI Adoption in 2026: Trends and Growing Divide
    May 10, 2026
    A modern smartphone displaying an artificial intelligence chat interface used for online shopping and product comparison.
    Alibaba Qwen AI Taobao Integration Launches Agentic Shopping
    May 10, 2026
    A split-screen illustration showing a high-tech modern office using advanced AI tools contrasted against an older, dimly lit workspace.
    Global AI Adoption Surges But Rich-Poor Divide Widens
    May 9, 2026
  • Science
    ScienceShow More
    Close-up of a silver espresso machine extracting a fresh shot of coffee into a glass cup in a softly lit cafe setting.
    Espresso Extraction Science: The Finer Grind Flaw
    May 18, 2026
    Federal agents standing outside a modern university biology laboratory building at dusk during an active investigation.
    US Arrests Chinese Scientists for Smuggling Biological Materials
    May 18, 2026
    Header image of a quantum communication lab setup with fiber-optic equipment, a telecom quantum dot device, and interferometer components used for long-distance quantum key distribution.
    Quantum Key Distribution Reaches 120 km With Quantum Dots
    May 10, 2026
    Abstract geometric representation of glowing quantum paraparticles interacting within a three-dimensional mathematical grid in deep blue and gold tones.
    Quantum Paraparticles Exist: New Math Challenges Physics
    May 10, 2026
    A large expedition cruise ship is navigating rough ocean waters under a cloudy sky.
    Global Authorities Respond to Andes Hantavirus Outbreak on MV Hondius Cruise Ship
    May 9, 2026
  • World
    WorldShow More
    Allu Arjun Commitment to Ethical Brand Partnerships
    Exploring Allu Arjun’s Commitment to Ethical Brand Partnerships
    December 18, 2023
    Orry aka Orhan Awatramani
    Orhan Awatramani ‘Orry’ Biography, Lifestyle and Rise to Fame
    December 8, 2023
    Alia Bhatt Latest Deepake Video Victim
    Alia Bhatt becomes latest victim of Deepfake Videos, Obscene Video goes Viral
    November 28, 2023
    Napoleon Movie Review
    Napoleon Movie Review: A Historical Epic by Ridley Scott Reviewed
    November 25, 2023
  • Bookmarks
Search
Category
  • News
  • Technology
  • AI
  • Science
  • World
Company
  • About Us
  • Contact Us
  • Fact Checking Policy
  • Terms & Conditions
  • Privacy Policy
  • Copyright Policy
Resources
  • Home
  • Web Stories
  • Bookmarks
  • Interests
  • Disclaimer
  • Sitemap
© 2022 VellaTimes • All Rights Reserved.
Reading: Jagged intelligence: why AI agents still fail in 2026
Share
Notification Show More
Font ResizerAa
VellaTimesVellaTimes
Font ResizerAa
  • News
  • Technology
  • AI
  • Science
  • World
Search
  • Explore
    • News
    • Technology
    • AI
    • Science
    • World
  • Useful Links
    • About Us
    • Contact Us
    • Fact Checking Policy
    • Terms & Conditions
    • Privacy Policy
    • Copyright Policy
  • Home
  • Web Stories
  • Bookmarks
  • Interests
  • Disclaimer
  • Sitemap
© 2022 VellaTimes • All Rights Reserved.
News

Jagged intelligence: why AI agents still fail in 2026

Rakesh Paul
Last updated: 26/01/2026
Rakesh Paul
Share
6 Min Read
A professional reviews AI agent task results on multiple computer screens in an office setting.

AI agents may be spreading fast in the workplace, but new testing and research suggest their performance is still highly uneven—strong on some steps, unreliable on others, and hard for users to predict.

Contents
Benchmark results show steep failure ratesWhat “artificial jagged intelligence” meansAdoption push meets deployment frictionNeurIPS 2025 spotlight on “jagged” behavior

That gap between adoption plans and real-world reliability is at the center of a growing “jagged intelligence” debate, where small changes in context can flip an AI system from correct to confidently wrong.

Benchmark results show steep failure rates

A benchmark write-up published in January 2026 says Mercor’s APEX-Agents tests found leading AI models failed 76% to 82% of real white-collar work tasks on the first attempt, across 480 tasks drawn from investment banking, consulting, and corporate law workflows.
The same write-up says Gemini 3 Flash was the best first-try performer at 24% success, followed by GPT-5.2 at 23%, while Claude Opus 4.5 and Gemini 3 Pro scored 18.4%.
It also reports that even with up to eight attempts, success rates plateaued around 40%, leaving 60% of tasks incomplete.

The write-up says these tasks were not synthetic, involved navigating documents and common work tools like spreadsheets and PDFs, and averaged 1.8 hours of expert-estimated human effort.
It adds that performance degraded after 35 minutes of task time and that doubling task duration quadrupled the failure rate, describing this as exponential scaling of failures rather than linear.
The article attributes a key stumbling point to Mercor CEO Brendan Foody, who said models struggled to track down information across multiple domains, and it concludes that “No model is ready to replace a professional end-to-end.”

What “artificial jagged intelligence” means

In a January 2026 paper, economist Joshua S. Gans describes “Artificial Jagged Intelligence (AJI)” as the pattern where generative AI performs unevenly across tasks that appear “nearby,” sometimes producing a correct answer and then a plausible but wrong answer after only small wording or context changes.
Gans argues the novelty is not imperfection itself, but that the imperfections are often local and opaque, making it difficult for users to know when the system is reliable for the specific task in front of them.
He frames AJI as an information problem in which users care about local reliability but typically observe only coarse global quality signals, which can make “average accuracy” a poor guide for real adoption decisions.

Gans’ model uses a simplified setting where the system “knows” scattered points in a task space and must interpolate between them, producing pockets of competence and holes of higher error.
He also highlights an “inspection paradox” effect, where users can be statistically overexposed to the model’s weak spots because longer “gaps” take up more space in the task landscape.
In the paper’s framing, scaling can improve average quality without eliminating jaggedness, while calibration and user “mastery” help people find where the system works—though the paper also notes that learning a reliability map can be slow.

Adoption push meets deployment friction

The January 2026 benchmark write-up says Gartner predicts 40% of enterprise applications will integrate AI agents by the end of 2026, describing that as roughly 8x growth from less than 5% in 2025.
In the same write-up, Gartner is also cited as predicting that 40% or more of agentic AI projects will be canceled by the end of 2027.
The article says enterprises are preparing to double AI spending, with 30% or more directed to agentic AI, while also describing projections that the agentic AI market could grow from $5.2 billion in 2024 to $200 billion by 2034.

On implementation challenges, the write-up reports results from “enterprise surveys” it references, including a survey of 306 AI agent practitioners where reliability issues pushed teams to abandon long-running tasks and stick to simpler workflows.
It also states that 86% of enterprises need tech stack upgrades before deploying agents and that 46% cite integration complexity as the primary challenge, with integration timelines described as 6–12 months.
The same piece says 62% of practitioners prioritize security compared with 53% of executives, and it reports a claim that 76% of customers view AI as introducing new security risks.

NeurIPS 2025 spotlight on “jagged” behavior

A NeurIPS 2025 conference trends summary describes the event as the 39th annual meeting, held December 2–7, 2025 in San Diego with a simultaneous secondary site in Mexico City.
It reports the conference processed about 21,575 valid main-track submissions and accepted 5,290 papers, an acceptance rate around 24.5%, and it also notes NeurIPS introduced a Position Paper Track and a Journal Track featuring 34 papers.
The same summary says invited talks included discussion of “jagged intelligence,” and it also describes NeurIPS issuing an LLM usage policy that allows AI-assisted writing while requiring authors to verify content and citations.

TAGGED: agentic AI, AI agents, APEX-Agents benchmark, enterprise AI, generative AI reliability, jagged intelligence, Joshua Gans, NeurIPS 2025
Share This Article
Facebook Twitter Whatsapp Whatsapp Telegram Copy Link
By Rakesh Paul
I'm the Co-Founder of VellaTimes and an experienced digital marketer. With substantial experience in the blogging industry, I love crafting insightful and engaging news articles on technology, sports, and automobiles.
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *


Most Read

Apple MacBook Neo: Budget Laptop Leads New Device Lineup

March 8, 2026

OpenAI Security Expands With Promptfoo and Codex Launch

March 10, 2026

Major Smartphone Brands 2024 Launches: Xiaomi, OnePlus, Samsung & More

January 1, 2024

Treatment-Resistant Cancer: Scientists Find Key Enzyme

February 23, 2026

Malaria Drug Plant Pathway: Quinine Mystery Solved

March 19, 2026

Microsoft Copilot Rollback: AI Bloat Reduced in Windows 11

March 21, 2026

Related News

Close-up of a silver espresso machine extracting a fresh shot of coffee into a glass cup in a softly lit cafe setting.
News

Espresso Extraction Science: The Finer Grind Flaw

Nisha Pradhan Nisha Pradhan May 18, 2026
A smartphone resting on a wooden desk displaying an AI-powered Amazon search bar in a modern home office setting.
News

Amazon Alexa for Shopping Replaces Rufus AI Assistant

Sameer Katoch Sameer Katoch May 18, 2026
Wide news-style image showing an OpenAI office scene with screens displaying audio waveforms and voice technology graphics
News

OpenAI acquires Weights.gg to boost voice AI tools

Rakesh Paul Rakesh Paul May 18, 2026

About Us

VellaTimesVellaTimesVellaTimes

VellaTimes is a leading news portal that covers the latest trending news in technology, lifestyle, entertainment, automobiles, travel, and sports.

Explore

  • News
  • Technology
  • AI
  • Science
  • World

Useful Links

  • About Us
  • Contact Us
  • Fact Checking Policy
  • Terms & Conditions
  • Privacy Policy
  • Copyright Policy

Subscribe Us

Subscribe to our newsletter for the Latest News and Top Stories!

© 2022 VellaTimes • All Rights Reserved.
  • Home
  • Web Stories
  • Bookmarks
  • Interests
  • Disclaimer
  • Sitemap
adbanner
AdBlocker Detected
Our site is an advertising supported site. Please whitelist us to support our work.
Okay, I'll Whitelist