By using this site, you agree to our Privacy Policy and Terms of Use.
Accept
VellaTimesVellaTimesVellaTimes
  • News
    NewsShow More
    Close-up of a silver espresso machine extracting a fresh shot of coffee into a glass cup in a softly lit cafe setting.
    Espresso Extraction Science: The Finer Grind Flaw
    May 18, 2026
    A smartphone resting on a wooden desk displaying an AI-powered Amazon search bar in a modern home office setting.
    Amazon Alexa for Shopping Replaces Rufus AI Assistant
    May 18, 2026
    Wide news-style image showing an OpenAI office scene with screens displaying audio waveforms and voice technology graphics
    OpenAI acquires Weights.gg to boost voice AI tools
    May 18, 2026
    Federal agents standing outside a modern university biology laboratory building at dusk during an active investigation.
    US Arrests Chinese Scientists for Smuggling Biological Materials
    May 18, 2026
    A dramatically lit modern corporate courtroom with futuristic technology elements, representing a high-stakes artificial intelligence legal trial.
    Elon Musk OpenAI Lawsuit Exposes Clashes Over AI Safety
    May 18, 2026
  • Technology
    TechnologyShow More
    Wide news-style image showing an OpenAI office scene with screens displaying audio waveforms and voice technology graphics
    OpenAI acquires Weights.gg to boost voice AI tools
    May 18, 2026
    A polished silicon wafer rests on a surface inside a modern semiconductor manufacturing facility.
    Samsung Strike Threatens Global AI Chip Production
    May 18, 2026
    A glowing computer screen displaying the text GPT-5.5 Instant in a modern, high-tech office environment with soft blue and purple lighting.
    GPT-5.5 Instant: OpenAI’s New Default ChatGPT Model
    May 10, 2026
    Wide view of a modern AI data center with server racks, glowing fiber-optic cables, and semiconductor hardware in the foreground.
    AI Infrastructure Spending Drives Nvidia, AMD Shares
    May 10, 2026
    A glowing computer monitor displaying lines of code and digital network graphics in a modern tech office setting.
    Airbnb AI Coding: 60% of New Software Now Generated by AI
    May 9, 2026
  • AI
    AIShow More
    A smartphone resting on a wooden desk displaying an AI-powered Amazon search bar in a modern home office setting.
    Amazon Alexa for Shopping Replaces Rufus AI Assistant
    May 18, 2026
    A dramatically lit modern corporate courtroom with futuristic technology elements, representing a high-stakes artificial intelligence legal trial.
    Elon Musk OpenAI Lawsuit Exposes Clashes Over AI Safety
    May 18, 2026
    A high-tech global map visualization showing glowing digital connections across different continents, representing the worldwide adoption of artificial intelligence.
    Global AI Adoption in 2026: Trends and Growing Divide
    May 10, 2026
    A modern smartphone displaying an artificial intelligence chat interface used for online shopping and product comparison.
    Alibaba Qwen AI Taobao Integration Launches Agentic Shopping
    May 10, 2026
    A split-screen illustration showing a high-tech modern office using advanced AI tools contrasted against an older, dimly lit workspace.
    Global AI Adoption Surges But Rich-Poor Divide Widens
    May 9, 2026
  • Science
    ScienceShow More
    Close-up of a silver espresso machine extracting a fresh shot of coffee into a glass cup in a softly lit cafe setting.
    Espresso Extraction Science: The Finer Grind Flaw
    May 18, 2026
    Federal agents standing outside a modern university biology laboratory building at dusk during an active investigation.
    US Arrests Chinese Scientists for Smuggling Biological Materials
    May 18, 2026
    Header image of a quantum communication lab setup with fiber-optic equipment, a telecom quantum dot device, and interferometer components used for long-distance quantum key distribution.
    Quantum Key Distribution Reaches 120 km With Quantum Dots
    May 10, 2026
    Abstract geometric representation of glowing quantum paraparticles interacting within a three-dimensional mathematical grid in deep blue and gold tones.
    Quantum Paraparticles Exist: New Math Challenges Physics
    May 10, 2026
    A large expedition cruise ship is navigating rough ocean waters under a cloudy sky.
    Global Authorities Respond to Andes Hantavirus Outbreak on MV Hondius Cruise Ship
    May 9, 2026
  • World
    WorldShow More
    Allu Arjun Commitment to Ethical Brand Partnerships
    Exploring Allu Arjun’s Commitment to Ethical Brand Partnerships
    December 18, 2023
    Orry aka Orhan Awatramani
    Orhan Awatramani ‘Orry’ Biography, Lifestyle and Rise to Fame
    December 8, 2023
    Alia Bhatt Latest Deepake Video Victim
    Alia Bhatt becomes latest victim of Deepfake Videos, Obscene Video goes Viral
    November 28, 2023
    Napoleon Movie Review
    Napoleon Movie Review: A Historical Epic by Ridley Scott Reviewed
    November 25, 2023
  • Bookmarks
Search
Category
  • News
  • Technology
  • AI
  • Science
  • World
Company
  • About Us
  • Contact Us
  • Fact Checking Policy
  • Terms & Conditions
  • Privacy Policy
  • Copyright Policy
Resources
  • Home
  • Web Stories
  • Bookmarks
  • Interests
  • Disclaimer
  • Sitemap
© 2022 VellaTimes • All Rights Reserved.
Reading: AWS and Cerebras Partner to Deliver Faster AI Inference with Giant Chips
Share
Notification Show More
Font ResizerAa
VellaTimesVellaTimes
Font ResizerAa
  • News
  • Technology
  • AI
  • Science
  • World
Search
  • Explore
    • News
    • Technology
    • AI
    • Science
    • World
  • Useful Links
    • About Us
    • Contact Us
    • Fact Checking Policy
    • Terms & Conditions
    • Privacy Policy
    • Copyright Policy
  • Home
  • Web Stories
  • Bookmarks
  • Interests
  • Disclaimer
  • Sitemap
© 2022 VellaTimes • All Rights Reserved.
News

AWS and Cerebras Partner to Deliver Faster AI Inference with Giant Chips

Rakesh Paul
Last updated: 16/03/2026
Rakesh Paul
Share
6 Min Read
A glowing giant computer chip displayed on a server rack inside a modern, brightly lit cloud data center.

Amazon Web Services (AWS) is officially partnering with hardware startup Cerebras Systems to combine Amazon’s custom Trainium processors with Cerebras’ giant chips. This high-profile collaboration aims to significantly accelerate artificial intelligence (AI) inference workloads for global cloud computing customers. The joint effort also seeks to challenge Nvidia’s current dominance in the AI infrastructure and hardware market.

The new integrated hardware service will be directly deployed via Amazon Bedrock inside AWS data centers. There are conflicting reports regarding the exact launch timeline for the new hardware integration. According to Bloomberg, the new cloud computing service is expected to roll out in the second half of 2026. In contrast, official press statements from AWS and Cerebras indicate that the integration will officially launch in the next couple of months. While the exact financial terms of the agreement were not disclosed to the public, AWS Vice President Nafea Bshara noted that the two companies have been working toward this partnership for several years. Bshara also indicated that AWS intends to install as many Cerebras chips as market demand dictates.

Tackling the Speed Bottleneck

According to AWS, inference is the specific phase where AI delivers tangible value to end users. However, processing speed remains a critical bottleneck for highly demanding workloads, such as real-time coding assistance and interactive AI applications. As reasoning models begin to represent the majority of AI inference, these systems must compute and generate significantly more tokens per request as they “think” through complex problems. This shift has drastically increased the industry-wide need to accelerate the AI workflow.

Currently, prominent AI companies like OpenAI, Cognition, and Mistral utilize Cerebras hardware to accelerate their most demanding computing workloads. Cerebras has demonstrated that it can power models from OpenAI, Cognition, and Meta at speeds of up to 3,000 tokens per second. This speed is particularly crucial for tasks like agentic coding, where a software developer’s productivity is directly constrained by AI inference speeds.

The Disaggregated Inference Strategy

To achieve industry-leading processing speeds, the partner companies are deploying an innovative hardware strategy called disaggregated inference. Instead of relying on a single type of graphics processing unit (GPU) for the entire AI pipeline, the workload is strategically split into two specialized computing stages. These two distinct hardware systems are seamlessly connected within the AWS cloud infrastructure using Amazon’s high-bandwidth, low-latency Elastic Fabric Adapter (EFA) networking stack.

The first stage of the inference process is called “prefill,” which involves interpreting user prompts and converting them into tokens that AI systems can process. Amazon’s custom Trainium 3 chips, which feature dense compute cores designed for scalable performance, will exclusively handle this highly compute-intensive phase.

The second stage, known as “decode,” is a highly memory-intensive process where the AI model generates its final response token by token. Cerebras’ CS-3 system, also referred to as the Wafer Scale Engine, will exclusively manage this decode stage. The giant CS-3 chip is uniquely designed to store all AI model weights directly on-chip in static random-access memory (SRAM). This architectural design gives the CS-3 thousands of times more memory bandwidth than the fastest traditional GPUs available on the market.

Industry Impact and Future Outlook

David Brown, Vice President of Compute and Machine Learning Services at AWS, stated that separating the inference workload allows each piece of hardware to focus entirely on what it does best. He noted that this dual-chip approach will deliver inference speeds an order of magnitude faster and offer significantly higher performance than currently available options. Cerebras CEO Andrew Feldman described the hybrid architecture as a “divide and conquer” strategy that will bring the fastest possible inference to a global enterprise customer base.

This specialized hybrid hardware model is designed for strict cost efficiency. It aims to deliver five times more high-speed token capacity within the exact same physical hardware footprint. Later this year, AWS plans to begin offering leading open-source large language models (LLMs) and its proprietary Amazon Nova models running specifically on the new Cerebras hardware.

For Cerebras, a startup currently preparing for an initial public offering, securing AWS as a client marks a major corporate milestone. AWS is the first major hyperscaler data center operator to commit to utilizing Cerebras technology. While Amazon remains a significant customer of market leader Nvidia, the cloud provider continues to expand its own proprietary silicon roadmap. Because inference workloads are becoming massively large, cloud providers are increasingly experimenting with heterogeneous hardware architectures to bypass Nvidia’s firmly established CUDA software ecosystem and mature tooling.

TAGGED: AI inference, Amazon Bedrock, AWS, Cerebras Systems, cloud computing, Generative AI, machine learning, Trainium 3
Share This Article
Facebook Twitter Whatsapp Whatsapp Telegram Copy Link
By Rakesh Paul
I'm the Co-Founder of VellaTimes and an experienced digital marketer. With substantial experience in the blogging industry, I love crafting insightful and engaging news articles on technology, sports, and automobiles.
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *


Most Read

OpenAI Pentagon Deal Includes Strict AI Safeguards

March 2, 2026

Nvidia Nears Massive $20 Billion Investment in OpenAI Funding Round

February 4, 2026

AI Peer Review Reaches New Milestones in Academic Publishing

March 27, 2026

EU Warns Meta to End WhatsApp Block on Rival AI Chatbots

February 10, 2026

Ola CEO Bhavish Aggarwal unveils new Indian AI Chat App ‘Krutrim’

November 28, 2023

Nvidia AI Chips: New Strategy Amid Export Controls

March 8, 2026

Related News

Close-up of a silver espresso machine extracting a fresh shot of coffee into a glass cup in a softly lit cafe setting.
News

Espresso Extraction Science: The Finer Grind Flaw

Nisha Pradhan Nisha Pradhan May 18, 2026
A smartphone resting on a wooden desk displaying an AI-powered Amazon search bar in a modern home office setting.
News

Amazon Alexa for Shopping Replaces Rufus AI Assistant

Sameer Katoch Sameer Katoch May 18, 2026
Wide news-style image showing an OpenAI office scene with screens displaying audio waveforms and voice technology graphics
News

OpenAI acquires Weights.gg to boost voice AI tools

Rakesh Paul Rakesh Paul May 18, 2026

About Us

VellaTimesVellaTimesVellaTimes

VellaTimes is a leading news portal that covers the latest trending news in technology, lifestyle, entertainment, automobiles, travel, and sports.

Explore

  • News
  • Technology
  • AI
  • Science
  • World

Useful Links

  • About Us
  • Contact Us
  • Fact Checking Policy
  • Terms & Conditions
  • Privacy Policy
  • Copyright Policy

Subscribe Us

Subscribe to our newsletter for the Latest News and Top Stories!

© 2022 VellaTimes • All Rights Reserved.
  • Home
  • Web Stories
  • Bookmarks
  • Interests
  • Disclaimer
  • Sitemap
adbanner
AdBlocker Detected
Our site is an advertising supported site. Please whitelist us to support our work.
Okay, I'll Whitelist