By using this site, you agree to our Privacy Policy and Terms of Use.
Accept
VellaTimesVellaTimesVellaTimes
  • News
    NewsShow More
    A glowing quantum clock fragmenting into light particles against a dark cosmic background with swirling entangled atoms and spacetime waves, representing quantum physics breakthroughs in time and the universe.
    Quantum Physics Breakthroughs Reshaping How We Understand Time and the Universe
    May 3, 2026
    A sleek and modern stage at a corporate technology launch event with glowing digital displays.
    OpenAI GPT-5.5 Launch Party and the Goblin Problem
    May 3, 2026
    A glowing digital medical tablet displaying artificial intelligence graphics in a modern hospital emergency room.
    AI Outperforms Doctors in Harvard Trial of Emergency Triage Diagnoses
    May 3, 2026
    A glowing antimatter atom passing through a hexagonal graphene sheet and splitting into a quantum wave interference pattern in a high-tech laboratory setting.
    Scientists Observe Positronium Wave Behavior in Lab
    May 1, 2026
    Hyper-realistic news-style image of a modern AI data center with server racks and a digital display labeled DeepSeek V4, shown in cool blue lighting.
    DeepSeek V4 launch puts Huawei AI chips in spotlight
    May 1, 2026
  • Technology
    TechnologyShow More
    A glowing digital medical tablet displaying artificial intelligence graphics in a modern hospital emergency room.
    AI Outperforms Doctors in Harvard Trial of Emergency Triage Diagnoses
    May 3, 2026
    A modern smartphone displaying an app storefront positioned next to a wooden judge's gavel on a desk, representing the legal battle over digital marketplace policies.
    Apple Loses Bid to Pause App Store Fee Changes
    May 1, 2026
    A business professional using an AI assistant on a laptop in a modern office with a data center visible in the background.
    Microsoft Copilot Tops 20 Million Paid Enterprise Seats
    May 1, 2026
    A brightly lit modern semiconductor cleanroom featuring advanced silicon wafers and glowing blue server racks.
    Samsung Q1 Profit Surges Eightfold as AI Boom Fuels Record Chip Earnings
    April 30, 2026
    A person holding a smartphone displaying the Amazon Shopping app's AI audio chat interface in a modern living room.
    Amazon AI Audio Shopping Chat Enhanced With Real-Time Q&A
    April 29, 2026
  • AI
    AIShow More
    A sleek and modern stage at a corporate technology launch event with glowing digital displays.
    OpenAI GPT-5.5 Launch Party and the Goblin Problem
    May 3, 2026
    Hyper-realistic news-style image of a modern AI data center with server racks and a digital display labeled DeepSeek V4, shown in cool blue lighting.
    DeepSeek V4 launch puts Huawei AI chips in spotlight
    May 1, 2026
    News-style image of Elon Musk seated in a courtroom during a legal dispute involving OpenAI.
    Elon Musk OpenAI Trial Puts Nonprofit Mission on Trial
    May 1, 2026
    News-style image showing LG Electronics and Nvidia branding in a modern tech setting with AI server racks and a service robot.
    Nvidia-LG Talks Highlight Wider AI Expansion Strategy
    April 30, 2026
    A dramatic courtroom setting featuring an abstract artificial intelligence hologram on a wooden table, representing the high-stakes tech trial.
    Elon Musk vs Sam Altman OpenAI Trial Over AI Future
    April 29, 2026
  • Science
    ScienceShow More
    A glowing quantum clock fragmenting into light particles against a dark cosmic background with swirling entangled atoms and spacetime waves, representing quantum physics breakthroughs in time and the universe.
    Quantum Physics Breakthroughs Reshaping How We Understand Time and the Universe
    May 3, 2026
    A glowing antimatter atom passing through a hexagonal graphene sheet and splitting into a quantum wave interference pattern in a high-tech laboratory setting.
    Scientists Observe Positronium Wave Behavior in Lab
    May 1, 2026
    The NASA Curiosity rover is using its robotic arm to drill into a red sandstone rock on the dusty surface of Mars.
    Mars Organic Molecules: Curiosity Rover Makes Historic Find
    May 1, 2026
    Aerial view of the Pacific Ocean off a forested coastline with a glowing geological fault line beneath the water representing the Cascadia subduction zone.
    Earth Tearing Apart Under the Cascadia Subduction Zone
    May 1, 2026
    A young adult female patient and a doctor are looking at medical charts in a modern clinical office setting.
    Rising Cancer Rates in Young Adults: Is Obesity to Blame?
    April 29, 2026
  • World
    WorldShow More
    Allu Arjun Commitment to Ethical Brand Partnerships
    Exploring Allu Arjun’s Commitment to Ethical Brand Partnerships
    December 18, 2023
    Orry aka Orhan Awatramani
    Orhan Awatramani ‘Orry’ Biography, Lifestyle and Rise to Fame
    December 8, 2023
    Alia Bhatt Latest Deepake Video Victim
    Alia Bhatt becomes latest victim of Deepfake Videos, Obscene Video goes Viral
    November 28, 2023
    Napoleon Movie Review
    Napoleon Movie Review: A Historical Epic by Ridley Scott Reviewed
    November 25, 2023
  • Bookmarks
Search
Category
  • News
  • Technology
  • AI
  • Science
  • World
Company
  • About Us
  • Contact Us
  • Fact Checking Policy
  • Terms & Conditions
  • Privacy Policy
  • Copyright Policy
Resources
  • Home
  • Web Stories
  • Bookmarks
  • Interests
  • Disclaimer
  • Sitemap
© 2022 VellaTimes • All Rights Reserved.
Reading: AWS and Cerebras Partner to Deliver Faster AI Inference with Giant Chips
Share
Notification Show More
Font ResizerAa
VellaTimesVellaTimes
Font ResizerAa
  • News
  • Technology
  • AI
  • Science
  • World
Search
  • Explore
    • News
    • Technology
    • AI
    • Science
    • World
  • Useful Links
    • About Us
    • Contact Us
    • Fact Checking Policy
    • Terms & Conditions
    • Privacy Policy
    • Copyright Policy
  • Home
  • Web Stories
  • Bookmarks
  • Interests
  • Disclaimer
  • Sitemap
© 2022 VellaTimes • All Rights Reserved.
News

AWS and Cerebras Partner to Deliver Faster AI Inference with Giant Chips

Rakesh Paul
Last updated: 16/03/2026
Rakesh Paul
Share
6 Min Read
A glowing giant computer chip displayed on a server rack inside a modern, brightly lit cloud data center.

Amazon Web Services (AWS) is officially partnering with hardware startup Cerebras Systems to combine Amazon’s custom Trainium processors with Cerebras’ giant chips. This high-profile collaboration aims to significantly accelerate artificial intelligence (AI) inference workloads for global cloud computing customers. The joint effort also seeks to challenge Nvidia’s current dominance in the AI infrastructure and hardware market.

The new integrated hardware service will be directly deployed via Amazon Bedrock inside AWS data centers. There are conflicting reports regarding the exact launch timeline for the new hardware integration. According to Bloomberg, the new cloud computing service is expected to roll out in the second half of 2026. In contrast, official press statements from AWS and Cerebras indicate that the integration will officially launch in the next couple of months. While the exact financial terms of the agreement were not disclosed to the public, AWS Vice President Nafea Bshara noted that the two companies have been working toward this partnership for several years. Bshara also indicated that AWS intends to install as many Cerebras chips as market demand dictates.

Tackling the Speed Bottleneck

According to AWS, inference is the specific phase where AI delivers tangible value to end users. However, processing speed remains a critical bottleneck for highly demanding workloads, such as real-time coding assistance and interactive AI applications. As reasoning models begin to represent the majority of AI inference, these systems must compute and generate significantly more tokens per request as they “think” through complex problems. This shift has drastically increased the industry-wide need to accelerate the AI workflow.

Currently, prominent AI companies like OpenAI, Cognition, and Mistral utilize Cerebras hardware to accelerate their most demanding computing workloads. Cerebras has demonstrated that it can power models from OpenAI, Cognition, and Meta at speeds of up to 3,000 tokens per second. This speed is particularly crucial for tasks like agentic coding, where a software developer’s productivity is directly constrained by AI inference speeds.

The Disaggregated Inference Strategy

To achieve industry-leading processing speeds, the partner companies are deploying an innovative hardware strategy called disaggregated inference. Instead of relying on a single type of graphics processing unit (GPU) for the entire AI pipeline, the workload is strategically split into two specialized computing stages. These two distinct hardware systems are seamlessly connected within the AWS cloud infrastructure using Amazon’s high-bandwidth, low-latency Elastic Fabric Adapter (EFA) networking stack.

The first stage of the inference process is called “prefill,” which involves interpreting user prompts and converting them into tokens that AI systems can process. Amazon’s custom Trainium 3 chips, which feature dense compute cores designed for scalable performance, will exclusively handle this highly compute-intensive phase.

The second stage, known as “decode,” is a highly memory-intensive process where the AI model generates its final response token by token. Cerebras’ CS-3 system, also referred to as the Wafer Scale Engine, will exclusively manage this decode stage. The giant CS-3 chip is uniquely designed to store all AI model weights directly on-chip in static random-access memory (SRAM). This architectural design gives the CS-3 thousands of times more memory bandwidth than the fastest traditional GPUs available on the market.

Industry Impact and Future Outlook

David Brown, Vice President of Compute and Machine Learning Services at AWS, stated that separating the inference workload allows each piece of hardware to focus entirely on what it does best. He noted that this dual-chip approach will deliver inference speeds an order of magnitude faster and offer significantly higher performance than currently available options. Cerebras CEO Andrew Feldman described the hybrid architecture as a “divide and conquer” strategy that will bring the fastest possible inference to a global enterprise customer base.

This specialized hybrid hardware model is designed for strict cost efficiency. It aims to deliver five times more high-speed token capacity within the exact same physical hardware footprint. Later this year, AWS plans to begin offering leading open-source large language models (LLMs) and its proprietary Amazon Nova models running specifically on the new Cerebras hardware.

For Cerebras, a startup currently preparing for an initial public offering, securing AWS as a client marks a major corporate milestone. AWS is the first major hyperscaler data center operator to commit to utilizing Cerebras technology. While Amazon remains a significant customer of market leader Nvidia, the cloud provider continues to expand its own proprietary silicon roadmap. Because inference workloads are becoming massively large, cloud providers are increasingly experimenting with heterogeneous hardware architectures to bypass Nvidia’s firmly established CUDA software ecosystem and mature tooling.

TAGGED: AI inference, Amazon Bedrock, AWS, Cerebras Systems, cloud computing, Generative AI, machine learning, Trainium 3
Share This Article
Facebook Twitter Whatsapp Whatsapp Telegram Copy Link
By Rakesh Paul
I'm the Co-Founder of VellaTimes and an experienced digital marketer. With substantial experience in the blogging industry, I love crafting insightful and engaging news articles on technology, sports, and automobiles.
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *


Most Read

Microsoft Maia 200 AI chip: What it means for Azure

January 27, 2026

Hubble Tension Deepens With New Universe Expansion Rate

April 15, 2026

Amazon and Cerebras Partner to Accelerate AI Inference

March 15, 2026

Apple Home Architecture Update: Support for Old Version Ends

February 11, 2026

OpenAI Smart Speaker With Camera Planned for 2027

February 22, 2026

Male Birth Control Breakthrough: Scientists Identify Sperm’s Energy Switch

February 16, 2026

Related News

A glowing quantum clock fragmenting into light particles against a dark cosmic background with swirling entangled atoms and spacetime waves, representing quantum physics breakthroughs in time and the universe.
News

Quantum Physics Breakthroughs Reshaping How We Understand Time and the Universe

Nisha Pradhan Nisha Pradhan May 3, 2026
A sleek and modern stage at a corporate technology launch event with glowing digital displays.
News

OpenAI GPT-5.5 Launch Party and the Goblin Problem

Sameer Katoch Sameer Katoch May 3, 2026
A glowing digital medical tablet displaying artificial intelligence graphics in a modern hospital emergency room.
News

AI Outperforms Doctors in Harvard Trial of Emergency Triage Diagnoses

Rakesh Paul Rakesh Paul May 3, 2026

About Us

VellaTimesVellaTimesVellaTimes

VellaTimes is a leading news portal that covers the latest trending news in technology, lifestyle, entertainment, automobiles, travel, and sports.

Explore

  • News
  • Technology
  • AI
  • Science
  • World

Useful Links

  • About Us
  • Contact Us
  • Fact Checking Policy
  • Terms & Conditions
  • Privacy Policy
  • Copyright Policy

Subscribe Us

Subscribe to our newsletter for the Latest News and Top Stories!

© 2022 VellaTimes • All Rights Reserved.
  • Home
  • Web Stories
  • Bookmarks
  • Interests
  • Disclaimer
  • Sitemap
adbanner
AdBlocker Detected
Our site is an advertising supported site. Please whitelist us to support our work.
Okay, I'll Whitelist