Nvidia is preparing to launch a new processor aimed at helping OpenAI and other customers build faster, more efficient artificial intelligence systems, according to a Wall Street Journal report that cited people familiar with the matter.
The report said Nvidia is developing a new platform focused on “inference” computing—processing used when AI models generate responses to user questions. Reuters said it could not immediately verify the Wall Street Journal report, and noted that Nvidia and OpenAI did not immediately respond to a request for comment.
A new push into “inference” computing
The Wall Street Journal report described the planned processor as part of a broader system designed for inference computing. Inference has become a key part of how AI products serve real-time answers, and the report framed Nvidia’s new system as a way to make those responses faster and more efficient.
The platform is expected to be introduced at Nvidia’s GTC developer conference in San Jose next month, according to reports that cited people familiar with the matter. One report said the new processor could be launched as early as next month.
Groq’s chip is part of the platform, report says
A key detail in the Wall Street Journal report is that the new platform will incorporate a chip designed by startup Groq. Multiple outlets repeating the report said the Groq-designed chip is expected to be included as part of the overall system.
That connection stands out because OpenAI has discussed working with startups—including Cerebras and Groq—to find ways to get faster inference performance, according to a separate Reuters report cited by other outlets. One report said Nvidia struck a $20 billion licensing deal with Groq that shut down OpenAI’s talks with the startup.
Competition and shifting customer needs
The reports come as major cloud and AI players push more of their own hardware efforts. One account said Nvidia’s rivals Google and Amazon have already designed chips that compete with Nvidia’s existing flagship systems.
The same reporting also pointed to changing workloads as a driver for new chip designs, describing increased demand for hardware that can handle complex AI tasks efficiently. Nvidia has long dominated the market for high-end GPUs used for AI, and one report cited analyst estimates that Nvidia controls 90% or more of the GPU market.
OpenAI’s demand for faster answers
Pressure for better inference performance has been visible in OpenAI’s own hardware discussions. Reuters reported earlier this month that OpenAI has been unhappy with how quickly Nvidia’s hardware can produce answers for certain use cases, including software development and AI systems that communicate with other software.
That Reuters reporting said OpenAI needs new hardware that would eventually provide about 10% of its inference computing needs in the future, according to one source. The Wall Street Journal report, as described by other outlets, said OpenAI has already agreed to be one of the biggest customers of Nvidia’s new system.
Recent deals add to the picture
One report said that earlier on Friday, OpenAI announced it would move toward purchasing “dedicated inference capacity” from Nvidia, a move that could align with the type of system described in the Wall Street Journal report. The same report said Nvidia will invest $30 billion in OpenAI, and that OpenAI has also signed a significant new deal to use Amazon’s Trainium chips.
Separately, another report said that in September, Nvidia stated it intended to invest as much as $100 billion into OpenAI as part of a deal that would give Nvidia a stake in the company and provide OpenAI with cash to buy advanced chips.
For now, the central claim remains tied to the Wall Street Journal’s description of an upcoming inference-focused platform, expected to be introduced at Nvidia’s next GTC conference in San Jose. Reuters said it could not immediately verify the report.
