Google has officially launched Google Gemma 4, a new generation of open-source artificial intelligence models. Confirmed by Google DeepMind CEO Demis Hassabis on Thursday via social media, Gemma 4 is described as the company’s most capable open model to date. It offers developers the freedom of open-source technology while delivering functionality previously reserved for proprietary systems.
Built on the same research foundations as the Gemini 3 Pro models released late last year, Google Gemma 4 aims to make frontier-level AI highly accessible. Unlike cloud-reliant chatbots, Gemma 4 serves as an AI processing engine that users can download and run locally. The release includes four distinct variants designed for a wide variety of hardware, ranging from basic Android smartphones to high-end data center servers.
Unpacking the Four Gemma 4 Variants
To address different computing constraints, Google released the new AI model family in four specific sizes. Each variant is tailored for distinct use cases, balancing parameter counts with operational efficiency.
Effective 2B and Effective 4B for Edge Devices
The Effective 2B (E2B) and Effective 4B (E4B) models are compact versions featuring roughly two billion and four billion parameters. These lightweight models are optimized for mobile devices, the Internet of Things, and edge hardware like the Raspberry Pi and Nvidia Jetson. Notably, they process audio input for speech recognition natively, alongside images and video. The E2B and E4B variants allow smartphones to run AI locally without an internet connection. Google noted that this architecture will serve as the foundation for the next generation of Gemini Nano on Android devices.
26B MoE and 31B Dense for Advanced Computing
For more demanding workloads, Google offers the 26B Mixture of Experts (MoE) and the 31B Dense models. The 26B MoE variant utilizes a specialized architecture that activates only a subset of its components per task. This approach significantly improves efficiency and lowers latency compared to traditional models that use all parameters simultaneously. The 31B Dense version stands as the most powerful option, focusing on raw performance for enterprise applications. Both models are engineered for offline computation on personal workstations and higher-end servers equipped with powerful GPUs, such as the Nvidia H100.
Performance, Capabilities, and Global Rankings
Google Gemma 4 brings a major leap in functionality, particularly in advanced reasoning and agentic workflows. Developers can use the models to solve complex mathematical problems, generate high-quality offline code, and build autonomous AI agents capable of multi-step planning. The models support over 140 languages natively. Furthermore, they feature expansive context windows, with edge models processing up to 128,000 tokens and the larger versions handling up to 256,000 tokens. This allows the system to analyze massive documents or complete code repositories at once.
In terms of global competition, the 31B Dense model ranks third among open models on the widely referenced Arena AI leaderboard with a score of 1452, trailing the GLM-5 and Kimi K2.5 Thinking models. The 26B variant also performs strongly, ranking sixth. Clement Farabet, Vice President of Research at Google DeepMind, stated that Gemma 4 delivers an unprecedented level of intelligence per parameter, allowing it to outcompete models twenty times its size.
Transitioning to an Apache 2.0 Open-Source License
A major shift with the Google Gemma 4 release is its licensing model. While previous iterations were open-weight—meaning only their training datasets were public—Gemma 4 is fully open-source under the industry-standard Apache 2.0 license. This provides developers, researchers, and commercial entities with complete autonomy to use, modify, fine-tune, and redistribute the software with minimal restrictions.
The move to a fully open-source license enhances data privacy and lowers operational costs by allowing private, local execution away from cloud infrastructure. The open-source nature of the release has also drawn attention beyond the tech industry. White House policy advisor Sriram Krishnan highlighted the strategic importance of the launch, noting that open-source models are a crucial area for the West to maintain a competitive technological edge.
Industry Partnerships and Platform Availability
To ensure smooth integration across various hardware platforms, Google collaborated closely with major technology companies, including Qualcomm, MediaTek, and Nvidia. Nvidia confirmed that the new models run efficiently on their RTX GPUs and DGX Spark systems, powering development environments and AI-driven workflows.
Since the launch of the original models, the developer community has downloaded the technology over 400 million times, creating an expansive ecosystem of more than 100,000 variants. The latest Google Gemma 4 models are currently accessible through Google AI Studio and can be downloaded directly from platforms like Hugging Face, Kaggle, and Ollama.
