Introduction to OpenAI's GPT-OSS Model
On August 5, 2025, OpenAI officially introduced GPT-OSS-120B and GPT-OSS-20B, two open-source language models with open weights and high performance.
Key highlights:
- Apache 2.0 License: allows free use and modification, including for commercial purposes.
- Superior performance compared to other open models of similar size.
- Optimized for inference and powerful tool use.
- Compatible with common hardware:
- 120B runs on an 80 GB GPU.
- 20B requires only 16 GB RAM.
- 120B runs on an 80 GB GPU.
Training the GPT-OSS Models
The GPT-OSS models were trained by OpenAI using advanced pre-training and post-training techniques, focusing on reasoning capabilities, performance optimization, and practical applicability across various deployment environments. These are OpenAI's first open-weight language models since GPT-2 (2019), following the release of other open models like Whisper and CLIP.
In terms of architecture, GPT-OSS employs a Mixture of Experts (MoE) approach to reduce the number of active parameters per token:
- GPT-OSS-120B: 5.1 billion active parameters/token.
- GPT-OSS-20B: 3.6 billion active parameters/token.
The training dataset primarily consists of high-quality English text, with a focus on STEM, programming, and general knowledge, encoded using the o200k_harmony token set (also used for o4-mini and GPT-4o).
The post-training phase includes supervised fine-tuning and RLHF to align with the OpenAI Model Spec, incorporating Chain-of-Thought reasoning training and tool usage.
Both models support three inference levels—low, medium, and high—allowing developers to flexibly balance speed and accuracy with a simple system prompt setting.
Model | Layers | Total Parameters | Active Parameters per Token | Total Experts | Active Experts per Token | Context Length |
---|---|---|---|---|---|---|
gpt-oss-120b | 36 | 117 billion | 5.1 billion | 128 | 4 | 128 thousand |
gpt-oss-20b | 24 | 21 billion | 3.6 billion | 32 | 4 | 128 thousand |
Performance Evaluation of GPT-OSS Models
OpenAI benchmarked GPT-OSS-120B and GPT-OSS-20B on standard academic datasets to assess programming, competitive math, healthcare, and tool usage capabilities, comparing them directly with o3, o3-mini, and o4-mini.
- GPT-OSS-120B:
- Outperforms o3-mini and matches or slightly exceeds o4-mini on Codeforces, MMLU, HLE, and TauBench.
- Surpasses o4-mini on HealthBench and AIME 2024 & 2025 tasks.
- Outperforms o3-mini and matches or slightly exceeds o4-mini on Codeforces, MMLU, HLE, and TauBench.
- GPT-OSS-20B:
- Matches or exceeds o3-mini on the same test sets, despite its smaller size.
- Particularly strong in competitive math and healthcare tasks.
- Matches or exceeds o3-mini on the same test sets, despite its smaller size.
GPT-OSS-120B approaches (and sometimes surpasses) smaller proprietary models in many scenarios, while GPT-OSS-20B offers a lightweight yet competitive performance option, ideal for local deployment and low-cost solutions.
Conclusion
The launch of GPT-OSS-120B and GPT-OSS-20B marks a significant milestone for OpenAI in bringing open-weight language models to the community. With performance rivaling proprietary models, hardware-optimized architecture, and stringent safety standards, GPT-OSS not only creates opportunities for large enterprises but also empowers individuals, startups, and small organizations to access cutting-edge AI technology.
This duo combines exceptional reasoning power, flexible customization, and cross-platform compatibility, enabling anyone to build, deploy, and optimize AI solutions on their own infrastructure. In a landscape where the demand for transparent, democratized, and accessible AI is growing, GPT-OSS stands as a testament to OpenAI’s vision: powerful AI must belong to everyone.
Key Highlights | Details |
---|---|
Open-Source, Flexible | Apache 2.0 License, free to customize and use |
High Performance | gpt-oss-120b ≈ o4-mini; gpt-oss-20b ≈ o3-mini |
Hardware Optimization | 80 GB GPU or common devices with 16 GB RAM |
Easy Integration | Supported on multiple platforms, from cloud to local |
Safety Assurance | Thorough testing and clear documentation for increased reliability |