YutoIT Blog
Scroll to top

Details on GPT-OSS - Free Open-Source LLM That Runs on Local Computers

2025-08-10 21:27
10 views
OpenAI Releases GPT-OSS - Free Open-Source LLM That Runs on Local Computers
Table of contents

Introduction to OpenAI's GPT-OSS Model

On August 5, 2025, OpenAI officially introduced GPT-OSS-120B and GPT-OSS-20B, two open-source language models with open weights and high performance.

Key highlights:

  • Apache 2.0 License: allows free use and modification, including for commercial purposes.
  • Superior performance compared to other open models of similar size.
  • Optimized for inference and powerful tool use.
  • Compatible with common hardware:
    • 120B runs on an 80 GB GPU.
    • 20B requires only 16 GB RAM.


Details on GPT-OSS - Free Open-Source LLM That Runs on Local Computers

Training the GPT-OSS Models

The GPT-OSS models were trained by OpenAI using advanced pre-training and post-training techniques, focusing on reasoning capabilities, performance optimization, and practical applicability across various deployment environments. These are OpenAI's first open-weight language models since GPT-2 (2019), following the release of other open models like Whisper and CLIP.

In terms of architecture, GPT-OSS employs a Mixture of Experts (MoE) approach to reduce the number of active parameters per token:

  • GPT-OSS-120B: 5.1 billion active parameters/token.
  • GPT-OSS-20B: 3.6 billion active parameters/token.

The training dataset primarily consists of high-quality English text, with a focus on STEM, programming, and general knowledge, encoded using the o200k_harmony token set (also used for o4-mini and GPT-4o).

The post-training phase includes supervised fine-tuning and RLHF to align with the OpenAI Model Spec, incorporating Chain-of-Thought reasoning training and tool usage.

Both models support three inference levels—low, medium, and high—allowing developers to flexibly balance speed and accuracy with a simple system prompt setting.

Model Layers Total Parameters Active Parameters per Token Total Experts Active Experts per Token Context Length
gpt-oss-120b 36 117 billion 5.1 billion 128 4 128 thousand
gpt-oss-20b 24 21 billion 3.6 billion 32 4 128 thousand


Details on GPT-OSS - Free Open-Source LLM That Runs on Local Computers

Performance Evaluation of GPT-OSS Models

OpenAI benchmarked GPT-OSS-120B and GPT-OSS-20B on standard academic datasets to assess programming, competitive math, healthcare, and tool usage capabilities, comparing them directly with o3, o3-mini, and o4-mini.

  • GPT-OSS-120B:
    • Outperforms o3-mini and matches or slightly exceeds o4-mini on Codeforces, MMLU, HLE, and TauBench.
    • Surpasses o4-mini on HealthBench and AIME 2024 & 2025 tasks.
  • GPT-OSS-20B:
    • Matches or exceeds o3-mini on the same test sets, despite its smaller size.
    • Particularly strong in competitive math and healthcare tasks.

GPT-OSS-120B approaches (and sometimes surpasses) smaller proprietary models in many scenarios, while GPT-OSS-20B offers a lightweight yet competitive performance option, ideal for local deployment and low-cost solutions.

Conclusion

The launch of GPT-OSS-120B and GPT-OSS-20B marks a significant milestone for OpenAI in bringing open-weight language models to the community. With performance rivaling proprietary models, hardware-optimized architecture, and stringent safety standards, GPT-OSS not only creates opportunities for large enterprises but also empowers individuals, startups, and small organizations to access cutting-edge AI technology.

This duo combines exceptional reasoning power, flexible customization, and cross-platform compatibility, enabling anyone to build, deploy, and optimize AI solutions on their own infrastructure. In a landscape where the demand for transparent, democratized, and accessible AI is growing, GPT-OSS stands as a testament to OpenAI’s vision: powerful AI must belong to everyone.

Key Highlights Details
Open-Source, Flexible Apache 2.0 License, free to customize and use
High Performance gpt-oss-120b ≈ o4-mini; gpt-oss-20b ≈ o3-mini
Hardware Optimization 80 GB GPU or common devices with 16 GB RAM
Easy Integration Supported on multiple platforms, from cloud to local
Safety Assurance Thorough testing and clear documentation for increased reliability
Share links:
Author
Yuto
Yuto

I created this blog in 2024, during the rapid development of AI technology. The goal of this blog is to share basic knowledge, computer tips, and guides on using basic AI tools.
Thank you for visiting my website. I hope the articles on this website will be useful for you.

Comments

No comments

Leave a Comment and Get Notified of Replies

Please sign in to leave a comment

Sign in