Chinese AI firm DeepSeek unveils open-source reasoning model DeepSeek-R1; beats OpenAI’s o1
R1 surpassed OpenAI o1’s performance on benchmarks including AIME (mathematical reasoning), MATH-500 (word problems) and SWE-bench Verified (programming)
by The Hindu Bureau · The HinduChinese AI startup DeepSeek has released its new R1 model under open MIT license. It includes an open-source reasoning AI model called DeepSeek-R1 that is on par with OpenAI’s o1 on multiple benchmarks.
DeepSeek gained a considerable attention a month ago after it launched the DeepSeek-V3 that outperformed AI models built by Big Tech rivals despite being trained at a fraction of their cost.
With a similar mixture-of-experts structure, DeepSeek has claimed that DeepSeek-R1 was 90-95% cheaper to build than its counterpart from OpenAI.
This means while OpenAI’s o1 costs $15 per million input tokens and $60 per million output tokens, DeepSeek Reasoner cost $0.55 per million input tokens and $2.19 per million output tokens.
The group of models includes DeepSeek-R1-Zero and the DeepSeek-R1 besides six more compact DeepSeek-R1-Distill models ranging from 1.5 billion and 70 billion parameters.
According to the startup, R1 surpassed o1’s performance on benchmarks including AIME (mathematical reasoning), MATH-500 (word problems) and SWE-bench Verified (programming).
The researchers trained the DeepSeek-R1-Zero variant using reinforcement learning without any supervised fine-tuning and then built the DeepSeek-R1 on it.
Published - January 22, 2025 04:07 pm IST