DeepSeek Releases V3 Model Rivaling GPT-4 at Fraction of Cost
DeepSeek V3 Shakes Up the AI Landscape
Chinese artificial intelligence company DeepSeek has released V3, a large language model that matches or exceeds GPT-4's performance on several major benchmarks while costing roughly one-tenth as much to run. The Hangzhou-based company published benchmark results alongside the model weights, which are available under an open-source license.
DeepSeek V3 scored 89.1 on MMLU, 78.3 on HumanEval, and 92.4 on GSM8K, placing it within the margin of error of OpenAI's GPT-4 Turbo on all three tests. The model uses a mixture-of-experts architecture with 671 billion total parameters but only 37 billion active during any given inference pass.
Cost Efficiency Is the Headline
The sparse architecture means DeepSeek V3 can run on far less hardware than dense models of comparable capability. API pricing is set at $0.27 per million input tokens and $1.10 per million output tokens — roughly 90% cheaper than GPT-4 Turbo's pricing at the time of comparison.
Liang Wenfeng, DeepSeek's founder and CEO, said the company trained V3 on a cluster of approximately 2,000 Nvidia H800 GPUs over two months. "Efficient architecture design matters more than brute-force compute scaling," Liang said in a blog post accompanying the release.
Industry Reaction
The release has generated significant discussion in the AI research community. Andrej Karpathy, a former OpenAI researcher, posted on X that DeepSeek V3 "demonstrates that the gap between US and Chinese AI labs is narrower than many assumed."
Jim Fan, senior research scientist at Nvidia, noted that the mixture-of-experts approach "validates what many researchers have been saying — you don't need trillion-parameter dense models to get frontier performance." He added that the open release of model weights accelerates the entire field.
Regulatory Questions
DeepSeek V3's strong performance has renewed debate over US export controls on advanced GPUs to China. The company trained the model on H800 chips, a modified version of the H100 designed specifically to comply with export restrictions. Critics of the controls argue that Chinese labs are finding ways to achieve competitive results despite hardware limitations.
The Biden administration is reportedly reviewing whether additional restrictions on the H800 and similar chips are needed. Industry groups have warned that overly broad controls risk cutting US chipmakers out of the Chinese market without significantly slowing Chinese AI development.