The emergence of DeepSeek and its R1 V3-powered AI model, which surpasses OpenAI’s o1 reasoning model across a wide range of benchmarks, including math, science, and coding, has raised investor concern about the exorbitant cost tied behind AI advances, seemingly making commitments such as OpenAI’s $500 billion Stargate project seem counter-productive.
Researchers at Stanford and the University of Washington recently developed an AI model to take on OpenAI’s o1 reasoning model. For more context, the model, dubbed s1, was trained using a dataset of 1,000 questions for under $50 (via TechCrunch). The researchers managed to achieve this milestone by distilling information from proprietary larger AI models.
Distillation is the process where a small AI model extracts information from larger AI models. In this case, the researchers indicated that s1 extracted its answers from Google’s Gemini 2.0 Flash Thinking Experimental AI reasoning model. As spotted by The Verge, the tool’s terms of service categorically indicate that it’s prohibited to use Gemini’s API to develop models that compete with the company’s AI models.
The process narrows the gap between AI startups and well-established AI firms, as they can develop sophisticated entries without breaking the back. However, top AI labs, including OpenAI and Microsoft, by extension, aren’t happy about smaller AI startups using distillation to refine their AI models. OpenAI and Microsoft recently accused DeepSeek of using their copyrighted data to train its ultra-cost-effective model.
s1’s training process took less than 30 minutes using 16 NVIDIA H100 GPUs. The model is based on Qwen2.5, an open-source Alibaba AI model. More interestingly, the researchers revealed that they asked the AI model to “wait” during the reasoning process, prompting it to think harder before generating its response to the query. “This can lead the model to doublecheck its answer, often fixing incorrect reasoning steps,” the researchers noted. As a result, the AI model seemingly generated well-curated and accurate answers.
You can check out the s1 model on GitHub.