OpenAI and Google DeepMind have both reached a major milestone by achieving gold-level performance at the 2025 International Mathematical Olympiad (IMO), one of the most challenging math competitions for high school students worldwide. Google’s AI model, Gemini Deep Think, officially competed during the exam under the IMO’s strict 4.5 hour time limit and solved five out of six problems using fully natural-language proofs. Its results were formally verified by the IMO judges, making it the first AI to officially earn gold at the competition.
Shortly after the event, OpenAI tested its new experimental reasoning model on the same problems and also achieved a gold-level score, solving five problems with detailed, step-by-step proofs. However, OpenAI’s model did not officially participate in the competition itself.
Out of more than 600 human contestants, only 67 earned gold medals this year, underscoring how exceptional these AI performances are. Unlike previous AI achievements in domains like chess or Go, the IMO requires sustained, creative mathematical reasoning and formal proof-writing in natural language.
Google’s model worked entirely within the exam’s time constraints, while OpenAI’s approach involved “test-time compute,” a method that uses significant computing resources during problem-solving to explore multiple reasoning paths in parallel. Both approaches demonstrate significant advances in AI’s ability to reason abstractly and communicate solutions clearly.
Google delayed announcing its results publicly to respect the human competitors and await official certification, while OpenAI shared its findings more promptly. Both companies emphasize that their models are not yet publicly available, and independent validation of the results is still pending.
This breakthrough represents a new era where AI systems can handle complex, creative reasoning tasks in natural language, moving beyond specialized games or narrow tasks. The implications extend far beyond competitions, potentially transforming scientific research and collaborative problem-solving between humans and AI.