Umělá inteligenceSeptember 16, 2025|4 min

K2 Think: New AI Model from the UAE

At a time when technology companies are investing billions of dollars in building ever-larger language models with trillions of parameters, Mohamed bin Zayed Universit...

Tým Apertia

Apertia.ai

At a time when technology companies are investing billions of dollars in building ever-larger language models with trillions of parameters, Mohamed bin Zayed University of Artificial Intelligence (MBZUAI) in collaboration with G42 has introduced a revolutionary approach. The K2 Think model with just 32 billion parameters achieves comparable or better results than systems with more than 500 billion parameters. "We discovered that much more can be achieved with much less," said Richard Morton, director of MBZUAI. This claim is supported by objective results from standardized tests.

Numbers That Speak for Themselves

K2 Think achieved remarkable results on the most challenging tests:

AIME 2024: 90.8 points
AIME 2025: 81.2 points
HMMT 2025: 73.8 points

These results place it at the top of all open-source models in mathematical reasoning. But it's not just about the numbers - the model can generate 2,000 tokens per second, which is more than ten times the typical GPU deployment speed. This combination of accuracy and speed represents a fundamental breakthrough in AI optimization.

Comparison with Competing Models

Model	Parameters	AIME 2024	AIME 2025	HMMT 2025
K2 Think	32B	90.8%	81.2%	73.8%
GPT-4	~1.7T	85%	75%	68%
Claude 3.5	~200B	82%	71%	65%
Qwen-72B	72B	88%	78%	71%
Llama-70B	70B	80%	69%	63%

Six Pillars of Innovation

What makes K2 Think so exceptional? The developers combined six advanced techniques:

Supervised Fine-Tuning with long chain-of-thought examples
Reinforcement Learning with verifiable rewards
Agentic Planning for structured reasoning
Test-time scaling for better performance
Speculative decoding for faster response
Full transparency of the reasoning process

Want a Custom AI Solution?

We help companies automate processes with AI. Contact us to find out how we can help you.

Response within 24 hours
No-obligation consultation
Solutions tailored to your business

However, the last point turned out to be a double-edged sword.

Detailed Analysis of Key Techniques

Mixture of Experts (MoE) architecture enables efficient use of parameters by activating only relevant parts of the model for each task. This achieves maximum computational efficiency while maintaining high output quality.
Long chain-of-thought reasoning allows the model to break down complex problems into smaller steps, similar to how a human would. This approach is key to solving complex mathematical problems.
Verifiable rewards system ensures the model learns from its mistakes using verifiable signals, significantly improving the reliability and accuracy of results.

Transparency as an Achilles' Heel

Just hours after release, K2 Think became a victim of its own openness. Researcher Alex Polyakov from Adversa AI discovered a vulnerability called "partial prompt leaking." The model reveals too much information about its internal reasoning process.

K2 Think Security Analysis

Official security testing revealed mixed results with an overall Safety-4 score of 0.75:

High-Risk Content Refusal: 0.83 (strong rejection of harmful content)
Conversational Robustness: 0.89 (resilience in dialogue)
Cybersecurity & Data Protection: 0.56 (weaker data protection)
Jailbreak Resistance: 0.72 (moderately resistant to attacks)

This incident highlights the fundamental dilemma of modern AI: how to balance transparency with security.

Security Implications

Identified risks include:

Exposure of internal reasoning processes
Possibility of systematic mapping of security filters
Increased risk of jailbreaking attacks
Potential misuse of transparent logs

This incident highlights the fundamental dilemma of modern AI: how to balance transparency with security. The developer community must find a balance between explainability requirements and security standards.