The Illusion of Intelligence: Why AI Still 'Freezes' When Facing the New
The Reality Check
After seeing AI pass the Bar Exam, complex medical tests, and even programming competitions, it’s easy to believe it has finally become “intelligent” in the human sense of the word.
But a recent study, published on February 6, 2026, brought a “reality check” that every tech professional needs to understand.
And the implications go far beyond computer science.
The Truth Test
Eleven mathematicians from the world’s most prestigious universities decided to conduct a simple but devastating experiment:
The Institutions
- Stanford
- Harvard
- MIT
- Yale
- Princeton
- Caltech
Models Tested
- GPT-5 (OpenAI)
- Gemini 3 (Google)
- Claude Opus 4 (Anthropic)
- Llama 4 (Meta)
The Experiment
They tested these models with something AIs couldn’t have “memorized” from the internet:
10 mathematical problems (lemmas and proofs) they had just solved in their current research, but which had not yet been published.
In other words:
- ❌ No trace of these solutions in forums
- ❌ Not in scientific articles
- ❌ Didn’t exist in training databases
- ❌ No “cheat sheet” available
Critical context: These problems were the type that PhD mathematicians took months to solve.
The Result: Understanding vs. Reasoning
The result was surprising (and revealing):
AI Success Rate: 8%
Only 1 out of 10 problems was solved correctly.
But the most interesting thing isn’t the number itself. It’s the failure pattern.
What AI Could Do
✅ Understood the question? Yes, perfectly. ✅ Could rewrite the problem in its own words? Yes. ✅ Identified relevant mathematical concepts? Yes. ✅ Cited related theorems? Yes. ✅ Started the proof promisingly? Yes.
What AI COULDN’T Do
❌ Arrive at complete logical proof from scratch? No. ❌ Make the necessary “creative leap”? No. ❌ Build genuinely new reasoning? No. ❌ Solve without having seen similar examples? No.
Observed pattern:
AI started well, applied known techniques, but froze the moment it needed to invent a new logical step.
The Great Frontier of 2026
This exposes the fundamental difference between:
Pattern Recognition
Input: Problem X
Process: "This looks like Y that I've seen"
Output: Solution based on Y
AI is EXCELLENT at this.
True Reasoning
Input: Completely new problem
Process: "I need to build a solution from scratch"
Output: New method/approach
AI still FAILS at this.
”Remix” vs. “Creation”
Here’s the perfect analogy:
AI as Musical DJ
What a DJ does:
- Takes existing songs
- Mixes, remixes, combines
- Creates something that “sounds” new
- But uses samples from previous works
What a composer does:
- Creates original melodies
- Invents new harmonies
- Builds unprecedented structures
- Genuinely creates something that didn’t exist
Current AI is a brilliant DJ, not a composer.
Why Did AI Pass Medical Exams?
Traditional medical exams:
Question: "Patient presents symptoms X, Y, Z. Diagnosis?"
AI: "I've seen millions of cases like this in training data"
→ Recognizes the pattern
→ Gets it right
Novel mathematical problem:
Question: "Prove this lemma we just discovered"
AI: "Never seen this before"
→ No pattern to recognize
→ Fails
The difference?
Clinical medicine (in tests) is primarily pattern recognition. Cutting-edge mathematics is genuinely creative reasoning.
Revealing Examples
Case 1: The Non-Euclidean Geometry Problem
Proposed problem: Proof involving hyperbolic geometries in 7 dimensions.
AI Performance:
GPT-5:
- Correctly identified it was hyperbolic geometry ✓
- Cited relevant Gromov theorems ✓
- Attempted standard proof method ✓
- Froze when it needed a non-obvious “trick” ✗
Result: Incomplete solution, crucial last step missing.
Human (mathematician):
- Same initial steps
- Insight: “What if we apply this theorem inversely?”
- Complete proof in 3 days
The difference: The creative “what if.”
Case 2: The Number Theory Lemma
Problem: Prove a property of prime numbers in specific sequences.
GPT-5:
"This problem resembles Green-Tao Theorem...
Applying induction... [10 correct steps]
Therefore, we can conclude... [wrong conclusion]"
Why it failed? Tried to force a known technique where it didn’t apply.
Claude Opus 4:
"I'm not sure how to proceed after step 7.
Standard techniques don't seem sufficient here."
At least it was honest about the limitation!
Why This Matters So Much
1. Redefines What “Intelligence” Is
We thought: “AI passes doctor’s test → AI is intelligent”
Reality: “AI is excellent at tasks that have been solved millions of times”
No less impressive, but different.
2. Identifies Where Humans Are Irreplaceable
AI dominates:
- Problems with clear patterns
- Repetitive tasks (even if complex)
- Optimization within known spaces
Humans still dominate:
- Genuinely new problems
- First-principles reasoning
- Creative insights
- Building new frameworks
3. Changes How We Should Use AI
Wrong use: “AI, solve this totally new problem for me” → Will fail or give wrong answer confidently
Right use: “AI, here’s my initial approach to this new problem. Help me refine, find errors, and explore variations” → Productive partnership
What This Means For Your Role
This study reinforces what we discussed in previous posts about the value of strategic thinking.
If AI is Limited to the “Already Seen”
Your competitive advantage lies in:
1. Solving Novel Problems
Problems that:
- Your company has never faced
- Have no recipe on Google
- Require unique combination of factors
- Demand deep business context
Example:
Generic problem: "How to increase sales?"
→ AI finds 50 tested strategies
Novel problem: "How to sell product X to customer Y
who has restriction Z in market W during economic crisis?"
→ AI offers generalizations, you need to create unique solution
2. First-Principles Reasoning
First-principles reasoning: Building solutions from basic principles, not replicating off-the-shelf models.
Practical example:
Standard approach (AI dominates):
Problem: Improve website conversion
AI: "Here are 20 UX best practices
based on 10 million websites"
→ You apply them
First-principles reasoning (human necessary):
Problem: Improve website conversion
You: "Why do MY specific users abandon?
What are their unique motivations?
How does this differ from industry standard?
What unique solution solves THIS?"
→ You invent something new
3. Connecting Disparate Domains
AI: Excellent within a domain.
Human: Can make connections between completely different domains.
Examples:
- Apply evolutionary biology principle to system design
- Use game theory to solve logistics problem
- Adapt jazz technique to project management
The Quote That Sums It All Up
“AI is excellent at remixing the world it has seen. Your role is to create the world it doesn’t yet know.”
Practical implications:
For Developers
❌ Don’t compete with AI on: Implementing known solutions
✅ Compete on:
- Architecting solutions for unique problems
- Identifying which problem to solve
- Combining tools in non-obvious ways
For Product Managers
❌ Don’t compete with AI on: Listing common features
✅ Compete on:
- Understanding latent user needs
- Defining products the market doesn’t know it wants
- Navigating complex and unique trade-offs
For Strategists
❌ Don’t compete with AI on: Standard SWOT analysis
✅ Compete on:
- Identifying opportunities data doesn’t show
- Making bets on uncertain futures
- Building non-obvious competitive advantages
The Three Levels of Work
Level 1: Standard Execution
Example: Write basic CRUD, make monthly report Status: AI already dominates or is dominating Action: Automate this immediately
Level 2: Complex Execution
Example: Optimize algorithm, create advanced dashboard Status: AI is getting very good Action: Use AI as copilot, focus on supervision
Level 3: Genuine Creation
Example: Invent new architecture, define new product Status: AI still freezes Action: This is your territory. Protect it.
Signs You’re at Level 3
✅ You’re solving problems Google has no answer for ✅ Your solution combines things in never-before-seen ways ✅ You’re inventing, not copying ✅ AI helps with parts, but can’t do the whole ✅ Deep context is essential for the solution
The Personal Test
Take this test now:
Question 1:
Your current work can be described as:
- A) Applying known best practices
- B) Optimizing existing processes
- C) Inventing solutions for unique problems
Question 2:
If you describe your problem to AI, it:
- A) Solves completely
- B) Gives 80% of solution
- C) Gives ideas, but you need to create the real solution
Question 3:
Your value lies in:
- A) Knowing tools/frameworks
- B) Executing tasks with expertise
- C) Reasoning about unique problems
If you answered C on all three: You’re safe (for now).
If you answered A on any: Time to evolve.
The Moving Frontier
Important: This frontier is moving.
2024: AI froze on complex code 2026: AI writes complex code easily 2028?: AI may reason better about new problems
But:
The frontier moves slower in creative reasoning than in execution.
Your strategy: Always stay ahead of the frontier.
Conclusion
AI isn’t “dumb” for failing at novel mathematics.
It’s extraordinary at what it does: recognizing patterns at superhuman scale.
But this reveals something crucial: intelligence isn’t just pattern recognition.
True intelligence includes:
- First-principles reasoning
- Genuine creativity
- Insight into new problems
- Building new frameworks
And that’s still predominantly human.
Final Question
Are you using AI just to automate “rice and beans” or are you challenging the machine to help you with problems where there’s no ready manual?
Where does the pattern end and your reasoning begin?
Think about it. The answer defines whether you’ll be replaced or become indispensable.
Reflection
If you made it this far, congratulations. This post was about AI’s limitations.
But it was also about your opportunities.
While others compete with machines on terrain they dominate, you can position yourself where they still stumble.
In the territory of the genuinely new.
Let’s Debate
What kind of problems do you work on?
- Known patterns (AI already dominates)?
- Complex optimization (AI is getting good)?
- Genuinely new (AI still freezes)?
Share your experiences:
- Email: fodra@fodra.com.br
- LinkedIn: linkedin.com/in/mauriciofodra
The future belongs to those who create what machines haven’t yet seen.
Read Also
- The AI Explosion in 2026: Real Evolution or Algorithmic ‘Cheating’? — The other side of the coin: evolution or illusion?
- RAG without Vectors: The End of Document ‘Chunking’ in AI? — A technical attempt to overcome context limitations.
- Neural Networks: Understanding the Brain Behind Modern AI — To understand why pattern recognition isn’t reasoning.