Anthropic has its latest AI model3.7 Sonnet sent to a mission that many of you have certainly contested as children: playing throughOn the game boy. The results sometimes show enormous progress compared to previous Claude models, but also various weaknesses.
Claude fights through the Pokémon world
Since February 2025, the AI Claude, developed by AI company Anthropic, has been trying to play Pokémon. The experiment with the nameClaude Plays Pokémon
becomeslive auf TwitchTranslated and has already attracted thousands of spectators.
Unlike specialized player KIs such as the systems developed for Go or Dota 2, Claude was not specially trained for video games.
What makes the experiment special:
- Claude only uses his general knowledge of the world and Pokémon
- Die ki
sees
The game via screenshots, similar to a person - The system had not been trained on Pokémon games beforehand
David Hershey, developer at Anthropic and responsible for the project, explains imInterview with ARS Technica:
It only uses the different knowledge that Claude has about the world applied to video games.
Recommended editorial content
At this point you will find an external content of Twitter, which complements the article.
You can have it displayed with one click and hide it again.
I agree that the content of Twitter is displayed.
Personal data can be transmitted to third -party platforms. More on this in our.
Link toTwitter content
Surprising progress and bitter setbacks
Compared to older Claude versions, who hardly made it out of the starting area, Claude 3.7 was able to defeat several arena leader and collect orders. At the current time (April 06, 2025), the AI acquired the first three arena orders.
According to Anthropic, the breakthrough is in the newExtended Thinking
-Modusthat enables the modelplan to plan to remember goals and adapt if initial strategies fail
.
But if you follow the live stream, you can also see the limits: Claude has enormous difficulties with navigation through the 2D game world. The Mondberg in particular presented a massive challenge.
Frequent problems:
- Hours of crazy ways in areas already completed
- Repeated stuck in dead ends
- Endless conversations with the same NPCs
- Difficulties in recognizing walls and obstacles
The latest problemHowever, the acquisition of the bike was created, because the character always moves two fields at the same time. This is a big problem for AI because it previously only knew simple steps.
Man and AI: The weaknesses differ
Interestingly, Claude shows different strengths and weaknesses than a human player. While the pixelated representation of the Game Boy is easy to interpret for humans, it is a major challenge for the AI, says Hershey:
Claude is still not particularly good at understanding what is on the screen. It is one of these fun things in humans that we can look at these 8x8 pixel spots of people and say: 'This is a girl with blue hair'.
In contrast, Claude is surprisingly strong when understanding the game mechanics and text -based challenges:
- Recognizing Pokémon types and their weaknesses
- Building effective combat strategies
- Record and save game notes
- Develop long-term team strategies
The memory problem
Another fundamental problem: Claudes limitedMemory
. The AI has a context window of 200,000 tokens, which means that older information has to be summarized or deleted when new ones are added.
Claude has difficulty pursuing things over a very long period of time and really feeling good for what it has tried so far
False information seems to be a big problem.
The things that have been written down in the past trusts it quite blindly.
This problem is clearly visible, if you look at the stream in the near future, because although the AI has already acquired three orders and has already conquered the Mondberg, she is standing on the route in front of the Mondberg and trying to find a way to and through the mountain again. The AI, however, prepares the bicycle mentioned above.
If the AI could better remember what she has already tried or did, she would have saved herself dozens of hours.
What does that mean for the future of the AI?
Despite the entertaining moments, when Claude fights against game mechanics that were conceived for children of the 90s, Hershey sees the experiment as an important guide to AI development.
The difference between 'it can't' and 'can somehow' for me is quite big for me in these AI things. If something can do something, it typically means that we are pretty close to making it really good.
For the future, he sees great potential in improving the understanding of the image and an expansion of the context window, which would enable it to come,to argue over a long period of time and to handle things over a long period of time coherent
.
No AGI in sight yet?
While leading AI companies such as Openaai and Anthropic themselves from an approachArtificial General Intelligence
(Agi) speak - a AI that achieves human -like skills in almost all conceivable areas - the experiment also shows how far we could be away from it.
Claude still fights with tasks that are not a problem for people, while it is surprisingly capable in other areas. The combination of spatial understanding to memory formation could be crucial for the development of a real AGI.