14 LLMs fought 314 Street Fighter matches. Here's who won
I benchmarked models in an actual chatbot arena

1
2
3
4
5
6
7
8
9
10
11
12
13
14
You are very far from the opponent. Move closer to the opponent.Your opponent is on the left.
You can now use a powerfull move. The names of the powerful moves are: Megafireball, Super attack 2.
Your last action was Low. The opponent's last action was Left.
Your current score is 17.0. You are winning. Keep attacking the opponent.
To increase your score, move toward the opponent and attack the opponent. To prevent your score from
decreasing, don't get hit by the opponent.
The moves you can use are:
- Move Closer
- Move Away
- Fireball
- Megapunch
- Hurricane
- Megafireball
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
def call_llm() -> str:
move_list = "- " + "\n - ".join([move for move in META_INSTRUCTIONS])
system_prompt = f"""You are the best and most aggressive Street Fighter III 3rd strike player in the world.
Your character is {self.character}. Your goal is to beat the other opponent. You respond with a bullet point list of moves.
{self.context_prompt()}
The moves you can use are:
{move_list}
----
Reply with a bullet point list of moves. The format should be: `- <name of the move>` separated by a new line.
Example if the opponent is close:
- Move closer
- Medium Punch
Example if the opponent is far:
- Fireball
- Move closer
"""
prompt = "Your next moves are:"
llm_response = call_bedrock_model(self.model, system_prompt, prompt, bedrock_runtime)
print(f"{self.model} making move {llm_response}")
Model | Elo |
---|---|
🥇 claude_3_haiku | 1613.45 |
🥈 claude_3_sonnet | 1557.25 |
🥉 claude_2 | 1554.98 |
claude_instant | 1548.16 |
cohere_light | 1527.07 |
cohere_command | 1511.45 |
titan_express | 1502.56 |
mistral_7b | 1490.06 |
ai21_ultra | 1477.17 |
mistral_8x7b | 1450.07 |



Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.