Leaderboard

Model Type Overall V H E
Random-22.021.921.821.9
Llama-3.2-11b-vision-instructSmall30.332.429.328.7
LLaVA-Mistral-7BMedical39.831.643.137.1
VILA1.5-13bSmall41.841.847.540.9
Llama-3.2-90b-Vision-InstructLarge42.444.942.138.7
LLaVA-Med-Mistral-7BMedical43.037.347.141.6
Llama-3.1-Nemotron-70b-InstructLarge44.244.943.344.8
*GPT-4oLarge45.648.743.144.8
Pixtral-12bSmall45.646.944.844.8
GPT-4o-miniSmall46.248.543.647.0
Gemini-Flash-1.5-8bSmall46.748.743.649.1
Claude-3.5-HaikuSmall47.148.043.851.7
Qwen-2-vl-72b-InstructLarge47.549.245.747.8
VILA1.5-40bLarge47.547.247.947.4
Grok-2-VisionLarge48.450.346.448.7
Qwen-2-VL-7bSmall48.854.143.349.6
Pixtral-LargeLarge49.850.849.548.7
Human-50.352.747.551.4
Gemini-Pro-1.5Large51.152.050.250.9
*Claude-3.5-SonnetLarge51.754.150.250.4
o1Reasoning52.855.450.253.0