Leaderboard - Closed

Model Easy Med Hard Average
GPT-4o58.3%53.2%48.9%53.5%
Gemini-1.5-Pro67.8%53.6%51.7%57.7%
Claude-3.5-Sonnet57.1%50.5%52.5%53.4%
LLaVA-Video-7B56.6%52.0%48.3%52.3%
Qwen2VL-7B49.0%52.6%49.6%50.4%
VidDiff (ours)62.7%56.2%50.0%56.3%