Leaderboard - Open
Model
Easy
Med
Hard
Average
VidDiff (ours)
49.9%
37.9%
38.5%
42.1%
GPT-4o
45.7%
41.5%
38.0%
41.7%
Claude-3.5-Sonnet
37.8%
34.6%
34.3%
35.6%
Gemini-1.5-Pro
30.3%
30.5%
24.1%
28.3%
LLaVA-Video-7B
7.8%
9.0%
8.5%
8.4%
Qwen2VL-7B
11.2%
8.8%
1.6%
7.2%