{
  "video": "video-431de632.mp4",
  "description": "This video displays a **comparison table** of performance metrics across different models, likely large language models (LLMs) or AI systems, tested on various benchmarks.\n\nThe table has the following structure:\n\n**Columns:**\n1.  **Benchmarks:** Lists the categories or specific tests being run.\n2.  **GLM-SV-Turbo:** Shows the performance score for the GLM-SV-Turbo model.\n3.  **Kimi K2.5:** Shows the performance score for the Kimi K2.5 model.\n4.  **Claude Opus 4.0:** Shows the performance score for the Claude Opus 4.0 model.\n\n**Rows/Benchmarks:**\nThe benchmarks are grouped into categories: **Multimodal Coding**, **Multimodal ToolUse**, and **GUI Agent**.\n\n**Detailed Breakdown of Benchmarks and Scores:**\n\n**1. Multimodal Coding:**\n*   **Design2Code:** GLM-SV-Turbo (94.8) vs. Kimi K2.5 (91.3) vs. Claude Opus 4.0 (77.3)\n*   **Flame-VLM:** GLM-SV-Turbo (93.0) vs. Kimi K2.5 (88.8) vs. Claude Opus 4.0 (79.8)\n*   **Vision2Web:** GLM-SV-Turbo (31.0) vs. Kimi K2.5 (33.2) vs. Claude Opus 4.0 (43.5)\n\n**2. Multimodal ToolUse:**\n*   **ImageMining:** GLM-SV-Turbo (30.7) vs. Kimi K2.5 (24.4) vs. Claude Opus 4.0 (-) (Dash indicates no score or not applicable)\n*   **BrowseCom-VLP:** GLM-SV-Turbo (51.9) vs. Kimi K2.5 (42.9) vs. Claude Opus 4.0 (35.9)\n*   **HHMeSearch:** GLM-SV-Turbo (72.9) vs. Kimi K2.5 (58.7) vs. Claude Opus 4.0 (63.8)\n*   **HHMeSearch-Plus:** GLM-SV-Turbo (30.0) vs. Kimi K2.5 (25.6) vs. Claude Opus 4.0 (25.6)\n*   **SimpleIVQA:** GLM-SV-Turbo (78.2) vs. Kimi K2.5 (71.5) vs. Claude Opus 4.0 (63.2)\n*   **Facts:** GLM-SV-Turbo (57.8) vs. Kimi K2.5 (57.8) vs. Claude Opus 4.0 (-)\n*   **V\\***: GLM-SV-Turbo (89.0) vs. Kimi K2.5 (84.3) vs. Claude Opus 4.0 (66.5)\n\n**3. GUI Agent:**\n*   **OSWorld:** GLM-SV-Turbo (62.3) vs. Kimi K2.5 (63.3) vs. Claude Opus 4.0 (72.2)\n*   **AndroidWorld:** GLM-SV-Turbo (75.7) vs. Kimi K2.5 (43.1) vs. Claude Opus 4.0 (62.0)\n*   **WebVoyager:** GLM-SV-Turbo (86.5) vs. Kimi K2.5 (84.3) vs. Claude Opus 4.0 (88.0)\n\n**Visual Context:**\nThe video appears to be a screencast or presentation slide showing this static data table, with the focus on clearly presenting the comparative performance of the three models across complex AI tasks. The duration of the video is very short, cycling through the display of this single data presentation.",
  "codec": "av1",
  "transcoded": true,
  "elapsed_s": 22.9
}