{
  "video": "video-fa48c3e9.mp4",
  "description": "This video appears to be a presentation or a demonstration related to **SWE-bench**, which is described as a benchmark suite of ISO instances for evaluating the capabilities of AI coding agents and language models.\n\nThe video transitions through several distinct sections:\n\n**1. SWE-bench Verified (00:00 - 00:11):**\n*   The initial part focuses on the **\"SWE-bench Verified\"** section.\n*   It provides an **Overview** of SWE-bench, explaining that it is a human-verified suite of ISO instances for testing AI coding agents.\n*   A significant part of this section is dedicated to **\"Bash Only: Comparing Language Models,\"** which discusses how the benchmark evaluates complex coding scenarios across different language models. It mentions running agents against shell scripts and the full leaderboard.\n*   Finally, there is a **Citation** section providing options (APA, MLA, BibTeX) for citing the work.\n*   The visual theme here is a clean, professional website interface displaying these informational sections.\n\n**2. SWE-bench Lite (00:12 - 00:35):**\n*   The video then transitions to **\"SWE-bench Lite,\"** which seems to be a streamlined or simpler version of the benchmark.\n*   This section also has an **Overview**, stating that SWE-bench Lite provides a smaller, carefully selected subset of 300 tasks from the full benchmark.\n*   A detailed section, **\"Selection Criteria,\"** outlines how these tasks were chosen. It specifies criteria related to:\n    *   The types of issues and related references.\n    *   The presence of \"other requests or issues.\"\n    *   The required coding difficulty level, noting that tasks with more than 3 edit hours are excluded.\n*   The visual presentation remains consistent with the website design, detailing the technical scope and methodology of the Lite version.\n\n**In summary, the video provides a detailed informational tour of two related AI coding evaluation benchmarks: the full SWE-bench and the more curated SWE-bench Lite, explaining their purpose, methodology, and structure.**",
  "codec": "av1",
  "transcoded": true,
  "elapsed_s": 15.0
}