AI for Video Highlights: 17 LLMs Tested – Our Surprising Results for Your Content ROI

We wanted to know which AI best finds the gold nuggets in your webcast content. The answer is more complex – and more valuable for your strategy – than you think.

AI for Video Highlights: 17 LLMs Tested – Our Surprising Results for Your Content ROI

Anyone who has ever produced an important webcast knows the feeling afterward: The live event was a success, the participants were enthusiastic—but what happens now with the valuable content? Far too often, the recording lands in an archive, and its potential fizzles out. That is the "One-and-Done" trap.

At MEETYOO, we believe that every webcast should be the beginning, not the end. It is a goldmine for reusable content. The big question is simply: How do you find the gold nuggets in a 60-minute video without spending hours manually sifting through it?

The answer lies in Artificial Intelligence. But—and this is crucial—not every AI is equally well-suited for this task. To prove this, we did what comes naturally to us as a software company: We tested it.

The Experiment: 17 AI Models Pitted Against Two of the World's Most Famous Speeches

We wanted to know how well the latest Large Language Models (LLMs) perform at identifying "Key Moments" in a transcript. So, we gave them a clear task: Analyze the transcripts of Steve Jobs' 2005 Stanford commencement address and John F. Kennedy's 1963 "Ich bin ein Berliner" speech, and extract the moments that are perfect for short video clips.

The criteria were strict: The moments had to contain quotes, anecdotes, or key messages, be between 10 seconds and 2 minutes long, and function without further context.

The 17 LLM Contenders at a Glance

To cover a broad spectrum, we selected models from established providers, up-and-coming specialists, and the open-source community.

Provider/CreatorModel NameIntelligence Index
DeepSeekDeepSeek R1 🧠52
GoogleGemini 2.5 Flash Lite 🧠40
GoogleGemini 2.5 Flash 🧠51
GoogleGemini 2.5 Pro 🧠60
GoogleGemini 3 Pro 🧠73
OpenAIGPT-5.1 (medium) 🧠70
OpenAIGPT-5 (medium) 🧠66
OpenAIGPT OSS 20B 🧠52
OpenAIGPT OSS 120B 🧠61
OpenAIGPT-4.143
OpenAIGPT-4o29
AlibabaQwen3 235B57
AnthropicClaude 4.5 Haiku 🧠70
AnthropicClaude 4.5 Sonnet 🧠55
Moonshot AIKimi K2 Instruct48
AmazonNova Lite21
AmazonNova Pro25
🧠 = Reasoning model. Data from Artificial Analysis.

An Honest Limitation: The Training Data Trap

Before we get to the results, a critical qualification is necessary. These two speeches are pop culture icons and are almost certainly part of the training data for nearly all models. The AIs likely "know" this content already.

Why did we choose them anyway? Because there is no objective truth when identifying "highlights." With an unknown internal webcast, we could hardly evaluate the quality of the AI suggestions fairly. Famous speeches offer a generally accepted consensus on what the key moments are. This allowed us to test how well the models understand human consensus regarding relevance and context—a crucial skill for content repurposing. The true test, of course, remains material that the AI has never seen before.

Our Tool: The Prompt We Share With You

A good result depends not only on the model but also massively on the quality of the prompt. We invested a lot of time in formulating our instructions as precisely as possible. Because we believe in the value of transparency and partnership, we are sharing it here with you:

    
<role>
You are a video content analyst specializing in 
identifying compelling moments for short-form content creation.
</role>

<task>
Analyze the provided VTT transcript to extract key moments that would make engaging short-form video clips.
</task>

<input_format>
Standard VTT (WebVTT) format with timestamps and text content.
</input_format>

<key_moment_criteria>
A key moment must contain at least one of:
• **Important quote** - memorable, quotable statement
• **Key insight** - valuable learning or revelation  
• **Interesting metric** - surprising statistic or data point
• **Compelling anecdote** - engaging story or example
• **Actionable advice** - practical tip or strategy

Requirements:
• Duration: 10 seconds minimum, 2 minutes maximum
• Self-contained: makes sense without additional context
• High value: provides significant insight or entertainment
• Clip-worthy: suitable for standalone short-form content
</key_moment_criteria>

<output_format>
For each key moment, provide:

**Title:** [Descriptive title that captures the essence]
**Start Time:** [HH:MM:SS]  
**End Time:** [HH:MM:SS]
**Type:** [Quote/Insight/Metric/Anecdote/Advice]
**Why Notable:** [Brief explanation of value/impact]

---

</output_format>

<guidelines>
• Focus on quality over quantity - extract only the most compelling moments
• Avoid overlapping time segments
• Prioritize moments with emotional impact or practical value
• Ensure each moment tells a complete micro-story
• If uncertain about timing, reply "TIMESTAMP_UNCLEAR"
</guidelines>

**VTT Transcript:**
...

The Three Decisive Insights From Our Test

The result was more than just a ranking. It revealed three fundamental truths about using AI to maximize your content ROI.

Insight 1: There is no "single" winner—and that is the most important finding

The biggest surprise was: No single model outclassed all the others. Instead, the AIs showed different "personalities."

Some models, like Qwen3 or AWS Nova Pro, tended to identify long, chapter-like blocks. They essentially broke the speeches down into their main sections. This is useful for a rough outline, but unsuitable for creating social media clips.

Others, like Google's Gemini family, consistently delivered shorter, more concise moments. They acted more like "moment spotters" rather than "chapter finders."

What this means for you: An AI platform that relies on just a single general-purpose model will never deliver the full spectrum of valuable content. The key lies in understanding the strengths of different models and using them specifically.

Insight 2: Expensive isn't always better—it depends on the task

One might assume that the largest and most expensive models automatically deliver the best results. Our test proved the opposite.

Some of the supposed top models delivered disappointing results because they were too general. The real jewels were often found by more specialized or smaller models. The best example: The open-source model GPT-OSS-20b was the only one to identify the subtle but extremely powerful moment "The Power of Dropping Out" in Steve Jobs' speech. A true "hidden gem" that the giants overlooked.

What this means for you: It's not about throwing the most computing power at a problem. It's about using the right kind of intelligence for the specific task of content analysis. An intelligent platform relies on proven results, not brand names.

Insight 3: Short clips beat long chapters

For content managers, usability is crucial. A 5-minute "highlight" isn't a highlight; it's a chapter. A 45-second, pointed clip, on the other hand, is pure gold. You can share it on LinkedIn, embed it in a newsletter, or use it as a teaser for the on-demand version of your webcast.

Our test showed that an AI's ability to recognize truly short, self-contained moments is the deciding factor for practical utility. A long list of suggestions is worthless if you have to painstakingly edit down every single suggestion yourself.

What this means for you: Your goal is to minimize the effort of repurposing while maximizing output. Therefore, an AI must not only tell you what is interesting but deliver it in a format you can use immediately.

Experience the Results Yourself: Our Interactive Demo

We can talk a lot about the differences—but it's best if you see for yourself. We built a simple, interactive demo where you can directly compare the results of the different models for both speeches. Click through and discover for yourself which AI finds the "Hidden Gems" and which merely scratches the surface.

Video highlights identified by LLMs
Video highlights identified by LLMs

Click here for the interactive demo

Our Conclusion: Expertise is the Deciding Factor

This test confirms our core philosophy at MEETYOO: It is not enough to simply slap AI onto a product. Real added value is only created when deep expertise flows into the software development.

We conducted this research because we understand what you need as a communications or marketing professional: not theoretical possibilities, but practical, efficient solutions that measurably increase your content ROI. The insights from this and many other tests flow directly into our platform, MEETYOO Show. This ensures that our AI is not just intelligent, but above all: useful.

Because we don't just offer you any software. We offer you Software for Decisive Moments. Backed by Experts.