All projects

Case study · Complete

Red Bull YouTube Sentiment Analytics

500 comments, VADER-scored, told a story

Role

Solo — data collection, analysis, report

Period

Apr 2026 · end-term SMA project

The Pitch

Net Sentiment Score of +28.6 pp — roughly 2× the consumer-brand benchmark — with 100% hashtag discipline across a 50-video catalog.

The Problem

Every marketing-ops deck claims "audience sentiment trending positive." Few of them show the math. I wanted to build a sentiment analysis I'd actually trust for a recommendation — starting from the raw comments, with a reproducible pipeline end-to-end.

Red Bull was the right target: a 27.9-million-subscriber channel, consistent content type, and a brand identity that's either working or it isn't. If the methodology holds on a hard case (stunts, POVs, Formula 1) it holds anywhere.

The Approach

Pipeline, six scripts

`scrape_youtube.py` pulls 500 comments across the 5 most-commented recent videos via YouTube Data API v3. `fetch_descriptions.py` enriches video metadata with hashtags from 50 video descriptions via yt-dlp — no API quota burned.

`analysis.py` cleans, scores with VADER (compound ≥ 0.05 → positive, ≤ −0.05 → negative, else neutral), extracts keywords, and categorises complaints. Then `build_excel_dashboard.py`, `build_report.py`, and `build_executive_summary.py` emit the three deliverables: an 8-chart Excel dashboard, a Word report, and a one-page executive PDF.

Sampling choices worth defending

Capped at 100 comments per video × 5 videos = 500 total. A single mega-viral video would dominate if uncapped, and the story I needed was brand-level not video-level.

Hashtag analysis pulled across a wider 50-video catalog — hashtags are brand policy, not audience reaction, so a bigger sample reveals the discipline pattern. That's how the "100% hashtag adoption on every video" finding surfaced.

The findings that matter

Organic keyword frequency: "gives" (35) and "wings" (31) are the top two, ahead of any adrenaline or F1 term. The slogan has genuine unprompted recall.

Hashtag discipline: #RedBull and #GivesYouWiiings appear on 100% of the 50-video catalog. That's unusually strict brand-policy enforcement for an entertainment channel.

The main genuine complaint is viewer anxiety about stunt safety (18 comments), not product taste or price (2). That's a brand-equity signal, not a marketing failure to fix.

Why the numbers are defensible

I chose VADER's official thresholds rather than tuning to taste, deduplicated on comment_id before analysis, and flagged complaint categories by keyword rules that are visible in `analysis.py` — anyone can re-run the pipeline and audit the exact classification rule for any comment.

YouTube Data API v3 quota usage for the full run: under 250 units of the free 10,000/day budget. Reproducible on a free account.

Key Decisions

VADER, not a fine-tuned transformer

A fine-tuned BERT-family model would beat VADER on accuracy by maybe 5–10 percentage points. But VADER is deterministic, auditable, runs locally in seconds, and uses published cut-offs. For a brand-sentiment report going to stakeholders, auditability beats incremental accuracy — "this comment scored negative because its compound score is −0.34" is a defensible claim.

yt-dlp for descriptions, API for comments

The YouTube Data API is rate-limited and expensive when you need 50 video descriptions. yt-dlp parses the public page, no API key, no quota. Using both tools for what each is best at cut the full pipeline quota usage by roughly 80%.

Ship three deliverable formats, not one

Excel dashboard for the analyst, Word report for the write-up, one-page PDF for the executive. A single format leaves one audience under-served. The executive PDF is the link I'd share first to a marketing director.

Metrics

Comments analysed

500

Hashtag catalog

50 videos

Net Sentiment Score

+28.6 pp

≈2× benchmark

Positive / neutral / negative

47 / 34.6 / 18.4

Hashtag discipline

100%

#RedBull + #GivesYouWiiings

API quota used

< 250 units

Live Chart

Rendered from the actual summary.json of the analysis run.

Overall sentiment · 500 comments

Net: +28.6 pp

Positive47%
Neutral34.6%
Negative18.4%

Industry benchmark for consumer-brand YouTube comments is +10 to +15 pp net sentiment. Red Bull runs roughly 2× that.

Sentiment by video

They Couldn't Look Away

+44 pp

Helicopter Drop Off For This

+38 pp

This Doesn't End

+34 pp

World's CRAZIEST POVs

+ 7 pp

How Quick Are Your Reflexes?

+20 pp

Top organic keywords

The slogan — "gives you wings" — has genuine organic recall.

gives
35
wings
31
drinks
21
bro
16
drink
16
man
14
life
14
camera
13
keep
13
video
12

Numbers rendered live from summary.json of the analysis run. Full dataset and code on GitHub.

Stack

Collection

YouTube Data API v3yt-dlpgoogle-api-python-client

Analysis

PythonpandasNumPyVADER (vaderSentiment)

Deliverables

Matplotlib + Seaborn (9 charts)WordCloudopenpyxl (Excel dashboard)python-docx (Word report)reportlab (Executive PDF)