# How to rank the best recent videos on any topic — a weighted, recency-aware method

> A reusable method for pulling the best *recent* videos on a topic — not just the most-viewed all-time. It scores every candidate on five weighted levers (recency decay, view velocity, channel over-performance, damped popularity, engagement), gates on age/views/duration first, caps per channel for diversity, and gets the publish dates it needs from a bulk metadata provider (since plain search can't rank by date). Every parameter is tunable; the defaults favor the last few months.
>
> https://pravda.systems/notes/how-to-rank-the-best-recent-videos · 2026-06-17

When you mine a topic for the *best recent* videos — the ones actually worth watching this quarter — "sort by views" betrays you: it returns the same all-time giants every time, most of them years old. Getting the freshest high-signal videos is a small ranking problem with a specific shape. This is the weighted, recency-aware method I use, with every knob exposed.

## Why "sort by views" fails for recent research

**Raw view count is an all-time accumulator, so it rewards age, not relevance-now: a 4-year-old video has had four years to collect views a three-week-old breakout never could.** Sort a topic by views and you get a museum, not a feed. For "what's good *now*," you need a score that explicitly decays with age and rewards how fast a video is gaining — otherwise recency is, at best, a filter bolted onto a popularity sort, and the freshest gems never reach the top.

## You can't rank by recency without the dates — get them in bulk

**The first practical wall: a plain search API returns titles and view counts but not publish dates, so there is literally nothing to rank recency on.** (YouTube's flat search via `yt-dlp` returns `date=NA`.) You have two ways to get dates: a per-video metadata call for every candidate — slow, and it hammers the same IP that gets rate-limited — or a **bulk metadata provider**. An Apify YouTube scraper is the clean primary: it runs from its own infrastructure (so it sidesteps the per-IP 429 wall that blocks direct subtitle/metadata pulls) and returns views, publish date, likes, comments, and duration for a whole query at once — exactly the fields the score needs. The per-video pass is the no-token fallback.

## The score: five weighted levers

**Score each candidate as a weighted blend of five normalized signals — recency, velocity, over-performance, popularity, engagement — defaulting to `0.32·recency + 0.25·velocity + 0.18·over-performance + 0.15·popularity + 0.10·engagement`.** Each lever is 0–1 so the weights are honest proportions:

| Lever | Formula | Why |
|---|---|---|
| **Recency** | `0.5 ^ (age_days / 60)` | Exponential decay, 60-day half-life — a 2-month-old video scores half a brand-new one. This is what makes it *recent*-aware, not recency-filtered. |
| **Velocity** | `log10(views / age_days)` (normalized) | Views *per day* — the breakout signal. A video with 100k views in 30 days outranks 500k in 5 years. |
| **Over-performance** | `(views / channel_subscribers)`, capped at 3× | Views relative to the channel's *own* audience. A video that outruns its subscriber base got pushed by the algorithm past the home crowd — genuine signal, independent of reach. It demotes a big channel coasting on subscriber inertia and promotes a small channel's breakout. Unknown subscriber count falls back to popularity. |
| **Popularity** | `log10(views)` (normalized) | Damped total views, so mega-channels don't crush everything; it's a quality signal, not the driver. |
| **Engagement** | `(likes + comments) / views` | Audience resonance — separates "watched" from "mattered." |

Velocity, recency, and over-performance together (0.75 of the weight) are what surface the *fresh, genuinely-good* winners; popularity and engagement keep reach and resonance in the mix.

## Gate before you score

**Apply hard filters first, so the score only ranks things already worth ranking: drop anything older than the recency window, below a view floor, shorter than ~3 minutes, or not in your language.** Defaults: published within the last **~9 months** (centered on now — as of mid-2026 that means roughly Sept 2025 onward), **≥ 1,000 views** (quality floor), **≥ 3 minutes** *and* not flagged as a Short by the provider (drops Shorts and teasers), English. Gating first is cheaper and stops a viral 12-second Short or a 2023 classic from ever entering the ranking.

## Tuning: freshest vs evergreen

**Two knobs reshape the whole result: the recency half-life and the max-age gate — shrink them for "what dropped this month," widen them for "the best of the past year."** Presets I use:

- **Freshest** (last-month pulse): half-life **30d**, max-age **120d**. Recency dominates; you get this quarter's conversation.
- **Balanced** (default): half-life **60d**, max-age **270d**. The mix above.
- **Evergreen-leaning**: half-life **120d**, max-age **540d**, and bump popularity's weight. Surfaces durable references that still get watched.

The weights themselves are tunable too — push velocity up when you're hunting breakouts, engagement up when you care about community resonance over reach.

## Diversity: cap per channel

**Without a guard, one prolific channel floods the top — so cap how many videos any single channel can contribute (default 3) before taking the top-K.** A topic's "best recent" should be a survey of the field, not one creator's upload schedule. The cap is applied after scoring, walking the ranked list and skipping a channel once it's hit its quota.

## How to run it

**The method is a small, provider-agnostic ranker with three stages:** a metadata provider returns candidates (views, date, engagement); a scorer applies the gates then the weighted blend; a final pass does the per-channel diversity cap and returns the top-K. Every parameter above is a tunable knob (half-life, max-age, view floor, the four weights), so the same documented mechanism serves every topic sweep — point it at a query, get back a ranked, recency-aware shortlist, then fetch transcripts only for the winners. The whole point: separate *finding the few videos worth your time* from the expensive step of actually pulling and reading them.
