DANYLO PRAVDA
ALL NOTES

FIELD NOTE — RESEARCH METHOD — 2026-06-17

How to rank the best recent videos on any topic — a weighted, recency-aware method

A reusable method for pulling the best *recent* videos on a topic — not just the most-viewed all-time. It scores every candidate on five weighted levers (recency decay, view velocity, channel over-performance, damped popularity, engagement), gates on age/views/duration first, caps per channel for diversity, and gets the publish dates it needs from a bulk metadata provider (since plain search can't rank by date). Every parameter is tunable; the defaults favor the last few months.

4 min read

How to rank the best recent videos on any topic — a weighted, recency-aware method

When you mine a topic for the best recent videos — the ones actually worth watching this quarter — "sort by views" betrays you: it returns the same all-time giants every time, most of them years old. Getting the freshest high-signal videos is a small ranking problem with a specific shape. This is the weighted, recency-aware method I use, with every knob exposed.

CONTENTS

CH.01

Why "sort by views" fails for recent research

Raw view count is an all-time accumulator, so it rewards age, not relevance-now: a 4-year-old video has had four years to collect views a three-week-old breakout never could. Sort a topic by views and you get a museum, not a feed. For "what's good now," you need a score that explicitly decays with age and rewards how fast a video is gaining — otherwise recency is, at best, a filter bolted onto a popularity sort, and the freshest gems never reach the top.

CH.02

You can't rank by recency without the dates — get them in bulk

The first practical wall: a plain search API returns titles and view counts but not publish dates, so there is literally nothing to rank recency on. (YouTube's flat search via yt-dlp returns date=NA.) You have two ways to get dates: a per-video metadata call for every candidate — slow, and it hammers the same IP that gets rate-limited — or a bulk metadata provider. An Apify YouTube scraper is the clean primary: it runs from its own infrastructure (so it sidesteps the per-IP 429 wall that blocks direct subtitle/metadata pulls) and returns views, publish date, likes, comments, and duration for a whole query at once — exactly the fields the score needs. The per-video pass is the no-token fallback.

CH.03

The score: five weighted levers

Score each candidate as a weighted blend of five normalized signals — recency, velocity, over-performance, popularity, engagement — defaulting to 0.32·recency + 0.25·velocity + 0.18·over-performance + 0.15·popularity + 0.10·engagement. Each lever is 0–1 so the weights are honest proportions:

Lever Formula Why
Recency 0.5 ^ (age_days / 60) Exponential decay, 60-day half-life — a 2-month-old video scores half a brand-new one. This is what makes it recent-aware, not recency-filtered.
Velocity log10(views / age_days) (normalized) Views per day — the breakout signal. A video with 100k views in 30 days outranks 500k in 5 years.
Over-performance (views / channel_subscribers), capped at 3× Views relative to the channel's own audience. A video that outruns its subscriber base got pushed by the algorithm past the home crowd — genuine signal, independent of reach. It demotes a big channel coasting on subscriber inertia and promotes a small channel's breakout. Unknown subscriber count falls back to popularity.
Popularity log10(views) (normalized) Damped total views, so mega-channels don't crush everything; it's a quality signal, not the driver.
Engagement (likes + comments) / views Audience resonance — separates "watched" from "mattered."

Velocity, recency, and over-performance together (0.75 of the weight) are what surface the fresh, genuinely-good winners; popularity and engagement keep reach and resonance in the mix.

CH.04

Gate before you score

Apply hard filters first, so the score only ranks things already worth ranking: drop anything older than the recency window, below a view floor, shorter than ~3 minutes, or not in your language. Defaults: published within the last ~9 months (centered on now — as of mid-2026 that means roughly Sept 2025 onward), ≥ 1,000 views (quality floor), ≥ 3 minutes and not flagged as a Short by the provider (drops Shorts and teasers), English. Gating first is cheaper and stops a viral 12-second Short or a 2023 classic from ever entering the ranking.

CH.05

Tuning: freshest vs evergreen

Two knobs reshape the whole result: the recency half-life and the max-age gate — shrink them for "what dropped this month," widen them for "the best of the past year." Presets I use:

  • Freshest (last-month pulse): half-life 30d, max-age 120d. Recency dominates; you get this quarter's conversation.
  • Balanced (default): half-life 60d, max-age 270d. The mix above.
  • Evergreen-leaning: half-life 120d, max-age 540d, and bump popularity's weight. Surfaces durable references that still get watched.

The weights themselves are tunable too — push velocity up when you're hunting breakouts, engagement up when you care about community resonance over reach.

CH.06

Diversity: cap per channel

Without a guard, one prolific channel floods the top — so cap how many videos any single channel can contribute (default 3) before taking the top-K. A topic's "best recent" should be a survey of the field, not one creator's upload schedule. The cap is applied after scoring, walking the ranked list and skipping a channel once it's hit its quota.

CH.07

How to run it

The method is a small, provider-agnostic ranker with three stages: a metadata provider returns candidates (views, date, engagement); a scorer applies the gates then the weighted blend; a final pass does the per-channel diversity cap and returns the top-K. Every parameter above is a tunable knob (half-life, max-age, view floor, the four weights), so the same documented mechanism serves every topic sweep — point it at a query, get back a ranked, recency-aware shortlist, then fetch transcripts only for the winners. The whole point: separate finding the few videos worth your time from the expensive step of actually pulling and reading them.

research-methodsyoutuberankingrecencydata
DISCUSSION

No comments yet — start the conversation.

Sign in to join the discussion — it's free.