A streaming service for Korean learners following the comprehensible input method — natural speech, native captions, no whiteboards. Videos are ranked by a personalized comprehensibility score derived from the lemmas a user has already encountered, so the next thing you watch is, on purpose, just a little harder than the last.
A Python ingestion worker pulls creator-captioned videos with yt-dlp, classifies them as immersion or lesson content with Gemini, and tokenizes the transcripts with kiwipiepy down to lemmas and POS tags.
The frontend tracks every lemma a user is exposed to across 10k+ tokens per video, recomputes familiarity in real time, and surfaces videos in the 80–90% Goldilocks zone.
A self-healing NLP layer flags slang and compound words the dictionary misses, and a small admin console retriggers ingestion when a channel drifts.