Sentence Mining
Sentence mining is the practice of harvesting short, mostly-understood sentences from real content you're consuming and turning each one into a flashcard — so you study the language as it's actually used, not as a textbook imagines it.
Sentence mining is the practice of harvesting short, mostly-understood sentences from real content you're consuming and turning each one into a flashcard — so you study the language as it's actually used, not as a textbook imagines it.
What it is
Sentence mining is exactly what the name says: you go prospecting through real input — a show, a podcast transcript, a book, a YouTube video — and you "mine" out individual sentences that are valuable to you right now. Each mined sentence becomes a single card in a Spaced Repetition System (SRS), usually Anki. You then review those cards on a schedule so the vocabulary and grammar inside them stick.
The genius of the method — and the reason it fits an input-first worldview — is the selection criterion. You don't mine random hard sentences. You mine i+1 sentences: sentences where you understand everything except one new piece (one unknown word, or one unfamiliar grammar pattern). That single unknown is the "+1." Stephen Krashen's Input Hypothesis (i+1) is the whole engine here: a sentence that's 95-100% known gives your brain the surrounding context it needs to absorb the one new thing for free.
Think of it as targeted hypertrophy. Mass comprehensible input is your cardio — high volume, builds the base, where most acquisition actually happens. Sentence mining is the isolation lift: you spot a specific gap and load one precise rep onto it. It does not replace immersion. It's a supplement that makes your immersion stickier by catching the words you keep almost-knowing.
A classic mined card looks like this:
- Front: the full target-language sentence (with the unknown word bolded or marked)
- Back: the meaning of the unknown word, a definition, often audio of the sentence, sometimes a screenshot or an English gloss
The card is sentence-context, never an isolated word. That's the entire point and what separates it from the old "memorize this word list" trap that this wiki happily debunks in Vocabulary Acquisition.
The evidence
Three well-established strands of research back sentence mining — and we'll be honest about what each one does and doesn't prove.
1. Context beats isolation (Paul Nation, Krashen). Vocabulary researcher Paul Nation has spent decades showing that words are not single facts but bundles of knowledge — form, meaning, and use (collocations, register, grammar behavior). His work on word knowledge makes the case that you learn a word properly by meeting it in many meaningful contexts, not by drilling a bare translation. A mined sentence delivers form + meaning + use in one shot. Krashen's broader claim — that we acquire language by understanding messages, not by studying rules — is the philosophical parent of the whole method.
2. Spaced repetition fights forgetting (Hermann Ebbinghaus). Ebbinghaus's 19th-century memory experiments produced the famous forgetting curve: newly learned material decays rapidly unless it's revisited, and each well-timed review flattens the curve. SRS software (Anki, built on the SM-2 algorithm) schedules each card to resurface right before you'd forget it. Mining without SRS leaks; SRS without good cards drills junk. Together they're efficient.
3. Retrieval beats re-reading (the testing effect). A large body of cognitive-science work — Roediger and Karpicke's studies on the testing effect are the standard reference — shows that actively recalling information strengthens memory far more than passively reviewing it. An SRS card is a tiny retrieval test every single time. See Retrieval Practice & Interleaving for the deeper dive.
The honest caveats. Sentence mining is manual and easy to over-do. A card you don't understand is a card you'll fight for months. Krashen himself would point out that conscious study of cards is "learning," not deep "acquisition" — the cards are a scaffold, and the real growth still comes from massive input. And there's a well-known failure mode called Anki addiction: people optimize their deck instead of consuming the language. More on motivation and sustainability in The Science of Motivation. The research supports mining as a force multiplier on input — never as a substitute for it.
How to actually use it
Here's the no-bullshit training plan. Don't mine until immersion is already a habit — you can't harvest a field you haven't planted.
Step 1 — Pick input you mostly understand. Mine from content where you already grasp ~90% or more. If you're staring at a wall of unknowns, the content is too hard; drop down a level and build the base first. See Finding Comprehensible Input.
Step 2 — Mine i+1, and be ruthless. As you read or watch, only pull sentences with one unknown. Two-plus unknowns? Skip it — it's not a rep, it's an injury. Quality over quantity, always. A good daily quota for a beginner-intermediate learner is 5-15 new cards, not 50. Your future self has to review every card you make today.
Step 3 — Automate the boring part. Hand-typing cards kills the habit. Use a tool that builds the card for you from a click: Language Reactor for Netflix/YouTube exports straight into Anki, the Yomitan browser dictionary (the successor to Yomichan) for one-click mining while reading, or LingQ which tracks unknowns as you read.
Step 4 — Build the card right. Front = the whole sentence with the target word marked. Back = a short definition (ideally a monolingual one once you're intermediate), audio of the sentence, and any image that helps. Always include audio — it trains your ear as well as your eye and feeds Mastering Listening.
Step 5 — Review daily, no streak-breaking. Open Anki every day. Ten minutes beats a two-hour weekend cram every time — that's the forgetting curve at work. Bake it into your Daily Routine. Miss a day, you'll just face a bigger queue; miss a week, the curve eats your gains.
Step 6 — Delete shamelessly. A card you keep failing or that bores you? Suspend or delete it. You'll meet the word again in real input. A lean deck you actually do beats a bloated deck you dread.
Step 7 — Don't let the deck become the goal. Mining serves input. If you find yourself perfecting card templates instead of watching the show, you've lost the plot. Speaking, by the way, isn't forced here — it emerges once enough comprehensible reps are in the bank. Cards build the raw material; output shows up on its own.
Resources
- Anki — the free, open-source SRS that powers most sentence mining. (iOS app costs money; desktop, Android, and web are free.) Start with our Anki guide.
- Language Reactor — browser extension for Netflix and YouTube with one-click Anki export of subtitle lines plus audio. Search "Language Reactor."
- Yomitan — free pop-up dictionary browser extension (the maintained fork of Yomichan) with built-in Anki card creation, especially strong for Japanese and other languages. Search "Yomitan."
- migaku — a paid all-in-one immersion + mining toolkit if you want the workflow bundled. Search "Migaku."
- LingQ — reader app that tracks your known/unknown words and supports exporting; covered in our LingQ guide.
- "How to Learn a Foreign Language" frameworks from the Refold community — Refold's free guide popularized modern sentence mining; see Refold / Mass Immersion Approach.
- Paul Nation, Learning Vocabulary in Another Language — the academic bedrock on how words are actually learned in context.
Related
Gear on the flywheel
The stuff that actually moves your reps
Real resources for this page — ranked by learners, never sponsored. Tap through to upvote, save, or grab them.
- TOOLFree
Language Reactor
Turns Netflix and YouTube into a comprehensible-input machine — dual subtitles, hover-to-look-up, save words from what you watch.
Comprehensible input - APPFree
Anki
The spaced-repetition workhorse. Mine words from your input, review daily, and they stick. Free everywhere except iOS.
Spaced repetition - GUIDEFree
Refold
A free, step-by-step roadmap for the immersion / input-first path — zero to fluent on comprehensible input.
Immersion roadmap - TOOLPaid
Migaku
Browser + Anki toolkit that turns shows, music and articles into mined flashcards with audio and screenshots. Input-first, automated.
Comprehensible input + SRS
Keep going — The Method
The rest of this shelf. Pick the next rep.