Subtitle Text Analysis with subtools4 months ago
Overview | 1. Reading subtitles | From a file | Attaching metadata at read time | From a character vector | 2. Exploring the subtitles object | Quick summary | Raw text extraction | Accessing individual columns | 3. Cleaning subtitles | Remove formatting tags | Remove closed captions | Remove arbitrary patterns | Chaining cleaning steps | 4. Combining subtitles | Collapsing multiple objects into one | Keeping a list structure | 5. Reading an entire series | 6. Adjusting timecodes | 7. Writing subtitles back to disk | 8. Text analysis with tidytext | Tokenising into words | Tokenising into sentences or n-grams | Word frequency | 9. Advanced: cross-episode analysis | TF-IDF across episodes | Dialogue timeline | Summary
