New blog post, in which I review and test some options for extracting unformatted text from #EPUB files in Python, using #Apache #Tika (via #Tika-python), #Textract and #EbookLib.
Includes link to Git repo with demo scripts.
https://www.bitsgalore.org/2023/03/09/extracting-text-from-epub-files-in-python
#ebooklib #textract #tika #apache #epub