FedSearch - Federated network search engine

Jakub Cabal · @xcabal05

174 followers · 88 posts · Server witter.cz

Open media

Skenoval jsem archivní čísla zpravodaje z mé rodné vsi s využitím OCR. Pro zajímavost jsme s @vavracze udělali srovnání OCR textu z nástroje #OCRmyPDF (vlevo) a Canon tiskárny (vpravo). Přestože OCRmyPDF je #opensource nástroj, jeho OCR výstup má velmi dobré výsledky.

#ocrmypdf #opensource

Last updated 2 years ago

Original post

gihyo.jp · @gihyo

4 followers · 15 posts · Server rss-mstdn.studiofreesia.com

第770回　UbuntuとOCRmyPDFでスキャンした内容に対して自動的にOCRを実行する
https://gihyo.jp/admin/serial/01/ubuntu-recipe/0770?utm_source=feed
#技術評論社 #gihyo_jp #アプリケーション #ネットワーク技術 #お役立ち情報 #Ubuntu #OCR #OCRmyPDF

#技術評論社 #gihyo_jp #アプリケーション #ネットワーク技術 #お役立ち情報 #ubuntu #ocr #ocrmypdf

Last updated 2 years ago

Original post

Alexandre B A Villares 🐍 · @villares

939 followers · 2074 posts · Server ciberlandia.pt

GitHub - GitHub - ocrmypdf/OCRmyPDF: OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched

@neil @marinheiro for the benefit of others reading this: https://github.com/ocrmypdf/OCRmyPDF #OCR #PDF #OCRmyPDF

#ocr #pdf #ocrmypdf

Last updated 3 years ago

Original post

Dominik Bucheli · @buchi

50 followers · 44 posts · Server verkehrswende.social

@343max #OCRmypdf mit Homebrew installieren. https://ocrmypdf.readthedocs.io/en/v11.6.0/batch.html

#ocrmypdf

Last updated 3 years ago

Original post

Ben S. · @HunterZ

138 followers · 1767 posts · Server mastodon.sdf.org

Open media

Working through hitches with the last document:
- font in last tweet wasn't getting used (had to tweak it in FontForge)
- bits of text underlay visible (optimization fixed it)
- #jbig2 encoder used by #ocrmypdf optimization scrambled glyphs (disabled; I can't trust it now!)

#xp

#JBIG2 #ocrmypdf #xp

Last updated 3 years ago

Original post

Bluelupo · @bluelupo

340 followers · 2605 posts · Server social.tchncs.de

Schritt 2: Texterkennung (OCR) von PDFs unter Linux

Texterkennung (OCR) von PDFs unter Linux

https://write.tchncs.de/~/Paperless/schritt-2-texterkennung-ocr-von-pd-fs-unter-linux

...ein sehr fundierter Artikel wie man Hilfe von OCR PDF-Dateien nach Text durchsuchbar macht. Das Ganze ist in einem Python Programm verpackt. Von OCRmyPDF gibt es einen offiziellen Docker-Container der vom Entwickler gepflegt wird. Das Python Programm kann man von der Kommandozeile starten um seine PDF's schnell und effektiv umzuwandeln.

#Tesseract #OCR #PDF #Linux #Docker #OCRmyPDF #Kommandozeile #Container #Texterkennung

#texterkennung #container #kommandozeile #ocrmypdf #docker #linux #pdf #ocr #tesseract

Last updated 3 years ago

Original post

· @aluaces

31 followers · 114 posts · Server fosstodon.org

Not its main use, but #ocrmypdf is excellent for converting jpeg images into small pdf files.

#ocrmypdf

Last updated 3 years ago

Original post

Parleur · @parleur

487 followers · 23675 posts · Server mastodon.parleur.net

Ho, mais ça marche rudement bien, #ocrmypdf !

#ocrmypdf

Last updated 4 years ago

Original post

tuxwise · @tuxwise

29 followers · 93 posts · Server social.tchncs.de

tuxwise - Recommended software - tuxwise

Recommended #opensource #PDF #OCR tool: #OCRmyPDF

Why? Deskew & clean images before OCR · Multi-language support · PDF/A output · Lossless optimization · Folder watcher · Redo existing OCR · Well documented

More recommendations: https://tuxwise.net/recommended-software/

https://ocrmypdf.readthedocs.io/en/latest/

#ocrmypdf #ocr #pdf #opensource

Last updated 4 years ago

Original post

T.F.G. · @TFG

74 followers · 1225 posts · Server social.linux.pizza

wow.. and even recognizing hyphenation 😱 😱

#ocrmypdf

Last updated 4 years ago

Original post

Mr. Teatime · @Mr_Teatime

119 followers · 7316 posts · Server social.tchncs.de

@stardenver

Bin ebenfalls bei #ocrmypdf gelandet (für einzelne PDFs, aber lässt sich ja scripten). Normalerweise überspringt es OCR, wenn schon ein Textlayer vorhanden ist.

Was ich besonders mag ist auch die Option, den Inhalt per #pngquant verlustbehaftet zu komprimieren, falls das installiert ist.

#ocrmypdf #pngquant

Last updated 6 years ago

Original post