Diese Seite mit anderen teilen ...

Informationen zum Thema:
Forum:
WinDev Forum
Beiträge im Thema:
5
Erster Beitrag:
vor 5 Jahren, 11 Monaten
Letzter Beitrag:
vor 2 Jahren
Beteiligte Autoren:
Gianni Spano, Michael Drechsel, Steven Sitas, evanpan

Convert PDF to TXT

Startbeitrag von Gianni Spano am 13.03.2012 09:36

Hello to All

I'm searching for a free activex/dll to insert in my project and convert a lot of pdf files into text to read their contents and insert the information in our database

Unfortunately, PDFTOTEXT () is not the best solution available in WIndev, because the result is not easily readble.

Do you know some free tools that can help me to resolve this issue??


Thanks in advance.

Gianni

Antworten:

Hi Gianni,
Your PDFs must be PDF Searchables, if you want't to use something as simple as PDFTOTEXT().

This is how it works PDFs + OCR -> PDF Searchable + PDFTOTEXT() -> TEXT

The problem is that the OCR stuff costs a lot of money and it has runtime cost.

take a look here:
http://www.aquaforest.com/en/index.asp

and this is open software:
http://code.google.com/p/tesseract-ocr/

Steven Sitas

von Steven Sitas - am 13.03.2012 13:12
Hello Steve

I did a lot of work to obtain the solution.

First, i open the .doc file with OLE Automation process and save the file as a PDF file (constant SaveAsPdf - value 17)

Then i extract the text from the file using a little program called "PDFTXT.exe" downloaded from www.pdf-tools.com.
There is a SDK, but i prefer to use the PDFDTXT.exe to extract the text from the pdf files saved/converted from Winword, using Exerun ()....

This little solution return a formatted text as is from the original .doc file, and after various researches and tests, this solution is the best available (also for my pocket!!)
[[1]]

Thank anyway for your inputs.

Gianni

von Gianni Spano - am 15.03.2012 12:36
*** I warned you! This is not the place for your spam! ***

von evanpan - am 26.02.2016 04:33
Hi Gianni,

can you post a link for "pdftext.exe" ? I can´t find it at www.pdf-tools.com.

TIA

von Michael Drechsel - am 26.02.2016 15:40
Zur Information:
MySnip.de hat keinen Einfluss auf die Inhalte der Beiträge. Bitte kontaktieren Sie den Administrator des Forums bei Problemen oder Löschforderungen über die Kontaktseite.
Falls die Kontaktaufnahme mit dem Administrator des Forums fehlschlägt, kontaktieren Sie uns bitte über die in unserem Impressum angegebenen Daten.