PDF Data Extractor Portable can extract certain text information within the PDF, this is an ideal product if you had for example a PDF statement that you need to extract data like Account Number, Name, Address and output this information into an Excel CSV file. It uses horizontal, vertical text position matching and for more advanced matching it has a rules system for conditional matching, e.g. Only take match when Account Number: text is on the same page. Different fields can also be merged into one, so say First Name and Surname can be output as one field in the CSV file. Many options are available:
Data Extraction, OCR pdf option, OCR number correction, Adjust for Skewed pdf page option, Full Unicode support for other language files e.g. Hebrew, Right to Left reading order option, Offset on a word on the page for dealing with Chopped Scanned PDF’s, Number, Date & Money / Date / Address / Email / Telephone Number / Number filtering, Smart Adobe Reader PDF Highlight Setup, Output filename using data, Pattern Matching, Data file lookup for matching codes for descriptions, Data column order assignment, Run on the command line, Header output, page number field, filename field, Batch list of files to process, 32bit and 64bit versions.
Also can now rename or copy files to a new location based on data extracted.
Enterprise Edition also supports:
Full enterprise flexibility with multi-threaded hot folder monitoring, background NT service support, SQL Server database insert support for database updates directly from PDF data, and other DOS commands per extracted data sexuality is realized.
1. Supports Windows Server 2008, 2012, 2016, 2019, Windows 7, 8, 10 & 11
2. Stand Alone version i.e. Does NOT require Adobe Acrobat
3. 32bit & 64bit Version downloads
4. Extract data from multiple page pdf’s
5. Multiple output fields from the source pdf
6. Conditional matching rules system
7. Optional OCR PDF first.
8. Full Unicode Support.
9. HOT Folder Support.
10. NT Service Background support.
11. SQL Server Database support.
12. DOS Scripting with Data support.
13. PDF Highlight Setup
14. Offset option for chopped PDF’s.
15. Skewed PDF option.
16. Number /Date/Money/Email/ Telephone No Filtering.
17. Numeric / Alpha Pattern Matching.
18. File Lookup Matching.
19. OCR alpha to number fixing.
20. Output fields such as: Total pages, page number matched, filename
21. Process batch list of pdfs
22. Optionally run on the command line for automation
23. Supports all pdf types except encrypted and protected
24. Automatically saves settings for later use
26. Full HTML & PDF Help.
NOTE: This Software is Stand Alone, i.e. does NOT require Adobe Acrobat to run
1. added multi word text match for option “if last exact data match then h>=(n) && h<=(n),v>=(n) && v<=(n) match output (join + add space)” e.g. match text: “Account Number:” would automatically use last match and one before, it can now have upto 3 words with a space to match on. Before it was limited to one word. e.g. “Number:”
2. Fix for file menu–>”save as” memory issue.