Tag Archives: Easy PDF Search

Displaying logical page numbers in search results

By default, the search results In Easy PDF Search displays the physical page numbers where the search words/phrases are found.

You may sometimes want to display the logical page numbers instead. For e.g. say we have a PDF file that has a cover, followed by a blank page, then 5 pages of prefaces using roman numbering, followed by another blank page, then finally the actual content.

The logical numbering of the PDF would look something like this:

  • cover (physical page 1)
  • blank page (physical page 2)
  • page I (physical page 3)
  • page II (physical page 4)
  • page III (physical page 5)
  • page IV (physical page 6)
  • page V (physical page 7)
  • blank page (physical page 8)
  • page 1 (physical page 9)
  • page 2 (physical page 10)

In Easy PDF Search 5.2, you can now define the logical page numbering for a PDF file and display that numbering scheme in your search results.  The search results using logical page numbering for the above example will then be displayed this way:

To define the logical page numbering for a PDF file, right click on the PDF file name in the search results and click on the Define logical page numbering item.

In the Logical Page Numbers screen, enter the page count for each page type that have their own numbering scheme or none at all. Using the example above, this is how we would define the logical page numbers.

 

Easy PDF Search 5 – highlighting function

In Easy PDF Search 5 (EPS), we added a highlighting function.  This is to help users who need to highlight content in their PDF files after EPS has delivered the search results.  Previously, we had to open the PDF file in another application like Acrobat Reader to highlight the content, thus losing the search results and navigation functions available in EPS.

IMPORTANT NOTE

DON”T use the highlighting function in EPS to redact your PDF content.  The highlighted areas are path ‘objects’, which can be easily removed.  

Getting started

We will need to open the file we want to highlight in EPSs’ internal viewer.  To do that, select the Open PDF file using internal viewer option from the context menu in the search tree.

The file is then opened in its own tab.

The search words are still highlighted, but using outlines instead of a solid color.  Using a solid color is confusing as it becomes difficult to differentiate our own highlighted text and those highlighted by EPS for the search words.

You can still highlight the search terms in solid color by deselecting the Show outlines only for search words option.

Highlighting content

To highlight the content, or remove the highlighting, use these options on the toolbar.

Toggle the Highlight text button to enable and disable highlighting.  When enabled, drag the mouse over the content you want to highlight.  Release the mouse to apply the highlight.

TIP: You can also click on the right mouse button to toggle the Highlight text button.

The color drop down allows you to select the highlighting color.

Click on the Remove highlight button to remove existing highlighted areas.  When enabled, click on a highlighted area to remove the highlight.  Note that EPS is unable to differentiate between highlights applied in EPS and by other applications.  If it detects a path object on where you clicked on, it will simply attempt to remove it.

The Remove all highlights on current page button allows you to remove all highlighted areas on the current page.  Again, EPS cannot differentiate between highlights it applied and existing highlights applied by another application.  It will simply remove all PDF path objects it finds on the current page.

 

Saving the highlighted file

You must save the modified PDF file using a different name from the original file name.  Click on the Save PDF file button to save your modified file.

Once you have saved the file, the modified PDF file name is displayed in the drop down list under the Save button.

To open this file in an external viewer, click on the Open PDF file using external application button.  Windows will then attempt to open the file using the registered PDF application.

Your PDF file is also automatically saved when you close EPS, but you need to have previously saved the file.

Persistence

When you close and reopen EPS, any files that were previously opened in the internal viewer will also be automatically loaded and displayed.  This allows you to continue your work from where you last left off.

Next steps

Depending on user feedback and sales, the following items are currently considered for implementation:

  • an option to remove all highlights from the entire file
  • a function to extract all the highlighted text from a page/file
  • make the highlighting function available to files opened outside of the search function

If you have any other suggestions, please drop us a line at support@yohz.com.

Finding PDF files that do not contain any text

The situation is as follows: you use Easy PDF Search to index and search for words and phrases in your PDF files.  Suddenly, you realise that not all your files have been indexed because they actually do not contain any searchable text.

This can happen if your PDF files are actually scans of documents, and you use an OCR application to read and store the text inside the PDF.  Now you need a quick and easy way to identify those PDF files that have not been scanned.

You can easily do this in Easy PDF Explorer.  First, select the Count images and text option.

If you want an accurate count of the number of images and text, you can leave the option at in all pages.  If you just want to know if a file contains any characters, then selecting the but stop when text is found option will speed up the process significantly.

This is because Easy PDF Explorer no longer has to scan every page in the PDF file – the moment it encounters a page that contains text, it stops scanning.  If your aim is just to know which files don’t contain any text, then this option is the fastest.

Once you’ve selected those options, select the PDF files you want to scan in the explorer window.

Easy PDF Explorer then lists down the files together with the number of characters in each file.

Click on the Characters column, and you can quickly sort the files by number of characters found.

Say you want to copy those files with no text to another folder.  Right click and select the Deselect all item.

Now click on the first file with no text, and while holding down the SHIFT key, click on the last file with no text.  The range of files will be highlighted.

Now right click to bring up the context menu, and click on the Select item.  The range of files will then be selected.

Now just click on the Copy to folder button, select the folder to copy the selected files to, and you’re done.  Now all you have to do is run your OCR application on those files.

Download a 14-day trial of Easy PDF Explorer now to work with your PDF files faster.  Also give Easy PDF Search a try – probably the fastest way to search your PDF files for multiple words and/or phrases.

What’s new in Easy PDF Search 4

Easy PDF Search (EPS) 4 was recently released with the following changes:

Notes window

There is now a Notes window in EPS which you can use to take notes while working with your search results.  The Notes Editor is a simple text editor, and will float above all other EPS windows.

Your notes are saved in the rich-text format, and if you have Microsoft Word installed, you can also save the notes to DOCX and PDF formats.

You can also embed PDF documents into the Notes window, so you can view multiple PDF documents simultaneously for reference.

Usability improvements

  • You can now scroll across multiple pages by holding down the SHIFT key and moving the mouse wheel.
  • You can now zoom in and out of a page by holding down the CONTROL key and moving the mouse wheel.
  • You can now specify the font size to be used in the Search and Results windows via the Settings screen.

Database clean-up

Over time, the full-text index database may contain entries for non-existing files.  You can now delete those entries using the Clean function in the Settings screen.

This will free up space in the database for new files, but it will not reduce the size of the file.

For existing users who purchased a license within the last 12 months, you can upgrade to version 4 for free.  For users with older licenses, you can purchase a license extension for only USD 10.

You can download a 14-day trial version using this link.

High DPI support

We recently added high DPI support to some of our applications so that they render better when user displays are scaled to 125% or more.  We may have missed 1 or 2 items, so if you encounter any GUI elements that are oversized or undersized, we would appreciate it very much if you could let us know at support@yohz.com.

The applications we’ve added high DPI support for are:

Search text in multiple PDF files fast

So you want to search for text in multiple PDF files?  You can do that in Adobe Acrobat, and Google will turn up a few guides on doing that.

That’s all good and fine, but what if you need the search results fast and you need to search hundreds or thousands of PDF files?  Then you should consider Easy PDF Search.

Speed

Easy PDF Search is fast.  Watch this video comparing Easy PDF Search with Adobe Acrobat.  In short, to search for a word the second in 46 files totaling 1 GB in size, Easy PDF Search took 3 seconds while Adobe Acrobat took 3 minutes 13 seconds.

We have a user who regularly searches his collection of over 12000 PDF files using Easy PDF Search, and he gets his search results in less than 20 seconds.

Search multiple words simultaneously

Search for multiple words simultaneously.  Why waste time searching the same files for different words?  Easy PDF Search lets you search for as many words or phrases as you require.

Quickly see where your words were found

Easy PDF Search doesn’t just tell you which files your words were found in, it tells you exactly which page you can find the words in, and the frequency of the words on each page and the entire file.

In the integrated PDF viewer, all your words are highlighted on each page.

View results from past searches

Easy PDF Search maintains a search history of the words you searched for and also of the search results.

This means you can easily view the search results from past searches without having to reperform the search.

By now, you can see that Easy PDF Search is designed to save you time and help you search for text in multiple PDF files fast and easily.

In addition to the above, there is a lot more you can do with Easy PDF Search like:

  • merge all the pages from the search results into a single PDF file
  • copy all the files in the search results
  • extract text from the pages where the words were found in
  • perform proximity searches e.g. NEAR (authorities “homeland security”, 20)
  • perform exclusion searches e.g. monitoring NOT daily
  • search PDF annotations and file attributes

Download a 14-day trial of Easy PDF Search and start using your PDF collection to their full potential, or visit our web site for more details.

Full text index for your PDF files

Are you considering creating a full text index on your PDF files, so that you can frequently search for words and phrases fast?  That’s what Easy PDF Search was created for.

Say you have a collection of PDF files for various topics.  You can organize your files into libraries so that when you run your search, you can choose to search only in specific libraries.  You don’t have to always search your entire PDF collection.

In Easy PDF Search, you can search for multiple words simultaneously.  Here, we are searching for all files containing the words monitoring, splices or pressure.

Our search results are then returned, grouped by each search word.

And on each page, our search words are highlighted in a different color.

Now what can you do with those search results?  In Easy PDF Search, lots.

For starters, you can export the search results listing or just the file names, for future or offline reference.

Next, you can work with the PDF pages from the search results.

You could extract each of the pages containing your search words and compile them into a single PDF file.  You could also extract the text found on those pages, or extract the pages into individual PDF files, and much more.

Easy PDF Search also keeps a search history, so you can just refer to it whenever the need arises without having to reperform the search.

Give Easy PDF Search a try.  We offer a 14-day fully functional trial so you can experience for yourself how easy it is to create a full text index for your PDF files and search those files fast.

Introducing Easy PDF Search 3

Easy PDF Search (EPS) 3 focuses on 3 areas – support more search options, more user actions on the search results, and general performance improvements.

More search options

In version 2, we added the option to search only the existing index.  This allows you to make very fast searches without having to check for new or modified files to index, or when the indexed files are not accessible.  In version 3, we added an additional option to search the existing index only for files in the selected libraries.

We also added the option to return only the file names from the search.

A good portion of the search duration is actually spent identifying which words to highlight in the search results.

When you only need the list of files where the search words were found, then selecting the  Return file names only option would speed up your searches even more.

User actions on search results

In previous versions, while you could work with the search results like combining all the pages into a single file, extracting the search pages into individual files etc, you could not work with the results listing itself.

In version 3, you now have a context menu that allows you to perform various actions on the search results listing, like copying the list of files to the clipboard, opening the containing folder etc.

General performance improvements

We have improved the performance where possible, especially when dealing with large collection of files.  The search history listing now loads faster too.

Miscellaneous UI improvements

We have also made various minor UI tweaks to improve usability.  An obvious addition is the availability of in-built icons you can easily add to your library definition.

This helps you to quickly make your libraries more distinctive.  Of course you can still always use your own icons.

If you would like to give Easy PDF Search a try, you can download a free 14-day fully functional trial here.

Easy PDF Search – the search options explained

When searching for words and phrases in Easy PDF Search (EPS), you have 4 options:

For the first option, the process flow is as follows:

  • EPS looks for all the folders set up in the selected libraries
  • in each folder, EPS compiles a list of all the files matching the search pattern
  • for each new file, EPS will index that file
  • for each modified file, EPS will rebuild the index
  • EPS then searches for the entered words/phrases in the list of files it compiled in step 2 above

For the second option, the process flow is as follows:

  • EPS looks for all the folders set up in the selected libraries
  • in each folder, EPS compiles a list of all the files matching the search pattern
  • for each file, EPS deletes any existing index, and builds the index again
  • EPS then searches for the entered words/phrases in the list of files it compiled in step 2 above

For the third option, the process flow is as follows:

  • EPS looks for all the folders set up in the selected libraries
  • in each folder, EPS compiles a list of all the files matching the search pattern
  • EPS then searches for the entered words/phrases only in the files where an index has already been created

For the fourth option, the process flow is as follows:

  • EPS searches for the entered words/phrases in its existing index.

The point to note is that in the first 3 options, Easy PDF Search only returns results from files that exist.  If a PDF file has already been indexed previously but no longer exists, EPS will not search the index of that file.

Searching an existing index in Easy PDF Search

Easy PDF Search indexes your PDF files and allows you to search your files for keywords.  When you perform a search in Easy PDF Search, it first scans your library paths for PDF files.  New and modified files will be indexed, then only existing files are searched.

In some situations, you may not have the source PDF files with you, but only the Easy PDF Search index database.  Or you may not want Easy PDF Search to spend time scanning for existing files, but just want to search for keywords in the already indexed files.

In Easy PDF Search 2.1, we added the option to skip the file scanning process and directly search the existing index.  This is available under the Options menu.

Selecting the Search index only option will search the existing index and return the results, regardless of whether the file exists.

To recap the 4 options:

  • Index new files only
    This option scans the search folders defined in each library, and indexes only the new and modified files it finds, then searches for keywords in those indexed files that exist.
  • index all files
    This option scans the search folders defined in each library and indexes all the files it finds, deleting any existing index for each file.  It then searches for keywords in those indexed files that exist.
  • search only indexed files
    This option scans the search folders defined in each library for files, and searches for keywords in those indexed files.  It ignores any new or modified files.
  • search index only
    This option performs searches on the existing index, and does not scan to check if the indexed files exist.