{"id":546,"date":"2019-09-09T20:57:33","date_gmt":"2019-09-09T12:57:33","guid":{"rendered":"http:\/\/www.yohz.com\/blogs\/?p=546"},"modified":"2019-09-19T16:32:55","modified_gmt":"2019-09-19T08:32:55","slug":"extract-text-from-pdf-files","status":"publish","type":"post","link":"https:\/\/www.yohz.com\/blogs\/2019\/09\/09\/extract-text-from-pdf-files\/","title":{"rendered":"Extract text from PDF files"},"content":{"rendered":"<p><strong>Task<\/strong> &#8211; you need to extract text from your PDF files<\/p>\n<p><strong>Options<\/strong> &#8211; you can find hundreds of online sites that can do that for you.<\/p>\n<p><strong>Concern <\/strong>&#8211; your files are confidential, and you&#8217;re not sure if those sites are making copies of your files for &#8216;other&#8217; purposes.<\/p>\n<p><strong>Practicality<\/strong> &#8211; you want to extract text from hundreds or thousands of files, and processing each file online is going to be veeeeeery boring.<\/p>\n<p>Try <a href=\"https:\/\/yohzapps.yohz.com\/epe_overview.htm\">Easy PDF Explorer<\/a>, a Windows application that helps you extract text your PDF files directly on your computer.<\/p>\n<h2>User interface<\/h2>\n<p>Easy PDF Explorer uses the familiar Windows Explorer interface, so you can easily navigate your folders and select your files.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-516\" src=\"http:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2019\/09\/epe_split01.png\" alt=\"\" width=\"705\" height=\"593\" srcset=\"https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2019\/09\/epe_split01.png 705w, https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2019\/09\/epe_split01-300x252.png 300w, https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2019\/09\/epe_split01-624x525.png 624w\" sizes=\"(max-width: 705px) 100vw, 705px\" \/><\/p>\n<p>Select 1 or more PDF files, and <a href=\"https:\/\/yohzapps.yohz.com\/epe_overview.htm\">Easy PDF Explorer<\/a> will display the details of each file.\u00a0 This is one benefit of <a href=\"https:\/\/yohzapps.yohz.com\/epe_overview.htm\">Easy PDF Explorer<\/a> &#8211; it allows you to work with batches of PDF files easily.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-518\" src=\"http:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2019\/09\/epe_split03.png\" alt=\"\" width=\"947\" height=\"274\" srcset=\"https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2019\/09\/epe_split03.png 947w, https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2019\/09\/epe_split03-300x87.png 300w, https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2019\/09\/epe_split03-768x222.png 768w, https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2019\/09\/epe_split03-624x181.png 624w\" sizes=\"(max-width: 947px) 100vw, 947px\" \/><\/p>\n<h2><\/h2>\n<h2>Extract text from PDF<\/h2>\n<p>When you want to start extracting text from your PDF files, click on the <strong>Extract text<\/strong>\u00a0button:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-547\" src=\"http:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2019\/09\/epe_text01.png\" alt=\"\" width=\"712\" height=\"282\" srcset=\"https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2019\/09\/epe_text01.png 712w, https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2019\/09\/epe_text01-300x119.png 300w, https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2019\/09\/epe_text01-624x247.png 624w\" sizes=\"(max-width: 712px) 100vw, 712px\" \/><\/p>\n<p>This brings up the <strong>Extract Text<\/strong> window.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-551\" src=\"http:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2019\/09\/epe_text04.png\" alt=\"\" width=\"696\" height=\"584\" srcset=\"https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2019\/09\/epe_text04.png 696w, https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2019\/09\/epe_text04-300x252.png 300w, https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2019\/09\/epe_text04-624x524.png 624w\" sizes=\"(max-width: 696px) 100vw, 696px\" \/><\/p>\n<p>You need to enter the folder you want to store the extracted text in.\u00a0 You also need to provide the naming convention for the extracted pages, and if you want the text from each page to be saved to a different file.<\/p>\n<p>In this example, we will be storing the images from each file in its own folder.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-552\" src=\"http:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2019\/09\/epe_text05.png\" alt=\"\" width=\"441\" height=\"85\" srcset=\"https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2019\/09\/epe_text05.png 441w, https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2019\/09\/epe_text05-300x58.png 300w\" sizes=\"(max-width: 441px) 100vw, 441px\" \/><\/p>\n<p>We use the <strong>&lt;FILENAME_NOEXT&gt;<\/strong> tag, so for a file named <strong>Accounting.pdf<\/strong>, all text from that file will be stored in the <strong>f:\\exports\\Accounting\\<\/strong> folder.<\/p>\n<p>We will use the default naming convention of <strong>&lt;FILENAME_NOEXT&gt;_text.txt.<\/strong><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-553\" src=\"http:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2019\/09\/epe_text06.png\" alt=\"\" width=\"441\" height=\"89\" srcset=\"https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2019\/09\/epe_text06.png 441w, https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2019\/09\/epe_text06-300x61.png 300w\" sizes=\"(max-width: 441px) 100vw, 441px\" \/><\/p>\n<p>This uses the PDF file name and append the <strong>_text<\/strong> value to the file.\u00a0 So in our example, our extracted text will be stored in a file named <strong>Accounting_text.txt<\/strong>.<\/p>\n<p>Next, we need to choose if we want to store all the text from our PDF file into a single file, or separate them by pages into individual files.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-554\" src=\"http:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2019\/09\/epe_text07.png\" alt=\"\" width=\"680\" height=\"244\" srcset=\"https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2019\/09\/epe_text07.png 680w, https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2019\/09\/epe_text07-300x108.png 300w, https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2019\/09\/epe_text07-624x224.png 624w\" sizes=\"(max-width: 680px) 100vw, 680px\" \/><\/p>\n<p>If we choose a single file as per the above screenshot, we can enter a page separator value.\u00a0 The default page separator will separate each page this way:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-555\" src=\"http:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2019\/09\/epe_text08.png\" alt=\"\" width=\"636\" height=\"542\" srcset=\"https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2019\/09\/epe_text08.png 636w, https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2019\/09\/epe_text08-300x256.png 300w, https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2019\/09\/epe_text08-624x532.png 624w\" sizes=\"(max-width: 636px) 100vw, 636px\" \/><\/p>\n<p>If you choose to store the text from each page in a separate file, then you need to enter a suffix for each of the files.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-556 alignleft\" src=\"http:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2019\/09\/epe_text09.png\" alt=\"\" width=\"315\" height=\"144\" srcset=\"https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2019\/09\/epe_text09.png 315w, https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2019\/09\/epe_text09-300x137.png 300w\" sizes=\"(max-width: 315px) 100vw, 315px\" \/><\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>The default suffix of <strong>_&lt;PAGENUMBER:0000&gt;<\/strong> will create files this way:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-557\" src=\"http:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2019\/09\/epe_text10.png\" alt=\"\" width=\"675\" height=\"309\" srcset=\"https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2019\/09\/epe_text10.png 675w, https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2019\/09\/epe_text10-300x137.png 300w, https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2019\/09\/epe_text10-624x286.png 624w\" sizes=\"(max-width: 675px) 100vw, 675px\" \/><\/p>\n<p>And that&#8217;s all there is to it.\u00a0 Use <a href=\"https:\/\/yohzapps.yohz.com\/epe_overview.htm\">Easy PDF Explorer<\/a> to extract text from your hundreds or thousands of PDF files, on <span style=\"text-decoration: underline;\">your<\/span> computer, securely and fast.<\/p>\n<p>&nbsp;<\/p>\n<h2>Other Easy PDF Explorer features<\/h2>\n<p>In addition to extracting text from your PDF files, <a href=\"https:\/\/yohzapps.yohz.com\/epe_overview.htm\">Easy PDF Explorer<\/a> can also:<\/p>\n<ul>\n<li><a href=\"http:\/\/www.yohz.com\/blogs\/2019\/09\/09\/extract-images-from-pdf-files\/\">extract images from your PDF files<\/a><\/li>\n<li><a href=\"http:\/\/www.yohz.com\/blogs\/2019\/09\/09\/split-pdf\/\">split your PDF files<\/a><\/li>\n<li><a href=\"http:\/\/www.yohz.com\/blogs\/2019\/04\/14\/how-to-merge-pdf-files\/\">merge\/combine PDF files<\/a><\/li>\n<li><a href=\"http:\/\/www.yohz.com\/blogs\/2019\/04\/16\/how-to-convert-pdf-to-jpg-or-png\/\">export pages as JPEG, PNG, or bitmap images<\/a><\/li>\n<li>extract text and images from <a href=\"http:\/\/www.yohz.com\/blogs\/2019\/09\/19\/pdf-to-word-extraction\/\">PDF to Word<\/a><\/li>\n<li>search for text across multiple PDF files<\/li>\n<\/ul>\n<p><a href=\"http:\/\/www.yohz.com\/downloads\/easypdfexplorer\/EasyPDFExplorerSetup.zip\">Download<\/a> a 14-day trial now, and see how <a href=\"https:\/\/yohzapps.yohz.com\/epe_overview.htm\">Easy PDF Explorer<\/a> can help you work with your PDF files faster and safer.<\/p>\n<div class=\"fcbkbttn_buttons_block\" id=\"fcbkbttn_left\"><div class=\"fcbkbttn_like \"><fb:like href=\"https:\/\/www.yohz.com\/blogs\/2019\/09\/09\/extract-text-from-pdf-files\/\" action=\"like\" colorscheme=\"light\" layout=\"button\"  size=\"small\"><\/fb:like><\/div><div class=\"fb-share-button  \" data-href=\"https:\/\/www.yohz.com\/blogs\/2019\/09\/09\/extract-text-from-pdf-files\/\" data-type=\"button\" data-size=\"small\"><\/div><\/div>","protected":false},"excerpt":{"rendered":"<p>Task &#8211; you need to extract text from your PDF files Options &#8211; you can find hundreds of online sites that can do that for you. Concern &#8211; your files are confidential, and you&#8217;re not sure if those sites are making copies of your files for &#8216;other&#8217; purposes. Practicality &#8211; you want to extract text [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":""},"categories":[52],"tags":[53,73,72],"_links":{"self":[{"href":"https:\/\/www.yohz.com\/blogs\/wp-json\/wp\/v2\/posts\/546"}],"collection":[{"href":"https:\/\/www.yohz.com\/blogs\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.yohz.com\/blogs\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.yohz.com\/blogs\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.yohz.com\/blogs\/wp-json\/wp\/v2\/comments?post=546"}],"version-history":[{"count":4,"href":"https:\/\/www.yohz.com\/blogs\/wp-json\/wp\/v2\/posts\/546\/revisions"}],"predecessor-version":[{"id":584,"href":"https:\/\/www.yohz.com\/blogs\/wp-json\/wp\/v2\/posts\/546\/revisions\/584"}],"wp:attachment":[{"href":"https:\/\/www.yohz.com\/blogs\/wp-json\/wp\/v2\/media?parent=546"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.yohz.com\/blogs\/wp-json\/wp\/v2\/categories?post=546"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.yohz.com\/blogs\/wp-json\/wp\/v2\/tags?post=546"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}