{"id":2011,"date":"2023-08-11T16:28:55","date_gmt":"2023-08-11T08:28:55","guid":{"rendered":"https:\/\/www.yohz.com\/blogs\/?p=2011"},"modified":"2023-08-11T16:28:55","modified_gmt":"2023-08-11T08:28:55","slug":"searching-pdf-files-using-word-stemmers","status":"publish","type":"post","link":"https:\/\/www.yohz.com\/blogs\/2023\/08\/11\/searching-pdf-files-using-word-stemmers\/","title":{"rendered":"Searching PDF files using word stemmers"},"content":{"rendered":"<p><a href=\"https:\/\/www.yohz.com\/ya_eps_overview.htm\">Easy PDF Search<\/a> by default searches for complete words\/phrases in your PDF files.\u00a0 For example, if we search for the word <strong>like<\/strong>, only files containing that exact word are returned and highlighted in the search results.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2013\" src=\"https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2023\/08\/stemmer01-1.png\" alt=\"\" width=\"760\" height=\"495\" srcset=\"https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2023\/08\/stemmer01-1.png 760w, https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2023\/08\/stemmer01-1-300x195.png 300w, https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2023\/08\/stemmer01-1-624x406.png 624w\" sizes=\"(max-width: 760px) 100vw, 760px\" \/><\/p>\n<p>If we wanted to search for words starting with the word <strong>like<\/strong>, we can perform a prefix search using the <strong>*<\/strong> character e.g. <strong>like*<\/strong><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2014\" src=\"https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2023\/08\/stemmer02.png\" alt=\"\" width=\"760\" height=\"487\" srcset=\"https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2023\/08\/stemmer02.png 760w, https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2023\/08\/stemmer02-300x192.png 300w, https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2023\/08\/stemmer02-624x400.png 624w\" sizes=\"(max-width: 760px) 100vw, 760px\" \/><\/p>\n<p>This returns all words with the prefix <strong>like<\/strong>.\u00a0 Unrelated words (from a grammar perspective) like <strong>likelihood<\/strong> and <strong>likewise<\/strong>, will be returned, while a related noun like <strong>liking<\/strong> will not be returned.<\/p>\n<h3>Stemmed words<\/h3>\n<p>Stemming is the process of removing a part of a word<i>,<\/i> or reducing a word to its stem or root.\u00a0 In the example above, the words <strong>like<\/strong>, <strong>likes<\/strong>, <strong>liking<\/strong>, <strong>liked<\/strong>, and <strong>likely<\/strong> all share the same root word i.e. <strong>like<\/strong>.<\/p>\n<p>When we want <a href=\"https:\/\/www.yohz.com\/ya_eps_overview.htm\">Easy PDF Search<\/a> to use stem words when searching e.g.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2017\" src=\"https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2023\/08\/stemmer03-2.png\" alt=\"\" width=\"764\" height=\"491\" srcset=\"https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2023\/08\/stemmer03-2.png 764w, https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2023\/08\/stemmer03-2-300x193.png 300w, https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2023\/08\/stemmer03-2-624x401.png 624w\" sizes=\"(max-width: 764px) 100vw, 764px\" \/><\/p>\n<p>we need to first create a stem database, then search that stem database.<\/p>\n<h3>Creating a stem database<\/h3>\n<p>To create a stem database, click on the <strong>Options &gt; Stemmer language &gt; Settings<\/strong> item.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2019\" src=\"https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2023\/08\/stemmer04-1.png\" alt=\"\" width=\"434\" height=\"444\" srcset=\"https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2023\/08\/stemmer04-1.png 434w, https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2023\/08\/stemmer04-1-293x300.png 293w\" sizes=\"(max-width: 434px) 100vw, 434px\" \/><\/p>\n<p>In the <strong>Stemmer Settings<\/strong> window, select up to 5 languages to create a stem database for.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2020\" src=\"https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2023\/08\/stemmer05.png\" alt=\"\" width=\"538\" height=\"507\" srcset=\"https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2023\/08\/stemmer05.png 538w, https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2023\/08\/stemmer05-300x283.png 300w\" sizes=\"(max-width: 538px) 100vw, 538px\" \/><\/p>\n<p>You can create stem databases for the following 27 languages:<\/p>\n<ul>\n<li>Armenian<\/li>\n<li>Basque<\/li>\n<li>Catalan<\/li>\n<li>Danish<\/li>\n<li>Dutch<\/li>\n<li>English<\/li>\n<li>Finnish<\/li>\n<li>French<\/li>\n<li>German<\/li>\n<li>Greek<\/li>\n<li>Hindi<\/li>\n<li>Hungarian<\/li>\n<li>Indonesian<\/li>\n<li>Irish<\/li>\n<li>Italian<\/li>\n<li>Lithuanian<\/li>\n<li>Nepali<\/li>\n<li>Norwegian<\/li>\n<li>Portuguese<\/li>\n<li>Romanian<\/li>\n<li>Russian<\/li>\n<li>Serbian<\/li>\n<li>Spanish<\/li>\n<li>Swedish<\/li>\n<li>Tamil<\/li>\n<li>Turkish<\/li>\n<li>Yiddish<\/li>\n<\/ul>\n<p>When you want to search the stem database, select the stem language you want to search in from the <strong>Options<\/strong> menu.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2021\" src=\"https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2023\/08\/stemmer06.png\" alt=\"\" width=\"411\" height=\"438\" srcset=\"https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2023\/08\/stemmer06.png 411w, https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2023\/08\/stemmer06-282x300.png 282w\" sizes=\"(max-width: 411px) 100vw, 411px\" \/><\/p>\n<p><a href=\"https:\/\/www.yohz.com\/ya_eps_overview.htm\">Easy PDF Search<\/a> then displays the stem language database that the search will be performed in.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2022\" src=\"https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2023\/08\/stemmer07.png\" alt=\"\" width=\"253\" height=\"306\" srcset=\"https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2023\/08\/stemmer07.png 253w, https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2023\/08\/stemmer07-248x300.png 248w\" sizes=\"(max-width: 253px) 100vw, 253px\" \/><\/p>\n<p>In the search results, the stem database that was searched will also be displayed.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2023\" src=\"https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2023\/08\/stemmer08.png\" alt=\"\" width=\"274\" height=\"377\" srcset=\"https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2023\/08\/stemmer08.png 274w, https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2023\/08\/stemmer08-218x300.png 218w\" sizes=\"(max-width: 274px) 100vw, 274px\" \/><\/p>\n<h3>Testing the stemmers<\/h3>\n<p>To test which words stem to the same root word, you can use the test utility in the <strong>Stemmer Settings<\/strong> window.\u00a0 Select the language you want to test, then click on the <strong>Test &#8230; stemmer<\/strong> tab.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2024\" src=\"https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2023\/08\/stemmer09.png\" alt=\"\" width=\"538\" height=\"507\" srcset=\"https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2023\/08\/stemmer09.png 538w, https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2023\/08\/stemmer09-300x283.png 300w\" sizes=\"(max-width: 538px) 100vw, 538px\" \/><\/p>\n<p>Enter the search word, then a list of words you want to check if the root word matches the search word.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2025\" src=\"https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2023\/08\/stemmer10.png\" alt=\"\" width=\"538\" height=\"507\" srcset=\"https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2023\/08\/stemmer10.png 538w, https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2023\/08\/stemmer10-300x283.png 300w\" sizes=\"(max-width: 538px) 100vw, 538px\" \/><\/p>\n<p>Next, click on the <strong>Test<\/strong> button.\u00a0 Non-matches will be displayed in a strike-out manner.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2026\" src=\"https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2023\/08\/stemmer11.png\" alt=\"\" width=\"538\" height=\"507\" srcset=\"https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2023\/08\/stemmer11.png 538w, https:\/\/www.yohz.com\/blogs\/wp-content\/uploads\/2023\/08\/stemmer11-300x283.png 300w\" sizes=\"(max-width: 538px) 100vw, 538px\" \/><\/p>\n<p><a href=\"https:\/\/www.yohz.com\/downloads\/easypdfsearch\/EasyPDFSearchSetup.zip\">Download<\/a> a 14-day trial of <a href=\"https:\/\/www.yohz.com\/ya_eps_overview.htm\">Easy PDF Search<\/a> now and experience how easy and fast it is to search your PDF files collection, now with the ability to perform stem word searches.<\/p>\n<div class=\"fcbkbttn_buttons_block\" id=\"fcbkbttn_left\"><div class=\"fcbkbttn_like \"><fb:like href=\"https:\/\/www.yohz.com\/blogs\/2023\/08\/11\/searching-pdf-files-using-word-stemmers\/\" action=\"like\" colorscheme=\"light\" layout=\"button\"  size=\"small\"><\/fb:like><\/div><div class=\"fb-share-button  \" data-href=\"https:\/\/www.yohz.com\/blogs\/2023\/08\/11\/searching-pdf-files-using-word-stemmers\/\" data-type=\"button\" data-size=\"small\"><\/div><\/div>","protected":false},"excerpt":{"rendered":"<p>Easy PDF Search by default searches for complete words\/phrases in your PDF files.\u00a0 For example, if we search for the word like, only files containing that exact word are returned and highlighted in the search results. If we wanted to search for words starting with the word like, we can perform a prefix search using [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":""},"categories":[111],"tags":[112,200,201],"_links":{"self":[{"href":"https:\/\/www.yohz.com\/blogs\/wp-json\/wp\/v2\/posts\/2011"}],"collection":[{"href":"https:\/\/www.yohz.com\/blogs\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.yohz.com\/blogs\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.yohz.com\/blogs\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.yohz.com\/blogs\/wp-json\/wp\/v2\/comments?post=2011"}],"version-history":[{"count":1,"href":"https:\/\/www.yohz.com\/blogs\/wp-json\/wp\/v2\/posts\/2011\/revisions"}],"predecessor-version":[{"id":2027,"href":"https:\/\/www.yohz.com\/blogs\/wp-json\/wp\/v2\/posts\/2011\/revisions\/2027"}],"wp:attachment":[{"href":"https:\/\/www.yohz.com\/blogs\/wp-json\/wp\/v2\/media?parent=2011"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.yohz.com\/blogs\/wp-json\/wp\/v2\/categories?post=2011"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.yohz.com\/blogs\/wp-json\/wp\/v2\/tags?post=2011"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}