{"id":2131,"date":"2020-09-13T14:25:43","date_gmt":"2020-09-13T08:55:43","guid":{"rendered":"https:\/\/www.rangakrish.com\/?p=2131"},"modified":"2020-09-13T14:25:43","modified_gmt":"2020-09-13T08:55:43","slug":"mathematica-using-textcases-to-extract-information-from-natural-language-text","status":"publish","type":"post","link":"https:\/\/www.rangakrish.com\/index.php\/2020\/09\/13\/mathematica-using-textcases-to-extract-information-from-natural-language-text\/","title":{"rendered":"Mathematica: Using TextCases to Extract Information from Natural Language Text\u00a0"},"content":{"rendered":"<p>Extracting meaningful information from unstructured, human readable text is a hot topic of research today and has important applications in many domains. I have written a few blogs related to this topic, for example, see <a href=\"https:\/\/www.rangakrish.com\/index.php\/2019\/06\/24\/text-analysis-using-meaningclouds-deep-categorization-api\/\" target=\"_blank\" rel=\"noopener noreferrer\"><em><strong>this<\/strong><\/em><\/a>\u00a0and <a href=\"https:\/\/www.rangakrish.com\/index.php\/2019\/07\/21\/custom-text-analysis-using-textrazors-prolog-engine\/\" target=\"_blank\" rel=\"noopener noreferrer\"><em><strong>this<\/strong><\/em><\/a>.<\/p>\n<p>In today\u2019s article, I would like to show how <a href=\"https:\/\/www.wolfram.com\/mathematica\/\" target=\"_blank\" rel=\"noopener noreferrer\"><em><strong>Mathematica<\/strong><\/em><\/a> can be a great help when working with natural language text.<\/p>\n<p><em><strong>Mathematica\u2019s Wolfram Language<\/strong> <\/em>has had, since release <em><strong>10.2<\/strong><\/em>, a function called <em><strong>\u201c<a href=\"https:\/\/reference.wolfram.com\/language\/ref\/TextCases.html\" target=\"_blank\" rel=\"noopener noreferrer\">TextCases<\/a>\u201d<\/strong><\/em>\u00a0that can find interesting syntactic and semantic patterns in natural language text. The good news is that its functionality is continuously being enhanced in each release. The current version <em><strong>12.1.1<\/strong><\/em> offers some incredible features that I haven\u2019t seen in any other framework or API.<\/p>\n<p>Let us start with a straightforward and common example. Given a paragraph, extract all the sentences contained in it. This functionality is available in many libraries.<\/p>\n<figure id=\"attachment_2147\" aria-describedby=\"caption-attachment-2147\" style=\"width: 500px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex1-2.png?ssl=1\"><img data-recalc-dims=\"1\" fetchpriority=\"high\" decoding=\"async\" data-attachment-id=\"2147\" data-permalink=\"https:\/\/www.rangakrish.com\/index.php\/2020\/09\/13\/mathematica-using-textcases-to-extract-information-from-natural-language-text\/ex1-3\/\" data-orig-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex1-2.png\" data-orig-size=\"650,230\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Extracting Sentences\" data-image-description=\"&lt;p&gt;Extracting Sentences&lt;\/p&gt;\n\" data-image-caption=\"&lt;p&gt;Extracting Sentences&lt;\/p&gt;\n\" data-large-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex1-2.png\" class=\"wp-image-2147\" src=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex1-2.png?resize=500%2C177&#038;ssl=1\" alt=\"Extracting Sentences\" width=\"500\" height=\"177\" srcset=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex1-2.png?resize=300%2C106&amp;ssl=1 300w, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex1-2.png?w=650&amp;ssl=1 650w\" sizes=\"(max-width: 500px) 100vw, 500px\" \/><\/a><figcaption id=\"caption-attachment-2147\" class=\"wp-caption-text\"><strong>Extracting Sentences<\/strong><\/figcaption><\/figure>\n<p>No fancy stuff here. Next, let us ask for <em><strong>\u201cadjectives\u201d<\/strong><\/em> and <em><strong>\u201cproper nouns\u201d<\/strong><\/em> in the text.<\/p>\n<figure id=\"attachment_2134\" aria-describedby=\"caption-attachment-2134\" style=\"width: 500px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex4.png?ssl=1\"><img data-recalc-dims=\"1\" decoding=\"async\" data-attachment-id=\"2134\" data-permalink=\"https:\/\/www.rangakrish.com\/index.php\/2020\/09\/13\/mathematica-using-textcases-to-extract-information-from-natural-language-text\/ex4\/\" data-orig-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex4.png\" data-orig-size=\"650,103\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Adjectives\" data-image-description=\"&lt;p&gt;Adjectives&lt;\/p&gt;\n\" data-image-caption=\"&lt;p&gt;Adjectives&lt;\/p&gt;\n\" data-large-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex4.png\" class=\"wp-image-2134\" src=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex4.png?resize=500%2C79&#038;ssl=1\" alt=\"Adjectives\" width=\"500\" height=\"79\" srcset=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex4.png?resize=300%2C48&amp;ssl=1 300w, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex4.png?w=650&amp;ssl=1 650w\" sizes=\"(max-width: 500px) 100vw, 500px\" \/><\/a><figcaption id=\"caption-attachment-2134\" class=\"wp-caption-text\"><strong>Adjectives<\/strong><\/figcaption><\/figure>\n<p>The result shows that the function has correctly identified the requested <em><strong>POS<\/strong><\/em> words. This feature is also widely available.<\/p>\n<p>What about word groups, such as phrases? In the following, the function identifies the different <em><strong>\u201cnoun groups\u201d<\/strong><\/em>:<\/p>\n<figure id=\"attachment_2135\" aria-describedby=\"caption-attachment-2135\" style=\"width: 500px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex5.png?ssl=1\"><img data-recalc-dims=\"1\" decoding=\"async\" data-attachment-id=\"2135\" data-permalink=\"https:\/\/www.rangakrish.com\/index.php\/2020\/09\/13\/mathematica-using-textcases-to-extract-information-from-natural-language-text\/ex5\/\" data-orig-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex5.png\" data-orig-size=\"651,95\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Noun Groups\" data-image-description=\"&lt;p&gt;Noun Groups&lt;\/p&gt;\n\" data-image-caption=\"&lt;p&gt;Noun Groups&lt;\/p&gt;\n\" data-large-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex5.png\" class=\"wp-image-2135\" src=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex5.png?resize=500%2C73&#038;ssl=1\" alt=\"Noun Groups\" width=\"500\" height=\"73\" srcset=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex5.png?resize=300%2C44&amp;ssl=1 300w, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex5.png?w=651&amp;ssl=1 651w\" sizes=\"(max-width: 500px) 100vw, 500px\" \/><\/a><figcaption id=\"caption-attachment-2135\" class=\"wp-caption-text\"><strong>Noun Groups<\/strong><\/figcaption><\/figure>\n<p>The following is an example of <em><strong>\u201cquantifier phrase\u201d<\/strong><\/em>:<\/p>\n<figure id=\"attachment_2138\" aria-describedby=\"caption-attachment-2138\" style=\"width: 500px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex6.png?ssl=1\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"2138\" data-permalink=\"https:\/\/www.rangakrish.com\/index.php\/2020\/09\/13\/mathematica-using-textcases-to-extract-information-from-natural-language-text\/ex6\/\" data-orig-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex6.png\" data-orig-size=\"647,152\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Quantifier Phrase\" data-image-description=\"&lt;p&gt;Quantifier Phrase&lt;\/p&gt;\n\" data-image-caption=\"&lt;p&gt;Quantifier Phrase&lt;\/p&gt;\n\" data-large-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex6.png\" class=\"wp-image-2138\" src=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex6.png?resize=500%2C117&#038;ssl=1\" alt=\"Quantifier Phrase\" width=\"500\" height=\"117\" srcset=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex6.png?resize=300%2C70&amp;ssl=1 300w, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex6.png?w=647&amp;ssl=1 647w\" sizes=\"(max-width: 500px) 100vw, 500px\" \/><\/a><figcaption id=\"caption-attachment-2138\" class=\"wp-caption-text\"><strong>Quantifier Phrase<\/strong><\/figcaption><\/figure>\n<p>Note that the extracted item contains both a number and the corresponding measurement unit. We can also extract <em><strong>\u201cwh-adjective phrases\u201d<\/strong><\/em>:<\/p>\n<figure id=\"attachment_2139\" aria-describedby=\"caption-attachment-2139\" style=\"width: 500px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex7.png?ssl=1\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"2139\" data-permalink=\"https:\/\/www.rangakrish.com\/index.php\/2020\/09\/13\/mathematica-using-textcases-to-extract-information-from-natural-language-text\/ex7\/\" data-orig-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex7.png\" data-orig-size=\"652,121\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"WH-Phrase\" data-image-description=\"&lt;p&gt;WH-Phrase&lt;\/p&gt;\n\" data-image-caption=\"&lt;p&gt;WH-Phrase&lt;\/p&gt;\n\" data-large-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex7.png\" class=\"wp-image-2139\" src=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex7.png?resize=500%2C93&#038;ssl=1\" alt=\"WH-Phrase\" width=\"500\" height=\"93\" srcset=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex7.png?resize=300%2C56&amp;ssl=1 300w, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex7.png?w=652&amp;ssl=1 652w\" sizes=\"(max-width: 500px) 100vw, 500px\" \/><\/a><figcaption id=\"caption-attachment-2139\" class=\"wp-caption-text\"><strong>WH-Phrase<\/strong><\/figcaption><\/figure>\n<p>We know that <em><strong>\u201cfortunate\u201d<\/strong><\/em> is an adjective. The <em><strong>\u201cWH\u201d<\/strong><\/em> words are <em><strong>who, whose, whom, which, what, where, when, why, how<\/strong><\/em><i>. <\/i>The phrase <em><strong>\u201chow fortunate\u201d<\/strong><\/em> is therefore a <em><strong>\u201cwh-adjective phrase\u201d<\/strong><\/em>.<span class=\"Apple-converted-space\">\u00a0<\/span><\/p>\n<p>The above examples show how we can extract <em><strong>\u201csyntactic\u201d<\/strong><\/em> information from the given text. Let us look at some cases involving <em><strong>\u201csemantic\u201d<\/strong><\/em> patterns.<span class=\"Apple-converted-space\">\u00a0<\/span><\/p>\n<p>The following example illustrates how we can identify sentences that mention dog <em><strong>\u201cbreed\u201d<\/strong><\/em>.<\/p>\n<figure id=\"attachment_2140\" aria-describedby=\"caption-attachment-2140\" style=\"width: 500px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex3.png?ssl=1\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"2140\" data-permalink=\"https:\/\/www.rangakrish.com\/index.php\/2020\/09\/13\/mathematica-using-textcases-to-extract-information-from-natural-language-text\/ex3\/\" data-orig-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex3.png\" data-orig-size=\"641,64\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Dog Breed\" data-image-description=\"&lt;p&gt;Dog Breed&lt;\/p&gt;\n\" data-image-caption=\"&lt;p&gt;Dog Breed&lt;\/p&gt;\n\" data-large-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex3.png\" class=\"wp-image-2140\" src=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex3.png?resize=500%2C50&#038;ssl=1\" alt=\"Dog Breed\" width=\"500\" height=\"50\" srcset=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex3.png?resize=300%2C30&amp;ssl=1 300w, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex3.png?w=641&amp;ssl=1 641w\" sizes=\"(max-width: 500px) 100vw, 500px\" \/><\/a><figcaption id=\"caption-attachment-2140\" class=\"wp-caption-text\"><strong>Dog Breed<\/strong><\/figcaption><\/figure>\n<p>Here <em><strong>&#8220;doberman&#8221;<\/strong><\/em> is a dog breed. The function has correctly identified that. How about checking for positive or negative sentiments? We can do that too. Look at the following example.<\/p>\n<figure id=\"attachment_2141\" aria-describedby=\"caption-attachment-2141\" style=\"width: 500px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex2.png?ssl=1\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"2141\" data-permalink=\"https:\/\/www.rangakrish.com\/index.php\/2020\/09\/13\/mathematica-using-textcases-to-extract-information-from-natural-language-text\/ex2\/\" data-orig-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex2.png\" data-orig-size=\"647,98\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Positive Sentiment\" data-image-description=\"&lt;p&gt;Positive Sentiment&lt;\/p&gt;\n\" data-image-caption=\"&lt;p&gt;Positive Sentiment&lt;\/p&gt;\n\" data-large-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex2.png\" class=\"wp-image-2141\" src=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex2.png?resize=500%2C76&#038;ssl=1\" alt=\"Positive Sentiment\" width=\"500\" height=\"76\" srcset=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex2.png?resize=300%2C45&amp;ssl=1 300w, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex2.png?w=647&amp;ssl=1 647w\" sizes=\"(max-width: 500px) 100vw, 500px\" \/><\/a><figcaption id=\"caption-attachment-2141\" class=\"wp-caption-text\"><strong>Positive Sentiment<\/strong><\/figcaption><\/figure>\n<p>Here we are asking for text that represents positive sentiment. The function has returned two fragments. I agree with the first one, but the second looks suspicious. In fact, I believe that the sentence <em><strong>\u201cShe is a hyperactive doberman\u201d<\/strong><\/em> is also positive statement.<\/p>\n<p>The following is an interesting case (and common too) where we extract <em><strong>email addresses<\/strong><\/em> and web <em><strong>URL<\/strong><\/em>:<\/p>\n<figure id=\"attachment_2142\" aria-describedby=\"caption-attachment-2142\" style=\"width: 500px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex8.png?ssl=1\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"2142\" data-permalink=\"https:\/\/www.rangakrish.com\/index.php\/2020\/09\/13\/mathematica-using-textcases-to-extract-information-from-natural-language-text\/ex8\/\" data-orig-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex8.png\" data-orig-size=\"648,208\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Emails\" data-image-description=\"&lt;p&gt;Emails&lt;\/p&gt;\n\" data-image-caption=\"&lt;p&gt;Emails&lt;\/p&gt;\n\" data-large-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex8.png\" class=\"wp-image-2142\" src=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex8.png?resize=500%2C160&#038;ssl=1\" alt=\"Emails\" width=\"500\" height=\"160\" srcset=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex8.png?resize=300%2C96&amp;ssl=1 300w, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex8.png?w=648&amp;ssl=1 648w\" sizes=\"(max-width: 500px) 100vw, 500px\" \/><\/a><figcaption id=\"caption-attachment-2142\" class=\"wp-caption-text\"><strong>Email Address<\/strong><\/figcaption><\/figure>\n<p>Works perfectly.<\/p>\n<p>Next, let us ask the function to identify <em><strong>country<\/strong><\/em> names and prominent <em><strong>persons<\/strong><\/em>.<\/p>\n<figure id=\"attachment_2143\" aria-describedby=\"caption-attachment-2143\" style=\"width: 500px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex9.png?ssl=1\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"2143\" data-permalink=\"https:\/\/www.rangakrish.com\/index.php\/2020\/09\/13\/mathematica-using-textcases-to-extract-information-from-natural-language-text\/ex9\/\" data-orig-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex9.png\" data-orig-size=\"654,214\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Countries and Persons\" data-image-description=\"&lt;p&gt;Countries and Persons&lt;\/p&gt;\n\" data-image-caption=\"&lt;p&gt;Countries and Persons&lt;\/p&gt;\n\" data-large-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex9.png\" class=\"wp-image-2143\" src=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex9.png?resize=500%2C164&#038;ssl=1\" alt=\"Countries and Persons\" width=\"500\" height=\"164\" srcset=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex9.png?resize=300%2C98&amp;ssl=1 300w, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex9.png?w=654&amp;ssl=1 654w\" sizes=\"(max-width: 500px) 100vw, 500px\" \/><\/a><figcaption id=\"caption-attachment-2143\" class=\"wp-caption-text\"><strong>Countries and Persons<\/strong><\/figcaption><\/figure>\n<p><em><strong>\u201cIndia\u201d<\/strong><\/em> appears twice because there are two occurrences of the word in the text.<\/p>\n<p>Here is an interesting example from the <em><strong>medical<\/strong><\/em> domain. The function is able to identify <em><strong>body parts<\/strong><\/em> and <em><strong>disease<\/strong><\/em> names present in the text:<\/p>\n<figure id=\"attachment_2144\" aria-describedby=\"caption-attachment-2144\" style=\"width: 500px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex10.png?ssl=1\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"2144\" data-permalink=\"https:\/\/www.rangakrish.com\/index.php\/2020\/09\/13\/mathematica-using-textcases-to-extract-information-from-natural-language-text\/ex10\/\" data-orig-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex10.png\" data-orig-size=\"654,451\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Medical Domain Example\" data-image-description=\"&lt;p&gt;Medical Domain Example&lt;\/p&gt;\n\" data-image-caption=\"&lt;p&gt;Medical Domain Example&lt;\/p&gt;\n\" data-large-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex10.png\" class=\"wp-image-2144\" src=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex10.png?resize=500%2C345&#038;ssl=1\" alt=\"Medical Domain Example\" width=\"500\" height=\"345\" srcset=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex10.png?resize=300%2C207&amp;ssl=1 300w, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex10.png?w=654&amp;ssl=1 654w\" sizes=\"(max-width: 500px) 100vw, 500px\" \/><\/a><figcaption id=\"caption-attachment-2144\" class=\"wp-caption-text\"><strong>Medical Domain Example<\/strong><\/figcaption><\/figure>\n<p>Nice, isn\u2019t it?<\/p>\n<p>Our last example is sure to appeal to all. Here, we take the <em><strong>\u201cWikipedia\u201d<\/strong><\/em> text pertaining to <em><strong>\u201cYoga\u201d<\/strong><\/em> and extract words and phrases that are related to the concepts <em><strong>\u201cMythology\u201d<\/strong><\/em> or <em><strong>\u201cReligion\u201d<\/strong><\/em>. We then present this data as a <em><strong>word cloud<\/strong><\/em>:<\/p>\n<figure id=\"attachment_2145\" aria-describedby=\"caption-attachment-2145\" style=\"width: 500px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex11.png?ssl=1\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"2145\" data-permalink=\"https:\/\/www.rangakrish.com\/index.php\/2020\/09\/13\/mathematica-using-textcases-to-extract-information-from-natural-language-text\/ex11\/\" data-orig-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex11.png\" data-orig-size=\"649,477\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Mythology and Religion\" data-image-description=\"&lt;p&gt;Mythology and Religion&lt;\/p&gt;\n\" data-image-caption=\"&lt;p&gt;Mythology and Religion&lt;\/p&gt;\n\" data-large-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex11.png\" class=\"wp-image-2145\" src=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex11.png?resize=500%2C367&#038;ssl=1\" alt=\"Mythology and Religion\" width=\"500\" height=\"367\" srcset=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex11.png?resize=300%2C220&amp;ssl=1 300w, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/ex11.png?w=649&amp;ssl=1 649w\" sizes=\"(max-width: 500px) 100vw, 500px\" \/><\/a><figcaption id=\"caption-attachment-2145\" class=\"wp-caption-text\"><strong>Mythology and Religion<\/strong><\/figcaption><\/figure>\n<p>There is a lot more that one can do with this function. If you are curious to know what content types are supported by <em><strong>\u201cTextCases\u201d<\/strong><\/em> function, take a look at this <a href=\"https:\/\/reference.wolfram.com\/language\/guide\/TextContentTypes.html\" target=\"_blank\" rel=\"noopener noreferrer\"><em><strong>page<\/strong><\/em><\/a>.<\/p>\n<p>I hope you enjoyed this article. There are two other related functions in the Wolfram Language: <em><strong>\u201cTextPosition\u201d<\/strong><\/em>\u00a0and <em><strong>\u201cTextContents\u201d<\/strong><\/em>. I wrote an article on the latter some time ago. <a href=\"https:\/\/www.rangakrish.com\/index.php\/2019\/04\/21\/textcontents-function-in-mathematica-12\/\" target=\"_blank\" rel=\"noopener noreferrer\"><em><strong>Check it out<\/strong><\/em><\/a>.<\/p>\n<p>Have a nice weekend!<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Extracting meaningful information from unstructured, human readable text is a hot topic of research today and has important applications in many domains. I have written a few blogs related to this topic, for example, see this\u00a0and this. In today\u2019s article, I would like to show how Mathematica can be a great help when working with [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"advanced_seo_description":"","jetpack_seo_html_title":"","jetpack_seo_noindex":false,"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[72,107,17],"tags":[43,250,74,249],"class_list":["post-2131","post","type-post","status-publish","format-standard","hentry","category-mathematica","category-natural-language-processing","category-programming","tag-mathematica","tag-natiural-language-precessing","tag-nlp","tag-textcases"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p9OLnF-yn","jetpack-related-posts":[{"id":328,"url":"https:\/\/www.rangakrish.com\/index.php\/2016\/09\/11\/natural-language-processing-in-mathematica\/","url_meta":{"origin":2131,"position":0},"title":"Natural Language Processing in Mathematica","author":"admin","date":"September 11, 2016","format":false,"excerpt":"Welcome back. Today I am going to share with you some of the nice capabilities of Mathematica in the area of Natural Language Processing (NLP). Let us start with words. What if we wish to know\u00a0the various definitions of the word image?\u00a0Here is the answer. Mathematica gives the various senses\u2026","rel":"","context":"In &quot;Mathematica&quot;","block_context":{"text":"Mathematica","link":"https:\/\/www.rangakrish.com\/index.php\/category\/mathematica\/"},"img":{"alt_text":"Word Definition","src":"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2016\/09\/word-data1-1024x238.png?resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2016\/09\/word-data1-1024x238.png?resize=350%2C200 1x, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2016\/09\/word-data1-1024x238.png?resize=525%2C300 1.5x, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2016\/09\/word-data1-1024x238.png?resize=700%2C400 2x"},"classes":[]},{"id":1541,"url":"https:\/\/www.rangakrish.com\/index.php\/2019\/04\/21\/textcontents-function-in-mathematica-12\/","url_meta":{"origin":2131,"position":1},"title":"TextContents[ ] Function in Mathematica 12","author":"admin","date":"April 21, 2019","format":false,"excerpt":"Mathematica 12 was released a few days ago.\u00a0 It has been over a year since version 11.3 came out in March 2018. The long wait appears justified since the new release boasts of numerous improvements and new features across several areas. You may want to read this blog post\u00a0by Stephen\u2026","rel":"","context":"In &quot;Mathematica&quot;","block_context":{"text":"Mathematica","link":"https:\/\/www.rangakrish.com\/index.php\/category\/mathematica\/"},"img":{"alt_text":"Importing Text File","src":"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/04\/FileImport.png?resize=350%2C200&ssl=1","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/04\/FileImport.png?resize=350%2C200&ssl=1 1x, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/04\/FileImport.png?resize=525%2C300&ssl=1 1.5x"},"classes":[]},{"id":870,"url":"https:\/\/www.rangakrish.com\/index.php\/2018\/03\/25\/question-answering-in-mathematica\/","url_meta":{"origin":2131,"position":2},"title":"Question Answering in Mathematica","author":"admin","date":"March 25, 2018","format":false,"excerpt":"About 10 days ago, I received an update for Mathematica. The latest version is 11.3.0. As usual, I looked through the list of new features\u00a0in this release. There are several new features, but one of them attracted my attention immediately: There is a new function FindTextualAnswer\u00a0that, given a piece of\u2026","rel":"","context":"In &quot;Mathematica&quot;","block_context":{"text":"Mathematica","link":"https:\/\/www.rangakrish.com\/index.php\/category\/mathematica\/"},"img":{"alt_text":"Example 1","src":"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2018\/03\/Example1.png?resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2018\/03\/Example1.png?resize=350%2C200 1x, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2018\/03\/Example1.png?resize=525%2C300 1.5x"},"classes":[]},{"id":2084,"url":"https:\/\/www.rangakrish.com\/index.php\/2020\/08\/16\/pattern-matching-comparing-elixir-and-mathematica\/","url_meta":{"origin":2131,"position":3},"title":"Pattern Matching: Comparing Elixir and Mathematica","author":"admin","date":"August 16, 2020","format":false,"excerpt":"One of the things I like about Elixir\u00a0is its support for patterns at the core language level, not through library functions as in most other languages. This contributes to writing cleaner code, in my opinion. \u00a0 Another environment that I am familiar with, namely Mathematica, boasts of (arguably) the most\u2026","rel":"","context":"In &quot;Elixir&quot;","block_context":{"text":"Elixir","link":"https:\/\/www.rangakrish.com\/index.php\/category\/elixir\/"},"img":{"alt_text":"Symbolic Expressions","src":"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2020\/08\/pattern-mm.png?resize=350%2C200&ssl=1","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2020\/08\/pattern-mm.png?resize=350%2C200&ssl=1 1x, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2020\/08\/pattern-mm.png?resize=525%2C300&ssl=1 1.5x"},"classes":[]},{"id":409,"url":"https:\/\/www.rangakrish.com\/index.php\/2016\/11\/02\/working-with-linguistic-data-in-mathematica\/","url_meta":{"origin":2131,"position":4},"title":"Working with Linguistic Data in Mathematica","author":"admin","date":"November 2, 2016","format":false,"excerpt":"There are many interesting functions in Mathematica for working with language data, not just in English but in many other languages too. The DictionaryLookup[] function is a good starting point. Let us see what languages are supported as part of dictionary lookup: That is a good collection. It is nice\u2026","rel":"","context":"In &quot;Mathematica&quot;","block_context":{"text":"Mathematica","link":"https:\/\/www.rangakrish.com\/index.php\/category\/mathematica\/"},"img":{"alt_text":"Supported Languages","src":"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2016\/11\/dict-1.png?resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2016\/11\/dict-1.png?resize=350%2C200 1x, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2016\/11\/dict-1.png?resize=525%2C300 1.5x"},"classes":[]},{"id":3545,"url":"https:\/\/www.rangakrish.com\/index.php\/2024\/11\/09\/semantic-search-in-wolfram-mathematica\/","url_meta":{"origin":2131,"position":5},"title":"Semantic Search in Wolfram Mathematica","author":"admin","date":"November 9, 2024","format":false,"excerpt":"In an earlier article, I explained how to use OpenAI from Wolfram Mathematica ver 14.1. This latest release of Wolfram supports Semantic Search as well. In today\u2019s article, let me discuss this feature. As in the case of using LLMs, using Semantic Search requires an account with one of the\u2026","rel":"","context":"In &quot;Mathematica&quot;","block_context":{"text":"Mathematica","link":"https:\/\/www.rangakrish.com\/index.php\/category\/mathematica\/"},"img":{"alt_text":"Remedy Description","src":"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2024\/11\/image1-300x225.png?resize=350%2C200&ssl=1","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2024\/11\/image1-300x225.png?resize=350%2C200&ssl=1 1x, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2024\/11\/image1-300x225.png?resize=525%2C300&ssl=1 1.5x"},"classes":[]}],"_links":{"self":[{"href":"https:\/\/www.rangakrish.com\/index.php\/wp-json\/wp\/v2\/posts\/2131","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.rangakrish.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.rangakrish.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.rangakrish.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.rangakrish.com\/index.php\/wp-json\/wp\/v2\/comments?post=2131"}],"version-history":[{"count":0,"href":"https:\/\/www.rangakrish.com\/index.php\/wp-json\/wp\/v2\/posts\/2131\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.rangakrish.com\/index.php\/wp-json\/wp\/v2\/media?parent=2131"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.rangakrish.com\/index.php\/wp-json\/wp\/v2\/categories?post=2131"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.rangakrish.com\/index.php\/wp-json\/wp\/v2\/tags?post=2131"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}