{"id":1817,"date":"2019-12-08T10:22:54","date_gmt":"2019-12-08T04:52:54","guid":{"rendered":"https:\/\/www.rangakrish.com\/?p=1817"},"modified":"2019-12-08T10:24:29","modified_gmt":"2019-12-08T04:54:29","slug":"using-definite-clause-grammars-dcg-for-information-extraction","status":"publish","type":"post","link":"https:\/\/www.rangakrish.com\/index.php\/2019\/12\/08\/using-definite-clause-grammars-dcg-for-information-extraction\/","title":{"rendered":"Using Definite Clause Grammars (DCG) for Information Extraction"},"content":{"rendered":"<p>In the previous <a href=\"https:\/\/www.rangakrish.com\/index.php\/2019\/11\/23\/using-augmented-transition-networks-atn-for-information-extraction\/\" target=\"_blank\" rel=\"noopener\"><em><strong>article<\/strong><\/em><\/a>, I showed how we can use <em><strong>ATNs<\/strong><\/em> for extracting key information from natural language text. I also pointed out in that article that <em><strong>Definite Clause Grammars (DCG)<\/strong><\/em> are a more compact formalism for doing this. That will be the focus of today&#8217;s article.<\/p>\n<p>For a nice introduction to <em><strong>DCG<\/strong><\/em>, read <a href=\"https:\/\/www.google.com\/url?sa=t&amp;rct=j&amp;q=&amp;esrc=s&amp;source=web&amp;cd=9&amp;cad=rja&amp;uact=8&amp;ved=2ahUKEwiwzOWcoKXmAhV6zDgGHWZlAWgQFjAIegQIARAB&amp;url=http%3A%2F%2Fwww.learnprolognow.org%2Flpnpage.php%3Fpagetype%3Dhtml%26pageid%3Dlpn-htmlse29&amp;usg=AOvVaw3TXsGDBjuCB6Vmz5VjI3aC\" target=\"_blank\" rel=\"noopener\"><em><strong>this<\/strong><\/em><\/a>.<\/p>\n<p>Let us first define the <em><strong>ATN<\/strong><\/em> arc primitives in <em><strong>DCG<\/strong><\/em>. Here are the definitions:<\/p>\n<figure id=\"attachment_1818\" aria-describedby=\"caption-attachment-1818\" style=\"width: 613px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/ATN-Primitives.jpg?ssl=1\"><img data-recalc-dims=\"1\" fetchpriority=\"high\" decoding=\"async\" data-attachment-id=\"1818\" data-permalink=\"https:\/\/www.rangakrish.com\/index.php\/2019\/12\/08\/using-definite-clause-grammars-dcg-for-information-extraction\/atn-primitives\/\" data-orig-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/ATN-Primitives.jpg\" data-orig-size=\"613,674\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;Admin&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;1575640351&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"ATN Primitives\" data-image-description=\"&lt;p&gt;ATN Primitives&lt;\/p&gt;\n\" data-image-caption=\"&lt;p&gt;ATN Primitives&lt;\/p&gt;\n\" data-large-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/ATN-Primitives.jpg\" class=\"size-full wp-image-1818\" src=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/ATN-Primitives.jpg?resize=613%2C674&#038;ssl=1\" alt=\"ATN Primitives\" width=\"613\" height=\"674\" srcset=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/ATN-Primitives.jpg?w=613&amp;ssl=1 613w, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/ATN-Primitives.jpg?resize=273%2C300&amp;ssl=1 273w\" sizes=\"(max-width: 613px) 100vw, 613px\" \/><\/a><figcaption id=\"caption-attachment-1818\" class=\"wp-caption-text\"><strong>ATN Primitives in DCG<\/strong><\/figcaption><\/figure>\n<p>The predicate <em><strong>is_cat<\/strong><\/em> interfaces with the lexicon to determine the part-of-speech category of the given word. Here is a simple grammar that demonstrates the use of <em><strong>wrd<\/strong><\/em>\u00a0and <em><strong>cat<\/strong><\/em>\u00a0primitives:<\/p>\n<figure id=\"attachment_1819\" aria-describedby=\"caption-attachment-1819\" style=\"width: 306px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Simple-Grammar.jpg?ssl=1\"><img data-recalc-dims=\"1\" decoding=\"async\" data-attachment-id=\"1819\" data-permalink=\"https:\/\/www.rangakrish.com\/index.php\/2019\/12\/08\/using-definite-clause-grammars-dcg-for-information-extraction\/simple-grammar\/\" data-orig-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Simple-Grammar.jpg\" data-orig-size=\"306,82\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;Admin&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;1575716370&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"A Simple Grammar\" data-image-description=\"&lt;p&gt;A Simple Grammar&lt;\/p&gt;\n\" data-image-caption=\"&lt;p&gt;A Simple Grammar&lt;\/p&gt;\n\" data-large-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Simple-Grammar.jpg\" class=\"size-full wp-image-1819\" src=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Simple-Grammar.jpg?resize=306%2C82&#038;ssl=1\" alt=\"A Simple Grammar\" width=\"306\" height=\"82\" srcset=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Simple-Grammar.jpg?w=306&amp;ssl=1 306w, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Simple-Grammar.jpg?resize=300%2C80&amp;ssl=1 300w\" sizes=\"(max-width: 306px) 100vw, 306px\" \/><\/a><figcaption id=\"caption-attachment-1819\" class=\"wp-caption-text\"><strong>A Simple Grammar<\/strong><\/figcaption><\/figure>\n<p>The sentence <em><strong>&#8220;The dog ran fast&#8221;<\/strong><\/em> is accepted by the above grammar:<\/p>\n<figure id=\"attachment_1820\" aria-describedby=\"caption-attachment-1820\" style=\"width: 428px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Using-the-grammar.jpg?ssl=1\"><img data-recalc-dims=\"1\" decoding=\"async\" data-attachment-id=\"1820\" data-permalink=\"https:\/\/www.rangakrish.com\/index.php\/2019\/12\/08\/using-definite-clause-grammars-dcg-for-information-extraction\/using-the-grammar\/\" data-orig-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Using-the-grammar.jpg\" data-orig-size=\"428,50\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;Admin&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;1575716715&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Applying the Grammar\" data-image-description=\"&lt;p&gt;Applying the Grammar&lt;\/p&gt;\n\" data-image-caption=\"&lt;p&gt;Applying the Grammar&lt;\/p&gt;\n\" data-large-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Using-the-grammar.jpg\" class=\"size-full wp-image-1820\" src=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Using-the-grammar.jpg?resize=428%2C50&#038;ssl=1\" alt=\"Applying the Grammar\" width=\"428\" height=\"50\" srcset=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Using-the-grammar.jpg?w=428&amp;ssl=1 428w, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Using-the-grammar.jpg?resize=300%2C35&amp;ssl=1 300w\" sizes=\"(max-width: 428px) 100vw, 428px\" \/><\/a><figcaption id=\"caption-attachment-1820\" class=\"wp-caption-text\"><strong>Applying the Grammar<\/strong><\/figcaption><\/figure>\n<p>In the second sentence, the verb <em><strong>&#8220;chased&#8221;<\/strong><\/em> is not followed by an <em><strong>adverb<\/strong><\/em> and hence it is not accepted.<\/p>\n<p>The above grammar applies to the complete sentence, which is how we normally define and use <em><strong>DCG<\/strong><\/em>.<\/p>\n<p>How can we use DCG to parse parts of a sentence? Additionally, how do we <em><strong>extract<\/strong><\/em> items of interest from a sentence? The following grammar identifies simple <em><strong>VP<\/strong><\/em> chunks:<\/p>\n<figure id=\"attachment_1822\" aria-describedby=\"caption-attachment-1822\" style=\"width: 436px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/VP-chunk-grammar.jpg?ssl=1\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"1822\" data-permalink=\"https:\/\/www.rangakrish.com\/index.php\/2019\/12\/08\/using-definite-clause-grammars-dcg-for-information-extraction\/vp-chunk-grammar\/\" data-orig-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/VP-chunk-grammar.jpg\" data-orig-size=\"436,85\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;Admin&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;1575717012&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Grammar for VP Chunks\" data-image-description=\"&lt;p&gt;Grammar for VP Chunks&lt;\/p&gt;\n\" data-image-caption=\"&lt;p&gt;Grammar for VP Chunks&lt;\/p&gt;\n\" data-large-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/VP-chunk-grammar.jpg\" class=\"size-full wp-image-1822\" src=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/VP-chunk-grammar.jpg?resize=436%2C85&#038;ssl=1\" alt=\"Grammar for VP Chunks\" width=\"436\" height=\"85\" srcset=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/VP-chunk-grammar.jpg?w=436&amp;ssl=1 436w, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/VP-chunk-grammar.jpg?resize=300%2C58&amp;ssl=1 300w\" sizes=\"(max-width: 436px) 100vw, 436px\" \/><\/a><figcaption id=\"caption-attachment-1822\" class=\"wp-caption-text\"><strong>Grammar for VP Chunks<\/strong><\/figcaption><\/figure>\n<p>We have used an additional argument in each grammar rule to retrieve the desired data during parsing. Here is an example using the chunking grammar:<\/p>\n<figure id=\"attachment_1823\" aria-describedby=\"caption-attachment-1823\" style=\"width: 390px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/vp-chunk-example1.jpg?ssl=1\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"1823\" data-permalink=\"https:\/\/www.rangakrish.com\/index.php\/2019\/12\/08\/using-definite-clause-grammars-dcg-for-information-extraction\/vp-chunk-example1\/\" data-orig-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/vp-chunk-example1.jpg\" data-orig-size=\"390,39\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;Admin&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;1575717131&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"VP Chunk Example\" data-image-description=\"\" data-image-caption=\"&lt;p&gt;VP Chunk Example&lt;\/p&gt;\n\" data-large-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/vp-chunk-example1.jpg\" class=\"size-full wp-image-1823\" src=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/vp-chunk-example1.jpg?resize=390%2C39&#038;ssl=1\" alt=\"VP Chunk Example\" width=\"390\" height=\"39\" srcset=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/vp-chunk-example1.jpg?w=390&amp;ssl=1 390w, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/vp-chunk-example1.jpg?resize=300%2C30&amp;ssl=1 300w\" sizes=\"(max-width: 390px) 100vw, 390px\" \/><\/a><figcaption id=\"caption-attachment-1823\" class=\"wp-caption-text\"><strong>VP Chunk Example<\/strong><\/figcaption><\/figure>\n<p>As expected, the grammar correctly identifies the <em><strong>VP<\/strong><\/em> chunk <em><strong>&#8220;ran fast&#8221;<\/strong><\/em>.<span class=\"Apple-converted-space\">\u00a0 <\/span>What happens if we process the sentence <em><strong>&#8220;He ran fast and ate well&#8221;<\/strong><\/em>? See below.<\/p>\n<figure id=\"attachment_1824\" aria-describedby=\"caption-attachment-1824\" style=\"width: 544px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/vp-chunk-example2.jpg?ssl=1\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"1824\" data-permalink=\"https:\/\/www.rangakrish.com\/index.php\/2019\/12\/08\/using-definite-clause-grammars-dcg-for-information-extraction\/vp-chunk-example2\/\" data-orig-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/vp-chunk-example2.jpg\" data-orig-size=\"544,41\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;Admin&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;1575729856&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"VP Chunk Example-2\" data-image-description=\"&lt;p&gt;VP Chunk Example-2&lt;\/p&gt;\n\" data-image-caption=\"&lt;p&gt;VP Chunk Example-2&lt;\/p&gt;\n\" data-large-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/vp-chunk-example2.jpg\" class=\"size-full wp-image-1824\" src=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/vp-chunk-example2.jpg?resize=544%2C41&#038;ssl=1\" alt=\"VP Chunk Example-2\" width=\"544\" height=\"41\" srcset=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/vp-chunk-example2.jpg?w=544&amp;ssl=1 544w, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/vp-chunk-example2.jpg?resize=300%2C23&amp;ssl=1 300w\" sizes=\"(max-width: 544px) 100vw, 544px\" \/><\/a><figcaption id=\"caption-attachment-1824\" class=\"wp-caption-text\"><strong>VP Chunk Example-2<\/strong><\/figcaption><\/figure>\n<p>Interesting. The reason why we get the trailing <em><strong>VP<\/strong><\/em> chunk and not the first one is because when we invoke the predicate, we have indicated that we are expecting no tokens after the match.<span class=\"Apple-converted-space\">\u00a0 <\/span>We can change that easily. Here is a predicate that collects all chunks:<\/p>\n<figure id=\"attachment_1825\" aria-describedby=\"caption-attachment-1825\" style=\"width: 449px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Getting-all-VP-chunks.jpg?ssl=1\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"1825\" data-permalink=\"https:\/\/www.rangakrish.com\/index.php\/2019\/12\/08\/using-definite-clause-grammars-dcg-for-information-extraction\/getting-all-vp-chunks\/\" data-orig-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Getting-all-VP-chunks.jpg\" data-orig-size=\"449,65\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;Admin&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;1575735077&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Getting All VP Chunks\" data-image-description=\"&lt;p&gt;Getting All VP Chunks&lt;\/p&gt;\n\" data-image-caption=\"&lt;p&gt;Getting All VP Chunks&lt;\/p&gt;\n\" data-large-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Getting-all-VP-chunks.jpg\" class=\"size-full wp-image-1825\" src=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Getting-all-VP-chunks.jpg?resize=449%2C65&#038;ssl=1\" alt=\"Getting All VP Chunks\" width=\"449\" height=\"65\" srcset=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Getting-all-VP-chunks.jpg?w=449&amp;ssl=1 449w, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Getting-all-VP-chunks.jpg?resize=300%2C43&amp;ssl=1 300w\" sizes=\"(max-width: 449px) 100vw, 449px\" \/><\/a><figcaption id=\"caption-attachment-1825\" class=\"wp-caption-text\"><strong>Getting All VP Chunks<\/strong><\/figcaption><\/figure>\n<p>When we apply this on the same sentence, we get both the VP chunks:<\/p>\n<figure id=\"attachment_1827\" aria-describedby=\"caption-attachment-1827\" style=\"width: 549px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Getting-all-chunks.jpg?ssl=1\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"1827\" data-permalink=\"https:\/\/www.rangakrish.com\/index.php\/2019\/12\/08\/using-definite-clause-grammars-dcg-for-information-extraction\/getting-all-chunks\/\" data-orig-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Getting-all-chunks.jpg\" data-orig-size=\"549,39\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;Admin&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;1575795471&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Getting All Chunks Example\" data-image-description=\"&lt;p&gt;Getting All Chunks Example&lt;\/p&gt;\n\" data-image-caption=\"&lt;p&gt;Getting All Chunks Example&lt;\/p&gt;\n\" data-large-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Getting-all-chunks.jpg\" class=\"size-full wp-image-1827\" src=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Getting-all-chunks.jpg?resize=549%2C39&#038;ssl=1\" alt=\"Getting All Chunks Example\" width=\"549\" height=\"39\" srcset=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Getting-all-chunks.jpg?w=549&amp;ssl=1 549w, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Getting-all-chunks.jpg?resize=300%2C21&amp;ssl=1 300w\" sizes=\"(max-width: 549px) 100vw, 549px\" \/><\/a><figcaption id=\"caption-attachment-1827\" class=\"wp-caption-text\"><strong>Getting All Chunks Example<\/strong><\/figcaption><\/figure>\n<h3>Using Registers<\/h3>\n<p>In the <em><strong>ATN<\/strong><\/em> implementation, we used registers as part of the structure building process. <em><strong>DCGs<\/strong><\/em> allow us to define additional arguments to suit our requirements and hence a separate <em><strong>Register<\/strong><\/em> system is not needed. However, it is easy to define a set of predicates to support the use of <em><strong>Registers<\/strong><\/em>. Here is the code:<\/p>\n<figure id=\"attachment_1828\" aria-describedby=\"caption-attachment-1828\" style=\"width: 610px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Register-Ops.jpg?ssl=1\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"1828\" data-permalink=\"https:\/\/www.rangakrish.com\/index.php\/2019\/12\/08\/using-definite-clause-grammars-dcg-for-information-extraction\/register-ops\/\" data-orig-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Register-Ops.jpg\" data-orig-size=\"610,273\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;Admin&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;1575735333&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Support for Registers\" data-image-description=\"&lt;p&gt;Support for Registers&lt;\/p&gt;\n\" data-image-caption=\"&lt;p&gt;Support for Registers&lt;\/p&gt;\n\" data-large-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Register-Ops.jpg\" class=\"size-full wp-image-1828\" src=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Register-Ops.jpg?resize=610%2C273&#038;ssl=1\" alt=\"Support for Registers\" width=\"610\" height=\"273\" srcset=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Register-Ops.jpg?w=610&amp;ssl=1 610w, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Register-Ops.jpg?resize=300%2C134&amp;ssl=1 300w\" sizes=\"(max-width: 610px) 100vw, 610px\" \/><\/a><figcaption id=\"caption-attachment-1828\" class=\"wp-caption-text\"><strong>Support for Registers<\/strong><\/figcaption><\/figure>\n<p>It is possible to add more features, but I just wanted to give a hint as to how it can be done. The following grammar corresponding to <em><strong>VP<\/strong><\/em> chunks uses the <em><strong>Register<\/strong><\/em> system.<\/p>\n<figure id=\"attachment_1829\" aria-describedby=\"caption-attachment-1829\" style=\"width: 547px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/VP-chunks-with-registers.jpg?ssl=1\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"1829\" data-permalink=\"https:\/\/www.rangakrish.com\/index.php\/2019\/12\/08\/using-definite-clause-grammars-dcg-for-information-extraction\/vp-chunks-with-registers\/\" data-orig-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/VP-chunks-with-registers.jpg\" data-orig-size=\"547,126\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;Admin&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;1575736021&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Extracting VP Chunks Using Registers\" data-image-description=\"&lt;p&gt;Extracting VP Chunks Using Registers&lt;\/p&gt;\n\" data-image-caption=\"&lt;p&gt;Extracting VP Chunks Using Registers&lt;\/p&gt;\n\" data-large-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/VP-chunks-with-registers.jpg\" class=\"size-full wp-image-1829\" src=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/VP-chunks-with-registers.jpg?resize=547%2C126&#038;ssl=1\" alt=\"Extracting VP Chunks Using Registers\" width=\"547\" height=\"126\" srcset=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/VP-chunks-with-registers.jpg?w=547&amp;ssl=1 547w, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/VP-chunks-with-registers.jpg?resize=300%2C69&amp;ssl=1 300w\" sizes=\"(max-width: 547px) 100vw, 547px\" \/><\/a><figcaption id=\"caption-attachment-1829\" class=\"wp-caption-text\"><strong>Extracting VP Chunks Using Registers<\/strong><\/figcaption><\/figure>\n<p>Here is the same sentence as before, but parsed as per the revised grammar:<\/p>\n<figure id=\"attachment_1830\" aria-describedby=\"caption-attachment-1830\" style=\"width: 557px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/VP-chunks2-example.jpg?ssl=1\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"1830\" data-permalink=\"https:\/\/www.rangakrish.com\/index.php\/2019\/12\/08\/using-definite-clause-grammars-dcg-for-information-extraction\/vp-chunks2-example\/\" data-orig-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/VP-chunks2-example.jpg\" data-orig-size=\"557,37\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;Admin&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;1575736153&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Parsing Using Registers\" data-image-description=\"&lt;p&gt;Parsing Using Registers&lt;\/p&gt;\n\" data-image-caption=\"&lt;p&gt;Parsing Using Registers&lt;\/p&gt;\n\" data-large-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/VP-chunks2-example.jpg\" class=\"size-full wp-image-1830\" src=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/VP-chunks2-example.jpg?resize=557%2C37&#038;ssl=1\" alt=\"Parsing Using Registers\" width=\"557\" height=\"37\" srcset=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/VP-chunks2-example.jpg?w=557&amp;ssl=1 557w, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/VP-chunks2-example.jpg?resize=300%2C20&amp;ssl=1 300w\" sizes=\"(max-width: 557px) 100vw, 557px\" \/><\/a><figcaption id=\"caption-attachment-1830\" class=\"wp-caption-text\"><strong>Parsing Using Registers<\/strong><\/figcaption><\/figure>\n<p>You can see that the result is the same.<\/p>\n<h3>Information Extraction Example: Homeopathy<span class=\"Apple-converted-space\">\u00a0<\/span><\/h3>\n<p>Now that you have a basic understanding of how to extract relevant information from a piece of text, let us look at a more interesting example. As in the previous <a href=\"https:\/\/www.rangakrish.com\/index.php\/2019\/11\/23\/using-augmented-transition-networks-atn-for-information-extraction\/\" target=\"_blank\" rel=\"noopener\"><em><strong>article<\/strong><\/em><\/a>, let us try to extract the <em><strong>Age<\/strong><\/em> and <em><strong>Gender<\/strong><\/em> of the patient and <em><strong>Modalities<\/strong><\/em> of the disease from a homeopathic case record (simplified).<\/p>\n<p>We will work with this text:<\/p>\n<figure id=\"attachment_1831\" aria-describedby=\"caption-attachment-1831\" style=\"width: 386px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Example-text.jpg?ssl=1\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"1831\" data-permalink=\"https:\/\/www.rangakrish.com\/index.php\/2019\/12\/08\/using-definite-clause-grammars-dcg-for-information-extraction\/example-text\/\" data-orig-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Example-text.jpg\" data-orig-size=\"386,84\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;Admin&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;1575641431&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Homeopathy Case Text\" data-image-description=\"&lt;p&gt;Homeopathy Case Text&lt;\/p&gt;\n\" data-image-caption=\"&lt;p&gt;Homeopathy Case Text&lt;\/p&gt;\n\" data-large-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Example-text.jpg\" class=\"size-full wp-image-1831\" src=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Example-text.jpg?resize=386%2C84&#038;ssl=1\" alt=\"Homeopathy Case Text\" width=\"386\" height=\"84\" srcset=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Example-text.jpg?w=386&amp;ssl=1 386w, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Example-text.jpg?resize=300%2C65&amp;ssl=1 300w\" sizes=\"(max-width: 386px) 100vw, 386px\" \/><\/a><figcaption id=\"caption-attachment-1831\" class=\"wp-caption-text\"><strong>Homeopathy Case Text<\/strong><\/figcaption><\/figure>\n<p>This text is stored in the file <em><strong>\u201csample-text.txt\u201d<\/strong><\/em>. Here is the grammar to extract <em><strong>Age<\/strong><\/em>:<\/p>\n<figure id=\"attachment_1832\" aria-describedby=\"caption-attachment-1832\" style=\"width: 585px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Age-pattern.jpg?ssl=1\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"1832\" data-permalink=\"https:\/\/www.rangakrish.com\/index.php\/2019\/12\/08\/using-definite-clause-grammars-dcg-for-information-extraction\/age-pattern\/\" data-orig-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Age-pattern.jpg\" data-orig-size=\"585,149\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;Admin&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;1575640604&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Age Pattern\" data-image-description=\"&lt;p&gt;Age Pattern&lt;\/p&gt;\n\" data-image-caption=\"&lt;p&gt;Age Pattern&lt;\/p&gt;\n\" data-large-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Age-pattern.jpg\" class=\"size-full wp-image-1832\" src=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Age-pattern.jpg?resize=585%2C149&#038;ssl=1\" alt=\"Age Pattern\" width=\"585\" height=\"149\" srcset=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Age-pattern.jpg?w=585&amp;ssl=1 585w, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Age-pattern.jpg?resize=300%2C76&amp;ssl=1 300w\" sizes=\"(max-width: 585px) 100vw, 585px\" \/><\/a><figcaption id=\"caption-attachment-1832\" class=\"wp-caption-text\"><strong>Extracting Age\u00a0<\/strong><\/figcaption><\/figure>\n<p><span style=\"font-size: 16px;\">And here is the grammar to extract <em><strong>Gender<\/strong><\/em>:<\/span><\/p>\n<figure id=\"attachment_1833\" aria-describedby=\"caption-attachment-1833\" style=\"width: 649px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Gender-pattern.jpg?ssl=1\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"1833\" data-permalink=\"https:\/\/www.rangakrish.com\/index.php\/2019\/12\/08\/using-definite-clause-grammars-dcg-for-information-extraction\/gender-pattern\/\" data-orig-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Gender-pattern.jpg\" data-orig-size=\"649,106\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;Admin&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;1575640582&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Extracting Gender\" data-image-description=\"&lt;p&gt;Extracting Gender&lt;\/p&gt;\n\" data-image-caption=\"&lt;p&gt;Extracting Gender&lt;\/p&gt;\n\" data-large-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Gender-pattern.jpg\" class=\"size-full wp-image-1833\" src=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Gender-pattern.jpg?resize=649%2C106&#038;ssl=1\" alt=\"Extracting Gender\" width=\"649\" height=\"106\" srcset=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Gender-pattern.jpg?w=649&amp;ssl=1 649w, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Gender-pattern.jpg?resize=300%2C49&amp;ssl=1 300w\" sizes=\"(max-width: 649px) 100vw, 649px\" \/><\/a><figcaption id=\"caption-attachment-1833\" class=\"wp-caption-text\"><strong>Extracting Gender<\/strong><\/figcaption><\/figure>\n<p><em><strong>Modality<\/strong><\/em> pattern is only slightly more involved than the above:<\/p>\n<figure id=\"attachment_1834\" aria-describedby=\"caption-attachment-1834\" style=\"width: 611px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Modality-pattern.jpg?ssl=1\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"1834\" data-permalink=\"https:\/\/www.rangakrish.com\/index.php\/2019\/12\/08\/using-definite-clause-grammars-dcg-for-information-extraction\/modality-pattern\/\" data-orig-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Modality-pattern.jpg\" data-orig-size=\"611,229\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;Admin&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;1575640655&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Extracting Modalities of Disease\" data-image-description=\"&lt;p&gt;Extracting Modalities of Disease&lt;\/p&gt;\n\" data-image-caption=\"&lt;p&gt;Extracting Modalities of Disease&lt;\/p&gt;\n\" data-large-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Modality-pattern.jpg\" class=\"size-full wp-image-1834\" src=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Modality-pattern.jpg?resize=611%2C229&#038;ssl=1\" alt=\"Extracting Modalities of Disease\" width=\"611\" height=\"229\" srcset=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Modality-pattern.jpg?w=611&amp;ssl=1 611w, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Modality-pattern.jpg?resize=300%2C112&amp;ssl=1 300w\" sizes=\"(max-width: 611px) 100vw, 611px\" \/><\/a><figcaption id=\"caption-attachment-1834\" class=\"wp-caption-text\"><strong>Extracting Modalities of Disease<\/strong><\/figcaption><\/figure>\n<p>When we apply the above patterns to the sample text, this is what we get as output:<\/p>\n<figure id=\"attachment_1835\" aria-describedby=\"caption-attachment-1835\" style=\"width: 663px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/File-Example.jpg?ssl=1\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"1835\" data-permalink=\"https:\/\/www.rangakrish.com\/index.php\/2019\/12\/08\/using-definite-clause-grammars-dcg-for-information-extraction\/file-example\/\" data-orig-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/File-Example.jpg\" data-orig-size=\"663,79\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;Admin&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;1575641308&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Applying Patterns to Sample Text\" data-image-description=\"&lt;p&gt;Applying Patterns to Sample Text&lt;\/p&gt;\n\" data-image-caption=\"&lt;p&gt;Applying Patterns to Sample Text&lt;\/p&gt;\n\" data-large-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/File-Example.jpg\" class=\"size-full wp-image-1835\" src=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/File-Example.jpg?resize=663%2C79&#038;ssl=1\" alt=\"Applying Patterns to Sample Text\" width=\"663\" height=\"79\" srcset=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/File-Example.jpg?w=663&amp;ssl=1 663w, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/File-Example.jpg?resize=300%2C36&amp;ssl=1 300w\" sizes=\"(max-width: 663px) 100vw, 663px\" \/><\/a><figcaption id=\"caption-attachment-1835\" class=\"wp-caption-text\"><strong>Applying Patterns to Sample Text<\/strong><\/figcaption><\/figure>\n<p>The actual <em><strong>Prolog<\/strong><\/em> code that does the processing is given below:<\/p>\n<figure id=\"attachment_1836\" aria-describedby=\"caption-attachment-1836\" style=\"width: 594px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Processing-file-code.jpg?ssl=1\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"1836\" data-permalink=\"https:\/\/www.rangakrish.com\/index.php\/2019\/12\/08\/using-definite-clause-grammars-dcg-for-information-extraction\/processing-file-code\/\" data-orig-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Processing-file-code.jpg\" data-orig-size=\"594,399\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;Admin&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;1575792803&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Processing the Text\" data-image-description=\"&lt;p&gt;Processing the Text&lt;\/p&gt;\n\" data-image-caption=\"&lt;p&gt;Processing the Text&lt;\/p&gt;\n\" data-large-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Processing-file-code.jpg\" class=\"size-full wp-image-1836\" src=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Processing-file-code.jpg?resize=594%2C399&#038;ssl=1\" alt=\"Processing the Text\" width=\"594\" height=\"399\" srcset=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Processing-file-code.jpg?w=594&amp;ssl=1 594w, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/12\/Processing-file-code.jpg?resize=300%2C202&amp;ssl=1 300w\" sizes=\"(max-width: 594px) 100vw, 594px\" \/><\/a><figcaption id=\"caption-attachment-1836\" class=\"wp-caption-text\"><strong>Processing the Text<\/strong><\/figcaption><\/figure>\n<p>In order to save space, I have not included the predicates that tokenize the input text. That part is simple and straightforward.<\/p>\n<p>As you would have gathered from the discussion so far, <em><strong>DCGs<\/strong><\/em> are a powerful formalism for processing both structured and unstructured text. All we need is a set of patterns to work with. The built-in backtracking mechanism of the <em><strong>Prolog<\/strong><\/em> engine makes the declarative model elegant and expressive.<\/p>\n<p>I have implemented the above logic in <a href=\"https:\/\/sicstus.sics.se\/\" target=\"_blank\" rel=\"noopener\"><em><strong>Sicstus Prolog<\/strong><\/em><\/a>\u00a0on Windows.<\/p>\n<p>Have a great weekend!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In the previous article, I showed how we can use ATNs for extracting key information from natural language text. I also pointed out in that article that Definite Clause Grammars (DCG) are a more compact formalism for doing this. That will be the focus of today&#8217;s article. For a nice introduction to DCG, read this. [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"advanced_seo_description":"","jetpack_seo_html_title":"","jetpack_seo_noindex":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2},"jetpack_post_was_ever_published":false},"categories":[107,17,147],"tags":[100,213,212],"class_list":["post-1817","post","type-post","status-publish","format-standard","hentry","category-natural-language-processing","category-programming","category-prolog","tag-dcg","tag-definite-clause-grammars","tag-information-extraction"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p9OLnF-tj","jetpack-related-posts":[{"id":2832,"url":"https:\/\/www.rangakrish.com\/index.php\/2022\/06\/12\/definite-clause-grammars-in-lisp-part-4\/","url_meta":{"origin":1817,"position":0},"title":"Definite Clause Grammars in Lisp &#8211; Part 4","author":"admin","date":"June 12, 2022","format":false,"excerpt":"In a series of articles\u00a0written earlier, I had shown how it is possible to model Definite Clause Grammars (DCG) in LispWorks Lisp (Enterprise Edition). We use defgrammar\u00a0in Common Prolog (available as part of KnowledgeWorks package) to define our grammar rules. Here is a toy English grammar represented using defgrammar: This\u2026","rel":"","context":"In &quot;LISP&quot;","block_context":{"text":"LISP","link":"https:\/\/www.rangakrish.com\/index.php\/category\/lisp\/"},"img":{"alt_text":"DCG Using Defgrammar","src":"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2022\/06\/defgrammar-version-300x177.jpg?resize=350%2C200&ssl=1","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2022\/06\/defgrammar-version-300x177.jpg?resize=350%2C200&ssl=1 1x, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2022\/06\/defgrammar-version-300x177.jpg?resize=525%2C300&ssl=1 1.5x"},"classes":[]},{"id":534,"url":"https:\/\/www.rangakrish.com\/index.php\/2017\/05\/22\/definite-clause-grammars-dcg-in-lisp\/","url_meta":{"origin":1817,"position":1},"title":"Definite Clause Grammars (DCG) in Lisp","author":"admin","date":"May 22, 2017","format":false,"excerpt":"Definite Clause Grammars (DCG) are an elegant formalism for specifying context free grammars, and part of their popularity is due to their support in the Prolog language. Most books on Natural Language processing usually include a brief coverage of DCGs, even though Natural languages are not context-free. Because of the\u2026","rel":"","context":"In &quot;LISP&quot;","block_context":{"text":"LISP","link":"https:\/\/www.rangakrish.com\/index.php\/category\/lisp\/"},"img":{"alt_text":"DCG Grammar","src":"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2017\/05\/DCG-Grammar.png?resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2017\/05\/DCG-Grammar.png?resize=350%2C200 1x, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2017\/05\/DCG-Grammar.png?resize=525%2C300 1.5x"},"classes":[]},{"id":541,"url":"https:\/\/www.rangakrish.com\/index.php\/2017\/06\/04\/definite-clause-grammars-in-lisp-part-2\/","url_meta":{"origin":1817,"position":2},"title":"Definite Clause Grammars in Lisp &#8211; Part 2","author":"admin","date":"June 4, 2017","format":false,"excerpt":"In the last post, I showed how we can implement DCGs in LispWorks using the KnowledgeWorks package. The grammar discussed in that post did not take into account subject\/predicate number agreement. This is one of the basic constraints in English grammar. Today I will show how easy it is to\u2026","rel":"","context":"In &quot;LISP&quot;","block_context":{"text":"LISP","link":"https:\/\/www.rangakrish.com\/index.php\/category\/lisp\/"},"img":{"alt_text":"Prolog Grammar","src":"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2017\/06\/Prolog-Grammar.png?resize=350%2C200","width":350,"height":200},"classes":[]},{"id":1659,"url":"https:\/\/www.rangakrish.com\/index.php\/2019\/08\/04\/generating-poetry-in-prolog\/","url_meta":{"origin":1817,"position":3},"title":"Generating Poetry in Prolog","author":"admin","date":"August 4, 2019","format":false,"excerpt":"In an earlier article, I showed how we can generate poetry (with limitations, of course!) using my iLangGen framework. That implementation (in Lisp) made use of iLexicon, a large dictionary of English words, which I have been building over the years. I subsequently ported iLexicon to Prolog and it now\u2026","rel":"","context":"In &quot;Natural Language Processing&quot;","block_context":{"text":"Natural Language Processing","link":"https:\/\/www.rangakrish.com\/index.php\/category\/natural-language-processing\/"},"img":{"alt_text":"Generation Logic","src":"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/08\/Code3.jpg?resize=350%2C200&ssl=1","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/08\/Code3.jpg?resize=350%2C200&ssl=1 1x, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/08\/Code3.jpg?resize=525%2C300&ssl=1 1.5x"},"classes":[]},{"id":548,"url":"https:\/\/www.rangakrish.com\/index.php\/2017\/06\/23\/definite-clause-grammars-in-lisp-part-3\/","url_meta":{"origin":1817,"position":4},"title":"Definite Clause Grammars in Lisp &#8211; Part 3","author":"admin","date":"June 23, 2017","format":false,"excerpt":"In today's post, let us see how we can enhance the grammar representation discussed so far to include both Number constraint and Parse Tree. Fortunately, this turns out to be quite straightforward. Just as we do in Prolog, we need to include additional parameters, as needed, to each grammar rule.\u2026","rel":"","context":"In &quot;LISP&quot;","block_context":{"text":"LISP","link":"https:\/\/www.rangakrish.com\/index.php\/category\/lisp\/"},"img":{"alt_text":"POS Functions","src":"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2017\/06\/POS-Function.png?resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2017\/06\/POS-Function.png?resize=350%2C200 1x, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2017\/06\/POS-Function.png?resize=525%2C300 1.5x"},"classes":[]},{"id":1711,"url":"https:\/\/www.rangakrish.com\/index.php\/2019\/09\/01\/poetry-in-prolog-part-2\/","url_meta":{"origin":1817,"position":5},"title":"Poetry in Prolog: Part-2","author":"admin","date":"September 1, 2019","format":false,"excerpt":"In an earlier post, I showed how Prolog can be used to generate poetry, making use of my \"iLexicon\". I want to continue the discussion today by giving another example, this time based on the theme of sounds emitted by various animals and birds. As hinted in my previous articles,\u2026","rel":"","context":"In &quot;Natural Language Processing&quot;","block_context":{"text":"Natural Language Processing","link":"https:\/\/www.rangakrish.com\/index.php\/category\/natural-language-processing\/"},"img":{"alt_text":"The DCG Grammar","src":"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/09\/code.jpg?resize=350%2C200&ssl=1","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/09\/code.jpg?resize=350%2C200&ssl=1 1x, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/09\/code.jpg?resize=525%2C300&ssl=1 1.5x"},"classes":[]}],"_links":{"self":[{"href":"https:\/\/www.rangakrish.com\/index.php\/wp-json\/wp\/v2\/posts\/1817","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.rangakrish.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.rangakrish.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.rangakrish.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.rangakrish.com\/index.php\/wp-json\/wp\/v2\/comments?post=1817"}],"version-history":[{"count":0,"href":"https:\/\/www.rangakrish.com\/index.php\/wp-json\/wp\/v2\/posts\/1817\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.rangakrish.com\/index.php\/wp-json\/wp\/v2\/media?parent=1817"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.rangakrish.com\/index.php\/wp-json\/wp\/v2\/categories?post=1817"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.rangakrish.com\/index.php\/wp-json\/wp\/v2\/tags?post=1817"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}