{"id":575,"date":"2017-08-06T02:25:55","date_gmt":"2017-08-06T02:25:55","guid":{"rendered":"http:\/\/www.rangakrish.com\/?p=575"},"modified":"2017-08-06T02:29:15","modified_gmt":"2017-08-06T02:29:15","slug":"text-generation-using-ilanggen-framework","status":"publish","type":"post","link":"https:\/\/www.rangakrish.com\/index.php\/2017\/08\/06\/text-generation-using-ilanggen-framework\/","title":{"rendered":"Text Generation Using iLangGen Framework"},"content":{"rendered":"<p>The two primary areas in Natural Language processing are <em><strong>Natural Language Understanding<\/strong><\/em> and <em><strong>Natural Language Generation<\/strong><\/em>. The former is concerned with processing and <em><strong>making sense of<\/strong><\/em> natural language text, whereas the latter is concerned with <em><strong>synthesizing text<\/strong><\/em>, possibly from some <em><strong>deep<\/strong><\/em> representation. Both are fascinating and at the same time, challenging, areas of research. The good news is that both these areas have moved from research into mainstream today.<\/p>\n<p>I have been fortunate enough to be associated with both these areas for many years (one of my first projects was to implement an ATN parser in Lisp &#8211; in the year 1987). Alongside my other project commitments, over the years, I have been gradually building three core components of NLP:<\/p>\n<p style=\"padding-left: 30px;\">&#8211; A lexicon<\/p>\n<p style=\"padding-left: 30px;\">&#8211; A parser\/chunking engine<\/p>\n<p style=\"padding-left: 30px;\">&#8211; A text generation framework<\/p>\n<p>Even though <em><strong>Machine Learning<\/strong><\/em> is widely used in the area of text processing, I believe that an intelligent lexicon has its own uses in parsing and generation. I hope my lexicon will be ready for commercial use in the near future. I have been using it as an integral component of my chunking engine as well as the text generation system.<\/p>\n<p>I hope to write more on all three projects in future posts, but today I would like to talk about <em><strong>iLangGen<\/strong><\/em>, a text generation framework that I have\u00a0implemented in Common Lisp.<\/p>\n<p>At its core, <em><strong>iLangGen<\/strong><\/em> uses a BNF-like grammar formalism to model the surface structure of English text (at present, it is limited to English, but it is possible to extend it to other languages too). It is quite feature-rich, for example, with the ability to build new grammars from existing grammars using composition and inheritance techniques. I will be giving examples of these in future posts.<\/p>\n<p>Although the primary use of <em><strong>iLangGen<\/strong><\/em> is likely to be generating text in natural language, another interesting use case is generating test cases for an application such as compiler. We can build a grammar (non-trivial exercise) to generate sample programs that can be used as test inputs for a compiler.<\/p>\n<p>OK, end of introduction. Let us look at a sample grammar:<\/p>\n<figure id=\"attachment_576\" aria-describedby=\"caption-attachment-576\" style=\"width: 568px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2017\/08\/Blog1.png\"><img data-recalc-dims=\"1\" fetchpriority=\"high\" decoding=\"async\" data-attachment-id=\"576\" data-permalink=\"https:\/\/www.rangakrish.com\/index.php\/2017\/08\/06\/text-generation-using-ilanggen-framework\/blog1\/\" data-orig-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2017\/08\/Blog1.png\" data-orig-size=\"568,264\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"iLangGen Grammar\" data-image-description=\"&lt;p&gt;iLangGen Grammar&lt;\/p&gt;\n\" data-image-caption=\"&lt;p&gt;iLangGen Grammar&lt;\/p&gt;\n\" data-large-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2017\/08\/Blog1.png\" class=\"size-full wp-image-576\" src=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2017\/08\/Blog1.png?resize=568%2C264\" alt=\"iLangGen Grammar\" width=\"568\" height=\"264\" srcset=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2017\/08\/Blog1.png?w=568&amp;ssl=1 568w, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2017\/08\/Blog1.png?resize=300%2C139&amp;ssl=1 300w\" sizes=\"(max-width: 568px) 100vw, 568px\" \/><\/a><figcaption id=\"caption-attachment-576\" class=\"wp-caption-text\"><strong>iLangGen Grammar<\/strong><\/figcaption><\/figure>\n<p>Every grammar has a name. This grammar is called <em><strong>SimpleGrammar<\/strong><\/em>. After the name, there is a place holder for an optional parent grammar. In this case, there is none. After that, we have many rules, where each rule is made up of an LHS and RHS. Terminal elements are enclosed in double quotes; the others are non-terminals.<\/p>\n<p>Once a grammar has been defined, we can generate text using the grammar. Text generation, in this case, is the result of traversing the implicit <em><strong>AND-OR graph<\/strong><\/em>. We can plug-in a custom function to participate in\u00a0the traversal. For greater flexibility, <em><strong>iLangGen<\/strong><\/em> supports traversals that generate AST as well as those that do not involve building the AST.<\/p>\n<p>Here I have defined two custom functions, one that makes use of the AST and another that doesn&#8217;t. If the function returns true, the traversal continues, else it is stopped. <em><strong>The print-ast<\/strong><\/em> function, for instance, returns nil when 10 traversals are over.<\/p>\n<figure id=\"attachment_577\" aria-describedby=\"caption-attachment-577\" style=\"width: 587px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2017\/08\/Blog2.png\"><img data-recalc-dims=\"1\" decoding=\"async\" data-attachment-id=\"577\" data-permalink=\"https:\/\/www.rangakrish.com\/index.php\/2017\/08\/06\/text-generation-using-ilanggen-framework\/blog2\/\" data-orig-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2017\/08\/Blog2.png\" data-orig-size=\"587,238\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Custom Traversal Functions\" data-image-description=\"&lt;p&gt;Custom Traversal Functions&lt;\/p&gt;\n\" data-image-caption=\"&lt;p&gt;Custom Traversal Functions&lt;\/p&gt;\n\" data-large-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2017\/08\/Blog2.png\" class=\"size-full wp-image-577\" src=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2017\/08\/Blog2.png?resize=587%2C238\" alt=\"Custom Traversal Functions\" width=\"587\" height=\"238\" srcset=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2017\/08\/Blog2.png?w=587&amp;ssl=1 587w, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2017\/08\/Blog2.png?resize=300%2C122&amp;ssl=1 300w\" sizes=\"(max-width: 587px) 100vw, 587px\" \/><\/a><figcaption id=\"caption-attachment-577\" class=\"wp-caption-text\"><strong>Custom Traversal Functions<\/strong><\/figcaption><\/figure>\n<p>&nbsp;<\/p>\n<p>Let us generate text from this grammar, first, with the simple traverser.<\/p>\n<figure id=\"attachment_579\" aria-describedby=\"caption-attachment-579\" style=\"width: 532px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2017\/08\/Blog3.png\"><img data-recalc-dims=\"1\" decoding=\"async\" data-attachment-id=\"579\" data-permalink=\"https:\/\/www.rangakrish.com\/index.php\/2017\/08\/06\/text-generation-using-ilanggen-framework\/blog3\/\" data-orig-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2017\/08\/Blog3.png\" data-orig-size=\"532,283\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Simple Graph Traversal\" data-image-description=\"&lt;p&gt;Simple Graph Traversal&lt;\/p&gt;\n\" data-image-caption=\"&lt;p&gt;Simple Graph Traversal&lt;\/p&gt;\n\" data-large-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2017\/08\/Blog3.png\" class=\"size-full wp-image-579\" src=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2017\/08\/Blog3.png?resize=532%2C283\" alt=\"Simple Graph Traversal\" width=\"532\" height=\"283\" srcset=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2017\/08\/Blog3.png?w=532&amp;ssl=1 532w, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2017\/08\/Blog3.png?resize=300%2C160&amp;ssl=1 300w\" sizes=\"(max-width: 532px) 100vw, 532px\" \/><\/a><figcaption id=\"caption-attachment-579\" class=\"wp-caption-text\"><strong>Simple Graph Traversal<\/strong><\/figcaption><\/figure>\n<p>&nbsp;<\/p>\n<p>As you can see, the sentences are generated and printed on the standard output stream. Because the function <em><strong>print-text<\/strong><\/em>\u00a0returns <em><strong>t<\/strong><\/em> (True in Lisp), the traversal is completed in full.<\/p>\n<p>Let us\u00a0look at the other function that uses AST during traversal.<\/p>\n<figure id=\"attachment_580\" aria-describedby=\"caption-attachment-580\" style=\"width: 723px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2017\/08\/Blog4.png\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"580\" data-permalink=\"https:\/\/www.rangakrish.com\/index.php\/2017\/08\/06\/text-generation-using-ilanggen-framework\/blog4\/\" data-orig-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2017\/08\/Blog4.png\" data-orig-size=\"723,190\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Traversal with AST\" data-image-description=\"&lt;p&gt;Traversal with AST&lt;\/p&gt;\n\" data-image-caption=\"&lt;p&gt;Traversal with AST&lt;\/p&gt;\n\" data-large-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2017\/08\/Blog4.png\" class=\"size-full wp-image-580\" src=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2017\/08\/Blog4.png?resize=723%2C190\" alt=\"Traversal with AST\" width=\"723\" height=\"190\" srcset=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2017\/08\/Blog4.png?w=723&amp;ssl=1 723w, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2017\/08\/Blog4.png?resize=300%2C79&amp;ssl=1 300w\" sizes=\"(max-width: 723px) 100vw, 723px\" \/><\/a><figcaption id=\"caption-attachment-580\" class=\"wp-caption-text\"><strong>Traversal with AST<\/strong><\/figcaption><\/figure>\n<p>In this case, after each traversal, the underlying AST representation is available to our function. the function <em><strong>get-ast-of-node<\/strong><\/em> returns the AST corresponding to any node of the grammar. The other interesting fact to note is that the traversal stops after 10 sentences have been generated. Something like this is very useful if the grammar is capable of generating infinite number of sentences. Obviously, we don&#8217;t want to generate <em><strong>all<\/strong><\/em> sentences in that case!<\/p>\n<p>There are many ways to attach hooks\/handlers to control the traversal, as well as to fine-tune the generated data. We shall explore them in coming weeks.<\/p>\n<p>That is it for now. Have a great day!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The two primary areas in Natural Language processing are Natural Language Understanding and Natural Language Generation. The former is concerned with processing and making sense of natural language text, whereas the latter is concerned with synthesizing text, possibly from some deep representation. Both are fascinating and at the same time, challenging, areas of research. The [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"advanced_seo_description":"","jetpack_seo_html_title":"","jetpack_seo_noindex":false,"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[18,107,17],"tags":[110,109,108],"class_list":["post-575","post","type-post","status-publish","format-standard","hentry","category-lisp","category-natural-language-processing","category-programming","tag-common-lisp","tag-ilanggen","tag-text-generation"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p9OLnF-9h","jetpack-related-posts":[{"id":2152,"url":"https:\/\/www.rangakrish.com\/index.php\/2020\/09\/28\/template-based-text-generation\/","url_meta":{"origin":575,"position":0},"title":"Template-Based Text Generation","author":"admin","date":"September 28, 2020","format":false,"excerpt":"I had written earlier about natural language generation\u00a0using my iLangGen framework. I used a \"template\" text file which was instantiated dynamically based on predefined \"grammars\" and external data. The sample application I show-cased demonstrated its utility and versatility. Today I would like to touch upon a few other \"pattern\" elements\u2026","rel":"","context":"In &quot;LISP&quot;","block_context":{"text":"LISP","link":"https:\/\/www.rangakrish.com\/index.php\/category\/lisp\/"},"img":{"alt_text":"Template File","src":"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/Template-300x195.jpg?resize=350%2C200&ssl=1","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/Template-300x195.jpg?resize=350%2C200&ssl=1 1x, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2020\/09\/Template-300x195.jpg?resize=525%2C300&ssl=1 1.5x"},"classes":[]},{"id":884,"url":"https:\/\/www.rangakrish.com\/index.php\/2018\/04\/08\/natural-language-generation\/","url_meta":{"origin":575,"position":1},"title":"Natural Language Generation","author":"admin","date":"April 8, 2018","format":false,"excerpt":"I had written a series of posts on my iLangGen framework last year. It aims to provide a flexible and expressive approach for building natural language generation systems. In today's post, I would like to describe a concrete example of how iLangGen can be used for generating natural language text\u2026","rel":"","context":"In &quot;LISP&quot;","block_context":{"text":"LISP","link":"https:\/\/www.rangakrish.com\/index.php\/category\/lisp\/"},"img":{"alt_text":"Overall Approach","src":"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2018\/04\/overall-1.png?resize=350%2C200","width":350,"height":200},"classes":[]},{"id":1410,"url":"https:\/\/www.rangakrish.com\/index.php\/2019\/01\/27\/generating-poetry-using-ilanggen\/","url_meta":{"origin":575,"position":2},"title":"Generating Poetry Using iLangGen","author":"admin","date":"January 27, 2019","format":false,"excerpt":"In an earlier article, I wrote about using iLangGen to generate natural language text. iLangGen is a powerful text generation library that I have been working on over the years. Today, I would like to show how we can use that library to generate \"poetry\". Be warned, however, that the\u2026","rel":"","context":"In &quot;LISP&quot;","block_context":{"text":"LISP","link":"https:\/\/www.rangakrish.com\/index.php\/category\/lisp\/"},"img":{"alt_text":"Sample Output 2","src":"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/01\/Output2.jpg?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]},{"id":637,"url":"https:\/\/www.rangakrish.com\/index.php\/2017\/09\/27\/composition-of-grammars\/","url_meta":{"origin":575,"position":3},"title":"Composition of Grammars","author":"admin","date":"September 27, 2017","format":false,"excerpt":"In the last post, we saw how iLangGen text generation framework supports reuse of grammars through inheritance, akin to object-oriented languages. The good news is that we can achieve reuse through composition as well. The following is a simple grammar, nothing fancy to elaborate. Here is the output when you\u2026","rel":"","context":"In &quot;LISP&quot;","block_context":{"text":"LISP","link":"https:\/\/www.rangakrish.com\/index.php\/category\/lisp\/"},"img":{"alt_text":"Simple Grammar","src":"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2017\/09\/G1.png?resize=350%2C200","width":350,"height":200},"classes":[]},{"id":2162,"url":"https:\/\/www.rangakrish.com\/index.php\/2020\/10\/11\/template-based-text-generation-part-2\/","url_meta":{"origin":575,"position":4},"title":"Template-based Text Generation &#8211; Part 2","author":"admin","date":"October 11, 2020","format":false,"excerpt":"In my previous article, I showed how \u201ciLangGen\u201d framework facilitates text generation using templates. I talked about the various \u201cpatterns\u201d that can be used in a template. However, in that article, I did not go into the details of the \u201cEmbedded Template\u201d pattern. That is the focus of today\u2019s article.\u2026","rel":"","context":"In &quot;LISP&quot;","block_context":{"text":"LISP","link":"https:\/\/www.rangakrish.com\/index.php\/category\/lisp\/"},"img":{"alt_text":"Main Template","src":"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2020\/10\/main-template-300x137.jpg?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]},{"id":1659,"url":"https:\/\/www.rangakrish.com\/index.php\/2019\/08\/04\/generating-poetry-in-prolog\/","url_meta":{"origin":575,"position":5},"title":"Generating Poetry in Prolog","author":"admin","date":"August 4, 2019","format":false,"excerpt":"In an earlier article, I showed how we can generate poetry (with limitations, of course!) using my iLangGen framework. That implementation (in Lisp) made use of iLexicon, a large dictionary of English words, which I have been building over the years. I subsequently ported iLexicon to Prolog and it now\u2026","rel":"","context":"In &quot;Natural Language Processing&quot;","block_context":{"text":"Natural Language Processing","link":"https:\/\/www.rangakrish.com\/index.php\/category\/natural-language-processing\/"},"img":{"alt_text":"Generation Logic","src":"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/08\/Code3.jpg?resize=350%2C200&ssl=1","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/08\/Code3.jpg?resize=350%2C200&ssl=1 1x, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2019\/08\/Code3.jpg?resize=525%2C300&ssl=1 1.5x"},"classes":[]}],"_links":{"self":[{"href":"https:\/\/www.rangakrish.com\/index.php\/wp-json\/wp\/v2\/posts\/575","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.rangakrish.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.rangakrish.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.rangakrish.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.rangakrish.com\/index.php\/wp-json\/wp\/v2\/comments?post=575"}],"version-history":[{"count":0,"href":"https:\/\/www.rangakrish.com\/index.php\/wp-json\/wp\/v2\/posts\/575\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.rangakrish.com\/index.php\/wp-json\/wp\/v2\/media?parent=575"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.rangakrish.com\/index.php\/wp-json\/wp\/v2\/categories?post=575"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.rangakrish.com\/index.php\/wp-json\/wp\/v2\/tags?post=575"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}