{"id":1068,"date":"2018-09-16T07:14:22","date_gmt":"2018-09-16T01:44:22","guid":{"rendered":"http:\/\/www.rangakrish.com\/?p=1068"},"modified":"2018-09-16T07:24:49","modified_gmt":"2018-09-16T01:54:49","slug":"dependency-graph-to-rdf","status":"publish","type":"post","link":"https:\/\/www.rangakrish.com\/index.php\/2018\/09\/16\/dependency-graph-to-rdf\/","title":{"rendered":"Dependency Graph to RDF"},"content":{"rendered":"<p><a href=\"http:\/\/www.phontron.com\/slides\/nlp-programming-en-11-depend.pdf\" target=\"_blank\" rel=\"noopener\"><em><strong>Dependency parsing<\/strong><\/em><\/a> is widely used these days, and many NLP tools give a dependency graph as the parsed representation of the input text. See for example, <a href=\"https:\/\/spacy.io\" target=\"_blank\" rel=\"noopener\"><em><strong>SpacY<\/strong><\/em><\/a> and <a href=\"https:\/\/www.textrazor.com\" target=\"_blank\" rel=\"noopener\"><em><strong>TextRazor<\/strong><\/em><\/a>.<span class=\"Apple-converted-space\">\u00a0 <\/span>The following is the dependency tree corresponding to the sentence <em><strong>Mary is drinking cold water<\/strong><\/em>:<\/p>\n<figure id=\"attachment_1069\" aria-describedby=\"caption-attachment-1069\" style=\"width: 649px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2018\/09\/DepGraph.png\"><img data-recalc-dims=\"1\" fetchpriority=\"high\" decoding=\"async\" data-attachment-id=\"1069\" data-permalink=\"https:\/\/www.rangakrish.com\/index.php\/2018\/09\/16\/dependency-graph-to-rdf\/depgraph\/\" data-orig-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2018\/09\/DepGraph.png\" data-orig-size=\"792,299\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Dependency Graph\" data-image-description=\"&lt;p&gt;Dependency Graph&lt;\/p&gt;\n\" data-image-caption=\"&lt;p&gt;Dependency Graph&lt;\/p&gt;\n\" data-large-file=\"https:\/\/www.rangakrish.com\/wp-content\/uploads\/2018\/09\/DepGraph.png\" class=\"wp-image-1069\" src=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2018\/09\/DepGraph.png?resize=649%2C245\" alt=\"Dependency Graph\" width=\"649\" height=\"245\" srcset=\"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2018\/09\/DepGraph.png?w=792&amp;ssl=1 792w, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2018\/09\/DepGraph.png?resize=300%2C113&amp;ssl=1 300w, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2018\/09\/DepGraph.png?resize=768%2C290&amp;ssl=1 768w\" sizes=\"(max-width: 649px) 100vw, 649px\" \/><\/a><figcaption id=\"caption-attachment-1069\" class=\"wp-caption-text\"><strong>Dependency Graph<\/strong><\/figcaption><\/figure>\n<p>The above tree was generated using <strong><a href=\"https:\/\/spacy.io\" target=\"_blank\" rel=\"noopener\"><em>SpacY<\/em><\/a><\/strong>. You can see that the arrow points from the dependent word to its head word, and each arrow has a label denoting the relationship between the head word and its dependent (some use the opposite convention, where the arrow points from head word to dependent). The above graph (actually, a tree) can be represented as a collection of triples as follows (not a formal RDF):<\/p>\n<blockquote><p><span style=\"color: #0000ff;\">(drinking aux is)<\/span><\/p>\n<p><span style=\"color: #0000ff;\">(drinking nubj Mary)<\/span><\/p>\n<p><span style=\"color: #0000ff;\">(water dobj drinking)<\/span><\/p>\n<p><span style=\"color: #0000ff;\">(water amod cold)<\/span><\/p><\/blockquote>\n<p>This is not the only way, but you get the idea. Expressed this way, we can immediately see the similarities between the dependency graph and <a href=\"https:\/\/en.wikipedia.org\/wiki\/Resource_Description_Framework\" target=\"_blank\" rel=\"noopener\"><em><strong>RDF<\/strong><\/em><\/a>\u00a0used in <a href=\"https:\/\/en.wikipedia.org\/wiki\/Semantic_Web\" target=\"_blank\" rel=\"noopener\"><em><strong>Semantic Web<\/strong><\/em><\/a>. If we are able to convert the dependency graph to RDF, we can then take advantage of a large number of tools that are available to operate on the RDF graph. Tools such as <a href=\"https:\/\/jena.apache.org\" target=\"_blank\" rel=\"noopener\"><em><strong>Jena <\/strong><\/em><\/a>and <a href=\"https:\/\/allegrograph.com\" target=\"_blank\" rel=\"noopener\"><em><strong>AllegroGraph<\/strong><\/em><\/a> are known for handling very large data sets, and come bundled with support for <a href=\"https:\/\/www.w3.org\/TR\/rdf-sparql-query\/\" target=\"_blank\" rel=\"noopener\"><em><strong>SPARQL<\/strong><\/em><\/a> query language and efficient <a href=\"http:\/\/owl.cs.manchester.ac.uk\/tools\/list-of-reasoners\/\" target=\"_blank\" rel=\"noopener\"><em><strong>Reasoners<\/strong><\/em><\/a>.<\/p>\n<p>Before attempting the conversion, we have to be clear about what information needs to be carried over from the dependency representation. Here is my list:<\/p>\n<blockquote><p><span style=\"color: #0000ff;\">1) We should have some unique IDs for the different sentences. So if someone wants to know which sentences contain, for example, the word &#8220;likes&#8221;, this information should be available. To take an even simpler example, we should be able to answer the question &#8220;How many sentences are there in the given text?&#8221;.<\/span><\/p>\n<p><span style=\"color: #0000ff;\">2) Each occurrence of a word must have a unique ID. To amplify, if the same word occurs multiple times in the same sentence, its IDs must be different.<\/span><\/p>\n<p><span style=\"color: #0000ff;\">3) Obviously, we need to capture the &lt;head-word, dependency, dependent-word&gt; relationship.<\/span><\/p>\n<p><span style=\"color: #0000ff;\">4) I feel it is useful to keep track of the part-of-speech (<em><strong>POS<\/strong><\/em>) of each occurrence of a word. Remember that a word such as &#8220;sleep&#8221; could act as a <em><strong>Noun<\/strong><\/em> in one place and <em><strong>Verb<\/strong><\/em> in another.<span class=\"Apple-converted-space\">\u00a0<\/span><\/span><\/p>\n<p><span style=\"color: #0000ff;\">5) The lemma (root form) of each word could also prove useful. This way, we can search for sentences that have the lemma &#8220;be&#8221;, without worrying about whether it is &#8220;is&#8221; or &#8220;was&#8221;.<\/span><\/p><\/blockquote>\n<p>When converting to RDF, we have to remember that there are many serialization <em><strong>formats<\/strong><\/em>, not just one. Common formats are <em><strong>Turtle, N-triples, N-Quads, N3, RDF\/XML, and RDF\/JSON<\/strong><\/em>. I chose <a href=\"https:\/\/www.w3.org\/TR\/turtle\/\" target=\"_blank\" rel=\"noopener\"><em><strong>Turtle<\/strong><\/em><\/a> as my output format.<\/p>\n<p>Another design decision is to use an appropriate <em><strong>Namespace<\/strong><\/em> for qualifying the different URIs. Again, to simplify my work and to get started quickly, I have used my own namespace for the URIs. Although not critical at this point, I feel this aspect has to be addressed eventually.<\/p>\n<p>Let us consider the simple sentence: <em><strong>John loves Mary.<\/strong><\/em><\/p>\n<p>Here is the Turtle representation generated by my converter:<\/p>\n<blockquote>\n<div><span style=\"color: #0000ff;\"># Dependency Graph Representation in Turtle Format.<\/span><\/div>\n<div><\/div>\n<div><span style=\"color: #0000ff;\">@prefix m: &lt;http:\/\/mmsindia\/depgraph\/example\/&gt; .<\/span><\/div>\n<div><\/div>\n<div><span style=\"color: #0000ff;\">m:sent-1 m:word m:word-250 .<\/span><\/div>\n<div><span style=\"color: #0000ff;\">m:word-250 m:pos &#8220;NNP&#8221; .<\/span><\/div>\n<div><span style=\"color: #0000ff;\">m:word-250 m:lemma &#8220;john&#8221; .<\/span><\/div>\n<div><span style=\"color: #0000ff;\">m:word-251 m:nsubj m:word-250 .<\/span><\/div>\n<div><\/div>\n<div><span style=\"color: #0000ff;\">m:sent-1 m:word m:word-251 .<\/span><\/div>\n<div><span style=\"color: #0000ff;\">m:word-251 m:pos &#8220;VBZ&#8221; .<\/span><\/div>\n<div><span style=\"color: #0000ff;\">m:word-251 m:lemma &#8220;love&#8221; .<\/span><\/div>\n<div><span style=\"color: #0000ff;\">m:word-251 m:ROOT m:word-251 .<\/span><\/div>\n<div><\/div>\n<div><span style=\"color: #0000ff;\">m:sent-1 m:word m:word-252 .<\/span><\/div>\n<div><span style=\"color: #0000ff;\">m:word-252 m:pos &#8220;NNP&#8221; .<\/span><\/div>\n<div><span style=\"color: #0000ff;\">m:word-252 m:lemma &#8220;mary&#8221; .<\/span><\/div>\n<div><span style=\"color: #0000ff;\">m:word-251 m:dobj m:word-252 .<\/span><\/div>\n<div><\/div>\n<div><span style=\"color: #0000ff;\">m:sent-1 m:word m:word-253 .<\/span><\/div>\n<div><span style=\"color: #0000ff;\">m:word-253 m:pos &#8220;.&#8221; .<\/span><\/div>\n<div><span style=\"color: #0000ff;\">m:word-253 m:lemma &#8220;.&#8221; .<\/span><\/div>\n<div><span style=\"color: #0000ff;\">m:word-251 m:punct m:word-253 .<\/span><\/div>\n<div><\/div>\n<div><span style=\"color: #0000ff;\">m:word-250 m:label &#8220;John&#8221; .<\/span><\/div>\n<div><span style=\"color: #0000ff;\">m:word-251 m:label &#8220;loves&#8221; .<\/span><\/div>\n<div><span style=\"color: #0000ff;\">m:word-252 m:label &#8220;Mary&#8221; .<\/span><\/div>\n<div><span style=\"color: #0000ff;\">m:word-253 m:label &#8220;.&#8221; .<\/span><\/div>\n<\/blockquote>\n<p>There is a triple of the form <em><strong>{m:sent-&lt;id&gt; m:word m:word-&lt;id&gt;}<\/strong><\/em> for each word in the text. The <em><strong>word-&lt;id&gt;<\/strong><\/em> has a unique value for every occurrence of a word. Hence for each <em><strong>word-&lt;id&gt;<\/strong><\/em>, we also have the corresponding <em><strong>lemma<\/strong><\/em> of the word and its <em><strong>POS<\/strong><\/em>. You will also notice that the last section enumerates all the word instances as triples of the form <em><strong>{m:word-&lt;id&gt; m:label &lt;literal&gt;}<\/strong><\/em>.<span class=\"Apple-converted-space\">\u00a0<\/span><\/p>\n<p>I think the conversion logic is easy to understand.<span class=\"Apple-converted-space\">\u00a0As I mentioned earlier, once we obtain the RDF representation, we can do many interesting things with it.<\/span><\/p>\n<p>In the next post, I will show how we can import this data into a graph database (<a href=\"https:\/\/allegrograph.com\" target=\"_blank\" rel=\"noopener\"><em><strong>AllegroGraph<\/strong><\/em><\/a>) and query the graph.<\/p>\n<p>I implemented the conversion logic in <em><strong>Python<\/strong><\/em>. The program uses <a href=\"https:\/\/spacy.io\" target=\"_blank\" rel=\"noopener\"><em><strong>SpacY<\/strong><\/em><\/a> to parse input text and convert the corresponding dependency graph to <em><strong>Turtle<\/strong><\/em> format. The input file and output files are passed as command line parameters to the program. You can download the python program <a href=\"http:\/\/www.rangakrish.com\/downloads\/Dependency to Turtle Format.py\" target=\"_blank\" rel=\"noopener\"><em><strong>here<\/strong><\/em><\/a>. A sample <a href=\"http:\/\/www.rangakrish.com\/downloads\/Sample.txt\" target=\"_blank\" rel=\"noopener\"><em><strong>input file<\/strong><\/em><\/a>\u00a0containing multiple sentences and the corresponding <a href=\"http:\/\/www.rangakrish.com\/downloads\/Sample.ttl\" target=\"_blank\" rel=\"noopener\"><em><strong>TTL file<\/strong><\/em><\/a> are also available for download.<\/p>\n<p>Have a nice weekend!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Dependency parsing is widely used these days, and many NLP tools give a dependency graph as the parsed representation of the input text. See for example, SpacY and TextRazor.\u00a0 The following is the dependency tree corresponding to the sentence Mary is drinking cold water: The above tree was generated using SpacY. You can see that [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"advanced_seo_description":"","jetpack_seo_html_title":"","jetpack_seo_noindex":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2},"jetpack_post_was_ever_published":false},"categories":[107,17,103],"tags":[156,74,153,154,155],"class_list":["post-1068","post","type-post","status-publish","format-standard","hentry","category-natural-language-processing","category-programming","category-python","tag-dependency-graph","tag-nlp","tag-rdf","tag-semantic-web","tag-turtle"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p9OLnF-he","jetpack-related-posts":[{"id":1078,"url":"https:\/\/www.rangakrish.com\/index.php\/2018\/09\/30\/dependency-graph-to-rdf-part-2\/","url_meta":{"origin":1068,"position":0},"title":"Dependency Graph to RDF &#8211; Part 2","author":"admin","date":"September 30, 2018","format":false,"excerpt":"In the last post, I outlined an approach to convert a dependency graph (the result of dependency parsing) to RDF. The particular RDF format I used is Turtle, which is widely supported. Today, I would like to show how to load this RDF data in a Semantic\u00a0 Graph database and\u2026","rel":"","context":"In &quot;LISP&quot;","block_context":{"text":"LISP","link":"https:\/\/www.rangakrish.com\/index.php\/category\/lisp\/"},"img":{"alt_text":"Browser View","src":"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2018\/09\/Checking-the-server.png?resize=350%2C200&ssl=1","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2018\/09\/Checking-the-server.png?resize=350%2C200&ssl=1 1x, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2018\/09\/Checking-the-server.png?resize=525%2C300&ssl=1 1.5x, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2018\/09\/Checking-the-server.png?resize=700%2C400&ssl=1 2x"},"classes":[]},{"id":328,"url":"https:\/\/www.rangakrish.com\/index.php\/2016\/09\/11\/natural-language-processing-in-mathematica\/","url_meta":{"origin":1068,"position":1},"title":"Natural Language Processing in Mathematica","author":"admin","date":"September 11, 2016","format":false,"excerpt":"Welcome back. Today I am going to share with you some of the nice capabilities of Mathematica in the area of Natural Language Processing (NLP). Let us start with words. What if we wish to know\u00a0the various definitions of the word image?\u00a0Here is the answer. Mathematica gives the various senses\u2026","rel":"","context":"In &quot;Mathematica&quot;","block_context":{"text":"Mathematica","link":"https:\/\/www.rangakrish.com\/index.php\/category\/mathematica\/"},"img":{"alt_text":"Word Definition","src":"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2016\/09\/word-data1-1024x238.png?resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2016\/09\/word-data1-1024x238.png?resize=350%2C200 1x, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2016\/09\/word-data1-1024x238.png?resize=525%2C300 1.5x, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2016\/09\/word-data1-1024x238.png?resize=700%2C400 2x"},"classes":[]},{"id":912,"url":"https:\/\/www.rangakrish.com\/index.php\/2018\/04\/22\/question-answering-using-dependency-trees\/","url_meta":{"origin":1068,"position":2},"title":"Question Answering\u00a0Using Dependency Trees","author":"admin","date":"April 22, 2018","format":false,"excerpt":"A few weeks ago I had written about my brief experiment with Mathematica's new feature, which provides answers to questions based on given text. After that post, I spent some time thinking about how to implement something similar. In today's post, I want to show you what I have been\u2026","rel":"","context":"In &quot;LISP&quot;","block_context":{"text":"LISP","link":"https:\/\/www.rangakrish.com\/index.php\/category\/lisp\/"},"img":{"alt_text":"Dependency Tree","src":"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2018\/04\/Deptree-example.png?resize=350%2C200","width":350,"height":200},"classes":[]},{"id":3660,"url":"https:\/\/www.rangakrish.com\/index.php\/2025\/04\/11\/using-claude-to-generate-rdf-triples\/","url_meta":{"origin":1068,"position":3},"title":"Using Claude to Generate RDF Triples","author":"admin","date":"April 11, 2025","format":false,"excerpt":"We all know that LLMs are now capable of generating structured data. I have used OpenAI models earlier to generate Tables and JSON data, but this time I wanted to try a more complex example.\u00a0 As someone interested in Homeopathy, I wanted to generate remedy descriptions as RDF triples, in\u2026","rel":"","context":"In &quot;Homeopathy&quot;","block_context":{"text":"Homeopathy","link":"https:\/\/www.rangakrish.com\/index.php\/category\/homeopathy\/"},"img":{"alt_text":"Lycopodium TTL Format","src":"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2025\/04\/Lyco-300x232.png?resize=350%2C200&ssl=1","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2025\/04\/Lyco-300x232.png?resize=350%2C200&ssl=1 1x, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2025\/04\/Lyco-300x232.png?resize=525%2C300&ssl=1 1.5x"},"classes":[]},{"id":1863,"url":"https:\/\/www.rangakrish.com\/index.php\/2020\/01\/02\/book-review-automatic-text-simplification\/","url_meta":{"origin":1068,"position":4},"title":"Book Review &#8211; Automatic Text Simplification","author":"admin","date":"January 2, 2020","format":false,"excerpt":"Title: Automatic Text Simplification Author: Horacio Saggino Publisher: Morgan & Claypool Publishers Year: 2017 Automatic Text Simplification is an active area of research in NLP and has been going on for over 20 years. The idea is to transform a given text T1 into text T2 such that T2 is\u2026","rel":"","context":"In &quot;Book Review&quot;","block_context":{"text":"Book Review","link":"https:\/\/www.rangakrish.com\/index.php\/category\/book-review\/"},"img":{"alt_text":"Automatic Text Simplification","src":"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2020\/01\/IMG_1496-edited-225x300.jpeg?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]},{"id":285,"url":"https:\/\/www.rangakrish.com\/index.php\/2016\/07\/22\/using-julia-to-interact-with-mathematica\/","url_meta":{"origin":1068,"position":5},"title":"Using Julia to Interact with Mathematica","author":"admin","date":"July 22, 2016","format":false,"excerpt":"Mathematica is a powerful environment for symbolic and numerical computation. I have been using it for many years now. In this post\u00a0I had explained how we can use Mathematica bundled with Raspberry distribution to control littleBits devices. When I saw that there is support in Julia for interacting with Mathematica,\u2026","rel":"","context":"In &quot;Julia&quot;","block_context":{"text":"Julia","link":"https:\/\/www.rangakrish.com\/index.php\/category\/julia\/"},"img":{"alt_text":"Julia Session","src":"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2016\/07\/Julia-1.png?resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2016\/07\/Julia-1.png?resize=350%2C200 1x, https:\/\/i0.wp.com\/www.rangakrish.com\/wp-content\/uploads\/2016\/07\/Julia-1.png?resize=525%2C300 1.5x"},"classes":[]}],"_links":{"self":[{"href":"https:\/\/www.rangakrish.com\/index.php\/wp-json\/wp\/v2\/posts\/1068","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.rangakrish.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.rangakrish.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.rangakrish.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.rangakrish.com\/index.php\/wp-json\/wp\/v2\/comments?post=1068"}],"version-history":[{"count":0,"href":"https:\/\/www.rangakrish.com\/index.php\/wp-json\/wp\/v2\/posts\/1068\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.rangakrish.com\/index.php\/wp-json\/wp\/v2\/media?parent=1068"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.rangakrish.com\/index.php\/wp-json\/wp\/v2\/categories?post=1068"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.rangakrish.com\/index.php\/wp-json\/wp\/v2\/tags?post=1068"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}