Welcome back.
Today I am going to share with you some of the nice capabilities of Mathematica in the area of Natural Language Processing (NLP).
Let us start with words. What if we wish to know the various definitions of the word image? Here is the answer. Mathematica gives the various senses of the word and the corresponding definition of each sense of the word.
For some operations such as this, Mathematica needs internet connectivity. It will download data from its servers, if needed.
OK, let us ask Mathematica to give us possible synonyms for this word.
The WordData[] function used above can give us a lot of useful information about words.
If we need sample English text for our analysis, Mathematica can help by providing many samples, from curated books and speeches, as well as from Wikipedia. In the following, I am drawing a word cloud of the top 20 most frequently appearing words in President Abraham Lincoln’s Gettysburg address. The ExampleData[] function comes in handy here.
TextCases[] is another useful function. In the first example below, Mathematica has identified the proper nouns in the given text. In the second example, it has correctly identified the countries referred to in the text.
Next, let us look at sentence structure. The function TextStructure[] is available for this purpose. The figure below shows the traditional constituent graph of the sentence Jack and Jill went up the hill.
When I ask Mathematica to render the same sentence as a dependency graph, this is what I get:
Quite interesting.
Finally, before concluding, let us ask Mathematica to display the frequency of the words guru and yoga in typical published English text, during the period 1950 to 2015.
OK, that brings us to the end of today’s post. Hope to share some more exciting features of Mathematica in future posts.
Stay tuned.
Recent Comments