In the last article, I showed how we can use the neuralcoref library along with spaCy to do coreference resolution (examples involved anaphoric references). In today’s article, I want to try the same (well, almost) examples in Stanford CoreNLP engine and see how they compare.
Since CoreNLP is a Java implementation, I chose to write the test program in Java. Here is the program along with the examples.
You can download the CoreNLP library from here.
Let us start with the first sentence:
“My sister has a dog and she loves him.”
When I run it through the program, this is the output:
Strange. It shows that the program has identified “My sister” and “she” as pointing to the same entity, but there is no mention of “dog” and “him”. In contrast, neuralcoref identified both the “mentions” correctly.
Could it be because this is a compound sentence? Let us convert this into two simple sentences:
“My sister has a dog. She loves him.”
Here is the output in this case:
OK! It is interesting that the program has identified both the “mentions” this time.
I am using the remaining sentences as they were in neuralcoref example. Here is the third sentence:
“My sister has a dog and she loves him. He is cute.”
The output is:
Although the system has identified “him” and “he” as belonging together, it has omitted “dog” from the reference. Bug! Here also, neuralcoref fared better.
The next example is:
“My sister has a dog and she loves her.”
The output is:
The behavior is the same as with neuralcoref. “dog” is not paired with “her”.
Consider the following statement that uses different genders and hence should be easier to resolve:
“My brother has a dog and he loves her.”
This is what we get:
That is nice! “dog” and “her” are correctly paired. neuralcoref did not handle this case correctly.
Let us look at the next sentence:
“Mary and Julie are sisters. They love chocolates.”
neuralcoref handled this case correctly. Strangely, when given this input, CoreNLP does not identify any “mentions” at all! I confess I am disappointed.
Let us try the next one:
“John and Mary are neighbours. She admires him because he works hard.”
This looks more complex than the earlier one. How does CoreNLP fare?
OK, this one is handled correctly. That is a relief.
The next one is tougher:
“X and Y are neighbours. She admires him because he works hard.”
Here is the program’s output:
Not surprised with the behavior. neuralcoref wasn’t any better.
Here is the last example:
“The dog chased the cat. But it escaped.”
This is the corresponding output:
Not correct. Again, the behavior is identical to neuralcoref.
Let me summarize and show the performance of CoreNLP and neuralcoref with respect to the examples:
You can see that among the sentences tested in both the systems, neuralcoref got 4 out of 8 correct, whereas CoreNLP got just 2 out of 8 correct. To clarify, I have marked the output as “Wrong” if the system did not FULLY identify the mentions.
Overall, in the case of anaphora resolution, both the popular libraries have fared poorly in my opinion. I expected more out of CoreNLP, so that was a bigger disappointment!
Looks like there is a lot more work to do in the area of coreference resolution!
You can download my Java program from here.
Have a nice weekend!
Recent Comments