The Structure of WH-Questions

Written by on May 23, 2021 in Natural Language Processing, Programming, Prolog with 0 Comments

WH-Questions are questions that begin with the following words:

– Who (“Who came here yesterday?”)

– What (“What is the goal of this project?”)

– When (“When can I visit my parents?”)

– Where (“Where did he go?”)

– Why (“Why is everyone running away?”)

– Which (“Which is the book you recommend?”)

– How (“How soon can you prepare dinner?”)

– Whose (”Whose car is this?”)

– Whom (“Whom should I contact for resolving this issue?”)

The WH-questions are usually asked to get detailed information, not a simple Yes or No. In contrast, we also have Yes/No questions that expect a Yes or No as answer. Some examples of this category are:

“Did you have lunch?”

“Can we go out for dinner tonight?”

“Do you like coffee?”

In today’s article, I would like to go over the grammatical structure of WH-questions in English. The idea is not to propose a comprehensive grammar that covers all possible forms of the WH-questions, but to give a feel for what the structure looks like. For the present discussion, I will exclude “Whose” and “Whom”, and cover the remaining seven. 

WH-Questions Structure

WH-Questions Structure

The above shows the grammar of WH-questions expressed as Definite Clause Grammar (DCG) in Prolog. It can be seen that each of the Wh-words is typically followed by an “Auxiliary verb”, a “Noun phrase” and a “Verb phrase” in that order. There are, of course, minor variations.

What are “Auxiliary verbs”?

An auxiliary verb is a verb that is used along with a “main” verb to express tense, voice, etc. Some examples are am, is, are, have, do, did, being, etc. Modal auxiliaries are can, could, shall, ought, might, etc.

“Whom should I contact for resolving this issue?”

“Where did he go?”

“When can I visit my parents?”

“What is the goal of this project?”

Noun and Verb Phrases

These two are known to have a very complex grammatical structure. What I have provided here are highly simplified structures. Elsewhere I have defined more realistic, although not complete, representations of these categories for my parsing needs, but that is not required for today’s discussion. Feel free to refer to a good book on English grammar, for example [1, 2] to get an idea of how complex these can be. Here is the Noun Phrase description:

Noun Phrase Structure

Noun Phrase Structure

Here are the Verb Phrase, Prepositional Phrase and Complementizer structures:

Verb Phrase, etc.

Verb Phrase, etc.

The final group consists of the primary parts of speech of various words:

Other Categories

Other Categories

To validate the above structures, we can try giving sample WH-questions and check the output. I am using a Prolog clause “parse_wh” as the primary entry point. It uses a simple Tokenizer and then feeds the tokens to the DCG parser to get the result.

Parser Entry Point

Parser Entry Point

Session-1: Where, When, Why

Where, When, Why Questions

Where, When, Why Questions

Session-2: Who

Who Question

Who Questions

Session-3: What, How, Which

What, How, Which Questions

What, How, Which Questions

I trust the above discussion throws light on the top-level structure of WH-questions. I have implemented this in Sicstus Prolog (64 bit) running on Windows 10. You can download the above Prolog source file here.

Have a nice weekend!

Further Reading

  1. Randolph Quirk, Sidney Greenbaum, Geoffrey Leech, and Jan Svartvik, “A Comprehensive Grammar of the English Language”, Longman, 1985.
  2. Rodney Huddleston and Geoffrey K. Pullum, “The Cambridge Grammar of the English Language”, Cambridge University Press, 2002.

Tags: ,


If you enjoyed this article, subscribe now to receive more just like it.

Subscribe via RSS Feed

Leave a Reply

Your email address will not be published. Required fields are marked *