image I am very interested in how natural languages work and evolve (see my review of the book The Language Instinct) and ever since I began playing with MGrammar I have wanted to see if it was possible to define English sentence structure using it.

Lets begin with a simple sentence:

The boy likes the girl.

This sentence is composed of the noun phrase (NP) "the boy" and a verb phrase(VP) "likes the girl". So lets begin with this MGrammar syntax:

syntax Main = S*;        
syntax S = NP VP ".";

If we look at the noun phrase "the boy", it is composed of determiner followed by noun, like wise if we look at the verb phrase "likes the girl" we see that it is composed of verb followed by a noun phrase. The MGrammar should then be:

syntax NP = Det? N;
syntax VP = V NP;

Then we just need to add some determiners, verbs and nouns:

syntax Det = "a" | "the" | "one";
syntax N = "boy" | "girl" | "dog" | "school" | "hair";
syntax V = "likes" | "bites" | "eats" | "discuss";

If you add an interleave rule to skip whitespace the sentence should be correctly parsed. That was a really simple sentence, lets add an adjective.

The nerdy boy likes the girl.

We need to modify the noun phrase rule. Before the noun an optional amount of adjectives (A*) can be placed. This is a simple change, just add A* to the noun phrase rule and add some adjectives.

syntax NP = Det? A* N;
syntax A = "happy" | "lucky" | "tall" | "red" | "nerdy";

That was simple, lets add something more to the sentence, for example:

The nerdy boy likes the girl from school with red hair .

I added a nested prepositional phrase (PP). A prepositional phrase is, according to wikipedia, composed of a preposition(P) and a noun phrase.

syntax NP = Det? A* N PP?;
syntax PP = P NP;
syntax P = "on" | "in" | "from" | "with";

The recursive nature of the PP phrase makes it possible to nest infinite number of prepositional phrases inside each other. Here is an illustration of the syntax tree for "girl from school with red hair":

image

I think I will stop here because this post is turning in to an English grammar lesson and I don't won't to loose all my subscribers :) Defining the English sentence structure in MGrammar is pretty pointless, unless you are building a grammar checker, in which case you are still out of luck as it will probably be impossible to define grammar for how words are built and you will run into trouble with ambiguity (which most natural languages have). But it was a fun try, and it is a good example for showing how recursive rules are parsed.

If you missed Martin Fowlers post on Oslo, it is a good read, I like how he defines it as a Language Workbench.

PS. I have started twittering, I know am late to the game, I just didn't get the point of twitter. I have been using it for a two days now and I am beginning to see the light. Oh, and please skip pointing out the irony with the inevitable grammatical errors in this post :)