The only part of the Oslo presentations at PDC that caught my attention was the MGrammar language (Mg).

The Mg language provides simple constructs for describing the shape of a textual language – that shape includes the input syntax as well as the structure and contents of the underlying information

The interesting part of Mg is how it combines schema, data transformation, and functional programming concepts to define rules and lists. Creating and designing a language is hard and requires some knowledge of how parsers work, as Frans Bouma, and Roger Alsing has pointed out, Mg and Oslo is not going to change that. I haven't work professionally with language parsers, I have written a C like language compiler using LEX and YACC, but that was many years ago. One of the most popular tools for language creation today is ANTLR, it would be great if someone knowledgeable in both ANTLR and Mg would write a comparison.

Anyway, I was intrigued by Mg so I decided to play around with it. I decided to create a simple DSL over the WatiN browser automation library. I wanted to be able to execute scripts that looked like this:

test "Searching google for watin"
    goto "http://www.google.se"
    type "watin" into "q"
    click "btnG"
    assert that text "WatiN Home" exists
    assert that element "res" exists
end

Maybe not the best possible DSL for browser testing, one could probably come up with something even more natural sounding. But it will be sufficient for now. To start creating the language specification I started Intellipad (a application that is included in the Oslo CTP). To get the nice three pane view, with input, grammar, and output window is kind of tricky. First switch the current mode to MGrammarMode, this is done by pressing Ctrl+Shift+D to bring up the minibuffer, then enter "SetMode('MGMode')". Now the MGrammar Mode menu should be visible, from this menu select "Tree Preview", this will bring up a open file dialog, in this dialog create an empty .mg file and select that file.

image

I entered my goal DSL in the dynamic parser window and began defining the syntax and data schema. After an hour of trial and error I arrived at this grammar:

module CodingInstinct {
    import Language;
    import Microsoft.Languages;
    export BrowserLang;
 
    language BrowserLang {
                  
        syntax Main = t:Test* => t;
        
        syntax Test = TTest name:StringLiteral a:ActionList TEnd
            => Test { Name { name }, a };
                       
        syntax ActionList
          = item:Action => ActionList[item]
          | list:ActionList item:Action => ActionList[valuesof(list), item];
                             
        syntax Action 
            = a:GotoAction => a
            | a:TypeAction => a
            | a:ClickAction => a
            | a:AssertAction => a;
            
        syntax GotoAction = TGoto theUrl:StringLiteral => GotoAction { Url { theUrl } };
        syntax TypeAction = TType text:StringLiteral TInto id:StringLiteral 
             => TypeAction { Text { text }, ID { id } };
        
        syntax ClickAction = TClick id:StringLiteral => ClickAction { ID { id } }; 
        syntax AssertAction = 
            TAssert TText text:StringLiteral TExists => AssertAction { TextExists { text } }
          |
            TAssert TElement element:StringLiteral TExists => AssertAction { ElementExists { element } }           ;
        
        @{Classification["Keyword"]} token TTest = "test";            
        @{Classification["Keyword"]} token TGoto = "goto";
        @{Classification["Keyword"]} token TEnd = "end";
        @{Classification["Keyword"]} token TType = "type";
        @{Classification["Keyword"]} token TInto = "into";
        @{Classification["Keyword"]} token TClick = "click";
        @{Classification["Keyword"]} token TAssert = "assert that";
        @{Classification["Keyword"]} token TExists = "exists";        
        @{Classification["Keyword"]} token TText = "text";        
        @{Classification["Keyword"]} token TElement = "element";
        
        interleave Skippable
          = Base.Whitespace+ 
          | Language.Grammar.Comment;       
                
        syntax StringLiteral
          = val:Language.Grammar.TextLiteral => val;        
    }

}

I have have no idea if this is a reasonable grammar for my language or if it can be written in a simpler/smarter way. The grammar generates this M node graph:

[
  Test{
    Name{
      "\"Search google for watin\""
    },
    ActionList[
      GotoAction{
        Url{
          "\"http://www.google.se\""
        }
      },
      TypeAction{
        Text{
          "\"asd\""
        },
        ID{
          "\"google\""
        }
      },
      ClickAction{
        ID{
          "\"btnG\""
        }
      },
      AssertAction{
        TextExists{
          "\"text\""
        }
      },
      AssertAction{
        ElementExists{
          "\"asd\""
        }
      }
    ]
  }
]

The problem I had now was how to parse and execute this graph, I could not find any documentation for how to generate C# classes from M schema. What is included in the CTP is a C# library to navigate the node graph that the language parser generates. This node graph is not very easy to work with, I wanted a GotoAction to be automatically mapped to a GotoAction class, the TypeAction to a TypeAction class, etc. To accomplish this I wrote a simple M node graph deserializer.

This is the AST I want the M node graph to deserialize to:

public class Test
{
  public string Name { get; set; }
  public IList<IAction> ActionList { get; private set; }

      public Test()
      {
          ActionList = new List<IAction>();
      }
}

public interface IAction
{
    void Execute(IBrowser browser);
}

public class GotoAction : IAction
{
    public string Url { get; set; }

    public void Execute(IBrowser browser)
    {
        browser.GoTo(Url);
    }
}

It was quite tricky to write a generic deserializer, mostly because the M node object graph is kind of weird (Nodes, Sequences, Labels, Values, EntityMemberLabels, etc). Here is the code:

public class MAstDeserializer
{
    private GraphBuilder builder;

    public MAstDeserializer()
    {
        this.builder = new GraphBuilder();
    }

    public object Deserialze(object node)
    {
        if (builder.IsSequence(node))
        {
            return DeserialzeSeq(node).ToList();
        }

        if (builder.IsNode(node))
        {
            return DeserialzeNode(node);
        }

        return null;
    }

    private object DeserialzeNode(object node)
    {
        var name = builder.GetLabel(node) as Identifier;

        foreach (var child in builder.GetSuccessors(node))
        {
            if (child is string)
            {
                return UnQuote((string)child);
            }
        }

        var obj = Activator.CreateInstance(Assembly.GetExecutingAssembly().FullName, "WatinDsl.Ast." + name.Text).Unwrap();
        
        InitilizeObject(obj, node);
        
        return obj;
    }

    private void InitilizeObject(object obj, object node)
    {
        foreach (var child in builder.GetSuccessors(node))
        {
            if (builder.IsSequence(child))
            {
                foreach (var element in builder.GetSequenceElements(child))
                {
                    AddToList(obj, child, element);
                }
            }
            else if (builder.IsNode(child))
            {
                obj.SetPropery(builder.GetLabel(child).ToString(), DeserialzeNode(child));
            }
        }
    }

    private void AddToList(object obj, object parentNode, object element)
    {
        var propertyInfo = obj.GetType().GetProperty(builder.GetLabel(parentNode).ToString());
        var value = propertyInfo.GetValue(obj, null);
        var method = value.GetType().GetMethod("Add");
        method.Invoke(value, new[] { DeserialzeNode(element) });
    }

    private IEnumerable<object> DeserialzeSeq(object node)
    {
        foreach (var element in builder.GetSequenceElements(node))
        {
            var obj = DeserialzeNode(element);
            yield return obj;
        }
    }

    private object UnQuote(string str)
    {
        return str.Substring(1, str.Length - 2);
    }
}

I guess in future versions of Oslo previews something like the above deserializer will be included as it is essential for creating executable DSLs. Maybe the Oslo team has another option for doing this, for example generating Xaml from the node graph which can then initialise your AST.

So how do we compile and run code in our new WatiN DSL language? First we need to compile the grammar .mg file into a .mgx file, this is done with the MGrammarCompiler, we can then use the .mgx file to create a parser, the parser will generate a node graph which we will deserialize into our custom AST.

public class WatinDslParser
{
  public object Parse(string code)
  {
    return Parse(new StringReader(code));
  }

  public object Parse(TextReader reader)
  {
    var compiler = new MGrammarCompiler();
    compiler.FileNames = new[] { "BrowserLang.mg" };
    compiler.Target = Target.Mgx;
    compiler.References = new string[] { "Languages", "Microsoft.Languages" };
    compiler.Execute(ErrorReporter.Standard);

    var parser = MGrammarCompiler.LoadParserFromMgx("BrowserLang.mgx", "CodingInstinct.BrowserLang");

    object root = parser.ParseObject(reader, ErrorReporter.Standard);

    return root;
  }
}

The reason I compile the grammar from code every time I run a script is so I can easily change the grammar and rerun without going through a separate compiler step. The Parse function above returns the M node graph. Everything is glued together in the WatinDslRunner class:

public class WatinDslRunner
{
  public static void RunFile(string filename)
  {
          var parser = new WatinDslParser();
          var deserializer = new MAstDeserializer();

          using (var reader = new StreamReader(filename, Encoding.UTF8))
          {
              var rootNode = parser.Parse(reader);
              var tests = (IEnumerable)deserializer.Deserialze(rootNode);

              foreach (Test test in tests)
              {
                  RunTest(test);
              }
          }
  }

      public static void RunTest(Test test)
      {
          Console.WriteLine("Running test " + test.Name);
          using (var browser = BrowserFactory.Create(BrowserType.InternetExplorer))
          {
              foreach (var action in test.ActionList)
              {
                  action.Execute(browser);
              }
          }
      }
}

If you have problems with the code above, please remember that the code in this post is just an experimental spike to learn MGrammar and the M Framework library. If you want to experiment with this yourself, download the code+solution: WatinDsl.zip.

Summery and some other thoughts on Oslo/Quadrant

It was quite a bit of work going from my textual DSL to something executable. The majority of the time was spent figuring out the M node graph and how to parse and deserialize it, writing the grammar was very simple. MGrammar will definitely make it easier to create simple data definition languages that could replace some existing xml based solutions, but I doubt that it will be widely used in enterprise apps for creating executable languages. Maybe it is more suited for tool and framework providers. It is the first public release so a lot will probably change and be improved so it is to early to say how much of an impact M/Oslo will have for .NET developers.

I got home from PDC quite puzzled over Oslo and the whole model-driven development thing. They only talked about data, data, data, I don't think they mentioned the word BEHAVIOR even once during any Oslo talk that I attended, to me that is kind of important :) I asked others about this and most agreed that they did not understand the point of Oslo, or how it would improve/change application development significantly.

Sure I found Quadrant to be a cool application that could potentially replace some Excel / Access solutions but what else? In what way is Quadrant interesting for application developers?  It would be interesting to get some comments on what others think about MGrammar, Quadrant & Model-driven development :)

20 comments:

Robert A. said...

Cool post! I really need to download the CTP and test this shit.

Dan Vanderboom said...

Nice post. It looks like you may have done it the hard way, though. If you would have gone to the M Language session (not the MGrammar one), you would have seen how to use m.exe and mx.exe to pull that D Graph into a SQL Server database. Then you could have automatically generated your Linq Entities or Linq SQL definitions from that.

But as far as working with it outside a SQL Server database, I'm not quite sure how to do it better.

Torkel Ödegaard said...

Yea, I knew I could use the SQL route, but it is not really an option in this case, well it would be a really strange thing to do :)

Dan Vanderboom said...

It would be difficult or impossible to implement a generic translator from the D Graph to a set of classes, due to the flexibility that you have in defining projections with MGrammar. But if you came up with a set of standards for your projections, it shouldn't be too tough.

Torkel Ödegaard said...

Isn't that what I have done? I translator for the graph to a set of classes? (granted a basic one)

Is the data graph that the parsers spits out called a D graph? Did not know that, what does the D stand for? :)

Colin Jack said...

"They only talked about data, data, data, I don't think they mentioned the word BEHAVIOR even once during any Oslo talk that I attended, to me that is kind of important"

Yeah the whole think of your model as an XSD thing was a good example of this but I've also seen zero discussion of behavioral models in most of the examples which either shows that they forgot to mention it or that they're only thinking data (if so, oh dear oh dear).

Not really been able to get much out of the Oslo people yet on the topic, be interested to hear what you find out as to me creating behavioral models is far more exciting than little data packages and WF/rules/services operating on them.

Dan Vanderboom said...

Yeah, good point. But when I saw this, I could tell it wasn't totally generic because you're presuming a fixed namespace, but with a tweak that could be turned into a parameter.

var obj = Activator.CreateInstance(Assembly.GetExecutingAssembly().FullName, "WatinDsl.Ast." + name.Text).Unwrap();

I think the D in D Graph stands for Data. That's my guess anyway. Good work, by the way. I think M/Oslo is fantastic and want it to get as much exposure as possible.

Torkel Ödegaard said...

Aha, yes, the namespace should definitely be a parameter :)

I found the MGrammar language really natural and easy to understand/learn, but I feel the Oslo team must be better at communicating what their vision is for this.

Dan said...

I guess D stands for directed. It's a labeled directed graph!

douglasp said...

http://douglaspurdy.com/2008/11/08/why-oslo/

Torkel Ödegaard said...

@Douglas Purdy

Thanks, that post clarify some things.

But I think Colin raises a valid concern in the comments to that post.

"Basically I’m hoping that at this level we don’t end up with a Customer “entity” in a model where that entity has no behavior and is operated on by WF/rules. Might not be a realistic issue with Oslo but it was my main worry when I looked at the examples out there (been there done that…)."

If behaviour is modelled as data, where is the glue that combines everything and creates an application? WF?

It would be great with some real application examples showing how you envision model-driven developemnt.

That being said, I am very impressed with the MGrammar language, I really like the syntax :)

I was also very impressed with Quadrant, it seems like a extremely fun app to develop, wish I was working on the Oslo team :)
But I don't see what part it plays in application development.

Colin Jack said...

@Torkel
First off thanks for the example, I finally downloaded it as well as Oslo and have happily been playing around with them. I actually think DSLs for acceptance testing could be very useful so I think its a good example.

I think what your doing here is what was suggested to me in the last comment on this post
http://tinyfinger.blogspot.com/2008/11/oslo-is-that-all-it-is.html, namely:

"You can choose to use our MGrammar language to write a DSL. MGrammar produces structured data. You can then write a small program to process that data and turn it into your OM or anything else you want to"

I must admit I hadn't thought of this before because it seems a little small scale compared to the grand pronouncements at PDC.

I'm also in two minds about the approach, at least as far as a domain model. Mainly because if I'm going down that path, and if I'm really only interested in a textual DSLs, then I could also just use something like Boo. Since I haven't yet written ant Boo/M DSLs though I could be wrong. I'm also thinking that whilst internal DSLs aren't as flexibile they do provide better dev support and support for conditional logic (as you pointed out on ALT.NET group) and so on but at this stage who knows.

My recent reading has also clarified things a lot, whilst the early problem (as opposed to technical) domain Oslo examples showed people doing their entire model in M I'm now thinking thats not really a good solution. Would be great if it worked well though, terms like aggregate/entity/valueobject/resource/representation being part of the language would be pretty sweet.

So I'm now thinking the sort of DSL you show here, written and then interpretted down to a standard OM are going to be useful, is really the best place to start.


"It would be great with some real application examples showing how you envision model-driven developemnt. "

Yeah this would be very useful at this stage.

Anonymous said...

Torkel,

Programming against IGraphBuilder is a pretty "deep" way to use the parser.

If you look at the IGraphBuilder->XAML work that Spanky did (see http://www.pluralsight.com/community/blogs/dbox/archive/2008/11/12/consuming-mgrammar-output-from-c.aspx) you'll get a feel for where I think the sweet spot is for consuming parser output from C#.

Thanks,
DB

Torkel Ödegaard said...

That is great Don, a MGraph->Xaml should work great as a generic MGraph -> .NET classes deserilizer.

I will try it out. Should make it easier to work with the MGraph.

Justin Chase said...

You basically read my mind here, I was doing some similar messing around. I was thinking though about writing something that would generate the AST for you based on the grammar rather than just mapping your language to the pre-defined AST objects.

But Mg is much more interesting to me than M. I hate to say it thought, but I'm not sure why it might be better than just using ANTLR (which I'm not very familiar with)?

Anonymous said...

in the following blog you will found a good implementation of controls in WatiN.
http://tanvirdotnet.blogspot.com/2008/11/ui-test-with-watin-for-tdd-in-net.html

Justin Chase said...

I've been working on a generalized way to transform MGraph into .NET classes. I've written a "templating" DSL where you can binding your MGraph to code and compile it down to codedom objects.

http://www.justnbusiness.com/post/2009/03/11/MetaSharp-code-generation-success!.aspx

Coupled with some helper MSBuild tasks you can convert your DSLs into new types at build time.

Justin Chase said...

Whoops, a real link this time:
http://www.justnbusiness.com/post/2009/03/11/MetaSharp-code-generation-success!.aspx

Dmitriy Nagirnyak said...

I have no idea why M should be choosen instead of ANTLR.

A while ago I did write the ANSI Pascal Interpretter using ANTLR. That was a proud story!

It has lexer, grammar and it gives you ready AST with ability to iterate over it and perform whatever action you need.

Cheers,
Dmitriy.

Torkel Ödegaard said...

@Dmitri

Well that is a legitimate question :)