Semantic Model(转 martinfowler)

来源:百度文库 编辑:神马文学网 时间:2024/05/20 11:13:28
Semantic Model
The domain model that's populated by a DSL
How it Works
In the context of a DSL, a semantic model is an in-memory representation, usually an object model, of the same subject that the DSL describes. If my DSL describes a state machine, then my Semantic Model would be an object model with classes for state, event, etc. A DSL script defining particular states and events would correspond to a particular population of that schema, with an event instance for each event declared in the DSL script.
A Semantic Model is realy just aDomain Model that is populated by the DSL. As with anyDomain Model it contains the heart of the behavior for the domain. The theSemantic Model of a DSL is usually a subset of the overalDomain Model for an application. From theDomain Model's point of view, the DSL is just a fancy alternative way of creating its objects and hooking them together. From the DSL's point of view theSemantic Model is the output of the overall parsing operation.
ASemantic Model is usually different to a syntax tree because they serve separate purposes. A syntax tree corresponds to the structure of the DSL scripts. Although an abstract syntax tree may simplify and somewhat reorganize the input data, it still takes fundamentaly the same form. TheSemantic Model, however, is based around what will be done with the information from a DSL script. It often will be a substantially different structure and will usually not be a tree structure. There are occasions when an AST is an effectiveSemantic Model for a DSL, but these are the exception rather than the rule.
Traditional discussions of languages and parsing don't use aSemantic Model. This is part of the difference between working with DSLs and with general purpose langauges. A syntax tree usually makes a suitable structure to base code generation for a GPL so there's less desire to have a differentSemantic Model. From time to time aSemantic Model is used, for instance a call graph representation is very useful for optimization. In this case these models are referred to as intermediate representations - they are usually intermediate steps before code generation.
TheSemantic Model can often precede the DSL. This happens when you decide that a portion of aDomain Model might be better populated from a DSL than from the regular push-button interface. Alternatively you can build a DSL andSemantic Model together using the discussions with domain experts both to refine the expression of the DSL and the structure of theDomain Model.
It's usually helpful to think of the Semantic Model as having two distinct interfaces. One interface is the operational interface - the one that allows clients to use a populated model in the course of their work. The second is the population interface which is used by the DSL put create instances in the model.
The operational interface should assume the Semantic Model has already been created and make it easy for other parts of the system to take advantage of it. I've often found that a good mental trick for API design is to just assume the model is magically already there and then ask myself how would I then use it. This can be counter-intuitive because I find it's better to define the operational interface before you think about the population interface, even though a running system will have to execute the population interface first. This is a general rule of thumb for me with any objects, not just DSLish ones.
The population interface is only used to create instances in the model and may only be used by the parser (and test code for the Semantic Model). Although we seek to decouple the Semantic Model and the parser(s) as much as possible, there is always a dependency in that the parser obviously needs to see the Semantic Model in order to populate it. Despite this by building a clear interface we can reduce the chances of an implementation change in the Semantic Model causing us to change the parser.
When to use it
My default advice is to always use a Semantic Model. I'm always rather uncomfortable when I say "always" because usually I find such absolute advice a strong sign of closed-minded thinking. In this case it may be my limited imagination, but I only see very few cases when you might not want to use a Semantic Model, and these are all in very simple situations.
I find a Semantic Model brings many compelling advantages. A clear Semantic Model allows you to test the and the parsing of the DSL separately. You can test the semantics by populating the Semantic Model directly and executing tests against the model. You can test the parser by seeing if it populates the Semantic Model with the correct objects. If you have more than one parser, you can test if they produce semantically equivalent output by comparing the population of theSemantic Model. This makes it easy to support multiple DSLs and, more commonly, to evolve the DSL separately from the Semantic Model.
The Semantic Model increases the flexibility in parsing and also in execution. You can execute the Semantic Model directly or you can use code generation. If you're using code generation you can base that code generation off the Semantic Model which completely decouples it from parsing. You can also execute both the Semantic Model and the generated code - which allows you to use the Semantic Model as a simulator for the generated code. A Semantic Model also makes it easier to have multiple code generaters because the independence of the parser avoids any need to duplicate parser code.
But the most important part of using a Semantic Model is that it separates thinking about semantics from thinking about parsing. Even a simple DSL contains enough complexity to make it worth dividing it up into two simpler problems.
So what are the few exceptions I envisage? One case is simple imperative interpretation where you just want to execute each statement as you parse it. The classic calcuator program where you calculate and print simple arithmetic expressions is a good example of this. With arithmetic expressions, even if you don't interpret them immediately their abstract syntax tree is pretty much what you'd have as a Semantic Model anyway, so there's no value in having a separate syntax tree and Semantic Model for that case. That's an example of a more general rule: if you can't think of a more useful model than the AST then there's little point creating a separate Semantic Model.
The most common case where people don't use a Semantic Model is when they're generating code. In this approach the parser can generate an AST and the code generator can work directly off the AST. This is a reasonable approach providing the AST is a good model of the underlying semantics and you don't mind coupling the code generation logic to the AST. If those aren't the case you may well it's simpler to transform the AST to a Semantic Model and do a simpler code generation from that.
Such is my bias, however, that I'd always start by assuming I need a Semantic Model. Even if thinking through convinces me that one isn't necessary I'd stay alert to increasing complexity and put one in as soon as I start seeing any complication coming into my parsing logic.
Example: The Introductory Example (Java)
There's lots of examples of Semantic Model in this book, precisely because I favor using Semantic Model so much. A useful one to focus on to illustrate the point is the one I use in the initial example - the secret panel controller state machine. Here the Semantic Model is the state machine framework. I didn't use the term Semantic Model in the discussion initially since my purpose in the introdution is to introduce the notion of a DSL. As a result I found it easier to discuss by assuming the framework was built first and the DSL layered on top of it. This still makes the framework a Semantic Model but because you are thinking inside-out it's not such good way to approach the discussion.
However the classic strengths of a Semantic Model are all there. I can (and did) test the state machine framework independently of writing the DSLs. I did some refactoring of the implementation of the framework without having to touch the parsing code, because my implementation changes didn't alter the population interface. Even if I did have to alter these methods, most of the time the changes would be easy to follow from the parser code because that interface marks a clear boundary.
While it's not terribly common to support multiple DSLs for the same Semantic Model, this was a requirement for my example. ASemantic Model made this relatively easy. I had multiple parsers with both internal and external DSLs. I could test them by ensuring they create equivalent populations of theSemantic Model. I could easily add a new DSL and parser without duplicating any code in other parsers or altering the Semantic Model. This advantage also worked for output. As well as having the Semantic Model execute directly to be a state machine, I could also use it to generate multiple code generation examples.
Significant Revisions
13 May 08: First Draft