The Pattern Movement in Software Engineering has become very popular and has produced many books, some of them arguably extremely good, but apart from automatic refactoring in modern IDEs, the number of supporting tools is surprisingly low.

What Is A Pattern Anyway

You can get a good overview about Design Patterns at the Portland Pattern Repository. For the purpose of this article we just take the classic definition from the Gang Of Four book:

A design pattern systematically names, motivates, and explains a general design that addresses a recurring design problem in object-oriented systems. It describes the problem, the solution, when to apply the solution, and its consequences. It also gives implementation hints and examples. The solution is a general arrangement of objects and classes that solve the problem. The solution is customized and implemented to solve the problem in a particular context.

The crucial bit is the last sentence: Design patterns need to be implemented specifically for each particular problem. The classes in an object-oriented design pattern are no classes at all. They are archetypes, and for the particular situation real classes have to be crafted after those archetypes.

To put it in other words, a design pattern is an archetypical solution to an archetypical problem, and that’s also the reason why you can’t press patterns into libraries. If you compare two instances of the pattern, i.e. two cases where the pattern has been used to solve the problem, you’ll find almost no code that could be factored out. The pattern is not code, the pattern is a way the code is structured. Lexically two instances of a pattern have nothing in common, everything is only similar, nothing is the same.

In some cases it is even possible to put a pattern into a library, but when you begin to do so, you quickly find out that the result is not useful. Too much varies, the result are endless parameter lists or endlessly overloaded variations of the same methods. Those libraries are hard to implement and they don’t feel natural.

Generics (templates in C++) are a step in the right direction, but although they make creating archetypical libraries easier, those libraries are not necessarily easy to understand and use, much less to implement.

The Most Simple Patterns

Design patterns don’t come only at the level of class interaction in object-oriented systems. Just look at programming languages. In the beginning there was only assembly language. We basically wrote machine instructions, but after a while it turned out, that we did similar things over and again.

One of those things was to build loops of code. People did completely different things in the bodies of their loops, but there were only a few patterns for how the loop was entered, exited, and how often it was executed. With the advent of higher programming languages, those patterns were cast in basic language constructs such as “WHILE-DO”, “DO-UNTIL”, “FOR” or “FOREACH”, and these language constructs are still with us.

Assembly language did not have those high-level loops, but it had a conditional branch. Loops had to be implemented with conditional branches. The direct equivalent to that is the combination of the “IF” and “GOTO” statements in programming languages, and of course you can use those to implement all your loops. It’s only more verbose, the intent is less clear, and that is, because you don’t express your intent by naming it. Patterns are much about naming.

Pattern Languages

Interestingly enough, an “IF” statement is also a pattern. Just look at the infinite number of conditions. This brings us to another important fact, the fact that patterns are not all born equal. They come in hierarchies. There are base patterns and composite patterns. Just like with “IF-GOTO”, you can use patterns to build other, more complex patterns.

Now, if you look at where the pattern movement came from, being initiated by the architect Christopher Alexander, and if you read his books, you see that he clearly understood the hierarchical nature of design patterns. That’s why he called it a Pattern Language.

Well, I’ve never tried to build a house, and if I were to build one, I’d probably not apply all of his patterns. Many of them I agree with, some I don’t, and for most there are alternatives that he does not cover. But that again coincides nicely with the notion of languages.

My native language is German, I regularly use English, have some low level, basic and spotty understanding of some of the languages originating in Latin, but that’s it. No idea of Chinese, no idea of any African language, no idea of anything else. And still, all those languages, that I don’t have a clue of, can express exactly the same things that German or English can. They use only different patterns to solve the same problems.

So I guess we can agree that it is possible to build different houses, houses that don’t fit into Christopher Alexander’s language, and we may still be able to live in them and stay sane.

For the adoption of Alexander’s method, this may really have been detrimental. It is much too obvious that, although it makes sense as a system, you may not be able to use it and build your dream. It would be Alexander’s dream instead, and when it comes to our dreams, we really dislike any intrusion.

This may explain Alexander’s limited success (and I don’t want to belittle him, he is really one of my heroes, even if he has not taken over the world), but it would not be in our way building programs, would it? After all, houses are for individuals, and individuals care for their individuality, but CRUD is just CRUD, ain’t it?

Maybe not. People seem to have a tendency to willingly and knowingly reinvent the wheel. Look at me: I could be satisfied using the tools I have, using them to create the things I’m paid for, and otherwise have a life. Or else just join the development of the Eclipse Modeling tools. In a way there’s so much already out there, I can’t have much hope to make a valuable contribution. And still I do what I do, knowing that I’m bound to reinvent, willingly accepting it. And why? Because it’s fun :)

I don’t know if this explains anything, but fact is, that the Design Pattern movement in computer science has not produced anything that even remotely comes close to Alexander’s hierarchical completeness. With Alexander’s language you can build a house, a street, a village, a town, a region, and in the other direction you can go down to details like the actual building materials.

You can’t do that with design patterns in computer programming. The high-level patterns are all missing. The discipline has evolved to the point where it has become possible to talk to each other using design patterns, but it’s only at that certain level of detail where we talk about basic object interaction.

The Necessary Next Step

Looking into some very simple patterns of repeated assembly code, we found that these patterns became the loop constructs of modern programming languages. Those loop patterns describe arrangements of assembly instructions, but the important thing to note here is, that today nobody arranges those instructions themselves. We use compilers for that. This is very different from how we work with Gang-Of-Four patterns, because they have to be instantiated by hand.

A compiler can do what it does, because it has a complete model of the program. All variables, that the built-in patterns like loop constructs refer to, are also defined in the same model. The whole semantic is defined in terms of the language and its patterns, and where this is not the case, the semantics are defined by standardized libraries. So, actually what this boils down to is, that a programming language allows us to express the desired semantics by constructing a model. The compiler than applies its patterns to this model and this way constructs either code or calls into supporting libraries.

That’s exactly what code generators do. They take a model and translate it into code of a different, simpler language, and where it makes sense, they generate not code but calls into external libraries. In this context, libraries have two purposes:

  1. they avoid code duplication
  2. they allow us to express semantics that can’t be expressed in the language’s patterns

#1 is nice to have, but #2 is really important, because it means that we can generate code, even when our patterns are not a complete, self-sufficiant system. What we can’t express in our language, we simply assume implemented in libraries.

This is the way to go. We have to build higher-order languages, languages that implement recurring patterns just the way as today’s programming languages implement assembly loop patterns. Once we have such languages, we can translate them automatically into code. Just like with conventional programming languages, those things will be the more powerful, the more self-sufficient they are, but that does not mean that small and incomplete steps can’t be incredibly useful in their own right.

What I want to implement as my project are three things:

  1. an environment to specify models,
  2. a set of patterns that, when applied to such a model, turns the model into an
    application, and finally
  3. a code generator, that makes this process automatic

That’s it. Easy, huh?

 

The desire for code reuse has been a driving force behind most efforts in software engineering, and in this article we will look into three increasingly sophisticated ways to achieve reuse.

Libraries

Libraries can be anything from Cobol copy libs to modern shared libraries. They can be the result of meticulous modularization in your own, earlier applications (a rare case), they can be part of a “system library” (like the vast set of libraries in UNIX or the Java Runtime Environment), or they can be third-party libraries, open source or commercial. It does not matter which, all libraries have something in common: They never exactly fit your application.

In order to be re-usable, libraries have to cater to different users. It also does not matter if they are object-oriented or not, you almost always have to initialize something and then to call a procedure or a method, passing data down as parameters, having to check for and to react to errors. Initialization can be cheap or expensive, you may have to do it upon every use or only once at startup or first use. The problem with libraries is, that in order to be useful, they have to have extensive interfaces.

Typical libraries in early GUI systems (e.g. XLib, OSF Motif, etc) or in systems for distributed computing (ONC aka Sun RPC, OSF DCE, Microsoft DCOM, CORBA) have hundreds or thousands of procedures or methods, often with long parameter lists, parameters frequently being of types defined in libraries as well, types that can be created by other library functions and so on and so forth.

Using those libraries quickly riddles your application with infrastructure code, often forces you to structure your code in certain ways, and these ways may be incompatible with other, alternative libraries, making any attempt to switch between alternatives practically impossible.

Frameworks

Frameworks are a much more sophisticated solution for code reuse. They accept the problem of entanglement, embrace it, and reverse the direction of control. Frameworks are the dominant solution today. No more do you call the library, the library calls you. A framework is called a framework, because it provides a frame, a kind of main program that does all initialization and most of the infrastructure plumbing.

Of course you still have to write application code. A framework is like a generic main program, but in order to do anything useful, it must rely on application code. Let me give a very primitive example.

Earliest GUI libraries relied on applications providing a main loop that took care of events. Something like this:

// some variables
MyTypeOfObjects currentlySelected;
...
// the loop
boolean done = false;
do {
    WindowSystem.event e = WindowSystem.getNextEvent();
    switch (e.getCode()) {
        case WindowSystem.EXPOSE:
            doRedraw();
            break;
        case WindowSystem.MENU_CLICK:
            WindowSystem.MenuDetails menuButton = e.getMenuDetails();
            switch (menuButton.getCode()) {
                case WindowSystem.StandardMenus.OPEN:
                    doOpen(currentlySelected);
                    break;
                case WindowSystem.StandardMenus.EXIT:
                    done = true;
                    break;
                case WindowSystem.StandardMenus.DELETE:
                    // delete currently selected object ...
                ...
                default:
                    log("Recieved invalid menu entry!");
                    break;
            }
        case WindowSystem.KEY_PRESS:
            // process keys ...
        ...
        default:
            log("Recieved invalid menu entry!");
            break;
    }
} while (!done);
// maybe some cleanup
...
exit(0);

GUI frameworks released you of the burden to write these main loops yourself. You had a main program provided by the framework, and this main program was able to process all possible events, menu buttons or key presses that the window system could ever deliver. By default it would ignore events, but you could register functions like doRedraw() and doOpen(FrameworkTypeOfObjects currentlySelected) to be called in case of certain events.

Obviously this is big progress, but the framework, still being a library, can’t know about your application types. You see how I have changed the method signature of doOpen() from taking a parameter of MyTypeOfObjects to a parameter of FrameworkTypeOfObjects. The framework is in control now, and because it can’t know your data types, it forces you to accept framework data types as parameters.

Again you have reuse, but now the framework forces its abstractions upon you. You have to write less code, but it is not as few as you’d have hoped for, because you now need to write a layer to adapt your abstractions to those of the framework. Of course you can ignore the problem and simply use framework abstractions in your own code, but if you do that, you’re doomed anyway, at least in the long run. New versions of the framework will force you into a deadly maintenance routine, and if the framework ever becomes unavailable, you can happily begin writing your program anew.

WADL

Sometime around 2001 I was confronted with the request to write a big web application. The application would have to work with a relational database, it would have to be written in Perl and I would have to use kind of a framework that had been developed in-house. A quick analysis identified three user roles, and the number of pages would be greater than 40. I had roughly six months before the system would go productive.

At that time I had almost no experience in Perl, had never used relational databases, and it was going to be my second web application. #1 had been in Perl as well, but it had been an application with two or three pages and the same number of forms. It had been trivial, but as I had hand-crafted it, it had been tedious nevertheless. I was in bad need of a tool.

In a spell of recklessness I used five of the six months to analyze the problem and construct a tool, and then I spent a month building the application. It was a gamble, but it worked, the application was a success and I was in business.

I called this tool WADL (Web Application Definition Language). Just like RPCmagiX (I wrote about it in the last post), I failed to ever publish WADL, and in the meantime the name has been taken. WADL is now a W3C proposal for something like the REST equivalent to WSDL.

My “WADL” was more, much more. It was a way to specify the structure and visual details of a web application. The specification was done in XML, and with a code generator you could generate a complete application. All the pages and forms were there, they only had no content. For prototyping purposes you could associate dummy data with the pages, and this way it was possible to create a complete prototype without writing one single line of code. The pages displayed meaningful data, it’s only that the data sent from one page had no influence on the next page. In cases where the result page was determined from input data, the prototype would pop up a choice box where you could select the desired outcome.

You had one XML file for the structure of the application (the “Application Definition”) and one XML file for each page (“Page Definition”). Furthermore you could have XML files describing database structures with tables, views, foreign key relations, etc.

The application definition consisted of some application attributes (most important the name), the definition of roles (like “user” or “administrator”), the reference to the database definitions if any, and finally the definition of the graph of pages. There were start pages (those that you could directly address from a GET request, at least one per role) and other pages. Each page had an attribute “roles”, specifying the roles that could get to that page. Events took you from page to page, each event corresponding to a button on the page that could be pressed and that would submit a form.

Roles could overlap. Think of a system where a role “user” can search for and display data. A second role “admin” can enter new data, but of course “admin” can search and display as well. The roles overlap, “admin” shares part of his graph of pages with “user”.

The page definitions basically described what was on the pages. There were text blocks and form blocks, and within form blocks you had form elements like input fields, text areas, select boxes, labels, grouping elements, etc. A layout generator would automatically generate a layout, conforming to our internal style guide, but the system was modular, layout generators could be plugged in, and it was even possible to use HTML templates (I called them “HTML Makeup”) on a per-page basis.

From the database definitions WADL created Perl classes, one for each table definition. A support library handled encapsulated database access.

That’s about what you got from XML alone. For everything else you had to write Perl code, but WADL generated templates to give you a quick start. You had to implement so-called “Processors” for all possible page transitions. In cases where the target page was determined at runtime, the processor was called a single method process() in a class Processors::OriginatingPage_EVENT (edge processor, processing the edges corresponding to a single event), the return value of this method determining the target page, and in all other cases it was a method OriginatingPage_EVENT() in a Perl class Processors::Page::TargetPage (page processor, processing all incoming edges to a page), with “OriginatingPage” and “TargetPage” being the respective page names and “EVENT” being the name of the event. To implement processors, you simply copied from the generated template directory, and began inserting code. This worked pretty well, because due to the prototyping system, basic questions about application structure could normally be answered very early.

The processors communicated with the application via generated input and output objects. Thus they did not have to care about the actual page structure. They took values from methods named like page elements, but it did not matter whether a value came from an input field or from a text box.

WADL: Additional Benefits

Knowing so much about an application opens up many opportunities, that you probably have not even thought of. One of WADL’s most successful features was a byproduct of my curiosity. I wanted to know, how much of the application I had already implemented, and so I began collecting data, but then I thought, why not visualizing it?

I already knew the Graphviz project, and it was fairly simple to write a program that created an application graph for each role and a graph for the database structure. The nodes in the application graphs represented pages, the edges were events. Blue edges represented events for which processors were already in place, gray edges represented events without processors yet.

Nodes were clickable, and they brought you to another graph, showing the incoming and outgoing events for that page. Here the events were clickable, and they brought you to the actual code of the processors. From there you could go on to the next page graph and so on. Essentially you could click through the source code in exactly the same way as you would navigate through the application.

This visualization proved to be one of WADL’s most successful features, because it was trivial to assess a project’s progress. You only had to look at the blue and gray edges and at some numbers in the statistics section.

Another byproduct was a code generator for database transformations. It took two database definitions and transformation rules, and from that it generated a script for transferring data from one database to another, doing all necessary transformations. Machine-readable knowledge about structure – you can do all sorts of things with it.

But there’s more. WADL was extremely easy to learn. You had guidance in XML via the DTDs, the whole project structure was generated, there were commands to generate templates for HTML makeup, processor templates could be generated, the templates were commented, thus it was all a matter of copying some files and filling in code where the comments hinted at it. One of our programmers had never before written a Perl application, and his first project was the biggest WADL application that was ever built. More than a hundred pages, many hundreds of processors, a three step workflow, and at the end of it the input of roughly a hundred users was compiled into an automatically created PDF full of tables. He created the application, finished it in time, and I did the PDF creation code.

What I did, was implementing an HTML to LaTeX translator, and then we typeset the document on the fly at download time. Using HTML as input had the advantage, that we could display the same code on a preview page. I took the code that I had written and made it part of WADL’s tool set.

But there’s even more. WADL automatically structured the projects. You never had to write any plumbing code, everything was always the same, regardless of project, regardless of project phase. Adding a functionality might mean adding a page and some processors, but that never complicated the project. Each new page had the same complexity, each processor needed the same effort as any other processor. WADL scales linearly.

I can’t imagine any pure library system or any pure framework that could ever scale that way. They can’t, because you always have to write plumbing code, code that’s repetitive and tedious to write, and whenever you do it, you do it in a slightly different way. Only then you complicate things, because what should be similar, becomes different, and over time it turns the project into a maintenance nightmare.

It’s a hard fact: such code should not be written. It can’t be simplified, because it is complicated by nature. It can’t be packed into libraries, because though it follows patterns, it is never the exact same. It’s similar structure we’re talking about, not sameness of code. There’s nothing to be factored out.

While RPCmagiX (see the last post) was a tool to create the ideal library for your interface, WADL was a tool to create the ideal framework for your application. Both would have been impossible without code generators.

WADL is still in use, but I did not get the funding to keep it current. It is pretty outdated now. The base mechanism is still CGI, that means one process per request, we have no AJAX support and I have never found a good way to keep up with .NET’s excellent support for SOAP.

And Now?

If I would implement WADL now, I would use the Java Enterprise Edition as its basis. It implements all that a big, scalable application could possibly need, it does it in a quite elegant way, and this basis would also make it more acceptable to management, would make it less of a risky, exotic solution tied to one person.

It’s only that I am not interested in WADL any more. I have solved it once, I could do it again, but there would be no challenges, no surprises, only tedious work. I intend to aim higher. How high, that’s what we will find out as this blog develops, as a plan begins to form, as I get input, as we discuss these matters. A first sketch will follow soon.

© 2010 Andreas Manessinger Suffusion theme by Sayontan Sinha