The desire for code reuse has been a driving force behind most efforts in software engineering, and in this article we will look into three increasingly sophisticated ways to achieve reuse.
Libraries
Libraries can be anything from Cobol copy libs to modern shared libraries. They can be the result of meticulous modularization in your own, earlier applications (a rare case), they can be part of a “system library” (like the vast set of libraries in UNIX or the Java Runtime Environment), or they can be third-party libraries, open source or commercial. It does not matter which, all libraries have something in common: They never exactly fit your application.
In order to be re-usable, libraries have to cater to different users. It also does not matter if they are object-oriented or not, you almost always have to initialize something and then to call a procedure or a method, passing data down as parameters, having to check for and to react to errors. Initialization can be cheap or expensive, you may have to do it upon every use or only once at startup or first use. The problem with libraries is, that in order to be useful, they have to have extensive interfaces.
Typical libraries in early GUI systems (e.g. XLib, OSF Motif, etc) or in systems for distributed computing (ONC aka Sun RPC, OSF DCE, Microsoft DCOM, CORBA) have hundreds or thousands of procedures or methods, often with long parameter lists, parameters frequently being of types defined in libraries as well, types that can be created by other library functions and so on and so forth.
Using those libraries quickly riddles your application with infrastructure code, often forces you to structure your code in certain ways, and these ways may be incompatible with other, alternative libraries, making any attempt to switch between alternatives practically impossible.
Frameworks
Frameworks are a much more sophisticated solution for code reuse. They accept the problem of entanglement, embrace it, and reverse the direction of control. Frameworks are the dominant solution today. No more do you call the library, the library calls you. A framework is called a framework, because it provides a frame, a kind of main program that does all initialization and most of the infrastructure plumbing.
Of course you still have to write application code. A framework is like a generic main program, but in order to do anything useful, it must rely on application code. Let me give a very primitive example.
Earliest GUI libraries relied on applications providing a main loop that took care of events. Something like this:
// some variables
MyTypeOfObjects currentlySelected;
...
// the loop
boolean done = false;
do {
WindowSystem.event e = WindowSystem.getNextEvent();
switch (e.getCode()) {
case WindowSystem.EXPOSE:
doRedraw();
break;
case WindowSystem.MENU_CLICK:
WindowSystem.MenuDetails menuButton = e.getMenuDetails();
switch (menuButton.getCode()) {
case WindowSystem.StandardMenus.OPEN:
doOpen(currentlySelected);
break;
case WindowSystem.StandardMenus.EXIT:
done = true;
break;
case WindowSystem.StandardMenus.DELETE:
// delete currently selected object ...
...
default:
log("Recieved invalid menu entry!");
break;
}
case WindowSystem.KEY_PRESS:
// process keys ...
...
default:
log("Recieved invalid menu entry!");
break;
}
} while (!done);
// maybe some cleanup
...
exit(0);
GUI frameworks released you of the burden to write these main loops yourself. You had a main program provided by the framework, and this main program was able to process all possible events, menu buttons or key presses that the window system could ever deliver. By default it would ignore events, but you could register functions like doRedraw() and doOpen(FrameworkTypeOfObjects currentlySelected) to be called in case of certain events.
Obviously this is big progress, but the framework, still being a library, can’t know about your application types. You see how I have changed the method signature of doOpen() from taking a parameter of MyTypeOfObjects to a parameter of FrameworkTypeOfObjects. The framework is in control now, and because it can’t know your data types, it forces you to accept framework data types as parameters.
Again you have reuse, but now the framework forces its abstractions upon you. You have to write less code, but it is not as few as you’d have hoped for, because you now need to write a layer to adapt your abstractions to those of the framework. Of course you can ignore the problem and simply use framework abstractions in your own code, but if you do that, you’re doomed anyway, at least in the long run. New versions of the framework will force you into a deadly maintenance routine, and if the framework ever becomes unavailable, you can happily begin writing your program anew.
WADL
Sometime around 2001 I was confronted with the request to write a big web application. The application would have to work with a relational database, it would have to be written in Perl and I would have to use kind of a framework that had been developed in-house. A quick analysis identified three user roles, and the number of pages would be greater than 40. I had roughly six months before the system would go productive.
At that time I had almost no experience in Perl, had never used relational databases, and it was going to be my second web application. #1 had been in Perl as well, but it had been an application with two or three pages and the same number of forms. It had been trivial, but as I had hand-crafted it, it had been tedious nevertheless. I was in bad need of a tool.
In a spell of recklessness I used five of the six months to analyze the problem and construct a tool, and then I spent a month building the application. It was a gamble, but it worked, the application was a success and I was in business.
I called this tool WADL (Web Application Definition Language). Just like RPCmagiX (I wrote about it in the last post), I failed to ever publish WADL, and in the meantime the name has been taken. WADL is now a W3C proposal for something like the REST equivalent to WSDL.
My “WADL” was more, much more. It was a way to specify the structure and visual details of a web application. The specification was done in XML, and with a code generator you could generate a complete application. All the pages and forms were there, they only had no content. For prototyping purposes you could associate dummy data with the pages, and this way it was possible to create a complete prototype without writing one single line of code. The pages displayed meaningful data, it’s only that the data sent from one page had no influence on the next page. In cases where the result page was determined from input data, the prototype would pop up a choice box where you could select the desired outcome.
You had one XML file for the structure of the application (the “Application Definition”) and one XML file for each page (“Page Definition”). Furthermore you could have XML files describing database structures with tables, views, foreign key relations, etc.
The application definition consisted of some application attributes (most important the name), the definition of roles (like “user” or “administrator”), the reference to the database definitions if any, and finally the definition of the graph of pages. There were start pages (those that you could directly address from a GET request, at least one per role) and other pages. Each page had an attribute “roles”, specifying the roles that could get to that page. Events took you from page to page, each event corresponding to a button on the page that could be pressed and that would submit a form.
Roles could overlap. Think of a system where a role “user” can search for and display data. A second role “admin” can enter new data, but of course “admin” can search and display as well. The roles overlap, “admin” shares part of his graph of pages with “user”.
The page definitions basically described what was on the pages. There were text blocks and form blocks, and within form blocks you had form elements like input fields, text areas, select boxes, labels, grouping elements, etc. A layout generator would automatically generate a layout, conforming to our internal style guide, but the system was modular, layout generators could be plugged in, and it was even possible to use HTML templates (I called them “HTML Makeup”) on a per-page basis.
From the database definitions WADL created Perl classes, one for each table definition. A support library handled encapsulated database access.
That’s about what you got from XML alone. For everything else you had to write Perl code, but WADL generated templates to give you a quick start. You had to implement so-called “Processors” for all possible page transitions. In cases where the target page was determined at runtime, the processor was called a single method process() in a class Processors::OriginatingPage_EVENT (edge processor, processing the edges corresponding to a single event), the return value of this method determining the target page, and in all other cases it was a method OriginatingPage_EVENT() in a Perl class Processors::Page::TargetPage (page processor, processing all incoming edges to a page), with “OriginatingPage” and “TargetPage” being the respective page names and “EVENT” being the name of the event. To implement processors, you simply copied from the generated template directory, and began inserting code. This worked pretty well, because due to the prototyping system, basic questions about application structure could normally be answered very early.
The processors communicated with the application via generated input and output objects. Thus they did not have to care about the actual page structure. They took values from methods named like page elements, but it did not matter whether a value came from an input field or from a text box.
WADL: Additional Benefits
Knowing so much about an application opens up many opportunities, that you probably have not even thought of. One of WADL’s most successful features was a byproduct of my curiosity. I wanted to know, how much of the application I had already implemented, and so I began collecting data, but then I thought, why not visualizing it?
I already knew the Graphviz project, and it was fairly simple to write a program that created an application graph for each role and a graph for the database structure. The nodes in the application graphs represented pages, the edges were events. Blue edges represented events for which processors were already in place, gray edges represented events without processors yet.
Nodes were clickable, and they brought you to another graph, showing the incoming and outgoing events for that page. Here the events were clickable, and they brought you to the actual code of the processors. From there you could go on to the next page graph and so on. Essentially you could click through the source code in exactly the same way as you would navigate through the application.
This visualization proved to be one of WADL’s most successful features, because it was trivial to assess a project’s progress. You only had to look at the blue and gray edges and at some numbers in the statistics section.
Another byproduct was a code generator for database transformations. It took two database definitions and transformation rules, and from that it generated a script for transferring data from one database to another, doing all necessary transformations. Machine-readable knowledge about structure – you can do all sorts of things with it.
But there’s more. WADL was extremely easy to learn. You had guidance in XML via the DTDs, the whole project structure was generated, there were commands to generate templates for HTML makeup, processor templates could be generated, the templates were commented, thus it was all a matter of copying some files and filling in code where the comments hinted at it. One of our programmers had never before written a Perl application, and his first project was the biggest WADL application that was ever built. More than a hundred pages, many hundreds of processors, a three step workflow, and at the end of it the input of roughly a hundred users was compiled into an automatically created PDF full of tables. He created the application, finished it in time, and I did the PDF creation code.
What I did, was implementing an HTML to LaTeX translator, and then we typeset the document on the fly at download time. Using HTML as input had the advantage, that we could display the same code on a preview page. I took the code that I had written and made it part of WADL’s tool set.
But there’s even more. WADL automatically structured the projects. You never had to write any plumbing code, everything was always the same, regardless of project, regardless of project phase. Adding a functionality might mean adding a page and some processors, but that never complicated the project. Each new page had the same complexity, each processor needed the same effort as any other processor. WADL scales linearly.
I can’t imagine any pure library system or any pure framework that could ever scale that way. They can’t, because you always have to write plumbing code, code that’s repetitive and tedious to write, and whenever you do it, you do it in a slightly different way. Only then you complicate things, because what should be similar, becomes different, and over time it turns the project into a maintenance nightmare.
It’s a hard fact: such code should not be written. It can’t be simplified, because it is complicated by nature. It can’t be packed into libraries, because though it follows patterns, it is never the exact same. It’s similar structure we’re talking about, not sameness of code. There’s nothing to be factored out.
While RPCmagiX (see the last post) was a tool to create the ideal library for your interface, WADL was a tool to create the ideal framework for your application. Both would have been impossible without code generators.
WADL is still in use, but I did not get the funding to keep it current. It is pretty outdated now. The base mechanism is still CGI, that means one process per request, we have no AJAX support and I have never found a good way to keep up with .NET’s excellent support for SOAP.
And Now?
If I would implement WADL now, I would use the Java Enterprise Edition as its basis. It implements all that a big, scalable application could possibly need, it does it in a quite elegant way, and this basis would also make it more acceptable to management, would make it less of a risky, exotic solution tied to one person.
It’s only that I am not interested in WADL any more. I have solved it once, I could do it again, but there would be no challenges, no surprises, only tedious work. I intend to aim higher. How high, that’s what we will find out as this blog develops, as a plan begins to form, as I get input, as we discuss these matters. A first sketch will follow soon.