I have already hinted at it in the answers to comments, but now it is official.

Azzyzt JEE Tools are available as open source.

Azzyzt JEE Tools are based on my Eclipse / GlassFish / Java EE 6 Tutorial and on a subsequent internal class that I held in October/November 2010 and that was based on the tutorial as well. The code generated by Azzyzt JEE Tools is basically a refinement of the code presented in the tutorial, with some very interesting features added.

For your convenience, here is the announcement again:

Version 1.0.0 has been available since end of March, but I had not announced it. Thus 1.1.0 is the first version that’s really open to the public.

Azzyzt JEE Tools is a collection of software tools helping software developers to create software using Java Enterprise Edition 6. It is designed to be integrated into popular Java IDEs, and at the moment this means it is an Eclipse plugin.

Azzyzt JEE Tools is a set of Eclipse plugins for creating a so-called azzyzted project, and for creating code from a model. Azzyzt uses Java JPA entities as a model, and from that model it creates an enterprise application, ready to be deployed in a Java EE 6 application server like GlassFish 3.1, ready to be accessed via CORBA, SOAP and REST. Thus the generated application is a set of web services, providing all that you need in a typical CRUD application.
Generated enterprise applications have separate source folders for generated and developer-supplied content. Add your own functionality to a well-engineered base project.
Azzyzt JEE Tools is not about user interfaces. It is expected that the generated application is accessed by a RIA frontend (Flex/Flash, Silverlight, Java FX, …) or by a fat client.

If you just want to use Azzyzt JEE Tools (as opposed to modify and build them), the recommended way to install the software is via an Eclipse update site. As of release 1.1.0, there are two update site URLs, one for the edition used by the Municipiality of Vienna, Austria, the other a generic version. The URLs are

http://azzyzt.manessinger.com/azzyzt_generic/

http://azzyzt.manessinger.com/azzyzt_magwien/

If you want to look into the source code, modify Azzyzt JEE Tools for your own use or if you even want to contribute, then you can fork the project from GitHub under the URL

https://github.com/amanessinger/azzyzt_jee_tools

So far the project lacks reference documentation, though a tutorial under

http://azzyzt.manessinger.com/doc/using_azzyzt.html

should give you a fairly good impression of what Azzyzt JEE Tools are about, how to get started and how to go on. The process of building/modifying the tools and of how to contribute to the code base currently lacks documentation.

All announcements of new versions will be published on

http://www.azzyzt.org

Discussion of the architecture, of interesting details of the implementation, and in general of things I’ve learned in the process, will happen on my programming blog

http://programming.manessinger.com/

If you want to be kept up-to-date, I suggest that you subscribe to the feeds of both sites, azzyzt.org and programming.manessinger.com, in the feed reader of your choice.

The next post will give you an overview of how the code generated by Azzyzt JEE Tools differs from the Eclipse / GlassFish / Java EE 6 Tutorial. Until then I suggest you have a look at the new tutorial Using Azzyzt JEE Tools.

Azzyzt JEE Tools are copyright (c) 2011, Municipiality of Vienna, Austria, licensed under the EUPL, version 1.1 or subsequent versions.

 

Have you ever wondered how stateless an EJB3 Stateless Session Bean is? Well, I have, I wrote a test program and I got some interesting and – for me – surprising results, that I wanted to share.

My original understanding (or let’s call it mis-understanding) was, that the JEE server has a pool of EJB instances for each bean class, that you get one of those beans injected for each @EJB annotation, and that all those injections happen immediately upon entrance into the container via a top-level call.

A call to a service bean via SOAP is such a top-level call. I expected, that the injections not only happen in the called bean, but also somehow happen along the whole call graph, meaning that not only EJB references in the top-level service bean would be satisfied via injection, but also in beans referenced by that bean and so on.

Finally I expected that if two beans reference the same type of bean, they would actually get the same instance. I really have never thought the whole thing through, I simply expected some diffuse sort of magic.

The idea was, that some ServiceBean would get a value, either via parameters or from elsewhere (actually I thought of HTTP headers), store that information in some StorageBean, and then call a WorkerBean. The WorkerBean or some other code down the call graph would then retrieve that information from the StorageBean. In order for this to work, the StorageBean injected into the ServiceBean would have to be the same instance as the StorageBean injected into the WorkerBean, and the StorageBean would have to be exclusively allocated to that one thread for the whole duration of the top-level call, or else there would be a time interval between when the value gets being set and the time when it is read in the WorkerBean, leading to the usual multi-threading troubles.

I wanted a StorageBean to have some instance variables, wanted to fill them somewhere in the program, and wanted to read them elsewhere, without having to pass them around as parameters. In other words, I wanted a Stateless Session Bean to have some state, and I hoped that the container would guarantee consistency of this state for the time of one top-level call.

I’m pretty sure this is still not clear, so let’s look at some code. Here is a Singleton that generates random numbers:

package com.manessinger.test.stateless.backing;
 
import java.util.Random;
 
import javax.annotation.PostConstruct;
import javax.ejb.Singleton;
import javax.ejb.LocalBean;
 
@LocalBean
@Singleton
public class RandomGen {
 
    private Random r;
 
    @PostConstruct
    public void init() { r = new Random(); }
 
    public int nextInt() { return r.nextInt(); }
}

The StorageBean could be something like

package com.manessinger.test.stateless.backing;
 
import javax.ejb.LocalBean;
import javax.ejb.Stateless;
 
@LocalBean
@Stateless
public class StorageBean {
 
    private int value;
 
    public void setValue(int value) { this.value = value; }
    public int getValue() { return value; }
}

A first shot at the ServiceBean could be

package com.manessinger.test.stateless.service;
 
import javax.ejb.EJB;
import javax.ejb.Stateless;
import javax.jws.WebService;
 
import com.manessinger.test.stateless.backing.RandomGen;
import com.manessinger.test.stateless.backing.StorageBean;
import com.manessinger.test.stateless.backing.WorkerBean;
 
@Stateless
@WebService
public class ServiceBean {
 
    @EJB StorageBean s;
    @EJB WorkerBean  w;
    @EJB RandomGen   rand;
 
    public boolean test1() {
        int i = rand.nextInt();
        s.setValue(i);
        boolean isCorrect = w.verify1(i);
        return isCorrect;
    }
}

and here is finally the worker:

package com.manessinger.test.stateless.backing;
 
import javax.ejb.EJB;
import javax.ejb.LocalBean;
import javax.ejb.Stateless;
 
@LocalBean
@Stateless
public class WorkerBean {
 
    @EJB StorageBean s;
 
    public boolean verify1(int i) {
        int value = s.getValue();
        boolean isCorrect = (value == i);
        if (!isCorrect) {
            System.err.println(String.format("Error: expected %d, got %d", value, i));
        }
        return isCorrect;
    }
}

Thus the ServiceBean retrieves a random value, uses it to set the value of the StorageBean, and then passes the value into a verify() method of the worker. The worker retrieves the value from the StorageBean, compares it with its parameter, and returns true if they are the same value.

Try it for yourself: call the ServiceBean from the web service tester in the GlassFish Administration Console and it will just work. It’s EJB3 and EJBs are just POJOs after all, right?

Wrong. Sure, it does work, but now try to call the test method from a load testing application like Apache Jakarta JMeter. Try a thread group with 15 threads, a ramp-up period of one second, and 50 iterations. This makes for a total of 750 calls with a high degree of parallelity. Now look at the server log and you will see lines like

...
SEVERE: Error: expected 781730157, got -1784951881
SEVERE: Error: expected 781730157, got -1093939097
SEVERE: Error: expected -1093939097, got 540930629
SEVERE: Error: expected -1093939097, got 1481982330
...

Funny, huh? And look at the numbers: this is a typical effect in a program that is not thread-safe. Consecutive calls get mixed up.

My next idea was a variant:

    public boolean test2() {
        int i = rand.nextInt();
        s.setValue(i);
        boolean isCorrect = w.verify2(i, s);
        return isCorrect;
    }

and

    public boolean verify2(int i, StorageBean sParam) {
        int value = sParam.getValue();
        boolean isCorrect = (value == i);
        if (!isCorrect) {
            System.err.println(String.format("Error: expected %d, got %d", value, i));
        }
        return isCorrect;
    }

This means that I don’t let the StorageBean be injected into the worker, instead I pass it down as a parameter. Funnily enough, this did not work either. Under load it showed the same effect. Only when I finally passed the value in something that was really only a POJO, it did work:

package com.manessinger.test.stateless.backing;
 
public class StorageNonBean {
 
    private int value;
 
    public void setValue(int value) { this.value = value; }
    public int getValue() { return value; }
}
    public boolean test3() {
        int i = rand.nextInt();
        StorageNonBean snb = new StorageNonBean();
        snb.setValue(i);
        boolean isCorrect = w.verify3(i, snb);
        return isCorrect;
    }

and

    public boolean verify3(int i, StorageNonBean snbParam) {
        int value = snbParam.getValue();
        boolean isCorrect = (value == i);
        if (!isCorrect) {
            System.err.println(String.format("Error: expected %d, got %d", value, i));
        }
        return isCorrect;
    }

Of course the latter is trivial, but the consequences of this little test are clear:

Stateless Session Beans are really meant to be stateless. In fact the EJB 3.0 final specification says in section 4.5 that:

Because all instances of a stateless session bean are equivalent, the container can choose to delegate a client-invoked method to any available instance. This means, for example, that the container may delegate the requests from the same client within the same transaction to different instances, and that the container may interleave requests from multiple transactions to the same instance.

The only thing that the container guarantees, is that one bean will be exclusively allocated to a transaction (and therefore a thread) for the time of the invocation of one of its own methods. This does not extend to other EJBs that it calls. There is no guarantee that we call the same instance of StorageBean, if we call storage.setValue() and storage.getValue() even consecutively from the same ServiceBean, one call immediately after the other, and even if it were the same instance, there is no guarantee, that the storage has not been modified by another instance of the service bean.

Why did I ever get the idea that this construct could work? Well, probably it’s the fact, that the injection of a persistence context always gives you access to the same context (or at least the same data). I don’t know, but in any case I found it important to clarify things, because I have the impression that my mis-understanding is not completely uncommon, and when you fall into that pit, you may not recognize it until your system is under heavy load.

 

One more error: In a comment to my Eclipse / GlassFish / Java EE 6 tutorial, FMora said:

One detail – if you annotate a constraint with @NotNull and then insert a null value via SoapUI, one gets a NPE instead of a constraint violation.

Oh dear, that’s so true and it is a typical case of untested code. Have you ever seen untested code that did not break? I haven’t :)

The problem was in the ValidationInterceptor in that line, where I took the invalid value and called “toString()” on it.

The tutorial is already updated, and from now on the tutorial begins with a version number. The current version is version 1.2, last updated August 13, 2010 – 17:06. The ZIP file with the sources was updated as well.

For your convenience, here is the updated ValidationInterceptor:

package com.manessinger.cookbook.util;
 
import java.util.ArrayList;
import java.util.List;
import java.util.Map;
import java.util.Set;
 
import javax.interceptor.AroundInvoke;
import javax.interceptor.Interceptor;
import javax.interceptor.InvocationContext;
import javax.validation.ConstraintViolation;
import javax.validation.Validation;
import javax.validation.Validator;
import javax.validation.ValidatorFactory;
import javax.validation.groups.Default;
 
import com.manessinger.cookbook.exception.ConstraintProperty;
import com.manessinger.cookbook.exception.ValidationException;
import com.manessinger.cookbook.exception.ViolationDetail;
 
@Interceptor
public class ValidationInterceptor {
 
    // the factory is expensive but thread-safe
    private static ValidatorFactory factory = Validation.buildDefaultValidatorFactory();
 
    @AroundInvoke
    public Object intercept(InvocationContext ctx) throws Exception {
        Validator v = factory.getValidator();
        // for all parameters
        for (Object p : ctx.getParameters()) {
            // validate parameter
            Set<ConstraintViolation<Object>> violations = v.validate(p, Default.class);
            if (!violations.isEmpty()) {
                // validation failed, gather details and throw an exception
                List<ViolationDetail> details = new ArrayList<ViolationDetail>();
                for (ConstraintViolation<Object> violation : violations) {
                    ViolationDetail d = new ViolationDetail();
                    // path to violated constraint
                    d.setAttributedItem(violation.getPropertyPath().toString());
                    // what type of constraint, e.g. "Min" or "Pattern"
                    d.setConstraintName(
                        violation.getConstraintDescriptor()
                            .getAnnotation().annotationType().getSimpleName()
                    );
                    // construct list of constraint properties
                    Map<String, Object> violationAttributes 
                        = violation.getConstraintDescriptor().getAttributes();
                    List<ConstraintProperty> properties = new ArrayList<ConstraintProperty>();
                    for (String propertyName : violationAttributes.keySet()) {
                        if (propertyName.matches("^(message|payload|groups|flags)$")) {
                            // skip unwanted property
                            continue;
                        }
                        ConstraintProperty a = new ConstraintProperty();
                        a.setName(propertyName);
                        a.setValue(violationAttributes.get(propertyName).toString());
                        properties.add(a);
                    }
                    d.setConstraintProperties(properties);
                    // the value that violated the constraint
                    Object invalidValue = violation.getInvalidValue();
                    d.setInvalidItemValue((invalidValue != null ? 
                                           invalidValue.toString() : "null"));
 
                    details.add(d);
                }
                throw new ValidationException(details);
            }
        }
        return ctx.proceed();
    }
}
 

If you have followed my Eclipse / GlassFish / Java EE 6 Cookbook, you have seen one single section, where the sample code was only sketched and never compiled. This was the “hypothetical security interceptor”. And of course when one writes down code from memory, it is invariably buggy.

Last week I wanted to use the code as base of the security interceptor of a system that we just develop, naturally it crashed, and it did so in a not exactly obvious way, thus I thought I should give you an update.

The security interceptor as shown in the cookbook is not really what I would use anyway, and I intend to write a follow-up tutorial explaining some of the things that I learned during our current project. There you will find a more detailed discussion of access control, and along with it, a more useful SecurityInterceptor.

Until that point, let me show you what the problem with the original code was. Here it is:

@Interceptor
@Stateless
public class SecurityInterceptor {
 
    @EJB SecurityEao sec;
    @EJB SecurityGuard guard;
 
    @AroundInvoke
    public Object intercept(InvocationContext ctx) throws Exception {
 
        Map<String, Object> contextData = ctx.getContextData();
        Map<String, String> headers = contextData.get("javax.xml.ws.http.request.headers");
 
        String secKey = headers.get("X-Portal-SecKey");
        if (secKey == null) {
            throw new SecurityNoKeyException();
        }
        SecInfo secInfo = eao.secInfoBySecKey(secKey);
        Method invokedMethod = ctx.getMethod();
        if (guard.isInvocationForbidden(invokedMethod, secInfo)) {
            throw new SecurityViolationException();
        }
        return ctx.proceed();
    }
}

The wrong assumption is that contextData.get("javax.xml.ws.http.request.headers"); yields a Map<String, String>. That’s wrong, but it compiles. The method really yields a Map<String, List<String>>, because you can have more than one header of a certain name. Thus the correct code is something like

@Interceptor
@Stateless
public class SecurityInterceptor {
 
    @EJB SecurityEao sec;
    @EJB SecurityGuard guard;
 
    @SuppressWarnings("unchecked")
    @AroundInvoke
    public Object intercept(InvocationContext ctx) throws Exception {
 
        Map<String, Object> contextData = ctx.getContextData();
        Map<String, List<String>> headers 
             = (Map<String, List<String>>) contextData.get("javax.xml.ws.http.request.headers");
 
        List<String> secCredentials = headers.get("X-Portal-Credentials");
        if (secCredentials == null || secCredentials.isEmpty()) {
            throw new SecurityNoCredentialsException();
        }
        SecInfo secInfo = sec.secInfoBySecCredentials(secCredentials);
        Method invokedMethod = ctx.getMethod();
        if (!guard.isInvocationAllowed(invokedMethod, secInfo)) {
            throw new SecurityViolationException();
        }
        return ctx.proceed();
    }
}

I have already updated the tutorial.

 

It’s only a few days since I’ve posted my Eclipse / GlassFish / Java EE 6 Tutorial, but due to the fact that it took me so long to write it, the toolset that I used is already outdated.

In the meantime Oracle has published a bugfix release to GlassFish v3, their reference implementation of Java EE 6 (and still the only implementation available), and of course Eclipse 3.6, code name “Helios”, has arrived. The last two days I had a deeper look into the combination, and I can tell that it works :)

Overall Impressions

Some bugs have been fixed, for example it is no problem now to restart the server, and the bug with libraries having been added to an EAR’s lib directory being “forgotten”, is gone as well.

At the moment I see only one major annoyance, and that is that re-publishing an application to the server tends to fail. The workaround is, to not manually re-publish, but instead use Add and Remove from the server’s context menu, Remove the application, Finish, followed by Add and Remove / Add / Finish. The bug has been reported on Java.net, but so far the guys responsible for the GlassFish plugin claim, that it is a bug in Eclipse. I fully expect some ping pong between the two development teams until the bug gets fixed :D

At work, under Linux, I have played once through my whole tutorial and found no showstopper. Thus, when I begin to teach Java EE 6 from next week, this is the combination that I’ll use.

Software versions and installation

I have used the following software versions:

  • JDK 6 Update 21 (JDK!!!): There are 32 bit and 64 bit versions, under Linux I have used the 32 bit version, on my laptop, where I’m just writing this, I run the 64 bit version. No problem with either.
  • Java EE 6 SDK: There is only one version, but the SDK is mostly JARs anyway. This time there is no “GlassFish Tools Bundle for Eclipse”, so be sure to download the right package. On the download page is a matrix that shows, which packages include what version. You will want the full Java EE 6 SDK, not the “Web Profile”.
  • Eclipse IDE for Java EE Developers is available for 32 bit and 64 bit as well. Under Linux I have used 32 bits, under Windows 7 the 64 bit version.
  • The Aquarium blog lists a temporary Eclipse update site for the GlassFish plugin, that’s how I have installed it. In the long run, the GlassFish plugin will be merged into the Oracle Enterprise Pack for Eclipse, the plugin that covers Oracle’s commercial application servers.

First thing to install is the JDK. Under Windows I have installed it to the default location. The JDK installs a JRE (runtime only) as well.

When you install the Java EE 6 SDK, you are asked to choose the JDK to be used. Unfortunately the stupid updatetool installation still has not been fixed. The installer asks you if it should be installed and you really ought to do that. Included with the installer is only a bootstrap version. Unfortunately (and unlike the full version) the bootstrap version can’t easily cope with authenticating proxies. At work under Linux, I had to supply username/password as part of the http_proxy environment variable, and because our proxy needs a Windows domain name in front of the user name, with a backslash as separator, I was in quoting hell :)

What I did after installing to ~/Applications/GlassFish_v3.01 was this:

cd ~/Applications/GlassFish_v3.01/bin
sh
http_proxy='http://DOMAIN\\\\USER:PASSWORD@PROXYSERVER:PROXYPORT' ./updatetool
http_proxy='http://DOMAIN\\\\USER:PASSWORD@PROXYSERVER:PROXYPORT' ./updatetool
^D

Starting sh disables the commandline history. Passwords don’t belong into histories :)

The first call to updatetool actually calls the bootstrap tool, the second call is to the real thing, a GUI tool, and there I’ve installed the two updates that are currently available, and then I have disabled automatic updates.

The four backslashes are necessary, because the bootstrap tool is a shell script, and the value is obviously evaluated two times. Whatever.

This was only a problem under Linux, because our fabulous proxy understands NTLM, and the Windows machine at work where I tried it, automatically authenticated via NTLM.

After the Java EE 6 SDK is installed (and with it GlassFish v3.01), don’t start the server. In your develpoment environment you will want to start it from Eclipse.

Next thing to install is Eclipse Helios. Simply unpack the ZIP where you want it. Start Eclipse, go to the Workbench, use Window > Preferences > Install/Update > Available Software Sites > Add to add the location http://download.java.net/glassfish/eclipse/helios, for instance under the name GlassFish Plugin.

Close the Preferences dialog and use Help > Install New Software. Make sure that Group items by category is not checked, at the top under Work with select the site you’ve just added, and now you should see three entries in the list, Oracle GlassFish Server Tools and the documentations for Java EE 5 and Java EE 6. Install all three. After the installation, Eclipse will prompt you to restart the application. Do so, and after marveling at the changed intro screen, go to the Workbench again.

Now we need to make sure, that we use the right JDK. Use Window > Preferences > Java > Installed JREs and check, that the path really points to the JDK that you’ve just installed, and not to the JRE, that was installed along with it.

Finally you can go to the Servers view and from its context menu add a server. As server type choose GlassFish Server Open Source Edition 3 (Java EE 6). Supply the directory, where you’ve installed the server (actually the glassfish directory within the directory, where you’ve instaleld the Java EE 6 SDK).

From then on, you can start the server via Eclipse and follow the tutorial.

Within the next days I will adapt the tutorial to the new versions and remove the discussion of bugs that have been fixed, but I will keep the old version for those people, who can’t update yet.

 

Version 1.4, last updated May 23, 2011 – 11:10

The content of this tutorial is still relevant, but you may also consider my new open source Azzyzt JEE Tools, a set of Eclipse plugins that greatly simplify the process of creating a Java Enterprise application using the patterns outlined in this tutorial. See the Azzyzt JEE Tools home page and especially the tutorial “Using Azzyzt JEE Tools“.

In “4 – Equipment” I have committed myself to using Eclipse and the Java Enterprise Edition as my tools, while in “5 – Patterns And Languages” I’ve declared my high-level goals for implementing a next step of design pattern-based tools. Now, for a deeper understanding of design patterns, you first have to use them. This post in the form of a tutorial shows some very basic project setups using Eclipse and GlassFish.

I am no expert in this field, some important things may be missing, so just take the following as a set of things that work for me.

As I am not immune to learning, and as I am going the use these things a lot, it is inevitable that my understanding of certain aspects will change. I suppose that means, I will have to make changes to this post whenever it happens. If I ever do so, I will post a short notice.

Applicability

This is a tutorial about using Eclipse and the GlassFish v3 Java application server to implement Java EE 6 applications. I will show how to use different Eclipse project types for different purposes, will show how to do manual tests and how to implement automatic unit tests. We will not create a complete application, but more a vertical slice through an application. The idea is to just touch all relevant areas, not to finish a project.

The tutorial assumes the existence of a relational database, and it concentrates on the Java application used as a backend. Using JSF or another server-based GUI framework is definitely out of scope.

Where’s the beef?

At an equivalent of approximately 80 printed pages, the full tutorial is much too long for a regular blog post. It would break the size limit of feeds syndicated via Google’s Feedburner server. In fact it did. I had no choice but to move the text to a separate page. On the other hand, that’s quite OK, I want my tutorials (this won’t be the last) to be available as pages from a tutorial menu anyway.

Unfortunately I noticed the feed problems only after I had already published the URL of this post a few times. Thus, if you arrive here from a link, please go on to the actual page containing the full Eclipse / GlassFish / Java EE 6 tutorial. Sorry for the inconvenience.

 

The Pattern Movement in Software Engineering has become very popular and has produced many books, some of them arguably extremely good, but apart from automatic refactoring in modern IDEs, the number of supporting tools is surprisingly low.

What Is A Pattern Anyway

You can get a good overview about Design Patterns at the Portland Pattern Repository. For the purpose of this article we just take the classic definition from the Gang Of Four book:

A design pattern systematically names, motivates, and explains a general design that addresses a recurring design problem in object-oriented systems. It describes the problem, the solution, when to apply the solution, and its consequences. It also gives implementation hints and examples. The solution is a general arrangement of objects and classes that solve the problem. The solution is customized and implemented to solve the problem in a particular context.

The crucial bit is the last sentence: Design patterns need to be implemented specifically for each particular problem. The classes in an object-oriented design pattern are no classes at all. They are archetypes, and for the particular situation real classes have to be crafted after those archetypes.

To put it in other words, a design pattern is an archetypical solution to an archetypical problem, and that’s also the reason why you can’t press patterns into libraries. If you compare two instances of the pattern, i.e. two cases where the pattern has been used to solve the problem, you’ll find almost no code that could be factored out. The pattern is not code, the pattern is a way the code is structured. Lexically two instances of a pattern have nothing in common, everything is only similar, nothing is the same.

In some cases it is even possible to put a pattern into a library, but when you begin to do so, you quickly find out that the result is not useful. Too much varies, the result are endless parameter lists or endlessly overloaded variations of the same methods. Those libraries are hard to implement and they don’t feel natural.

Generics (templates in C++) are a step in the right direction, but although they make creating archetypical libraries easier, those libraries are not necessarily easy to understand and use, much less to implement.

The Most Simple Patterns

Design patterns don’t come only at the level of class interaction in object-oriented systems. Just look at programming languages. In the beginning there was only assembly language. We basically wrote machine instructions, but after a while it turned out, that we did similar things over and again.

One of those things was to build loops of code. People did completely different things in the bodies of their loops, but there were only a few patterns for how the loop was entered, exited, and how often it was executed. With the advent of higher programming languages, those patterns were cast in basic language constructs such as “WHILE-DO”, “DO-UNTIL”, “FOR” or “FOREACH”, and these language constructs are still with us.

Assembly language did not have those high-level loops, but it had a conditional branch. Loops had to be implemented with conditional branches. The direct equivalent to that is the combination of the “IF” and “GOTO” statements in programming languages, and of course you can use those to implement all your loops. It’s only more verbose, the intent is less clear, and that is, because you don’t express your intent by naming it. Patterns are much about naming.

Pattern Languages

Interestingly enough, an “IF” statement is also a pattern. Just look at the infinite number of conditions. This brings us to another important fact, the fact that patterns are not all born equal. They come in hierarchies. There are base patterns and composite patterns. Just like with “IF-GOTO”, you can use patterns to build other, more complex patterns.

Now, if you look at where the pattern movement came from, being initiated by the architect Christopher Alexander, and if you read his books, you see that he clearly understood the hierarchical nature of design patterns. That’s why he called it a Pattern Language.

Well, I’ve never tried to build a house, and if I were to build one, I’d probably not apply all of his patterns. Many of them I agree with, some I don’t, and for most there are alternatives that he does not cover. But that again coincides nicely with the notion of languages.

My native language is German, I regularly use English, have some low level, basic and spotty understanding of some of the languages originating in Latin, but that’s it. No idea of Chinese, no idea of any African language, no idea of anything else. And still, all those languages, that I don’t have a clue of, can express exactly the same things that German or English can. They use only different patterns to solve the same problems.

So I guess we can agree that it is possible to build different houses, houses that don’t fit into Christopher Alexander’s language, and we may still be able to live in them and stay sane.

For the adoption of Alexander’s method, this may really have been detrimental. It is much too obvious that, although it makes sense as a system, you may not be able to use it and build your dream. It would be Alexander’s dream instead, and when it comes to our dreams, we really dislike any intrusion.

This may explain Alexander’s limited success (and I don’t want to belittle him, he is really one of my heroes, even if he has not taken over the world), but it would not be in our way building programs, would it? After all, houses are for individuals, and individuals care for their individuality, but CRUD is just CRUD, ain’t it?

Maybe not. People seem to have a tendency to willingly and knowingly reinvent the wheel. Look at me: I could be satisfied using the tools I have, using them to create the things I’m paid for, and otherwise have a life. Or else just join the development of the Eclipse Modeling tools. In a way there’s so much already out there, I can’t have much hope to make a valuable contribution. And still I do what I do, knowing that I’m bound to reinvent, willingly accepting it. And why? Because it’s fun :)

I don’t know if this explains anything, but fact is, that the Design Pattern movement in computer science has not produced anything that even remotely comes close to Alexander’s hierarchical completeness. With Alexander’s language you can build a house, a street, a village, a town, a region, and in the other direction you can go down to details like the actual building materials.

You can’t do that with design patterns in computer programming. The high-level patterns are all missing. The discipline has evolved to the point where it has become possible to talk to each other using design patterns, but it’s only at that certain level of detail where we talk about basic object interaction.

The Necessary Next Step

Looking into some very simple patterns of repeated assembly code, we found that these patterns became the loop constructs of modern programming languages. Those loop patterns describe arrangements of assembly instructions, but the important thing to note here is, that today nobody arranges those instructions themselves. We use compilers for that. This is very different from how we work with Gang-Of-Four patterns, because they have to be instantiated by hand.

A compiler can do what it does, because it has a complete model of the program. All variables, that the built-in patterns like loop constructs refer to, are also defined in the same model. The whole semantic is defined in terms of the language and its patterns, and where this is not the case, the semantics are defined by standardized libraries. So, actually what this boils down to is, that a programming language allows us to express the desired semantics by constructing a model. The compiler than applies its patterns to this model and this way constructs either code or calls into supporting libraries.

That’s exactly what code generators do. They take a model and translate it into code of a different, simpler language, and where it makes sense, they generate not code but calls into external libraries. In this context, libraries have two purposes:

  1. they avoid code duplication
  2. they allow us to express semantics that can’t be expressed in the language’s patterns

#1 is nice to have, but #2 is really important, because it means that we can generate code, even when our patterns are not a complete, self-sufficiant system. What we can’t express in our language, we simply assume implemented in libraries.

This is the way to go. We have to build higher-order languages, languages that implement recurring patterns just the way as today’s programming languages implement assembly loop patterns. Once we have such languages, we can translate them automatically into code. Just like with conventional programming languages, those things will be the more powerful, the more self-sufficient they are, but that does not mean that small and incomplete steps can’t be incredibly useful in their own right.

What I want to implement as my project are three things:

  1. an environment to specify models,
  2. a set of patterns that, when applied to such a model, turns the model into an
    application, and finally
  3. a code generator, that makes this process automatic

That’s it. Easy, huh?

 

In the last post we have seen that it will take me years to finish this project. The question is, what can I rely on, what can I build upon. What are the tools that I can use, without worrying that they are obsolete before my own tool even becomes usable?

Programming Language

In the case of RPCmagiX I have made a bad choice. I used Itcl 2.2 for implementing the GUI, and as soon as I had finished it, ITcl 3 came out and it was incompatible in some subtle but crucial ways. But even if it had not been so obvious, today Tcl is mostly dead.

Perl was my choice for WADL, and although Perl is still much more alive than Tcl ever was, it is not really dominant any more. Perl 6 has taken too much time, there was a lot of FUD about Perl’s imminent death, and at the moment Python is king and Ruby the fashionable contender.

I really hate Python. It’s personal, probably nothing that you’ll understand, but a language that relies on indentation for structure, c’mon! That’s nothing but a bad, tasteless joke. So Python can’t be it.

Ruby seems to be OK as a language, but I have no idea how it will fare. It could easily be the next Tcl. Remember: we are talking about at least three years before the shelf life of my program even begins!

COBOL would be a good candidate for a language that is guaranteed to never die. But who would want to work in COBOL? Nope, not even joking :)

And then there is another thing: I would like to use one language for the tool and for the code it creates.

Currently I work with Java. It took me a time to like it, but now I do. Yes, it’s a big ecosystem, yes it suffers from the fact that they had to reinvent the wheel and everything else that already was there, but now it is pretty mature and complete. And it has Eclipse. Working with Perl I never missed an IDE. A lot of Emacs windows on a big virtual desktop, some teminal windows, that’s all I ever needed.

Not so with Java. Java is pretty unusable without an IDE. But when you have something like Eclipse, when you have an integrated, incremental compiler, when you don’t write texts but actually manipulate parse trees, then all sorts of funny things are possible. Think of the refactoring support, and suddenly Java is a very flexible and dynamic language.

And Java has some other advantages: It won’t go away. Too much important code depends on it. Java is like Cobol, only a much more likeable language, and one with absolutely superior tools. Java is such an important language, because so much code for banks and insurances is written in it. There is a reason why there is a Java Enterprise Edition.

And for all that reasons my potential users are very, very likely to use Java. When I want to maximize the impact of my tool, Java is an ideal language.

But then, very similar things could be said about C#. Same category of language, same capabilities. A year ago I have written a nice mult-threaded system in it, and as a language I like it. Why not, it’s Java, deliberately disguised in an incompatible syntax, but there is no essential difference, and some solutions they have come up with, are really, really clever. For instance I like partial classes and miss them in Java.

On the other hand, it’s a matter of principle. I don’t like a language that is so tightly controlled by an entity like Microsoft. I don’t like working on Windows either. Yes, there is Mono, but when you go the Microsoft way, you really want Visual Studio.

Apart from that, it can have actual advantages when your tools are open source. Remember my ill-inspired decision for Itcl? Well, it is open source after all. For more than 10 years we have compiled Itcl 2.2 on all our AIX and Linux systems, and I see no reason why this can’t go on. A closed source core component would long since have been unsupported. Think of Visual Basic. Sure, Microsoft offered a migration path into .NET, but that’s only the core language. People often had augmented it with third-party components, and they often were not ported. We sure do have some of those cases.

Libraries And Frameworks

And then there’s all the stuff other people do. After all, what I’ll do is basically Model Driven Design. Eclipse has a whole big group of projects under that title. Shall I use them. Can I?

They play a different game. They play a numbers game. The Eclipse Modeling Project has seven people only on its Project Management Council. Look at the list of projects on that page. Everything on that list looks interesting and like something that I could need, but I would have to spend all my time, work, free, sleeping, just to follow all that.

And then think of all that constantly changing. Try imagining the hassle with keeping all that in sync. It’s impossible, and even if I could do that, it would not help me, it would only slow me down.

No. I won’t do it. It may be possible to catch up with these things later though, for instance at a time when I have some people helping me. If they are interested in the field at all, they would probably have worked with those tools and frameworks. They would probably know how to bridge the gap between EMF and my own core abstractions. After all, whatever I come up with, it must be semantically equivalent to EMF, because if it were not, it couldn’t be as expressive.

Yes, I know, in a way it’s a waste of time to re-invent such things, but I simply can’t tie my own core abstractions to something that is completely out of my control. This would be irresponsible. On the other hand, when my own modeling structures are largely equivalent, when they simply have to, we can always bridge that single gap later, and by doing so, we will gain access to the whole infrastructure. At least that’s the theory :D

So What’s Left?

I consider Java to be a good choice as a language, and I consider the Enterprise Edition in release 6 sufficiently mature. I will use all of that (well, probably not JSF, but who knows), and I won’t use any other framework.

The whole design will be very abstract, most of it completely oblivious of the environment it runs in. In the end there will be some channels where input comes from, the models will have to be stored, there will be code generators, but all that can be abstracted away, done by primitive stubs in the beginning, done by plugins later. The only things that I have to worry about now are my code abstractions and how expressive they are.

I can and should program away in my simple sandbox, not caring about slick user interfaces and high performance. While I do that, all sorts of things will happen. Computers will get faster, new frameworks, new GUI systems may come up and get mature, and I’ll connect to them when I need to. Doing it from the beginning would only be a waste of time.

 

So far I have not even said what exactly that upcoming project will be, and reality is, that I don’t know exactly yet, thus of course I can’t tell how long it will take me to create it. But still, it may be a good idea to think about time. It will take me the longer, the higher I aim. But how high shall I aim? And how much time do I have?

RPCmagiX, the subject of my Master’s Thesis, took me extraordinarily long, two implementations over the course of five or six years. But then, at that time I had not a clue of the whole business, yes I even lacked a clear understanding of where I was heading. I worked in a lab situation, I noticed that some work got repetitive, and I felt the need to do something about it.

WADL, my past project that I wrote about in the previous post, was created in a completely different situation. It was much bigger, much more advanced, but I had much more experience as well, and generating code already felt natural. Although I needed to implement a lot of infrastructure that today would be supplied by an application server, I was pretty fast. The base architecture and a first implementation took me less than half a year, almost all of the rest was done in the first two years, and all the while I myself was using it in actual projects, and in the second year I was also giving support to other developers.

The current situation is very different again. I have some very precise ideas, I think I mostly know what I need to know, but this time it is not a full-time job, this time it is something that I do in my free time. I may get support, but first I’ll have to provide something useful, something that people can play with, get excited about.

Well, that can easily put’s us in the multi-year category again, at least when I aim a little higher.

Let’s for the moment assume that it will take me a year to produce something that actually can be used. This will be an unstable implementation, no compatibility promises would be made, but the idea is, to have something that could be used as a shortcut in real projects. You would specify something, it would generate something, you would save a lot of time, but you could not hope to use any future version of the tool on that project. It would still be useful though, especially when you are in a hurry.

In order to be at that point in a year, I’ll have to have a rough implementation sometime next fall, I’d say October is a good time. One of the problems with code generators is, that you have to use them in order to iron out the bugs. I’ll have to make an actual project with my tool, and this has the potential to severely slow me down. I think I have a nice solution though. More about that later.

This is not a project schedule, there are no hard limits, it may easily take me even longer, but I think a year is a milestone. If I don’t have anything substantial in a year, I’ll cancel the project and don’t bother you any further.

So, I guess we’ll see nothing usable this year. I don’t have all time in the world though. Within a short time I will have committed to a core architecture and a rough feature set. If I want this program to be used (and of course I do, why else would I bother creating it?), I need to create something that is advanced enough to be useful at the time it begins to be usable. It’s the same problem that game developers face: you can’t make a new game for current technology, you have to design it with the future ecosystem in mind.

For me that means I should aim rather higher than lower. Not too high though, because if it takes me ten years to finish it, chances are, that other people have come up with pretty clever ideas in the meantime, and my ten year old design will show its age before it’s even done. I think five years would already be dangerously long, three years wouldn’t worry me though.

OK, without knowing what exactly I do, I have at least a rough time frame. An architecture and initial code in October, something useable for producing prototypes within a year, a commitment to compatibility in maybe two years, a first release in maybe three years, but not later than in five. Let’s see how this works out :)

 

The desire for code reuse has been a driving force behind most efforts in software engineering, and in this article we will look into three increasingly sophisticated ways to achieve reuse.

Libraries

Libraries can be anything from Cobol copy libs to modern shared libraries. They can be the result of meticulous modularization in your own, earlier applications (a rare case), they can be part of a “system library” (like the vast set of libraries in UNIX or the Java Runtime Environment), or they can be third-party libraries, open source or commercial. It does not matter which, all libraries have something in common: They never exactly fit your application.

In order to be re-usable, libraries have to cater to different users. It also does not matter if they are object-oriented or not, you almost always have to initialize something and then to call a procedure or a method, passing data down as parameters, having to check for and to react to errors. Initialization can be cheap or expensive, you may have to do it upon every use or only once at startup or first use. The problem with libraries is, that in order to be useful, they have to have extensive interfaces.

Typical libraries in early GUI systems (e.g. XLib, OSF Motif, etc) or in systems for distributed computing (ONC aka Sun RPC, OSF DCE, Microsoft DCOM, CORBA) have hundreds or thousands of procedures or methods, often with long parameter lists, parameters frequently being of types defined in libraries as well, types that can be created by other library functions and so on and so forth.

Using those libraries quickly riddles your application with infrastructure code, often forces you to structure your code in certain ways, and these ways may be incompatible with other, alternative libraries, making any attempt to switch between alternatives practically impossible.

Frameworks

Frameworks are a much more sophisticated solution for code reuse. They accept the problem of entanglement, embrace it, and reverse the direction of control. Frameworks are the dominant solution today. No more do you call the library, the library calls you. A framework is called a framework, because it provides a frame, a kind of main program that does all initialization and most of the infrastructure plumbing.

Of course you still have to write application code. A framework is like a generic main program, but in order to do anything useful, it must rely on application code. Let me give a very primitive example.

Earliest GUI libraries relied on applications providing a main loop that took care of events. Something like this:

// some variables
MyTypeOfObjects currentlySelected;
...
// the loop
boolean done = false;
do {
    WindowSystem.event e = WindowSystem.getNextEvent();
    switch (e.getCode()) {
        case WindowSystem.EXPOSE:
            doRedraw();
            break;
        case WindowSystem.MENU_CLICK:
            WindowSystem.MenuDetails menuButton = e.getMenuDetails();
            switch (menuButton.getCode()) {
                case WindowSystem.StandardMenus.OPEN:
                    doOpen(currentlySelected);
                    break;
                case WindowSystem.StandardMenus.EXIT:
                    done = true;
                    break;
                case WindowSystem.StandardMenus.DELETE:
                    // delete currently selected object ...
                ...
                default:
                    log("Recieved invalid menu entry!");
                    break;
            }
        case WindowSystem.KEY_PRESS:
            // process keys ...
        ...
        default:
            log("Recieved invalid menu entry!");
            break;
    }
} while (!done);
// maybe some cleanup
...
exit(0);

GUI frameworks released you of the burden to write these main loops yourself. You had a main program provided by the framework, and this main program was able to process all possible events, menu buttons or key presses that the window system could ever deliver. By default it would ignore events, but you could register functions like doRedraw() and doOpen(FrameworkTypeOfObjects currentlySelected) to be called in case of certain events.

Obviously this is big progress, but the framework, still being a library, can’t know about your application types. You see how I have changed the method signature of doOpen() from taking a parameter of MyTypeOfObjects to a parameter of FrameworkTypeOfObjects. The framework is in control now, and because it can’t know your data types, it forces you to accept framework data types as parameters.

Again you have reuse, but now the framework forces its abstractions upon you. You have to write less code, but it is not as few as you’d have hoped for, because you now need to write a layer to adapt your abstractions to those of the framework. Of course you can ignore the problem and simply use framework abstractions in your own code, but if you do that, you’re doomed anyway, at least in the long run. New versions of the framework will force you into a deadly maintenance routine, and if the framework ever becomes unavailable, you can happily begin writing your program anew.

WADL

Sometime around 2001 I was confronted with the request to write a big web application. The application would have to work with a relational database, it would have to be written in Perl and I would have to use kind of a framework that had been developed in-house. A quick analysis identified three user roles, and the number of pages would be greater than 40. I had roughly six months before the system would go productive.

At that time I had almost no experience in Perl, had never used relational databases, and it was going to be my second web application. #1 had been in Perl as well, but it had been an application with two or three pages and the same number of forms. It had been trivial, but as I had hand-crafted it, it had been tedious nevertheless. I was in bad need of a tool.

In a spell of recklessness I used five of the six months to analyze the problem and construct a tool, and then I spent a month building the application. It was a gamble, but it worked, the application was a success and I was in business.

I called this tool WADL (Web Application Definition Language). Just like RPCmagiX (I wrote about it in the last post), I failed to ever publish WADL, and in the meantime the name has been taken. WADL is now a W3C proposal for something like the REST equivalent to WSDL.

My “WADL” was more, much more. It was a way to specify the structure and visual details of a web application. The specification was done in XML, and with a code generator you could generate a complete application. All the pages and forms were there, they only had no content. For prototyping purposes you could associate dummy data with the pages, and this way it was possible to create a complete prototype without writing one single line of code. The pages displayed meaningful data, it’s only that the data sent from one page had no influence on the next page. In cases where the result page was determined from input data, the prototype would pop up a choice box where you could select the desired outcome.

You had one XML file for the structure of the application (the “Application Definition”) and one XML file for each page (“Page Definition”). Furthermore you could have XML files describing database structures with tables, views, foreign key relations, etc.

The application definition consisted of some application attributes (most important the name), the definition of roles (like “user” or “administrator”), the reference to the database definitions if any, and finally the definition of the graph of pages. There were start pages (those that you could directly address from a GET request, at least one per role) and other pages. Each page had an attribute “roles”, specifying the roles that could get to that page. Events took you from page to page, each event corresponding to a button on the page that could be pressed and that would submit a form.

Roles could overlap. Think of a system where a role “user” can search for and display data. A second role “admin” can enter new data, but of course “admin” can search and display as well. The roles overlap, “admin” shares part of his graph of pages with “user”.

The page definitions basically described what was on the pages. There were text blocks and form blocks, and within form blocks you had form elements like input fields, text areas, select boxes, labels, grouping elements, etc. A layout generator would automatically generate a layout, conforming to our internal style guide, but the system was modular, layout generators could be plugged in, and it was even possible to use HTML templates (I called them “HTML Makeup”) on a per-page basis.

From the database definitions WADL created Perl classes, one for each table definition. A support library handled encapsulated database access.

That’s about what you got from XML alone. For everything else you had to write Perl code, but WADL generated templates to give you a quick start. You had to implement so-called “Processors” for all possible page transitions. In cases where the target page was determined at runtime, the processor was called a single method process() in a class Processors::OriginatingPage_EVENT (edge processor, processing the edges corresponding to a single event), the return value of this method determining the target page, and in all other cases it was a method OriginatingPage_EVENT() in a Perl class Processors::Page::TargetPage (page processor, processing all incoming edges to a page), with “OriginatingPage” and “TargetPage” being the respective page names and “EVENT” being the name of the event. To implement processors, you simply copied from the generated template directory, and began inserting code. This worked pretty well, because due to the prototyping system, basic questions about application structure could normally be answered very early.

The processors communicated with the application via generated input and output objects. Thus they did not have to care about the actual page structure. They took values from methods named like page elements, but it did not matter whether a value came from an input field or from a text box.

WADL: Additional Benefits

Knowing so much about an application opens up many opportunities, that you probably have not even thought of. One of WADL’s most successful features was a byproduct of my curiosity. I wanted to know, how much of the application I had already implemented, and so I began collecting data, but then I thought, why not visualizing it?

I already knew the Graphviz project, and it was fairly simple to write a program that created an application graph for each role and a graph for the database structure. The nodes in the application graphs represented pages, the edges were events. Blue edges represented events for which processors were already in place, gray edges represented events without processors yet.

Nodes were clickable, and they brought you to another graph, showing the incoming and outgoing events for that page. Here the events were clickable, and they brought you to the actual code of the processors. From there you could go on to the next page graph and so on. Essentially you could click through the source code in exactly the same way as you would navigate through the application.

This visualization proved to be one of WADL’s most successful features, because it was trivial to assess a project’s progress. You only had to look at the blue and gray edges and at some numbers in the statistics section.

Another byproduct was a code generator for database transformations. It took two database definitions and transformation rules, and from that it generated a script for transferring data from one database to another, doing all necessary transformations. Machine-readable knowledge about structure – you can do all sorts of things with it.

But there’s more. WADL was extremely easy to learn. You had guidance in XML via the DTDs, the whole project structure was generated, there were commands to generate templates for HTML makeup, processor templates could be generated, the templates were commented, thus it was all a matter of copying some files and filling in code where the comments hinted at it. One of our programmers had never before written a Perl application, and his first project was the biggest WADL application that was ever built. More than a hundred pages, many hundreds of processors, a three step workflow, and at the end of it the input of roughly a hundred users was compiled into an automatically created PDF full of tables. He created the application, finished it in time, and I did the PDF creation code.

What I did, was implementing an HTML to LaTeX translator, and then we typeset the document on the fly at download time. Using HTML as input had the advantage, that we could display the same code on a preview page. I took the code that I had written and made it part of WADL’s tool set.

But there’s even more. WADL automatically structured the projects. You never had to write any plumbing code, everything was always the same, regardless of project, regardless of project phase. Adding a functionality might mean adding a page and some processors, but that never complicated the project. Each new page had the same complexity, each processor needed the same effort as any other processor. WADL scales linearly.

I can’t imagine any pure library system or any pure framework that could ever scale that way. They can’t, because you always have to write plumbing code, code that’s repetitive and tedious to write, and whenever you do it, you do it in a slightly different way. Only then you complicate things, because what should be similar, becomes different, and over time it turns the project into a maintenance nightmare.

It’s a hard fact: such code should not be written. It can’t be simplified, because it is complicated by nature. It can’t be packed into libraries, because though it follows patterns, it is never the exact same. It’s similar structure we’re talking about, not sameness of code. There’s nothing to be factored out.

While RPCmagiX (see the last post) was a tool to create the ideal library for your interface, WADL was a tool to create the ideal framework for your application. Both would have been impossible without code generators.

WADL is still in use, but I did not get the funding to keep it current. It is pretty outdated now. The base mechanism is still CGI, that means one process per request, we have no AJAX support and I have never found a good way to keep up with .NET’s excellent support for SOAP.

And Now?

If I would implement WADL now, I would use the Java Enterprise Edition as its basis. It implements all that a big, scalable application could possibly need, it does it in a quite elegant way, and this basis would also make it more acceptable to management, would make it less of a risky, exotic solution tied to one person.

It’s only that I am not interested in WADL any more. I have solved it once, I could do it again, but there would be no challenges, no surprises, only tedious work. I intend to aim higher. How high, that’s what we will find out as this blog develops, as a plan begins to form, as I get input, as we discuss these matters. A first sketch will follow soon.

© 2010 Andreas Manessinger Suffusion theme by Sayontan Sinha