Saturday, June 27, 2009

Consistent SVN branch and tag names for maven projects

When using maven together with subversion (SVN), it is good to have a naming strategy for your branches and tags which is consistent with the maven versions of your projects (in the pom.xml files). A consistent naming strategy makes it trivial to know which version of a maven project can be found where in the SVN repository (and vice versa), and it prevents possible version conflicts between maven projects when for example branching twice from the same starting point.

We designed a simple scheme at work which suits us well. I thought I might as well share it, you might find it useful.
  • In principle, we have a X.Y.Z numbering scheme for all our maven projects (but having only one or two numbers would be possible as well); for example someproject-3.4.4.
  • All main development happens on trunk (for the "current" release). A maven project on trunk thus has a version number X.Y.Z-SNAPSHOT; for example someproject-3.4.5-SNAPSHOT.
  • When releasing a maven project, the word "SNAPSHOT" is removed from the version; for example someproject-3.4.5. On trunk, development continues with the next SNAPSHOT version; for example someproject-3.4.6-SNAPSHOT (or perhaps 3.5.0-SNAPSHOT or even 4.0.0-SNAPSHOT).
  • A released project is tagged in SVN as ${artifactId}-${version}; for example http://svnserver/svnrepo/tags/someproject-3.4.5.
  • A branch (for fixes in "old" releases) is always taken starting from a tag. Branching a project tagged with version X.Y.Z results in a branch SVN directory called X.Y.Z.1.x; for example http://svnserver/svnrepo/branches/someproject-3.4.5.1.x (the 1 means "first branch", the x indicates that that is the only part of the version which changes in this branch during the lifetime of the branched project).
  • If another branch is taken from the same tag, it becomes X.Y.Z.2.x etc (This actually corresponds to how CVS does it: branching introduces two extra numbers); for example http://svnserver/svnrepo/branches/someproject-3.4.5.2.x.
  • The version of the maven project in a branch follows this naming strategy; for example the maven version of the project in .../branches/someproject-3.4.5.1.x is initially 3.4.5.1.0-SNAPSHOT.
  • Releasing from a branch results in a tag with 5 numbers; for example the first release would be someproject-3.4.5.1.0 and the branch would continue with 3.4.5.1.1-SNAPSHOT.
  • Branching on a branch is also possible, just add another two numbers; for example X.Y.Z.1.2.1.x (first branch of the second release from that first branch of project with version X.Y.Z). In our case, this has been necessary only once though.
I guess we aren't the first team to invent this, but I have never seen it written down for the SVN/maven combination like this.

Tuesday, June 23, 2009

How to provide hibernate persistence support along with class libraries

Suppose you want to encapsulate some kind of functionality in a class library, and you want to provide an easy way for the clients of your library to persist certain entities from your library with hibernate. I think the most straight forward way to do this, is to provide hibernate UserType implementations along with the entities in the class library that the client might want to persist. Joda Time for example, uses this approach. Joda Time is a class library to work with dates and times, and the Joda Time Hibernate Support project provides hibernate UserType implementations for many of the entities from Joda Time.

This approach works perfectly well, but it is limited to what one can express in a hibernate UserType. One of those limitations is that you can not (or almost not) use a single UserType to persist an entity that has a one2many relationship with something else (probably a value type). I encountered this problem when writing a class library to manage IP address pools (an IP address pool is a continuous range of IP addresses of which some are "free" and some are "allocated"). I represent an IP address pool as a class (IpAddressPool) which contains a sorted set of free sub-ranges in the pool. Persisting this would naturally result in a one2many mapping (using hibernate to map the SortedSet). Writing a single UserType which is capable of persisting IpAddressPool instances (including the one2many mapping) is (almost) not possible. I say almost, because it might be possible to access the JDBC connection directly and do everything by hand, but obviously I would like hibernate to manage the one2many relation.

So, when UserType implementations are not an option, what other options do we have?

  • we can simply do nothing and leave it up to the client to create the necessary hibernate mappings;
  • we can provide a mapping and a perhaps even a DAO and let the client use that by "injecting" its session factory into it;
  • we can do the same with annotations;
  • we can prevent the one2many mapping by serializing the sorted set into a single string representation or something like that, and use a simple UserType after all;
  • ... ?

The first solution is not really a solution at all. Apart from not helping the client of your library with the persistence aspect, it also means that we need to expose the internals of our classes such that clients can write correct hibernate mapping files for them.

The second solution is a bit better, but still not as good as the UserType approach. It typically results in having two mapping files (one from the class library, one from the client application) and thus complicates things like schema generation. It also makes it more cumbersome to integrate it with your client application. The provided DAO for example, might not be aligned with the DAO style of the client application, or the DAO might not provide certain queries the user wants to perform. Also, the client application might use annotations to express the hibernate mappings, while the class library uses an XML mapping file. Querying on properties of the entity in the class library is possible, but it means the internal representation has to be exposed. The UserType (and more specifically CompositeUserType) API has a nicer way to expose properties on which HQL queries can be performed without having to expose the internals of the entities in your class library, which can unfortunately not be used in this solution. It also becomes a bit problematic if you have multiple entities in your class library of which a client might only use a few.

The third solution is basically the same concept as the previous, except we could use annotations in stead of mapping files in the class library. It has all the typical advantages of annotations, but also means that the class library now needs a dependency on hibernate, even if the client doesn't care about persistence.

The fourth solution is more a workaround than a real solution. I think it might be appropriate in some situations though. In the IP address pool example this actually works very well, although it could become a problem for very large pools with lots of fragmentation (many entries in the sorted set of free sub-ranges) and it also complicates manual SQL database queries.
 
If anybody has better suggestions, or if anybody can prove me wrong on the claim that this is not feasible with a UserType, I'd be happy to hear about it.
 

Monday, June 8, 2009

Modernizing MultithreadedTC: JUnit 4 integration

MultithreadedTC is a nice library to help writing tests for multi threaded applications (go check out their website for some examples, I can 't explain it better). Writing and testing multi threaded applications is not an easy task, so having a decent test suite for to help with that task is no luxury at all. I think MultithreadedTC does a nice job, it certainly helped me to write better and more elegant tests than I would have been able to without the library.

One thing I find it a bit frustrating though, is that it doesn't integrate with JUnit (4) very nicely. Basically, with MultithreadedTC, you have to write a class which extends MultithreadedTestCase, with some methods that start with "thread" (a naming convention for MultithreadedTC to know what to run in which threads)
class MyMultithreadedTest extends MultithreadedTestCase {
  public void threadFoo() {
    ...
  }
}
To execute this "multi threaded test case", you have to instantiate it and pass it on to one of the static methods in the TestFramework
TestFramework.runOnce(new MyMultithreadedTest());
Although the name MultithreadedTestCase suggests it is a JUnit test, it is not. To get it all bootstrapped in a JUnit test, you typically end up with an inner class (extending MultithreadedTestCase) inside a JUnit test, and a single JUnit test method to call the TestFramework.runOnce() method
public class SomeTest {
  class MyMultithreadedTest extends MultithreadedTestCase {
    public void threadFoo() {
      ...
    }

    ...
  }
  
  @Test
  public void test() {
    TestFramework.runOnce(new MyMultithreadedTest());
  }
}
This obviously works, but I think we can do better.

I updated the MultithreadedTC library such that multi threaded tests are real JUnit tests (including all the benefits that has, such as using the @After and @Before annotations), in stead of merely something which can be bootstrapped from within JUnit. I also added some annotations for the "thread" methods, in stead of the naming conventions.

The above now looks like this
public class SomeTest {

  @Thread("Foo")
  public void threadFoo() {
    ...
  }
  
  @Test
  public void test() {
    ...
  }
}
Consider the first example on the authors website for a more complete example. This can now be written like this:
public class MTCCompareAndSetTest extends MultithreadedTestCase
{

    AtomicInteger ai;

    @Before
    public void initialize()
    {
        ai = new AtomicInteger(1);
    }

    @Threaded
    public void getTwoSetThree()
    {
        while (!ai.compareAndSet(2, 3)) Thread.yield();
    }

    @Threaded
    public void getOneSetTwo()
    {
        assertTrue(ai.compareAndSet(1, 2));
    }

    @Test
    public void resultShouldBeThree()
    {
        assertEquals(ai.get(), 3);
    }
}
Advantages of this approach include: no inner classes, no explicit TestFramework.runOnce() method calls, usage of standard JUnit annotations, test class is a JUnit class, annotations in stead of naming conventions, ...

Let me know if you are interested in a tgz file which contains the code of this modified MultithreadedTC library. It also contains all the tests of the original sourcecode, adapted to the more modern version.

I have tried to contact the authors of MultithreadedTC, asking them if they are interested. I am willing to share my modifications with them to get them included in the MultithreadedTC library, but I'm still awaiting their reply. Maybe they are deadlocked ;-)