Short description: Katja is a tool which generates rich Java libraries for order-sorted, immutable datatypes from concise specifications.

Katja allows you to easily define immutable datatypes, which you can then use by adding a generated library to your system. It is a successor to the MAX tool developed by Arnd Poetzsch-Heffter.

Why have dedicated data types? In many cases it is desirable to just represent data, rather than combine any specific form of usage with its type definitions. You want to pass around data in a system, especially over interfaces between larger system components. In Java you have no choice but to represent data by defining your own classes, which can quickly become a tedious task to do. Especially when there are a lot of common operations you want to be able to use on any kind of data, you end up with either giving up type safety or writing a lot of redundant code.

Why have immutability? Because you can design your system in an easier and safer way. Like it is with Strings in Java: Nothing bad can happen when passing Strings around. No one can change Strings you gave away and this allows you to write programs in an easier and more readable fashion. If you want someone to work on or update your String, you have to make this explicit by defining methods which return the new version of the String. With Katja you can do the same for more complex data.

Why use code generation? Because Java is very limited in what you can do with just a general library. By generating a specific library for each specification, we can offer a lot of benefits. Not the least of which is that you profit from an enormous amount of static type safety, which can reveal a lot of potential bugs or help in developing a correct program with the help of an IDE upfront. Katja is also designed in such a way that you will never have to modify the generated code. Changing your data specification is a common task to do. Just rerun Katja, update the library in your IDE and then adjust the usages of changed types, which really have to be adjusted.


  1. Invoices (small introductory example)
  2. Formulas (small example featuring term positions)


  • Types are generated immutable.
  • Supports the use of the basic Java types, Strings and in general any immutable type you may have defined yourself.
  • Supports definition of tuples, lists and their subtype relation (called variants).
  • Creates convenient methods for construction and selection of terms.
  • Creates convenient methods to change tuple components.
  • Creates a rich interface on lists.
  • Creates interfaces and support methods to visit terms and switch between cases of variants.
  • Supports the concept of term positions, i.e. the notion of location of a term contained in a much larger term.
  • Supports the navigation on terms using positions.
  • Supports the iteration over terms in pre- or post-order, using positions.
  • Supports to conveniently replace small pieces of data deeply embedded in a larger term by using positions.
  • Supports persistence of both terms and positions.


In the absence of a dedicated user documentation, these reports can be used to understand what Katja can do and how certain features work. Note, however, that many technical aspects have changed over time and their description is superseded by the one in the respective more recent work.

  • P. Michel: Redesign and Enhancement of the Katja System (pdf), Technical Report (354/06), University of Kaiserslautern, October, 2006

  • P. Michel: Adding Position Structures to Katja (pdf),Technical Report (353/06), University of Kaiserslautern, June, 2005

  • J. Schäfer: Generating Order-Sorted Data Types in Java (pdf), Project Report, University of Kaiserslautern, February 5, 2004


Do not hesitate to contact us if you need assistance in using Katja, have questions or have discovered a bug in Katja. Do also not hesitate to ask if you have a feature request! Katja can already do a lot for you, probably more than you are aware of!

To Be Implemented

  • A better parser. Katja does not have a complicated syntax, yet the error messages you get from simple mistakes could be a lot better. The most common mistakes with Katja specifications, however, are not on a syntactic level, but lie within the semantics of the specification. For those you will already get the kind of helpful messages you would expect.
  • Garbage collection for term sharing. Katja uses term sharing excessively, to optimize the memory footprint of data structures (among other reasons). Unfortunately the current implementation does not allow freeing any term ever allocated (which is to say we are aware of a memory leak in our implementation). This is easy to fix and can be done when need arises.
    This has been fixed as of version 5809.
  • Performance improvements on position updates. We do not have encountered any application yet, where performance was really an issue. Yet we are already aware of a different implementation technique that would further improve it. This improvement can be done when need arises.


Note that in order to use Katja, you do not need to run any build scripts or the like. Katja uses itself and was created using bootstrapping. So you can just use the library jar file of Katja, which is always the most recent version of Katja itself. The katja-jar shell script might help you with just running Katja.

  • katja-2.2.tar.bz2 Version 2.2 (improved Haskell backend)
  • katja-5822.tar.bz2 Version 2.1
  • Changes from 5738 to 5822:
    • Implemented garbage collection of the term sharing cache.
    • Implemented garbage collection of the position cache.
    • Renamed =visitSort= methods in the =Visitor= interfaces back to =visit=.
    • Implemented a simple XML un/parser facility in =katja.common.XML=. The class needs either JDK1.6 or the JSR173 API and an implementation to run.
  • katja.5738-1.tar.bz2 Version 2.0