practicing techie

tech oriented notes to self and lessons learned

Monthly Archives: January 2013

A JVM polyglot experiment with JRuby

The nice thing about hobby technology projects is that you get to freely explore and learn new things. Sometimes this freedom makes the project go off at a tangent, and it’s in those cases in particular when you get to explore.

Some time ago I was working on a multi-vendor software development project. We had trouble making developers follow Git commit message guidelines and asking multiple times didn’t help, so I thought I’d implement a technological solution for this. Our repositories were hosted at GitHub, so I studied the post-receive hooks mechanism, learned a bit of Ruby and implemented my own service hook that validates the message format against a configurable format, generates an email using a configurable template and delivers it to selected recipients. Post-receive hooks don’t prevent people from committing with invalid messages, but I chose to go with a centralized solution that would not require every developer to configure their repository. I submitted my module, test code and documentation to GitHub, but the service hook implementation was rejected.

After the dead-end, I decided to try enforcing a commit message policy using a server-side hook that could actually prevent invalid commits. That solution was technically viable, but as suspected, it turned out that not all developers were willing to configure the hook in their repository. Also, every once in awhile when developers do a clean clone of the repository, the configuration needs to be redone.

So, I decided to study how service hooks could be run on an external system instead of being hosted on GitHub. The “WebHook” service hook allows you to deliver the post-receive event anywhere over HTTP. GitHub also makes service hook implementations available to be run on your own servers. The easy way would’ve been to simply take my custom service hook implementation and run it on our server. In addition to being too easy, there were some limitations with this approach as well:

  • a github-services server instance can only have a single configuration i.e. you can’t serve multiple repositories each with different configurations
  • the github-services server dispatches data it receives to a single service based on the request URL. It’s not possible to dispatch the data to a set of services.
  • you have to code service hooks in Ruby

I had heard of JRuby at that time, but didn’t have practical experience with it. After some experimenting I was able to validate my assumption that GitHub Service implementations could, in fact, be run with JRuby. At that point I started migrating the code base into a polyglot GitHub Services container that allows you to run the GitHub provided github-services as well as your own custom service implementations. Services can be implemented in different languages and run simultaneously in the same container instance. The container can be configured with an ordered set of services (chain) to handle post-receive events from one or more GitHub repositories. It’s also possible to configure a single container with separate service chains, each bound to a different repository. The container is run in Jetty servlet container and uses JRuby for executing Ruby code.

Below is an illustration of an example configuration scenario where two GitHub repositories are set up to deliver post-receive events to a single container. The container has been configured with a separate service chain for each repository.

ghj-diagram

The current status of the project is that a few GitHub Services as well as my custom Ruby and Java based services have been tested and seem to be working.

Lessons learned

The JVM can run code written in a large number of different programming languages and it’s a great platform for both dynamic language implementers and polyglot application developers. To quote JRuby developer Charles Nutter:

The JVM is going to be the best VM for building dynamic languages, because it already is a dynamic language VM.

Java 7 delivered vastly improved support for dynamic language implementers with JSR 292 or the invokedynamic bytecode instruction. Java 8 is expected to further improve language interoperability and performance.

JRuby is an interesting alternative Ruby implementation for the JVM. It’s mature and the performance benchmark numbers are impressive (Why JRuby) compared with Ruby MRI. Performance is expected to get even better with invokedynamic optimization work being done for Java 8.

While taking the existing Ruby based github-services and making them run on JRuby was successful and didn’t require any code changes, there were lots of small issues that took a surprising amount of time to resolve. Many of the issues were related to setting up the runtime environment in one way or another. High-level troubleshooting strategies are similar from platform to platform, but on a more detailed level the methods and tools are often quite different, and many of the problems I encountered might have been easier to crack with solid Ruby experience.

Here’re some lessons learned during the project:

  • Learning how to use the JRuby embedding API. There’re 3 different APIs to choose from: Red Bridge, JSR 223 and BSF. Performing tasks like instantiating objects and passing parameters in a Java call-out was not immediately obvious at first using Red Bridge
  • Figuring out concurrency / thread-safety properties of different areas of the JRuby embedding API. JRuby concurrency documentation was lacking at the time when the project was started.
  • JRuby tooling. Tooling works somewhat different from Ruby.
  • bootstrapping GitHub services gem environment 
    • in order to keep the github-services installation self-contained, I wanted to install as many gems as possible in the github-services vendor directory instead of installing them in JRuby. Some gems had to be installed in JRuby while others could be installed in vendor tree.
    • some gems need to be replaced by JRuby specific ones (e.g. jruby-openssl)
    • setting up Ruby requires and load paths
    • Gems implemented as native extensions require a compiler toolchain as well as gem module library dependencies to be installed.
  • bypassing the Sinatra web app framework that’s used by github-services

Probably, most of the issues I encountered were related to bootstrapping GitHub services gem environment in some way.

Code for the experiment can be found from https://github.com/marko-asplund/github-hook-jar

JavaOne 2012 – Technical sessions

While trying to find information about technical sessions presented in JavaOne 2011 and earlier I noticed there was surprisingly little details available. Many of the results I got from googling were pointing to the JavaOne 2012 site or the links were broken. So, even though it’s been a while since the conference, I thought I’d post some notes about the themes for historical reference.

The technical sessions (487) were classified in the following tracks:

  • Core Java Platform (146 sessions)
  • Development Tools and Techniques (212 sessions)
  • Emerging Languages on the JVM (44 sessions)
  • Enterprise Service Architectures and the Cloud (112 sessions)
  • Java EE Web Profile and Platform Technologies (123 sessions)
  • Java ME, Java Card, Embedded, and Devices (62 sessions)
  • JavaFX and Rich User Experiences (70 sessions)

The numbers were from Oracle’s Schedule builder application.

Some of the hot topics in JavaOne 2012 were thin server architecture, REST, HTML 5, NoSQL, data grids, big data, polyglot, mobile web and cloud. There was also a lot of interest on next big things on the Java platform roadmap, especially Java SE 8 (Lambdas) and Java EE 7 related talks.

In addition to the new and exciting topics, the evergreen topics such as troubleshooting and optimization (esp. garbage collection), scalability, parallellism, design and design patterns seemed to draw a lot of attention.

Java on Mac OS X

Mac OS X is a nice platform for Java development because it successfully combines a very good desktop user experience with the system’s unix heritage and tooling. There are some problems, however:

  • only a limited set of JDK versions are available for current OS X releases
  • the JVM implementations are not standalone and require Apple proprietary frameworks to be present

In this respect, Linux is probably the best Java development platform because JDKs are available from many different vendors and multiple versions of the most popular JDKs run on Linux. The JVM implementations, usually don’t have esoteric dependencies and only require basic OS libraries in addition to the ones bundled with the JVM.

On the other hand, only Apple and Oracle provided JDKs run on Mac OS X and older Java versions aren’t available. Also, neither Apple’s Java 6 nor Oracle’s Java 7 JVM seems to run on Lion or Mountain Lion without the com.apple.pkg.JavaEssentials package, for example.

Recently, I managed to corrupt my Java installation beyond repair and I wanted to try and avoid this in the future by trying to isolate my Java 7 and 8 installations as much as possible. This turned out to be fairly simple if you defy the temptation to install by just clicking on the downloaded JDK package. The JDKs are distributed by Oracle as disk images that contain a Mac OS X installer package file. Instead of running the installer you can easily extract the contents of the package using command line tools. First mount the disk image by clicking on the disk image and then run the following commands in a terminal session:

xar -xf '/Volumes/JDK 8/JDK 8.pkg'
cat jdk180.pkg/Payload | gunzip | cpio -i

After that, the Contents directory will include the entire JDK installation and you can move it to the location of your choosing. Then just specify JAVA_HOME environment variable and add the JVM and required tools to shell search path.