practicing techie

tech oriented notes to self and lessons learned

Category Archives: development

Deploying a dependency manager

Older codebases are common to include dependencies required by the application, such as code libraries, picture and sound files and other artifacts in the source code repository itself. This typically leads to the repository getting bloated. As an example, in a project where I was working recently the repository size on checkout was 450 MB. Checking out such a repository takes a lot of time, wastes disk space on each developer sandbox and makes implementing continuos integration more difficult.

Here’s where dependency management technologies such as Apache Maven and Ivy (with an artifact repository) can be a great help. Deploying such a solution in a greenfield project is simple, but when you have an existing codebase that’s used in production things get more complicated.

Suppose you have a set of libraries that the codebase currently uses and you want to manage them using a dependency manager and artifact repository. You can choose to

  • a) declare and publish your own artifacts based on artifacts in the existing codebase
  • b) use artifacts published in public repositories
  • c) use a hybrid approach

With option a you gain maximum control over your artifacts (and thus what exactly gets included) while with b you hope to be doing less work managing artifacts on the long term. Using publicly available artifacts requires you to determine and categorize library dependencies, which can require a lot of detective work. Also, you need to trust the persons providing artifacts to be doing a good job in declaring transitive dependencies. I haven’t found good automated solutions for determining dependencies, so I’ve used the following approach:

  1. Determine compile-time dependencies
    Check import statements from source code. For each import find the corresponding library in a public artifact repository and add the dependency to compile-time set.
    Delete jars one-by-one, compile. If build fails, add required dependencies.
    Rebuild and iterate.
  2. Determine runtime dependencies
    build package, deploy and test. If any dependencies are missing, add them to the runtime set.
    Iterate.
  3. Cross check original set of jar files (optional)
    Remove any files that aren’t present in the original runtime set using explicit exclusions.

This is a rather frustrating and dull method. Also, in practice, in step 3 some artifacts usually bring in dependencies that weren’t packaged with the application originally and which you might not want to include in the dependency set. You can solve this in two ways: a) include transitive dependencies by default and exclude the unnecessary ones or b) exclude transitive dependencies by default and include explicitly the required ones. With option b you gain control but you’re not using the dependency manager up to its full potential. Option a on the other hand may require that you let go on your purism and accept the fact that the dependency set will include something, that wasn’t included as a dependency originally.

Advertisement

Oracle Enterprise Linux now free (of charge)

Linux application software developers often face a choice between two compatible but different OS variants: Red Hat Enterprise Linux (RHEL) and CentOS. Using RHEL can sometimes be problematic for developers because typically some sort of centralized subscription management is required for enabling software updates, and depending on the organization you can get stuck from hours to days. The required bureaucracy can be a really frustrating experience for software developers looking to install just a basic virtualized RHEL guest OS instance for development or QA purposes: 5 minutes and you’re done – if it just wasn’t for the subscription management part! CentOS on the other hand can be freely downloaded and used but the downside is that traditionally publishing updates has dragged behind. Depending on the project, this may not be a big problem for QA and development purposes, but for internet facing production platforms you’d like the security updates to get installed as soon as they get released.

Oracle Enterprise Linux is an enterprise Linux distribution similar to CentOS in that it’s binary compatible with RHEL. It’s also been made freely (as in beer) available recently. The big upside for the app dev use case above is that Oracle promises to publish updates faster than CentOS has done. For operations personnel the benefit is that you can also get paid support for the OS from Oracle as well as some interesting features, such as zero-downtime kernel updates with Ksplice.

Being a bit curious, I downloaded Oracle Linux installation image (Oracle Linux Release 6 Update 3 for x86_64 [64 Bit], 3.5 GB) from Oracle and installed it as a virtualized guest OS instance on my laptop. The installation process worked as expected with RHEL and CentOS, except for the different branding, logos etc., of course. Software updates also installed without problems after initial installation.

So far I’ve dismissed Oracle Linux from consideration as a niche distribution and had some doubts about its continuity, but it does look like a solid OS and it has been around for a while now, so it could be a viable option to consider when choosing an enterprise Linux platform.

For more information see:

Java 7 on Mac OS X – finally!

Apple doesn’t exactly have a history of timely Java releases for Mac OS X so, I didn’t expect Java 7 to be available soon after its GA release, but I was very disappointed to read instead Apple’s announcement in october 2010 stating they will not be supporting Java 7 on Mac OS X. I also was quite sceptic in november, when there was a surprise announcement from Apple and Oracle saying the two companies will be working together to port OpenJDK to Mac OS X. Java 7 was published in july 2011 but patience was required from Mac OS X Java developers still.

When the OpenJDK Java 7 preview packages were finally made available in 2012 they didn’t run on Mac OS X Snow Leopard, so I had to build the JDK from the sources. That was fairly simple but rather time-consuming, and the build process practically rendered my laptop unusable since it used up a lot of CPU, IO and memory resources. Operating system reinstallation is always a huge load of work, with all the backing up, finding a suitable time slot and other arrangements, so it was only last week when I finally managed to find the time for OS X Lion upgrade, but now I’m able to use the Oracle provided JDK 7 installation packages, which makes JDK upgrades a lot easier. So, a year after Java 7 release I’m finally able to run it on my laptop! And one nice thing about Oracle picking up Java on Mac OS X is that they’ve promised to release Java 7 updates simultaneously for Mac, Windows, Linux and Solaris (Henrik on Java). I hope the Mac OS X port code base is well integrated with the rest of the tree and that also future Java major releases like 8 ja 9 will happen in a timely fashion.