tech oriented notes to self and lessons learned
Tag Archives: maven
2012-08-02Posted by on
Older codebases are common to include dependencies required by the application, such as code libraries, picture and sound files and other artifacts in the source code repository itself. This typically leads to the repository getting bloated. As an example, in a project where I was working recently the repository size on checkout was 450 MB. Checking out such a repository takes a lot of time, wastes disk space on each developer sandbox and makes implementing continuos integration more difficult.
Here’s where dependency management technologies such as Apache Maven and Ivy (with an artifact repository) can be a great help. Deploying such a solution in a greenfield project is simple, but when you have an existing codebase that’s used in production things get more complicated.
Suppose you have a set of libraries that the codebase currently uses and you want to manage them using a dependency manager and artifact repository. You can choose to
- a) declare and publish your own artifacts based on artifacts in the existing codebase
- b) use artifacts published in public repositories
- c) use a hybrid approach
With option a you gain maximum control over your artifacts (and thus what exactly gets included) while with b you hope to be doing less work managing artifacts on the long term. Using publicly available artifacts requires you to determine and categorize library dependencies, which can require a lot of detective work. Also, you need to trust the persons providing artifacts to be doing a good job in declaring transitive dependencies. I haven’t found good automated solutions for determining dependencies, so I’ve used the following approach:
- Determine compile-time dependencies
Check import statements from source code. For each import find the corresponding library in a public artifact repository and add the dependency to compile-time set.
Delete jars one-by-one, compile. If build fails, add required dependencies.
Rebuild and iterate.
- Determine runtime dependencies
build package, deploy and test. If any dependencies are missing, add them to the runtime set.
- Cross check original set of jar files (optional)
Remove any files that aren’t present in the original runtime set using explicit exclusions.
This is a rather frustrating and dull method. Also, in practice, in step 3 some artifacts usually bring in dependencies that weren’t packaged with the application originally and which you might not want to include in the dependency set. You can solve this in two ways: a) include transitive dependencies by default and exclude the unnecessary ones or b) exclude transitive dependencies by default and include explicitly the required ones. With option b you gain control but you’re not using the dependency manager up to its full potential. Option a on the other hand may require that you let go on your purism and accept the fact that the dependency set will include something, that wasn’t included as a dependency originally.