practicing techie

tech oriented notes to self and lessons learned

Category Archives: operations

Daemonizing JVM-based applications

Deployment architecture design is a vital part of any custom-built server-side application development project. Due to it’s significance, deployment architecture design should commence early and proceed in tandem with other development activities. The complexity of deployment architecture design depends on many aspects, including scalability and availability targets of the provided service, rollout processes as well as technical properties of the system architecture.

Serviceability and operational concerns, such as deployment security, monitoring, backup/restore etc., relate to the broader topic of deployment architecture design. These concerns are cross-cutting in nature and may need to be addressed on different levels ranging from service rollout processes to the practical system management details.

On the system management detail level the following challenges often arise when using a pure JVM-based application deployment model (on Unix-like platforms):

  • how to securely shut down the app server or application? Often, a TCP listener thread listening for shutdown requests is used. If you have many instances of the same app server deployed on the same host, it’s sometimes easy to confuse the instances and shutdown the wrong one. Also, you’ll have to prevent unauthorized access to the shutdown listener.
  • creating init scripts that integrate seamlessly with system startup and shutdown mechanisms (e.g. Sys-V init, systemd, Upstart etc.)
  • how to automatically restart the application if it dies?
  • log file management. Application logs can be managed (e.g. rotate, compress, delete) by a log library. App server or platform logs can sometimes be also managed using a log library, but occasionally integration with OS level tools (e.g. logrotate) may be necessary.

There’s a couple of solutions to these problems that enable tighter integration between the operating system and application / application server. One widely used and generic solution is the Java Service Wrapper. The Java Service Wrapper is good at addressing the above challenges and is released under a proprietary license. GPL v2 based community licensing option is available as well.

Apache commons daemon is another option. It has its roots in Apache Tomcat and integrates well with the app server, but it’s much more generic than that, and in addition to Java, commons daemon can be used with also other JVM-based languages such as Scala. As the name implies, commons daemon is Apache licensed.

Commons daemon includes the following features:

  • automatically restart JVM if it dies
  • enable secure shutdown of JVM process using standard OS mechanisms (Tomcat TCP based shutdown mechanism is error-prone and unsecure)
  • redirect STDERR/STDOUT and set JVM process name
  • allow integration with OS init script mechanisms (record JVM process pid)
  • detach JVM process from parent process and console
  • run JVM and application with reduced OS privileges
  • allow coordinating log file management with OS tools such as logrotate (reopen log files with SIGUSR1 signal)

 

Deploying commons daemon

From an application developer point of view commons daemon consists of two parts: the jsvc binary used for starting applications and commons daemon Java API. During startup, jsvc binary bootstraps the application through lifecycle methods implemented by the application and defined by commons daemon Java API. Jsvc creates a control process for monitoring and restarting the application upon abnormal termination. Here’s an outline for deploying commons daemon with your application:

  1. implement commons daemon API lifecycle methods in an application bootstrap class (see Using jsvc directly).
  2. compile and install jsvc. (Note that it’s usually not good practice to install compiler toolchain on production or QA servers).
  3. place commons-daemon API in application classpath
  4. figure out command line arguments for running your app through jsvc. Check out bin/daemon.sh in Tomcat distribution for reference.
  5. create a proper init script based on previous step. Tomcat can be installed via package manager on many Linux distributions and the package typically come with an init script that can be used as a reference.

 

Practical experiences

Tomcat distribution includes “daemon.sh”, a generic wrapper shell script that can be used as a basis for creating a system specific init script variant. One of the issues that I encountered was the wait configuration parameter default value couldn’t be overridden by the invoker of the wrapper script. In some cases Tomcat random number generator initialization could exceed the maximum wait time, resulting in the initialization script reporting a failure, even if the app server would eventually get started. This seems to be fixed now.

Another issue was that the wrapper script doesn’t allow passing JVM-parameters with spaces in them. This can be handy e.g. in conjunction with the JVM “-XX:OnOutOfMemoryError” & co. parameters. Using the wrapper script is optional, and it can also be changed easily, but since it includes some pieces of useful functionality, I’d rather reuse instead of duplicating it, so I created a feature request and proposed tiny patch for this #55104.

While figuring out the correct command line arguments for getting jsvc to bootstrap your application, the “-debug” argument can be quite useful for troubleshooting purposes. Also, by default the jsvc changes working directory to /, in which case absolute paths should typically be used with other options. The “-cwd” option can be used for overriding the default working directory value.

 

Daemonizing Jetty

In addition to Tomcat, Jetty is another servlet container I often use. Using commons daemon with Tomcat poses no challenge since the integration already exists, so I decided to see how things would work with an app server that doesn’t support commons daemon out-of-the-box.

To implement the necessary changes in Jetty, I cloned the Jetty source code repository, added jsvc lifecycle methods in the Jetty bootstrap class and built Jetty. After that, I started experimenting with jsvc command line arguments for bootstrapping Jetty. Jetty comes with jetty.sh startup script that has an option called “check” for outputting various pieces of information related to the installation. Among other things it outputs the command line arguments that would be used with the JVM. This provided quite a good starting point for the jsvc command line.

These are the command lines I ended up with:

export JH=$HOME/jetty-9.2.2-SNAPSHOT
export JAVA_HOME=`/usr/libexec/java_home -v 1.8`
jsvc -debug -pidfile $JH/jetty.pid -outfile $JH/std.out -errfile $JH/std.err -Djetty.logs=$JH/logs -Djetty.home=$JH -Djetty.base=$JH -Djava.io.tmpdir=/var/folders/g6/zmr61rsj11q5zjmgf96rhvy0sm047k/T/ -classpath $JH/commons-daemon-1.0.15.jar:$JH/start.jar org.eclipse.jetty.start.Main jetty.state=$JH/jetty.state jetty-logging.xml jetty-started.xml

This could be used as a starting point for a proper production grade init script for starting and shutting down Jetty.

I submitted my code changes as issue #439672 in the Jetty project issue tracker and just received word that the change has been merged with the upstream code base, so you should be able to daemonize Jetty with Apache commons daemon jsvc in the future out-of-the-box.

Java VM Options

Some of the more frequently used Java HotSpot VM Options have been publicly documented by Oracle.

Of those flags the following are the ones I tend to find most useful for server-side Java applications:

  • -XX:+UseConcMarkSweepGC – select garbage collector type
  • -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=$APP_HOME_DIR – when an out-of-memory error occurs, dump heap
  • -XX:OnOutOfMemoryError=<command_for_processing_out_of_memory_errors> – execute user configurable command when an out-of-memory error occurs
  • -XX:OnError=<command_for_processing_unexpected_fatal_errors> – execute user configurable command when an unexpected error occurs
  • -XX:+PrintGCDetails -XX:+PrintGCTimeStamps – increase garbase collection logging
  • -Xloggc:$APP_HOME_DIR/gc.log – log garbage collection information to a separate log file
  • -XX:-UseGCLogFileRotation -XX:GCLogFileSize=<max_file_size>M -XX:NumberOfGCLogFiles=<nb_of_files> – set up log rotation for garbage collection log

Occasionally, the following might also be helpful to gain more insight into what the GC is doing:

  • -XX:PrintHeapAtGC
  • -XX:PrintTenuringDistribution

Oracle documentation covers quite a few HotSpot VM flags, but every once in a while you bump into flags that aren’t on the list. So, you begin to wonder if there’re more JVM options out there.

And, in fact, there’s a lot more as documented by this blog entry: The most complete list of -XX options for Java JVM

That list is enough to make any JVM application developer stand in awe 🙂
There’s enough options to shoot yourself in the foot, but you should avoid the temptation of employing each and every option.

Most of the time it’s best to use only the one’s you know are needed. JVM options and their default values tend to change from release to release, so it’s easy to end up with a huge list of flags that suddenly may not be required at all with a new JVM release, or worse,  may even result in conflicting or erroneous behavior. The implications of some of the options, the GC one’s in particular, and the way that related options interact with each other may sometimes be difficult. Using a minimal list of options is easier to understand and maintain. Don’t just copy-paste options from other applications without really understanding what they’re for.

So, what’re your favorite options?

Mine would probably be -XX:SelfDestructTimer 😉
(No, there really is such an option and it does work)

Tomcat JDBC Connection Pool

tomcat-jdbc is a relatively new entrant to the Java JDBC connection pool  game. It’s been designed to be a drop-in replacement for commons-dbcp.

Many of the popular Java connection pool implementations have become quite stagnant over the years, so it’s nice to see someone make a fresh start in this domain. tomcat-jdbc code base is small and has minimal dependencies. It has configurable connection validation (validation policy, query and intervals) and enables automatically closing connections after they reach a configurable maximum age.

Some time ago, we started having problems with a near end-of-life legacy application that was migrated to a new environment. The application was a standalone Java app that used commons-dbcp as its JDBC connection pool implementation. After the migration, MySQL connections started failing occasionally. This appeared to happen with a somewhat regular interval of a few hours. Since tomcat-jdbc API is compatible with commons-dbcp we were able to replace the connection pool without any code changes and configure the new pool to automatically close connections once they had been open for a certain period of time. This turned out to be an effective and nonintrusive workaround for the issue.

I hope the Tomcat development team would more actively promote using tomcat-jdbc also outside of Tomcat. The pool implementation doesn’t currently seem to be available as a separate download or from Maven central, which is likely to hinder its adoption.

SFTP file transfer session activity logging

Old-school bulk file transfers may be considered out of fashion in a SOA world, but we’re not in a perfectly SOA enabled world, yet. So, bulk file transfers still remain widely used and SFTP is the workhorse of bulk file transfers.

Typically you would create an operating system account for each inbound file transfer party you want to enable and configure SSH public key authentication. By default, this would also allow full terminal access for account. What if you don’t want to have that? Well, just define sftpd as the login shell for those users to restrict access:

sftp1:x:502:502::/home/sftp1:/usr/libexec/openssh/sftp-server

Simple enough.

What if you want to monitor SFTP session activity on the system? Depending on SSH and syslogd config, you’ll find session logs in different places, but often on Linux this will be in /var/log/secure. OpenSSH v5.3 logs the following lines for a SFTP session:

Sep 16 15:31:34 localhost sshd[4274]: Postponed publickey for sftp1 from 172.16.221.1 port 56069 ssh2
Sep 16 15:31:34 localhost sshd[4273]: Accepted publickey for sftp1 from 172.16.221.1 port 56069 ssh2
Sep 16 15:31:34 localhost sshd[4273]: pam_unix(sshd:session): session opened for user sftp1 by (uid=0)
Sep 16 15:31:34 localhost sshd[4276]: subsystem request for sftp
Sep 16 15:31:36 localhost sshd[4276]: Received disconnect from 172.16.221.1: 11: disconnected by user
Sep 16 15:31:36 localhost sshd[4273]: pam_unix(sshd:session): session closed for user sftp1

You can see the session start, source IP address and user account info there. This is enough information for basic transfer account activity monitoring, and in some cases it may be enough for accounting and billing as well.

But what if you want to customize logging or hook into SFTP session start and end events to allow some form of custom processing? You can create a wrapper script for sftpd and configure that as the user’s login shell:

#!/bin/sh
#
# sftpd wrapper script for executing pre/post session actions
#

# pre session actions and logging here
SOURCE_IP=${SSH_CLIENT%% *}
MSG_SESSION_START="user $LOGNAME session start from $SOURCE_IP"
logger -p local5.notice -t sftpd-wrapper -i "$MSG_SESSION_START"

# start actual SFTP session
/usr/libexec/openssh/sftp-server

# add post session actions here

You could replace the syslogd based logging command above with your custom logging. The logging command above would log session start events using local5 log facility with notice priority (see man rsyslog.conf(5)). Log entries using a local5 log facility can be directed to a custom log file using the following syslogd configuration in /etc/rsyslog.conf:

local5.*                                                /var/log/sftpd.log

So, now you can customize the messages and perform pre/post session actions. If you need to do more advanced reporting for this data or allow non-technical users to do ad-hoc reporting, you might want to put the session data in a RDBMS. You could either add the data to the database directly in the wrapper script or set up logrotate to rotate your custom log files and configure a postrotate/prerotate script that would parse the log file and add entries to the database in batches.

What if you need to know what exactly goes on inside the file transfer sessions, like which files are being downloaded or uploaded etc.? OpenSSH sftpd doesn’t log this info by default, with the default facility and log level being auth.error. You can change this either globally in sshd_config or per user by changing the sftpd-wrappper script above like this:

/usr/libexec/openssh/sftp-server -f local5 -l info

This would direct sftpd log entries issued with local5 facility and info priority for selected users only. So, now your sftpd.log would look like the following:

Sep 16 16:07:19 localhost sftpd-wrapper[4471]: user sftp1 session start from 172.16.221.1
Sep 16 16:07:19 localhost sftp-server[4472]: session opened for local user sftp1 from [172.16.221.1]
Sep 16 16:07:40 localhost sftp-server[4472]: opendir "/home/sftp1"
Sep 16 16:07:40 localhost sftp-server[4472]: closedir "/home/sftp1"
Sep 16 16:07:46 localhost sftp-server[4472]: open "/home/sftp1/transactions.xml" flags WRITE,CREATE,TRUNCATE mode 0644
Sep 16 16:07:51 localhost sftp-server[4472]: close "/home/sftp1/transactions.xml" bytes read 0 written 192062308
Sep 16 16:07:54 localhost sftp-server[4472]: session closed for local user sftp1 from [172.16.221.1]

The log entry format requires some effort to process programmatically, but its still manageable. You can identify SFTP session operations, such as individual file transfers from the log. The log could then be processed using a logrotate postrotate/prerotate script that could e.g. add the data in a database for generating input data for accounting or billing.

Software versions: RHEL 6.3 and OpenSSH 5.3.

Oracle Enterprise Linux now free (of charge)

Linux application software developers often face a choice between two compatible but different OS variants: Red Hat Enterprise Linux (RHEL) and CentOS. Using RHEL can sometimes be problematic for developers because typically some sort of centralized subscription management is required for enabling software updates, and depending on the organization you can get stuck from hours to days. The required bureaucracy can be a really frustrating experience for software developers looking to install just a basic virtualized RHEL guest OS instance for development or QA purposes: 5 minutes and you’re done – if it just wasn’t for the subscription management part! CentOS on the other hand can be freely downloaded and used but the downside is that traditionally publishing updates has dragged behind. Depending on the project, this may not be a big problem for QA and development purposes, but for internet facing production platforms you’d like the security updates to get installed as soon as they get released.

Oracle Enterprise Linux is an enterprise Linux distribution similar to CentOS in that it’s binary compatible with RHEL. It’s also been made freely (as in beer) available recently. The big upside for the app dev use case above is that Oracle promises to publish updates faster than CentOS has done. For operations personnel the benefit is that you can also get paid support for the OS from Oracle as well as some interesting features, such as zero-downtime kernel updates with Ksplice.

Being a bit curious, I downloaded Oracle Linux installation image (Oracle Linux Release 6 Update 3 for x86_64 [64 Bit], 3.5 GB) from Oracle and installed it as a virtualized guest OS instance on my laptop. The installation process worked as expected with RHEL and CentOS, except for the different branding, logos etc., of course. Software updates also installed without problems after initial installation.

So far I’ve dismissed Oracle Linux from consideration as a niche distribution and had some doubts about its continuity, but it does look like a solid OS and it has been around for a while now, so it could be a viable option to consider when choosing an enterprise Linux platform.

For more information see: