Thursday, October 12, 2006

Making shell calls from Java

One of our clients needs to tie an ancient DOS application into one of their web processes. Unfortunately, this application isn't really designed for use in an unattended process. It works, but occasionally (usually in error states), it will request user input. Not so spectacular if there's no user and the applications output isn't even being sent to a display. In addition, it has some serious concurrency issues and every now and again it will just fail for no discernible reason whatsoever. I was faced with three major issues here: first, how to efficiently make system calls from Java (not as trivial as it sounds!), second, how to ensure that this application isn't called by more than a certain number of threads at a time and third, how to identify those random error states and retry the operation. Just so this post doesn't drag on forever, I'll just address the first of these here, and leave the others for another day.

So, why is making shell calls from Java so difficult? All you have to do is call Runtime.getRuntime().exec() right? Well, sort of... Let me explain... round-aboutly. First off, our shell command execution needs to have some sort of timeout (to deal with those cases when the shell command hangs waiting for input). So, we need to execute the command in another thread and have a loop that waits until the process says it's done. Something that looks kind of like this perhaps (try/catch blocks and such elided for simplicity sake):

ExternalExecutionThread thread = 
new ExternalExecutionThread();
thread.setCommand(command);
long start = System.currentTimeMillis();
thread.start();
while (!thread.isFinished()) {
if (System.currentTimeMillis() - start > timeout) {
thread.interrupt();
}
}

class ExternalExecutionThread extends Thread {
private boolean finished;
private int exitCode;
private ResultListener listener;

... getters and setters

public void run() {
Process process = Runtime.getRuntime().exec(cmd);
process.waitFor();
exitCode = process.exitValue();
}
}


In addition to simply executing the command you specify, exec() provides StdErr and StdOut as streams for your monitoring convenience. A nice feature, especially when dealing with a legacy application as we are here. So, first problem. What if the command you're running doesn't output anything and then hangs? You're now stuck in a read loop, with nothing coming in. So, we need to spawn some more threads to gobble up the command output as it comes, so that our timeout will continue to work. Something more or less like this:

class StreamReadingThread extends Thread {
private OutputStream outputStream;
private InputStream inputStream;
private boolean finishedWriting;

... getters and setters

public void run() {
BufferedInputStream bis =
new BufferedInputStream(inputStream);
while (!finishedWriting) {
byte[] temp = new byte[bis.available()];
int bytesRead = bis.read(temp, 0, temp.length);
if (bytesRead == -1) {
finishedWriting = true;
} else {
outputStream.write(temp, 0, bytesRead);
}
}
}
}


Now, we modify our ExternalExecutionThread to create a couple of these StreamReadingThreads, feed them the InputStreams from the Process we started and send them on their merry way. When the process ends (or times out) we call finishWriting() on the threads and they exit gracefully. Now, I can hear the clamor, "That's a whole lot of trouble to go through to read the output! What if I don't even need the output?" Well, I can't argue with you, it is a royal pain. There's a catch here though: if you don't read the output of the command, the stream buffer will eventually fill up (assuming your command has a significant amount of output) and it will deadlock your whole thread. Gross.

At the end of the day, we've got a couple of custom classes and three new threads spawned every time we want to make a shell call. Is there a simpler way? I certainly hope so, but I've been unable to find it. I'd certainly love to be proven wrong...

Wednesday, October 11, 2006

My Experience Integrating Load Time Weaving

I use JetBrainsIntelliJ IDEA to develop Java web applications. Because of the lack of AspectJ support, I was forced to find my own solution. Reluctant to give up my highly adored refactorings and productivity tools for an IDE like Eclipse that integrates AspectJ, I set out to find a workaround. At first I tried using LTW because it seemed like the quickest, easiest, and least invasive solution. However, I was met with disaster: stack traces a mile long, resulting in a heap exception. I searched Google for solutions to the LTW stack explosion, but to no avail. Accustomed to hitting seemingly insurmountable roadblocks (after all, I am a Software Engineer), I pressed on. Next I turned to the AJC compiler; “surely I can change which compiler I use,” I thought. Quite to the contrary; for reasons unknown to me, IDEA only lets you choose between two compilers: javac and jikes. I even looked inside the .bat file and properties files for IDEA. To solve this problem I turned to using an ant script that runs post-compilation to weave my classes. However, you can only have one post-compilation task and I had more than one module that I wanted to weave. My natural reaction at this point was to hit the Google searches. After an hour of searching on how to change my compiler in IDEA to AJC I decided to reinvestigate why LTW was croaking. After a mere fifteen minutes I found a mailing list posting by Adrian Colyer, explaining what had happened: aspectj-users@eclipse.org/msg0032. Apparently there was a bug in the 1.5.2 release that caused an issue with pointcuts that match based on annotations (which Spring 2.0 does), and there was now a 1.5.2a release to address the issue. Downloading the bugfix release and plopping it into my libraries folder fixed the problem.


The forehead slap occurred when I realized that including the spring-aspects.jar implies that you want to weave both the AnnotationBeanConfigurerAspect and AnnotationTransactionAspect aspects because Spring includes an aop.xml in the META-INF inside their jar. Simply placing <exclude within="org.springframework.transaction.aspectj.AnnotationTransactionAspect" /> in your aop.xml will tell AspectJ not to use the @Transactional support from spring-aspects.jar


Using IDEA, the only change I had to make was to add the following to my JVM args in the Run/Debug dialog: -javaagent:lib/aspectjweaver.jar. See a great blog post by Adrian: A Practical Guide to Using an Aspect Library (part 1). Next, I needed to configure Tomcat for LTW in order for my webapp to function. This was as simple as adding the same JVM arg (changing the path to the jar) to my startup script. I can now sleep soundly at night, not worrying about having to abandon my beloved IDE. Now if only they’d let me write AspectJ code natively… I’ll save that one for another post.

XSLT: Use of the self:: axis

I'm Peter Cooper, one of the software engineers working with Checkmate Technologies, and a lot of my job right now involves using XML and XSLT. One of the things that's kind of confusing when you first look at XSLT is all the axes that you can use, and why you'd want to use them. Things like child::, descendent::, and parent:: make a lot of sense, but what about self::? Doesn't it seem kind of pointless to have an axis that just gives you nodes where you already are?

However, after working with XSLT for many years, we've found a few places where the self:: axis is really a lifesaver. The biggest example is when you want to exclude a particular node when selecting a nodeset. For instance, let's say that you have input XML in a format like this:

<line>
<text>This is a line of text to be printed</text>
<fontsize>14</fontsize>
<justification>Center</justification>
...
</line>

The idea is that the <text> element is the only one that's required, and there are a bunch of optional tags you can add to change that Line from the default.

The problem we're trying to solve is that if there are any of these optional tags present, we want to send the user to the
"advanced" edit page, whereas if there are just lines of text without fancy
formatting on a particular line, we just want to show the "normal" edit page.

So our mission is to write a select statement that sees if there are any <line>s with any child elements other than <text>. The first thing that might come to your mind is something like this:

<xsl:variable name="AdvancedNodes" select="line/*[not(name() = 'text')]" />

A statement like this kind of works, and it'd get the job done, but doing comparisons based on the name seems kind of sloppy. It also gets a lot more tricky if you're using XML Namespaces, because then you need to check the local-name() and namespace-uri(). So, the much more elegant approach is something like this:

<xsl:variable name="AdvancedNodes" select="line/*[not(self::text)]" />

In this manner, the self:: axis lets you start with a very broad selection of elements (all children elements) and then narrow it down further very easily. You can also use techniques like this if you're writing XSLT that primarily copies elements, but you want to exclude some of the elements in the copy.