cat /dev/random

Sunday, January 28, 2007

Interview on Software Engineering Radio

Tune in to Software Engineering Radio to hear the interview Marcus Voelter did with David Holmes and I at OOPSLA 2006.

Tuesday, January 16, 2007

JCiP named Jolt Award Finalist

2006 Winners and Finalists

JCiP was named a finalist in the 2006 Jolt Awards in the technical books category.

Friday, January 12, 2007

Mama's got a squeezebox...

A number of friends have asked about my Squeezebox home audio setup, so rather than repeat myself, I'll write it up here.

Squeezebox is a device that you attach to your home network and your stereo, streaming digital music to your stereo system. There are lots of such devices on the market; Roku's Soundbridge, Apple's AirTunes, and offerings from networking companies like Netgear and Linksys. I chose Squeezebox because it has a digital (optical) audio out, meaning that what gets piped into my expensive stereo is not going through the analog stage of a $20 sound card. The model I have has a wireless (802.11g) interface and a wired ethernet port as well; I paid about $250 each, and we have them in multiple rooms so you can listen to any music from any room without fussing with physical CDs. (You can even sync them so you can have the same music throughout the house, say for parties.)

My primary motivation for this transition was that I hate CD furniture; it's mostly ugly, and takes up an obscene amount of room in your living room if you have any reasonable size music collection. Even transferring from jewel cases to sleeves, which gives about a 2.5:1 space compression, CDs can overwhelm your living room.

The first challenge: ripping the CDs. This is the most time consuming step, so I did not want to have to do this again because I had chosen the wrong audio format. There are three components to ripping: the audio extraction, the association of metadata (artist, title, genre) with albums and tracks, and conversion to the format of choice (mp3, wma, etc.) As with many other situations, you can get all-in-one solutions like iTunes or Windows Media Player that will do all of these in a single step, or you can have more control over each step but have to deal with multiple programs.

It turns out that nearly all rippers do not take advantage of the error-correcting information present in CDs. So if you have a scratched CD, you'll get bad bits when you rip, and those bad bits will stay with your recording forever. The only Windows-based tool I know of that will use the error correction is EAC (Exact Audio Copy). It's not as slick as iTunes, but it will usually get you a perfect rip. For CDs in good condition, it can usually rip at 10x, ripping a whole CD in 6 minutes or so. If it detects bit errors, it will slow down and keep reading until it is satisfied; for one really badly scratched CD (one that wouldn't even play in my car), it chewed for 24 hours, and got all but ~1000 bits off!

EAC will rip to WAV, and has a mechanism to post-process to further launch an external converter (mp3, wma, etc), which I did not use. Instead, I saved the WAV files to disk and post-processed them separately. iTunes and WMP can access the commercial CDDB database that associates metadata (artist, album, track names) with albums and tracks; EAC uses the open-source freedb database, which is convenient but whose data quality is less than perfect. Expect to spend some time correcting titles and genres that don't match up (e.g., the first CD of a set is called Volume One, where the second is called Disc 2, or one volume lists Rock as the genre, where the other lists Pop). You can do this through EAC or using an ID3 tag editor, but in any case, expect to spend some time cleaning up the data.

For my storage format, I chose FLAC, the open-source lossless audio compression, which stores files in about 55% of the space of the WAV file. This about about three times bigger than a good VBR MP3 or AAC, but disk space is cheap -- real cheap. (As of this writing, 500G drives are going for less than $200.) And the time to re-rip is very expensive. I set up the ripper on Windows to write the output files to a drop folder on my Linux server (named using a convention that embeds the track, artist, album, and genre, since WAV doesn't support metadata tags), and have a home-grown perl-script (willing to share, just ask) that will find the files and feed them to the flac converter.

Squeezebox versions 2 and later support FLAC native, so it doesn't have to transcode to MP3 on the fly. This is nice because the transcoding interfers with fast forward / rewind functionality on the Squeezebox. So, following the chain, error-free RIP courtesy of EAC, lossless conversion to FLAC, digital transfer from server to squeezebox, lossless FLAC decompression to PCM on squeezebox, digital out to receiver -- meaning no end-to-end signal loss, and digital-to-analog conversion done by my receiver. Just as if I'd plugged the CD player's optical out into the receiver.

For the server software, the free SlimServer package is written in Perl so can run on Windows, Linux, or Mac. I chose Linux since I did not want to downgrade the reliability of my stereo to that of my Windows desktop. (I have a Linux server in the house anyway, but if you don't, you can build one fairly cheaply.)

If you want to transfer to your iPod or other device, you need to transcode from FLAC to MP3 or AAC or WMA or whatever your favorite portable format is. The best MP3 encoder is called LAME (open source); you then have to decompress from FLAC to WAV, and pipe that into LAME to get an MP3 out. (I believe iTunes for Mac has a LAME plugin, but not iTunes for Windows.) LAME encoding using VBR (variable bit rate) takes a while. Disk space is cheap enough you might consider an automated nightly script to encode all new FLAC files into a parallel tree of MP3 files for transfer to iPod, if iPod is a big enough part of your life.

Once you get all the ripping done, it's pretty nice. It took me about a week to rip ~400 CDs "in the background" while I was working. Thereafter, the only time you need to find the physical CDs is if you want to play them in the car. And the SlimServer software has a web interface that lets you create playlists and such, so you can set up playlists for parties so you don't have to be fussing with CDs.

Highly recommended. We've got two squeezeboxes now (living room and bedroom) and are considering adding more (kids room, family room). Plus there's a software player you can use on the computer.

Saturday, December 30, 2006

Microsoft humor

Actual ad from Battlestar Galactica home page.

Reader mail: Urban performance legends

In response to Urban Performance Legends, Revisited, an anonymous reader wrote that I was "too enthusiastic" about the JVM's ability to inline virtual method calls:

Moreover, the example provided is misleading: inlining in java can be generally done for private or final methods. Non-final public and protected methods can't be automatically inlined because they can be overidden by a subclass. The actual type of the object may be unknown at compile time and at load time. The only way to inline them is to analyse the code to find every use of the class and perform type inference. This can be done sometimes with inner classes that have limited scope, but not with public classes that must be compiled indipendentely from each other.

Unfortunately, this is a common myth about optimization in Java (and other dynamically compiled languages): that if a method could be overridden by a class that has yet to be loaded, then the compiler cannot optimize away the virtual function call. This is just plain wrong (as is the rest of what this reader says), and today's JVMs can devirtualize and inline through virtual calls using a number of techniques.

One such technique is called monomorphic call transformation, where the compiler observes that for a given method foo(), there is no class loaded right now that overrides it, so calls to foo() can be compiled (speculatively) as direct calls instead of virtual calls. If a class is loaded later that makes foo() polymorphic, the compiler can invalidate the speculatively optimized code. This is covered in detail in Dynamic Compilation and Performance Measurement. There are other techniques as well, such as inline virtual caching.

The myth that final has any effect on method invocation performance for monomorphic methods was well-exploded by Cliff Click's 2003 and 2005 JavaOne presentations.

Java (as well as other managed languages, like C#) is not C. Dynamic compilers are smarter than you think.

Reader mail: making tasks noncancelable, and polling for interruption

I frequently get e-mail feedback on my articles on IBM developerWorks. Unfortunately, I can rarely reply to them because no one ever leaves their e-mail address. The majority of e-mails I get are erroneous corrections (not that I don't make mistakes -- I make plenty, and those get pointed up too -- its just there are even more people out there who are really really sure they're always right, but aren't), which I'd be happy to respond to if anyone ever left their e-mail... Here's one from Dealing with InterruptedException. Listing 6 shows an example of how to make an operation noncancelable by deferring interruptions until the operation completes.

public Task getNextTask(BlockingQueue queue) { boolean interrupted = false; try { while (true) { try { return queue.take(); } catch (InterruptedException e) { interrupted = true; // fall through and retry } } } finally { if (interrupted) Thread.currentThread().interrupt(); } }
The anonymous reader asks: shouldn't this be "while (!interrupted)"? Won't the loop never terminate? The idea behind this example was to address the problem of "what if we have an operation that should not be interruptible, but is composed using interruptible steps?" We will have to ignore the interruptions when they occur and retry the interrupted operation -- but the key is we want to remember if there has been an interruption, so we can restore the interrupted status after we complete our noncancelable operation. That way, we don't throw away the information that someone requested cancellation, we just defer the cancellation request until after the noncancelable operation operation completes.

A related issue that is often asked is how often we should check for interruption. Listing 3 shows a typical interruptible operation wrapped in a Runnable. When this happens, in order to not throw away the evidence that an interruption was requested, we have to re-set the interrupted status with Thread.currentThread().interrupt().

public class TaskRunner implements Runnable { private BlockingQueue queue; public TaskRunner(BlockingQueue queue) { this.queue = queue; } public void run() { try { while (true) { Task task = queue.take(10, TimeUnit.SECONDS); task.execute(); } } catch (InterruptedException e) { // Restore the interrupted status Thread.currentThread().interrupt(); } } } It is often asked: shouldn't the loop header be while (!Thread.currentThread().isInterrupted()), instead of while(true)? This one is a little more subtle. We can look at this from several perspectives: correctness, responsiveness and performance.

From a correctness perspective, the existing code is correct, because (reasonably coded) interruptible blocking methods check if the interrupted status is set on method entry, and immediately throw InterruptedException() if it is. Similarly, the existing approach is effectively as responsive as the suggested alternative, because the first step in the while loop is to call an interruptible blocking method. (If substantial computation occurred between loop entry and the first interruptible blocking method call, responsiveness might give us a reason to test the interrupted status in the loop header as well -- and maybe other places throughout the loop too.)

What about performance? As with most performance questions of this sort, the answer is "We don't have enough information to tell". Testing the interrupted status in the loop header costs a little more (because the test is repeated in the loop header and in the blocking take() call), but in the case that the interrupted status is set, saves us the cost of instantiating and catching the exception thrown from take() when it finds the interrupted status set on entry. Which is a performance win will depend on how often the operation is actually interrupted -- and we usually don't have this information when coding general-purpose library code.

When in doubt, do the thing that makes the code simplest, cleanest, and most readable.

Sunday, September 10, 2006

Farewell Quiotix, hello Sun!

I've been self-employed for fifteen years; I founded Quiotix in 1992 to pursue some short-term consulting opportunities after getting laid off from my last job, and I never looked back. I like the flexibility, and I don't mind the stress of not knowing what my income is going to be in any given month. I like not having to ask permission to take a day off, or buy a new monitor, or attend a conference that looks interesting. So in the past, when offered full-time positions, my response has usually been "Why would I want that?" (This can be very perplexing to someone offering you a job; they're used to getting a very different response.)

So, those of you who know me well may be a little surprised to learn that as of September, I will be joining Sun Microsystems as a Sr. Staff Engineer / Technical Evangelist in the Java SE engineering organization. I'll be involved in a lot of things, including a lot of the same things I've been doing as an independent -- like writing technical papers on JVM internals and speaking at conferences. I'll also be involved with the Java SE engineering, QA, and education efforts. I have a feeling I'm going to be pretty busy.

To the customers, colleagues, business partners, and employees I've worked with at Quiotix, I offer thanks for fifteen good years and the best of luck in the future.