Wednesday, February 11, 2009

What? Java Can't Sort Numbers?

I was recently working on some charts4j code to sort a list of Numbers (java.lang.Numbers -- to be exact) and discovered that it could not be done simply by calling Collections.sort. I would think, if anything ought to be sortable in Java, it should be a list of Numbers.

As most Java programmers know, sorting lists in Java is easy -- sort of. "Simply" use the static Collections.sort method, which under the hood is implemented with the classic merge sort algorithm. The major caveat, of course, is that the class of objects you are sorting must implement the Comparable interface by having a properly written compareTo method. Fortunately, many classes in the core Java language already implement Comparable: String, Date, Integer. They can immediately be sorted with Collections.sort without any special effort from the programmer, but not for Numbers. I did a little research by posting a question on this topic on Stackoverflow. I received a lot of good albeit -- Sun party line -- answers. The issue is a bit more complex than one would immediately think.

The simplest explanation for why java.lang.Number does not implement Comparable is rooted in mutability concerns. For a bit of review, java.lang.Number is the abstract super-type of AtomicInteger, AtomicLong, BigDecimal, BigInteger, Byte, Double, Float, Integer, Long, Short. On that list, AtomicInteger and AtomicLong to do not implement Comparable. Digging around, I discovered that it is not a good practice to implement Comparable on mutable types because the objects can change during or after comparison rendering the result of the comparison useless. Both AtomicLong and AtomicInteger are mutable. The API designers had the forethought to not have Number implement Comparable because it would have constrained implementation of future subtypes. Indeed, AtomicLong and AtomicInteger were added in Java 1.5 long after java.lang.Number was initially implemented.

Apart from mutability, there are probably other considerations here too. A compareTo implementation in Number would have to promote all numeric values to BigDecimal because it is capable of accommodating all the Number sub-types. The implication of that promotion in terms of mathematics and performance is a bit unclear to me, but my intuition finds that solution kludgy.

A related issue to this discussion is autoboxing. In Java 1.5, Sun introduced this language feature to seamlessly go back and forth between primitive numeric types and reference types. This is useful for adding and getting numbers in and out of Collections, for example, without having to explicitly convert the primitive type to a reference type. The Java numeric primitive types are byte, char, short, int, float, and double. They all have corresponding reference types, Byte, Character, Short, etc. Those reference types all extend Number (except for Character). The rules for converting from primitive types to boxed reference types are difficult to anticipate for purposes of building APIs. (I thought Java 1.5 was supposed to be all about type safety.) For instance, Arrays.asList(1,2,3) yields a List<Integer> and not a List<Long>. Arrays.asList(1.0,2,3) produces a List<? extends Number> instead of a List<Double>. This lack of predictability is burdensome for API designers, because they essentially have to build multiple method signatures to accommodate multiple boxed numeric reference types. Overloading the method signature may not even be an option in the case of Collections since List<Integer> and List<? extends Number> have the same erasure. One possibility is to rely only on the Number type or in the case of Collections List<? extends Number> for polymorphism but the problem is that the Number type is too general, encompassing more than the Java boxed primitive types, and it does not implement Comparable.

A potential solution is to implement an intermediate RealNumber type that would extend Number and encompass all the boxed numeric primitive types (except for char). This approach would allow polymorphism on numeric primitive types via autoboxing. Moreover, it would be reasonable for this RealNumber type to implement Comparable by promoting primitive numeric types to double in the compareTo method.

Unfortunately, this is yet another example of how Java has never been great in the numerical computation arena.

1 comment:

mk said...

Promoting to double to compare is probably a bad idea as not every long can be represented as a double.