Return to lecture notes index

15-100 Lecture 29 (Friday, April 11, 2008)

Arrays and Collections

In class today, we took a quick look at the Arrays and Collections classes. These classes contain static methods that operate upon Java's arrays and Collection-derived data structures. Of particular note, they provide both sorting and searching methods, including a binarySearch() useful for sorted collections.

We took a look at the API documentation, which can be found here:

The Sort Methods

We gave the sort(...) methods a good bit of attention, both because they are useful and also because there is some subtlety hiding therein. For example, we noticed that, at first glance, the binarySearch(...) and sort(...) methods appear to work for any type of array or Collection-derived data structure -- and the Arrays version is even overloaded so as to work on arrays of primitive types.

In fact, there are basically three different versions for seemingly any occasion. For example, consider sort(...):

Pitfalls

One intersting pitfall that isn't obvious at a quick glance is that, although the Arrays version of these methods written for non-primitive data are written using Object references, they only work on Comparable data. If the Object is not also Comparable, a ClassCastException arises at runtime.

My preference would have been for this to accept a Comparable array rather than an Object array. This would have provided a compile-time type check, rather than a runtime type check -- preferable in most anyone's book. But, what the designers did is not without its advantages.

The list of items that is being passed in can be of any type that implements the Comparable interface -- and it can be passed in using any type of reference, whether Comparable or not. If the method would require a Comparable array, the caller would have to make the ugly cast from whatefer type of reference was present in the code to a Comparable reference. By doing it this way, the ugly cast can be hidden inside of the sort(...) method. Personally, I think I'd rather trade the ugliness for compile-time type checking -- but, what to say? I suppose that programmers don't often mistakenly ask for non-Comparables to be sorted.

Natural Ordering

Remember from our prior discussions that natural ordering is a pretty way of saying, "ordered using the compareTo() method of a Comparable." If we call the version of the sort method that does not make use of a Comparator, it makes use of the fact that the Objects are Comparable and invokes the compareTo() method.


Please note...Comparable vs Comparator.

A Comparable object is one that can compare itself to another instance of the same class. It is this ability that makes the object Comparable.

A Comparator is an object that compares other things. It, itself, is not (necessarily) Comparable, but it does that the ability to compare other objects. Comparator objects are, in essence, function objects.

Comparables only take one argument, the object to compare to themselves. Comparators take two arguments, as they compare them to each other. The Comparator, itself, is not being compared.


Remember, there is only one natural ordering for any particular type of Comparable object, the one defined by the compareTo() method. n some sense, this is the primary ordering. Next, we'll talk about how to sort things in other ways.

Comparators

What happens if we want to order objects in different ways at different times? For example we want to manage student records, in the common case, by keeping them alphabetic by name. But, sometimes, we want to sort them by GPA -- for example, to send out "naughty and nice" letters at the end of each semester? Or, by expected graduation date, to prioritize advising appointments? This is made difficult if the only way to sort objects is via the single, unchangeable, intrinsic "natural ordering" defined by their compareTo() method.

Well, this is where Comparators enter the picture. The Comparator interface defines a single method, compare(). Well it also defines equals(), but we'll talk about that in a second -- it operates upon the Comparator, itself, not other Objects.

interface Comparator  {
  
  public int compare (T o1, T o2);
  public boolean equals (Object o);
}

Or, if you find its pre-generics version easier to read:

interface Comparator {
  
  public int compare (Object o1, Object o2);
  public boolean equals (Object o);
}

The compare() method does exactly what you think it does. It works just like the compareTo() method of a Comparable -- except that it accepts both operands as arguments, rather than only one.

The idea here is that we can define multiple Comparators for a single object type. Then, we can supply which ever one we want, whenever we want. Notice the other version fo the sort(...) method: The one that takes the Comparator as the second argument. We can pass any Comparator we'd like in there, and off we go.

Of course, if the particular Comparator that you pass in doesn't match the type of the array you pass in -- you'll blow up at runtime.

It is important to note that a Compatator's compareTo() method, unlike a compareTo() method does not have access to an instance's private members. Remember, the compareTo() method is actually defined by the class of Object upon which is operating. The Comparable is usually an entirely separate class of object -- and, as such, has no access to another class of object's private members.

You could, of course, define a class such that it implements both the Comparable and Comparator interfaces and then pass it in as its own Comparator. But since there can only be one method with the matching compare() signature, this would give you only a second way to compare -- and would be very confusing to any reader.

It is perfectly reasonable to implement the Comparable interface with a compareTo() method that works one way, and one or more Comaprable classes with compare() methods that work differently. In fact, this is the beautify of the system.

Quick Example

Imagine some class, such as the one below:

class Student {
  private String name;
  private double gpa;

  public String getName() {
    return name;
  }

  public double getGpa() {
    return gpa;
  }

  // Blah, blah, blah
}

We could implement two different Comparators, as follows.

class NameComparator implements Comparator {
  public int compare (Object o1, Object o2) {
    Student s1 = (Student) o1;
    Student s2 = (Student) o2;

    return (s1.getName().compareTo(s2.getName());
  }
}

class GpaComparator implements Comparator {
  public int compare (Object o1, Object o2) {
    double gpa1 = ((Student)o1).getGpa();
    double gpa2 = ((Student)o2).getGpa();

    if (gpa1 == gpa2) return 0;
    if (gpa1 > gpa2) return 1;
    else return -1;
  }
}

And then, we can call sort any way we'd like:

Student[] students = new Student[STUDENT_COUNT];

loadStudents(students); // Some method to initialize the array
sort (students, new NameComparator());
sort (students, new GpaComparator());

For even more fun, we can nest the Compartators within the orginal and use them from the outside, as long as we make them public, as follows:

class Student {
  private String name;
  private double gpa;

  public String getName() {
    return name;
  }

  public double getGpa() {
    return gpa;
  }


  public class GpaComparator implements Comparator {
    public int compare (Object o1, Object o2) {
      double gpa1 = ((Student)o1).getGpa();
      double gpa2 = ((Student)o2).getGpa();
  
      if (gpa1 == gpa2) return 0;
      if (gpa1 > gpa2) return 1;
      else return -1;
    }
  }

  public class NameComparator implements Comparator {
    public int compare (Object o1, Object o2) {
      Student s1 = (Student) o1;
      Student s2 = (Student) o2;
  
      return (s1.getName().compareTo(s2.getName());
    }
  }

  // Blah, blah, blah
}

In this case, we need to use the fully qualified name with the scope operator, to name the Comparators, as follows:

Student[] students = new Student[STUDENT_COUNT];

loadStudents(students); // Some method to initialize the array
sort (students, new Student.NameComparator());
sort (students, new Student.GpaComparator());

Not Even Close to a 15-100 Topic -- But fun!

Just for fun, let me introduce you to the concept of an anonymous class. In the example below, I define a Comparator object in-line, right in the middle of the call to sort(...). Since it is only used this once and is passed right in, it doesn't need an identifier, hence it is called an anonymouse class.

Notice, in the example below, how the body of the class specification is embedded in-line rihgt after the "new". We call new, giving it the name of the reference type, followed by the customary ()-parenthesis that could accept arguments to the constructor, followed by the body of the class. Notice that the class is not identified anywhere and is only usable in this one place.

Student[] students = new Student[STUDENT_COUNT];

loadStudents(students); // Some method to initialize the array
sort (students, new Comparator() { 
 public int compare (Object o1, Object o2) {
      double gpa1 = ((Student)o1).getGpa();
      double gpa2 = ((Student)o2).getGpa();
   
      if (gpa1 == gpa2) return 0;
      if (gpa1 > gpa2) return 1;
      else return -1;
    }
});

I don't usually encourage anonymous classes. They are hard to test. And, they are harder to read. But, I did want to include it as trivia. Maybe someday, you'll find a use.