Wednesday, August 30, 2006

Nerds versus geeks

Ask Yahoo recently had a post titled "What's the difference between a nerd, a geek, and a dork?. Their definitions of nerd and geek are pretty consistent with the ones I've become used to since living in the SF Bay Area, but they aren't consistent with the way these terms were used when I was growing up.

In my high school a geek was someone who was unpopular while a nerd was someone who was unpopular but smart. I remember when I first started reading Slashdot, shortly after I moved to the Bay Area, I was really surprised at their use of the word "geek". "Geeks don't know how to use computers", I thought, "they're too busy learning to speak Klingon or memorizing the names of Star Wars characters". Sure, there are certain things that both geeks and nerds tend to be interested in, like science fiction, comic books and role playing games, but nerds are the ones who know how to do "useful" things. I became even more surprised when I learned that many slashdotters seemed to use an inverted set of definitions for geek and nerd.

Wikipedia has a bit of an explanation:

Pundits and observers dispute the relationship of the terms "nerd" and "geek" to one another. Some view the geek as a less technically skilled nerd. Others view the exact opposite.

They also reference an excellent Cat and Girl comic which, incidentally, defines nerd and geek in a way that's consistent with the definitions I grew up with.

It sounds like the definitions of these two terms are regional. The Wikipedia page suggests that this may be an east-coast versus west-coast thing: on the east coast people think nerds are smart, while on the west coast people think geeks are smart. That's certainly consistent with my experience, as I grew up in Ontario. It would be interesting if someone made a map-poll like the Pop vs. Soda Page. LazyWeb, don't fail me now!

posted Wednesday, August 30, 2006 (5 comments)

Thursday, August 17, 2006

Top 10 Java Classes I Love to Hate

I haven't posted here in months. What better way is there to end a blogging dry spell than a good rant?

Here are ten Java classes in the standard API that annoy me whenever I have to deal with them, in no particular order:
An abstract file representation... or is it? It exposes system specific things like the File.separator and File.pathSeparator, yet it doesn't understand what "." and ".." do unless you canonicalize the File object. It's also tied to the system's filesystem. This means you generally need to build an abstraction layer on top of File. Finally, File.lastModified() returns a long. Why not a Date? (lastModified() used to be measured in arbitrary units from some arbitrary time. Not very useful if you want to be able to communicate the last modified time to the user or even another program. This was later fixed to be measured in milliseconds since the epoch.)
First, it's a marker interface. Marker interfaces are generally a bad smell. They're a good sign that someone was being lazy, or not thinking very carefully about the problem they were trying to solve. In the case of Serialization it's especially bad because there are methods that probably should be in the interface: writeObject(), readObject() and getSerialVersionUID. Instead, reflection is used to find methods that have "magic names". A definite no-no in my book.

Of course, even that wouldn't fix the bigger issue which is that Java's serialization (like Python's pickling) is a very fragile mechanism for persisting objects which relies on the internal state of objects rather than on their public interfaces.

Like Serializable, Cloneable is a marker interface. It's even easier to see what's wrong with Clonable, though. Where's the clone() method? The docs mention its absence. That doesn't stop it from being a bug.
Overcomplicated. It defines a whole programming language for messages. This is great in theory, but it's not something you could actually give to translators. Hint: most translators are not computer programmers. Even if you manage to find a translator that can deal with the craziness of MessageFormat chances are they don't know every language you want to translate into.
Almost every time I see this class used I see the same bug in the code that uses it. SimpleDateFormat.format() is not reentrant and is not thread safe. Beyond that, DateFormat has one of the most bizarre APIs ever. It has a setCalendar method, but what does setting the "calendar" do? Why, it lets you get it back with getCalendar! It also lets you stomp on some of format's internal state if you call it concurrently, but presumably it isn't meant for that.
What isn't wrong with this class? Well, I guess it isn't a marker interface, at least.

An instance of the Calendar class represents what? The answer should be "a calendar", but in fact an instance of this class represents a date. A mutable date whose mutators follow the rules of a particular calendar. Truly bizarre.

Date isn't so bad. It has two main sins: First, Date objects are mutable. Second, somehow it made someone feel the urge to write Calendar.
Here's a dictionary definition for "locale":
lo·cale (lō-kăl')
  1. A place, especially with reference to a particular event: the locale of a crime.
  2. The scene or setting, as of a novel.

Some of the constants in Locale fit this definition, and clearly represent "places":

static public final Locale CHINA = new Locale("zh","CN","");
static public final Locale FRANCE = new Locale("fr","FR","");
static public final Locale GERMANY = new Locale("de","DE","");

Others, not so much:

static public final Locale CHINESE = new Locale("zh","","");
static public final Locale FRENCH = new Locale("fr","","");
static public final Locale GERMAN = new Locale("de","","");

This is a pretty clear example of a common problem I've noticed with internationalization: a lot of people seem to confuse location with language. Is it because of the Locale class, or is the Locale class merely another victim of some sort of mass-hysteria? I don't know. In any case, the class probably should've been called language since it's obviously based on RFC1766: Tags for the Identification of Languages.

Stack is mostly annoying because it used up a good name. I occasionally want a stack, but I never want a stack that extends Vector. At least Vector and Hashtable had the decency to not use the only good names for what they do, but what else do you call a Stack? A LIFO? Java 1.6 will actually add Deque which can be used as a stack, but the interface is a lot "fatter" than than I'd like for the cases where I really just want a stack.
Good idea, poor execution. First, the name is ambiguous. Weak what? Weak keys? Weak values? It turns out that it's weak keys, but it would be nice if the name said as much. It would also be nice if there was a weak value version so that people were more aware that they had to make a decision, as the two types of "weakness" are not interchangeable. You typically want weak keys when you're trying to "annotate" existing objects, but don't want the annotations to outlive the objects. You typically want weak values when you're trying to implement a weak cache keyed by something that's "recreatable" (typically "value objects", like Strings or numbers).

Since it is a weak key hashmap it should actually be a weak key identity hashmap. It isn't, though. WeakHashMap uses Object.equals() despite the fact that weak references operate on identity.

posted Thursday, August 17, 2006 (2 comments)