Alphabetical != ASCIIbetical

[BEGIN RANT]

Partially this is a case of Java community being populated by idiots, but people seem to be wholly ignorant on this issue in other languages too. Google for java alphabetical sorting capitalization or any combination of words you can think of that might get you an algorithm that sorts a collection alphabetically. You will find hundreds of wrong responses and no correct ones. Most of them say to use the Arrays.sort(..) or Collections.sort(..) methods. But both of those use natural order (or ASCIIbetical as I like to call it) not alphabetical order so 1 is followed by 10 not 2 and things starting with a capital letter aren’t beside things with the lowercase version of the same letter.

Some people think of calling .toLowerCase() on everything first so that at least they eliminate the latter issue but I have yet to see a single example of an actual alphabetical sort in Java (or any other language I’ve checked).

Silly me, I just figured that alphabetical sorting was such a common need (judging by the number of people asking how to do it I’m not wrong either) that I wouldn’t have to write the damn thing. But I didn’t count on the stupid factor. Jesus Christ people. You’re programmers. You’re almost all college graduates and none of you know what the fuck “Alphabetical” means. You should all be ashamed.

If any of you are using your language’s default sort algorithm, which is almost guaranteed to be ASCIIbetical (for good reason) to get alphabetical sorting you should proceed to the nearest mirror and slap yourself repeatedly before returning to your desks and fixing your unit tests that didn’t catch this problem.

[END RANT]

[Update] There is apparently one person who knows the freaking difference. Someone get that man a cookie!

[Update 2] Alphabetization/Alphanum/whatever you want to call it is entirely language dependent. Obviously an algorithm that works for English is unlikely to work for other languages. Attempting to create a generic purpose one that handles special characters of multiple languages is brain-dead because there may be other rules that affect alphabetization in them.

[Update 3] Jeff Attwood points out that this is called “Natural sorting” not “Alphabetical” sorting. Unfortunately the Javadocs for Collections.sort(…) talk about sorting things according to their “Natural ordering” which may be technically true but just adds to the confusion.