IDictionary: Merriam-Webster it Ain’t

I can’t decide which bothers me most: that enumerating an IDictionary doesn’t guarantee any particular order, or that most programmers don’t seem to know this important fact.

Update: this post generated some discussion among the folks up in Redmond, but their overall stance was as expected: don’t expect anything to change. I didn’t, really, but I’m going to stand by my opinion, which I will break down and refine a little more below.

IDictionary is named thusly because, like a dictionary, it can map words to their definitions.  More generally, it can map just about anything to anything else.  However, unlike a dictionary, it’s not ordered in any way.  Can you imagine how useless a dictionary in which the words were not in alphabetical order would be?  Pretty useless.

Don’t believe me about IDictionary?  It’s right there in the official documentation.  If this surprises you, it may because you’ve stuffed a lot of stuff in dictionaries, and it’s always come out in the same order in which you inserted it.  I just wrote a test to see if I could get a dictionary to enumerate it’s elements in any order other than insertion order, but I couldn’t…from ten elements to a million, it came out in the same order in which I put it in.

Update: Brian Grunkemeyer from Microsoft offered the following test in which insertion order is not preserved:

var dictionary = new Dictionary<int, String>();
dictionary.Add(1, "One");
dictionary.Add(2, "Two");
dictionary.Remove(1);
dictionary.Add(3, "Three");
foreach(String value in dictionary.Values)
  Console.WriteLine(value);
// prints "three" then "two"

As you can see, it’s all about removing then inserting.  That was what I was missing in my tests: I had tried removing elements and replacing elements, but not removing then inserting.

So what am I so excited about?  Well, it explicitly states in the documentation (both for IDictionary and the concrete implementation Dictionary) that no guarantees are made about enumeration order so, test notwithstanding, it would be foolish to rely on that, especially for something critical.

What I am suggesting is that insertion-order enumeration be specified in the IDictionary interface.  I know why things are they way they are: dictionaries are usually implemented as a hashtable, which cannot guarantee enumeration order, but are very efficient.  However, the current hashtable-based dictionary implementation appears to be very good at preserving insertion order even for very large dictionaries, so it seems like the performance impact, if any, would be minimal.  However, the positive impact on usage would be great: developers could confidently enumerate dictionaries expecting insertion order.  For those applications where efficiency needs truly preclude preservation of insertion order, a generic IHashtable/Hashtable could be provided (a non-generic version already exists).

So why is this even a big deal?  I know there are some purists out there who are saying “if you need to enumerate in a specific order, then you are using the wrong data structure.”  To these people, I say bull hocky.  I frequently need the mapping features that a dictionary provides AND the ability to preserve order.  Here’s a specific example: a country drop-down.  What I want to be able to do is have a list of ISO 3166-1 country codes mapped to their descriptive country names.  Furthermore, I want to display the countries to the user in a specific order, either alphabetic or most common countries at the top followed by an alphabetic list.  I can’t use a dictionary for this purpose because I can’t rely on it preserving order.  So I end up having to pass two things: a list with country codes (which preserves order) and a dictionary for the mapping:

// WRONG:
vm.Countires = new Dictionary<string,string>(
  { "US", "United States", "CA", "Canada" } );
return View( vm );

// RIGHT (but annoying, and note the unfortunate
// repetition of string literals):
vm.Countries = new List<string>( { "US", "CA" } );
vm.CountryNames = new Dictionary<string,string> {
  { "US", "United States" },
  { "CA", "Canada" },
};
return View( vm );

// RIGHT (another approach that avoids string literal
// repetition):
vm.Countries = new Dictionary<dynamic,string> {
  { new { Position = 1, Code = "US" }, "United States" },
  { new { Position = 2, Code = "CA" }, "Canada" }
}

Update: so, to summarize Microsoft’s stance on this topic, IDictionary does not guarantee any kind of enumeration order, and that’s not likely to change.  Of course there are ways to get around the issue, but it still doesn’t feel natural to me. My opinion remains that “dictionary” is an unfortunate name for an ADT that does not preserve any order, but my little opinion isn’t going to change anyone’s mind in Redmond. If I was the emperor of the universe, and I could change things, here’s what I would do:

  • Make IDictionary insert-only (i.e., I would remove methods to delete elements), and specify in the contract that insertion order is preserved.
  • Create generic IHashtable classes that essentially replace what IDictionary currently is.

If wishes were trees….


2 Comments on IDictionary: Merriam-Webster it Ain’t

  1. It’s a nomenclature issue, really. And from past conversations, I know that you know just how important I feel proper nomenclature is! If the current implementation of IDictionary is actually just an enumerable hash table, and in fact, the documentation stipulates that an IDictionary is little more than an enumerable hash table, then by golly why not call it IEnumerableHastable??? Of course, one could argue then that this reveals too much about implementation. But a dictionary is, by definition, not merely a mapping of token-to-value… it *is* ordered. If it’s not ordered, it’s not a dictionary!

  2. I agree; if you’re going to use a metaphor, you should do your best to make the metaphor as accurate as possible (without being pathological). I also agree that dictionary should really just have been named hashtable, because hashtables are well-understood not to preserve insertion order. I’ll be updating this blog post soon; I got some very interesting feedback from Microsoft on it. The unsurprising upshot is that nothing’s going to change, but the discussion was interesting nonetheless.