Home > View Post

GetHashCode and LINQ to objects

I was coding some LINQ to objects the other day and fell foul of a common mistake. Let me walk you through the example. Here's an object Foo - notice how it is IComparable and also overrides the Equals method:

public class Foo : IComparable<Foo>
{
    public int Value { get; set; }

    public override bool Equals(object obj)
    {
        Foo other = obj as Foo;

        if (other == null)
        {
            return false;
        }

        return other.Value.Equals(Value);
    }

    public int CompareTo(Foo other)
    {
        if (other == null)
        {
            return +1;
        }
        else
        {
            var comparison = Value.CompareTo(other.Value);
            return comparison;
        }
    }

    public override string ToString()
    {
        return string.Format("{0}", Value);
    }
}

It has one property called Value which is of type Int32 and when you ToString() the object you simply get this integer value.

Now, the LINQ query I want to write sorts Foos and groups them. Here's an example done for a list of integers: List bars = new List { 2, 1, 1, 2 };

List<int> bars = new List<int> { 2, 1, 1, 2 };

var groupedBars =     from b in bars
                    orderby b ascending
                    group b by b into g
                    select g.First();
// This writes each value to the console
Dump(groupedBars);

And here's the output:

1
2

Perfect, exactly what we'd expect. Let's now do the same for a very similar list of Foos:

List<Foo> foos = new List<Foo>
    {
        new Foo { Value = 2 },
        new Foo { Value = 1 },
        new Foo { Value = 1 },
        new Foo { Value = 2 }
    };

var selection = from f in foos
                orderby f ascending
                group f by f into g
                select g.First();

Dump(selection);

And here's the output:

1
1
2
2

Mmm. The sorting worked OK but the grouping not so much. My Equals operator clearly states that two Foos are equal if they have the same value - but they're not being grouped in this way (though they are being sorted correctly).

It's all my fault for ignoring this warning in Visual Studio:

"'Foo' overrides Object.Equals(object o) but does not override Object.GetHashCode()"

It's easily fixed with this very simplistic implementation of GetHashCode being added to the object:

public override int GetHashCode()
{
    return Value.GetHashCode();
}
        

Implementing GetHashCode()

I was worried about this implementation so headed over to MSDN to check the docs and see how I _should_ implement GetHashCode(). I found this page Object.GetHashCode Method which contained the three rules of GetHashCode():
  • If two objects of the same type represent the same value, the hash function must return the same constant value for either object.
  • For the best performance, a hash function must generate a random distribution for all input.
  • The hash function must return exactly the same value regardless of any changes that are made to the object.
Instantly I was stuck by a dichotomy here. 'If two objects represent the same value (i.e. are Equal()) the hash function must return the same value' and 'the hash function must return exactly the same value regardless of any changes that are made to the object.' - eh?

How can you possible satisfy both of these requirements for mutable objects? In my case, Foo's Value can change and the Equals override must account for this - however, the Equals and GetHashCode need to be aligned but that would break the last rule.

I'm not the first person (by a long way) to spot this. Ian Griffiths discusses the issues in more detail in this post The Rules for GetHashCode.

My decision is to ignore the last rule (which is primarily to support HashTables) and go with a changing GetHashCode - this best allows me to use group by in my LINQ queries how I wish to.

I have to admit though that I'm sort of surprised that group by *doesn't* use the Equals method for grouping - surely this has more in common with ANSI SQL (and thus, most query languages)?

Anyway, with the addition of my GetHashCode, the output now looks like this:

1
2

Happy days.

As my buddy Zulfiqar says, another good reason to enable 'Treat warning as error' in Visual Studio.

Tags: LINQ C#

 
Josh Post By Josh Twist
2:29 AM
29 May 2009

» Next Post: Help! Why can't I use DataTriggers with controls in WPF?
« Previous Post: If you can't beat XAML, improve it

Comments are closed for this post.

© 2005 - 2017 Josh Twist - All Rights Reserved.