Sunday, February 08, 2015

Improving Struct Performance By Overriding GetHashCode()

In an earlier post, I discussed the pitfall of using structs in C# to store data and failing to override the Equals method.  In this post I’ll look at the GetHashCode method and how not overriding the default implementation can have a drastic effect on the performance of a C# application.

I was asked by a client to investigate the performance of a C# WPF application; users were concerned that performance across certain areas just didn’t feel as fast as it ought to be – particularly during busy times of a trading day.  

After analysing the code and using an automated profile application (ANTS by Red Gate) I narrowed down the bottle neck to a struct had been used to store stock market prices.


Summary of Results (Bottom Line)


If you don’t want to read how these numbers were created, then the bottom line is, if you are using structs to store data and these structs regularly have their Equals or GetHashCode default methods called - either directly or indirectly (by storing in a Dictionary), then in terms of performance you should override the methods and implement your own Equals and GetHashCode methods.

If you’re using a struct and you do not override the Equals or GetHashCode methods, then you’re deferring down to the default implementations.  These are shockingly slow, as they use Reflection to look at the fields defined in your struct.

The chart below shows (times in milliseconds) how slow it can be by repeatedly calling GetHashCode on a struct that does/does not have the GetHashCode method overridden.



Use of Class Versus Struct To Store Data


Commonly known facts:
  • Class instances get allocate on the heap
  • Class instances get via a copied pointer reference (size is 4/8 bytes for 32/64 bit process)
  • Class instances have additional overhead of  8/16 bytes for 32/64 bit process (in fact .NET aligns objects to the minimum of 12/24 bytes object size for 32/64 bit processes)
  • Structs have no additional overhead, the amount of memory is the sum of all fields in the struct
  • Structs declared in a class exist on the heap (not the stack!)
  • Structs, by default, when passed to a method are copies of the original struct (a byte-by-byte copy), so any changes made to the struct from inside the method are local to that method


The original application developer used a struct to store minute-by-minute stock market pricing data, rather than a class.  

In addition to using structs to store the market data info, various supporting calculations required value equality comparison and were stored in Dictionary based collections, hence a large of calls to Equals and GetHashCode

Unfortunately, these structs did not to override the GetHashCode and Equals methods, therefore using .NET’s default values, which resulted in some rather surprising performance issues.

Take a look at my Equals post for further details of Blittable and Non-Blittable Types .

Test Code Generate Results

I’m going to walk through a selection of code snippets to show how much of a performance difference you can see by implementing GetHashCode.

As part of my profiling samples, I’ve created a base class (TimedActionBase) that, when used in conjunction with NUnit and Resharper, makes it really easy to run snippets of test code with a single mouse click and provide a statistical average for the area under test.  You’ll find a link to the code at the bottom of this article.  

You won’t need Resharper to compile the test code, but you’ll need to run Install-Package NUnit  using the Package Manager Console to download and install NUnit from NuGet.

I’ll just summarise the base test class here, as it’s the actual tests that are more important.   


Each test class derives from TimedActionBase and implements the TimedAction() method.  This method gets called once initially (unloading any timed JIT overhead) and then is called a set number of times, with a percentile calculation used to provide a meaningful idea of how long the method takes to execute.  I can then run SinglePassRunner() if a want a single timed pass (this calls the test method 100 times), or, in order to build up the data for the graphs MultiPassRunner(): 


using System;
using System.Collections.Generic;
using System.Diagnostics;
using NUnit.Framework;

[TestFixture]
public abstract class TimedActionBase
{
    private readonly List<TimeSpan> _durations = new List<TimeSpan>();
    private readonly Stopwatch _timer = new Stopwatch();
    private bool _headerLogged;

    protected abstract void TimedAction();

    [Test]
    public void SinglePassRunner()
    {
        TimedActionRunner();
    }

    [Test]
    public void MultiPassRunner()
    {
        const int NoOfPasses = 180;

        for (var i = 0; i < NoOfPasses; i ++)
        {
            TimedActionRunner();
        }
    }

    private void TimedActionRunner()
    {
        const int DefaultTestCycles = 100;

        // Prerun TimedAction method in case there's any JIT overhead
        TimedAction();

        _durations.Clear();

        for (var cycle = 1; cycle <= DefaultTestCycles; cycle++)
        {
            _timer.Start();
            TimedAction();
            _timer.Stop();

            var duration = _timer.Elapsed;
            _durations.Add(duration);

            _timer.Reset();
        }

        CalculateStatistics();
    }

    private void CalculateStatistics()
    {
        var average = _durations.AverageTotalMilliseconds();
        var min = _durations.MinTotalMilliseconds();
        var max = _durations.MaxTotalMilliseconds();
        var deviation = _durations.Deviation();

        const double Percentile = 95D;
        var percentile = _durations.Percentile(Percentile);

        var header = string.Empty;
        if (!_headerLogged)
        {
            _headerLogged = true;
            header = "Timings(ms)     \tPercentile\tAverage\tMinimum\tMaxmimum\tDeviation\n";
        }

        Log(string.Concat(header, string.Join("\t"
            new[]
            {
                GetType().Name,
                FormatTimeSpan(percentile), FormatTimeSpan(average),
                FormatTimeSpan(min), FormatTimeSpan(max),
                FormatTimeSpan(deviation)
            })));
    }

    
    private static string FormatTimeSpan(TimeSpan span)
    {
        return span.TotalMilliseconds.ToString("N0");
    }

    protected static void Log(string format, params object[] args)
    {
        Console.WriteLine(format, args);
    }

}


To compare the performance of different implementations, all of the sample tests below follow the same basic steps:

  1. Define a struct with slightly different implementation under test (eg, with or without GetHashCode)
  2. Create a List containing  50,000 structs
  3. Walk through the list calling the GetHashCode method and calculate how long that timed operation takes to complete


Even if you aren't directly calling GetHashCode in your code, it is used by .NET when you use the objects in any of the Dictionary based collections.

1. Not Overriding GetHashCode() With Non-Blittable Fields (Slowest Speed)
GetHashCode Overridden
Non-Blittable Fields
Blittable Fields
Typical Milliseconds
No
3
0
70


class SlowNonBlittableWithoutGetHashCodeTest : TimedActionBase
{
    struct MyStruct
    {
        public Decimal Field1;
        public Decimal Field2;
        public Decimal Field3;
    }
 
    private readonly List<MyStruct> _items = Enumerable
        .Range(1, GetHashCodeTestParams.NumberOfItems)
        .Select(n => new MyStruct())
        .ToList();
 
    protected override void TimedAction()
    {
        var hashcodes = _items
            .Select(item => item.GetHashCode())
            .ToList();
    }
}

2. Not Overriding GetHashCode() With Blittable Fields (Medium)
GetHashCode Overridden
Non-Blittable Fields
Blittable Fields
Typical Milliseconds
No
0
3
18

class MediumBlittableWithoutGetHashCodeTest : TimedActionBase
{
    struct BlittableStruct
    {
        public int Field1;
        public int Field2;
        public int Field3;
    }
 
    private readonly List<BlittableStruct> _items = Enumerable
        .Range(1, GetHashCodeTestParams.NumberOfItems)
        .Select(n => new BlittableStruct())
        .ToList();
 
    protected override void TimedAction()
    {
        var hashcodes = _items
            .Select(item => item.GetHashCode())
            .ToList();
    }
}

3. Overriding GetHashCode() With Blittable Fields (Fastest)
GetHashCode Overridden
Non-Blittable Fields
Blittable Fields
Typical Milliseconds
Yes
0
3
9

class FastWithGetHashCodeTest : TimedActionBase
{
    struct StructWithGetHashcode
    {
        public override int GetHashCode()
        {
            return Field1 ^ Field2 ^ Field3;
        }
 
        public int Field1;
        public int Field2;
        public int Field3;
    }
 
    private readonly List<StructWithGetHashcode> _items = Enumerable
        .Range(1, GetHashCodeTestParams.NumberOfItems)
        .Select(n => new StructWithGetHashcode())
        .ToList();
 
    protected override void TimedAction()
    {
        var hashcodes = _items
            .Select(item => item.GetHashCode())
            .ToList();
    }
}

The numbers are very surprising.


You can view similar results for the Equals performance test post.

Source Code:

 

No comments: