C# subreddit question time again! This time, “Which string comparison method is faster?

I took a bit of a deep dive to see what each code path does. I decided to compare string.Equals(a, b), string.Equals(b) , ==, and !=. Which one is faster? Which one runs less code?

Aside: I have a new YouTube channel called Elias Explains. I’ll be posting my videos up there from now on.

If you’re looking for the video walkthrough, here it is.

Here’s my sample program.

class Program
{
    private static string string1 = "Hello world";
    private static string string2 = "HELLO WORLD";

    static void Main(string[] args)
    {
        if(string1 != string2)
        {
            Console.WriteLine("Not Equal.");
        }

        if (!string1.Equals(string2))
        {
            Console.WriteLine("Not Equal.");
        }
    }
}

Which of these two produced less code? First, here’s the != operator. This generated 9 lines of IL.

    // IL_0001: ldsfld       string ConsoleApp5.Program::string1
    // IL_0006: ldsfld       string ConsoleApp5.Program::string2
    // IL_000b: call         bool [System.Runtime]System.String::op_Inequality(string, string)
    // IL_0010: stloc.0      // V_0
    // IL_0011: ldloc.0      // V_0
    // IL_0012: brfalse.s    IL_0021
    // IL_0014: nop
    // IL_0015: ldstr        "Not Equal."
    // IL_001a: call         void [System.Console]System.Console::WriteLine(string)

Next, here’s the string.Equals(string) version. This produced 11 lines of IL. Notice the addition.

    // IL_0021: ldsfld       string ConsoleApp5.Program::string1
    // IL_0026: ldsfld       string ConsoleApp5.Program::string2
    // IL_002b: callvirt     instance bool [System.Runtime]System.String::Equals(string)
    //
    // ceq will compare the last two values pushed to the stack.
    // This is a call to see if the result of string.Equals is 0 (false).
    //
    // IL_0030: ldc.i4.0
    // IL_0031: ceq
    // IL_0033: stloc.1      // V_1
    // IL_0034: ldloc.1      // V_1
    // IL_0035: brfalse.s    IL_0044
    // IL_0037: nop
    // IL_0038: ldstr        "Not Equal."
    // IL_003d: call         void [System.Console]System.Console::WriteLine(string)

There’s two more IL instructions generated for the string.Equals call – we have to compare whether the result of string.Equals was equal to 0 (false). Could this be slower? It depends on your use case, how often you’re calling this code, and what the Just-in-Time or Ahead-of-Time compiler does to your code. I would bet money that 99.9% of us would never notice the performance difference. But what about the System.String class? Is there a difference between using ==, !=, and string.Equals?From dotPeek, string operator ==:

    public static bool operator ==(string a, string b)
    {
      return string.Equals(a, b);
    }

From dotPeek, string operator !=:

    public static bool operator !=(string a, string b)
    {
      return !string.Equals(a, b);
    }

The IL for both are nearly identical save for the same ceq call. What does string.Equals look like though?From dotPeek, string.Equals(b):

    public bool Equals(string value)
    {
      if (this == null)
        throw new NullReferenceException();
      if (value == null)
        return false;
      if ((object) this == (object) value)
        return true;
      if (this.Length != value.Length)
        return false;
      return string.EqualsHelper(this, value);
    }

First we check to see if the string we’re comparing against is null. Otherwise, check to see if we’re comparing a string to itself. Then check to see if the lengths are different. Finally, call string.EqualsHelper.

What about the static method, string.Equals(a, b)? Would it be faster to call that?From dotPeek, string.Equals(a, b):

    public static bool Equals(string a, string b)
    {
      if ((object) a == (object) b)
        return true;
      if (a == null || b == null || a.Length != b.Length)
        return false;
      return string.EqualsHelper(a, b);
    }

First we check to see if the two strings are the same object reference. If they are, then return true – you are always equal to you. Otherwise, check if one of the strings is null or the lengths differ. If that’s the case, return false. Finally, if all that fails, call EqualsHelper. Looks almost identical to the instance method. What about this string.EqualsHelper method? Get ready for some pointer fun. This is an unsafe method.From dotPeek, string.EqualsHelper:

    private static unsafe bool EqualsHelper(string strA, string strB)
    {
      int length = strA.Length;
      fixed (char* chPtr1 = &strA.m_firstChar)
        fixed (char* chPtr2 = &strB.m_firstChar)
        {
          char* chPtr3 = chPtr1;
          char* chPtr4 = chPtr2;
          for (; length >= 12; length -= 12)
          {
            if (*(long*) chPtr3 != *(long*) chPtr4 || *(long*) (chPtr3 + 4) != *(long*) (chPtr4 + 4) || *(long*) (chPtr3 + 8) != *(long*) (chPtr4 + 8))
              return false;
            chPtr3 += 12;
            chPtr4 += 12;
          }
          for (; length > 0 && *(int*) chPtr3 == *(int*) chPtr4; length -= 2)
          {
            chPtr3 += 2;
            chPtr4 += 2;
          }
          return length <= 0;
        }
    }

Woof. That’s a bunch of code. I would hazard a guess that once the string is beyond 12 characters long, there’s a more efficient method to check string equality. This code is clever though – basically for each run of the 2nd loop, the length of the string is decremented. As soon as there’s an inequality in the loop iterator, we jump to return length <= 0. So it’s basically a loop through memory, consuming more of the string until we’re done.So what? Which one is faster?

All of the above call string.EqualsHelper. The direct equality is slightly faster (maybe, debatable depending on how the code gets turned into machine code) because you skip a comparison to zero. The equals operator also introduces a call to string.Equals, so you could say it’s slightly slower due to a method call.

Again, it’s an extra jump. For a definitive answer, you need to see if the equality you’re using makes a difference in your code. If you’re checking equality once in your program, either one works. If you’re checking it thousands of times a second, it might matter.

Published by Elias

Elias Puurunen is a versatile entrepreneur and President of Northern HCI Solutions Inc., an IT consulting firm which has worked with Fortune 500 companies, governments, and startups. He has spoken at conferences in Canada and the United States and has been published around the world. Part of his work led to an agreement between the Canadian Government and Siemens Canada, creating jobs and investment into green infrastructure. His company's event management app, the Tractus Event Passport connects people at conferences, seminars and symposiums across Canada. Today he is a consultant and advisor to technology firms and government organizations. He lectures at the University of Waterloo on Coding for Policy Analysis for the School of Public Policy. He is the author of Beyond Passwords: Secure Your Business, a cyber-security book for small business owners.

Leave a comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: