Sunday, 20 September 2015

Casting and foreach – swings and roundabouts

I had a good discussion the other day about a code warning that one of the new Roslyn Code Analyzers had flagged.

SonarLint comes from SonarSource, and has a whole bunch of analyzers, including this one: S3217 - "Explicit" conversions of "foreach" loops should not be used.

It is a useful warning, highlighting how the foreach instruction will cast each item for you if the collection of items is not generic. If you only use generic collections, you’ll probably never hit this – but if you ever have to deal with some of the older classes (such as the original ADO.NET types) then this may come up.

This got me curious. What does the IL (intermediate language) look like for a foreach.

The following C# code will trigger this warning:

    var table = new DataTable();

    DataRowCollection rows = table.Rows;

    foreach (var row in rows)
    {

    }

We get this IL:

  IL_0000:  nop
  IL_0001:  newobj     instance void [System.Data]System.Data.DataTable::.ctor()
  IL_0006:  stloc.0
  IL_0007:  ldloc.0
  IL_0008:  callvirt   instance class [System.Data]System.Data.DataRowCollection [System.Data]System.Data.DataTable::get_Rows()
  IL_000d:  stloc.1
  IL_000e:  nop
  IL_000f:  ldloc.1
  IL_0010:  callvirt   instance class [mscorlib]System.Collections.IEnumerator [System.Data]System.Data.InternalDataCollectionBase::GetEnumerator()
  IL_0015:  stloc.2
  .try
  {
    IL_0016:  br.s       IL_0026
    IL_0018:  ldloc.2
    IL_0019:  callvirt   instance object [mscorlib]System.Collections.IEnumerator::get_Current()
    IL_001e:  castclass  [System.Data]System.Data.DataRow
    IL_0023:  stloc.3
    IL_0024:  nop
    IL_0025:  nop
    IL_0026:  ldloc.2
    IL_0027:  callvirt   instance bool [mscorlib]System.Collections.IEnumerator::MoveNext()
    IL_002c:  brtrue.s   IL_0018
    IL_002e:  leave.s    IL_0045
  }  // end .try
  finally
  {
    IL_0030:  ldloc.2
    IL_0031:  isinst     [mscorlib]System.IDisposable
    IL_0036:  stloc.s    V_4
    IL_0038:  ldloc.s    V_4
    IL_003a:  brfalse.s  IL_0044
    IL_003c:  ldloc.s    V_4
    IL_003e:  callvirt   instance void [mscorlib]System.IDisposable::Dispose()
    IL_0043:  nop
    IL_0044:  endfinally
  }  // end handler

I’m no IL guru, but it you can get the basic idea – and sure enough you can see line IL_001e, there’s a castclass operation that’s run for each item in the foreach loop.

So in this case, you can’t just explicitly cast a DataRowCollection to an IList<DataRow>. The LINQ Cast<> extension method can be used though.

This code no longer triggers the warning:

    foreach (DataRow row in rows.Cast<DataRow>())
    {

    }

And here is the corresponding IL:

  IL_0045:  nop
  IL_0046:  ldloc.1
  IL_0047:  call       class [mscorlib]System.Collections.Generic.IEnumerable`1 [System.Core]System.Linq.Enumerable::Cast(class [mscorlib]System.Collections.IEnumerable)
  IL_004c:  callvirt   instance class [mscorlib]System.Collections.Generic.IEnumerator`1 class [mscorlib]System.Collections.Generic.IEnumerable`1::GetEnumerator()
  IL_0051:  stloc.s    V_5
  .try
  {
    IL_0053:  br.s       IL_0060
    IL_0055:  ldloc.s    V_5
    IL_0057:  callvirt   instance !0 class [mscorlib]System.Collections.Generic.IEnumerator`1::get_Current()
    IL_005c:  stloc.s    V_6
    IL_005e:  nop
    IL_005f:  nop
    IL_0060:  ldloc.s    V_5
    IL_0062:  callvirt   instance bool [mscorlib]System.Collections.IEnumerator::MoveNext()
    IL_0067:  brtrue.s   IL_0055
    IL_0069:  leave.s    IL_0078
  }  // end .try
  finally
  {
    IL_006b:  ldloc.s    V_5
    IL_006d:  brfalse.s  IL_0077
    IL_006f:  ldloc.s    V_5
    IL_0071:  callvirt   instance void [mscorlib]System.IDisposable::Dispose()
    IL_0076:  nop
    IL_0077:  endfinally
  }  // end handler

You can see there’s no more castcall operation, but instead notice line IL_0047. Now instead the code is calling the Cast extension method on the entire enumerable.

You might be wondering, does it make any different to performance? In this case not that I could detect. I loaded up a DataTable with 1,000,000 rows and compared execution times between the two approaches, and there wasn’t any significant difference between them.

This makes sense if you think about it – the cast needs to happen – either up front before the loop, or inside the loop. There’s no avoiding it.

So I’d say this is one warning that you shouldn’t necessarily just blindly follow. Having said that, if you’re iterating over the object more than once, then you probably will see a performance boost if you do the cast up front.

No comments: