Monday, 17 September 2012

Server-side Async: Careful with that Axe, Eugene

[Level T3]

In a previous post, I talked about the dangers lurking in doing server-side async operations in .NET 4.0. As you know, .NET 4.5 provides a much better syntax allowing async/await keywords to take your TPL Task-Soups to a much more readable and organised code. But even so, async will make debugging your application more difficult and bugs could take much longer to be reproduced, isolated and fixed.

Task-Soup

In .NET 4.0, when we add up continuations to create a chained task, we could end up with a few problems:

  1. We could end up with an unobserved exception problem. This is nicely described by Ayende here
  2. Nested lambda expressions could create unexpected problems with closure of variables
  3. The code becomes hard to read.
On the third note, I will just bring an example from my own code in CacheCow. What is it that we are actually returning here?

return response.Then(r =>
{
 if (r.Content != null)
 {
  TraceWriter.WriteLine("SerializeAsync - before load",
   TraceLevel.Verbose);

  return r.Content.LoadIntoBufferAsync()
   .Then(() =>
   {
    TraceWriter.WriteLine("SerializeAsync - after load", TraceLevel.Verbose);
    var httpMessageContent = new HttpMessageContent(r);
    // All in-memory and CPU-bound so no need to async
    return httpMessageContent.ReadAsByteArrayAsync();
   })
   .Then( buffer =>
      {
       TraceWriter.WriteLine("SerializeAsync - after ReadAsByteArrayAsync", TraceLevel.Verbose);
       return Task.Factory.FromAsync(stream.BeginWrite, stream.EndWrite,
        buffer, 0, buffer.Length, null, TaskCreationOptions.AttachedToParent);                                                        
      }
     );

   ;
 }

Even looking at brackets gives me headache.

Is Async worth it at all?

Now we talk a lot about Async operations and its role in improving scalability. But really, is it worth it? How much scalability would it bring? Would it help or hinder?

The answer to these questions is yes, it does help. The more IO you do on your server-side actions, the more you benefit from improvement from scalability. So it is highly advisable to implement your ApiController actions as Async by returning Task or Task<T>

The truth is, it will help even with your non-IO-bound operations although it is not advisable to use Async in such scenarios. You can test it for yourself, create a sync and an async controller to do exactly the same operation and use a benchmarking tool to compare the performance.

I have a CarManager sample on GitHub which I use for testing CacheCow.Server and it contains two simple  controllers: CarController and CarAsyncController. All these do is to use an in-memory repository and their GET only looking up the dictionary by its key:

// sync version
public Car Get(int id)
{
 return _carRepository.Get(id);
}


// async version (on another controller)
public Task<Car> GetAsync(int id)
{
 return Task.Factory.StartNew(() => _carRepository.Get(id));
}

So if you use a benchmarking tool such as Apache Benchamrk ab.exe, you could see a slight increase in throughput using the async controller. In my case, there was a 10% increase in throughput using async.

My ordeal with a bug

Development of CacheCow has been marred by existent of a problem which as we will see, turns out to be not in my code. I have been battling with this for a few weeks (on and off) and could not progress CacheCow development because of that.

OK, here is how my story begins; I think the Sherlock Holmes nature of this troubleshooting could be amusing for others too. After realising that using simple ContinueWith will not flow the context (see previous post) I was tasked with changing all such cases with Then in the TaskHelpers which checks existence of SynchronizationContext and flows the context if it exists.

On the other hand, lostdev, one of CacheCow's most loyal users, informed me of an occasional null reference exception in CacheCow.Server. Now, I had already fixed a bug related to null reference exception when the a resource was being retrieved for the first time. I attributed the problem to the fix I had made and reported that the problem is fixed in the current version.

So I started developing file-based cache storage for CacheCow.Client (which will have its own post very soon) and replaced all ContinueWith cases with Then.

And then I started to experience deadlocks in CacheCow.Client when I was using file-based caching and sending concurrent GET requests to the server. As soon as I would remove FileStore, and replace with InMemoryCacheStore, it would work. So I started searching through the client code, debug, look at the threads, debug again, change code, debug... to no avail. As soon as I was using file-based caching it would start to appear so it had to be on the client.

Then I noticed a strange thing: I could only run 4 concurrent calls and rest would be blocked. Why? Then I started playing with the maxconnection property of the system.net configuration:

  <system.net>
 <connectionManagement>
   <add address = "*" maxconnection = "N" />
 </connectionManagement>
  </system.net>

and interestingly, by setting the N to a high number, I would get more concurrent connections - but only up to the number defined. Hmmm... so the requests do not quite finish. OK, I fired up Sysinternals' TcpView but unfortunately these connections did not show up (and I do not know why).

I was getting nowhere until I accidentally loaded an earlier version of the server code. To my surprise, I did not get the deadlock but this error which @Tugberk separately reported earlier but attributed to order of handlers:

[NullReferenceException: Object reference not set to an instance of an object.]
System.Web.Http.WebHost.HttpControllerHandler.EndProcessRequest(IAsyncResult result) +112
System.Web.Http.WebHost.HttpControllerHandler.System.Web.IHttpAsyncHandler.EndProcessRequest(IAsyncResult result) +10
System.Web.CallHandlerExecutionStep.OnAsyncHandlerCompletion(IAsyncResult ar) +129

OK, so it is probably happening on the server but the continuation code gets deadlocked on unhandled exception. I am close! So it was time to go to bed and I was positive that I would nail it the day after.

It was funny that I woke up the day after and with my in-bed reading on tweets, stumbled on @Tugberk's tweet on issue he had just created. That sounds exceedingly similar, so we just doubled checked our scenarios and it turned out that an HttpResponseMessage with empty RequestMessage property is not handled in Web API and a null reference exception is thrown at the end of the response clean-up code. And the reason I was seeing it only with file-based cache store was that the part of server-side code to return such responses was being triggered only using file-based store (since it was capable of persisting caches and was trying to validate the cache).

So as you can see, a seemingly unrelated problem can really confuse the nature of the bugs in async scenarios.

Conclusion

First of all, always use request.CreateResponse() instead of using new HttpResponseMessage. I googled for cases of new HttpResponseMessage and found +3000 entries. This is really dangerous and I think this is a bug in Web API and needs to be fixed. If you are using new, make sure you set the RequestMessage property.

And in general, be careful with doing server-side async operations. It is really a powerful axe but with it you are not quite sure what a slightly off swing could bring. Careful with that axe Eugene.

19 comments:

  1. // async version (on another controller)
    public Task GetAsync(int id)
    {
    return Task.Factory.StartNew(() => _carRepository.Get(id));
    }

    While I am not completely sure, the above method will decrease your web app's performance if you have high traffic because your will generate a lots of thread hoops. Task.Factory.StartNew is just a fancy way of saying Thread.Start. The reason for that is that it is not an asynchronous method. What you are doing here is multithreatening. I might be totally off-base here because u might be using this only for demo purposes (it is midnight here, spare me :)).

    I have been doing that a lot till bunch people punched me in the face :) http://stackoverflow.com/questions/8743067/do-asynchronous-operations-in-asp-net-mvc-use-a-thread-from-threadpool-on-net-4

    ReplyDelete
    Replies
    1. Well I think you are probably mixing this with Task.Run which creates a dedicated thread. Task.Factory.StartNew according to MSDN is equivalent to using constructor and then calling start (http://msdn.microsoft.com/en-us/library/dd321439.aspx):

      "Calling StartNew is functionally equivalent to creating a Task using one of its constructors and then calling Start to schedule it for execution. However, unless creation and scheduling must be separated, StartNew is the recommended approach for both simplicity and performance."

      On the other hand, Task.Run creates a new Thread instead of using the ThreadPool hence is not recommended unless really a dedicated thread is required.

      Delete
    2. Task.Run does not create a new Thread, in fact it's just a shortcut of Task.Factory.StartNew(), calling the same logic with some default parameters. See http://blogs.msdn.com/b/pfxteam/archive/2011/10/24/10229468.aspx

      Delete
    3. Thanks Lasse for correcting me. I was misguided by a podcast. I will go back to it, it could be just my misunderstanding.

      Delete
  2. Yes, u are taking a thread from the thread pool with StartNew but u are blocking that thread this time and u are also generating a unnecessary thread hoop. The whole idea behind the asyn is to free up the thread. If carRepo.Get() take e seconds, ur thread will do nothing but waiting till the job is done. Just to be on the safe side, I would ask someone form the team for that (levi, damian, brad, etc)

    ReplyDelete
    Replies
    1. I am not sure what you are up to - does not make sense to me. At the end of the day, you will be using a thread from somewhere anyway. But the thread that is handling request can go back and serve another request. There is no alternative, is there?

      Anyway, Task.Factory.StartNew is just a syntactic sugar.

      Delete
  3. It's certainly going to kill your perf if you use the below method:

    // async version (on another controller)
    public Task GetAsync(int id)
    {
    return Task.Factory.StartNew(() => _carRepository.Get(id));
    }

    _carRepository.Get method is not an asynchronous method. So, you are not going to free up thread here. Instead, you will consume one more unnecessary thread to handle the request. In other words, this is multithreading, not async operation.

    Assuming that your ASP.NET request is handled by thread #1 and _carRepository.Get method is handled by thread #2; The thread #2 will wait till the _carRepository.Get operation completes and u will be blocking a thread pool thread instead of the main thread but u will be blocking a thread at the end of the day.

    If this was a client application such as a WPF or a WinRT app, your scenario would work fine because ur main concern will be to free up the UI thread. In a server application, every thread has its own cost.

    If this was an asynchronous operation instead of a multithreaded operation, the thread #2 would go back to the thread pool and would be ready to process any other work. Finally, when the async operation is done, a thread would be drawn from the thread pool to process the request (assuming you are using the Default TaskSchaduler).

    ReplyDelete
    Replies
    1. I am sorry, you have completely lost me. Perhaps I just do not get Async.

      Let's assume repository work is IO-bound so there is benefit in async. You say _carRepository.Get(id) is not async. Yes, it is not but we have made it so by running the task. That task must be created somewhere, either in controller or repository itself, right? Guess what, that would take up a thread regardless and it will be from the SAME threadpool. So moving the task creation outside of controller does not make any difference.

      Now we know that ASP.NET and our code share the same ThreadPool. So any benefit we get could be limited since we could be hijacking ASP.NET threads if throughput is high. But if throughput is high, we are stuffed regardless, resources are limited!

      In fact Lucian from Microsoft here believes there is no point in doing async in ASP.NET: http://www.dotnetrocks.com/default.aspx?showNum=785

      Delete
    2. "In fact Lucian from Microsoft here believes there is no point in doing async in ASP.NET"

      What Lucian talks about is similar to what I have been trying to express. He expressed that he doesn't think of any use cases for "Multithreading" inside ASP.NET apps (the words are not the same here as his). The talk from 16 min to 24 min is just expressing what I meant.

      Just after that he expressed that I/O bound async is useful (again, not the same words but meaning is similar).

      Delete
    3. OK, so let's imagine we call _carRepository.GetAsync(id) returning Task instead of Get() so can you tell me what implementation look like that would use IO completion port?

      As I said before "Let's assume repository work is IO-bound so there is benefit in async." So even _carRepository.Get(id) is UI-bound but that is not async.

      Delete
    4. Here is an example:

      public async Task> GetAsync() {

      using (var conn = new SqlConnection(_connectionString)) {
      using (var cmd = new SqlCommand()) {

      cmd.Connection = conn;
      cmd.CommandText = _spName;
      cmd.CommandType = CommandType.StoredProcedure;

      conn.Open();

      using (var reader = await cmd.ExecuteReaderAsync()) {

      return reader.Select(r => carBuilder(r)).ToList();
      }
      }
      }
      }

      private Car carBuilder(SqlDataReader reader) {

      return new Car {

      Id = int.Parse(reader["Id"].ToString()),
      Make = reader["Make"] is DBNull ? null : reader["Make"].ToString(),
      Model = reader["Model"] is DBNull ? null : reader["Model"].ToString(),
      Year = int.Parse(reader["Year"].ToString()),
      Price = float.Parse(reader["Price"].ToString()),
      };
      }

      As you can see, we are using cmd.ExecuteReaderAsync instead of cmd.ExecuteReader. When we await on cmd.ExecuteReaderAsync, we will suspend the logical execution flow of the code and the thread will go back to thread pool instead of hanging there doing nothing. Also, we are using await inside a using statement here. It's completely legit and compiler will handle that. So, compiler will ensure that reader won't be disposed before we are done with it.

      (BTW, There are a few more settings to enable asycn database calls, you can check them out here: http://www.tugberkugurlu.com/archive/asynchronous-database-calls-with-task-based-asynchronous-programming-model-tap-in-asp-net-mvc-4)

      Now assume that we will do this with cmd.ExecuteReader and have a sync method instead. Then, we will wrap this inside Task.Factory.StartNew. We will free up the main thread when we await on Task.Factory.StartNew but we will block a thread pool thread this time. + we will generate an unnecessary thread hoop which is vital for a server app.

      Delete
    5. So what does ExecuteReaderAsync do? Is it not the case that it will use Task.Factory.StartNew?

      Delete
    6. No, absolutely not!

      If you decompile down the System.Data.dll, you will actually see what it does but basically, it creates a Task over the old-school APM methods through Task.Factory.FromAsync.

      Have a look: http://stackoverflow.com/questions/5018897/tpl-taskfactory-fromasync-vs-tasks-with-blocking-methods

      My view is this: don't ever use Task.Factory.StartNew on a server application unless it's meant to be used by small amount of users and a low level traffic is expected (such as a Back Office application, etc).

      Delete
    7. I was actually looking at it using Reflector and as soon as found it your answer arrived.

      I think now I can see where you are coming from! Also that link was useful. As I said in the post, as soon as I think I get async, something happens and my whole understanding gets upside down!

      Thanks anyway, let me digest the stuff.

      Delete
  4. "You say _carRepository.Get(id) is not async. Yes, it is not but we have made it so by running the task."

    You are not making it async. The thread which will execute that code will not be freed up because the method is not benefiting from the async I/O Completion Ports on Windows. Yes, you will free up the main thread here but you will still block a thread pool thread (assuming you are using the default TaskSchaduler).

    What you do here is multithreading and this is going hurt your performance if you have lots of traffic. If I were your, I would repro my issue and ask someone from the team about that (probably ask on stackoverflow?).

    ReplyDelete
  5. Wow, that podcast is way too old :) August 30, 2004. Here is a new one for ya:

    http://hanselminutes.com/327/everything-net-programmers-know-about-asynchronous-programming-is-wrong

    ReplyDelete
    Replies
    1. I listened to that one as soon as it came out. See my comments!

      And also the other one is from July 2012. Where did you get 2004 from?!

      Delete
    2. Ah, I went to http://www.dotnetrocks.com/default.aspx?showNum=78 instead of http://www.dotnetrocks.com/default.aspx?showNum=785 :) I'll listen to that.

      Ask the question on SO, man. Maybe I am wrong (which I don't think but who knows).

      Delete
    3. Thanks. I think I have again learnt one bit more about async. Now I need to explain why even in load testing, thread-based async (even CPU-bound) works better than sync :(

      Delete