Byte Rot: Server-side Async: Careful with that Axe, Eugene

Monday 17 September 2012

Server-side Async: Careful with that Axe, Eugene

[Level T3]

In a previous post, I talked about the dangers lurking in doing server-side async operations in .NET 4.0. As you know, .NET 4.5 provides a much better syntax allowing async/await keywords to take your TPL Task-Soups to a much more readable and organised code. But even so, async will make debugging your application more difficult and bugs could take much longer to be reproduced, isolated and fixed.

Task-Soup

In .NET 4.0, when we add up continuations to create a chained task, we could end up with a few problems:

We could end up with an unobserved exception problem. This is nicely described by Ayende here
Nested lambda expressions could create unexpected problems with closure of variables
The code becomes hard to read.

On the third note, I will just bring an example from my own code in CacheCow. What is it that we are actually returning here?

return response.Then(r =>
{
 if (r.Content != null)
 {
  TraceWriter.WriteLine("SerializeAsync - before load",
   TraceLevel.Verbose);

  return r.Content.LoadIntoBufferAsync()
   .Then(() =>
   {
    TraceWriter.WriteLine("SerializeAsync - after load", TraceLevel.Verbose);
    var httpMessageContent = new HttpMessageContent(r);
    // All in-memory and CPU-bound so no need to async
    return httpMessageContent.ReadAsByteArrayAsync();
   })
   .Then( buffer =>
      {
       TraceWriter.WriteLine("SerializeAsync - after ReadAsByteArrayAsync", TraceLevel.Verbose);
       return Task.Factory.FromAsync(stream.BeginWrite, stream.EndWrite,
        buffer, 0, buffer.Length, null, TaskCreationOptions.AttachedToParent);                                                        
      }
     );

   ;
 }

Even looking at brackets gives me headache.

Is Async worth it at all?

Now we talk a lot about Async operations and its role in improving scalability. But really, is it worth it? How much scalability would it bring? Would it help or hinder?

The answer to these questions is yes, it does help. The more IO you do on your server-side actions, the more you benefit from improvement from scalability. So it is highly advisable to implement your ApiController actions as Async by returning Task or Task<T>.

The truth is, it will help even with your non-IO-bound operations although it is not advisable to use Async in such scenarios. You can test it for yourself, create a sync and an async controller to do exactly the same operation and use a benchmarking tool to compare the performance.

I have a CarManager sample on GitHub which I use for testing CacheCow.Server and it contains two simple controllers: CarController and CarAsyncController. All these do is to use an in-memory repository and their GET only looking up the dictionary by its key:

// sync version
public Car Get(int id)
{
 return _carRepository.Get(id);
}


// async version (on another controller)
public Task<Car> GetAsync(int id)
{
 return Task.Factory.StartNew(() => _carRepository.Get(id));
}

So if you use a benchmarking tool such as Apache Benchamrk ab.exe, you could see a slight increase in throughput using the async controller. In my case, there was a 10% increase in throughput using async.

My ordeal with a bug

Development of CacheCow has been marred by existent of a problem which as we will see, turns out to be not in my code. I have been battling with this for a few weeks (on and off) and could not progress CacheCow development because of that.

OK, here is how my story begins; I think the Sherlock Holmes nature of this troubleshooting could be amusing for others too. After realising that using simple ContinueWith will not flow the context (see previous post) I was tasked with changing all such cases with Then in the TaskHelpers which checks existence of SynchronizationContext and flows the context if it exists.

On the other hand, lostdev, one of CacheCow's most loyal users, informed me of an occasional null reference exception in CacheCow.Server. Now, I had already fixed a bug related to null reference exception when the a resource was being retrieved for the first time. I attributed the problem to the fix I had made and reported that the problem is fixed in the current version.

So I started developing file-based cache storage for CacheCow.Client (which will have its own post very soon) and replaced all ContinueWith cases with Then.

And then I started to experience deadlocks in CacheCow.Client when I was using file-based caching and sending concurrent GET requests to the server. As soon as I would remove FileStore, and replace with InMemoryCacheStore, it would work. So I started searching through the client code, debug, look at the threads, debug again, change code, debug... to no avail. As soon as I was using file-based caching it would start to appear so it had to be on the client.

Then I noticed a strange thing: I could only run 4 concurrent calls and rest would be blocked. Why? Then I started playing with the maxconnection property of the system.net configuration:

  <system.net>
 <connectionManagement>
   <add address = "*" maxconnection = "N" />
 </connectionManagement>
  </system.net>

and interestingly, by setting the N to a high number, I would get more concurrent connections - but only up to the number defined. Hmmm... so the requests do not quite finish. OK, I fired up Sysinternals' TcpView but unfortunately these connections did not show up (and I do not know why).

I was getting nowhere until I accidentally loaded an earlier version of the server code. To my surprise, I did not get the deadlock but this error which @Tugberk separately reported earlier but attributed to order of handlers:

[NullReferenceException: Object reference not set to an instance of an object.]
System.Web.Http.WebHost.HttpControllerHandler.EndProcessRequest(IAsyncResult result) +112
System.Web.Http.WebHost.HttpControllerHandler.System.Web.IHttpAsyncHandler.EndProcessRequest(IAsyncResult result) +10
System.Web.CallHandlerExecutionStep.OnAsyncHandlerCompletion(IAsyncResult ar) +129

OK, so it is probably happening on the server but the continuation code gets deadlocked on unhandled exception. I am close! So it was time to go to bed and I was positive that I would nail it the day after.

It was funny that I woke up the day after and with my in-bed reading on tweets, stumbled on @Tugberk's tweet on issue he had just created. That sounds exceedingly similar, so we just doubled checked our scenarios and it turned out that an HttpResponseMessage with empty RequestMessage property is not handled in Web API and a null reference exception is thrown at the end of the response clean-up code. And the reason I was seeing it only with file-based cache store was that the part of server-side code to return such responses was being triggered only using file-based store (since it was capable of persisting caches and was trying to validate the cache).

So as you can see, a seemingly unrelated problem can really confuse the nature of the bugs in async scenarios.

Conclusion

First of all, always use request.CreateResponse() instead of using new HttpResponseMessage. I googled for cases of new HttpResponseMessage and found +3000 entries. This is really dangerous and I think this is a bug in Web API and needs to be fixed. If you are using new, make sure you set the RequestMessage property.

And in general, be careful with doing server-side async operations. It is really a powerful axe but with it you are not quite sure what a slightly off swing could bring. Careful with that axe Eugene.

20 comments:

tugberk20 September 2012 at 22:46
// async version (on another controller)
public Task GetAsync(int id)
{
return Task.Factory.StartNew(() => _carRepository.Get(id));
}

While I am not completely sure, the above method will decrease your web app's performance if you have high traffic because your will generate a lots of thread hoops. Task.Factory.StartNew is just a fancy way of saying Thread.Start. The reason for that is that it is not an asynchronous method. What you are doing here is multithreatening. I might be totally off-base here because u might be using this only for demo purposes (it is midnight here, spare me :)).

I have been doing that a lot till bunch people punched me in the face :) http://stackoverflow.com/questions/8743067/do-asynchronous-operations-in-asp-net-mvc-use-a-thread-from-threadpool-on-net-4
ReplyDelete
Replies
tugberk21 September 2012 at 00:38
Yes, u are taking a thread from the thread pool with StartNew but u are blocking that thread this time and u are also generating a unnecessary thread hoop. The whole idea behind the asyn is to free up the thread. If carRepo.Get() take e seconds, ur thread will do nothing but waiting till the job is done. Just to be on the safe side, I would ask someone form the team for that (levi, damian, brad, etc)
ReplyDelete
Replies
Tugberk21 September 2012 at 23:24
It's certainly going to kill your perf if you use the below method:

// async version (on another controller)
public Task GetAsync(int id)
{
return Task.Factory.StartNew(() => _carRepository.Get(id));
}

_carRepository.Get method is not an asynchronous method. So, you are not going to free up thread here. Instead, you will consume one more unnecessary thread to handle the request. In other words, this is multithreading, not async operation.

Assuming that your ASP.NET request is handled by thread #1 and _carRepository.Get method is handled by thread #2; The thread #2 will wait till the _carRepository.Get operation completes and u will be blocking a thread pool thread instead of the main thread but u will be blocking a thread at the end of the day.

If this was a client application such as a WPF or a WinRT app, your scenario would work fine because ur main concern will be to free up the UI thread. In a server application, every thread has its own cost.

If this was an asynchronous operation instead of a multithreaded operation, the thread #2 would go back to the thread pool and would be ready to process any other work. Finally, when the async operation is done, a thread would be drawn from the thread pool to process the request (assuming you are using the Default TaskSchaduler).
ReplyDelete
Replies
Tugberk22 September 2012 at 14:33
"You say _carRepository.Get(id) is not async. Yes, it is not but we have made it so by running the task."

You are not making it async. The thread which will execute that code will not be freed up because the method is not benefiting from the async I/O Completion Ports on Windows. Yes, you will free up the main thread here but you will still block a thread pool thread (assuming you are using the default TaskSchaduler).

What you do here is multithreading and this is going hurt your performance if you have lots of traffic. If I were your, I would repro my issue and ask someone from the team about that (probably ask on stackoverflow?).
ReplyDelete
Replies
Tugberk22 September 2012 at 14:35
Wow, that podcast is way too old :) August 30, 2004. Here is a new one for ya:

http://hanselminutes.com/327/everything-net-programmers-know-about-asynchronous-programming-is-wrong
ReplyDelete
Replies
Blogger4 October 2016 at 09:20
Did you know that you can generate money by locking premium pages of your blog / website?
Simply join AdWorkMedia and use their Content Locking widget.
ReplyDelete
Replies

Add comment

Note: only a member of this blog may post a comment.