Tuesday, 31 July 2012

Using range header for retrieving range of IEnumerable<T> in ASP.NET Web API


Introduction

[Level T3] In this post, we talk about using HTTP's Range header to achieve requesting data ranges for entities.

Background

HTTP spec defines a series headers that can be used for a client to request for a partial content. These operations are optional (in most cases spec uses word SHOULD) but most servers implement them and browsers have increasingly been using them. If you have ever resumed downloading a big file from internet, then you have used this feature (in fact all browsers use range if supported by server). In this case, client keeps requesting chunks and builds up the file until it is fully downloaded.

So here is how it works in a nutshell:

  1. Server can optionally informs clients while serving a resource that it supports partial content. It does that by sending Accept-Range header with a value of the unit it supports, normally bytes. In our case, our server sends back a custom unit that we call x-entity.
  2. Client, either informed by the server on partial content feature based on Accept-Range header or just simply tries its luck, sends a request with Range header with value of [unit]=[from]-[to] for example bytes=1024-2047. In this example, client asks for the second KB of the file. Range header can specify multiple ranges for example bytes=500-600,601-999
  3. Server will return the range requested and include a Content-Range header with value [units] [from]-[to]/[TotalCount]. For example bytes 1024-2047/12345678. It also returns status code 206 (partial content) to inform the client that the content is partial. If server does not support the range specified, it will send back status code 416.
Spec does consider using custom units so server can implement its own custom units and inform the client of the unit using Accept-Range header. Now the idea is that in ASP.NET Web API, we normally build many actions that return IEnumerable<T>. What if we could use the range to specify range of the enumerable to be returned? Hmmm....

This feature can be useful in case of pagination on the client so that instead of API implementing a range parameter all the time, we just use HTTP's built-in features and encapsulate the implementation in a reusable component, in this case a filter.

So in the code to follow, we define a custom range unit and call it x-entity. "x-" prefix is a common naming convention on the web to specify custom tokens that are not part of the canonical tokens defined in RFC specs.

Implementing range in ASP.NET Web API

So where is the best place to implement this? We have these requirements:

  • Access to request headers to read Range header
  • Access to response header to set Accept-Range
  • Access to content headers to set Content-Range header
  • Access to content so that it can filter IEnumerable<T>
DelegatingHandler might look promising but by the time it accesses the content, it is already turned into stream by MediaTypeFormatters.

MediaTypeFormatter is an interesting option. I actually created a RangeMediaTypeFormatterWrapper that would wrap the MTFs and intercept the content and if it was of type IEnumerable<T>, it would apply the filtering. Initially it seems MTF does not have access to request headers but in here we had an interesting discussion and it turns out it can access request using GetPerRequestFormatterInstance. But it also needs access to response headers.


So Glenn Block suggested filters and after some thoughts, it seems to be the right approach considering current limitations of MTF. The only drawback is that it has to be explicitly defined on the action - which can in some cases be a blessing in fact. In any case, filter approach as you will see is clean and does everything in the same place.

Filters in ASP.NET Web API is not much different from MVC. You get two methods: before (OnActionExecuting) and after (OnActionExecuted) the action where you can change values in request, response, action arguments or simply examine values.

Using the code

You can get the source code from GitHub. As you can see, we have a single controller called CarController. Project is running on port 50714 on my machine so all examples will include this port - yours could be different. So download the project, build and run it. I created a client project to implement all steps below using HttpClient but there is an issue with ASP.NET Web API implementation of the Range header that regardless of the unit set in the range header, always bytes is sent to the server.

As you can see, we have a simple action with EnableRange filter defined on it:

[EnableRange]
public IEnumerable<Car> Get()
{
 return CarRepository.Instance.Get();
}

So now we use fiddler (or similar tool capable of sending HTTP requests, such as Google's Postman) to send this request:

GET http://localhost:50714/api/Car HTTP/1.1
User-Agent: Fiddler
Host: localhost:50714

We will get all the cars in our repository in JSON format. But note the Accept-Range header with value of  x-entity:

HTTP/1.1 200 OK
Cache-Control: no-cache
Pragma: no-cache
Content-Type: application/json; charset=utf-8
Expires: -1
Accept-Ranges: x-entity
Server: Microsoft-IIS/8.0
Date: Tue, 31 Jul 2012 18:59:56 GMT
Content-Length: 1125

[{"Id":1,"Make":"Vauxhall","Model":"Astra","BuildYear":1997,...

So this should tell the client that it can use Range header. Now let's send a range header requesting 3rd item to 6th item (total of 4 items):

GET http://localhost:50714/api/Car HTTP/1.1
User-Agent: Fiddler
Host: localhost:50714
Range: x-entity=2-5

And here is the response:

HTTP/1.1 206 Partial Content
Cache-Control: no-cache
Pragma: no-cache
Content-Type: application/json; charset=utf-8
Content-Range: x-entity 2-5/10
Expires: -1
Server: Microsoft-IIS/8.0
Date: Tue, 31 Jul 2012 19:00:19 GMT
Content-Length: 447

[{"Id":3,"Make":"Toyota","Model":"Yaris","BuildYear":2003,"Price":3750.0,...

Note the Content-Range header above and also the fact that we got the entities we requested in JSON (not shown fully above). So it tells us that it has sent back items from index 2 to index 5 and total of items is 10. Also note the 206 response.

Server can send back * if number of items is not known at the time of serving the request. I have used this option since I do not want to run a Count() on an IEnumerable<T>. It is very likely that the data is being retrieved from database and we do not want to load the whole table into memory. So my approach is to try cast the value into ICollection using as keyword. If it case OK then I get the count, otherwise I set the count to *.

Another option in the spec is that to in the range is optional so the client can send a range header with value 2-*. In this case, we must skip the first 2 items and return the rest:

GET http://localhost:50714/api/Car HTTP/1.1
User-Agent: Fiddler
Host: localhost:50714
Range: x-entity=2-

In this case, our server returns this response:

HTTP/1.1 206 Partial Content
Cache-Control: no-cache
Pragma: no-cache
Content-Type: application/json; charset=utf-8
Content-Range: x-entity 2-9/10
Expires: -1
Server: Microsoft-IIS/8.0
Date: Tue, 31 Jul 2012 21:18:05 GMT
Content-Length: 897

[{"Id":3,"Make":"Toyota","Model":"Yaris","BuildYear":2003,"Price":3750.0, ....

Notes on implementation

The crux if the implementation is to call Skip() and Take() on IEnumerable<T>. Our code has to be able to work with all types hence cannot be generic. On the other hand, filters (and attributes as a whole) cannot use generics. As such we just have to use reflection to do this:

[EnableRange]
var skipMethod = t.GetMethods().Where(m => m.Name == "Skip" && m.GetParameters().Count() == 2)
 .First().MakeGenericMethod(_elementType);
var takeMethod = t.GetMethods().Where(m => m.Name == "Take" && m.GetParameters().Count() == 2)
 .First().MakeGenericMethod(_elementType);
...
value = skipMethod.Invoke(null, new object[] { value,  from});
if(to.HasValue)
 value = takeMethod.Invoke(null, new object[] { value, to - from + 1 });

Also it is useful to note that the return value is not accessible in the filter and we have to resort to using Content and casting it to ObjectContent and use the Value property.

Conclusion

Range header defined in HTTP spec is useful in retrieving partial content. We can use custom units and we defined x-entity unit to enable selecting a range of entities (commonly used in pagination scenarios) and implemented it using a filter.

4 comments:

  1. I recent wrote a very similar post to this a few days before you

    Although I didn't include any specific implementations I have built a working example (I'll have to blog about that soon) in which instead of returning IEnumerable I return a custom type "IPagedList<>" which inherits IEnumerable but includes properties like an option count etc.

    I then have a custom delegation handler to detect when the WebApi returns IPagedList and modify the response code/headers instead of using an attribute.

    Good post!

    ReplyDelete
    Replies
    1. Well, IMHO, you really do not need another type (that would be intrusive) and also the count should be on the Content-Range according to HTTP spec.

      Delete
    2. The reason for the custom type is to bundle properties like Total and set them in the header, plus with using the custom type you can see that my intent is to return a page of data. That and it allowed me to easily pick up when to respond with the headers.

      Do you have a link to that part of the spec?

      Cheers
      Tony

      Delete
    3. Yeah it is here:

      http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.16

      Delete