Beware of the .NET HttpClient
In the old days of .NET (pre 4.5) sending a HTTP request to a server could be accomplished by either using the WebClient or at a much lower level via the HttpWebRequest. In 2009 and as part of the REST Starter Kit (RSK) a new abstraction was born called the HttpClient; It took until the release of .NET 4.5 however for it to be available to the wider audience.
This abstraction provides easier methods of communicating with a HTTP server in full asynchrony as well as allowing to set default headers for each and every request. This is all awesome an all but there are dark secrets about this class that if not addressed can cause serious performance and at times mind boggling bugs! In this article we will explore the problems subtleties that one would need to be aware of when working with the HttpClient
class.
So what’s wrong?
The HttpClient
class implements IDisposable
suggesting any object of this type must be disposed of after use; With that in mind, let us have a look at how I would use this class assuming that I am not aware of the problem:
Uri endpoint = new("http://localhost:1234/");
for (int i = 0; i < 10; i++)
{
using HttpClient client = new();
string response = await client.GetStringAsync(endpoint);
Console.WriteLine(response);
}
So here we are sending 10 requests to an endpoint sequentially and assuming there is a listener serving the requests on port 1234 or any other endpoint you choose to hit, you will see 10 responses written to the Console
; This is all going well, right? WRONG!
Let us run the command: netstat -abn
on CMD which should return (depending on the endpoint you hit):
...
TCP [::1]:40968 [::1]:1234 TIME_WAIT
TCP [::1]:40969 [::1]:1234 TIME_WAIT
TCP [::1]:40970 [::1]:1234 TIME_WAIT
TCP [::1]:40971 [::1]:1234 TIME_WAIT
TCP [::1]:40972 [::1]:1234 TIME_WAIT
TCP [::1]:40973 [::1]:1234 TIME_WAIT
TCP [::1]:40975 [::1]:1234 TIME_WAIT
TCP [::1]:40976 [::1]:1234 TIME_WAIT
TCP [::1]:40977 [::1]:1234 TIME_WAIT
TCP [::1]:40978 [::1]:1234 TIME_WAIT
...
What is this you ask? Well, this is showing us that our application has opened up 10 sockets to the server so one for every request but more importantly even though our application has now ended, the OS has 10 sockets still occupied and in the TIME__WAIT state.
This is due to the way TCP/IP has been designed to work as connections are not closed immediately to allow the packets to arrive out of order or re-transmitted after the connection has been closed. TIMEWAIT indicates that the local endpoint (the one on our side) has closed the connections but the connections are being kept around so that any delayed packets can be handled correctly. Once this happens the connections will then be removed after their timeout period of 4 minutes. Remember, we sent 10 requests to the same endpoint and yet we have 10 individual sockets still busy for at least 4 minutes!
The above example is an overly simplified one but have a look at this:
public class ProductController : ApiController
{
public async Task<Product> GetProductAsync(string id)
{
using HttpClient httpClient = new();
string result = await httpClient.GetStringAsync("http://somewhere/api/...");
return new Product { Name = result };
}
}
Doing this per incoming request will eventually result in SocketException
, don’t believe me? Just run a load test, sit back and watch how many requests you can serve before you run out of sockets!
What can we do?
Well, the first obvious thing that comes to mind is reusing our client instead of creating a new one for every request but as you will see later in the post, that can cause yet another problem. Before we get to that point, let us first find out if we can even re-use a single instance. Is the HTTPClient
thread-safe? The answer is YES it is, at least the following methods have been documented to be thread-safe:
CancelPendingRequests
DeleteAsync
GetAsync
GetByteArrayAsync
GetStreamAsync
GetStringAsync
PostAsync
PutAsync
SendAsync
However the following are not thread-safe and cannot be changed once the first request has been made:
BaseAddress
Timeout
MaxResponseContentBufferSize
In fact on the same documentation page, under the Remarks section, it explains:
HttpClient is intended to be instantiated once and re-used throughout the life of an application. Instantiating an HttpClient class for every request will exhaust the number of sockets available under heavy loads. This will result in SocketException errors.
Okay so is that it? Create and reuse a single instance of our client and happy days? Well, NO! There is yet another very subtle but serious issue you may face.
A Singleton HttpClient does not respect DNS changes
Re-using an instance of HttpClient
means that it holds on to the socket until it is closed so if you have a DNS record update occurring on the server the client will never know until that socket is closed and let me tell you DNS records change for different reasons all the time, for example a fail-over scenario is just one of them (albeit in such case the connection/socket would have faulted and closed) or an Azure deployment when you swap different instances e.g. Production/Staging in this case your client would still be hitting the old instance! In fact there is an issue on the dotnet/corefx repo about this behaviour.
HTTPClient
(for valid reasons) does not check the DNS records when the connection is open so how can we fix this? One naive easy workaround is to set the keep-alive header to false
so the socket will be closed after each request, this obviously results in sub-optimal performance but if you do not care, then there is your answer; However, I think we can do better.
There is the lesser known ServicePoint class which holds the solution to our problem. This class is responsible for managing different properties of a TCP connection and one of such properties is the ConnectionLeaseTimeout. This guy as its name suggests specifies how long (in ms) the TCP socket can stay open. By default the value of this property is set to -1 resulting in the socket staying open indefinitely (relatively speaking) so all we have to do is set it to a more realistic value:
ServicePointManager.FindServicePoint(endpoint)
.ConnectionLeaseTimeout = (int)TimeSpan.FromMinutes(1).TotalMilliseconds;
The above override needs to be applied once for each endpoint. Note the method only cares about the host, schema and port everything else is ignored.
Almost There…
So far we have taken care of force-closing the connections after a given period but that was just the first part. If our singleton client opens another connection it may still be pointed to the old server, why you ask? well all the DNS entries are cached which by default does not refresh for 2 minutes. So we also need to reduce the cache timeout which we can do by setting the DnsRefreshTimeout
on the ServicePointManager
class like so:
ServicePointManager.DnsRefreshTimeout = (int)1.Minutes().TotalMilliseconds;
I wanted to have a better abstraction and not have to remember to do all of this on every request, I also wanted the abstraction to implement an interface for dependency injection between my services.
RestClient
RestClient is a thread-safe wrapper around HttpClient
and internally keeps a cache of endpoints that it has already sent a request to and if it is asked to send a request to an endpoint it does not have in its cache, it updates the ConnectionLeaseTimeout
for that endpoint. Here is a simple usage example:
// This is to show that IRestClient implements IDisposable
// just like HttpClient, you should not dispose it per request.
using IRestClient client = new RestClient();
await client.SendAsync(new HttpRequestMessage(HttpMethod.Get, new Uri("http://localhost/api")));
Now you can safely hold on to the client and/or register it using your favorite IoC container and inject it where ever you require.
The class supports the same constructors as HttpClient
and also provides a safe way of setting its default properties:
var defaultHeaders = new Dictionary<string, string>
{
{"Accept", "application/json"},
{"User-Agent", "foo-bar"}
};
using IRestClient client = new RestClient(defaultHeaders, timeout: 15.Seconds(), axResponseContentBufferSize: 10);
client.DefaultRequestHeaders.Count.ShouldBe(defaultHeaders.Count);
client.DefaultRequestHeaders["Accept"].ShouldBe("application/json");
client.DefaultRequestHeaders["UserAgent"].ShouldBe("foo-bar");
client.Endpoints.ShouldBeEmpty();
client.MaxResponseContentBufferSize.ShouldBe((uint)10);
client.Timeout.ShouldBe(15.Seconds());
The code is on GitHub and is available on NuGet as part of the Easy.Common library used in my other projects.
Update 2019
Starting from .NET Core 2.1, Microsoft addressed some of the issues covered in this article by making available HttpClientFactory. Despite the various features offered in this class, in my opinion, there is a little too much of ceremony involved with using this type also we would still need to deal with setting the DNS refresh timeout ourselves; Therefore, I still prefer to use RestClient
in my projects.
HttpClient
was also overhauled in .NET Core 2.1 with a rewritten HttpMessageHandler called SocketsHttpHandler which results in significant performance improvements it also introduces the PooledConnectionLifetime property which allows us to set the connection timeout without having to set the ConnectionLeaseTimeout
for each endpoint.
As of version 3.0.0 of Easy.Common, the RestClient
no longer needs to set the ConnectionLeaseTimeout
when running on .NET Core 2.1 or higher.
Have fun and happy RESTing.
BTW
This post was inspired by YOU’RE USING HTTPCLIENT WRONG AND IT IS DESTABILIZING YOUR SOFTWARE by Simon Timms and Singleton HttpClient by Ali Ostad as well as various great posts by Darrel Miller.