In my last post, we implemented the rate limiting for an API endpoint on the server side. The server side rate limiting is useful for API providers to ensure system performance and/or to realize business values. On the client side, the API consumers then should throttle the rate of concurrent HTTP requests in order to comply with the rate limits of the endpoints and moderate the usage of client side resources.
This post will go over how to make concurrent outgoing HTTP requests on the client side. The goal is to let the HTTP Client send concurrent requests at the maximum allowed rate which is set by the server, for example, at a maximum rate of 2 requests per second.
We will use a
semaphore in C# to limit the maximum concurrent tasks. The demo project is a .NET Core Console application, which takes advantage of the native dependency injection (DI) system and the new
HttpClient released since .NET Core 2.1. The full project is located in this GitHub repository.
Setting Up a .NET Core Console App with a Typed HttpClient
For demo purposes, we are going to create a small .NET Core Console app. Two necessary NuGet packages are
Microsoft.Extensions.Http. The first one allows us to take advantage of the DI system, and the second one provides the
We modify the
Main() method as below. Notice that this
Main() method returns a
async mode, which requires us to configure the project to build with a C# version of 7.1 or up.
Lines 3 to 5 register all dependencies and build a service provider. Then we can use the service provider to resolve dependencies for the services we want to use. Line 7 produces an instance of a Typed HttpClient,
IThrottledHttpClient, which will be implemented in the next section. Line 9 then instructs the
ThrottledHttpClient instance to issue requests against an API endpoint with a list of numbers as query parameters. In the end, lines 10 to 13 print out the results.
We now have a bare-bones program to let an
HttpClient make requests. Next, let’s study how to throttle concurrent outbound HTTP requests.
Why? We want to throttle concurrent HTTP requests not only because of the rate limit policies enforced by API providers, but also because of resource allocation optimization. If we don’t control the number of requests in a time period, our workstations or application servers might end up with an unnecessarily large number of open outgoing TCP connections in a short amount of time. A high volume network traffic may obstruct the network bandwidth and deteriorate the computing power.
Therefore, we will need some mechanisms to dispatch asynchronous tasks in order to limit the number of concurrent HTTP requests within any given time period. There are several ways to throttle concurrent tasks. Let’s examine the semaphore approach first.
Throttling Concurrent HTTP Requests Using a Semaphore
semaphore in C# is often compared to a bouncer in a night club, whose responsibility is to only allow a certain number of people step into the club at a point of time. By default, a semaphore only takes care of the maximum number of concurrent requests at a point of time. In the case we are discussing here, we need to add a second dimension: the time period.
We want to limit the tasks to execute over 2 threads (no more). In addition, we want to ensure there are no more than two tasks within any 1 second. The idea is depicted in the following illustration.
In the illustration above, we set up a semaphore to only allow 2 tasks to run simultaneously and add a delay of 1 second after each task. The semaphore will release an entry after the 1 second delay is finished, then a new task can enter the semaphore.
We create a Typed HttpClient class
ThrottledHttpClient and inject an instance of
HttpClient to the class. Then we create a method,
GetPrimeNumberResults(), to send HTTP requests and return the results. The code snippet is shown below.
Line 16 creates a semaphore with an initial size of 2, which is the rate limit in a time period. Lines 17 to 39 generate an array of tasks from the list of query parameters that we are interested in. Line 19 tells the semaphore to block the current thread until another task can enter it. Lines 21 to 27 generate a chained task that issues an HTTP request, then delays for one second, finally releases the semaphore object once to allow another task to enter. Lines 28 to 38 process the return results and handle HTTP request exceptions with fallback results.
To run the program locally, we first need to spin up the Web API application, otherwise all responses will be the fallback result because the API endpoint is not available. If you pull the code from this GitHub repository, then you can navigate to the
ThrottledWebApi folder and issue the command “
dotnet run” in a terminal. Then the Web API endpoint is ready to serve you.
Now, we navigate to the
ThrottledWebApi.ClientDemo folder, then open a new terminal and issue the command “
dotnet run”. The messages start to print in the Console window. I recorded my screen in the following GIF image.
As seen from the image above, the tasks run two by two in about every second. Thus, we have successfully throttled the outbound HTTP requests.
Other Client Side Throttling Approaches
SemaphoreSlim allows us to set a limit on the number of threads in the critical section. It also supports
async operations which are useful when many threads are waiting in a queue. In contrast, we should not use
Parallel.ForEach for asynchronous work.
Parallel.ForEach is best for working with CPU-heavy work.
It is worth mentioning the Polly library, which includes a Bulkhead policy. With Polly, to throttle the outgoing request to a maximum of two requests at a point of time, we can do the following. However, you might need to tweak it a little bit to add the time period dimension.
var throttler = Policy.BulkheadAsync(maxParallelism: 2);
The last approach,
RateGate, is proposed by Jack Leitch, which is also worth a try.
That’s all for today. Again, the complete .NET Core Console app is in this GitHub repository. Thanks for reading.