Throttling Concurrent Outgoing HTTP Requests in .NET Core

Throttling Concurrent Outgoing HTTP Requests in .NET Core

In my last post, we implemented the rate limiting for an API endpoint on the server side. The server side rate limiting is useful for API providers to ensure system performance and/or to realize business values. On the client side, the API consumers then should throttle the rate of concurrent HTTP requests in order to comply with the rate limits of the endpoints and moderate the usage of client side resources.

This post will go over how to make concurrent outgoing HTTP requests on the client side. The goal is to let the HTTP Client send concurrent requests at the maximum allowed rate which is set by the server, for example, at a maximum rate of 2 requests per second.

We will use a semaphore in C# to limit the maximum concurrent tasks. The demo project is a .NET Core Console application, which takes advantage of the native dependency injection (DI) system and the new HttpClient released since .NET Core 2.1. The full project is located in this GitHub repository.

Setting Up a .NET Core Console App with a Typed HttpClient

For demo purposes, we are going to create a small .NET Core Console app. Two necessary NuGet packages are Microsoft.Extensions.DependencyInjection and Microsoft.Extensions.Http. The first one allows us to take advantage of the DI system, and the second one provides the HttpClient.

We modify the Main() method as below. Notice that this Main() method returns a Task in async mode, which requires us to configure the project to build with a C# version of 7.1 or up.

private static async Task Main()
    var services = new ServiceCollection().AddHttpClient();
    services.AddHttpClient<IThrottledHttpClient, ThrottledHttpClient>();
    var serviceProvider = services.BuildServiceProvider();

    var client = serviceProvider.GetService<IThrottledHttpClient>();
    var numbers = new List<long> { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 0, 11, 13, 17, 19, 23, 29, 31, 41, 43, 1763 };
    var results = await client.GetPrimeNumberResults(numbers);
    foreach (var result in results)
        Console.WriteLine($"{result.Number} is a prime number? \t {result.IsPrime}.");
gist link

Lines 3 to 5 register all dependencies and build a service provider. Then we can use the service provider to resolve dependencies for the services we want to use. Line 7 produces an instance of a Typed HttpClientIThrottledHttpClient, which will be implemented in the next section. Line 9 then instructs the ThrottledHttpClient instance to issue requests against an API endpoint with a list of numbers as query parameters. In the end, lines 10 to 13 print out the results.

We now have a bare-bones program to let an HttpClient make requests. Next, let’s study how to throttle concurrent outbound HTTP requests.

Why? We want to throttle concurrent HTTP requests not only because of the rate limit policies enforced by API providers, but also because of resource allocation optimization. If we don’t control the number of requests in a time period, our workstations or application servers might end up with an unnecessarily large number of open outgoing TCP connections in a short amount of time. A high volume network traffic may obstruct the network bandwidth and deteriorate the computing power.

Therefore, we will need some mechanisms to dispatch asynchronous tasks in order to limit the number of concurrent HTTP requests within any given time period. There are several ways to throttle concurrent tasks. Let’s examine the semaphore approach first.

Throttling Concurrent HTTP Requests Using a Semaphore

semaphore in C# is often compared to a bouncer in a night club, whose responsibility is to only allow a certain number of people step into the club at a point of time. By default, a semaphore only takes care of the maximum number of concurrent requests at a point of time. In the case we are discussing here, we need to add a second dimension: the time period.

We want to limit the tasks to execute over 2 threads (no more). In addition, we want to ensure there are no more than two tasks within any 1 second. The idea is depicted in the following illustration.

Illustration of the staging of concurrent tasks executing over 2 threads. The 2 horizontal arrow lines represents the 2 threads and evolves by time from left to right. Solid and empty circles are tasks when they are at the starting point and at the ending point, respectively. The number of threads is determined by the maximum number of concurrent tasks within a time period. The stages “Waiting for Next Task” represent the rate limit allowance time period (one second).

In the illustration above, we set up a semaphore to only allow 2 tasks to run simultaneously and add a delay of 1 second after each task. The semaphore will release an entry after the 1 second delay is finished, then a new task can enter the semaphore.

We create a Typed HttpClient class ThrottledHttpClient and inject an instance of HttpClient to the class. Then we create a method, GetPrimeNumberResults(), to send HTTP requests and return the results. The code snippet is shown below.

public class ThrottledHttpClient : IThrottledHttpClient
    private readonly HttpClient _httpClient;
    private readonly string _baseUrl = @"http://localhost:5000/api";

    public ThrottledHttpClient(HttpClient httpClient)
        _httpClient = httpClient;

    public async Task<PrimeNumberResult[]> GetPrimeNumberResults(
      List<long> numbers, 
      int requestLimit = 2, 
      int limitingPeriodInSeconds = 1)
        var throttler = new SemaphoreSlim(requestLimit);
        var tasks = numbers.Select(async n =>
            await throttler.WaitAsync();

            var task = _httpClient.GetStringAsync($"{_baseUrl}/values/isPrime?number={n}");
            _ = task.ContinueWith(async s =>
                await Task.Delay(1000 * limitingPeriodInSeconds);
                Console.WriteLine($"\t\t {n} waiting");
                var isPrime = await task;
                return new PrimeNumberResult(n, isPrime);
            catch (HttpRequestException)
                Console.WriteLine($"\t\t\t {n} error out");
                return new PrimeNumberResult(n, "NA");
        return await Task.WhenAll(tasks);
gist link

Line 16 creates a semaphore with an initial size of 2, which is the rate limit in a time period. Lines 17 to 39 generate an array of tasks from the list of query parameters that we are interested in. Line 19 tells the semaphore to block the current thread until another task can enter it. Lines 21 to 27 generate a chained task that issues an HTTP request, then delays for one second, finally releases the semaphore object once to allow another task to enter. Lines 28 to 38 process the return results and handle HTTP request exceptions with fallback results.

Showtime 🎉

To run the program locally, we first need to spin up the Web API application, otherwise all responses will be the fallback result because the API endpoint is not available. If you pull the code from this GitHub repository, then you can navigate to the ThrottledWebApi folder and issue the command “dotnet run” in a terminal. Then the Web API endpoint is ready to serve you.

Now, we navigate to the ThrottledWebApi.ClientDemo folder, then open a new terminal and issue the command “dotnet run”. The messages start to print in the Console window. I recorded my screen in the following GIF image.

As seen from the image above, the tasks run two by two in about every second. Thus, we have successfully throttled the outbound HTTP requests.

Other Client Side Throttling Approaches

SemaphoreSlim allows us to set a limit on the number of threads in the critical section. It also supports async operations which are useful when many threads are waiting in a queue. In contrast, we should not use Parallel.ForEach for asynchronous work. Parallel.ForEach is best for working with CPU-heavy work.

It is worth mentioning the Polly library, which includes a Bulkhead policy. With Polly, to throttle the outgoing request to a maximum of two requests at a point of time, we can do the following. However, you might need to tweak it a little bit to add the time period dimension.

var throttler = Policy.BulkheadAsync(maxParallelism: 2);

The last approach, RateGate, is proposed by Jack Leitch, which is also worth a try.

That’s all for today. Again, the complete .NET Core Console app is in this GitHub repository. Thanks for reading.

Leave a Reply