When Google introduced gRPC in 2015, one of the most significant architectural decisions was building it on top of HTTP/2 rather than the widely adopted HTTP/1.1. This was just about following some trend, but a deliberate choice that fundamentally shapes how gRPC performs and behaves.
When Google introduced gRPC in 2015, one of the most significant architectural decisions was building it on top of HTTP/2 rather than the widely adopted HTTP/1.1. This was just about following some trend, but a deliberate choice that fundamentally shapes how gRPC performs and behaves. Let’s dig deeper into why this decision was made and how it impacts real-world applications.
Before diving into gRPC’s specific needs, let’s understand what HTTP/2 brings to the table that HTTP/1.1 doesn’t.
Binary vs Text Protocol
HTTP/1.1 is a text-based protocol. When you make a request, it looks something like this:
Plain text
HTTP/2, on the other hand, is a binary protocol. The same request is encoded into binary frames that are more compact and faster to parse. This improves performance, especially when dealing with thousands of concurrent connections (such as in a microservices architecture).
Multiplexing
The most interesting and important feature of HTTP/2 is multiplexing. In HTTP/1.1, each TCP connection can handle only one request at a time. If you want to make multiple requests, you either have to:
Wait for each request to complete sequentially
Use HTTP pipelining
Open multiple TCP connections (which browsers limit to 6-8 per domain)
HTTP/2 allows multiple requests and responses to be interleaved over a single TCP connection using streams. Each stream has a unique ID, and frames can be sent for different streams without blocking each other.
Plain text
This is how the timeline view of multiplexed HTTP/2 requests looks.
Plain text
On the wire, this multiplexing looks something like this.
Plain text
gRPC’s Core Requirements
Bidirectional Communication
gRPC supports four types of RPC calls:
Unary: Traditional request-response
Server streaming: Server sends multiple responses
Client streaming: Client sends multiple requests
Bidirectional streaming: Both sides stream data simultaneously
HTTP/1.1 simply cannot handle streaming scenarios effectively. While techniques like Server-Sent Events (SSE) or WebSockets exist, they’re either limited or require protocol upgrades that break the HTTP model.
HTTP/2’s stream-based architecture naturally supports these patterns. A bidirectional streaming gRPC call maps perfectly to an HTTP/2 stream where both client and server can send frames asynchronously.
Here’s a practical example…
Imagine a chat application where multiple users are sending messages simultaneously:
Plain text
With HTTP/2, this becomes a single stream where:
Client sends DATA frames containing serialized ChatMessage
Server sends DATA frames back with messages from other users
All happens over one TCP connection with proper flow control
High Throughput and Low Latency
In microservices architectures, services often make dozens of calls to other services to fulfill a single user request. With HTTP/1.1, this creates a bottleneck:
Plain text
Even with connection pooling, you’re limited by the number of concurrent connections and the head-of-line blocking problem.
With gRPC over HTTP/2, Service A can make all these calls concurrently over a single connection:
Plain text
The multiplexing ensures that a slow response from Service B doesn’t block the faster responses from Services C and D.
Efficient Header Handling
gRPC makes extensive use of headers for metadata like authentication tokens, tracing information, and custom headers. In a typical microservices call chain, these headers are propagated through multiple services.
HTTP/1.1 sends headers as plain text with every request:
Plain text
In a service mesh with hundreds of requests per second, this overhead becomes significant.
HTTP/2’s HPACK compression maintains a dynamic table of previously seen headers. After the first request, common headers are referenced by index rather than sent in full.
Plain text
This reduces header overhead by 85-90% in real applications.
Performance
Connection Management
Consider a microservices application with 10 services, each making an average of 5 calls to other services under load.
HTTP/1.1 Scenario:
Each service needs connection pools to every other service
With 6 connections per pool, that’s 10 × 9 × 6 = 540 TCP connections
Each connection has TCP overhead, OS socket limits, and connection establishment latency
HTTP/2 Scenario:
Each service maintains 1-2 connections to every other service
In practice, the latency benefits of HTTP/2 for gRPC are most noticeable in:
High-frequency, low-payload requests: Microservices often make many small calls. HTTP/2’s frame overhead is lower than HTTP/1.1’s text parsing.
Concurrent requests: When a service needs to aggregate data from multiple sources, HTTP/2’s multiplexing provides a significant speedup.
Long-lived connections: gRPC services maintain persistent connections for streaming. HTTP/2’s connection reuse is more efficient than HTTP/1.1’s connection establishment overhead.
Real-world numbers that I have observed
20-40% latency reduction for concurrent requests
50-80% reduction in connection overhead
2-3x improvement in requests per second for small payloads
Streaming
Server Streaming Example
Consider a log streaming service:
Plain text
HTTP/1.1 Approach:
Long polling with timeouts
Chunked transfer encoding
Complex client-side reconnection logic requiring state management
HTTP/2 Approach:
Natural streaming with DATA frames
Built-in flow control via WINDOW_UPDATE
(explained below) frames
Clean connection management
The HTTP/2 implementation is not only simpler but also more robust and efficient.
Flow Control in Action
HTTP/2’s flow control prevents fast producers from overwhelming slow consumers. In gRPC streaming:
Client opens stream with initial window size (65KB default)
Server sends data frames up to the window limit
Client processes data and sends a WINDOW_UPDATE to increase the available window
Server continues sending based on the updated window
The window size controls how much unacknowledged data can be “in flight” between sender and receiver at any given moment. Think of it as a buffering limit, not a message size limit.
This prevents memory exhaustion and provides natural backpressure, something that’s difficult to achieve cleanly with HTTP/1.1. Also, both client and server can send a WINDOW_UPDATE frame depending on their rate of consumption and production.
Trade-offs
When HTTP/2 Might Not Be Ideal
Single Request Scenarios - For simple, one-off requests, HTTP/1.1 might have lower latency due to:
Simpler protocol negotiation
Less connection setup overhead
Broader proxy support (though this is diminishing)
Resource-Constrained Environments - HTTP/2 requires more memory for:
HPACK compression tables
Stream state management
Flow control windows
In embedded systems or extremely memory-constrained environments, this overhead might be significant.
Proxy and Infrastructure Considerations
Load Balancer Compatibility - Not all load balancers handle HTTP/2 efficiently:
Some terminate HTTP/2 and forward as HTTP/1.1
Others don’t properly handle gRPC’s use of HTTP/2 trailers
Stream-aware load balancing is still evolving
Debugging Complexity - HTTP/2’s binary nature makes debugging more challenging:
Network captures require specialized tools
Stream interleaving makes request/response correlation complex
Traditional HTTP debugging tools may not work
Protocol Buffer Integration
And of course, talking about the most common - protobuf. gRPC’s use of Protocol Buffers pairs exceptionally well with HTTP/2
Protobuf’s binary serialization is naturally aligned with HTTP/2’s binary frames:
No text-to-binary conversion overhead
Efficient frame packing
Better compression ratios when combined with HTTP/2’s HPACK
Schema Evolution
Protobuf’s schema evolution capabilities work well with HTTP/2’s header compression:
Feature flags and capabilities can be communicated compactly
Footnotes
gRPC’s adoption of HTTP/2 was a strategic decision that enables the framework’s core value propositions - performance, functionality, and scalability.
If you are building distributed systems, understanding this relationship between gRPC and HTTP/2 is crucial. It helps you build performant production environments.
As you design and implement gRPC services, keep these underlying mechanics in mind – they will inform your decisions about service boundaries, streaming strategies, and performance optimization.
The key is understanding how to leverage these capabilities effectively in your specific use case.