GraphQL and gRPC

Big Picture

For inter-process communication at a high level, the two styles are asynchronous and synchronous styles:

  • Asynchronous event-driven style: involving an event broker as a middle man.
  • Synchronous request-response style: including several families of technologies:
    • RPC (Remote Procedure Call):
      • CORBA (Common Object Request Broker Architecture)
      • Java RMI (Remote Method Invocation)
    • SOAP (Simple Object Access Protocol)
    • REST (Representational State Transfer)
    • gRPC
    • GraphQL
    • Apache Thrift

RPCs, built on top of TCP/UDP, are usually complex to implement. SOAP improved it and can operate on HTTP. Many large companies today still used SOAP for message exchange. However, it has a limitation with complex format and specifications for XML messaging, giving rise to REST. REST is not a standard, but rather a loosely defined architectural style. Its payload can be in any format (XML, JSON, etc) specified in the header. As long as the API conforms to certain guidelines (criteria outlined in this article), we can consider the API RESTful. REST has been steadily replacing SOAP in the past few years. In this post, I start with REST, then dive into gRPC and GraphQL.

REST

The de facto method of building microservices using REST architectural style is use HTTP protocol with JSON format payload. JSON format is human readable, but not optimized for machine-to-machine communication. So there is some room for compression. To emulate a web request in REST, one can use curl, a common utility to emulate any HTTP client.

REST has its shortcomings. For example, the interface between REST client and server is not strongly typed. You can choose to use OpenAPI/Swagger specification to define types but it is still not tightly integrated. There is no enforcement on the format of the payload either. RESTful services are quite bulky, inefficient, and error-prone. gRPC and GraphQL emerged to address different challenges with REST. GraphQL operates on HTTP and we can view it as a layer on top of REST in a broad sense. gRPC on the other hand, operates on HTTP /2, and thereby inherits many advantages from it.

HTTP /2

HTTP/2 is the second major version of HTTP. It overcomes some issues with HTTP/1.1 on security, speed, etc. This post is a good rundown of the difference between HTTP /2 and HTTP /1.1, which account for many of the advantages of gRPC. A thorough discussion on the differences between the two HTTP versions is beyond what this post can cover. One of the important difference with HTTP/2, is that all communication between a client and server is performed over a single TCP connection that can carry any number of bidirectional flows of bytes. This makes gRPC a high-performance RPC framework. In HTTP/2, the key concepts to understand are:

  • Stream: a bidirectional flow of bytes within an established connection. A stream may carry one or more messages;
  • Frame: the smallest unit of communication in HTTP/2. Each frame contains a frame header, which at a minimum identifies the stream to which the frame belongs.
  • Message: a complete sequence of frames that map to a logical HTTP message that consists of one or more frames.

The request message is always triggered by the client. During the interaction, the client and server break down the message into frames, interleave them, and then reassemble them on the other side. In this way HTTP /2 multiplex the messages, and enables the following communication patterns:

  • Simple RPC: a single request and a single response in the communication;
  • Server Streaming RPC: a single request and message followed by multiple response messages;
  • Client streaming RPC: client sends multiple messages and the server replies with one response message;
  • Bi-directional RPC: client setups connection by sending header frames. Once connection is established, both client and server send messages simultaneously without waiting for the other to finish;

The streaming communication patterns fundamentally improves performance, enabling the duplex streaming capability for gRPC.

gRPC

At the protocol level, gRPC has the following advantages over REST:

  • well-defined service interface and schema
  • strongly typed data
  • duplex streaming (thanks to HTTP2)
  • built-in commodity features (e.g. authentication, encryption, resiliency, service discovery, etc)

I previously touched on gRPC protocol in the context of Envoy and etcd. Envoy makes use of gRPC for its control plane, where it fetches configuration from management server(s) and in filters, such as for rate limiting or authorization checks. Etcd store implements gRPC protocol for client utility to communicate with.

Since the typical use case of gRPC is internal communication, I have not been able to find a playground online. To get a taste of gRPC client-server interaction, just play with etcd on MacOS. gRPC is language neutral. To start developing, refer to the tutorial in different languages (e.g. Golang, Python). When developing a gRPC application, the first thing to do is define a service interface in IDL (interface definition language). gRPC uses protocol buffers as the IDL to define the service interface. Protocol buffers are a language-agnostic, platform-neutral, extensible mechanism to serializing structured data. Using that service interface definition, we can generate the server-side code known as a server skeleton. Also you can generate the client-side code, known as a client stub. The methods that you specify in the service interface definition can be remotely invoked by the client side as easily as making a local function invocation.

gRPC also has some disadvantages. Currently, the ecosystem is still small. When we have a service interface, we have to maintain the interface across versions.

GraphQL

In most of the use cases of gRPC and GraphQL, GraphQL works for external-facing services/APIs while internal services backing the APIs are implemented using gRPC. Honeypot created good documentary for GraphQL available here.

We use GraphQL for external facing services because it gives API client the ability to query. An SQL query allows client to filter requested data based on conditions, and define the interested columns in the data return. This capability is missing in the REST style guideline. One may choose to implement their RESTful service to support their client’s query requirement, GraphQL standardizes this capability with typed data, thereby prevents unnecessary network round trips, over- and under-fetching of data.

The official site is a good reference for learning. One needs to know concepts around queries and mutations, schema and types (scalars, variable, fragment, interfaces, unions) to understand how query works.

This page lists a number of publicly available services in GraphQL. For example, country information service is available here. On the web page you can put in a query like this:

query myCountry {
  countries(filter:{code:{in:"CA"}}) {
    code,
    name,
    capital
  }
}

The query (with the name of myCountry) above asks to return entries with countries as type. Then it filters the result by the condition that the country code must include “CA”. The return should include code, name and capital columns. The web page looks like this:

GraphQL is on HTTP, so I can emulate the call with curl and get the same result:

curl --request POST https://countries.trevorblades.com/ \
--header 'Content-Type: application/json' \
--data-raw '{
    "query" : "query myCountry { countries (filter: { code: { in: \"CA\" } }) {code name capital} }"
}'

The return prints the same result:

The document has a page on library options for different languages, such as Graphene and Ariadne for Python. You can use Ariadne and Flask to build GraphQL API as this post suggests.

The example on this page has a better example that highlights how GraphQL allows client to ask multiple questions at the same time, and how the response includes different pieces of answers. GraphQL backend often needs to connect to different downstream system. This requires a proxy service to connect to disparate data sources. Examples of such services with such capabilities include Apollo, AWS AppSync, or ApiGee.

Summary

For API protocols, we often compare among REST, gRPC and GraphQL. REST is loosely defined and widely adopted for the past few years. The use case of REST diverged into two areas: external facing API and internal service-to-service communication. gRPC and GraphQL are relatively new and they each are good for one of the use cases.

Microservice pattern based on gRPC and GraphQL (source: “gRPC up & running” by Indrasiri & Kuruppu)

The gRPC protocol on HTTP2 is often used in internal communication between microservices. GraphQL offers query capability and is therefore often used to face external client.