Sensibly Default

2024-10-19

There are two programming principles that I hold dear to my heart: the principle of least surprise and provide sensible defaults. I’ve recently been working within the GraphQL ecosystem, and the number of violations of both here has frustrated me. This will be a little bit ranty.

GraphQL Is Insecure By Default

Well, at least in the JavaScript implementation. Cue the bug bounties:

It is well known that there are several easily accessible Denial of Service vectors on GraphQL endpoints.

Unlimited Power Tokens

One of the most egregious was CVE-2022-37734, where it is possible to construct a query with a very large number of tokens to put pressure on the lexer - a variant of the Billion Laughs attack. This was fixed in graphql-java, by stopping parsing after the configured maximum number of tokens (including whitespace) is reached, extending the original fix (https://github.com/graphql-java/graphql-java/pull/2549) that adds a default limit of 15,000 tokens.

However, in the reference implementation graphql/graphql-js, a similar fix was added that did not include a default limit - by default it is left undefined, and so the DoS mitigation is not active by default.

Some other implementations also lack this:

There are no doubt other implementations of the spec that don’t set this by default.

In many cases this can be mitigated by setting a limit on the size of the request body, but having it configured and built into the lexer is vital.

Aliases and Directives

As mentioned in the blog post earlier, a less easily solved problem is that of alias overloading and directive overloading.

In short, alias overloading allows performing the same operation multiple times within the same request - batching. Directive overloading “spams” the same directive multiple times in the same operation.

In order to mitigate against these, we need to traverse the parsed query AST and count against them. Again, this is something that most implementations do not have guards for by default, often falling back to query complexity analysis, as well as complex rate limiting implementations, such as GitHub’s. I’ll wait while you digest that particular document. However, there be dragons: GraphQL Incorrect Cost Handling disclosed to Shopify indicates that this is a hard thing to get right.

Some popular GraphQL frameworks such as RedwoodJS and Apollo provide some defaults, but in the case of the latter, this is an Enterprise only feature.

Again, this kind of thing is surprising.

Information Disclosure By Default

Many GraphQL implementations enable schema introspection by default, which allows attackers to perform easy reconnaissance.

It is considered good practice to disable introspection in production, and requires an explicit opt in. The linked docs from Apollo show how for their framework, and in vanilla graphql/graphql-js:

import { createHandler } from "graphql-http/lib/use/express";
import { NoSchemaIntrospectionCustomRule } from "graphql";

const graphQLHandler = createHandler({
  // ...
  validationRules: [NoSchemaIntrospectionCustomRule],
});

I kind of understand why this is enabled by default - it is helpful in development - but like a lot of things GraphQL, feels like a loaded footgun.

Introspection Abuse

Besides introspection leaking data, it can also be used in attacks that result in deeply recursive queries. As a result, controlling query depth is also required. Reviewing graphql-python/graphene, this is something that needs to be enabled explicitly through an additional validator.

What Does A “Secure By Default” Endpoint Looking Like Anyway?

Taking graphql/graphql-js as the reference implementation, lets take the “getting started” example and work on making it secure.

Here is the starting example:

import express from "express";
import morgan from "morgan";
import { createHandler } from "graphql-http/lib/use/express";
import { schema } from "./schema";

const root = {
  hello: () => "Hello, world!",
};

const app = express();

app.use(morgan("common"));
app.use("/graphql", createHandler({ schema, rootValue: root }));
app.listen(3000, () => console.log("Server running on port 3000"));

First of all, we need to protect ourselves against large request bodies. Straight forward enough with body-parser:

 import morgan from "morgan"
 import { createHandler } from "graphql-http/lib/use/express"
 import { schema } from "./schema"
+import bodyParser from "body-parser"

 const root = {
     hello: () => "Hello, world!"
@@ -10,5 +11,7 @@ const root = {
 const app = express()

 app.use(morgan("common"))
+app.use(bodyParser.json({ limit: "64kb" }))
+app.use(bodyParser.urlencoded({ extended: true, limit: "64kb" }))
 app.use("/graphql", createHandler({ schema, rootValue: root }))
 app.listen(3000, () => console.log("Server running on port 3000"))

Next, we need to add some default validation rules and set our maxToken to some sensible value:

-app.use(morgan("common"))
-app.use(bodyParser.json({ limit: "64kb" }))
-app.use(bodyParser.urlencoded({ extended: true, limit: "64kb" }))
-app.use("/graphql", createHandler({ schema, rootValue: root }))
-app.listen(3000, () => console.log("Server running on port 3000"))
+const handler = createHandler({
+  schema,
+  parse: (query) => parse(query, { maxTokens: 5000 }),
+  validationRules: [
+    ...specifiedRules,
+    NoSchemaIntrospectionCustomRule,
+  ],
+});
+
+app.use(morgan("common"));
+app.use(bodyParser.json({ limit: "64kb" }));
+app.use(bodyParser.urlencoded({ extended: true, limit: "64kb" }));
+app.post("/graphql", handler);

Now, we need to add some specific guards. The nice people over at escape.tech have created a practically mandatory package - graphql-armor - that exposes some helpful rules to mitigate the above:

 validationRules: [
   ...specifiedRules,
   NoSchemaIntrospectionCustomRule,
+    maxDepthRule({
+      exposeLimits: false,
+    }),
+    costLimitRule({
+      exposeLimits: false,
+    }),
+    maxDirectivesRule({
+      n: config.graphqlMaxDirectives,
+      exposeLimits: false,
+    }),
+    maxAliasesRule({
+      n: config.graphqlMaxAliases,
+      exposeLimits: false,
+    }),
 ],

We set exposeLimits to false to prevent leaking information to attackers about what limits are in place.

The final setup is shown in this repo: mble/graphql-tirefire.

This leaves us with something very similar to RedwoodJS’s configuration.

In an ideal world, these packages and their default configuration would be baked into the underlying GraphQL implementations themselves, to provide more of a “zero config” approach.

By providing sensible defaults, we can enable our users to be more successful out of the gate, and keep the web safer by default.