On Web-Security and -Insecurity: Automating REST Security Part 1: Challenges

Although REST has been a dominant choice for API design for the last decade, there is still little dedicated security research on the subject of REST APIs. The popularity of REST contrasts with a surprisingly small number of systematic approaches to REST security analysis. This contrast is also reflected in the low availability of analysis tools and best security practices that services may use to check if their API is secure.

In this blog series, we try to find reasons for this situation and what we can do about it. In particular, we will investigate why general REST security assessments seem more complicated than other API architectures. We will likewise discuss how we may still find systematic approaches for REST API analysis despite REST's challenges. Furthermore, we will present REST-Attacker, a novel analysis tool designed for automated REST API security testing. In this context, we will examine some of the practical tests provided by REST-Attacker and explore the test results for a small selection of real-world API implementations.

Author

Christoph Heine

Overview

Understanding the Problem with REST

When evaluating network components and software security, we often rely on specifications for how things should work. For example, central authorities like the IETF standardize many popular web technologies such as HTTP, TLS or DNS. API architectures and designs can also be standardized. Examples of these technologies are SOAP and the more recent GraphQL language specification. Standardization of web standards usually influences their security. Drafting may involve a public review process before publication. This process can identify security flaws or allow the formulation of official implementation and usage best practices. Best practices are great for security research as a specification presents clear guidelines on how an implementation should behave and why.

The situation for REST is slightly different. First of all, REST is not a standard in the sense that there is no technical specification for its implementation. Instead, REST is an architecture style which is more comparable to a collection of paradigms (client-server architecture, statelessness, cacheability, uniform interface, layering, and code-on-demand). Notably, REST has no strict dependency on other web technologies. It only defines how developers should use components but not what components they should use. This paradigm makes REST very flexible as developers are not limited to any particular protocol, library, or data structure.

Furthermore, no central authority could define rules or implementation guidelines. Roy Fielding created the original definition of REST as a design template for the HTTP/1.1 standard in 2000. It is the closest document resembling a standard. However, the document merely explains the REST paradigms and does not focus on security implications.

The flexibility of the REST architecture is probably one of the primary reasons why security research can be challenging. If every implementation is potentially different, how are we supposed to create common best practices, let alone test them consistently across hundreds of APIs? Fortunately for us, not every API tries to reinvent the wheel entirely. In practice, there are a lot of similarities between implementations that may be used to our advantage.

Generalizing REST Security

The most glaring similarity between REST API implementations is that most, if not all, are based on HTTP. If you have worked with REST APIs before, this statement might sound like stating the obvious. However, remember that REST technically does not require a specific protocol. Assuming that every REST API uses HTTP, we can use it as a starting point for a generalization of REST API security. Knowing that we mainly deal with HTTP is also advantageous because HTTP - unlike REST - is standardized. Although HTTP is still complex, it gives us a general idea of what we can expect.

Another observation is that REST API implementations reuse several standardized components in HTTP for API communication. Control parameters and actions in an API request are mapped to components in a generic HTTP request. For example, a resource that an API request operates on, is specified via the HTTP URL. Actions or operations on the said resource are identified and mapped to HTTP methods defined by the HTTP standard, usually GET, POST, DELETE, PUT, and PATCH. API operations retain their intended action from HTTP, i.e., GET retrieves a resource, DELETE removes a resource, and so on. In REST API documentation, we can often find a description of available API endpoints using HTTP "language":

Since the URL and the HTTP method are sufficient to build a basic HTTP request, we can potentially create an API requests if we know a list of REST endpoints. In practice, the construction of such requests can be more complicated because the API may have additional parameter requirements for their requests, e.g., query, header, or body content. Another problem is finding valid IDs of resources can be difficult. Interestingly, we can infer each endpoint's action based on the HTTP method, even without any context-specific knowledge about the API.

We can also find components taken from the HTTP standard in the API response. The requested operation's success or failure is usually indicated using HTTP status codes. They retain their meaning when used in REST APIs. For example, a 200 status code indicates success, while a 401 status code signifies missing authorization (in the preceding API request). This behavior again can be inferred without knowing the exact purpose of the API.

Another factor that influences REST's complexity is its statelessness paradigm. Essentially, statelessness requires that the server does not keep a session between individual requests. As a result, every client request must be self-contained, so multi-message operations are out of the picture. It also effectively limits interaction with the API to two HTTP messages: client request and server response. Not only does this make API communication easier to comprehend, but it also makes testing more manageable since we don't have to worry as much about side effects or keeping track of an operations state.

Implementing access control mechanisms can be more complicated, but we can still find general similarities. While REST does not require any particular authentication or authorization methods, the variety of approaches found in practice is small. REST API implementations usually implement a selection of these methods:

HTTP Basic Authentication (user authentication)
API keys (client authentication)
OAuth2 (authorization)

Two of these methods, OAuth2 and HTTP Basic Authentication, are standardized, while API keys are relatively simple to handle. Therefore, we can generalize access control to some degree. However, access control can be one of the trickier parts of API communication as there may be a lot of API-specific configurations. For example, OAuth2 authorization allows the API to define multiple access levels that may be required to access different resources or operations. How access control data is delivered in the HTTP message may also depend on the API, e.g., by requiring encoding of credentials or passing them in a specified location of the HTTP message (e.g. header, query, or body).

Finding a Systematic Approach for REST API Analysis

So far, we've only discussed theoretical approaches scatching a generic REST API analysis. For implementing an automated analysis tool, we need to adopt the hints that we used for our theoretical API analyses to the tool. For example, the tool would need to know which API endpoints exist to create API requests on its own.

The OpenAPI specification is a popular REST API description format that can be used for such purpose. An OpenAPI file contains a machine-readable definition (as JSON or YAML) of an API's interface. Basic descriptions include the definition of the API endpoints, but can optionally contain much more content and other types of useful information. For example, an endpoint definition may include a list of required parameters for requests, possible response codes and content schemas of API responses. The OpenAPI can even describe security requirements that define what types of access control methods are used.

{
    "openapi": "3.1.0",
    "info": {
        "title": "Example API",
        "version": "1.0"
    },
    "servers": [
        {
            "url": "http://api.example.com"
        }
    ],
    "paths": {
        "/user/info": {
            "get": {
                "description": "Returns information about a user.",
                "parameters": [
                    {
                    "name": "id",
                    "in": "query",
                    "description": "User ID",
                    "required": true
                    }
                ],
                "responses": {
                    "200": {
                        "description": "User information.",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "type": "object",
                                    "items": {
                                        "$ref": "#/components/schemas/user_info"
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    },
    "security": [
        {
            "api_key": []
        }
    ]
}

As you can see from the example above, OpenAPI files allow tools to both understand the API and use the available information to create valid API requests. Furthermore, the definition can give insight into the expected behavior of the API, e.g., by checking the response definitions. These properties make the OpenAPI format another standard on which we can rely. Essentially, a tool that can parse and understand OpenAPI can understand any generic API. With the help of OpenAPI, tools can create and execute tests for APIs automatically. Of course, the ability of tools to derive tests still depends on how much information an OpenAPI file provides. However, wherever possible, automation can potentially eliminate a lot of manual work in the testing process.

Conclusion

When we consider the similarities between REST APIs and OpenAPI descriptions, we can see that there is potential for analyzing REST security with tools. Our next blog post discusses how such an implementation would look like. We will discuss REST-Attacker, our tool for analyzing REST APIs.

Acknowledgement

The REST-Attacker project was developed as part of a master's thesis at the Chair of Network & Data Security of the Ruhr University Bochum. I would like to thank my supervisors Louis Jannett, Christian Mainka, Vladislav Mladenov, and Jörg Schwenk for their continued support during the development and review of the project.

On Web-Security and -Insecurity

Monday, October 10, 2022

Automating REST Security Part 1: Challenges