November 7, 2022

The Absolute Beginner's Guide to API Security

A blog post describing the OWASP API Security top 10, for developers and technologists of any skill level

The Absolute Beginner's Guide to API Security

Hacker gaining unauthorized entry through a vulnerable interface

So you want to learn about API security huh.

Why is API security a problem?

Let’s talk about the pervasiveness of APIs by discussing a real-life scenario most of us can relate to: hailing a trip on a rideshare app.

So first, you open up the app using your phone and request a ride to your location. That sends a request from your phone to the rideshare app’s servers.

If the rideshare app’s backend is running a microservices architecture, then it’s likely that this will trigger a series of API calls between the loosely coupled backend services, which can sometimes be within a single network (such as within the AWS cloud), and other times over the internet (multi cloud, hybrid cloud).

Next, the rideshare app pings drivers in your area, again, over an API, to check who’s available to come pick you up. Your designated driver responds through their app, and the rideshare service again receives this request, processes it, and sends you a new notification letting you know someone is on their way to pick you up.

Assuming a simple architecture for the backend of the app, we’re already at a minimum of four API calls for a single transaction that was only just started.

We don’t need to go through the full scenario, suffice to say that there are at least a few more occasions for API calls to be made: at pick up time, drop off time, and even after the ride when reviews or gratuities can be made.

The point here is that APIs are everywhere, and they help transmit valuable data over business-critical functions where it’s stored, processed, and used.

Whenever there’s value, there’s someone looking to exploit it. It’s not a niche thing, either. Our friends over at Gartner predicted that this year (2022) will be the year APIs become the number one source of enterprise data breaches.

This makes sense. The previous top attack vector for cloud-based applications, asset misconfigurations (e.g. leaky AWS S3 buckets), caused quite a few headaches during the early stages of cloud computing. The cloud service providers have gotten better at helping customers configure their assets more securely, and those same customers have either gotten more skilled at using the cloud, or adopted tools like cloud security posture management software that make compliance with security best practices a lot easier.

So okay, cloud infrastructure is getting easier to build securely. What about APIs? What’s the state-of-the-art here?

Let’s again turn to our friends over at Gartner who’ve made a nice page summarizing what is being offered in the “Cloud Web Application and API Protection” market. Notice the recurring acronym “WAF” that pretty much every vendor seems to be pushing as the solution to API security.

There’s a major problem with this approach. Web application firewalls haven’t stopped API data breaches.

Why is this the case? 

It’s the nature of the attacks. Attacks against APIs generally look like normal API traffic, but contain queries designed to evade security controls through exploitation of the application logic. WAFs are not designed to handle problems in applications or API logic. WAFs are designed to block specific IP address, ranges, functional endpoints on an API, or some combination thereof. 

As we’ll see soon enough, the way APIs have been breached is through flaws in application logic, something a WAF cannot stop.

In 2019, the Open Web Application Security Project (OWASP), a leading authority on internet security, published a top ten list of API vulnerabilities.

1. Broken Object Level Authorization

Developers often assume that authentication equals authorization. They assume that api calls will only be coming from known good client software, like a mobile app, that will control the parameters of the api request, and make it impossible for user A to request user B's data. But bad actors don't work that way...

Functionally, there are at least two bad practices at play here. 

First, any sort of predictable pattern in an application’s data structure can be exploited by a malicious user. If you’re not using random and unpredictable values to identify your data objects, they can be predicted and so are easier to exploit.

To reuse the rideshare scenario: say a malicious user guessed your rideshare account’s user identifier because they figured out that those IDs are sequential, they could mock a legitimate API call to fetch information about that user’s ride history.

Second, can an unauthorized user gain access to a data object they shouldn’t have access to? If the application’s backend doesn’t check that the specific user is allowed to access the specific object in question, then yes.

Now, the mock request made to the rideshare app’s backend would only go through if that mock API call is not properly checked for authorization. Is the user making the request actually allowed to see the ride history that is being requested? If this auth check is not made by the application logic, then object level authorization has been broken, and the rideshare company is about to have a bad day.

2. Broken User Authentication

This one is about user impersonation. Can a malicious user trick your application into believing they’re someone else? There are a few common ways this can happen:

  • Credential stuffing is using pre-made lists of usernames/passwords and trying every one of them out until one works. 
  • Automated brute force attacks against a specific user, where an attacker tries out different passwords or multi-factor authentication (MFA) tokens until they find the right one.
  • Poor in-transit security like lack of encryption, poor or no hashing, or even including auth tokens and passwords in the URL can mean your users are vulnerable. Malicious actors would be able to  intercept users’ authentication details by sniffing the network for juicy API requests and responses.
  • Tokens or other credentials are properly generated by the frontend but never actually checked for validity by the backend.
  • Exploitation of deeplinks or other “one click authentication” mechanisms that are improperly secured

This is essentially a malicious actor gaining access to a real user’s rideshare account and then ordering for themselves as many rides and food deliveries as they can get away with.

Some best practices to implement to limit your application’s risk include minimum password lengths, rate limiting API calls, not relying on long-lived credentials, and implementing captcha/”are you a human” tests to your authentication services, including login, password reset, and MFA token requests.

So we’ve reviewed the first two major vulnerabilities and security risks highlighted by the 2019 release of OWASP’s API security top 10. We did this because we wanted to understand why breaches through APIs are becoming so common despite Web Application Firewalls having been available for many years. In summary, WAFs operate at the network level. They allow you some level of visibility and control over the types of traffic that can go to your applications. 

However, the vulnerabilities that are today commonly used all exploit faulty application logic. It’s literally in the name, Application Programming Interface security is an application level problem, not a network problem.

We can admit it, security gets in the way of software development pretty often. It can get in the way of your organization’s mission. 

Good security practices will help you avoid damage to reputation, and maintain a consistently secure & smooth experience for your users. But on the other hand, it can also make it feel as if solving your customers' problems must include jumping through circus hoops just to check the boxes required by this compliance standard or that. 

At the developer’s level, and within the context of an API, security best practice recommends the use of a contract. You, the developer, describe how your API works, and customers or  users consume that API according to what is defined in your contract.

Allow me to illustrate my point with a little scenario about the following example endpoint: /api/users/{userId}/bookmarks/{bookmarkId}

A junior backend developer is tasked with handling a new property to store a new rideshare app’s user bookmarked addresses, which are usually their home and their work addresses.

The frontend team has already developed the mechanism by which a user can save those properties in-app, and users can see their saved addresses when they look to book a ride.

Now, if an API contract were being enforced, the backend would state the exact conditions under which that data could be manipulated and returned to the frontend: (a) when the user creates or modifies the saved address data with a POST, or PATCH call, or (b) when the user requests this data with a GET call because they’re about to book a ride in the app.

There should be no other scenario in which this personal identifiable information (PII) should be returned to the front end.

Our app and junior backend developer do not enforce an API contract. They live on the cutting edge, security isn’t their jam, and it’ll only slow them down anyways. So, while they did get that new address bookmarking feature to work without a hitch, they forgot to implement a few security best practices, which has unfortunately led to…

3. Excessive Data Exposure

The marketing team had a great idea. They want to show the world how diverse the rideshare app’s user demographics are, so they’re having the public website show a rotating feed of user first names and the city they’re from. Innocent enough, sure. 

Despite the marketing folks only needing user first names and cities, the app’s backend services are returning entirethe whole user objects. It’s easier to develop that way, bring back all the data, and cherry pick the few fields that you actually need to use.

Well, the extra data is still there. It’s a few clicks away for someone with very basic technical knowledge. And unfortunately for our rideshare company, the extra user data contains some pretty sensitive info like that new field reserved for bookmarked addresses. Suddenly, it’s not a fun new little addition to the website, it’s a user data breach.

What could have been done to prevent this? A couple things…

  • Handle data sanitization on the server side. Only return what’s needed. You’ve the expression “need to know”? Since tThe website doesn’t need to know about a user’s PII, cut it out before it’s sent back as part of an API response.
  • Maintain an API contract and make sure it’s being enforced. The contract should have stated that user bookmarks are for specific scenarios that don’t include the public website.

4. Lack of Resources & Rate Limiting

The junior backend dev has learned their lesson, data will be sanitized, and a contract will be enforced.

The rideshare market is competitive. Sometimes, international competition can play a little less than fairly. A rideshare app whose servers aren’t responding is unable to compete. If your API is being hit with calls that are making the servers unavailable, you’ve done something wrong with the design of your API.

Denial of Service (DoS) attacks like these have made the news on more than one occasion. It’s the hacktivist collective’s favorite technique because it’s visible and can easily make a big public relations splash.

If a request can make a call for an unlimited number of records from your database, and your server tries to fulfill that request, it’s going to affect the performance of the database, and all the services that depend on that database.

Additionally, if a malicious user can send your server a 2 terabyte “jpeg” as their user profile picture, and your server doesn’t handle that gracefully, it’s probably going to have an effect on service availability for legitimate users.

This is where rate limiting, pagination, and a transparent API contract come in.

Rate limiting will prevent users from sending requests at a rate that is too quick for your backend to handle. Pagination will allow users to slow down their access to your databases by returning a different set of results with each request (items 1-99, then items 100-199, etc.) A clear API contract will mean users, including the team responsible for handling the client side, are aware of the limitations of interaction with the API.

It’s difficult to be completely safe from DoS attacks. Even more so distributed denial of service. Hackers can easily change which IP and device ID they’re pinging your servers from, making it difficult to differentiate between a legitimate request and a fake one. For now, suffice to say that the more visibility you have on incoming requests, the more it will be possible for you to catch the ones intending you harm.

5. Broken Function Level Authorization

Each API call made to a server is received and processed by a specific function in the application. Before executing their code, functions evaluate the metadata that’s been passed along with each API request. This data includes information like who is making the request, what type of request is being made, and where the request should be going.

The user who requests a ride is the only one who should be able to interact with the function that handles payment processing for their account. The driver should not have the ability to mock a user API call to the same payment processing function and change how much they are being paid for the fare.

However, if the Principle of least privilege is not being applied to properly define the scope of user-allowed actions, then application functions are vulnerable to broken function level authorization exploits. To avoid this, developers must develop an extremely granular access control system to define who is allowed to do what:

  • As a rider, I can modify how much I leave as a gratuity for a ride I have taken.
  • As a driver, I can view how much was left for me as a gratuity for a ride I have given.

Unfortunately, it is a lot easier to write an application that treats all users the same, without making the distinction between the modify or view scope of permissible action on a gratuity, than to write an application that does.

Traditionally, IT and security teams have been separate from the rest of the organization. The security role relied on the aggregation of data and logs from all possible attack surfaces into the Security Operations Center (SOC) for analysis and evaluation. 

Enter DevSecOps, a relatively new trend in the IT world. Where agile methodology and DevOps have changed how application development teams operate, DevSecOps has also led to a shift in how organizations do security. Essentially, DevSecOps is about introducing security practices earlier in the software development lifecycle (SDLC). This requires very close collaboration between security experts and the teams ultimately responsible for delivering code.

In cloud computing, this has led to the rise of the Cloud Center of Excellence (CCOE), a team which includes experts from across the organization: from security, architecture, and every technical function in between. One organization I worked with called it their Guardians of the Cloud bi-weekly meeting, but the purpose remained the same: define which best practices should be prioritized, monitored and enforced. 

If you’re looking to set up a CCOE, you’re going to need to understand the best practices that will make your organization more secure:

6. Mass Assignment

Mass Assignment is about exploiting a vulnerability in an API endpoint receiving a data object by adding new parameters that allow for unintended application behavior or privilege escalation

This vulnerability is also sometimes called object injection or autobinding, depending on the programming language being used.  

Say a user of a rideshare app discovers that an endpoint allowing them to change their phone number also allows them to change how many credits they have in their account. Let’s see how it would play out: 

The user discovered the vulnerability by inspecting the data schema of the user object when it is returned as part of the “my profile” section of the rideshare app. 

On that page, the user object being returned by the backend includes a {“credits”: 0} parameter. By mocking an API call on the endpoint that allows the user to change their phone number and including {“credits”: 9999} in the request query, the user is able to trick the app into giving them a lot more credits than they should have.

The Mass Assignment vulnerability is found on the endpoint handling the “my profile” section of the rideshare app. This endpoint is not checking if the user making the request is authorized to modify the “credits” parameter on their user account. 

To avoid this type of exploit, it is important to clearly define which object parameters should be modifiable by the user, and which ones should not. This definition can then be included and enforced in the API contract.

7. Security Misconfiguration

Security Misconfiguration is a rather large vulnerability category. It includes configurations that pertain directly to the API & application higher up in the technology stack but also the infrastructure lower down in the stack.

There is a long list of areas where potential misconfigurations can slip through. Here are some of the most common sources of security misconfigurations:

1) Using insecure transport protocols: Transport Layer Security (TLS) version 1.3 was released in 2018 and is the most up to date version.

2) HTTP methods: Don’t allow for unnecessary HTTP methods on your endpoints. There are more than you think.

3) Require the latest HTTP security headers:  HTTP Strict Transport Security (HSTS), Content Security Policy (CSP), and X-Frame-Options.

4) Overly descriptive error messages - make sure when your endpoints return error messages to not expose anything about how the underlying application works. Keep messages short and generic.

5) Enforce the use of HTTPS.

6) Use a Cross-Origin resource sharing (CORS) policy that restricts data flows to only pages or domains which should communicate.

7) Infrastructure misconfigurations by not following the security pillar of the well-architected framework.

8. Injection

As wide as the previous category was, the Injection vulnerability is fairly narrow. Many people have heard of SQL injections. They can be catastrophic beyond simple data breaches as they can very quickly take down the data servers that power mission-critical services.

There are ways to configure your API or application logic that can stop code or command injections from going through. If none of these are implemented, then malicious users may be able to interact with your infrastructure through your vulnerable API.Command and code injection events often include predictable patterns that can easily be identified and blocked. Don’t let this one be your security team’s oversight.

Think about how you would prepare a real-life heist. You’d spend some time preparing, right? Digital thieves do the same. They read API documentation, probe vulnerable endpoints, and inspect data being returned by overly trusting web applications.

Pretty commonly, attackers look to execute exploit chains. These are also called vulnerability chaining, bug chaining, or daisy-chained attacks. In common blue team (defender) terms, this is usually called the attack path. Essentially, the idea is to use multiple exploits in a row to gain access to the most valuable parts of the infrastructure.

For organizations that maintain APIs, a single vulnerability likely won’t cause a catastrophic breach. Unfortunately, more often than not, hackers will spot more than one vulnerability from the OWASP API security top 10, allowing them to leverage the chain of vulnerabilities into a breach which has a much higher impact.

In our rideshare app example, a simple two step vulnerability chain allows for account takeovers, letting a determined attacker use any feature or download any data from their target’s account:

(1) The first vulnerability is in the login process, specifically logins through Facebook. A faulty implementation of the Facebook login mechanism means that the rideshare app’s authentication step can be injected with the Facebook ID of an account controlled by the attacker, instead of the true Facebook ID of the legitimate user.

(2) Despite passing the Facebook login step, there is still two-factor authentication (2FA) in the way of the malicious actor. Unfortunately, rate-limiting was never implemented on the endpoint checking the 2FA code, meaning, a script could try every single possible combination of the six digit verification code until the breach was successful.

Independently, either of these vulnerabilities would not lead to a major breach. However, used in combination, they build a chain that allows for full account takeover. The best way to be protected from exploit chains is to understand the context of the web application: two high threat vulnerabilities, depending on their context, can in fact lead to a critical threat to your infrastructure.

Let’s take a moment to learn about the last two most common categories of threats to APIs, according to OWASP.

9. Improper Assets Management

APIs routinely use versioning to account for the sometimes conflicting priorities of development velocity and support to legacy systems. Despite releasing v2 of your API, you may want to give a grace period to the legacy users of the API’s version 1 so that they have time to update their applications to work with the latest version.

In theory this would be fine, however, in practice and given real-life time constraints, development teams will often make the decision to implement the latest security best practices on the latest version of the API, and forego updating previous versions.

To at least be aware of what they’re running, many organizations will maintain an API hosts inventory. Let’s make sure we understand the distinction between an API contract and an API hosts inventory: The contract is defined by documentation often created through OpenAPI spec, and describes how to interact with a specific application or service. On the other hand, an API hosts inventory will encompass all of the API contracts that are under the scope of the team responsible for its maintenance.

Properly managing new releases also falls under the scope of assets management. You should always set up distinct environments for development, test, staging, and production. Don’t use production data in pre-production environments. Aside from being a risk for additional data breaches, it exposes your internal teams to sensitive customer data, and may also violate GDPR principles.

Improper assets management can be mitigated by operational controls. You should categorize your public APIs, and APIs meant for internal, private use. Ensure that APIs that can interact with personally identifiable information (PII) are tagged as such.

If you have a deep understanding of data flows and the connection between services, there’s a much better chance you’ll be one step ahead of a would-be attacker’s attempts to chain vulnerabilities.

10. Insufficient Logging & Monitoring

There are several levels of maturity in logging & monitoring systems. Some will give you incomplete visibility over data flows, but give you the context needed to understand the activity in your APIs. Others will feel like a firehose of log data that can only be understood by diving into seemingly endless rabbit holes to go through a complete stack trace.

Maintaining a good balance between the two extremes will ensure your logging & monitoring systems give you complete visibility, but also enable more actionable observability

There are multiple layers that can create logs in the software stack. At the start of a client interaction with an API, a request will generally go through an API gateway, which will create a log of the network event. 

Next, the event will be passed to the application layer where logging becomes a lot more dependent on developer choices. If your API is powered by AWS Lambda, you can log most data to CloudWatch, although observability may not be the best. Otherwise, you may not have any logging at the application layer. 

Once at the database layer, using AWS hosted services like DynamoDB, Appsync, or Relational Database Service (RDS), will make logging and storage easy, but ease of monitoring might suffer.

If you’re not logging API request and response activity, you run the risk of flying blind. You won’t know how your API is being used or if it’s being exploited. In the event of a breach, you won’t know the extent of the breach. Further than that, if you’re not actually checking your logs, then really what was the point of logging in the first place?

Essentially, the same API request can create logs at multiple points, but the best place, by far, is at the application logic layer.

This is where FireTail can help. We maintain open-source libraries for Python, JavaScript (Node.js), GoLang, and Ruby that show you the conflicts between what your API contract states the behavior of your API should be, and the actual behavior of your API. 

It works by reading your OpenAPI spec file, confirming expected behavior, and giving you the option to block any unintended behavior. We also check for security rules and best practices we’ve explored in this overview of the OWASP API Top 10.

Developing APIs is increasingly important, so adding security requirements shouldn’t be seen as sacrificing development velocity and time to market. This is why we’ve adopted this approach - we’re giving developers the “easy button” to develop APIs securely. 

FireTail is on a mission to secure the world’s APIs by making API security as simple as import, setup, done. We are passionate about helping organizations secure their APIs as they grow their cloud presence. Thank you for reading. If you would like more information, please visit our website