FireTail had the opportunity to attend and connect with other great minds in the industry at API Days Hong Kong 2023.
FireTail CEO and cybersecurity expert Jeremy Snyder breaks down the importance of securing your APIs in today's digital ecosystem.
APIs account for over 83% of all web requests today, and that number is continuing to grow. And as APIs rise in importance, API attacks rise in volume and impact. In fact, as of 2022, APIs became the leading web attack vector. In this talk, you will learn about:
- The Importance of API Security: Real-world examples to highlight the importance of APIs and API security in the current, digital world.
- A Decade of Data Breaches Jeremy looks at past API data breaches and the evolving threat landscape with a comprehensive analysis of the most significant incidents over the last 10 years.
- Addressing the Root Causes of API Security Breaches: Insights into common API security flaws and root causes of data breaches, and discover how to prevent them with best practices and the latest industry standards, including the OWASP API Top 10.
- Challenges of API Security: Problems that organizations face as they expand their API footprint and change their architectures to be API-centric.Effective API Security Strategies - Discover best practices for securing your APIs and mitigating risks, including real-world examples and practical advice for creating a comprehensive API security strategy.
Alright, welcome to API days connect Hong Kong 2023 my name is Jeremy Snyder, I'm the founder and CEO of FireTail, I'm really excited to be with you today, super excited to deliver today's presentation on some of the research we've been doing over here at FireTail relative to API security. We've got a lot to get through in about 25 minutes so stay with me and let's get going.
Alright, so the title of today's talk is API security analysis of breaches attack vectors and strategies and the reason that we wanted to give this talk is that we wanted to kind of go into detail on the some of the headlines that we're hearing around API breaches we know there have been a lot of high-profile cases especially in the last few months here in 2023 and we wanted to kind of break down what we see in the news relative to what actually went wrong and try to help provide information that allows organizations to do better relative to protecting their own APIs.
Just a few quick words about myself before we dive into the content of today's presentation. I am about a 25 years technology veteran, I started my career really as a practitioner in IT networking cybersecurity across a couple of SaaS companies and then a video game company I then transitioned into more customer facing roles, first with AWS and then with another of other companies in the cloud and cyber security ecosystem. Most recently I spent about 7 years doing Cloud security posture management with global organizations around the world with a software company that grew about 20x in a 4-year period and then was acquired by Rapid 7 who is you know one of the I'd say larger cybercity vendors certainly a longtime established company in the vulnerability management space as well as a number of other things. I spent about a year and a half with them post-acquisition doing M&A work, corporate development strategy, etc. I have a couple of degrees, I speak a few languages, I've actually spent some time in Hong Kong myself over the years when I was living in Asia. There is my email address, please if you want to reach out to me if you have any questions after today's presentation please do it by email. I'm really not very active on Twitter at all. The slides will be available after the presentation and API days will be posting the talks so all of the information in case you miss anything or if you want to share this on to colleagues it will be available for you after the event.
So I think the first important thing to really start off with in today's presentation is that APIs really connect us all and there's a couple stats that I'm going to share and then I'm going to share a use case or a scenario that really illustrates it so first of all it's important to understand that actually where we are today in 2023 more than 83% of all internet requests are API calls and what I mean by requests is every time you know you pull up let's say a web page, that web page may reach out to three or four third parties to assemble all of the content and functionality of that page together if you've ever watched the status bar in your browser while you loaded a page you've probably noticed that it says “Hey loading Google analytics, loading outbrain, loading recommended content…” whatever those things may be generally speaking to assemble a web page like that it is actually over API calls that all of this content and functionality is kind of stitched together.
You know, there are more than 50,000 public APIs available today and in fact more than 60% of developers are are already using third party APIs and building the software that they deliver for their organizations that number is only expected to rise that is an upward trend that was tracked actually by DevOps Network and they've been tracking that stat over a number of years and this is the highest point that we've ever been at but it is definitely rising year-over-year and you know with the growth of public public APIs you know more than 2,000 new public APIs put out every year we are on track to having a trillion API n points by 2030 so in about 7 years from now we will be in an Internet that is literally you know anything that you think of there will be an API for that you know with this switch to mobile there was the popular phrase there's an app for that we are actually getting to the point where there's an API for that in response to almost any thing you can think of any data set any functionality you can think of there will be or is already an API for that.
One example I like to give is you know what happens when you order takeout or you order food delivery via a mobile app so I recently spent time with a developer from one of the leading kind of global food delivery app companies and we were talking through this kind of basic transaction that you see on the screen here right so what you do when you open your phone is your phone really fetches your geolocation sends that to a Cloud Server based on that location you get back a list of menus and by the way that send and receive was already a one API transaction between the phone and the cloud service just to deliver the menus to you there will have been other API calls to kind of log you in fetch your account information, things like that you go through you kind of select the things that you want you send that out another set of API roundtrip transaction on the back end what's happening though is that that service is then sending that out to restaurants via some delivery app or you've probably seen things like tablet on in some of these locations where they can uh look at that deliver that food order coming in but in any event you've also then got the coordination with the delivery driver with the payment
processor, etc. and you know we started counting the individual API calls in this transaction and we got to 25 and we realized oh boy there's actually a lot more so it could be somewhere between let's say you know 40 to 50 and that's just for one transaction.
And so this is kind of indicative of where we are and how APIs have become this kind of connective tissue that allows developers to stitch services together to create a business transaction and if you think about what's in that order we've actually got a lot of sensitive information you know of course we've got your identity we've also got your address we've got your billing details right all of that just in one transaction and so this is again you know a big indicator of why API security is so important and why, you know, Gartner predicted in 2017 that APIs would be the number one attack vector by 2022 we had an interesting interaction with Gartner about this recently where I had given a talk earlier this year where I said actually I don't know if that statistic came true I don't didn't see enough data to really indicate to me that it came true and Gartner actually replied to us after watching the talk, and said actually no it it has come true, and in fact they've underestimated it and by their calculations and some of the research that they and IBM have seen, APIs are actually responsible in about half of all data breaches so we have been working on collating a list of all of the kind of large- scale publicly disclosed API breaches that we've seen over the years you can see that on our website the URLs at the bottom of the screen right here and we've tried to take that and put that into an analysis matrix where we try to analyze primary attack vector, secondary attack vector, were there more than one thing that went wrong in in each situation…
What do we know about the the totals here and this is actually, this infographic is slightly out of date, this was done in May of this year we've actually had a few more pretty large scale events since so where it says you know 500 million records exposed and 40 publicly disclosed data breaches we actually up over 630 million if I remember correctly and we're at about 54 publicly disclosed data breaches and these are you know global in scope and they tend to be very large scale so that's one of the big observations that we have around it we'll come to some of the other things but I think some of the really interesting findings here and hopefully the uh video doesn't cover this up too much on the screen but if it does what I would certainly tell to you is that you know the number one and number two primary attack vectors are authentication and
authorization and we're going to talk in a couple minutes about why that is so important to understand because it will help to explain why a lot of traditional cyber security approaches really don't work for API security or where they have shortcomings for API security.
The other thing to mention is that when we looked through this list we kind of looked for you know primary attack vector, yes is there a secondary attack vector and effectively in every single case there is there's always more than one thing that kind of goes wrong so let's dive into a couple of examples here and try to kind of illustrate that pattern that we see here.
So number one, a coffee chain that I'm sure we're all very familiar with, this case was actually a responsible disclosure by a security researcher so you know that 100 million records here this was actually not a breach so we do actually distinguish in our own research between disclosures and breaches but this was a disclosure by a researcher who found this so what happened in this situation so the researcher was working against the Starbucks mobile app API and actually you know there are different app APIs in different parts of the world because Starbucks doesn't operate the mobile app in every single company direct a country directly, they may have franchising or local partnership agreements in different places- in fact I believe in Hong Kong it is actually through a franchise relationship so you know, your data was probably not in the data set that the researcher found in this case to our audience in Hong Kong, but regardless what happened- so obviously, there is a mobile app that mobile app is talking to a backend over an API the researcher was working on this API and had a really interesting set of experiences the first thing was they realized that just by issuing calls to the API requesting end points or routes that don't exist they continued to get more and more information back from the apps server so they would try to call an endpoint and the app server would say back you know that doesn't exist “did you mean this or this” and you know the basically the app server kept disclosing additional path information that led them further and further down towards discovering APIs that were there.
So we call that enumeration, right? So that is kind of looking across an asset for all of the available functionality that you can find, so through that enumeration combined with the verbose error reporting that basically presented a GraphQL endpoint to the researcher.
Now GraphQL is a very fast growing API technology if you're not familiar with it do I would encourage you to go read up on it if you get a chance but one of the things that it really does is it presents a lot a much simplified data query interface to the developer who is writing software to access that API one of the downsides is that you then need to place restrictions on what data can and cannot be queried again in that GraphQL endpoint so what they found was a GraphQL endpoint that did not have any of that typically there are things like levels of introspection or um
levels of query depth that you want to look at in terms of restrictions especially if you put a GraphQL endpoint into production. Very often, GraphQL endpoints are super helpful in development scenarios where they kind of allow developers to really make the most benefit of the data presented by that API, but you do then want to think about you know kind of limiting access as you get into production so there you go so you have like kind of probing an enumeration that combined with some bad error reporting behaviors by the app server allows for discovery of this undocumented API.
This API has no restrictions on it, you can issue basically a select star fetch an entire data set and this illustrates one of the points that I wanted to kind of share with the audience around API breaches and that is that when API breaches happen they tend to be very large in scale and the reason for that is that API breaches tend to be programmatic and that programmatic nature means that usually an entire data set is exposed so there's kind of a logical flaw and that exposes an entire data set so let's move onto example two.
This is a type of exercise equipment, piece of exercise equipment it is a smart device or a connected device basically every connected device that you have in your home or is speaking to a backend over one or more sets of APIs if you ever want to observe that go onto your home
router watch the logs as this device communicates with the cloud service that it is connected to and what we saw here was basically the the researcher started looking at the API calls going from his home router going from this device to the backend service you started realizing that there were patterns in the integers that were in the query parameters and the strings that were presented those patterns were pretty easy to kind of take apart and understand.
You have let's call it like a user ID and a profile ID and you say user ID one profile ID is one great well that means I'm probably fetching my own user profile well what they then figured out was can I fetch profile number two number three and in fact what they found was unrestricted access to fetching any other profile combine that with integer-based numbering of ID numbering of records and sequential numbering and it was really a matter of Child's Play to kind of write a script to scrape down 3 million records. This is what's known as broken object level authorization.
So we have authentication to log me in but we're not checking post authentication whether I am I am authorized to access the data that my API calls are requesting this is one of the most common single problems that we see in API security this on its own is probably the most common combination of primary and secondary attack vectors so this would be like you know so Bola- so broken object level authorization combined with poor data management at the back end so again that kind of integer-based sequential numbering is a very bad practice best practice want to use GIDs, they should have effectively some randomization in them they should be basically impossible to guess.
Moving on to our third example and that is with a Telco in AsiaPac, this was a breach in September of 2022 this actually made a lot of noise locally in the location in the country in question 11 million records provably exfiltrated and what we saw in this case was a supposedly internal API that did not have authentication requirements around it or where the authentication requirements got mistakenly removed combined with a network configuration change that then exposed this API publicly.
This is another kind of common example where we see kind of let's call it like configuration change misconfiguration again coupled with a coupled with a poor authentication design so again where authentication was kind of a a top number one or two primary secondary attack vector it usually comes in the favor of an authentic a mechanism that just isn't working or is it there so that was the case in this case and that you know is really another pretty common pattern so almost all breach events are multi-vector hopefully those examples kind of help to illustrate it. So what that really means when you're thinking about your API strategy is you actually need to think a little bit more holistically and we're going to talk about some ideas for good API security strategies in just a minute here.
Couple of you know high level analysis points from looking at the day data not industry specific or geography specific we saw a couple examples from North America one from Asia Pac there but it really is global there have been a number you know really in pretty much every part of the world APIs are everywhere, some industries have had a larger impact than others and if you think about this for a second it's exactly what you would expect it to be, it's industries that are very interconnected so industries that have kind of digital Supply chains um travel
has been very heavily represented. There was a very recent event with Points.com that you can go look at I unfortunately didn't have time to include that in today's presentation you can also see on the manufacturing side there have been a number of high-profile uh thankfully disclosures not breaches around connected cars so very similar to some of the examples that we looked at around smart home devices where there's been a combination of let's say like bad numbering practices and bad data identification data record identification practices combined with poor authentication or authorization in various cases I'd encourage you to go look at those as well. So those have been a couple of examples but of course like technology is very heavily represented why because who builds APIs? Software companies build APIs so that is again what you would expect it to be.
So I want to take just a couple minutes to talk about some traditional cyber security
approaches that don't really work well for APIs. Number one is perimeter and there's a couple things that don't really work for perimeter security regarding APIs, the first is kind of the most obvious which is that these were designed to be public APIs these were, designed to be APIs that any you know mobile apps smart device, etc. could connect to from really anywhere on the
internet and in fact the data exfiltration looks like normal API traffic you know in many cases it was authenticated and then just you know lacking proper authorization but again normal API requests to exfiltrate the data so perimeter security really doesn't work regarding this.
Number two is agents and this one's a little bit more subtle and nuanced but one of the things that we've observed in working with customers around the world where are they building APIs? To a large extent they're building APIs on modern Cloud platforms and in fact they're building them on things like serverless and container structures. And the challenge there is when you're building on Cloud on container or serverless services, you very often don't have access to an underlying operating system where you could install an agent and couple that with you know you've got limited access to data streams that would be coming off of those compute platforms to provide you data that you would be able to observe.
The other thing is of course you know agents like our third category monitoring are almost always post-event so they can't provide prevention of API security breaches they can at best tell you or give you indicators that you've been breached and that is helpful and certainly I would encourage everybody to use what monitoring tools and capabilities are available to you, but don't think of them as solving your API security problem. They will provide you with a useful stream of data around things like performance around things like utilization of course and you may be able to find some good forensics clues out of them but they're not really a solution to API security.
So we have been looking and talking to a number of customers around API security for our entire existence as a company and we've heard a couple of things first is you know we've seen this survey of CISOs that really led to kind of a top six of what's wrong some of them are things that you were that you would expect well, the number one is you know a lot of CISOs don't know where their APIs are or security teams it's not just CISOs but security teams and that is very often a symptom of you know development teams that got given access to platforms to go build and move the business forward and that was certainly accelerated during the pandemic when digital transformation was really front and center and the real goal of a lot of organizations with just keep business transactions moving forward.
Number two perimeter security, we've talked about, number three end to end this end to end is really crucial why because you know with the challenges that we've seen and the breach vectors that we've seen around things like authorization or data management these are design flaws in the applications so who does that well it's the application owners and the developers who are writing the app so being able to trace hey we've had a breach on this app API that we've seen in production tie that back to the developers who rode it that tends that turns out to be super crucial.
Number four number of required security configurations this is particularly challenging in fast moving Cloud environments we saw breaches in the case where like hey one network configuration change made an API public inadvertently, kind of goes hand in hand with number five change management and then you know six the gap between developers and security teams and that actually ties in very much to number three the end to end tracing so you see kind of like really two to three themes within these number sixes.
So when we talk to a lot of organizations around effective API security strategy a lot of
it comes down to how are we adopting new technologies and then how do we react to that and some of these things that you see on the screen turn out to be super crucial. The adoption path that we've seen work well for organizations is really start with the discovery process that discovery process really helps you to understand all the APIs you have and you can do that in production you can do that in development environments, who can do that in some combination of production and development environments but secondarily once you've got that visibility and you've got let's call it an API inventory you want to run an assessment so how good or bad are my APIs where do we have challenges and that can be through a combination of techniques, some proprietary some open source there's a lot of tools out there, happy to talk with anybody about that who wants to.
Typically it's only after that phase that we see customers say well okay these are my crucial APIs, these are public facing they process PII, or they process crucial business transactions, how do I look at applying prevention and runtime defense for those APIs? And then one really crucial component again we've seen you know kind of API adoption go to hand in hand with modern Cloud adoption and to that end, centralizing audit trails for APIs tends to be super crucial a lot of those platforms log to different locations by default and with increasing requirements about timeliness and and disclosing full scale and scope of breaches you need complete audit trails of your APIs.
One last thing I want to mention is that it's actually really important to look at the application layer when you look at APIs, especially from an audit trail perspective the thing is this is a comparison chart that we put together around AWS in particular and if you want to understand what went wrong you you'll notice that you know the things that were used in breaches like manipulation of request parameters and request payloads you only get through visibility at the application layer you can't get that from a lot of the let's call it the default network services on these Cloud platforms.
So just with the couple minutes I have left, a quick overview of us at FireTail. We've actually built a custom-built approach to API security, it's very much in line with the NIST cybercity framework: identify, protect, detect, respond, recover.
We do that by really responding to those six problem areas that we've seen and those six kind of pillars of API security that we talk about uh we help you to discover your APIs in your Cloud environments plus APIs that are being developed in code we're working on actually mapping that end to end for you we give you real-time self-updating inventory of your APIs you can then observe them see what's going on analyze traffic see what's good and what's bad we give you runtime enforcement at the code level through a set of Open Source code libraries that you can check out on our GitHub we also give you policy assessment of your APIs centralized audit trail and incident detection and response with very rich context we've gotten great feedback from customers around next steps first.
I hope this wasn't too much too fast, but if you want more information on the report itself, there's a QR code you can scan it's also available on our website. Like I said the slides will be available on the firetail blog, the video will be available from API Days. Please feel free to hit us up on our
website which is just FireTail.io and request a demo, we're also happy let anybody play around with our POC environment and for people watching this talk as part of Apidays Hong Kong we have a limited offer for small you know Smbs and startups for the free tier of FireTail which we have not yet publicly launched but we are actively seeking early adopters and early beta customers for we're constantly releasing new solution guides and analysis of API security as we see it.
And with that I thank you so much for taking the time and it's been a real pleasure. Again my name is Jeremy Snyder please do check us out at FireTail. And if you want more information I'm just Jeremy@firetail.io.