Modern Cyber with Jeremy Snyder - Episode
26

Zack Glick of Zatik Security

In this episode of Modern Cyber, Jeremy Snyder speaks with Zack Glick, founder of Zatik Security, live from fwd:cloudsec 2024. Zack shares insights from his extensive experience in cloud incident response, including his time at AWS, where he handled major incidents like Heartbleed and Log4j.

Zack Glick of Zatik Security

Podcast Transcript

00:08

Welcome to another episode of Modern Cyber coming to you once again from the sidelines of fwd:cloudsec 2024. And I am joined once again by somebody attending the conference who I wouldn't ordinarily get a chance to interact with, but somebody with a wealth of experience in the cloud space, specifically cloud incident response, Zack Glick.

00:37

Thank you for taking the time to join us. Thanks for having me on. My pleasure. And do us a favor, give us a little bit about your background and some of your experience. So I was with AWS security for eight years. Okay. And my tenure there was from Heartbleed to Log4j. Okay. My first page was for Heartbleed. They took pity on me and let me shadow that one. My last month there was, full month there was Log4j. Wow. And so worked in IR, international expansion, and researcher relations, which is how I...

01:06

know a lot of the folks here at fwd:cloudsec. Awesome, awesome. It's hilarious to me, hilarious, in a way that you mark your tenure by first major incident, last major incident, but I do get it. I'm also former AWS, although a number of years before you, and I kind of look at it from, you know, first big customer workload migrated to last big customer workload migrated, so I can kind of relate from that sense.

01:30

IR is something that we've actually talked about on the podcast before, but I think there are different flavors of IR, and I think that's something important to understand. When you worked there and from the experience that you gained, what is unique about doing IR at AWS or on AWS environments? Yeah. The biggest thing is what you bring is calm. Okay. So every pilot on every plane you've ever been has the pilot voice. Okay.

01:59

And if the plane started shaking and that voice didn't come over saying that, you know, folks, everything's okay. You're going to find some food there for you. Right. You'd be a little worried. Yeah. And on our, our team, we were dealing with tens of incidents at a time, not all live issues, but running them as if they were. And it was just another day at the office for us when something came up, right? We had a follow the sun on call model where we had us East, us West Asia pack and Europe.

02:29

So stressful day, but you're there for six hours. Yeah. And what you bring is calm. Yeah. Just another day, we had an issue, we're going to work through it. Yeah. Here's what I need you to do. Yeah. Being very clear on assigning tasks and delegating things out. Yeah. You're more of an incident commander than an incident responder. Yeah. Because each AWS service is its own little special snowflake. Yep. And so you're relying on your technical experts. Where are the logs? What's the format? What can we dive on?

02:59

You're working with the developers to get, you know, containment, mitigation, and recovery. Yeah. You're working with PR on explaining to customers what happened, finding notified customers, you're working with the field staff to make sure that they know how to talk to those customers and, you know, everything else that goes with running an incident, but you're more of a commander and a manager than an actual responder.

03:20

It's really interesting. And by the way, I kind of glanced at the little timer that we've got going here because it was about two minutes into your response before you ever mentioned anything technical. The first bunch of log files was about two minutes in. I find that really interesting. I think it's like you think about incidents. And I have a friend who works for an MDR provider. And he said, for most customers, it's the worst day of their life. And I'm like, what?

03:44

I don't know if that's objectively true, but let's say the worst day of their life in that job or in that organization that they happen to be with when the incident happens. And that point you raise about kind of keeping people calm in the moment and understanding, I don't know if it's the relative importance or if it's the assuredness of we will come out this the other side. What do you think is that kind of stability that customers are looking for? What is the kind of like...

04:12

right at the end of the tunnel that they see there. Yeah, I mean, we know that recovery is the last phase of an incident. But if it's your first time running an incident, you don't know that that's the last step, right? And unless you play sports or are into martial arts or something, it's very rare as an adult human in modern society that you're actively being hunted, where there's another human actively trying to cause harm to you. Right.

04:39

And that's a weird feeling, right? If you're a board gamer, obviously there are people that unfortunately experience that in their day to day, but for most people, when you come into an incident and it's not a tree branch has fallen on the roof of our house, there is someone doing harm to a thing that I put my blood, sweat, and tears into. I care about, it's important to me. I care, yeah, I get meaning from it, so you can feel bad. Right, right. And as an incident responder,

05:08

you know that people recover, right? You know that recovery is that last step of the model, but in the day to day when you're just, there are no atheists in the foxhole, right? It's hard in an incident to see the big picture. Yeah. And you're there for your expertise, you're there for calm. Yeah, yeah. I'm curious about something else. So you mentioned recovery is kind of the last step.

05:32

does that lead me to think that, you know, the NIST cybersecurity framework and kind of those pillars are a guiding principle of incident response for you or for the teams that you've been a part of? I think when you're doing incident response at that sort of not strategic level, not tactical level, but that middle level, like really understanding how does each individual tactical aspect fit into the whole thing. Yeah. Right. You're, you know, the recovery function, doing the after action report.

06:00

It's not a thing you do two weeks later to hopefully whatever and maybe there's some action items. Like you as the commander are thinking, all right, well we couldn't, you know, we weren't able to do this dive because of compute issues or log files or, hey, we had this issue where it wasn't in UTC, it was in Cape Town standard time. Like, that was annoying. Right, it's your job to start shunting those over to the recovery doc, even if they're totally unformatted, totally whatever.

06:30

Yep. And then having a super strong PM team that is there to help track those after action item or after action review items, make sure they get done to feed back into that detection, that respond function, right? Yeah. The next time you're there. Yeah. Right. You know, if there's words, hey, the last time we tried to do this, this was annoying. Yeah. And you find that right in the, you know, incident repo. Yeah. Or right in the playbook. Yeah. It helps you put that calm out there.

07:00

So you're running the issue. And do you find that because of the scale and the scope and how critical the cloud providers are to organizations that it's a particularly high stress environment for a lot of the people working IR? Yeah, I mean, I was there. So I was there for sort of four years in the thick of it as a frontline responder. Yeah. And then I needed to take a year and a half to... So I was the lead security engineer for building up the Hong Kong region. Yeah.

07:29

So if anybody has had to deal with opt-in regions and turning regions on, I was a part of the team that helped get that as the default going forward. Yeah. And so that was my time away from incident response to recover, you know, and recharge. Interesting. And I think we dealt with the hassle of a follow the sun on call model, right? Yeah. It's an active leadership decision to not try to roll three shifts in the same geographic region, but to deal with hiring laws in

07:59

the EU, America and Australia, three different immigration systems, three different management styles, three different universities, time zones, senior managers flying all over the place to meet each other. It's a lot of overhead. Yeah. But, you know, letting somebody run an incident when it's their daytime. Yeah. It helps you make that better decision. Yeah. You know, angel or devil on the shoulder saying like, I know it's Friday at 6pm, but like we got to.

08:29

push a little further. Yeah, if you're in Australia, and that happens to be well, Friday's bad example, because that's their Saturday, but right, right. You know, they're, they're fresh. Yeah, and they can keep temperatures cool, they can keep emotions in check. Because they're, you know, they've been they've had some time to recover, even if it's just a little bit. Yeah, I think that's such a good point. You're in this like high stress environment to start off with, and then you're in a high stress moment in that environment.

08:56

And there's study after study that shows that we don't do our best decision making under those circumstances as human beings, right? You may make suboptimal decisions and those can have a big impact on the organization itself or importantly on the customers, right? So I really appreciate that perspective on it. So walk us through kind of what it's like from the perspective of

09:21

something major, like you said, heartbleed. So you got to shadow that. And to the extent that you can, I know there's gonna be some things that are obviously off limits, but to the extent that you can, talk us through what does that look like from the inside? Yeah, I mean, it's just a bigger version of what you're used to. Okay. So understanding what is the issue, what are we dealing with, understanding where are we impacted. And that was always tricky, right? Because there were multiple different levels of.

09:49

security in the cloud, security of the cloud, everyone's heard that. Right, right. We were that security of the cloud function. Right. And so really understanding how it affected individual services, how it would affect customer support, how it would affect, you know, in the case of Heartbleed, it was a pretty major incident. And so there was a lot of, you know, a lot of work. So in the sense of like every service has their own APIs, each of those service APIs, every service per region has its own APIs.

10:17

And then each of those APIs has an SSL certificate that was potentially vulnerable or? Yeah, I don't remember the exact details. So I don't want to really get into it, but it really is just tracking what does what does getting to mitigation and recovery look like? Okay. Right. And so we had great senior leadership there that was really focused on understanding what is the definition of done. Okay. What is a pants on fire? We need to do it now. What is a?

10:47

we should do it pretty soon. What is it nice to have? And seeing this was in 2014, seeing the Amazon machine really spin up things like S2N, which is this custom SSL version that was sort of a long-term action, and that's not something to do in the middle of an incident. No, right, right. But it's cool to see that function and come out of an incident like that.

11:12

Yeah, that's really fascinating. And that, you know, one thing I'd be curious to understand is, you know, if you're the cloud provider and you've got tens of thousands of customers dependent on your service, it's almost never really an option for you to take your services offline. Whereas individual organizations, it's almost a typical response nowadays. You know, it's very common with you subscribe to any of cyber newsletters or anything like that, pretty much once a week.

11:40

you will see at least one story that such and such organization takes services offline to deal with X. It could be ransomware, it could be cyber incident, broadly undefined cyber incident. If they're not sure. Right. If they're not sure, right. If they've been breached or compromised or whatnot. And historically, that's actually a relatively recent kind of development from my perspective. When I started in IT in 1997,

12:07

We were all about 99% uptime in SLAs and trying to get from 99 to 99.9, right? And things like that. But now actually it's almost seen as a valid response to, you know what, we're just offline. But for the cloud provider, that's pretty much not an option, right? Yeah, I mean, your, you know, whether this is an original Amazon line or they stole it from somewhere, right? Customer trust is the hardest resource to earn and the easiest one to spend. And...

12:35

you are asking customers to run their business on a thing that you're running. Yeah. And that's a big responsibility. Yeah. And people felt that responsibility of keeping things ready for customers. Yeah. If it had to happen right, we had the ability to, the escalation process of being able to run things up the chain, of impacted customers might need to do a reboot, but.

13:01

you know, finding ways to keep things alive was key because you're asking people to put their trust in you. Yeah, yeah. So with that background and all the experience that you have, what have you moved on to since leaving Amazon? So I was with New Relic for a little while and then I left them at the beginning of this year to start Zodiac, which is a consulting company focused on small businesses. Okay. And so just like companies will have a fractional CFO or a Chief Marketing Officer, we're a fractional application security team.

13:31

Okay. And so for the cost of a single FTE, we help a company build up a security team, help them with a security strategy. So if they're trying to get through a vendor management office, we can help them build those enterprise features that they're going to need, help them build those customer security controls to get through vendor incorporation and acquisition. Yep. And then we're building a product for super small businesses that's launching later this summer. And why did you choose to focus on application security as opposed to

14:00

infrastructure security or network security or even identity security? In a small and medium business world, it was the term that made the most sense. Because we end up touching all those things. We even added IT security. It turned out we needed to bring out a few more people on who understood that. Because at a small business, at a startup, you're at a startup, right? People are wearing so many different hats. You really don't necessarily have those firm structures, columns, organizational isolation.

14:28

And so at a small business, you need to be able to talk about everything and touch every part of the business when you're trying to help them secure what they're building. Interesting, interesting. So we're here at fwd:cloudsec. I've seen a lot of great talks. Yes. Have you? I've seen a bunch of, seen one on cosmic ray bit flipping yesterday. That was fascinating, right? Super fun. Yeah. Fun to see that, that public, uh, you know, in a public setting. Yep. Yep. Um, you know, the talks on, uh,

14:57

different vulnerable policies, seeing CVEs issued for cloud products. So my last role at Amazon was running the outreach team, owning the AWS-security and amazon.com inbox. And that's where I met a lot of the researchers that are at this conference. And there was always the question of cloud CVEs, why is there no cloud CVE? Becoming intimately familiar with those MITRE reporting rules. And so it was cool to come to a talk at the cloud conference and see that

15:27

Since it was client facing software this time, it made sense for there to be a CV. So that was fun to see as well. Yeah. It's funny. MITRE is something that I kind of struggle with, to be honest, because I find it to be applicable for, let's say, broadly distributed third party software. But most of the organizations that I interact with on a daily basis are companies that are building their own IP. Most of it is software based. Yeah. We work with a lot of fintechs, mobile apps, IoT companies, things like that. And so you see them.

15:56

developing issues of their own, not intentionally, of course, but they end up building software and that software ends up having vulnerabilities and issues. And then you can't really apply any kind of MITRE ATT&CK or any kind of framework to it because it's such a one-off type of thing, even if it follows a pattern of, you know, in our case we focus on API security and one of the things we see most often is authorization issues.

16:21

So we have this broad category of authorization issues, but you can't then kind of apply a label to the things that you're finding in the customer environment. So anyway, that's a separate challenge. Yeah, I think the CVE discussion, CVE is not the right tool. Okay. The counting rules is what they call them are, a CVE is there to let somebody know they need to do something. And that something is to go from vulnerable version to patch version. And so I think there is definitely a use case

16:51

hey, this happened universal identifier that's not a code name somebody picked up from the marketing team. CV is not the right, we're just gonna cloud that space. And going back to those after action reviews, so I went to Syracuse University for grad school in the School of Information Studies and the original information study was library science. And that ability to catalog and come up with a taxonomy.

17:19

Yeah, and classify data. Yeah, it's the thing I sort of regret the most is that Amazon never had that official historian. So that all the stories that we told internally could have been recovered, you know, for years later, but build those after action reviews. And when you bring new people onto the team, that should be part of their onboarding process. Is to go back and read. Is read all the incident reports, the mistakes, the lessons learned, why things operate the way they do now.

17:46

Hopefully it's from the action items that have come out of all those things. Yeah. All right. Why is the build pipeline look the way it does? It's not like we talk a lot about the security developer experience. Yeah. Right? If we make default the secure, and I think bringing that as a part of the onboarding, especially as you're bringing on devs, they bring the culture. Culture leads the product, and everybody wants to build a quality product, then security is just another measure of quality. That's such a great point, because I think like,

18:16

I've never entered an organization where the first task was to sit down and read stuff other than when I joined Amazon and it was to read a bunch of documentation. At that time it was, here's some HR policies, but more than anything it was, I think there were five documents given to us about the core five AWS services at the time, which were EC2, EBS, S3, RDS, and maybe SimpleDB if I'm kind of dating myself. I joined pre-VPC and IAM.

18:46

you know, that's kind of all there was. And so you sat down and you read a bunch of documentation, but I rarely hear about organizations where you sit down and, you know, read what's happened to the company. And I do wonder, I mean, I've got to think part of that is just that there's such a temptation like, okay, we've closed that incident, let's move on and get back to work. Do you think that's what it is or is there more? Some of it, well, I mean, unfortunately, some of it is legal. Yeah. You don't want those docs widely read, yada yada.

19:16

I'm not a lawyer, but some of it is there's trauma there. We don't remember the worst day of our life when we were at the office with a hostile actor on network. That day sucked. We have this new person we want them to know they're joining this super cool company that doesn't have those bad days. There's this concept called the entrepreneurial, I know we're coming up on time, but the entrepreneurial operating system talks about having a company scorecard.

19:44

which is a thing that you review once a week to sort of understand the health of your business. And retroactively exposing that to new hires is a great way for them to see the trends that have emerged over time, where new businesses' processes come from, why the organization is shaped the way it's shaped. When you come into a new organization, I've joined Amazon, I've joined smaller companies like New Relic, I've started a company, every organization is unique in understanding.

20:13

how we got here, that's the history major. Yeah, yeah. We're shaped by the society we live in. Yeah. Society we're looking to shape from the history that it comes from. And so as you've started Zodiac, what have you thought about putting into your own scorecard? So we're looking at, we're split between delivery and discovery. Okay. And so my co-founder Kimberly is more on the discovery side and more on the delivery side. Okay. And even with her in Seattle and myself in Buffalo, we need to make sure that, you know, what did I work on?

20:43

delivering for the customer this week, this, this, and this. What happened on the discovery? I had these meetings, this iron is hot, right? And understanding, you know, now we're getting a little more formal, right? We can say like, all right, we've built a Hubs pipeline that drives that. Yeah. All right. We're getting into JIRA tracking for items that meetings had with clients. So you don't want to, you know, you don't want to formalize all that stuff right away, but it really gives you a quick way to talk to your co-founder, talk to your people.

21:12

Yeah, really understand what blog posts we have in the iron. Yeah, yeah. Sales leads we have a fire because that's how the company grows. Awesome. Awesome. Well, for people who are looking to learn more about yourself and the organization, how is that expelled and where can they find you? Okay, ZATIK.io. So exotic exotic. Okay. If you find yourself with the South American skincare brand, okay, you're at the dot com, not the dot io. Got it. And if they're interested in our product, they can go to ZATIK.io.

21:41

and that's for small businesses when you're at your next barbecue and that person asks you how to secure their restaurant and you don't want to help them because you're too busy. That's what we're building to help those small businesses.

Discover all of your APIs today

If you can't see it, you can't secure it. Let FireTail find and inventory all of the APIs across your organization. Start a free trial now.