Modern Cyber with Jeremy Snyder - Episode
36

Cory O'Daniel of Massdriver

In this thought-provoking episode of Modern Cyber, Jeremy talks with Cory O'Daniel, Co-Founder and CEO of Massdriver, about the evolving world of infrastructure-as-code (IaC) and platform automation.

Cory O'Daniel of Massdriver

Podcast Transcript

Alright. Welcome back to another episode of Modern Cyber. We've got a treat today. We've got somebody who's been in the ecosystem for a little while and has strong opinions that we will definitely get into towards, I think, the second half of today's episode. But I'm joined today by Corey O'Daniel, co founder and CEO at Massdriver.

Corey is a self described taco aficionado, RC car enthusiast enthusiast. Sorry. I stumble over that word. I'm just kidding. But also the enthusiastic about RC.

Awesome. But also the author of the somewhat controversial how how old is the article by now? The DevOps is bullshit article? It's like, I think it's about 2 years old now, and then the follow-up's about a year old. But, it it gets tossed back into the Internet gears every quarter or so.

Well, gotcha. Yeah. Yeah. I don't know. Like, I feel feel like any blog post that's 2 years old at this point is either kind of ancient Internet archive or if it does get recirculated, it's now kind of accepted as gospel truth amongst the Internet and the Internet's disciples by virtue of being brought up time and again.

Right? I'll take I'll take gospel truth on that. Awesome. Awesome. Well, we'll get into that a little bit later in today's conversation.

But I wanna start off on a topic that is kind of near and dear to my heart as somebody who's been in cloud security for a long time, and that's actually the shift left aspect of it and kind of infrastructure as code. You know, with your company over there at Massdriver, I know that you're doing a lot of infrastructure as code work and you I don't know what the right way to think about it is, but platform automation is maybe a term that comes to mind. I don't know if that's a fair description, but safe to say you have a lot of exposure to the Internet as code side of things. Right? Yeah.

Yeah. So, I mean, I think the the general I hate this term because it overlaps with another term that is near and dear to my heart, and that is the the term people use is TACO, Terraform automation and collaboration. It's like we're in that side of the world. I love TACOs, but I hate such a cherished word being used to describe software. But that end, like, you know, most of the products in the space, ours included in some of our competitors, we all work with tools besides Terraform.

So, you know, Bicep, CloudFormation, Helm, like, the whole gamut. But Yeah. So automation platform automation is probably the way to think of it. Cool. Cool.

So on that side, when I think about platform automation, you know, I I started at a AWS in 2010. I left a little while later. I've been in the cloud ecosystem ever since. In the early days of infrastructure as code, all we really saw was, like, the simplest of kind of plans or templates. Right?

It's like I don't want to click ops my way towards having a 3 tier architecture of an a web server and app server and database server. I wanna automate the 3 of those things come together and obviously that has gotten considerably more complex as we've layered in, you know, let's say, like, least privilege IAM principles that we might wanna deploy for any workload specific purpose and, you know, complex VPC structures and things like that. But the long and short of it has for in my mind for a long time been making sure that the configuration of any set of infrastructure resources that we're provisioning is designed securely. Is that still the main focus of, let's say, the security lens of infrastructure as code? Or are there more things that people are thinking about now as they look at, like, hey.

Is my Terraform template good or my Terraform plan secure? Yeah. I mean, it's definitely is a concern. I think that there's a lot of tools you're kinda seeing in the space now. There's, like, TF6, the OPAs, check offs.

It's like a ton of tools you can kinda throw in there to, like, make sure that you're, enforcing policies whether they're secured or not. But, like, they were the things that we have to go up against is, like, the cloud. Like, we think of the cloud as 2 things. Like, if if I were to say, hey, the cloud is secure and easy, people would nod their head. They'd be like, yeah.

The cloud is secure. And it's fucking not either of those things. It's Yeah. Yeah. Not secured by default.

Like, you have to Well, it's definitely not easy. And it's definitely not easy. Right? It's like these two things that we, like, think, like, ah, secure and easy and the cloud. It's like, no.

That's not that does not describe our lives whatsoever. Right? It's like, one thing the thing that's still hard with infrastructure as code is, like, you have to we can shift left and say, like, shift sec onto engineers, but, like, they already there's a ton of stuff that's already been shifted left onto them. Right? Yeah.

Yep. Is mining infrastructure and trying to focus on security. It's like, okay. Well, you have to understand security, and you have to understand, like, the security domains of an individual cloud service. Right?

RDS versus I'm in general versus EC 2 are very different beasts. Right? And it requires a level of experience with those services. And so it's, I think, really hard still in infrastructure as code. Like, the cloud doesn't give you a great secure system by default.

Right? You have to, oh, I have to turn on encryption. I have to not use star Yep. My IAM policies. Right?

So there's really an onus on the person writing the code to make sure that that happens. Right? Whether that's me throwing some check off policies on there or OPA policies at the end of the day or thinking about the way that we design our modules in the first place, I think is probably one of the better places to do it. Right? Because the thing that's rough with using these tools that kind of scan stuff after the fact is somebody might forget to copy your CICD pipeline.

Right? And you put Yeah. You put check off policies in there all day long. But if your engineers are, like, copying a or referencing a Terraform module from another repo and they don't bring the CI policies with it, like, all for not. Yeah.

You think you're doing things great as an opt person, CI pipeline's not running. Right? And they were the things that we've kinda missed with DevOps and shifting things left is paying attention, like, how much we're shifting onto an engineer's plate and what should we be shifting onto an engineer's plate. Like, we do want self-service. Like, the idea of engineers being able to get things themselves instead of, like, tapping ops dev ops as a platform, whoever's shoulder and asking for it or, oh, you know, making a request through ServiceNow or whatever.

Right? Like, that sucks. Like, we all agree that sucks. Like, we want these people to have some sort of agency to manage their own infrastructure, but do we want them to understand RDS and I'm and VPC rules and subnets and all that stuff just to be able to get a database? I'd say no.

Right? So I think one of the important things for many teams when they're designing stuff and designing modules for their developers is designing essentially designing like a more constrained view of the cloud. So instead of trying to make modules that kind of let you do whatever you want, actually codify your policies into your modules and expose a surface layer to your developers, essentially, like the variables for, like, a Terraform module or a Jupo module that makes sense to your developers. Like, they care about the version of Postgres. They, you know, they might care about some settings that are on in the database, but zonal replication, they probably don't care about as long as as long as it's highly available.

They don't care if it's 2, 3, 4, 200 zones. Right? And then when it comes to security, it's like, should they have to think about security groups? I'd say no. Like, why Yeah.

To them. Right? Just put that stuff in place. And instead of trying to make the most glamorous amazing Terraform module in the world, give somebody something that's usable by them and makes all the hard decisions for them. Yeah.

And, I mean, along those lines, what you're what you're describing sounds to me an awful lot like, you know, let's say the early days of service catalog on from a ClickOps perspective in a lot of organizations because they would get to those situations where for whatever reason, and it might have been in many cases, it was like a data sovereignty thing where it's like, hey, we can only run-in these regions because of the physical location of the infrastructure for whatever, you know, AWS or Azure region that they were looking at. Or it might have been like, oh, we have, enterprise license agreement and a savings plan with whatever cloud provider for particular instance types. And so you would see these kind of codified policies around like, you know, no, Corey, you can't just launch any EC 2 instance in any region, but you can launch absolutely these instance types in these regions. And that's very much, you know, available to you to go do whatever you need to do to get your job done. And it sounds like what you're describing is kind of like, well, why don't we just like take that concept and codify it towards the left so that any engineer designing an infrastructure as code plan or template is they don't have to, to your point, like they don't have to know about zonal replication.

There's like organizational policies that say like, okay, these are the zonal replication settings that we have, and you can launch into whatever with these, you know, security configuration constraints in place. Is that kind of the way to think about it? I mean, I I think so. That's the way that we've started doing a lot of our modules and with customers that we work work with, especially when we're we're doing, like, we do, like, statements of work and, like, fractional ops hours for for, as a part of our service. And so we'll work with companies that have, like, no idea how most of this stuff works and, like, should they?

Probably not. Eventually, yeah. Yeah. But, like, probably not, like, at the stage that they're at. And, you know, one of the ways that we've started to try to deliver stuff is focusing more on, like, the availability.

Right? So, like, that'll be an actual input to, like, a Terraform modules. Like, how available does this environment need to be? Is it a production environment? Is it highly available, or is it, like, a dev environment?

And, like, to a developer, they that's easy to reason about. It's like, oh, this is our checkout. This is this absolutely has to be high high availability. Right? Or this is, like, user sign up.

It should be available, but not as high as checkout. Like, checkout takes precedent. Right? That thing always needs to be up. And what we'll tend to do is model out, like, things like replication, security, configurations, like, inside the module that will kind of switch based off of the availability that they've said that they need.

I need this to be highly available. It's like, okay. We'll make the zone of decisions for you. But you said this in developer mode, just gonna be a single instance sitting in, like, a single zone. Right?

And so kind of getting it in the language of the developers, like, that's, I think, how you get to that. You build it. You run it. Like, the thing that we are promised with DevOps. Right?

You build it. You run it. It's like, that made sense 10 years ago, but now there's just too much to reason about to still write software that makes a company money at the end of the day. But if as an ops person on your team, if I can say, okay. Well, we've thought through all these different constraints.

We've got all these different regions that are deployed, you know, different countries that we're in, and, like, this is the way that we want all of our databases to run, and here's a very limited interface for you developers. Like, guess what? That's what most developers want. Like, most developers want that Heroku experience. They're forced into the Terraform and AWS experience by, like, getting into a large team or having massive scale.

Right? And so I think you can develop and deliver your infrastructure as code to your team in a way that feels a lot more platform esque rather than just saying, like, here is here is all the configuration for s 3. That's terrifying. There's, like, a 120 configuration options for an s three bucket. But saying here's a module for how we log, and here's a module for how we take FTP uploads.

Like, those are very specific. You can start to filter out a lot of the world and a lot of the configuration when you start to think through it through the lens of, like, who's gonna be using it and what is the case in which they'll be using it for. But I think something you said there kind of flies in the face of what a lot of people assume to be true, which is that, like, a lot of people think every developer wants a full admin rights to everything. And also every developer wants the rights to launch the biggest instances that they can so that their code always flies, you know, like, whatever they develop is always amazingly performant because, sure, when you put it on, you know, 16 core GPU with x terabytes of memory, it's going to run amazingly. Like, you don't find that to be true?

So I think the developers want to be not hindered. They don't want to be obstructed. Right? Because at the end of the day, it's like, you know, we have this, like, project management chain. Right?

We're a tool. We're a tool by a business to build software because they don't know how. Right? Like, we gotta get stuff done. We gotta get stuff shipped.

Do they want admin rights to everything? No. Necessarily think so. But they do want to be able to make the changes they need to make. Right?

And, like, you'll see a little bit of friction when you start to, like, change the way you think about modules and you start to design with these smaller interfaces, but that's good software iteration. Right? Your customer is saying, hey. The functionality is not here. Right?

I would rather somebody come to me and say, hey. This one attribute that I need, isn't exposed to me in this Terraform module. Like, I opened a PR to add it, or can you open a PR to add it versus this thing took a 150 inputs, and I couldn't figure out what a 130 of them were. Can you help me? And it's like, okay.

I'm gonna do that today, and then I'm gonna do it to the next person tomorrow and the next person tomorrow. Right? So it's like that collaboration of, like, hey. This isn't working the way I need. Like, let's iterate on it Feels much more DevOps y than here's the entire kitchen sink.

Good luck. And when you need help, make sure to block yourself by tapping me on the shoulder, which is where a lot of companies still sit today. So I think as far as, like, getting them those, you know, those admin rights, like, I don't I don't I think it's, like, just a matter of, like, thinking about, like, what are the right level of configuration for them that can feel like they have that full control of the cloud. And, again, I think you have to think about when you're designing, and I feel like this is really easy to go wrong with with open source and, like, Terraform modules and Helm charts, but you have to think about who you're designing it for. When you're not designing it for anybody, and that's what we see a lot in the Terraform registry, then what do you've got?

And go look at the VPC module. It's got a 110 inputs. It's like, I Yeah. What are they? I have no idea.

When you're like, okay. I'm designing a VPC module for my team. It starts to be very different. Right? And I think one of the things that we get to a lot in infrastructure as code is, you know, there's an official AWS VPC module.

It's like, okay. You're just rerepresenting the AWS API in a different way. Right? At that point in time. Right?

Like, you're exposing so many attributes. It's literally just the same API, but, like, this, like, useless thin abstraction. Like, what expertise are we putting in there? Let's say, if you look at any major registry module, there's none. It's just passed through on variables, variables route to attributes, variables route to attributes.

There's no value in it. Right? And I think that a lot of times, like, when you're looking towards those modules, like, you're just better off writing it yourself. Right? Like, because then you can start to bring in your practices.

You can bring in, like, what we wanna expose, and that's why I don't see enough people doing that I see when we do it with people really kind of unlock that velocity in their org. They're starting to think of I need Postgres, not I need AWS, Aurora, RDS, Postgres. Right? Like Yeah. Yeah.

Yeah. Yeah. Come through. Yeah. But but to that point, I mean, I'm curious.

It's like 2024, I'm, like, legally required to ask about AI. Is there not, like, a, you know, some aspect of this where I can just describe to some, you know, LLM, like, hey. This is the thing that I'm trying to do. And the LLM can infer from it all the rest of the inputs. Like, is that is that a thing?

Is is that going to become a thing? It might. You know, we did our we did our fundraising round last year. I'm I'm bearish on AI. I mean, I use it day to day, but I'm still very bearish on it.

Like, I use it in a very tinkerish way. Like, I I don't do anything major with AI. And we raised our Same for me, by the way. Yeah. It's like I'll use it to, like, hey.

Here's 8 points I wanna make to my VCs, like, some make it sound a little less abrupt. You know? Send it out. Right? Or, like, hey.

I've got this Terraform module, and, like, I wanna, you know, do a refactoring on, like, the naming for the resource names. Okay. Great. You know? But giving me net new infrastructure, like, some of our competitors are, like, throwing the AI around there, and we we did our fundraise last year very specifically without using the words AI on our site anywhere or in our pitch deck.

We didn't mention it. And we're able to do our round, and we were very proud of that. And we may bring in AI tools into Massdriver in the near future. But I think a couple of things are terrifying about it. So one, garbage in, garbage out.

Right? Like Yeah. Sure. If you look at if you look at AI where you see it like crushing is like, hey. Write a story.

K. Great. It's plenty of stories. Interesting thing about stories is authors generally, for most of history, have gone through a publisher and an editor. So really literally only the best of the best works are being mined.

Art is very similar. You're putting stuff up on Behance. You're putting stuff up on different art sites. You're trying to show off your work. When it comes to software developers, you look at GitHub at any point in time.

That's not our best work. That was what we had to do to get the job done at that point in time. Right? Or you look at the article, this was me trying to figure out how to do this. This is the best, but, no, this is how you can cobble these two things together.

Right? The data that we train AI on around cloud operations is very, like, infantile. Like, it's it's it's it's not our best, and it's also missing half the picture. This is how you build a great Postgres module. It's like, okay.

For for what? What purpose? Yeah. What purpose? Right?

Like, it's like, oh, well, this is the we I saw this on the AWS RDS module registry. Like, here it is. And it's like, okay. Well, what was that database designed to do? It's missing the context.

It doesn't understand Yeah. Yeah. Requests or throughput If you our domains. Right? Yeah.

But if you trained it on, like, let's say, you know, peer reviewed PRs that have been approved and gone into production, granted, like, I think a lot of organizations wouldn't have the velocity or the volume of PRs to actually, like, constitute a meaningful training corpus. But in a larger organization where you've got, like, let's say, a track record of a couple of years worth of PRs on infrastructure as code, like, you know, where their their Terraform or their CloudFormation driven PRs, you know, I would say that you could probably get a somewhat meaningful sample size for your organization. Not general purpose, but they might be, like, well tuned to your organizational policies into, like, let's say, the standards and the defaults that you have for your company. I think you definitely get there. Yeah.

But I mean Yeah. Then the net then there's the next problem, which is, oh, somebody generated this code. I don't know what came out of it. Like, I don't know I don't know cloud operations. I don't know Terraform.

I don't know RDS. It gave me something, and I Terraform applied it, and it exited 0. So does that mean Yep. Does that mean it works and it's secure and it's compliant? I have no idea.

Right? So, like, I went through this recently because one of our competitors launched, like, an AI thing, and I was like, okay. Great. It popped over to it, and I was like, I need a, a Kubernetes cluster that will pass CIS and could pass SOC 2 given these, like, constraints about my business. And literally, it was like new AWS.eks, and then there was a comment section where it was just, like, commented out.

It was, like, put your configuration, SOC 2 compliance and CIS here. And I'm like How helpful is that? I was, like, it's green. If I run it, it actually built a cluster, but, like, it didn't take any of my constraints into account. Right?

So I feel like the thing is, like, if you're a novice or or or don't know, most AI will confidently tell you the wrong thing and be fine with it. Right? The amount of times I've used chat g p two four o in the past couple of days where I'm like like, I can tell that it's telling me something because it thinks it's what I wanna hear, and I'm like, that's not right at all. And it's like, well, yeah, it's not right, but, like, you asked. And I'm like, can I ask for you can tell me no, robot?

Like Yeah. Yeah. We're just like, yes. This is how you do it. It's like, I don't think they have a free plan.

I was gonna say, oh, they don't have a free plan. I just said they did in case they do. And it's like, I'm, like, more confused than, like, when I started this thing. Right? Yeah.

I think that that's, I think, mildly problematic. I think on the other side of it, though, is, like, okay. I don't understand what I'm doing. I'm asking AI to generate it. If I run this through, you know, Terra, warm, or Helm, or whatever, it'll exit 0, but doesn't mean it met all these requirements.

Right? If if it does or doesn't, but it deploys some infrastructure either way, now who's missing the context? I'm just context as a developer, and so is the ops people. I have no idea what put this thing into production. Now could I say, hey.

Here's a couple of prompts. Like, make sure you always include these tags. Yes. That's a great place for it. Right?

Yeah. Can I say, hey? Make sure that, you know, any given module, there's a name one name field, and we use that name with suffixes for all the resources that are created. Like, great. That's a great policy to put in.

And I think you can do successful things with that, but, like, can you AI away operations and expertise? I think we're we're pretty far from that. Or or can you AI correctly get, you know, the 95 of those 110 fields on the VPC module that the developer doesn't wanna deal with Yeah. Or that the developer actually has no opinion or or any knowledge or understanding of. Exactly.

Yeah. I take your point. Yeah. I wanna shift gears for a second. I wanna talk a little bit about kind of what's going on.

We've talked about Terraform a couple of times, and obviously that's a a, like, super well known and widely adopted, I I don't know. Standard technology format. I don't know what the right thing to actually even call Terraform is at this point. Yeah. But, you know, like, what well, first of all, let let's not answer that question.

Let's just talk about kind of what's what's happened in that community, the hard fork, the license changes, etcetera. For those who haven't really followed this story, like, I'm sure you're you live this every day day to day and a lot of the work that you do, so you're probably way more familiar with it than I am or that than a lot of our audiences. So give us your your take if you would. Yeah. So I'll give you I'll give you the rundown and before I switch into my opinion mode, I'll give you a flag up.

So yes. About this time last year, I think, right around August, HashiCorp changed the licensing of many of their, like, flagship open source projects from MIT to, Bussel, BUSL, business source license. So Yeah. Sometimes mis abbreviated as BSL. So if you look into it, you'll see BUSL and BSL, but technical one's BUSL.

Okay. And the reason they did this was, at the time, they said that there was a a a whole fleet of companies that were competitors with HashiCorp using HashiCorp's products to make money. Okay. I don't know. Within months of this, they also announced the IBM merger.

So, like, is it was it us or was it that? Like, who knows? But they had to they changed the license for some reason. Right. And so, you know, around this time, a bunch of organizations were kinda put in a tight spot.

Right? All VC backed, so it's okay. Well, you can no longer use Terraform or Vault or whatever the tool is that you, you know, kinda competed with. So go out of business or pivot or fork the thing. Right?

And so, the at the time, it was very confusing to everybody what was going on. Like, consultants were confused. Like, HashiCorp has a PSO arm, professional services arm. If I do consulting and write TerraForm, am I competing with HashiCorp's PSO arm? Right?

So it's like, can I use this? Right? CNCF was like, are we competitors? Like, some of our products compete with HashiCorp products? Like, are we competitors?

Right? So, like, there was just a ton of vagueness in in this change. And then so HashiCorp started releasing these FAQs. And so it seemed like every time somebody emailed them to ask them, like, a question, they would, like, publish an FAQ. And so, like, the license was, like, partially a license file and then partially just growing FAQ blog post.

And that's kinda where when you start seeing these FAQs come out, it's like you start to get a little bit of clarity as to, like, who this affected or not. And in the meantime, I'd reached out to a buddy of mine who worked at Spacelift, and I was like, this is concerning. And he was like, well, we're we're already talking to people, like, figure out what to do about it. And so there's about 7 orgs that kinda came together over a weekend, tried to figure out, like, is this plausible to take a fork of Terraform? Now there's a couple other companies that started doing forks of some of the other projects, but I only really know the Terraform s thing.

So, we decided to fork it. And then from there, it was, you know, trying to build up all these stuff around it to support, you know, an open source community driven team around what became Open Tofu. And so, you know, we first announced the project. There was a ton of support. There was also a ton of people very mad.

And so we're getting into the mic. Ton of HashiCorp people very mad or a ton of Shills. Sorry. If you're hearing this and you're like, I was very mad about OpenTok who existing, you're a shill. The the shills.

So I don't know. Those, like, people coming out of the woodworks. I don't sorry. I said shills. I said I said shills.

Some of you are shills. But, it it was more like I think it was I think it was 2 glasses of people. There were there were actual shills, and there are a couple of shills out there. And then there were people that I think took HashiCorp's original phrasing as to why they did this change, and they took it to heart. Because, like me, I'm a software developer who's used HashiCorp tools for since before HashiCorp existed.

I was using Vagrant before there was even a HashiCorp. Right? And so this is a brand that as a developer, I've kind of fallen in love with over the past decade. Right? Like, I've saw them go from barely existing to this company that builds all these tools to the IPO.

Right? Like, I'm using them at other companies that, you know, I'm doing consulting for. And so in my mind, I'm like, oh, this is this is, like, the company, like, to look up to as a software engineer, as a founder. Right? And so when they made this change, I think there was a lot of people that were enamored for good reason with HashiCorp and their philosophy around open source software development.

I think when they heard that, oh, people are taking advantage of us. Think people were like, well, how dare you take advantage of HashiCorp and and make a competing product to them. But then when I think started to happen after that was, like, the shakeout. Right? Like, once everything started, like, to settle and people saw this was a real community driven, like, project, open Joville, I think people started to see, like, some of the cracks.

And so when you start looking at the cracks, this is where we'll get into my opinion just not to freak anybody out. This is Corey O'Daniel speaking here. We start looking at the cracks in that the image that they were trying to portray there. The reality is a few years ago, they cut off open source contributions to Terraform. There's even a blog post by HashiCorp about them no longer taking community contributions.

The reason why is they didn't have the staff to deal with it. Right? So now what you're seeing is a tool for about 2 years. It's just kinda languishing. You know, we anybody who lived through that 0.13 phase, like, 0 it was, like, 07 to 013 to 1, like, that took a very long time to get there, and there's a lot of PRs that never made it through, a lot of issues, like, 500 upvotes that just never made it through.

Right? So that's happening. Right? Or that's not happening as it were. Right?

People doing making contributions to it. And now Open TOFU is actually trying to have an RFC process that is not what the companies that are involved in it want. It's what does the community want in this thing. And if we get good functionality in here, we'll build our products to support it, but we're not making decisions on PRs and issues based on our businesses. Yeah.

The community. Right? So I think people start to see these cracks around, like, ah, like it they weren't actually doing it for the community. Like, they're doing a lot of this because they have business interests, which makes sense. I got VCs to make those decisions.

But to say that we're making this change because people are taking abusing it and they haven't wanted to help, I don't think that they saw 7 groups getting together that actually wanted to help and could pull this off. Right? So that's where we are today. It's now we got Open Tofu out. We've got it in the Linux Foundation, getting it into CNCF.

We have Open ToFoo Days, at KubeCon, and, you know, we 20,000, 22,000 something stars on GitHub, few million registry downloads a day, like, starting to have uptick. And the reason why is people are seeing us, a, maintain that compatibility with Terraform that we promised to do, and, b, hitting the features that people have been asking for for years that haven't landed in Maine. Right? And I think that, like, that's where, you know, the rubber hits the road. It's like, we're actually building the stuff that people have wanted for a long time.

And so now you're starting to see these opinions change and people saying open tofu is the way to go. Right? You see people saying it on Reddit. Oh, I was just on Reddit this morning. Somebody's like, hey.

Like, I'm thinking about Terraform Cloud versus Atlantis, and everybody's like, open tofu. Right? So, like, you're starting to see that velocity, which I think is great. But when it first happened, it was like it was a lot of people backed backed us. They it was easy to sign a thing, but it was just like there was as many people who are just mad on Reddit and and happy.

This this may not go as well as we hope it does, and it it's gone pretty well so far. And so for a developer who is kind of coming from Terraform, you're saying, like, compatibility is guaranteed so anything that you've built historically you should be able to plug in and run just as you always have. Yeah. So, I mean, you can do you can go the extreme route or, like, the the simple route about this. So what I suggest is if you've already got, like, good Terraform pipelines, take your plan and run your plan twice.

Once with the open tofu binary, once with the Terraform binary. And if you take those and diff them, the only diff you should see is and one, it says Terraform and one, it says open tofu. It should be identical otherwise. And so what we suggest with a lot of customers that are already on Terraform was to throw this in their CI and just dip it and see if there's anything besides open to it, like, the name change showing up. Right?

Yeah. And what you'll see is, like, they're very much identical because the software is mostly identical. So that's one great way to, like, start getting in there. It's, like, throw the plan in there and see that it's, like, generating the right plans. You feel confident about it.

If you're using, like, Atlantis or something like that, like, you'll see the PR. You'll get 2 comments posted. 1 about OpenTok, 1 about Terraform on, like, your GitHub actions or your GitHub PRs. That one feels pretty good. But to kinda show, like, how, like, compatible they are, like, I have an alias in my shell called t.

It's always been t. Terraform. Terraform they got they got a single letter in my aliases. And now at 50 fifties. So whenever I at any point in time when I'm, like, doing local development, I have no idea which tool I'm using.

And, like and I do that just to make sure that I'm aware of, like, any sort of incompatibilities we run against. Right? And so Yeah. Pretty cool. That that's, like, the hardcore one, I think.

Yeah. Yeah. He's doing that. But, yeah, I mean, you're gonna see compatibility now. The one catch is open tofu will always be a little bit behind Terraform and Terraform functionality, but then we're ahead of Terraform and non Terraform functionality.

Right? So the end to end encryption of state files. Right? So, like, talking about security of ISE, they're securing of your cloud resources, but then they're securing your infrastructure as like, securing the code itself. Right?

Yeah. Yeah. Yeah. That is like a whole other, you know, bag of worms of sizes securing the cloud services. So end to end encryption of state, We've got the dynamic, functions.

So you instead of, like, you know, the the the functions that you give to Terraform out of the box or trying to, like if you've ever written Terraform provider, like, you can use data resources to, like, make functions if you try really hard. You can now package custom functions with providers. Well, with openTOFU, you can also drop in Go and Lua code, and you don't have to have custom providers for your functions. You can just get dynamic functions. Right?

And so, like, that's the kind of stuff that I think is really important for continuing to move the community forward, and I think people are really excited about. It's like seeing that stuff that people are just waiting for forever. The other big one is it's coming as dynamic provider configuration. Right? So if you ever had a team where you've, like, tried to lay down a Kubernetes cluster, and then you're like, oh, on all of our Kubernetes clusters, we're using, you know, let's encrypt an external DNS, and we don't wanna put that in a separate Terraform module.

It's always kind of been a pain to initialize Kubernetes and then initialize the Kubernetes provider so that you can lay down some Helm charts or whatever. Right? Yeah. So with this dynamic provider configuration, you'll be able to reconfigure providers effectively on the fly. So you can yeah.

Which is big if you're doing a bunch of Kubernetes stuff. Yeah. It makes a ton of sense. It's it's actually one of the areas that I think, like, I've seen customers struggle with on the infrastructure as code side. So aside from just, like, let's say, checking the configuration of all the resources, one of the other patterns that I've seen with a lot of, let's say, larger larger let's say, more traditional organizations that are trying to be quote unquote cloud native is that they'll be like, oh, okay.

You know, we're gonna, like, we're gonna open the doors or open the gates to developers a little bit more. But you know what? There's 2 areas where our security teams are gonna get involved. And almost, like, whenever you hear that without fail, it's VPC structures and IAM roles. And so, like, you'll see them.

They're like, hey hey, Corey developer, go do whatever you need to do, but you can only use these VPCs and these IAM roles. And generally speaking, that can work out relatively well. But where I see it falling short is when you, as the developer, don't have a good understanding of what the settings of those environments or of those, permissions and role sets are, it's where you're like, okay. Great. I'm gonna use this thing, and you look and you pick some random roll off a paper or you choose some VPC based on your limited understanding of what the production environment looks like.

And then all of a sudden, you end up with, like, an unintentional configuration where, let's say, some service that you're trying to expose, that port is by default blocked. Or, some service that you launched, actually, it's open and you don't want it to be open because you don't you lack that understanding. It's like this mix of where you're trying to do, like, net new stuff defined as code into an environment that's already running whose code is not accessible to the person launching into it. Mhmm. And, like, that mismatch I've seen a number of times.

Awesome. But just to just to kind of come back on the OpenTofu stuff, just to kind of confirm, like, so from a licensing perspective, gone back to the MIT license or picked a different license as it's gone to the Linux Foundation? You know what? I, I should know this off the top of my head. I cannot.

I'm sorry. We've been doing a bunch of licensing ourself around some stuff that we're gonna be releasing open source. I'm just like, I can't remember who was who was what. So pardon me for being an idiot. Yeah.

It's the, it's the, what is it? It's not a g p, m Mozilla the MPL. Mozilla public license. Yeah. Got it.

Got it. Okay. Cool. Cool. Cool.

Yeah. We're going to sip under IT. Master. Awesome. Awesome.

Yeah. Yeah. Well, we've we've only got a couple minutes left, and I wanna finish up on, you know, the thing that we talked about at the beginning. So let's talk about DevOps as bullshit. So, first of all, why is it bullshit?

You know, every organization that I talk to and it's funny, like, we do API security, which is this kind of unique beast because APIs are a funny thing, and and I often repeat this line when people ask me. They're like, well, API security, like, is that is that AppSec? Is that, like, CloudSec? Is that secure product and coding and whatnot? I'm like, well, generally speaking, yes, it's all of the above.

But actually, most security teams care about it from the perspective of what data is is potentially exposed or breachable behind this poorly written API. And so I I tend to think of it from that perspective, but the reason I bring this up is that like typically when there is a problem with an API that leaves data vulnerable it's almost always in the design. And so you kind of always have to go back to the the source of, like, who wrote this API. And so then people tend to bring in the whole, like, oh, this is a DevSecOps problem. You need to incorporate, like, you know, the developers and give them the security tooling and the security knowledge to build a secure API and give them pointers to fixing their API and so on.

So that's where my exposure to this side of the world tends to be on a regular basis. But you live it day to day, like, why is DevOps bullshit? Yes. I think it's a couple of things. So one, we still can't define it 10 years in.

Right? There's like Okay. There's like the DevOps there's DevOps culture, which is important, and then there's, like, the DevOps team or the role. Right? Which I'd wanna be crystal clear.

If you have a DevOps title today, your work is important and the stuff that you're doing is important. Right? Like, I feel like that's something that was kinda missed. People are like, oh, you're like tearing down these people? It's like, no.

That was my title for the longest time and the entire time I'm like, this isn't a title, it's a job, it's a culture, it's how we do things, it's not me, and it's just a lot of companies, you will try to fight that, but you are a salmon swimming swimming upstream. Right? So, a, we already have, like, 2 definitions. I think that's okay. B, the phrase was coined when the world was very different.

Right? It was 2,008, 2010, like, we were running software on VMs. Right? Ops was setting up these VMs and networks for us. We were trying to figure out how to get software on them.

Right? I feel like that has kind of tainted what DevOps has become. Many people just consider DevOps like the CICD portion. Right? So it's like a lot of like, you you'll see a lot of job descriptions where you're like, literally this is a CICD engineer.

You're a YAML Yeah. Yeah. DevOps. It's the DevOps role, and it's like, okay. Some companies it support, some companies it's not.

It's like, so it's just vague. Like, if I said today, I've got a DevOps role open for you, you wouldn't know what I was talking about. And so, like, that right there, that is that is just bullshit. Now here's the thing, I think it's a little bit more bullshit is we haven't adapt like, adapted the definition of DevOps if we could get it right to fit the world that we're in today. 2010 versus 2024 deploying software, it looks very different.

Right? Yeah. For sure. Really count, like, the cloud big c in DevOps back then. It was like, EC 2 is around.

Like, you had, like, classic VPCs. Just trying to get your software on this and VMs. But now it's like, okay. My Docker files, is that me or is that you, or is am I doing DevOps that I'm in charge of the Docker files? Now I gotta get this thing to a Kubernetes cluster, but my application, which is written in Docker, is using step functions to trigger some state machine stuff, which is backed by a Lambda and maybe some stuff on s 3.

And so now I've got this, like, cloud stuff that it's not just how I run my software, it's how my software functions. Right? So, like, the the way that we think about our apps has changed. Like, we're no longer running on the cloud. We're composed of cloud.

Right? And so am I doing DevOps by writing some Terraform to make an s three bucket, or is that just a part of writing my software? Right? And I feel like we haven't really incorporated that back into the original definition of DevOps, which we still haven't gotten down. Right?

And so here's where I think the so that's a and b. The third part is is we're not doing it. Like Yeah. There's so many companies you look at today, and you're like and you can see this in the Dora report. The Dora report is my favorite my favorite illustration of this.

Like, the report for DevOps folks. Right? There's so many companies I meet today where they're like, we have a DevOps team. We have a we have a global cloud operations team. We got SREs.

We got platform engineers. And I'm like, okay. Like, what are they like, you got you sound like Google. Like, what are all these people doing? And it's like, ah, they're all right YAML.

And, like, we just page Yeah. Yeah. Yeah. Page them when shit's broken. It's like, if you're paging somebody else, like, you're not building it and running it.

Or maybe you're building it and running and just calling somebody else for support. Right? But when you look at, like, things like the Dura Report, you start to see, like, 50% of companies, an outage lasts multiple days. Right? You look at state of CDU report, 30% of companies responded to the state of CDU report said that they were doing infrastructure automation.

And so, like, when you start looking at this, it's like, okay, 50% of people that like, 50% of people are deploying once a month and they have multi day outages. Only 30% of organizations are doing infrastructure as code at all. It's like, well, if you aren't codifying your infrastructure, how are you automating it? Yeah. Yeah.

How are you making it easier for your engineer to make that step function that calls a Lambda that makes an s three bucket? Are they going and doing that themselves? Are they click opsing it? Because if they are, that just sounds like a lot of manual toil. It sounds like you've shifted responsibility left, but not the expertise.

Right? And that's where many teams are kind of sitting is, like, hey. This person does it for me. You could argue whether or not that's DevOps or not, or it's like, I do it all, but I don't really know, and my product manager wants me to get this shit out on Friday. So I'm gonna guess through all these things that you've exposed to maybe a Terraform until it works.

And if it works, it's fine. If it's secure, I don't know. It's gonna be the security guy's problem when something breaches. Right? It's gonna be the right.

What about cloud costs? Whose problem is that? Is that mine? In a DevSecOps world, like, is that is that me, or is it the CFO getting mad 3 to 6 months from now when he's like, why is the cloud bill so high? And it's like, oh, because all these engineers keep just deploying metal with a 128 gigs of RAM so their stuff never goes down.

Right? And so I feel like like we didn't have a good definition of it. It kind of, like, wriggled out of our hands over the past 14 years while the cloud has kind of just consumed everything, and now we don't have a definition of it. But at the same time, like, we need it. There's more companies that are becoming software companies.

There's more people that are moving to the cloud. I don't know about you, but I get a breach notice, like, maybe every 2 weeks now. Like Yeah. It's more important than it was 14 years ago, and we're still, for many companies, in the exact same place we were 14 years ago. And, like, that's why it feels like bullshit to me.

We couldn't define it. We couldn't get it to kind of take on this full picture of the cloud, and many of us are still where we were then. Right? And that's one of the things that we're trying to help with is, like, trying to, like, bridge that talent gap. Like, there's just not enough people with that expertise at the same time.

There's ton of software developers out there, but most of them have spent their time, like, developing on a path, which is great. But then when they get into the cloud, it's like the only way they're gonna learn about it is in production. Like, you're not gonna learn the same things on that free tier from AWS that you'll learn in a production system. Right? Yeah.

Yeah. I mean, I can speak firsthand on that on the truth of that point for sure. You know, it's like when I started my career pre cloud, like, you didn't learn about infrastructure and data centers until you actually, like, built one and made that 3 AM drive to swap out the failed hard drive and recover the corrupt file system and all the things that went into it. And, you know, and to your point, you also learned the lessons of that and you took your hard knocks and then you got better with every iteration and every outage really informed, you know, how you design things moving forward. And so do you think that, like, there's a difference between smaller companies and larger companies as as far as how they embrace this and, you know, like because my sense is that smaller organizations if I think about what I I might have originally under understood DevOps to mean, and everybody's seen the kind of, like, infinity 8 for, you know, like, release, measure, iterate, improve, you know, blah blah blah, plan design, improve, etcetera.

Like, you know, smaller companies just by virtue of being resource constrained are where I see the same engineers who wrote the code being the ones responding to the outages and having to take those lessons learned and get better for the next time. Is that, you know, is that just kind of like small sample size and representative of small organizations, or is that kind of a broader truth that you've seen as well? No. That that honestly I feel I feel like some of the teams that are doing, like, DevOps the best are the smallest teams. Because it's like you only have so many people.

Right? Somebody has to do it. Right? And what what I've seen happen more times than not, and this is actually how I became a DevOps engineer was I was the person on the team that for whatever reason, you know, it happened to me. My boss was like, this person knows the most about the cloud.

AWS. Yeah. You're the ops person now. Right? And all of a sudden, it's just, like, velocity on the product dies because, like, oh, they take this person who's doing a ton for the product they're trying to sell.

They're, like, you are now supporting these 7 other people because you have the most experience there. Right? And that happens a lot where it's like, oh, this is my one ops person now. Right? And, like, that is where I feel like things start to go a bit haywire.

Now you've got this one person who's probably doing a job they don't necessarily love. Right? They they join to work on your product, but now they're kinda supporting these 7 other people, and you got one. They call out sick. Oh, they go on vacation.

Right? Yeah. Yeah. Thing breaks at 2 AM too many times in a row. Like, you're starting to wear that person down.

Right? And so I feel like it's a great place to start, but, like, almost the immediate place people go is, well, we need to hire ourselves an ops person. Right? Or the or they won't do that where they grab that person and say you're the ops person now, but say, let's post a JD and get our first ops person in here. That is the most terrifying job description in the world for me.

I'm like, oh, shit. You guys have been in business for 5 years and have never had somebody, like, managing all this stuff. And, like, day 1, I'm gonna have 20 developers mad at me. Like, I'm walking knee deep in the debt, and I'm the only one, you know, hiring us in pairs. Like, that's a scary play that's a scary job description for me.

So, yeah, I mean, I feel like I feel like the early stage companies, like, they they kinda get it. They're building it and they're running it. But I also think another place that is fine is just using a a pass. Right? And this is, you know, you won't come to Massdriver if you're using a pass and that's that's fine, but I because I honestly, at the end of the day, care more about your software being secure than making an extra a few $1,000 a year off you.

There's nothing wrong with running your business on a pass until you're Yeah. Start adopting the cloud. Now the problem is a lot of people are starting to adopt cloud earlier. Right? Because you you are getting things in there that are appealing to you.

Right? You're running on a Heroku, which is great, but then you need to run, I don't know, your Claude Sonic model on, you know, SageMaker or whatever. Right? So, like, now I've got a foot into AWS. Now I've got 2 platforms.

I'm managing the Heroku stuff, and I'm managing AWS stuff. Right? And it's like, well, we're already using AWS. Why don't we just use that for objects? Or right?

And you kinda just start stacking on more and more stuff. So I think it's fine to, like, start simple, and we probably should be starting simple more often. But I think that the cloud has found its way to creep into your simple architectures earlier and earlier. Yeah. Yeah.

Well, on that note, I think that is where we leave you. Not with necessarily words of sage wisdom for how you should be doing things better, but with just some observations about the state of things. So, Corey, thank you so much for taking the time. For people who wanna learn more about you, the work that you're doing, your team, I know it's Massdriver dot cloud, anywhere else that they should check out? Yeah.

Our YouTube channel. So we do we do webinars every Wednesday on just different cloud topics, cloud migrations, like infrastructure as code best practices. You can check out our YouTube channel, Massdriver cloud on YouTube. Got a bunch of stuff up there, and then, I'm Corey O'Daniel pretty much everywhere. So Twitter, Den, all that jazz.

And then we also are the sponsors of the platform engineering podcast. So, great podcast to check out if you wanna hear all things cloud operations. Alright. Well, we will have the podcast, the YouTube channel, and MassDriver linked. Corey O'Daniel, thank you so much for taking the time to join us today on Modern Cyber.

We'll talk to you next time. Bye bye.

Discover all of your APIs today

If you can't see it, you can't secure it. Let FireTail find and inventory all of the APIs across your organization. Start a free trial now.