The Evolution of DataOps: Insights from DataKitchen's CEO

Hello, and welcome to the Data Engineering Podcast, the show about modern data management.

Your host is Tobias Massey, and today, I'm welcoming back Chris Bergh to talk about his tireless quest to simplify the lives of data engineers. So, Chris, can you start by introducing yourself?

So my name is Chris Bergh. I'm a CEO of a company called Data Kitchen in Massachusetts.

I have a long history in technology,

going back to the nineties. So I worked at companies like NASA and MIT Lab. I spent a lot of time doing I was doing AI back when it wasn't cool, and your, machine learning class had only 6 people in it. So I'm that old.

And, then about 2,005, I had done a bunch of things in software and managing teams, and I thought the best thing to do would do data and analytics.

And so I I started managing a team,

and I thought my kids were young. They're, like, 8 and 5, and I thought, well, this will be easy. I'm a big software engineer. This data stuff, no problem. And, it turned out my life was hell. Things kept breaking. We could never go fast enough. Everyone wanted their own tool.

And that was a really good experience. So we spent 8 years doing a lot of work. We eventually sold the company. And then about 11 years ago, we started this company, DataKitchen,

to really focus on those sets of problems for data and analytics teams, which are

sort of generated from our experience that

stuff's breaking left and right. You can never go fast enough. And, you know, your customers are just incredibly demanding. And and how do you live in that world and and sort of not get depressed and and and and somewhat,

want a therapist.

You mentioned a bit about what DataKitchen is, some of the story behind how you got into it. What is it about that particular space of

trying to address the problems of data engineers that has kept you engaged and focused on that problem for so long. Should should you have solved it by now?

Yeah. Yeah. Well, I wake up in the middle of the night and say I should have. Right? So,

well,

this will be an interesting discussion because why isn't it solved? Right? Why aren't those problems solved? And

I think

they really are, in some ways, getting somewhat better, but also getting worse.

And so a lot of people

will build data and analytics systems.

Right? They'll spend their they'll focus on the 1st day. They'll go into Azure or Amazon.

They'll go into GitHub and go, look at all this stuff I got I can do. And they'll put together something and

have some insight for a customer.

And and that's great. And and they'll focus a lot on building new things, but they won't focus on kind of the 2nd or 3rd day, which is I have to run this.

So you mentioned

that you ended up starting DataKitchen because of the problems that you were seeing as a manager for a data team, and

you were one of the people who helped to define and popularize the concept of data ops. And I'm wondering if you can talk to some of the ways that your journey has progressed and where things are getting better and maybe some of the ways that things have stayed the same and the pressures of the complexities of data

have resisted

everyone's efforts to move into this bright new future where everything works all the time?

Yeah. I think

we did a survey about 2 years ago of 700 data engineers.

And it came back where 78%

of them were so frustrated.

They wanted their job to come with a therapist,

and over 50% of them were thinking of quitting.

And so why is that, right, when data is great, we have all these tools,

Why are people so upset?

And I think it comes down to the fact that

we're disconnected from our customers.

We're not achieving our goals.

And

from an industry standpoint,

we popularize

a lot of buzzwords.

And I've been part of that, popularizing

data ops. But they they mutate, and they become almost meaningless.

So these terms that go out there,

I'm not quite sure, but they're like a gas. They sort of expand to fill everything.

And so what happens to any term like data ops or

data mesh or

data observability

or data products or data lakehouse, they they start off with a fairly precise definition.

But then people end them to whatever they're currently doing and saying, I'm doing data mesh. But in fact, what they're just doing is plain old DTL.

Or I'm I have a data product tool. And, well, what what is that? Well, it's really, a governance platform. And so the pollution of the name space in data and analytics is it makes it hard to get through that.

And, you know, we've tried. We wrote a manifesto. We've written 2 books on on data ops. We've tried to stay pure to kind of the 2

concepts because of of what DataOps is. And and one is that in production, you're running a factory of data and analytics, and that factory is taking data from one side and passing it through a bunch of tools

and giving insight on the other side. And and so

the ability for you to run a factory with low errors and low bottlenecks,

how you do that takes a lot of inspiration from industrial process control and demoing, etcetera.

And then the second part is you have a factory that you want to change very quickly because your customers are demanding. And so that takes a lot of the same ideas that came from agile software development because the change is a software.

You know, terms are not built by engineers. They're distorted by marketers. And so it's hard to make hard to make sense of anything nowadays.

Absolutely.

On that note of the misappropriation

of the term DataOps, it seems to have gone through a similar journey to the concept of DevOps that started

I guess, it must be close to 15 years ago now where

it started with a very specific use. It then ballooned to mean everything that had to do with

vernacular. Still not used correctly in most cases, but there's less of a argument about the the misappropriation

of it. And it seems that DataOps has gone through a similar cycle of

it was introduced. People started using it to mean everything having to do with data quality, data observability,

data testing,

and now has started to fade a bit more into the background in favor of the latest cycle of hyped up terms. And I'm wondering if you can give your sense of what you see as the state of DataOps today,

however you want to define that.

Well, I agree with your definition.

I think people even use DataOps as,

a moniker for ETL or or data transformation or databases. Right? It's everything.

And I don't agree with that way of thinking.

I I do think it's proceeded a lot slower than software,

adoption of DevOps. And I think it's really interesting why. So if, you know, when I

like, 20 years ago,

one way I would change code in production is I'd literally go on the production machine and and change some code.

And nobody on my team went, ew.

What are you doing? That's gross. Don't do that. Don't change things on production. In fact, it was seen as kind of cool. Like, oh, he's he's a cowboy.

And so what hasn't happened yet with data people

is they're not saying

eww enough.

Like, you just change your code in production, and you don't know if it works. You don't automate deployment. You don't have testing.

You don't have monitoring. You're kinda hoping everything works.

And that's unfortunate because I think as a team, one of the reasons why

we're so depressed and so upset is because of these death of a 1000 knives. We're not automating enough. We're not testing enough. We're not saying, oh, to each other enough.

And we're sort of accepting that heroism is part of it. Or on some teams, they're the exact opposite.

They're living such fear of breaking anything that they've made. They don't wanna touch it, and then they have this sort of procedural defensiveness.

So along with that misappropriation

of the concept of data ops, there were a sequence of subsequent trends that came along,

notably data observability,

data reliability engineering.

Those all try to encapsulate

at least a portion of the idea of data ops.

And I'm wondering

how those subsequent trends from when you first started DataKitchen, first started trying to popularize DataOps, wrote the DataOps manifesto,

how those different cycles of adoption and technical evolutions

have influenced the way that you think about the work that you're doing at DataKitchen? Actually actually, quite a bit. And I think, you know, 10 years ago when I was first talking to DataOps, I'd go to conferences and bring up Git, and I had to explain version control. And sort of 1 out of 10 people, 1 out of a 100 knew what Git was, knew what CI CD was. The thought of actually coding SQL freaked a lot of people out. And so I think there's some tools that have really helped the world, like like the existence of of DBT and its integration with Git and its way of writing template at SQL. You know, those sort of 600 or $700,000,000

in data observability investment capital and the massive marketing budgets, that that's also how people have data observability

projects now,

or data reliability projects. And I also think the fact that, like, more and more tools actually have

Git and CI and CD integration.

And so I think those things are not as far out as they once were, which is, you know, which is fantastic.

And I think they've actually have influenced. And I think in order to answer that question, you know, I'm a technical founder. It took me a while to start a start a company, but I was intended to do it kind of based

on my values, based on always being profitable. So we've run a company that's never had any investment, either our own or not. We've just sort of been quietly bootstrapping.

And I think it's really influenced me because I thought sort

of 2, maybe 4 years ago and and and even before that, that the way this change would happen in the world, that way people would adopt the principles of data ops,

would be from the top down.

And I went to a conference, a DevOps conference 6 or 7 years ago, the DevOps Enterprise Summit, and there were all these enterprise teams saying, we're adopting agile. We're adopting DevOps. They were so enthusiastic.

They were so

into it. And I'm like, okay. This is gonna happen. People are gonna start to believe these ideas just like they believe DevOps.

And it turns out that's not the way change is happening. And, unfortunately,

the people who run data and analytic teams

have, like, a 18 to 24 month, sometimes two and a half year life cycle. CDOs, VPs,

they're hired, they're fired so quickly.

And

when you try to have a

top down change like that and your sort of foundation is a mess, it's really hard to make that happen. And so what we've switched to is is is really focusing on

individual contributors

and kind of starting over again. And we have a you know, we still

support talking to CDOs, but we've open sourced a whole bunch of software

targeted at a single

data engineer, data quality person who

is tired of, you know, getting be getting beaten up for having the errors in production or just tired of being able to go too slow and then wants to have more satisfaction on their job. And so it's really influenced how we think because, you know, how does change happen in the world? Right? When when we started, we had to say, here's the idea,

wrote wrote books, videos, talked about it. You know, we had to run a company. But really, we've invested, I don't know, $6,000,000

in software development, which is a lot for a company like ours, and we've completely open sourced it. It's all full featured Apache 2 0. And we're excited about this sort of opportunity to continue to try to push the the market forward.

On that note of

how to drive change

and the turnover cycles of

senior leadership

in the data space,

I think that that speaks to as well the fact that

companies think, oh, data is going to solve all my problems. We're going to throw a bunch of money at it. We'll become data driven, and everything will be great, will be profitable.

But then they don't factor in the realities of

how complicated

the data workflow is,

the

cycles of how long it takes to actually build out the data warehouse and then be able to gain real value from it. I'm actually reading

the Bill Inman, how to build the data warehouse book recently, which has a lot of useful insight there about,

unlike software, you don't start from requirements and build the product. You build the product to then get the requirements, and then you go do it again. And I'm wondering if you're seeing some of those same cycles of

senior leadership comes in. They make all kinds of promises.

They can't deliver in the time frame that they've said, and then they get kicked out, and then the next person comes in and makes different promises.

And that's and

those promises are a 100% buzzword compliant. They go to the latest Gartner conference, pick up some buzzwords, and then they're gonna do these buzzwords.

And then their foundation is utterly unreliable.

And so, you know, I feel sorry for these men and women who are in these roles because they're trying

and they're just thinking about it the wrong way. They're thinking about the data and not the processes acting upon data. They're not thinking about their people and process. They're thinking about sort of trend following.

And I I, you know, I I think we've built a lot of good things. We have a lot of great tools, but we've got to focus on,

as I said, the day 2 or 3 problem, we've got to focus on.

Another way to think about it is what data ops means is you've got 2 23 year old problems and 1 46 year old problem.

So another way to think about this is you hire a 23 year old on your team, and he or she's got

a degree in computer science, really smart. Maybe they've done some data work and as a as a co op student, and you bring them in.

Can they make a change something? Can they make a small change?

You know, a line of SQL, a line of Python. And can that get into production in their 1st week where you know,

that they can make that change and they don't have to be babysat by your by your senior people? Can they make a change with very low risk? And most data teams don't benchmark themselves that way.

You know, they'll they'll make a change and maybe you'll hope it works or it works in their box. And if you just think about you change

a column name, what's the ripple effect on that on all the other tables, on all the reports, on all the models, on all the governance?

Can you have a button next to them? And, you know, Tobias, I have a son who's 25, and he took a job at Amazon

a few years ago as a software engineer, like, in their b to b payments place, and and he was able to do that. And, you know, I love my son, and I I'm sure he's a good engineer. But if you look at his room when he was, you know, when he was 10 or,

his room when he was 16, which was scary, I wouldn't trust him to make a change. And so where does that trust come from? Well, they built a system

next to their 23 year olds, you know, to help them deploy fast with low risk, next to their 23 year olds to help them observe the system to make it right. And then finally, the 46 year old on the team

actually has a set of metrics that measure things like error rates and cycle time and customer satisfaction and productivity.

And so we've got to focus on that, you know,

on not so much on let's let's build it, but how do we run it? How do we change it? How do we measure

our success? And those things, I think, are

really

the the key to gaining back a lot of time

and the key to sort of minimizing the immense amount of waste that happens in data and analytic teams.

I think that that note on

the customer impact is also interesting in the context of the data space because I think that for a long time,

there wasn't a clear view on who the customer is of the work that you're doing.

When you're an operations engineer, you know that your customers are the software engineers who need to be able to run run their applications.

The software engineers know that their customer is the end user of the application, whether that's internal or external to the organization.

On data teams,

theoretically, the customer is the CEO, but only if the CEO knows what he's looking at and how to interpret it. And I think that that's another

evolution that we've gone through over the past 10 years is starting to understand more who is the customer of the data that you're producing,

particularly

as we move into the ideas of data as a product,

data contracts,

data tests, things like that?

Yeah. Yeah. And I I think there's one salient thing that's

it doesn't matter 10 years ago, 20 years ago, you don't know what your customer wants. And, like, you may be the smartest person in the world, but, like, you don't have the whole business or the whole organization in your head.

And so,

the best way to do that is to deliver small bits of value to your customer iteratively.

And so to do that, you need to be able to make changes quickly with low risk.

To do that, you need to think and interact with your customer directly.

And also you need to to develop a sense of trust. And and I think what happens with a lot of data teams is you talk to a customer once you say, okay. I think you want 10 things.

And they instantiate

that and however they do that. And then they go off for weeks or months, and they come back and say, here, I built these 10 things. And the customer says, oh, that's great.

I only want 3 of those. You know, really, those 7, I don't really need. And by the way, here's 5 more. And so the waste that happens and building things that aren't right for your customers is huge. And then the waste of keeping things running that aren't used anymore by your customers is huge. And so I'm I'm all about let's recapture that waste. And and if you look at it from a number standpoint,

you could argue that 70% of the time is waste in data and analytic teams. In in some cases, it's 90% of the time. And so let's recapture that waste. Let's not focus on

being able to write SQL faster because that's only in the the 30% of the time. Let's focus on the system that allows you to put SQL into production faster and monitor it because that's where you're gonna recapture all your time.

Another

note from what you were saying earlier about the DevOps Enterprise Summit

is

that DevOps

went through a very long cycle.

Even now, there are companies in the enterprise who are just starting to adopt those concepts.

New things take a long time to percolate through into the enterprise, and I'm wondering what you see as some of the aspects of data ops, the iterative development and deployment of

data and data products that

are still lacking that

enterprise capability

and some of the things that you see as an industry that we need to address to be able to actually

make that leap from, hey. We've got all these shiny tools and everybody in start ups and small companies is using them to all of these tools are robust and enterprise ready and will actually drive adoption at large scale.

Yeah. I guess I just have this

you know, you might disagree with me. I just don't think it's about the tools,

Whether they're small scale or enterprise scale, I think it's about the system in which you work with those tools.

And so I think we need to work on that. And so we've had enough tools. And honestly, I've been

and it's fantastic that we've got,

you know,

a ton of orchestrators out there. They're all really cool. We've got a bunch of SQL templating tools. We've got a bunch of GUI tools to build things. We can now chat to an LLM and get some SQL.

We can write Python. There's there's 50 different ways to take data from one place, to put it to another and do some transformation on it.

I just don't think that that's from a

problem standpoint.

If

70% of your time is waste even more, you're improving the 30%.

That's not where we're gonna make gains. Work on where the big problems are. And the big problems are the systems of work that people have

and how they actually

enable

you to make rapid changes in small increments,

run systems so you don't have these emergencies and war rooms where everyone on the team's kinda doing this group grope to find out where the problem is because the CEO's

upset.

And so I think

in some ways, we've got it's not about the tools. It's about how you work with your team and those tools.

And in some ways, it's not about the data. It's more about the processes acting upon data.

Digging more into that, I think that one of the symptoms as well is that all of the tools are built

for

data engineers, data analysts. They're not built for other people in the business to actually look at them, interact with them in any fashion.

There are some tools that are starting to move in that direction of being a hub of collaboration

across the organization,

which I think helps to enable that use case.

And I'm wondering if you can talk to some of the ways that the massive growth over the past 7 years and recent consolidation

has

influenced the ways

that data teams

relate to

the rest of the organization and the ways that they think about the purpose of the work that they're doing beyond just, hey. Isn't this really cool?

Yeah. Well, if you actually, like, open up Azure, there's

10, I think now 12 ways to

do to integrate data, change data. And some of them have nice UIs that are targeted business users, some of which are more technical,

some of which work on only some databases, some of which are open source.

And so of those ten ways,

every data and analytics system has this characteristic of where do you put the logic.

And

do you so let's say you get a new question from your customer. I want us can you segment my customer base? I wanna look at them this way.

One way to do it is actually do it in the chart and graph tool. Another way is to do it in the front end of the chart and graph tool like like in Alteryx. 3rd way is have analytic engineers do it in the database. A 4th way is to is to have data engineers do it more on ingest. There's and and so where the logic lives in your system is actually a really interesting question. And so I I I'm a believer in in

making things prove that they're worthy. And so from a logic standpoint, the farther you put it down the stack, the more general it is. But you don't know whether that segmentation is worthy or not. You just be wanna be able to run it. So maybe it should be in the BI tool. Maybe it should be the

or maybe the

analytic engineer should do it. It's a question and a discussion that you have with your customer

about how you

maximize the amount of learning and then either take something and push it into something that's a little bit more

tested, a little bit more deployable, or or or maybe not. And so we're in the ideas and service business and and the promoting people to make quick changes.

And then

once it's proved morally, having a discussion about how we can industrialize this, how we can make it safe, how we can eat eat the technical debt that we may have created. Because not everything is worth

testing and automating

because it may not.

So,

I I think it's really about that discussion and and

taking things like technical debt and testing

and observability and deployment and getting all those things and and putting them on the table with your customer and saying, look. I wanna put this on the table as what we do as well as doing new things. And if you spend my time

telling my team all to build new things, we're gonna build up so so much technical debt and so many problems that,

guess what's gonna happen?

People are gonna start to quit. We're gonna start to go slower. Everyone's gonna get frustrated, and you're gonna get frustrated.

And so, to me, it's

it's really about

having a good relationship with your customer and a and a discussion and making some things that aren't are invisible,

like testing and automating and deploying visible to your customer and and where things live in this the stack.

And, that's a a management technique that I think,

you know, we all need to learn.

I certainly learned it, and it's just hard because your customers don't really care

kind of,

you know, where you are.

You know, they don't really care about that how things work.

Data lakes are notoriously complex.

For data engineers who battle to build and scale high quality data workflows on the data lake, Starburst is an end to end data lake has platform built on Trino, the query engine Apache Iceberg was designed for.

Starburst has complete support for all table formats, including Apache Iceberg, Hive, and Delta Lake.

And Starburst is trusted by teams of all sizes, including Comcast and DoorDash.

Want to see Starburst in action? Go to data engineering podcast.com/starburst

today and get $500 in credits to try Starburst Galaxy, the easiest and fastest way to get started using Trino.

Another aspect

of this space that I think

acts as a distraction is that with every new

breakthrough or every new capability, there's another hype cycle of, hey. Hey. Look at the shiny new tool. It's gonna make you move faster and everything will be easier.

Obviously, the latest one is large language models and the ways that they could be used to generate

anything that you want.

And as somebody who has been in the space for a while, I'm curious if you could maybe

put this LLM and generative AI hype cycle in the context of other hype cycles that you've seen and some of the ways that you see it as maybe

being beneficial and some of the risks that come along with it?

I think it's beneficial. I just don't think it's revolutionary. I mean,

so

in 1990, I learned to program at NASA with man pages and a bunch of c c programs that of other people. And so and then I would go if I didn't know it, I'd go to this thing called a nerd bookstore

that had all which is now extinct that had all, books, and there used to be quantum books in Cambridge and SoftPro. There was a one in in the West Coast.

And then, you know, it came along in the 2000, and

people started to use Google search. Right? And and people started to to post answers to questions on the Internet. And and what happened is a whole bunch of new people became enabled to be programmed. You didn't have to be, like, a master's degree dude at NASA to figure it out.

And I think what LLMs are gonna do are just gonna enable another realm of people to start programming more. And I think that's a good thing. But with anything

with any kind of technical activity, it's complexity and how do you manage that complexity,

especially when you have more people creating it? And then you you end up with this question. Is it worthy to be tested and observed? Where in the stack should it fit?

It maybe we should throw it out. How do we generalize it? What's the right schema? These technic these technical debt questions that we need to continue to work on. And we've gotta stop building systems where you're rushing to get stuff in your customer's hands, and then you have no idea if it's gonna work the second day and no and no ability to change it the 3rd.

So moving now into

the tools that you were mentioning earlier, you recently open sourced

a pair of utilities, DataOps test gen and DataOps observability.

I'm wondering if you can talk to the specific outcomes that you're trying to drive with those projects

and the motivation for releasing them as open source.

I guess the number one motivation is that I talked to I've talked to

100, thousands of companies over over the years,

and

they build systems and they're having errors in production and their customers are finding problems.

And so their customer may be an an analyst team and maybe the CEO, but there's customer visible problems. And then they have to react. So they're reactive

to customer problems. So

our hope in this is that we can make them find problems before their customers see them.

And there's 2 technical challenges that we built this on.

Number 1 is that you have to find problems in data.

And so finding problems in data

means you have to check the data. And so how do you check the data? Well, you might wanna recount and look at anomalies over time. Will that make sense? But you also want to actually dig into the the syntax of the data itself.

And so our test gen tool

profiles your data and then has has this sort of algorithmic

AI engine that builds a whole set of data quality validation rules for you so you don't have to build them. And then exposes them through a nice UI, through, you know, through a YAML file so you can you can manage them. So the the problem that we're trying to solve with DataOps test gen is is automatic creation of a whole wide variety

of data quality rules because data engineers

either don't have the business context to understand what they what they are or they don't have the time.

And so the reason we open source the second tool,

observability,

is that we think

the problems in production are not just with data.

You can have perfect data and still have a problem, a customer visible problem. You can have perfect raw data and perfect integrated data, but the model may not be right

or the export to the UI or your reverse ETL may break.

So you need something that sits on top of it, and we call that a data journey. And think of it as the fire alarm control panel for your data and analytic systems. You know, it's it tries to point you saying, hey, the fire's in this room in this corner.

And it's we think that that concept of a data journey is missing from data and analytics systems.

It's certainly missing if you want to regress a system. It's certainly missing in production.

So the 2 really reasons we built these is, hey, testing data is complicated. You don't have time, so you need a a tool to help you generate tests. And then observing the entire complexity of production during production

is something that you need to do. And we have this concept of a data journey that we've done. And both are available, both are Apache 2 0, both are completely full featured. And so the outcome that we're trying to do is just use these tools and don't, you know,

say when

you have a problem in production, start saying, oh, let's not have that. Let's try to find the way to fix it so that never happens again.

And that's I think these things can accelerate. And we put a lot of work into making them sort of easy to install,

easy to use. And we've been working on, you know, test gen for we I've been

my team and I have been doing data testing for close to 20 years. And so we've got a whole bunch of best practice baked into it. And so,

really excited. We've only open sourced it, like, 2 2, 3 months ago. So it's it's

still quite new. The observability

piece in particular,

there are a number of tools that try to address at least a portion of that scope. And I'm wondering if you can talk to

the ways that you see your DataOps observability and DataOps test gen tools

overlapping

with other

capabilities in the ecosystem

and some of the ways that they are differentiated

either because of their ease of use or their capabilities and just some of the ways that you think about their position in the market?

Yeah.

And so,

first of all, I think

whatever you do,

don't it's not acceptable to hope that your production system works.

And you need to test data and everything that's acting upon data from a principal standpoint. And then you need to pinpoint where the problem is quickly.

And so

those kinds of outcomes, I think, are are are really good. And if you've got

so the person who's listening to this is add some tests yourself. You know, write some SQL tests. Just just do it. Don't wait for permission.

Just start putting tests in. You don't need a tool to do this. And every time you have a problem,

try to find a way to automate it or test it so it never happens again. Those kinds of ideas aren't you don't need to compete with someone. And so I think the market itself

is for data observability tools. Is very focused on ingest teams and kind of pulling databases for changes.

And I think that's that's a good use a good place to start for some teams. Right?

Proving that the ingest is right.

But there's other use cases in data and analytics beyond that. And we do that. We pull databases. You know, we'll find anomalies in databases. That's fine. But, also, I think you need to to look at the full production cycle of data and analytics and test not just raw data, but integrated data. Actually, you need to test data before you even put it into production. You need to profile data.

You need to be say, is this data even worthy of being in production? Should you just push back on

your data engineers or patch it up before it gets in production? So we have a sort of a more broader scope

of how people do observability.

But to me, I think start checking and testing your data

manually, start looking at products.

And I think what all the data observability vendors, I think we all agree on, it's just utterly unacceptable to have embarrassing

errors on your customer in front of your customer.

And there you don't have to live with it. It's not a good way. You're not a hero. You're just you're just putting a bunch of waste in the system that's making everyone's lives,

crappy.

So for teams who are looking to

adopt these tools, integrate them into their work flow. I'm wondering if you can give an overview about how to actually get them set up and

some of the ways that the

test gen and observability tool

are going to be interacted with throughout a typical Workday.

Yeah. Yeah. So it's just you you go to GitHub and, you know, we have an installer program. It takes about 10 minutes to install on your machine.

And so let's say the first thing that you have is I've got a new file that I need to put into my data and analytics system. Well, test gen will profile it, and then it'll give you a bunch of data hygiene checks on it to say, well,

this is this this column's blank. Maybe you should did you mean to have that? Or you've got 5 different ways to say the same thing. And these are kind of heuristic rules to help you say, maybe this data isn't good enough.

And then as you get something in production, we can start pulling the database test gen and observability to find out if their things are

wrong. And then as your data moves through its other tools, like models, visualization

tools,

export, reverse ETL tools. We have integrations to monitor its progress

and,

to check it if it's right. And so the full data journey as it moves from kind of source to destination we're checking

on. And so all these tools,

every if you look at

where problems

are in data and analytic projects

and kind of think of them like a like a graph. Oftentimes,

problems start with

bad raw data. But then over time,

once you work with your supplier, once you patch up the data,

the data doesn't itself doesn't have problems, but also it's interaction with other datasets. And you start to learn more about the problems. And then as it goes into production, it's the problem isn't the data. The problem is someone changed some code acting upon data, and your system's not right. And then you need to start thinking about regression and and and other testing. And so I think

where problems lie in your data and analytic team or data and analytics systems

often change with the maturity of,

of your system and the maturity of the dataset and how how complex it is.

What are some of the most interesting or innovative or unexpected ways that you're seeing the observability and test gen utilities applied?

Well, we have one organization who,

they use Databricks,

they did all their ETL

and it's sort of product analytics. So,

manufacturing product analytics.

And so they have all these sort of stream of notebooks that does all this work, all these algorithms, and they had no idea if it ran. It had its own complex interactions.

And so they just send a huge amount of data to our system, complex

system in notebooks that they went ahead again, spent day 1 building a lot of. Complex system in notebooks that they went ahead again, spent day 1 building a lot of, and are able to find it. And and was a very fast and and easy case to do it.

Another case is we just have a a customer who has sort of, I think, in the order of 80 datasets that are updated

often

hourly. And they're just very important, very high value. And and what they were able to do was was

instrument the whole system of tasks by just sort of scanning it. So they got a a big step up

on data quality validation testing because they sort of point to test gen at it and then auto automatically generated tests.

Now as time went on, they had to

write tests that were sort of specific to their business domain,

tests that looked across other tables, tests that compared with history, tests that were very specific.

And so they could instantiate those and and test gen. In fact, they they wrote some of them themselves, and and they run ran them their own way. And so we're very open

in integrating data quality tests that are from from other tools.

And so, really, it comes down to you can build it in whatever tool, but we can capture and observe it, and and we can actually generate data quality tests quickly.

Once teams

adopt these tools and they have better visibility

into their end to end flow, how are you seeing that impact the way that they think about building new systems and some of the ways that that drives more of that

organizational

change?

It lowers

the fear.

It lowers the shame that happens in organizations.

And a lot of teams, they're sort of fear and shame bound,

and that locks them up from

making changes.

And so the the it just becomes lighter, and you enjoy your work more. And I found that personally. Like,

you know, I had to take back in 2005, I had to take my team through this, and I didn't know it. I was reading Deming at the time and thinking about manufacturing. But, you know, Deming had this thing called a quality circle where you sat around and said, well, what were our problems? Let's put them in a spreadsheet.

How can we find the root cause? How can we fix it so it never happens again?

And

I think teams are lighter. They're happier.

They're more able to make changes. They're more able to create. They're they're not wasting so much time.

And,

you know,

it it you know, it's still early in the,

release of these tools, so I don't have a 100 cases. But I do have the case where people have done

with our automation tool. We've seen that a a lot of times to be able to, make these things. And and to me, that's that's why I'm still doing this. Is it just

my life sucks. I see teams that are just really unhappy,

and it's not about your tools.

And it's not gonna get any better with whatever

AI enabled super Gartner buzzword tool. It's just not gonna change until you start thinking in systems and thinking in

how you actually make your life better.

In your journey of

trying to drive

the adoption of data ops and the principles go that go along with it and drive this organizational

shift to make the data work

more impactful,

less wasteful?

What are some of the most interesting or unexpected or challenging lessons that you've learned in that journey?

Well, first of all, I think I suck at it because if you would ask me 10 years ago, I I I thought it would be all be done by now. Right? I thought

at the at the most

data and analytic teams are 5 years behind.

And so it's a lot slower than I thought. And

I'm not sure, you know, how,

you know, our latest swing at the bat is to go away from leaders and to go to individual contributors.

We spent a lot of years building a tool that

that sort of encapsulated

the entire

value stream of data and analytics. It was an orchestration test enhanced

orchestrator

to try and move things into production quickly.

And it was really hard to get that to to make money with it,

honestly, because people would go ahead and build a whole bunch of stuff. And we had to catch them right at the beginning of their projects.

And

so for me and I don't know. I asked you, Tobias,

it's slower than I thought. And

and am I am I being negative?

Am I being

too hopeful? Do you think it's never gonna be solved? Do you think this is what do you think of the whole idea of DataOps and its maturity and impediments? What what what's your stand on it?

I think that we have come a long way.

I think that DevOps

took a decade before it

even reached any modicum of widespread adoption. And as you noted, it still hasn't made its way through everywhere.

I think that

from a tooling perspective

and an industry perspective, there has been a lot more focus on

making sure that the data that we build and deliver is actually reliable and trustworthy

and that we have a way of being able to actually test changes before they make their way into production and to end users.

I think that

there are a lot of different players in the space, and so it makes it hard to get everybody aligned. It makes it hard for everything to

integrate effectively to be able to actually make sure that

your testing

environment

is actually able to be

run end to end

because there are certain

point solutions of, okay, my warehouse can have 0 copied tables, and I can make some changes to that without impacting production. But can I do the same thing with my,

business intelligence environment and just some of the ways that that end to end integration can hamper the

testing

and validation of the changes that you're making before you pull the trigger and say, okay? Put all of this into production all at once.

Yeah. Yeah.

Yeah. I I think there is a lot of progress. And, like and I think

I think more and more people are are seeing it. And I'm talking to teams that are sort of agility is sort of expected now and not. But of course, I talk to other teams where it's it's rare. And

and I it's very interesting that you would think

you know, it's not the usual suspects. Right? The sort of small companies who are unagile. They're big companies who are agile. It's it's just very interesting, the the the layout. And and from from my standpoint, I think the other

challenging thing, and this is from a corporate standpoint, our our company, I'm just really grateful to have that we've never gotten investors, that we've always been profitable because, like, getting investors would have given us a timeline.

And, like, my

I never really trusted my ability to predict the future.

Certainly with human adoption of ideas, I always thought I would, like, get it wrong. And so building a profitable company,

investing in our employees,

making our customers satisfied, and not having to, like, make these,

you

know, timelines that I'm not sure if I could keep, I think are good. Because we're gonna still be here. We're still profitable. We're still gonna be around. You know, hopefully, a decade from now, we'll still be I'll still be I'll still be talking to you, Tobias, at 10 years. I'll probably still be talking about the same thing. But, hopefully, it'll be a lot better.

Yeah. I I think another element of this too is that

we talk about this idea of data teams who are doing their work, but I think a lot of it too is that

you have a company, they start with a software team, they start to realize, oh, we need to be able to do these

longitudinal

analytics. So they just say to the software team, hey. Go build me something.

And

despite the fact that it's all still software that you're working with, there is a lot more dimensionality

to data problems, and there's still a generalized lack of understanding of how that changes the way that you have to think about the problem space.

And I think that there just aren't enough people in the ecosystem

who have that context

to be able to make that

a widespread

availability

of

understanding how to tackle the problem properly

beyond just I write some software and I put it into production, and, oh, now I've gotta go back and keep fixing things and fixing things because they don't understand it from a holistic perspective.

Yeah. Yeah. And and I guess

the change I'm seeking is that individual contributors who are making that change and going back and fixing it aren't gonna passively quit. Right? They're gonna say, no. We can actually fix this. Here's the steps, and they're gonna start to talk to their manager and get their get their manager bought into it. Because I think that's

I saw that with agile adoption. There were individual contributors that said, look, I don't wanna spend 6 months building something and then no one does this. Can we have a can we actually build something in 2 weeks?

We're gonna learn a lot more. And I think what it comes down to is it's just gonna your life's gonna be better, and you're gonna have a lot more fun in your job if you can iterate quicker. You can run things with low errors. You can measure your work. And I think those things, I think, have to come from individual contributors because your bosses are just you know,

they're they got a shotgun to their head, and they're getting shot every 18 months.

And so it it's it's unfortunate,

but it's gotta come from the bottom. And I think it's a real opportunity for

people new to their career. Because I one of the things that makes me sad is people who've

gone to school. They've got a master's degree. They they came of age when,

you know, data scientist was the sexiest job in the world, and then they're just like they're they're depressed.

They don't like their job. They're like, why am I why am I stuck fixing and re fixing things? Why am I not doing anything innovative? And that's just you know, I'd like that I'd like that to change. But I think, collectively,

you know, we've gotta start sort of holding our teams to higher standards and and how we work together.

As you continue to build and invest in the tools that we discussed and your overall mission at DataKitchen, what are some of the things you have planned for the near to medium term or any particular projects or problem areas you're excited to dig into?

Well, I I'm excited to dig into

the role of a data quality person on a data team because I think they've been underinvested on. And I think,

you know, data observability is very focused on the ingest team and making sure the ingest is good, but the data quality person is is really interesting. And we're actually doing some market research now, you know, to to talk to a bunch of data from data quality people because

it it impacts, you know, having good data, but also

having,

their ability to influence the team. And so I think we're gonna look at that. And then I think one of my favorite things that I hope we're gonna get to, probably won't be this year, but I think the ability to capture analytics about your your develop about your your data and analytic team, their amount of product the amount of work they're doing, the amount of deploys they make, the amount of errors they make in production,

their customer sat, like, sort of looking at the value streams that people create in production and deployment with their customers.

It's amazing how many data and analytic teams don't metric themselves.

And, you know, they produce metrics. They produce numbers, but they kinda don't know what's going on with their team from a numeric standpoint.

I think that's the, you know, that's the you know,

we spent our first product solving the 23 year old deployment problem. Now our second and third products were solving the 23 year old monitoring and making sure there's low errors in production. And I think our hopefully, our last product will be solving the 46 year old problem. How do you run your team

in an analytic way? And so, those are the things that we're we're thinking about.

I think that that is noteworthy too because

in that concept of the parallel between DevOps and DataOps

is the concept of

being on call, error monitoring,

trying to drive down mean time to resolution, mean time to detect failures.

I think that that is one of those pieces of the data ecosystem that is still

in the process of being figured out of what does it even mean for a data person to be on call, and what does it mean for there to be an incident in the data platform?

Yeah. Yeah. And what is it? And is it number of incidents? What's the impact?

And I think it's

important to measure those things because

if you look at it from a manufacturing line, your error rates and how you manufacture

are and bottlenecks are huge impediments to success.

And so, like, you don't need fancy software to do it. Like, I encourage every team to get a spreadsheet and every time you have any production error, put it down in that spreadsheet. And then just every 2 weeks, sit with your team and say, what's the one thing we can do to automate this so it never happens again? What's the script? What's the test? What's the fix we can do? And so

I I think that sort

of being able to reflect and and think of it think of it as love your errors.

Try to have a way where errors aren't cause you sort of shame and blame, but let's love them and try and fix them and improve them. And I think that will help your team

be more productive and get better. And you don't have to do any software to do that. You just have to have a spreadsheet.

Are there any other aspects of this overall ecosystem of DataOps, the work that you're doing at DataKitchen,

the specifics of the observability

and test gen tools that we didn't discuss yet that you'd like to cover before we close out the show?

No. I I, you know,

I may sound negative. Right? Like, people are depressed. They wanna quit.

You know, it's frustrating. But I think in this change, and I think the fact that, like, we've had this great burst of tool creativity

and the fact that we're seeing some consolidation.

And I think

I think it's actually a really opportunity for people to reflect on how they do work and improve it. And I think there's so many new people and and they're running into the same similar sets of problems

that I'm excited

at the future. And I think it's it's really gonna be a good thing when,

you know, you don't have to

have these huge data and analytic teams where you could have a smaller one that's more in that does more work, does better work,

and the teams are happier and and people aren't quitting. And I'm looking forward to

helping sort

of nudge that along

in the future.

Alright. Well, for anybody who wants to get in touch with you and follow along with the work that you and your team are doing, I'll have you add your preferred contact information to the show notes.

And as the final question, I'd like to get your perspective on what you see as being the biggest gap in the tooling or technology that's available for data management today.

Yeah. I think it's that last one. I think it's the 46 Euro problem of how do you measure the success of your data and analytic team. I think both definitionally,

what is a successful team look like? You know, what are the numbers? There's no in software, there's Dora Metrics,

DevOps Research Association that they have sort of a set of metrics that are really sort of software

deployment focused.

I think those, and we have a set of deployment metrics. We have a set of production error metrics. We have productivity, customer sat.

I think we need to the biggest gap is really measurement of our team and what those measurements are. And, like, you know, I've often thought we need a Dora group for data and analytics.

What's the common way to benchmark teams in terms of how well they're doing, how successful they are, and what things that we should measure. And so that's another challenge, I think. So I think some smart entrepreneur will will pick up or someone someone should work on.

Alright. Well, thank you very much for taking the time today join me and share the work that you're doing on the observability and test gen tools, your perspective

on the evolution

of DataOps

and its adoption,

and the ways that we are trending as an industry. So I appreciate the insights and the time, and I hope you enjoy the rest of your day.

Alright. Thank you.

Thank you for listening, and don't forget to check out our other shows. Podcast.netcoversthepythonlanguage,

its community, and the innovative ways it is being used, and the AI Engineering Podcast is your guide to the fast moving world of building AI systems.

Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. And if you've learned something or tried out a project from the show, then tell us about it. Email host at data engineering podcast.com

with your story.

Just to help other people find the show, please leave a review on Apple Podcasts and tell your friends and coworkers.