Summary
Data observability is a product category that has seen massive growth and adoption in recent years. Monte Carlo is in the vanguard of companies who have been enabling data teams to observe and understand their complex data systems. In this episode founders Barr Moses and Lior Gavish rejoin the show to reflect on the evolution and adoption of data observability technologies and the capabilities that are being introduced as the broader ecosystem adopts the practices.
Announcements
- Hello and welcome to the Data Engineering Podcast, the show about modern data management
- When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With their new managed database service you can launch a production ready MySQL, Postgres, or MongoDB cluster in minutes, with automated backups, 40 Gbps connections from your application hosts, and high throughput SSDs. Go to dataengineeringpodcast.com/linode today and get a $100 credit to launch a database, create a Kubernetes cluster, or take advantage of all of their other services. And don’t forget to thank them for their continued support of this show!
- Atlan is the metadata hub for your data ecosystem. Instead of locking your metadata into a new silo, unleash its transformative potential with Atlan’s active metadata capabilities. Push information about data freshness and quality to your business intelligence, automatically scale up and down your warehouse based on usage patterns, and let the bots answer those questions in Slack so that the humans can focus on delivering real value. Go to dataengineeringpodcast.com/atlan today to learn more about how Atlan’s active metadata platform is helping pioneering data teams like Postman, Plaid, WeWork & Unilever achieve extraordinary things with metadata and escape the chaos.
- RudderStack helps you build a customer data platform on your warehouse or data lake. Instead of trapping data in a black box, they enable you to easily collect customer data from the entire stack and build an identity graph on your warehouse, giving you full visibility and control. Their SDKs make event streaming from any app or website easy, and their state-of-the-art reverse ETL pipelines enable you to send enriched data to any cloud tool. Sign up free… or just get the free t-shirt for being a listener of the Data Engineering Podcast at dataengineeringpodcast.com/rudder.
- The only thing worse than having bad data is not knowing that you have it. With Bigeye’s data observability platform, if there is an issue with your data or data pipelines you’ll know right away and can get it fixed before the business is impacted. Bigeye let’s data teams measure, improve, and communicate the quality of your data to company stakeholders. With complete API access, a user-friendly interface, and automated yet flexible alerting, you’ve got everything you need to establish and maintain trust in your data. Go to dataengineeringpodcast.com/bigeye today to sign up and start trusting your analyses.
- Your host is Tobias Macey and today I’m interviewing Barr Moses and Lior Gavish about the state of the market for data observability and their own work at Monte Carlo
Interview
- Introduction
- How did you get involved in the area of data management?
- Can you give the elevator pitch for Monte Carlo?
- What are the notable changes in the Monte Carlo product and business since our last conversation in October 2020?
- You were one of the early entrants in the market of data quality/data observability products. In your work to gain visibility and traction you invested substantially in content creation (blog posts, presentations, round table conversations, etc.). How would you summarize the focus of your initial efforts?
- Why do you think data observability has really taken off? A few years ago, the category barely existed – what’s changed?
- There’s a larger debate within the data engineering community regarding whether it makes sense to go deep or go broad when it comes to monitoring your data. In other words, do you start with a few important data sets, or do you attempt to cover the entire ecosystem. What is your take?
- For engineers and teams who are just now investigating and investing in observability/quality automation for their data, what are their motivations?
- How has the conversation around the value/motivating factors matured or changed over the past couple of years?
- In what way have the requirements and capabilities of data observability platforms shifted?
- What are the forces in the ecosystem that have driven those changes?
- How has the scope and vision for your work at Monte Carlo evolved as the understanding and impact of data quality have become more widespread?
- In what way have the requirements and capabilities of data observability platforms shifted?
- When teams invest in data quality/observability what are some of the ways that the insights gained influence their other priorities and design choices? (e.g. platform design, pipeline design, data usage, etc.)
- When it comes to selecting what parts of the data stack to invest in, how do data leaders prioritize? For instance, when does it make sense to build or buy a data catalog? A data observability platform?
- The adoption of any tool that adds constraints is a delicate balance. What have you found to be the predominant patterns for teams who are incorporating Monte Carlo? (e.g. maintaining delivery velocity and adding safety/trust)
- A corollary to the goal of data engineers for higher reliability and visibility is the need by the business/team leadership to identify "return on investment". How do you and your customers think about the useful metrics and measurement goals to justify the time spent on "non-functional" requirements?
- What are the most interesting, innovative, or unexpected ways that you have seen Monte Carlo used?
- What are the most interesting, unexpected, or challenging lessons that you have learned while working on Monte Carlo?
- When is Monte Carlo the wrong choice?
- What do you have planned for the future of Monte Carlo?
Contact Info
- Barr
- @BM_DataDowntime on Twitter
- Lior
Parting Question
- From your perspective, what is the biggest gap in the tooling or technology for data management today?
Closing Announcements
- Thank you for listening! Don’t forget to check out our other shows. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used. The Machine Learning Podcast helps you go from idea to production with machine learning.
- Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
- If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@dataengineeringpodcast.com) with your story.
- To help other people find the show please leave a review on Apple Podcasts and tell your friends and co-workers
Links
- Monte Carlo
- App Dynamics
- Datadog
- New Relic
- Data Quality Fundamentals book
- State Of Data Quality Survey
- dbt
- Airflow
- Dagster
- Episode: Incident Management For Data Teams
- Databricks Delta
- Patch.tech Snowflake APIs
- Hightouch
The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA
Hello, and welcome to the Data Engineering Podcast, the show about modern data management. When you're ready to build your next pipeline or want to test out the projects you hear about on the show, you'll need somewhere to deploy it. So check out our friends at Linode. With their new managed database service, you can launch a production ready MySQL, Postgres, or MongoDB cluster in minutes with automated backups, 40 gigabit connections from your application hosts, and high throughput SSDs. Go to dataengineeringpodcast.com/linode today and get a $100 credit to launch a database, create a Kubernetes cluster, or take advantage of all of their other services. And don't forget to thank them for their continued support of this show.
Data stacks are becoming more and more complex. This brings infinite possibilities for data pipelines to break and a host of other issues, severely deteriorating the quality of the data and causing teams to lose trust. Siflae solves this problem by acting as an overseeing layer to the data stack, observing data and ensuring it's reliable from ingestion all the way to consumption. Whether the data is in transit or at rest, Ciflae can detect data quality anomalies, assess business impact, identify the root cause, and alert data teams on their preferred channels, all thanks to over 50 quality checks, extensive column level lineage, and over 20 connectors across the data stack. In addition, data discovery is made easy through Siflae's information rich data catalog with a powerful search engine and real time health status. Listeners of the podcast will get $2, 000 to use as platform credits when signing up to use Siflae. Siflae also offers a 2 week free trial. Find out more at data engineering podcast.com/ciflae today. That's s I f f l e t. Your host is Tobias Macy, and today I'm interviewing Bar Moses and Lior Gavish about the state of the market for data observability and their own work at Monte Carlo. So, Barr, can you start by introducing yourself?
[00:02:01] Unknown:
Yeah. Sure. It's great to be here again. I guess it's now 2 years later. I think the very first episode that we had with you was in 2020, I learned. So it's great to be back. A lot has changed, and some hasn't changed at all. I think it's pretty cool. So, yeah, my name is Barb. I'm the CEO and cofounder of Monte Carlo. Best way to think about this is, like, a Datadog, AppDynamics, or New Relic, but for data engineers. So we help data teams make sure that their data is accurate and reliable so that organizations can actually use data in their various data products. And, you know, a little bit of about my background, I was born and raised in Israel, moved to the Bay Area to study math and stats, got stuck here since. I've worked with data teams kind of my entire life and was always looking at engineering and was jealous of all the set of tools and solutions that they had and the very little solutions that we had at data. And so kind of inspired by making it easier trying to make it easier for data teams started Monte Carlo today. And, Lior, how about yourself?
[00:02:57] Unknown:
Hi, everybody. I'm Lior. I'm Bar's cofounder at Monte Carlo. Also grew up in Israel. Started my career as we would now call a machine learning engineer, though it wasn't exactly the way it was called back then. And came to Bay Area for school, started a company in actually in the cybersecurity space. And for all those who've been doing cybersecurity, you probably know it's a lot about analytics and ML under the hood to do the work. And I led the engineering team at a company called Barracuda, and we built products that use machine learning specifically for fraud prevention. And that kind of helped inspire some of the problems that today solve with Monte Carlo, right? Like the types of issues that we had to deal with to deliver good service to our customers were oftentimes related to data problems and data reliability issues that made me kind of partner up with with VAR. I'm excited to be on a show with today. We don't we don't get together very often. So
[00:03:55] Unknown:
Yeah. It's definitely great to have you both on back. So for folks who haven't listened to your previous appearances, I'll add links in the show notes for that. You've both given a bit of your kind of cliff notes of how you got into data. For the longer details, I'll refer back to those past episodes. You already kind of preempted my question about the elevator pitch for Monte Carlo. So I guess what I'll ask you now is what are some of those notable changes that you've seen in the overall space for data observability, data quality, and your own reactions to it and
[00:04:27] Unknown:
kind of planning around it for that you've done at Monte Carlo since the last time we talked? Yeah. I mean, reflecting on sort of the market, you know, 2 years, 5 years ago, I think it's actually insane how, like, the market has sort of accelerated at palpable rate, Not just for data observability, but rather for data more globally. And we see that in various forms. Right? So if we look at sort of maybe the most obvious or most notable is the size of large data infrastructure companies. Right? So whether it's Databricks with, I think, just over $1, 000, 000, 000 in revenue or Snowflake with 1.2, BigQuery with 1, 500, 000, 000, Redshift rumored to be the fastest growing service from above, you know, a 100 plus services AWS.
I think that sort of speaks to the strength of the market sense and how prevalent it is for those technologies to be part of kind of a strong kind of modern data stack, if you will. On the other hand, if you look at, you know, like, programs like powered by or built on, that's kind of encouraged data teams to actually build data products. So more and more organizations actually use data in production or share reports with customers. I think, you know, if I would get on a random call with a data engineer 5 years ago, 10 years ago, not a lot of that data would be actually exposed to customers and not a lot of that data would be in a machine learning model, and not a lot of that data would actually be used.
I think what the large thing that or kind of the primary thing that has changed in the last 2 years is how important data has become to organization today. And, you know, as a result of that, we're also seeing kind of way stronger need for data quality, for data reliability. That just wasn't the reality 2 years ago. I mean, when I was just reflecting on it 2 years ago, we didn't even know how we would call the category. I definitely I was like, oh, observability. It's such a difficult word to pronounce. And here we are. Right? Like, I can get on a call and lots of folks, you know, our customers and prospects, not only recognize what data observability is, but also actually, you know, have started to think through what did it look like to measure data observability.
And for folks who are unfamiliar, sort of, you know, just taking a step back for a second explaining what data observability is, I can't assume that everyone knows what it is. Data observability, really kind of takes a page from software observability. And so in software engineering, it's very sort of traditional, and you kind of have to be crazy to run an engineering team without something like AppDynamics or Datadog or New Relic. And so, you know, engineers will use those solutions to make sure that their application and infrastructure are reliable.
And, yeah, data teams, which, you know, produce kind of high stakes data products, oftentimes are not aware of the data actually being wrong or are the last to know about it. Find out about data being inaccurate because of someone downstream that identified that. Maybe you have, you know, your finance team looking at a report and saying, hey. The number looks looks wrong. Or maybe a customer that's using that's looking at a particular dashboard says, hey. Like, the price of this product doesn't make sense to me. Or maybe the data that you're feeding a particular model just stopped arriving, for example. In all of these cases, the data is unreliable. The data team doesn't always know about that. And so the idea of data observability is to proactively monitor your data stack end to end to give your data team the confidence to know when data breaks, be the 1st to know about data issues, and to be able to resolve those quickly.
[00:07:43] Unknown:
Yeah. Definitely some interesting things to dig into there. 1 of the pieces is definitely the question of what do you call this space when you're 1 of the people who's helping to define it. And your comment about the fact that in the time from when you first started working on this idea to where we are now, there has been an increased prevalence of people actually exposing some of these data products to end users, whether it's embedded analytics or feeding that data back into product features or, as you said, machine learning models. And I think that maybe 1 of the reasons that data observability as a category and as a first order concern for engineering teams did take so long to be kind of the obvious answer is that up until that point, data was more of an internal process, and so you didn't have the high stakes issue that we have process. And so you didn't have the high stakes issue that we have with applications where it's end user facing. And if your application is down for an hour, pure ecommerce, you're missing you know, losing out on 1, 000, 000 of dollars. So, of course, you wanna make sure that things don't break. Whereas with data, it's, oh, it's just people inside the business. So if things are wrong, we'll just fix it. No problem.
Without really taking that to its logical conclusion of, well, if you don't know that it's broken and people make decisions on that, then that's costing you 1, 000, 000 of dollars.
[00:08:59] Unknown:
A 100%. You could have said it better. And I'll just give you a couple, like, tangible examples of that. Unity, gaming company, released a couple weeks ago. It was actually, like, in the news. 1 mistake in their data that was related to their ads actually resulted in a loss of $100, 000, 000. I'll just repeat that. 1 mistake cost a $100, 000, 000. Isn't that sort of a scary proposition? Right? That's actually not uncommon. Right? Give me another example of a customer that we work with. Just to kind of, you know, give a little bit more context on what this looks like. 1 of the customers that we work with is JetBlue, you know, obviously, kind of a well known leading international airline and, you know, a very strong data team. And the JetBlue team basically managed all the company's data from bookings to flight times. And so you can think about some of the experiences that are driven by that data. Right? Say, is my suitcase arriving on time? And do I have the right connection? And if I missed my connection, what's the next flight that I can get on? And, you know, how quickly am I getting support from someone online, on the phone call to actually book my next flight? All of that is incredibly data driven. And the team, it's 1 of the biggest DBT instances. They're a very big DBT user. They basically had to go to extraordinary lengths to fix data issues all the time in a very manual way. They actually had this team called eyes on glass, which basically, like, manually refresh dashboards to make sure that the operations are smooth. And just to be clear, that's a very common reality. We see that with so many data teams. The data is important. It's high stakes data. We need to look at it a very manual way and make sure that it's accurate.
And if it's wrong, you know, obviously, the the implications are, you know, not just for kind of the executives who are looking at the reports, making decisions based on it, but also for, you know, actual people who are flying like you and I. And so, actually, we started working with the JetBlue team and set up sort of data observability as a way to have coverage across their stack, understand when data is breaking, and actually ensure that both the operations and customer support data is up to date, the dashboard are accurate. So for example, there's a bug in a particular model that causes, say, like, a downstream table or a trunket, AniCarlo can actually send an alert and provide the right tools to help identify the root cause and debug it in a timely manner before there's impact downstream.
So that's just an example. You know, I think the the number and importance of use cases for data observability has really, really sort of accelerated and and changed in the last few years.
[00:11:29] Unknown:
The kind of biggest change, I think, that we've seen and, Tobias, you mentioned that part of the drive for data observability comes from the fact that there is much more out order facing, customer facing consequences to bad data. I think the other thing that's driving this is just the growth of data teams and the growth and complexity of data systems. We saw that in the last 2 years. I'd really drove the need to move from testing and monitoring to observability. Alright? So the traditional approach to data reliability is to add tests, various places in your data stack, and to perhaps put some monitoring in place, right, to track a small set of metrics, just by virtue of of a human having to write those tests and those monitoring rules.
And actually, to date, while a lot of technologies call themselves data observability, what they really do is that. So it allows people to define tests on top of their data and to and to put monitoring rules there and get alerted when they break. What we've seen though in the last few years is they really vindicated the observability approach versus the testing and monitoring approach. It's just that the huge complexity of data systems really forces you to really monitor things at scale. Right? So you definitely wanna do the monitoring piece and the testing piece. And and Monte Carlo obviously provides a lot of capabilities around that to get very granular about monitoring the specific metrics in 1 table or another, and to monitor all the different statistics and distribution metrics about your data, and that's critical.
What we've also seen though is the importance of tracking the entire production dataset. Right? All of the different tables that lead into those critical tables where you add testing and explicit monitoring. This is really the only way to scale a reliability program, because when you move from 10 tables to a 100 tables to 1, 000 tables, and when you move from 1 data engineer to 5 data engineer to data engineers, to eventually 5500, it's no longer possible to do everything manually. Right? And if you look at those critical tables that drive your critical features or your critical dashboards or whatnot, the problems that emerge there are actually typically a result of something that happened way upstream. Right? It's it might be a data source, you know, that's 10 or 20 stages upstream removed that changed in some unexpected way or a pipeline that's far removed from those tables that broke or a change in logic or in transformations that happens way downstream. Now, if you're only monitoring that key critical asset that you have, you're going to send the alert to the wrong person. Right? Because the problem actually happened, you know, to another person, maybe another team.
You're really going to confuse them about how to solve the problem. It would take them ages to investigate and go step by step backward or upstream to find the root each issue. And you're also going to drastically delay the time that you detect the problem. Right? You're going only find it when it impacts a critical asset. And then you're going to worry about figuring out where it broke, and then fixing it, and then backfilling. And that's a very, very costly process for the organization. And so, really, the only way, at least that we're aware of, to allow teams to scale that effort and to quickly get to why things broke and not just like something's broken, is through that concept of observability, of proactively capturing metrics from all across your pipelines, across the entire stack. That's something that we've kind of proved itself out with over 150 customers that we work with. And when you do these things, you really need to think about scalability and performance, right?
If you do that in a naive way, you're going to get a really, really high bill on your Snowflake, or BigQuery, or Redshift, or whatnot. That's an unpleasant surprise typically. You need to think about how you collect metadata effectively, about how you collect metrics effectively, about how you really leverage the specific capabilities in each of those platforms to really do this observability thing at scale. And and I think Monte Carlo invested a lot in building that over the last 2 years or so, and that has led to phenomenal outcomes. The other side of it, I think, and this is something we've been preaching for a while now, but now, you know, it's come to fruition. Thinking about reliability goes beyond just anomaly detection. Right? It goes beyond just, hey, here's a metric in 1 of my tables. How do I get alerted when it deviates from normal? Right? That's critical. Monte Carlo does it. It's an important part. But to really get to reliability in data products, you need to think about not just how you find tech problems, but also about how you solve them. Right? And we talked about it. The ability to look at things upstream, to understand lineage and dependencies.
And it's also about getting better over time and preventing issues from happening. Right? It's about looking at your past performance, understanding what are some foundational changes that you can make to your platform to make it more reliable? What are some foundational changes you can make to your process to make it more reliable? And this is something that's been, again, learned over on the DevOps side over
[00:17:02] Unknown:
a long period of time. And we've kind of seen it happening and taking flesh over over the last couple of years at Monte Carlo, and that's been a a very exciting journey. I think that it's definitely worth calling out that for a lot of people, the kind of earth shattering moment is just being able to know something is wrong because a lot of people don't even have that piece or it's just like it went through the job ran. I don't know. And just being able to know, yes, something is wrong here. Like, that's just monumental. Just like they're going from 0 to 1 is an amazing step for a lot of people and, you know, then everything else beyond that is just gravy. You know? If I can know why it broke and how it broke and what I can do to fix it, that's amazing. But just knowing that it broke in the first place is, you know, light years ahead of where I was. And so in terms of your kind of early efforts as you were developing the product and figuring out what your go to market looks like, you were very active in kind of content production and trying to own the narrative around data quality and, you know, putting out a lot of resources about helping people understand what are the different elements of this space beyond just data breaks sometimes, you know, getting more into the details of, like, yes, data breaks. Here's how. Here's why. Here's how to think about it. And I'm wondering if you can just kind of summarize your initial focus of how you thought about putting out that messaging and the ways that you kind of shaped that narrative as you iterated towards your general release and how the types of reactions and feedback that you got as you went through that journey of kind of figuring out how do you talk to people about this problem?
[00:18:38] Unknown:
Yeah. For sure. And, you know, I agree with you. I think the first moment is when you know that there was a problem. Right? You identify that there's something, this light bulb goes up, and you're like, oh, I had no idea about that. Right? And a lot of the value is in that. And I think the next sort of iteration or phase of phase of that, that's something that sort of customers and folks very quickly ask themselves is, like, why? Where? How? Right? Oftentimes, actually, people tell us to just knowing that something is broken, but not knowing not being able to answer those other questions is actually really frustrating for teams, and it sort of can become sort of alert fatigue or kind of noise, if you will. You know, so I think that's kind of an important point that comes up a lot in our discussions with folks in the industry. But I think going to, you know, your question on, discussions with folks in the industry. But I think going to, you know, your question on sort of content and how we've thought about, you know, how we've thought about what our customers are, you know, what folks care about. You know, we've definitely in the early days, you know, maybe a couple years ago, you know, as I mentioned before, there was a question of, like, how do we even call this thing? Right? And what resonates most with folks, and what does it even mean to have data observability?
Right? So we kind of coined the term data downtime, you know, use a lot of sort of you know, what we heard from customers to help define what we call the 5 pillars of data observability. So those 5 pillars, freshness, schema, distribution, volume, and lineage, and having sort of an automated way to have a strong understanding of those 5 pillars across your data stack. So wherever your data is, data warehouse, data lake, BI solution, That, I think, is kind of the core of what we started with kind of in the data observability category. And we run a lot of content to help folks kind of actually understand what that means. And our approach has always been, you know, very customer focused or customer driven. We try to to focus on that more than anything. And so we try to kind of hang out wherever our customers hang it out, which was, for example, Medium or podcasts like this 1 or LinkedIn.
And then, you know, write content that's actually approachable and easy to consume. So, you know, we heard a lot of feedback that there's a lot of content out there that's actually, like, really technical and really hard to relate to. And so we focused on content that actually focuses on storytelling and focuses on, honestly, a lot of sort of fundamental questions that our customers had. And a lot of things are top of mind for them. So, you know, I'm I'm reflecting on the last couple years. Data mesh was a big deal at a certain point. How to build a data team, how to go from 1 to 50 in your data team, how to, you know, set up SLAs, SLOs, SLIs. What do those even mean? How do I set up data contracts? A lot of those are sort of things that, you know, we've written about. Today, I think a lot of the questions that folks ask us are, how do we prove the value of a data team? How do we connect the value of the data team to the commercial reality that our business is asking us for? What are the metrics that we should be using in order to measure the success of our data observability efforts? All of those are things that are very much top of mind for our customers. And so, you know, we write content about those topics targeted at folks who are curious about those, like data engineers or head of data engineering. You know, I think another thing that has changed dramatically, as Gregor mentioned, with several hundreds of customers today, we actually found that some of our customers are starting to write about us. We're starting to write about data observability. And so, you know, we obviously included many of them in our content. You know, I think Shift Key was was another 1 that we just released a few weeks ago, and I know SeatGeek is in the works. So that's content has always been a big big focus for us, and as part of that, we're actually publishing sort of the first O'Reilly book on the topic of data quality. So we've done a bunch of sort of courses and classes with O'Reilly, and they asked us to write the first book on this. We're very honored to do that. And again, it's targeted at helping folks in the industry who have questions about how to build data teams, what data tools should I use, and how do I build a reliable data stack.
So, you know, today, we have, like, over, you know, 30, 000 or so subscribers and planning sort of a big conference this fall. And, again, it's all sort of with the goal of helping data teams, or wherever they are, answer these questions so that they can continue to deliver reliable data products.
[00:22:46] Unknown:
Data teams are increasingly under pressure to deliver. According to a recent survey by Ascend. Io, 95% reported being at or overcapacity. With 72% of data experts reporting demands on their team going up faster than they can hire, it's no surprise they are increasingly turning to automation. In fact, while only 3.5% report having current investments in automation, 85% of data teams plan on investing in automation in the next 12 months. That's where our friends at Ascend dot io come in. The Ascend Data Automation Cloud provides a unified platform for data ingestion, transformation, orchestration, and observability.
Ascend users love its declarative pipelines, powerful SDK, elegant UI, and extensible plug in architecture, as well as its support for Python, SQL, Scala, and Java. Ascend automates workloads on Snowflake, Databricks, BigQuery, and open source Spark and can be deployed in AWS, Azure, or GCP. Go to dataengineeringpodcast.com/ascend and sign up for a free trial. If you're a data engineering podcast listener, you get credits worth $5, 000 when become a customer. And then the other interesting direction to dig into is what you were mentioning about the rapid evolution of the space and the rapid adoption of this idea of data observability and the growth of, you know, engineering teams identifying that as a problem, identifying the fact that they need a solution for that. And I'm wondering if you can highlight what you see as some of the main motivating factors that have led to the widespread growth in understanding of data observability and
[00:24:22] Unknown:
kind of understanding the problems that it solves and how to know when you've hit that point that is a problem for you. I think what we've been seeing, Tobias, we touched on it a little bit earlier on this talk. I'm happy to expand, but I think what we've been seeing is just the growth and investment in data, and the growth of data teams, and the more mission criticality of data applications. Right? All these things together are making data problems more prevalent. Right? The more you build, the more problems you're going to have with larger teams and harder coordination. And then the other thing is the consequences of those data problems are increasingly affecting bottom line at the end of the day. Right? Whether it's a product that's exposed externally or a machine learning model that makes many decisions every day or whether it's a dashboard that drives really key decisions for the company. Right? And so as all these things are going, teams are also recognizing the need to take a more, let's say structured approach to data reliability. And as part of that also acquiring and putting in place the tools and the processes to do that. Right? And again, it starts with with testing. Right? Like and if you're a DBT user, you probably start to write tests, and then extending it into observability and other tools in your stack to really control and measure reliability of data and and improve it over time and hold teams accountable to that.
[00:25:48] Unknown:
Just to add on what's changed, we actually commissioned the survey on the state of data quality by Wakefield Research. I think we sat we surveyed over 300 data engineers and found some really interesting things. Like, for example, over 60% of companies or respondents actually said that data quality is way worse today than it was same time last year. I thought that was, like, 1 interesting data point. I'd say the second is, and this is maybe the good news, is that over 90% of respondents said that they are now actually actively investing in in data quality or data observability solutions. I think if you commissioned the same survey, like, 3 or 5 years ago, it would be a far cry from these results.
And maybe tied to kind of what said earlier, we also asked folks, like, what percentage of your revenue is impacted by bad data? Which percent of the company's revenue? I was actually shocked to find that north of 25% of the company's revenue is tied to bad data or to impact of data. Again, I think, you know, if you were to look at this several years ago, the results would not even be close to that, and so I was shocked by these numbers, and I think they also speak to how critical data is and how critical it is that the data is actually accurate.
[00:26:59] Unknown:
Yeah. And I think it's interesting, your observation, that so many people said, oh, my data quality is so much worse than it was a year ago. And I think it's, you know, along with the conversation we were having earlier about not even knowing when there are problems, it's probably tied to the question of, you know, maybe they think about it as, well, a year ago, I didn't get any alerts for my data. Now I get it at least once a week where it's like, yeah. My data is worse because I actually know how bad it is.
[00:27:25] Unknown:
Yeah. For sure. You know, I think folks who actually don't have it doesn't have to be Monte Carlo, but doesn't you know, whatever it is in place, if you don't have sort of a solution that's, like, data observability, there's, like, this big clock ticking. And every minute that goes by, there's, like, some significant impact on your customer, on your business as a result of data downtime. So even if you're unaware of it, even if you're not getting that, like, weekly alert, in reality, businesses are impacted by that. Data teams, data products are impacted by bad data. And so even for folks who are not aware of it, I'm convinced that there's just, you know, hundreds of data teams there that are just hit by it, but have not sort of thought about it proactively yet. And I think that's the big gap. It's not necessarily not having data related to your products, but rather it's not being aware of it. And starting with that awareness that I think is so critical.
[00:28:19] Unknown:
Another interesting aspect of this market is that at roughly the same time that you were getting ready to release your product, there were I can think of at least 3 other businesses that were in what could easily be considered either the same or very similar kind of product categories. And it's definitely 1 of those points in time where everybody has been dealing with this problem long enough that enough people decided that they were going to do something about it, and it just happened that they all did it at the same time. But maybe there's some element of the kind of level of attention in the venture capital market that had something to do with that too. And I'm just curious what you think are some of the catalysts that led to so many different companies attacking this category from their own particular directions in such a close time frame and rapid succession.
[00:29:10] Unknown:
That's definitely true. I've seen data observability. Honestly, it's a pretty common pattern in in a lot of different software markets, and I can only guess as to why it happens. I think 1 aspect of it, as you point out, it's bias is is the venture capital community that shifts its focus into 1 space or another. And, you know, it's more likely to fund companies in a given space. And I think that that was definitely the case for data observability in the in the last 2 or 3 years. I think another aspect is just teams getting inspired by 1 another. You know, maybe 1 or 2 teams start, and then a third team notices the traction, the excitement around the space, and maybe, you know, pivots into it or starts a new company in it. And so I think you see that happening as well.
And perhaps the third aspect is just kind of the reasons why data observability became important during that time. And I think there's a lot of smart teams out there and they're all thinking about problems and as they emerge and become more and more important for the industry, a lot of smart people think about it, around the same time. There's also a confluence of technology here that enabled the space. I think the transition into cloud data warehouses and data lakes and data lake houses and whatever you call it, the emergence of EBT, all of these technical trends have made data observability possible. And so for all of these factors combined, it's typical that you see several teams operating in a similar space around the same time, and data observability I think is better for it. I I think we all make each other better. Excited to be part of that space. Yeah. And I would say, you know, just adding to that,
[00:30:53] Unknown:
yeah, I think the last few years have been pivotal and and important for data. Right? And and I think COVID 19 and the acceleration of remote work have have helped that. So, you know, just to give another example, Vimeo, 1 of our customers, it's a video hosting platform. I think, you know, they have more than 200, 000, 000 users, and they use data quite extensively. They have customer data, marketing data, product usage data. I mean, they literally have billions of streaming events ingested per day, and they use that to make decisions on, for example, which customers need more video bandwidth at any given minute. You know, what type of devices are they using?
And actually, using data, they were able to kind of sustain growth in COVID 19 and actually open totally new revenue channels by leveraging data. And so they actually sort of doubled down on including SLAs for data, making sure that data is reliable throughout the life cycle. And, you know, I think for lots of folks, this was just a similar reality. Right? A reality where data teams are accountable for data products, and they spend 20 to 80% of their time making sure that that data is reliable and accurate. Like, the worst thing that you can do, the worst thing that can happen to you is that the data is wrong. And I think, you know, companies invested so much in building, you know, top notch data infrastructure and a really awesome stack, but then at the end of the day, data's wrong. And so when folks are facing that, I think like this, and and last year was money time for data teams. And data teams were like, okay, we've invested so much in having, like, the best data warehouses, the best data lake, the best ETL, the best VHI. We have all this awesome infrastructure.
Now let's actually make use of the data. And then when you turn to make use of the data, the data is inaccurate, then all of your efforts have basically been moot. And I think that's a very, very frustrating reality for folks. And when that happens, folks immediately ask themselves, well, how can we regain trust in the data? I think that's sort of 1 of the biggest pushes I've seen.
[00:32:51] Unknown:
In the growth of data observability, there has been more widespread visibility and understanding of the the fact that it's even an option. And for people who were some of your customers when you first launched and started buying into this idea of data quality, data observability as the initial set of products and tools were being released. You know, those are definitely the very early adopters, the forward looking engineers, the people who are constantly on the lookout for how can I do things better, faster, cheaper? And now that we have hit a point where the market has kind of agreed upon the terminology, agreed upon the value proposition, I'm wondering what you see as the ways that customers are approaching this area have shifted, and the ways that their motivations and the ways that they come to this conclusion have changed from when you first started working on the product?
[00:33:47] Unknown:
I think what's changed is just, first of all, awareness. Right? Like you mentioned, in the early days, it was a bunch of early adopters, very visionary people that understood why, you know, why it makes sense to take some concepts from, you know, from DevOps and apply them to the world of data. Today, I think we're seeing increased recognition in the industry that data reliability, not just quality, but reliability is a foundational part of a data strategy and of a data stack. We're seeing a lot more companies that actually have a stated objective or strategy around solving data reliability at scale. Right? And sometimes it's driven by going public and knowing that you're going to be sharing your numbers with the street.
Sometimes it's driven by new products and capabilities that are very customer facing that are driven by data. Sometimes it's driven by, you know, it's sad to say, but like massive failures or problems that affected the entire company and that resulted in the team taking a proactive approach to data observability. Well, we're definitely seeing it coming into the mainstream. It's no longer tech visionaries. We're now 1 of the biggest changes we've seen also is sectors, right? It's no longer tech companies that are doing data observability. It's companies in every sector that you can think about, whether it's media or manufacturing or CPG, or even car companies or educational institutions, right? And we've seen all of them adopting data observability and making it a central part of their data stack. And it's just great to see how the industry is so rapidly adopting this practice, which we believe is critical.
[00:35:33] Unknown:
As you have evolved your own capabilities, the ecosystem around you has evolved the specific concerns or the level of detail that people need to be able to solve their increasingly complex systems has changed. How has that driven your overall thinking about the role that you play in people's data platforms and the direction of the product that you've taken as a result?
[00:35:57] Unknown:
That's a really good question. And what's happening is kind of 2 fronts, actually. 1 front is is the ability to cover complex stacks. Right? And our customers use multiple solutions within their stack. You know, typically starting from a data warehouse or a data lake, but then they have the Ive tools on top of it, and they have orchestration tools like DBT or Airflow or Daxter. And they have streaming infrastructure that feeds into the data platform typically. And our product has really evolved in its ability to cover all of those different pieces and correlate information. Right? Because when you have a data problem, if you only know what's going on with your data, you're limited. Right? And we touched upon it. Right? Like there could be a million different reasons why it broke. And if you don't have that information from across the stack about how all the different assets tie together, and about how they work together, and what issues you've seen in every single part of the stack, you're going to find it really, really hard to deal with data problems and to fix them. Right? And so, of course it's critical to monitor, the metrics the shines is the ability to pull together that information. And so for example, today in Monte Carlo, when you go and look at a data problem, you will automatically see a map of everything that's upstream of the table that's impacted, including things like DBT errors or test failures that happened anywhere upstream. Right? Or schema changes that happened or other data anomalies. Right? And that ability to take all of that information together is really what drives a lot of value for our customers. And that's really, really unique.
The other dimension that we saw evolving really nicely is that transition from, you know, from just monitoring and detecting issues to actually helping teams resolve them and even prevent them. Right? And again, it's all about, you know, it's not enough to monitor, like, you know, your 10 most important tables. You actually need to have deep visibility across your entire stack. Make sure that alerts go to the right person, the person that owns the thing and can actually act on fixing it. It's critical to give them all the relevant information, right? Whether it's from DBT or from their query engine, the user, or the BI layer, and really help them solve fast because we've seen it time and time again. If you alert the wrong person, if you inundate them with information that they can't act on, you know, that your reliability program is just not going to be successful. And so Monte Carlo invested a lot in, you know, getting the right alert to the right person, making it very, very meaningful, and giving them all the context from around the stack to actually go ahead and act on it and solve it. And that's something that's very, very unique in the market.
The other thing is, you know, going beyond, you know, detecting problems and solving them to actually making the system better. Right? And this is also an area where Monte Carlo invested in actually foundationally changing the architecture and the system so that you actually have fewer problems and fewer issues. And Monte Carlo invested a lot in creating usage information, metadata, and statistics that help teams understand where problems are happening, how to simplify their architectures, sometimes how to even reduce cost and complexity in their systems.
And as you do all of these things, you actually get, fewer incidents. And we've had some amazing success stories where companies literally within 6 months dramatically reduced the number of issues that they had by actually learning from the past, by understanding the structure of their systems and by simplifying it honestly. And so being able to address all these 3 pieces of detecting, solving, and preventing issues is probably the most dramatic change. And then being able to do it across the stack is something that made some of our customers extremely successful.
[00:40:18] Unknown:
Yeah. The question about the dialogue that it opens in terms of understanding how to evolve your own data architecture and data stack with the information that you get back from the observability platform was you beat me to the punch. I'm wondering if you can maybe speak a bit more to some of the types of optimizations or evolutions in the kind of data architecture and data platform that some of the insights that Monte Carlo and observability solutions provide can inform and direct?
[00:40:48] Unknown:
Oh, yeah. Absolutely. It actually starts mapping all the assets that you have and understanding how they are created on 1 hand and how they are used on the other hand. Right? For example, 1 simple thing that you could do is start looking at, okay, what are all the assets today I have that I'm investing a lot of resources in to create, and that are actually rarely or never used, never read. Right? And start deprecating those. Right? We've seen a lot of our customers doing that. What you accomplish when you do that is twofold. A, you save on costs. Right? There's no point paying for all that compute if you're not gonna use it. But the other thing is you're actually reducing your data debt. Right? You're reducing the number of tables where things can go wrong. You're reducing the number of options for data consumers to get their data. You're actually streamlining and simplifying the architecture. So that's 1 example of how you could use usage data and statistics to actually make the platform better.
Another example is you could start looking at your lineage and see that certain things are duplicated. Right? Maybe you have a set of dashboards that are trying to give visibility into the same area of the business or the same set of metrics, but they're actually reading from 2 versions of the same data. So you can actually see and moreover, you can lay over the reliability data about those 2 datasets. Right? So if you have 2 datasets that are used to measure users or revenue or whatnot, you can actually see which 1 of them is more reliable, And you can actually refactor the system to use a single dataset instead of 2 that are trying to accomplish the same thing. And so that level of understanding of the platform with the ability to understand reliability over time really gives you a lot of power in simplifying your architecture.
[00:42:51] Unknown:
The biggest challenge with modern data systems is understanding what data you have, where it is located and who is using it. SelectStar's data discovery platform solves that out of the box with a fully automated catalog that includes lineage from where the data originated all the way to which dashboards rely on it and who is viewing them every day. Just connect it to your DBT, Snowflake, Tableau, Looker or whatever you are using and Select Star will set everything up in just a few hours. Go to dataengineeringpodcast.com/selectstar today to double the length of your free trial and get a swag package when you convert to a paid plan.
The other interesting thing to talk about is the question of understanding the value that a product like Monte Carlo brings to the data platform and a data team and the ways that organizations think about measuring and tracking the overall return on investment and the types of top level indicators that they're looking to to understand, you know, how much effort to put into something like a Monte Carlo and, you know, how that will impact the overall business.
[00:44:00] Unknown:
You know, I can speak a little to what folks or data teams are doing today. You know, I think there is kind of things from a few different ways to think about measurement goals. 1 are in terms of kind of internal measurement. So service level agreement, service level objectives, and service level indicators. But that honestly is probably like a step forward from where folks are at today. I think 1 of the things that folks really wanna understand at the very basic level is time to detection. So how quickly do I find out about problem in my data? For most data teams, this can be days, weeks, months. Can we actually get that down to hours? You know, some customers that we work with, Chuzil, for example, I think reduced that by 90%, basically.
The second pretty fundamental metric is time to resolution. So once there's an incident or data problem, how quickly do we resolve that? And, again, that could be months. And when there's a clock ticking, you know, and every minute or hour is 1, 000, 000 of dollars, time to resolution is really critical. And then I finally, the thing that kind of on on that thread is actually the reduction of data incidents overall. So, you know, to the discussion earlier, if we can identify how to improve the data stack or how to work with sources that are more reliable, you know, generally sort of deliver on sort of data reliable data products, we can actually sort of proactively reduce those data incidents to begin with. And so, you know, those are some of the things folks start out with. And then over time, they look at, you know, measures like a freshness SLI or a volume SLI and, you know, more kind of metrics that go towards reducing data downtime overall and actually increasing data reliability at the same time.
[00:45:41] Unknown:
And then on that question of kind of understanding the value, understanding the benefits of data observability, you know, we talked a little bit about the kind of data architecture and data platform aspect, but there's also the way that it can inform how and where engineers spend their time and think about investing their kind of efforts to be able to, you know, build systems that are more reliable, but also be able to build data assets that are going to be more reliable and are going to be used. And I'm curious how you've seen some of the information that you're providing influencing the aggregate behavior of engineers and teams in organizations that are very data focused?
[00:46:23] Unknown:
A 100%. So great question. So when it comes to sort of ROI or kind of, you know, what's in it for you from data observability, there's actually a set of 2 pretty simple ways to look at it. 1 is, like, money saved by reduction of data incidents, and that you can look at how many of incidents you have and what's the cost of each for the organization. And then the second is data engineering time spent. And we have some data on both of those. So from our surveys and research, we know that data teams spend, for sure, north of 40% of their workday actually on data quality, and we hear numbers anywhere between 30 to 80%.
We also have heard that the average organization experiences something like 50 or so data related incidents per month. Actually, adds up to literally 100 of hours for a data team. And so for folks who look at the ROI, it's it's typically on those 2 measures. And we've worked with so many customers, like, you know, mentioned Chuzil and Vimeo and ASICs and Auth0 and Fox and CNN and many, many others where you know, that reality where, you know, you're able to both reduce the number of data incidents and you're also able to reduce the time that your data team spends on data incidents overall. So that's typically how sort of folks think about it. That's kind of the, you know, I would say the easiest way to measure. There's obviously kind of other benefits. Right? Like all the things that you could be doing at the time that you were spent triaging your data incidents. Right? So new products that could be launched or, right, like other revenue generating activities that your data team could be working on that isn't. That's obviously a lot harder to measure. I would add that as sort of a qualitative RO aspect of the ROI as well.
[00:48:01] Unknown:
In your experiences of building Monte Carlo and working with your customers and engaging in the broader conversations around data quality, data observability that are happening in the industry and in the ecosystem. What are some of the most interesting or innovative or unexpected ways either that you've seen Monte Carlo used specifically or ways that you have seen data observability incorporated into, data teams overall kind of objectives and workflow?
[00:48:30] Unknown:
I think, you know, in terms of how people work, we've seen teams that have done really cool things. Some teams have created really, really sophisticated and granular reporting about their data quality. They actually use data that Monte Carlo provides, and and we provide it through data shares or through APIs. And have built really, really granular dashboards that allow them to track the quality of their assets or their teams in very, very sophisticated ways. And they actually use that to run their weekly operational reviews. And that's that's something that we really, really enjoy seeing.
At the more development of workflow level, we've seen teams basically using observability to better understand the impact of changes they're about to make. Right? They are no longer changing the field or deprecating it and hoping for the best. They can actually tell at a very granular level. In fact, at the field level, what the downstream impact is going to be. They can tell exactly what tables and fields are going to be impacted and what reports and what applications are going to be influenced. And then do that proactive communication, make sure that everybody's advised about the change and that everybody's ready for the change and that it doesn't break their systems as they roll it out to production. And so it really kind of changed the way teams work, and that's been very gratifying.
That's why we started Monte Carlo, you know, to make the lives of data engineers better and easier and less stressful.
[00:50:11] Unknown:
And so these sort of things make me really happy. You know, a couple of years ago, 1 of our early customers said something like, you know, the only tabs that I use is Gmail, BigQuery, and Monte Carlo. And that became our bar for what an excellent product looks like. Right? We think data observability will be so critical to data engineers that they won't be able to do their jobs without that. And, you know, I would say I've been surprised by how quickly the industry is adopting that. It's really not uncommon for us to speak with folks today where data observability is just so ingrained in their stack. They don't really operate their data stack without something like this. So I think that's really powerful.
[00:50:48] Unknown:
In your experience of building the platform, building the product, engaging with the conversation observability? What are some of the most interesting or unexpected or challenging lessons that you have each learned in the process?
[00:51:01] Unknown:
I think 1 of the things that I think a lot about is focus, both for us at Monte Carlo, but also for our customers. Like, you know, if I put myself in the shoes of our customers, they have a lot going on. They can be doing a ton of different things. They have a ton of asks. They probably have a bottleneck of things that might take them about a year to get to, if they're lucky. Right? And so being really thoughtful about what are the 1, 2 things that you can really do that moves the needle is a very important decision, both for our data teams and both for us internally. So and I think for my kind of personal journey has been, know, there's a lot of noise. There's a lot of things that we could do. But at the end of the day, remembering the focus on our customers, remembering on how do we allow someone, you know, data team, both to make their lives easier and also to deliver a great experience. Right? Whether it's working with a company that is in the medical space or, you know, in InsurTech and, you know, has influence on the ability to deliver loans. Right? Making sure that, you know, folks can actually get credit cards and mortgage. And if the data that's powering that is wrong, people's lives are impacted. Right?
And so focusing on that and focusing on how do we improve those experiences through reliable data and having a relentless focus on that and on what matters, then 1 of the most important things that I've experienced.
[00:52:23] Unknown:
It's actually pretty similar. The 1 thing I learned is always listen to customers. We all come with preconceptions about how, what are the important problems and how to solve them that we bring in from our own set of experiences and challenges and organizations that we worked in. But ultimately every organization is unique and every customer is unique. And you have to really listen deeply and understand how to solve people's problems. Right? And 1 example of that that I always find curious is, you know, oftentimes people say like, Okay, let's do testing and monitoring. Right? You know, just pick the 10 assets that are most important and add a bunch of tests on them. And if you really listen to a lot of teams, you'll actually learn that most teams can't even say which assets are most important to them. Like, what are those 10 tables that you're going to to really monitor?
And that was a really important insight for Monte Carlo. Right? It's true for decentralized teams. It's even truer with centralized teams. People don't always know out of box what, where to even start. Right? And so building Monte Carlo in a way that allows teams to do that and gives them that information was actually 1 of the most important aspects of Monte Carlo. And I don't think we would have guessed that from our, you know, at least not from my personal experience. It's something that we really learned from working with data teams out there and understanding what are their challenges.
[00:53:55] Unknown:
As you continue to build and grow the Monte Carlo platform and, you know, continue to explore the possibilities and requirements of observability for data platforms and data products. What are some of the things you have planned for the near to medium term, and what are some of the areas that you're excited to dig into?
[00:54:13] Unknown:
There's several areas that we're focused on. 1 that, you know, that was a cornerstone of our strategy so far is continuing to work in our ability to cover as much of the stack as possible. And 1 recent release around that was the ability to support the Databricks Delta ecosystem that they're building, and we just launched this and are continuing to invest in making that integration as it can be and give people the most powerful Databricks solution out there. We're going to continue to invest in covering additional pieces of the stacks. Streaming is top of mind, and we definitely want to be able to do more for our customers there. We're getting a lot of demand around that. And I think the other side of it, going back to the tech resolve prevent framework, we're going to launch a lot more capabilities around helping teams actually prevent incidents and get over time and foundationally improve and for the reliability of their systems and catch problems as early as possible so that they have the least amount of impact on their data products. And so this is where we're focused on top of, obviously, streamlining and improving all the core capabilities that we've built so far. We love working in partnership with our customers. And so today and forever, you know, a large amount of our bandwidth is dedicated to solving the myriad of feature requests that we get for our solution.
[00:55:35] Unknown:
Well, for anybody who wants to get in touch with each of you and follow along with the work that you're doing, I'll have you add your preferred contact information to the show notes. And as the final question, I'd like to get your perspectives on what you see as being the biggest gap in the tooling or technology that's available for data management today.
[00:55:51] Unknown:
I mean, I will say I'm biased, but, obviously, I think observability is a big 1. So if anyone is investing in, modern data stack, I think, you know, you better make sure that that's reliable. But other than that, you know, 1 of the areas that I'm excited is sort of around folks who are enabling other data teams to actually build data products in an easier way. So there's a team that I think they're called patch.tech that's allowing, kind of, the creation of APIs on top of Snowflake and making that easier.
[00:56:21] Unknown:
Pretty excited about that. I think a lot of technologies around how to bridge that gap between the data platform and the production system. Right? And Patch is is 1 example both Bart and I are pretty excited about. And then there's other companies, you know, basically helping make the products of the data platform more usable where the data is needed. Another example of that, by the way, is the data activation or reverse CTO companies like Hightouch. I think they're basically teams to make data more valuable for the organization and get used in more than just, you know, the context of the data platform. And so that's really exciting. I'm also kinda curious to see what happens with the BI space. Right? With Google taking out Looker, which was the innovator in the space. It'll be interesting to see what's the kinda next generation and what innovations we're going to see around that.
[00:57:15] Unknown:
Well, thank you both very much for taking the time today to join me and for all of your efforts in building the Monte Carlo product and helping to engage in and collectively evolve the overall space of data observability and data quality
[00:57:30] Unknown:
and helping people build more reliable and resilient data systems. So I appreciate all the time and energy that you and your teams are putting into that, and I hope you enjoy the rest of your day. Thanks, Tobias. Thank you for, you know, I think 1 of the best podcasts in data, certainly in data engineering. So thanks for having us and I hope we'll see you in a couple years again for V2 of What's New with Data Observability.
[00:57:58] Unknown:
Thank you for listening. Don't forget to check out our other shows, podcast dot init, which covers the Python language, its community, and the innovative ways it is being used, and the Machine Learning podcast, which helps you go from idea to production with machine learning. Visit the site at dataengineeringpodcast dotcom. Subscribe to the show, sign up for the mailing list, and read the show notes. And if you've learned something or tried out a product from the show, then tell us about it. Email hosts at data engineering podcast.com with your story. And to help other people find the show, please leave a review on Apple Podcasts and and tell your friends
[00:58:33] Unknown:
and coworkers.
Introduction and Overview
Interview with Bar Moses and Lior Gavish
Market Changes in Data Observability
Importance of Data Reliability
Content Strategy and Customer Engagement
Growth and Adoption of Data Observability
Shifts in Customer Approaches
Evolving Capabilities and Product Direction
Measuring ROI and Impact
Innovative Uses and Lessons Learned
Future Plans and Exciting Areas