Summary
The ecosystem for data professionals has matured to the point that there are a large and growing number of distinct roles. With the scope and importance of data steadily increasing it is important for organizations to ensure that everyone is aligned and operating in a positive environment. To help facilitate the nascent conversation about what constitutes an effective and productive data culture, the team at Data Council have dedicated an entire conference track to the subject. In this episode Pete Soderling and Maggie Hays join the show to explore this topic and their experience preparing for the upcoming conference.
Announcements
- Hello and welcome to the Data Engineering Podcast, the show about modern data management
- Hey there podcast listener, are you tired of dealing with the headache that is the 'Modern Data Stack'? We feel your pain. It's supposed to make building smarter, faster, and more flexible data infrastructures a breeze. It ends up being anything but that. Setting it up, integrating it, maintaining it—it’s all kind of a nightmare. And let's not even get started on all the extra tools you have to buy to get it to do its thing. But don't worry, there is a better way. TimeXtender takes a holistic approach to data integration that focuses on agility rather than fragmentation. By bringing all the layers of the data stack together, TimeXtender helps you build data solutions up to 10 times faster and saves you 70-80% on costs. If you're fed up with the 'Modern Data Stack', give TimeXtender a try. Head over to dataengineeringpodcast.com/timextender where you can do two things: watch us build a data estate in 15 minutes and start for free today.
- Your host is Tobias Macey and today I'm interviewing Pete Soderling and Maggie Hays about the growing importance of establishing and investing in an organization's data culture and their experience forming an entire conference track around this topic
Interview
- Introduction
- How did you get involved in the area of data management?
- Can you describe what your working definition of "Data Culture" is?
- In what ways is a data culture distinct from an organization's corporate culture? How are they interdependent?
- What are the elements that are most impactful in forming the data culture of an organization?
- What are some of the motivations that teams/companies might have in fighting against the creation and support of an explicit data culture?
- Are there any strategies that you have found helpful in counteracting those tendencies?
- In terms of the conference, what are the factors that you consider when deciding how to group the different presentations into tracks or themes?
- What are the experiences that you have had personally and in community interactions that led you to elevate data culture to be it's own track?
- What are the broad challenges that practitioners are facing as they develop their own understanding of what constitutes a healthy and productive data culture?
- What are some of the risks that you considered when forming this track and evaluating proposals?
- What are your criteria for determining whether this track is successful?
- What are the most interesting, innovative, or unexpected aspects of data culture that you have encountered through developing this track?
- What are the most interesting, unexpected, or challenging lessons that you have learned while working on selecting presentations for this year's event?
- What do you have planned for the future of this topic at Data Council events?
Contact Info
- Pete
- @petesoder on Twitter
- Maggie
Parting Question
- From your perspective, what is the biggest gap in the tooling or technology for data management today?
Closing Announcements
- Thank you for listening! Don't forget to check out our other shows. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used. The Machine Learning Podcast helps you go from idea to production with machine learning.
- Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
- If you've learned something or tried out a project from the show then tell us about it! Email hosts@dataengineeringpodcast.com) with your story.
- To help other people find the show please leave a review on Apple Podcasts and tell your friends and co-workers
Links
- Data Council
- Data Community Fund
- DataHub
- Database Design For Mere Mortals by Michael J. Hernandez (affiliate link)
- SOAP
- REST
- Econometrics
- DBA == Database Administrator
- Conway's Law
- dbt
The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA
Sponsored By:
- TimeXtender: ![TimeXtender Logo](https://files.fireside.fm/file/fireside-uploads/images/c/c6161a3f-a67b-48ef-b087-52f1f1573292/35MYWp0I.png) TimeXtender is a holistic, metadata-driven solution for data integration, optimized for agility. TimeXtender provides all the features you need to build a future-proof infrastructure for ingesting, transforming, modelling, and delivering clean, reliable data in the fastest, most efficient way possible. You can't optimize for everything all at once. That's why we take a holistic approach to data integration that optimises for agility instead of fragmentation. By unifying each layer of the data stack, TimeXtender empowers you to build data solutions 10x faster while reducing costs by 70%-80%. We do this for one simple reason: because time matters. Go to [dataengineeringpodcast.com/timextender](https://www.dataengineeringpodcast.com/timextender) today to get started for free!
Hello, and welcome to the Data Engineering podcast, the show about modern data management. Are you tired of dealing with the headache that is the modern data stack? It's supposed to make building smarter, faster, and more flexible data infrastructure a breeze. It ends up being anything but that. Setting it up, integrating it, maintaining it, it's all kind of a nightmare. And let's not even get started on all the extra tools you have to buy to get it to work properly. But don't worry, there is a better way. Time extender takes a holistic approach to data integration that focuses on agility rather than fragmentation. By bringing all the layers of the data stack together, Time extender helps you build data solutions up to 10 times faster and saves you 70 to 80% on costs. If you're fed up with the modern data stack, give Time extender a try. Head over to data engineering podcast.com/timeextender where you can do 2 things. Watch them build a data estate in 15 minutes and start for free today. Your host is Tobias Macy, and today I'm interviewing Pete Soderling and Maggie Hayes about the growing importance of establishing and investing in an organization's data culture and their experience forming an entire conference track around this topic. So, Pete, can you start by introducing yourself?
[00:01:22] Unknown:
Sure. I'm Pete Soderling. I'm the founder of Data Council. Data Council is the world's first data engineering conference, which we've been excited to build for the last, 8 or so years. And now I'm also the founder of Data Community Fund, which invests in day 0 engineer founders in the b to b data space. And, Maggie, how about yourself? Sure. So I'm Maggie Hayes. I'm the community product manager for DataHub.
[00:01:45] Unknown:
DataHub is the leading open source metadata management platform, so my focus is on making sure that we're building a really robust offering for our open source community and really driving community led product development.
[00:01:58] Unknown:
And going back to you, Pete, do you remember how you first got started working in data?
[00:02:02] Unknown:
Yeah. I started working in data in probably 2009 or so officially. I mean, obviously, I was fascinated with databases as a young software engineer. I remember reading that book, data data design, database design for mere mortals, over Christmas at my parents' house. 1 year, probably in my early twenties, and was amazingly fascinated by such dry, boring content. So I think that sort of kicked off my interest in data. But, in 2009, I started a startup that was in the data space. It was called Strata Security, and we were helping businesses sell premium data, like Bloomberg or Comscore, Hitwise through rest based APIs. And so, that was a very early experience I had in the world of data. And early in the lifetime of REST as well?
Yeah. Very early. We were, the the product was had to straddle soap and rest APIs, and it was sort of a strategic decision that we had to make as to which we wanted to support first. Because with a small, scrappy startup, of course, you can lead to so many things. And so that was actually felt like a big decision at the time that we had to make. Yeah. I I was gonna ask why you didn't say so because
[00:03:07] Unknown:
I I have my own battle wounds around that. And Exactly. And back to you, Maggie. Do you remember how you first got started working in data?
[00:03:13] Unknown:
Yeah. My first exposure into kind of data or really understanding data analytics or kind of the power around it was in undergrad back in 2007, I think. I took an econometrics class for my econ major, and I was a terrible, terrible econ student until I took econometrics when that's kind of when everything started to click. So, from there, I, after graduating, I landed a job as a, let's see, what was it, a market information consultant, at Bank of America, which was a very fancy way term for what we now call data scientist. So I was really digging into understanding how people spend and save money and are are kind of incentivized or at that time not incentivized by interest rate, and have been in a data practitioner space ever since.
[00:04:00] Unknown:
And so bringing us around to the topic at hand, can you each give your working definition of what you mean by the term data culture? Because it's definitely 1 of those phrases that could probably mean anything to anyone. So for the sake of this conversation, it'll be good to have a particular grounding in what we actually mean when we say that.
[00:04:19] Unknown:
Definitely. I'll kick that off. I think when I think about data culture and and what I'm seeing in, among data, data practitioners in general is really building a strategy around how an organization invests in tooling, education, and collaboration around data. So tooling is not the not the be all end all solution for everything. You also really have to invest in the implementation or clear guidelines around who owns what or where does it fit within the stack. So really just making sure that everyone within the organization or even a subset of the organization understands their place as a producer and or consumer of data.
[00:05:00] Unknown:
And I think also this notion that data isn't just for DBAs anymore. Right? Coming coming from the last 10 or 20 years, we used to have this gatekeeper over all the data. It was called the DBA, and, of course, they served an important function. But I think that more modern and a more modern approach to, to data management and to the way teams collaborate around data is that data flows through the entire, every single role and team and product and aspect of every corporation. And so, we're all touching data, or our customers are generating data, or reporting on data. And I think for the for the whole organization to understand the implications of keeping that data high quality and clean and, understand the principles around interacting with that data. And that that we all contribute to the overall lifeblood data lifeblood inside of a company is really key to understanding, how the the modern organization has transitioned into thinking about data culture because it's not just the domain of of the database administrator anymore, and I think that's an important concept.
[00:06:02] Unknown:
And another interesting aspect of this topic is that whenever you're getting hired somewhere or when you're talking to a leadership team, there's always this concept of the culture of the organization where we have a fun culture or we have a very serious culture. And, you know, culture is 1 of the things that everybody tries to tout, but nobody really has a good definition for. And I'm curious what you see as the distinction between the corporate culture of an organization and the data culture of an organization, and what are the ways that they are interdependent?
[00:06:32] Unknown:
Yeah. I I love this question, and I spent I spent a good amount of time thinking through this because I think it's, I agree that it can be extremely nebulous and hard to really, yeah, really hard to nail down. But I think the way that I, think about data culture as being interdependent with kind of the core companies or an organization's culture is defining the kind of how we work. Right? So kind of principles or, kind of guiding principle or guiding principles to how, how we go about our work on a day to day basis. But ways that I think it's more distinct is much more in defining what that means in a cross functional sense.
So on the like, really getting into the the day to day of how does 1 team hand off responsibility or, creation or delegation of data from 1 to the other that can't really be captured in a in a corporations, culture. Right? That's very much like interpersonal and building out. I well, I don't know if I wanna go so far to say data contracts just yet, but building out those agreements between different teams and how those evolve over time just to be, really productive in the long run.
[00:07:46] Unknown:
Yeah. And and I could definitely see a lot of interplay in the organizational culture and how that manifests in the data culture, particularly when you have kind of the juxtaposition of very formalized and procedure driven teams versus very fast and loose startup style teams where if you are in an enterprise organization, everything has to be run through the compliance department and everything has to be run through your legal department before you can decide on how different teams are going to interact with each other, you know, be because of things like Conway's law that's going to manifest in the ways that the interchanges of data or are established and maintained versus in a start up where it's just, oh, you want that data? Sure. Here you go. And there's no real oversight to it. Totally. I'm I'm curious what you see as kind of the different, components and elements that come together to be able to generate the overarching data culture within a team or around a particular product.
[00:08:38] Unknown:
The different stages of an organization's development is is definitely critical to informing what that culture looks like. But I think the underpinning of all of it is identifying and and holding others accountable to the responsibility of creating or accessing data. So even if you're in that fast and loose, like, you know, start up, we need to hustle, like, just give me access to the data. I think there's still a requirement to start talking about, like, the best practices around that. How do you scale that so that you can continue to move as quickly as you'd like to in the start up
[00:09:12] Unknown:
space? And I'm glad you mentioned Conway's Law, Tobias, because I think that's a really interesting angle. I think that in a lot of ways, the the the culture of a company really is manifested in the systems that they build. And and, obviously, this is what Conway's law means is that the the implementation of your systems often reflects or will reflect, sort of the hierarchy and the and the control structures of the organization. And many companies are designed culturally to be very siloed, and they have very specific incentives for a very narrow group of people. They're almost vertically integrated, if you will, to to behave and conform to a certain sort of, you know, way way of was the standard's behavior. And a lot of in a lot of ways, I think we've we've learned from open source that, this is sort of the antithesis of, of a collaborative culture, and many companies really suffer from this, especially large companies. So I think that, you know, the the large company with a siloed, incentive structure, sort of has a DBA at the at the bottom of the tree, like, managing all the data and and fiercely guarding the keys to all that, all that data store. And I think as once you start to sort of open the the parameters a bit and learn what it means to make everyone accountable and responsible for data, across the org, I think some of those silos break down. And you see that in in things like data mesh today, where we're sort of embracing the data where it lives and working to put it under governance and and some kind of, you know, management system, but we're not actually, enforcing that all the data lands in 1 repository controlled by a central team, and we're we're actually developing technical, approaches to sort of, reflect
[00:10:52] Unknown:
the the collaboration nature that business really feels they need to be highly efficient, with their ever increasing amounts of data. Another interesting aspect of this question of culture is the fact that you can be very intentional about it or you can be accidental. And that can have very different outcomes as far as how it actually comes about where there there's no way to avoid having a culture. It's just a matter of what that looks like and whether it's the culture that you actually want or just the 1 that happens to grow up around the ecosystem that it's thriving within or that it's growing within. And I'm curious what you see as some of the ways that different teams and companies and product organizations think about the motivations around being explicit about creating and supporting an explicit data culture and some of the examples of accidental data cultures that you've seen grow up as
[00:11:43] Unknown:
well? I feel like 1, example of an accidental data culture is when everyone transforms data, as is right in their own eyes. Or Mhmm. When every developer feels empowered to create a new table in the database, so that they can guarantee that the data is clean and fresh and meets their standards. Now that's a way of empowering developers and probably, you know, Facebook or someone like that would aspire to that culture from from way back when, but it can also create a lot of chaos. And so, the pendulum is swinging these days, and now we have these concepts and and discussions around, quote, data contracts. How is a data contract different than a schema? Well, you know, we'll we'll let other I'll let other smarter people than I am weighing on that 1. But I think the the it's it's a reflection of people desiring to be very explicit, around how data flows through an organization and who's responsible for what, where. So I think those are 2 examples of ways that culture might sort of emerge inside teams that obviously have very different, impacts,
[00:12:44] Unknown:
on the overall company. Mhmm. Yeah. And I think I've lived through both, like, many iterations of this kind of on the ground where I've had very, kind of, like, structured and principled data access culture, right, where you need to go through a bunch of hoops to get access to data. I've been on the far other side of that where I've had access to any and all data, and it's my job to compile it to then, you know, help the organization make a decision. And I I will not be so bold as to say that there's 1 perfect way for every organization, but I do think the absence of a strategy is what really kind of bites you at the end of the day. Right? So if you if the strategy is to really lead with that data democratization, then I think it's really it it behooves folks to think through what are the best practices, what are the the what are the patterns, what are the audit structures that we'll have in place to make sure that it doesn't turn into that kind of winding web or, you know, kind of mess of of spaghetti code to answer ad hoc questions or to build ad hoc resources along the way. And so I think the other challenge is to start to push towards more structured more structured or more principled data access or data management or data transformation without boiling the ocean and also not scaring people away or kind of dissuading them from from leveraging data.
And so 1 1 way that I talk to a lot of folks about doing this is through just, like, really, you know, treating it as though it is product development and treating it as a very iterative approach of of setting clear outcomes that you're working towards. So why are we implementing some sort of process? Or why are we implementing some sort of, rigor or hoop to jump through? And measuring the success or the the of that as you go, instead of just having kind of a blanket
[00:14:31] Unknown:
a blanket set of rules that everybody needs to follow because that's just no 1 wants to do that. And in terms of the work that you're doing at data council, you've decided that this subject of data culture is something that is substantial enough to merit its own dedicated track. And I'm curious how what your overall process has been as you have grown the conference and brought and as the over surrounding ecosystem has evolved at such a rapid pace to understand what factors you consider when you're deciding how to group the different talk submissions and how to think about kind of when and how you want to segment them against different categorical lines and how to figure out what those categories even look like, the granularity. It's it's a very rich topic. I'm sure we could probably spend the whole time just on this, but curious to kind of start down that path of understanding, like, why data culture has its own track and why now? Yeah. I think that data council has always
[00:15:30] Unknown:
been, the best place for technical teams to discover the latest, greatest tools of data. That applies to open source tools. That applies to closed source tools. We've always had a very rich community of of folks and speakers and the the authors of these open source projects, Wes McKinney and others who come to data council and talk about, the latest greatest tooling that they're building. So that that's very meaningful, I think, for a particular kind of company, especially when you already have sort of a base table stakes of being data driven, and you've bought into the appreciation of of the the realization of the efficiency that those tools can can bring you. But then I realized that, there's larger companies who would come to the conference, and it'd sort of be awash in the sea of tools. And, they try and dabble in tools, or they pick them up and they wouldn't stick, or, you know, 1 thing would happen, that wouldn't particularly, lend itself to to this this data panacea that they were promised. And I think we realized that it really is a cultural issue. And, these companies, they they want to become data driven, and they want to, like, learn from the best startups, that are advancing the future of data like we feature at the conference. But, there's almost a a new like, a a ground 0 that they need to embrace, which is, we we we call and sort of reshaped into this data culture and community track. So this is our our response to try to help more companies that aren't just the, Bay Area startups or the ones who are building the data tooling themselves, embrace the power of becoming data driven, and and really build that culture in the organization so that they can adopt the tools successfully and have them stick in the 1st place.
[00:17:02] Unknown:
And as far as the work of actually getting submissions and evaluating the submissions, Is this a chicken and egg situation where you needed to create the track in order to get this type of submission, or was it a situation of, you were already getting these types of submissions and you needed somewhere to put them? And I'm wondering kind of how that dynamic has played out from when you launched this conference and with this track and kind of what your experience has been there and kind of is this something that the peep the types of people who are already submitting to data council were thinking about and talking about already? Or was it a matter of, hey. We wanna bring in some different types of people, and this is a way for us to engage with people who are, you know, interacting with data at a different level than our, you know, our our typical attendees?
[00:17:44] Unknown:
I I think it's a little bit of both. And, you know, as the conference organizer, my goal is, whenever I sort of get the the the idea behind a new track, it's to go out and and find the best person to be the track leader who's, a luminary in that space, who's respected by the community, who has lots connections. And so that that's why I tapped Maggie and asked Maggie if she'd come and and support this this new track in the conference. And so she's done an amazing job in crawling speakers, and I'll give it over to her to chat a little bit more about what that process looked like.
[00:18:14] Unknown:
Yeah. And we definitely we had some submissions kind of off the bat, and it was actually really great to work with Pete on this because we started it as, we started the track focused on open source communities. And I I actually went back to Pete and said, hey. What if we expand this beyond just open source communities and actually say and and focus on building a culture around data either internally within your organization or externally around your product or around a project or whatever that might be, because I think there's a lot of overlap in terms of the learnings and the takeaways that that folks can that folks can get from it. So that was a really, kind of exciting, evolution of the track itself to kind of talk about data culture or cultural culture around data tools more broadly.
And so then I just kind of sought out to kinda tap my network across dev, across kind of some DevRel spaces that I work in, across some kind of, like, broader connections through the LinkedIn of it all, and really just looking for folks who have experience in building those communities either internally or externally from the ground up. So it's a really so it's been a combination of we're seeing some submissions coming through, but also, hey, there's really something here for us to tap into and for us to kind of do some outreach for to folks that have that can really kind of speak share their experience and also help other people think through how that experience may kind of interplay to their own challenges or or objectives that they wanna reach, on their own. And I think the the key here is that,
[00:19:49] Unknown:
we realize that there is this nature of evangelism inside of a company that needs to happen in order to to help turn the tide and for an attendee at the conference to go back to their large company and become that evangelist. And it's not dissimilar to the way a, quote, dev evangelist, you know, tries to build community around externally, perhaps around an open source project or some tech. And so I think the the key insight here and the reason we put these 2 pieces together is that we realize that whether you're externally evangelizing or internally evangelizing, there's lots of key things that you can learn, from folks on on the different sides of that divide. It actually reminds me of the way that, hackathons sort of emerged, to become, you know, innovation sort of externally for a company, for a particular product around their API or or some way of gestating startup ideas, etcetera, as an external thing. But then internal internally, companies started to adopt hackathons to increase collaboration and and innovation and inspiration, inside their own companies and to try to, adopt new products. So I think that there's a similar kind of analog here with the external, evangelism that we learn from, you know, companies like Snowflake and others who will be appearing on the conference as they build their external communities and lessons that can be learned from people who wanna be armed to go back and to convince their boss and to convince their internal teams that, building a data culture is an important is an important thing to do.
[00:21:13] Unknown:
And as far as the information and kind of the the sense of the ecosystem that you've gathered from reviewing these presentations and talking to the people who are proposing them, what are some of the broad challenges, some of the the the rough the the course buckets of challenge that practitioners and teams are facing as they go through this journey of developing their own understanding of what is a data culture, how do I build a healthy and productive data culture, what are some of the potential pitfalls that I might be encountering, or what are some of the lessons that I learned from painful experiences. And, you know, either did I push through it or did it cause me to leave? I'm just curious what are some of the kind of interesting bits of information that you've been able able to gather through that process.
[00:21:53] Unknown:
Yeah. So I think from from kind of an internal building an internal data culture, a lot of what I what I I'm excited for folks to hear about is how people have overcome the overemphasis or kind of over indexing on expecting tools to solve every data problem and realizing that there's a broader strategy that needs to be in place. There's leadership buy in that needs to happen. There's stakeholder buy in or kind of coaching that needs to happen. And so I think it'll it'll lend a lot of insight into how folks can start thinking through, maybe moving from more of a reactionary data practice to more of a collaborative data practice or more of an an ad hoc data extraction or data compiling to more of a data data product defined functions, for a data team. From the community side of things, I think there's a growing emphasis or excitement around building a community around a technology, but there's and and there are a handful of kind of playbooks of, you know, I personally in the open source space, you know, dbt the dbt community forever is kind of our north star of what we strive to be. But how you get there is still, you know, it's it's very much choose your own adventure. And then I think for more SaaS centered communities, there's a lot of different ways that folks can think through what type of developer community are they looking for, What are the pros and cons and trade offs of that? So we have, a handful of experts. We have, Wesley Faulkner joining us as a senior community manager from AWS to really break down what are all of the different approaches to community? What are the pros and cons of those?
How do you know which 1 is right for your organization and to align that with your goals? And then we're also gonna hear from, a couple of folks from Snowflake who have been there since the beginning. So Filip Hoffa and Daniel Myers are gonna be talking about how they built out DevRel for the for Snowflake and how all of these different approaches that they've taken have really been critical to that success. So a lot of it will be focused on again, there's no 1 perfect or correct way to build a community, but what are kind of the objectives behind it, the ways that you can measure your success, challenges you can expect to face along the way, and and kind of lessons that other folks have learned along the way. That way, you don't have to go through that that trial and error yourself.
[00:24:16] Unknown:
As you were working through the process of putting together this track, deciding that this was a topic that was broad and rich enough and had enough kind of general interest to actually dedicate to focusing an entire segment of the conference on that. What were some of the ways that you thought about potential risk to the conference and and just some of the the different ways that the risk manifests, not even necessarily just to kind of the the success of the conference? Just, you know, risk is such a broad and, personal subject. I'm curious to get your sense of it without without me trying to frame it too much. I mean, maybe the biggest risk of adding this track to data council is that there's just so much content this year. And Mhmm. People talk about FOMO, and, you know, some people really love the single track conference. Well, data council decided,
[00:25:03] Unknown:
years ago that we would not be a single track conference, and and, so to speak, we're paying the price. Because now we have 3 or 3 or 4 ongoing tracks at any 1 point during the day. We actually increased the footprint of the conference this year from 2 days to 3 days, just to make room for this track and several others that we thought were, hot topics and representative hot hot topics in the data community. So, the conference is awash with content. It's a really amazing place to come, and there's something for everyone. The speakers are all deeply technical and and very well accomplished, and, and the data culture community track is no exception. But there is a lot of content, and a person has to be able to sort of wade through that, when they wanna, take advantage of the full data council experience.
[00:25:43] Unknown:
And another element of risk that might be interesting is kind of the the the risk involved in understanding which are the right presentations to bring in or some of the ways that you think about the criteria for a particular talk. And is that something that is, you know, a different rubric depending on the track? Or, you know, just curious how you think about the specifics of this topic area as it relates to the broader conference.
[00:26:09] Unknown:
Yeah. We are always trying to get a diversity of of content, of speakers at data council. And so there's large companies. There's small companies. There's new projects. There's established projects. And, really, I think this team, or or sorry, this track is no different. I think that we aspire to sort of show a a showcase, a cross section of really interesting things that are happening across data culture and community teams, overall, and, you know, Maggie has been the arbiter of that. So we typically have very high standards for speakers at data council. You know, it it's difficult to disappoint folks that submit and are really passionate, especially in a community role, about speaking about their company and and their project at the conference.
We all you know, we obviously we only have limited space, and so we have to make some tough decisions sometimes. But Maggie's done a great job job of shepherding this track and and really selecting the speakers that we think are gonna leave the most impact and and really be the most inspirational to attendees. Because you don't typically think of a data conference as being inspirational, but I think you'd see, when when people leave data council, you know, with their eyes really lit up, and all the awesome people they met and the and the cool things they learned that we actually inspire we we aspire to to turn data council into an inspirational event, and this track has no difference. Yeah. And that's something I was really keen in,
[00:27:27] Unknown:
in seeking out when the speakers is, 1 again that that variety or that, diversity in backgrounds, diversity and experience, also diversity and topics. Right? I didn't want a track where it's, you know, every single talk is about how to build the right DevRel community or how to, like, crack the the code on building out the data culture, like, really wanted nuanced approaches to that because it is a very nuanced topic. So I was looking for people who, 1, have that that background and expertise, but but more importantly are excited and passionate about sharing their story and their journey because this is not a this is not about implementing a specific thing. Right? It's it's evaluating your environment, identifying what success looks like, identifying opportunities or trade offs that you need to make along the way.
And so I was really looking for folks who are excited and passionate about really sharing and and teaching and coaching and helping people walk away from walk away from these kind of more nebulous topics with a better idea of how they can start to to grapple with that on their own. And from the topic of the overall conference
[00:28:36] Unknown:
and the organization that you've grown up around it, Pete, given its history and the massive rate of change of the ecosystem, I'm curious what you have seen as kind of the the evolution of the space that led to this being such a rich topic at this point in time? And if there are other kind of track categories that you have retired because they fell out of favor or because there were other things that eclipsed them and just kind of your your sense of the evolution of the broader conversation around data and its role in business and its broader applications?
[00:29:10] Unknown:
Yeah. I think that, you know, we started off as a data engineering conference, in a meetup first in 2013 and and then a conference in 20 15. And, over time, even though we were committed to, like, helping the the industry sort of figure out what it even meant in those days to be a data engineer because was so early. What we discovered was that all the other professionals from all the other adjacent layers of the stock started to show up at our meetups. And, that included data scientists and data analysts and, and others. And so we learned to embrace sort of the fullness of that stack. And over the years, data council has unquestionably become a full stack data conference. We cover data infrastructure, data edge, data science and models, data analytics, AI driven product features. We'll have a bunch of Gen AI stuff at the conference, as well as building data products tracks this year, which are all new, in addition to the data culture and community track. So I think we have really considered it our our place in the ecosystem as being a third party independent conference. It's not owned by any vendor, to really embrace a lot of the changes across, data infrastructure and ETL and ML tooling platforms, and, and all the layers of the data stacks. So we've been quite, you know, fortunate to have been in a place where we can grow so fast with the industry, community. So we're we interesting, for the community. So we're we're quite proud of that, and and I think this year will be no exception.
[00:30:35] Unknown:
I think that the subject of data culture is also an indicator of the level of maturity that the broader kind of data practice has reached where, as you mentioned, to begin with, it was data engineer versus data scientist where before that even it was just data scientist. That was they did everything because nobody really knew what that meant. You know, before that, it was, you know, database administrator who also did all the ETL work and all of the reporting. And so now we're at this point where there is this level of nuance when it comes to data where we we've kind of fractured into I mean, fractured isn't even right to the right turn. We've fractaled into where we are right now, where we have data engineers, analytics engineers, data scientists, machine learning engineers, you know, data product owners. You know? Now now we're talking about data culture because there is so much breadth of focus in terms of how all these people are managing all of this work to be done, and so much more investment has been put into data as a core competency at the organizational level and at a societal level that we need to be able to understand what are all the interrelations across these different roles, how do these people need to be able to think about interacting with each other because they have become their own microcosm, within the broader, you know, ecosystem of business and society.
And so I think that that's another reflection of what we've grown to over the past few years.
[00:31:58] Unknown:
No question. I think yeah. And I think I think even it it extends out beyond that. Right? Like, it extends out beyond the stakeholders who do not understand or, you know, it's not their job to understand the nuance between all of those differentiated data roles. Right? So like, how do you start to build collaborative, workforce outside of maybe a data organization. Right? So you're working with software engineers who may otherwise just kind of think that they're just throwing things over the the fence to the data team. Well, no. Like, we actually need to bring those people into the conversation. We need to have kind of clear guidelines or clear, you know, clear I can't think of the word. Protocols. Alignment. I yeah. Protocols exactly of what those interactions look like so that it you know, we can kind of break this cycle of data engineers or data analysts or analytics engineers or data scientists or whatever the role is. Having them be the cleanup crew to make the data work, like, workable. And, again, I just kinda keep coming back to this idea of, you know, the the conversation around culture is arising more and more because we've realized that it's not just a tooling problem. We can start to, you know, bring in our anomaly detection. We can pull in our data quality checks. We can pull in data contracts. But at the end of the day, there is a human element to make these things successful. And so I think we're, you know, just kind of facing the hard reality that well, hard and exciting reality that, you know, we need to move beyond just, unit tests and and implementation challenges and actually start to overcome those interpersonal or or cross functional challenges as well.
[00:33:31] Unknown:
And I think to your point about bringing in the business owners and stakeholders into the conversation is another piece of the evolution that we've been experiencing where it used to be, oh, the data is the tech people's problem. I don't need to worry about it. I just need to throw the question over the wall, and then they give me what I want. Whereas now Totally. That they've been brought closer to the overall operation, and we've been bringing about aspects of self serve access and trying to educate people on some of the, specific kind of semantics of data and the vagaries of what's involved and being able to work with it effectively. So there is a a much more detailed conversation happening across a wider variety of roles and as a continuous process versus discrete units of work. Totally. And so now that you do have this track, you have a full slate of speakers.
You are going to be throwing the conference in, what are we? A little less than a month away now. So I'm wondering, as you prepare for that and as you go through the experience, once you come out the other side, what are the criteria that you have for determining whether or not this particular track is successful, what the track is going to look like in years? Is it something where maybe this doesn't need its own dedicated track? It's actually something that gets embedded as aspects of the other topics. Just kind of curious what what are your criteria to think about and reflect on this experiment as you move forward? Well, I know that both Maggie and I definitely wanna see full rooms,
[00:34:56] Unknown:
on this track. So so please come to data council if you wanna learn about data culture and community and how top teams are doing it. So that that's obviously, the the key KPI for us. But data council attendees are really amazing to sound off on Twitter and and social media and other ways, and so we're confident that we're gonna hear good things from the community about the track, and, they're not shy about telling us what they think. So I there'll be both a qualitative and a and a quantitative aspect, to answer that, but we're we're looking forward to inviting people to come to the event and and enjoy this awesome track. And in your experience
[00:35:30] Unknown:
of preparing this track and working with the presenters and doing your own exploration of the ways that these conversations are happening throughout the community? What are some of the most interesting or innovative or unexpected aspects of data culture that you've encountered in the process?
[00:35:45] Unknown:
I think the the thing that I continue to be surprised by is how how nebulous and nascent we are in this space, given that, you know, we the the data the big data world has been firing in all engines and, you know, for what, the past 15, 20 years. And I and I think there's just still a lot of human focused challenges that this group this set of practitioners or this type of practitioner has not figured out. Maybe that's actually not surprising. Maybe I shouldn't be surprised by that considering, you know, we we tend to come from much more kind of technical, oriented folks that are, you know, set out to solve very technical problems and not necessarily interpersonal problems.
But the thing that I'm that I've been continually surprised by is that no 1 really feels like they've cracked the code. It's more, here's what I tried, here's what worked, Here's what didn't. Hopefully, that will help you in your journey as well. So I think I think there's just still a lot of kind of broader lessons for us to learn as we scale this this data culture implementation, that is, you know, it's gonna be kind of ongoing and or tangential to, to to all of the technical problems, all the technical scaling problems that we're gonna be facing along the way. So I think it's just I think the thing that continues to surprise me is that technology aside, there's still plenty of challenges for us collectively to be successful in in building out really data driven organizations.
[00:37:17] Unknown:
Yeah. And and to your point about, you know, the big data industry has been around for over a decade now. It it's always a little surprising to take a step back as people who are taking part in the forefront of these capabilities and, reflecting on the presence of what, Scott Hanselman refers to as dark matter developers of you know, we're we're the people who are out there talking about it all the time. So we think that everybody's doing the same thing, but most people the the the large majority of people are actually just keeping their heads down, doing their day to day. They have no idea what Dagster and Airbyte and DBT are Right. Analytics engineering. They're just, you know, chugging along with this is my ticket. This is what I'm gonna get done. Mhmm. So Mhmm. And and and as with any technical system, the hardest part is always people. Totally. Totally. Isn't that true? And and the last to be prioritized. Right? Yes. The hardest and the last to be prioritized.
[00:38:08] Unknown:
Yeah. So this will be the unabashedly human, track at the conference this year, and it's really great to to to be able to do that. Because, again, we have such deeply technical content that it really is nice to be able to focus on some of the human collaboration, community culture aspects, because, otherwise, it can easily be missed. And as Maggie has so, eloquently put it, it's just so critical, I think, in building high performing teams, or oriented around data in the future.
[00:38:39] Unknown:
Conference, what are the most interesting or unexpected or challenging lessons that you've learned in the process?
[00:38:45] Unknown:
I mean, this is this is my first time hosting a a track with Data Council, so I think, the part that I have con that I underestimated was, very ironically the human element of how much work it would take to collaborate with speakers and, you know, just kind of, like, make sure that these are high quality and and impactful and impactful and informative experiences for folks. So it's really, again, like, I think the the human element of it, there I think that some people actually don't realize that they are experts in this in the area. Right? That they don't It's easy to kind of overlook how impactful or important it is to drive interpersonal or cross functional, efforts within an organization.
So really championing people to and and kind of elevating people who have done it well and and help them understand that, you know, what they may actually be doing that's maybe second nature to them isn't necessarily second nature to other people and that they have learnings that they can disseminate out that others can really, really benefit from. So I think kind of the coaching part of it and the, kind of curating of it has definitely been a really exciting and an exciting challenge for me.
[00:39:55] Unknown:
What about you, Pete?
[00:39:56] Unknown:
Yeah. The I think the challenge for me is that we just get inundated with submissions, these days. And, again, there's just such high quality folks that wanna speak at the conference and wanna participate and are excited to participate. It's it's very hard to to have to tell people no, to to be totally honest. And, you know, we're we're very lucky to have such, great folks who have eyes in the conference, and we're excited to welcome as many as we can, as speakers this year, and and the rest we hope will come as attendees and participate in speaker office hours and, continue to enrich the community because we try to provide a lot of surface area for this sort of institutional knowledge to to come out, into the water supply. And so, you know, it's both challenging but rewarding, and, we hope that that that this year will will will be like past years and and be very rewarding and and engaging for attendees.
[00:40:43] Unknown:
And as you continue to invest in data council and its community, I'm wondering what your plans are for future iterations and venues for this topic to take place and continue to percolate?
[00:40:59] Unknown:
Yeah. Well, I think this is a key this is a key aspect of building data companies. And as we've mentioned, it's not just, the internal culture, but it's also the external communities and how to build communities around your data tools and products as well. And there's there's a connection between those 2 things at the hip. So, I've been looking for an excuse to be able to to run this track at the conference for years, and, I think the interest from the ecosystem is now at the point where it made sense. And I don't expect to con to see that, wane in any anytime in the future in the near future. So so we plan to continue, on this road with data council this year and in in years to come and and to really enrich the community, and and give them the ability to collaborate and and interact with each other on this topic. Are there any other aspects of the topic of data culture
[00:41:43] Unknown:
and the work that you're doing at data council to promote conversations around it that we didn't discuss yet that you that you would like to cover before we close out the show? I think the conference is the main 1. So, yeah, probably nothing else really to add there at this at this point. K. Alright. Well, for anybody who wants to get in touch with either of you and follow along with the work that you're doing, I'll have you add your preferred contact information to the show notes. And as the final question, I'd like to get each of your perspectives on what you see as being the biggest gap in the tooling or technology that's available for data management today?
[00:42:16] Unknown:
Definitely the metadata layer. I mean Yeah. There's there's such amazing things going on with, as we as we mentioned, data mesh and sort of appreciating the data where it sits and meeting it where it is, and yet at the same time, building this rich ability to collaborate around the data from a cultural perspective. These things are not easy to achieve technically. And so, I'm really excited about the the metadata layer because I think that it powers so many, underlie so so many, things sort of at an underlying level, like lineage and quality and governance and and many other things. And, Maggie could speak much more more about that because she's also involved in that space, but I think it's a particularly, exciting, set of sort of fundamental infrastructure that is a bit missing from, the current, data tooling landscape right now. I I mean,
[00:43:06] Unknown:
selfishly well, I don't know if it's selfishly, but definitely by it's a biased take because I do work so closely in the metadata space. I do think that metadata is something where we've really only scratched the surface on what we can do with it and where, honestly, I think community is critical because a lot of what we're doing is figuring out why do we even care about metadata? What can we do with it? What are the best practices? What are the kind of common developer patterns that folks should be following, and it's it's really what's interesting is if we think about metadata as kind of an intersection of the governance space, Data governance is not new, but figuring out, you know, data governance has been around for as long as we've had kind of data generating organizations for compliance reasons.
But how do we expand out data governance so that we can not just treat it as a compliance necessary evil, but as something that actually fuels productivity, as something that minimizes confusion, as something that, ensures high levels of data quality. And how do we do that in a way where it's not just an engineering exercise, but it's actually cross functional, it's, with collaboration of kind of operational or or business stakeholders and really making sure that data is generated and modeled and stored and transmitted and ETL then done, you know, reported and and all of those things in a way that it's, moving the company or moving the organization forward. So I think the metadata layer is something where we again, we're really just scratching the surface of what we can do with it. But where I think community is really critical to informing what that that collective strategy looks like going forward.
[00:44:39] Unknown:
Alright. Well, thank you both very much for taking the time today to join me and share the work that you're doing on the conference and investing in this topic of data culture and how it's manifesting throughout the community. So appreciate all the time and energy that you're putting into that, and I hope you enjoy the rest of your day. Thanks so much, Tobias. Thanks for having us.
[00:45:04] Unknown:
Thank you for listening. Don't forget to check out our other shows, podcast.init, which covers the Python language, its community, and the innovative ways it is being used, and the Machine Learning Podcast, which helps you go from idea to production with machine learning. Visit the site at dataengineeringpodcast.com to subscribe to the show, sign up for the mailing list, and read the show notes. And if you've learned something or tried out a product from
[00:45:30] Unknown:
the show, then tell us about it. Email hosts at data engineering podcast.com
[00:45:31] Unknown:
with your story. And to help other people find the show, please leave a review on Apple Podcasts and tell your friends and coworkers.
Introduction and Overview
Guest Introductions: Pete Soderling and Maggie Hayes
Early Careers in Data
Defining Data Culture
Interplay Between Corporate and Data Culture
Components of Data Culture
Intentional vs. Accidental Data Culture
Data Council Conference and Data Culture Track
Challenges in Building Data Culture
Evolution of Data Roles and Culture
Human Element in Data Culture
Evaluating the Success of the Data Culture Track
Lessons Learned and Future Plans
Closing Remarks