Summary
Modern businesses aspire to be data driven, and technologists enjoy working through the challenge of building data systems to support that goal. Data governance is the binding force between these two parts of the organization. Nicola Askham found her way into data governance by accident, and stayed because of the benefit that she was able to provide by serving as a bridge between the technology and business. In this episode she shares the practical steps to implementing a data governance practice in your organization, and the pitfalls to avoid.
Announcements
- Hello and welcome to the Data Engineering Podcast, the show about modern data management
- Data lakes are notoriously complex. For data engineers who battle to build and scale high quality data workflows on the data lake, Starburst is an end-to-end data lakehouse platform built on Trino, the query engine Apache Iceberg was designed for, with complete support for all table formats including Apache Iceberg, Hive, and Delta Lake. Trusted by teams of all sizes, including Comcast and Doordash. Want to see Starburst in action? Go to dataengineeringpodcast.com/starburst and get $500 in credits to try Starburst Galaxy today, the easiest and fastest way to get started using Trino.
- This episode is supported by Code Comments, an original podcast from Red Hat. As someone who listens to the Data Engineering Podcast, you know that the road from tool selection to production readiness is anything but smooth or straight. In Code Comments, host Jamie Parker, Red Hatter and experienced engineer, shares the journey of technologists from across the industry and their hard-won lessons in implementing new technologies. I listened to the recent episode "Transforming Your Database" and appreciated the valuable advice on how to approach the selection and integration of new databases in applications and the impact on team dynamics. There are 3 seasons of great episodes and new ones landing everywhere you listen to podcasts. Search for "Code Commentst" in your podcast player or go to dataengineeringpodcast.com/codecomments today to subscribe. My thanks to the team at Code Comments for their support.
- Your host is Tobias Macey and today I'm interviewing Nicola Askham about the practical steps of building out a data governance practice in your organization
Interview
- Introduction
- How did you get involved in the area of data management?
- Can you start by giving an overview of the scope and boundaries of data governance in an organization?
- At what point does a lack of an explicit governance policy become a liability?
- What are some of the misconceptions that you encounter about data governance?
- What impact has the evolution of data technologies had on the implementation of governance practices? (e.g. number/scale of systems, types of data, AI)
- Data governance can often become an exercise in boiling the ocean. What are the concrete first steps that will increase the success rate of a governance practice?
- Once a data governance project is underway, what are some of the common roadblocks that might derail progress?
- What are the net benefits to the data team and the organization when a data governance practice is established, active, and healthy?
- What are the most interesting, innovative, or unexpected ways that you have seen data governance applied?
- What are the most interesting, unexpected, or challenging lessons that you have learned while working on data governance/training/coaching?
- What are some of the pitfalls in data governance?
- What are some of the future trends in data governance that you are excited by?
- Are there any trends that concern you?
Contact Info
Parting Question
- From your perspective, what is the biggest gap in the tooling or technology for data management today?
Closing Announcements
- Thank you for listening! Don't forget to check out our other shows. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used. The Machine Learning Podcast helps you go from idea to production with machine learning.
- Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
- If you've learned something or tried out a project from the show then tell us about it! Email hosts@dataengineeringpodcast.com) with your story.
Links
- Website
- Master Data Management
- Cartesian Join
- DAMA == Data Management Community
- DMBOK == Data Management Body of Knowledge
- DAMA DMBOK Wheel
- CDMP (Certified Data Management Professional) Exam
- Data Mesh
- Data Governance First Steps Checklist
- The Never Normal
The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA
Sponsored By:
- Red Hat Code Comments Podcast: ![Code Comments Podcast Logo](https://files.fireside.fm/file/fireside-uploads/images/c/c6161a3f-a67b-48ef-b087-52f1f1573292/A-ygm_NM.jpg) Putting new technology to use is an exciting prospect. But going from purchase to production isn’t always smooth—even when it’s something everyone is looking forward to. Code Comments covers the bumps, the hiccups, and the setbacks teams face when adjusting to new technology—and the triumphs they pull off once they really get going. Follow Code Comments [anywhere you listen to podcasts](https://link.chtbl.com/codecomments?sid=podcast.dataengineering).
- Starburst: ![Starburst Logo](https://files.fireside.fm/file/fireside-uploads/images/c/c6161a3f-a67b-48ef-b087-52f1f1573292/UpvN7wDT.png) This episode is brought to you by Starburst - an end-to-end data lakehouse platform for data engineers who are battling to build and scale high quality data pipelines on the data lake. Powered by Trino, the query engine Apache Iceberg was designed for, Starburst is an open platform with support for all table formats including Apache Iceberg, Hive, and Delta Lake. Trusted by the teams at Comcast and Doordash, Starburst delivers the adaptability and flexibility a lakehouse ecosystem promises, while providing a single point of access for your data and all your data governance allowing you to discover, transform, govern, and secure all in one place. Want to see Starburst in action? Try Starburst Galaxy today, the easiest and fastest way to get started using Trino, and get $500 of credits free. Go to [dataengineeringpodcast.com/starburst](https://www.dataengineeringpodcast.com/starburst)
Hello, and welcome to the data engineering podcast, the show about modern data management. This episode is supported by CodeCommence, an original podcast from Red Hat. As someone who listens to the data engineering podcast, you know that the road from tool selection to production readiness is anything but smooth or straight. In code comments, host Jamie Parker, Red Hatter and experienced engineer, shares the journey of technologists from across the industry and their hard won lessons in implementing new technologies. I listened to the recent episode, transforming your database, and appreciated the valuable advice on how to approach the selection and integration of new databases and applications into the impact on team dynamics.
There are 3 seasons of great episodes and new ones landing everywhere you listen to podcasts. Search for Code Comments in your podcast player or go to data engineering podcast dot com/codecomments today to subscribe. My thanks to the team at Code Comments for their support. Your host is Tobias Macy. And today, I'm interviewing Nicola Askam about the practical steps of building out a data governance practice in your organization. So, Nicola, can you start by introducing yourself?
[00:01:13] Unknown:
Yeah. Thank you for having me here, Tobias. So my name is Nicola Askam. I'm known as the data governance coach, and I help organizations understand and manage their data better. I've been doing data governance for 21 years now. So, I've had quite a lot of fun doing that and helping all sorts of organizations, you know, reduce costs and inefficiencies, and and to be fair, even these days remain relevant by implementing data governance. And do you remember how you first got started working in data? I think I'll never forget. So, I worked for a very large bank in the UK at the time, and, I'd been running an operations department and felt that all these projects that kind of came down from head office and implemented change projects and changes to us, didn't do it all too well. So I fancied myself a bit of a project manager, and I applied for project management job and was told that I didn't get it, but the team next to them also had a project manager job going.
And, I I, you know, I depressed them so much that they they gave me the job. And, it was a data management team, and they were building a new data warehouse. And I didn't even know what a data warehouse was. If somebody had told me it was a big room full of paper, I probably would have believed them at that point. So it was a baptism of fire because I'd come from a business background. So we did, we had a really young enthusiastic project team, had great fun, and we built and designed the new data warehouse model, and then we migrated the data over from the old 1 to test it before we moved the source systems over, and we realized that you know, the data was pretty rubbish in places.
Let's be honest. So we were really young and keen, and we sorted out how we could cleanse and fix and and make things better. And you tried being a young, naive, and very inexperienced project manager getting financed to sign off their reports because they were different. And I'm just going, yes, but they're better. We made the data better. And I couldn't understand why this guy didn't want better data quality. And, you know, I can't remember how I ever got him to sign off things, but we eventually, migrated the source systems. And guess what? Our data went straight back to where it was because we never fixed the source. We'd fixed everything on the fly.
And, I, carried on doing all sorts of data warehouse related projects. I only did projects in the data space, master data management, all involved in that. And then as as all, organizations do, they had a restructure. And my boss told me 1 day I couldn't be a project manager anymore. And I was gutted because I really thought I'd found my thing, this sense of order and sorting things out. And I particularly loved the data related projects that I'd been involved in. And that's all tough. They're all being put in a central pool. We didn't wanna lose you, but you've we you've gotta make up a job. And I'd never been told that in my life, in my career. And I didn't know what to do because the vast majority of the team was like the business intelligence, and Mi part of the team. And I've done my SQL training when I joined the team, and I can do well, on the course, I could write basic SQL. No problem. Could do that. As soon as I actually tried to do it in real life to see what the users are doing to my data warehouse, the DBAs rang me up because I'd accidentally done the Cartesian join.
And with every table joined to every table, I kind of ground the whole data warehouse to a standstill. They weren't very polite. So I was pretty convinced at this stage that perhaps my career didn't lay in coding. But I was part of a data management team, I didn't know what else to do. So I had to think about it and asked my boss if I could go and talk to the people about who should care about the quality of the data. I think she rolled her eyes and said, you're not still going on about that, are you? But something along the lines of don't upset anybody senior. And I just started talking my way into senior people's offices and asking them who should care about data. And I got some better reactions than others. And I can't promise I didn't upset some senior people, but I don't think it ever got back to my boss. I got away with it.
[00:04:56] Unknown:
And so you mentioned master data management. You mentioned data quality. As I introduced the topic, I said that we were gonna talk about data governance, and that's 1 of those terms that has come to be very nebulous and means many different things to many different people. And I'm wondering if you can just start by giving your summary of how you view data governance and the scope and boundaries of what it means in the context of an organization.
[00:05:22] Unknown:
Yeah. I think that's a really good question because I think a lot of the issues I perhaps help some of my clients overcome is because they're not sure of the scope of data governance. And I think some of the issues come from the DAMA, DME bot, Will. I don't know whether you've come across that. So DAMA is the, data management association. And for for trans full transparency, I I used to sit on the, Dharma UK chapter board for for 13 years. So great fan of Dharma. I think he does some really good work forwarding the whole data management disciplines. But, they had the the Dharma wheel as it's often known and and easily available on the Internet was was drawn when the first version of their body of knowledge was created.
And they've put it's a wheel with segments for anybody listening that hasn't seen it. And in the bull's eye in the center, they've put data governance. And then all around the rest of the wheel is a lot of other data management disciplines ranging from, you know, database management, data security, all the way through to the ones you're mentioning. And I've mentioned data quality, master data management, reference data management. And they're all around there on the edge, but they've put data governance right in the middle. Now that is because data governance provides the foundation and supports, to be fair, all of the data management disciplines, but some of them a lot more closely than others, and in particular, data, data quality and, master data management.
Now, in my experience, they're often done best if they're all done on the same team, rather than being split, but I do know a number of organizations split them. But I think this data governance being in the middle of the DMEOT wheel means that if you aren't a member of DAMA, if you haven't done your CDMP exams, people don't understand what that diagram's fully showing, and they think that the word data governance in the middle means everything on that wheel is data governance. And I think that hasn't helped with the confusion. So 1 of my clients did it first, but I've recommended it to so many other clients since, is that I think it's a useful thing to do.
Change the words on it to the names of your teams, and you could use it to kind of highlight what the scope of data governance is for your organization, because you might have a separate data quality team, you might have a separate master data team. And, you know, it's also a good way to explain who does what, Because I do find that data governance is much more business facing than some of the data management disciplines. And so if you're the face of data talking to the business, they then go, oh, you do data. So you must do all of these things. So it's quite good to have a little diagram most based on the D and bot wheel that says, no. No. I only do these things and and go to these people for for data security or, you know, data warehousing.
[00:08:05] Unknown:
Another common view of data governance is that, oh, that's the thing that you do when you're an enterprise, but I'm a small company, so I don't need worry about it. And I'm curious if you can talk to some of the ways that you see that perspective become problematic, and at what point does the lack of an explicit governance policy become a liability to the organization? Yeah. I I think that's a really good question. And I've worked with all sorts of sizes of organizations over the years.
[00:08:34] Unknown:
And in an ideal world, you should have, you know, your data governance framework as early as you possibly can. But the trouble is, if you are a small company, you can't get, well, we don't need all this. I know what the data is. I'm the main person here. It's all in 1 person's head. As you start growing, it it gets really complicated. And I've I've tried I've I've had some early, like, prospect client calls with startups, and they never go anywhere because startups are all about leveraging on a shoestring. Basically, they want to spend as little money as possible growing the company. So spending time and money and effort doing this data governance thing isn't of interest to them. But I think what happens is particularly like startups, the company grows quite quickly.
And if their data isn't documented and understood, we don't know if it is it good enough quality, it becomes quickly not manageable by the few people who've got the knowledge in their heads. So I would say it you should have an approach as early as possible, but I think people tend to over egg it. I think I definitely did in the early days. I was a bit of an evangelist, and all data had to be perfect and will live happily ever after. And I think that doesn't win your friends and supporters when you come to do it. So I think what you've got to do is, what could we put in place that's a good foundation that as we grow, we can perhaps raise the bar as we need to? Because the bigger and more complicated your organization, the more systems you have, the harder any kind of using your data get becomes. And that's where data governance really comes into its own. But it's really hard to go back and clear the historic backlog, so we need to try and if we can, if we're a smaller organization, just get the bare bones in now so that we're in a good place as we start getting bigger.
[00:10:15] Unknown:
Digging a little bit further into what you were saying of as the data governance person, you're the 1 who is talking to all the business people. I'm wondering if you can talk to some of the ways that, in particular, technical teams have misconceptions about what data governance is and what it means, in particular, what it means to them, and some of the ways that the business has a misconstrued view of what the data teams do because they only talk to the governance person?
[00:10:44] Unknown:
Yeah. So I think I think it's mixed, and it does depend on the organization. But, generally, I find that a lot of the technical teams are really happy to work with data governance because a lot of them find that their jobs are made harder because the data isn't well documented and understood. And that's for, you know, also, I don't think that people do this deliberately. I just think they haven't been told that there was a better way of doing it. So I'm never after a witch hunt when I do it. So, you know, quite often the technical teams are really keen to to work with me to to help solve problems. I think 1 of the big things I see from technical teams is that they're quite often used to, you know, somebody from the business rings up and says, can you change this field on this database?
And they're kind of going, well, I don't know whether I should, but, you know, so is it either that they do it because they know me and I ask nicely or because I shout and make a fuss? But am I really the person that has authority to do that? And I think data governance actually puts in place a proper framework, so not just anybody can ask for any changes, which makes it easier for technical teams. They're not trying to do that best guess. Should I or shouldn't I do this? And will I get shouted at later down the line for making this change? It makes the business clearly accountable for that. But I also think that everything you do, you've got a point of contact in the business. So, 1 that comes to mind a lot is data security, always a very, you know, cybersecurity, very technical team.
But when they try and talk to the business and say, who's responsible for doing the security classifications, the encryption levels, the business will all go, oh, well, probably not me. You should try talking to Tobias. I think he knows this data. And you go, I don't think it's me. You try somebody else. And, you know, they go pillar to post. But as soon as you've got a data governance framework in place, you've got somebody who has agreed that they are the data owner, if we've done this properly, and that they are the right person to make decisions about the data. Now as a data governance person, I'm gonna be asked them to do some definitions on what makes the data good enough to use in terms of data quality. But from a technical point of view, you can come and ask, right, you you're the person who makes decisions about this data. Tell me the security classifications. What encryption level do you want on this? And that's repeated over every of the you know, all the different teams. When you have a question, you need to go and ask somebody. It's really clear because we've got the data owners all agreed, and it just makes things simpler.
And I think the other thing that I've been told, anecdotally, but by multiple clients over multiple years now is that having data governance in place makes the business much better at giving you their data requirements. They're able to articulate them better because somebody's taught them the language on how to do that. So I think we, out yeah. It kind of makes it sound like I think data governance just solves everything. But I do think we've bridged that gap and break down the silos of them and us between the IT teams because I, you know, I I've sat in quite a lot of meetings where business users have told me IT don't do it properly for them and IT telling me separate that the business won't explain what they want. And they're they're all on the same side. They just don't realize it, and they just don't talk the same language. So I think sometimes data governance can act as a translator that just that facilitator between the 2.
[00:13:52] Unknown:
Absolutely. And I was gonna point out that as somebody who is very technically focused, it's very tempting to just go and you build your data warehouse, you have this magnificent data model, everything is as it should be, and then you hand it over to the business users, and you say, here you go. Here's all the data that you asked for. Go be data driven. And then the business users say, what is all of this nonsense, and what am I supposed to do with it? And I think that to your point that data governance policy, having that as a standing practice as the bridge between those 2 sides reduces the potential for that type of hard hand off to happen, and instead that you are evolving the requirements and the scope as it builds rather than just 1 massive hand off of here's everything. Good luck.
[00:14:37] Unknown:
Yes. Very much so. And I've I've had so many, let's say, like, reporting data analytics teams who say you know, business users say, can I have this report or this information, and they'll build the report for them? They'll spend ages because we don't have a data catalog. You don't know what it is. You think you found the right data. You've done some fixing or, you know, probably, fixing or, you know, probably, I tend to call it data wrangling. I don't you might be less polite. I've done less polite for it, but, you know, it's taken a lot of effort. And I think the business user thinks you just press the button and out pop this report kind of thing. You give them the report, and then they go, what's that? That's not what I asked for. And as soon as you got data governance and a data catalog in place, you can say, tell me which of these fields that you want in your report. And it saves so much time and effort.
And it forces the business users to be able to explain what they want a lot better.
[00:15:25] Unknown:
And that point of the data catalog, it is definitely a necessity as you scale the size and complexity of the data, the number of stakeholders. I'm wondering what you see as the foundational requirements from a technical perspective for being able to facilitate data governance as a practice.
[00:15:47] Unknown:
I think that's a good question. I think, you know, when I when I started doing data governance, there were there were no data governance tools. They didn't exist. So my, even then probably called it data glossary or a business glossary, that's evolved into the data catalog over the last few years, but it would have been on an Excel spreadsheet. And, yeah, it's great when you're just starting and you haven't got many data domains or systems in the scope of your initiative. But as that grows or you work for, you know, a a large organization, an Excel spreadsheet just isn't scalable.
You've got the version control and access because, you know, this isn't a Wiki. This isn't everybody accesses it and changes what they like. This is an authoritative source of what our data is. So it starts getting really, really tricky, and I think that's where the tools really come into their own now is that they they make things, scalable in a way that it never was before. And they're also becoming much more flexible. So you can't compare all the data governance tools. So a lot of them have a data catalog at their core, but a lot of them now will do the automated data lineage. They'll do the data quality issue workflows.
So, they're they're all subtly different, but they they really start making it easy to do this work. So, if you have a complex organization, particularly you're a global organization, you know, why would you if you, you know, I'm sitting in London, in the UK, how would I know that the data owners, somebody in America that I've never heard of before? And how am I even going to find out I probably won't bother, which is where we were before we started doing data governance. So, we've got to we this is where the tools really could become powerful. Because I can go and look to see who owns this. Oh, it's Tobias. And there's a little click to send him a message to say, could we change the definition for this field because I need it I need to record this in it as well, please. And it doesn't matter that I hadn't heard of you 5 minutes before. It just facilitates all those conversations.
And I think they're becoming more important the more that our data is distributed and particularly moving data into the cloud, things like that. Everybody thinks that, oh, it's, I've had so many people say to me, do we have to do data governance now? It's in the cloud. It's on a third party system beginning. Yeah. You know the only difference is it's your data on somebody else's servers. You know, you still need to know what it is, how you're using it. Yeah. So technology moving on, I think, is really exciting, but I think it makes the data governance probably even more important than ever before, but harder than ever before.
[00:18:13] Unknown:
Another growth in that challenge is that data systems used to be just the data warehouse. Everything lived in 1 place. People would query it. They would get their response back, and then they would go on their merry way. Now we've had a proliferation of data sources, proliferation of data consumers, a proliferation of data storage layers, data processing layers. As you mentioned, there has been some measure of parity in terms of the tools that we use to keep track of all of that in these automated metadata catalogs, but I'm curious how you have seen this explosion in the types and availability and variety of data in its applications, how that impacts the strategic elements of data governance and how to effectively manage it in an organization.
[00:19:04] Unknown:
Yeah. I think it's it's really tough because I always want data governance to be the facilitator or the enabler, the thing that's making everything else work better. So, you know, your organization's investing in a shiny new tool. You and I know if we put the same rubbish data that's on the old system into the shiny new tool, it won't magically get better. I think 1 of my associates that worked for me once got so frustrated with a with a client. He said, what do you think it is? You think this data fair is in the in I think this was in the data warehouse. It was back a few years ago. But there are no magic tools that magically correct the data for us and make it all understandable. So I think this explosion and particularly gen AI is is everybody thinks it would just do all the data for us. It'll just do all the coding. We still need the humans to do the the context. So I think if anything, we really, you know, we need data governance more than ever, and we've got to be very clever what tools we use and how we use them. So, particularly like the Gen AI, I just keep thinking, you know, if people want to do that, it's exciting. It's a bit like when data science came along, we'll do that. We don't wanna do the boring data governance stuff, but let's do data science and do the clever things, really analyzing our data and getting these amazing insights. People are now moving on, let's do Gen AI.
But Gen AI is only as good as the data that your training models had. So we've really got to make sure that we put the right data and it's good enough quality into our LLMs. Otherwise, you know, the answers we get out are gonna be wrong, and they're gonna, we're gonna be, you know, accessing and analyzing our data and coming to the wrong answers faster than ever before, I think is the trouble. But some people are thinking it's authoritative because it's come from Gen AI.
[00:20:43] Unknown:
Exactly. The the computer said it, so it must be true. Yes. Exactly. Which is scary. Another interesting aspect of generative AI is the fact that people don't necessarily understand what its capabilities actually are. I was having a conversation a little while ago with somebody who has a business intelligence tool that is powered by generative AI where they can have a conversation to figure out what is the report that I'm looking for. And they said that 1 of their users typed in, how does my revenue projection compare to all of my competitors? And they said, there's no way that we can get you that data. The AI doesn't know that.
[00:21:23] Unknown:
I know. And I think I think people just think, as I suppose, you know, the Internet and LinkedIn has led us all to believe that AI has the answer, but it only has the answers if we've given it the data in the first place.
[00:21:36] Unknown:
To that point, how much do you see that coming into play in your work of helping companies manage their data governance approach of just helping to educate people on both sides what is actually possible and what is the work that is required to achieve their stated, desired outcome?
[00:21:56] Unknown:
So I I I think it's there's a lot of conversations to be had right now. And I think from, like, the the business user side, I've always advocated or recommended what's probably more lastly be called data literacy training, but I used to call it data culture training because, you know, the business users, vast majority of them didn't think that they had anything to do with data or if they did, it was just the necessary evil that made their job harder because it was never the right day to roll on time or whatever.
And I've always been saying we need to let them understand this is an asset. We need to manage it and care for it the same way we do our other assets in our organization. But we now need to get them thinking about AI as well. So almost, like, have this basic AI literacy that all needs to be rolled into the same training as far as I'm concerned, so people understand what they can and can't use Gen AI for. I I went to, the big, Gartner Data Analytics Summit in London the week before last, and and there were awful examples that talking to people, and they were saying where people have just taken company data and put it into chat gpt, not understanding that you've put that data out on the Internet and now it's available to your competitors, to whoever.
So we need a lot of, I think, that literacy, that understanding from the business point of view. But I think also on the technical point of view, we've got to understand, you know, is our data AI ready, I suppose, is probably what it is. Because I think, you know, we can't do gen AI without our tech teams doing all this clever stuff for us. And we don't want to get totally carried away on implementing the technology without the data going hand in hand. So, I think we really need to work together. I oh, well, for many years, I've been told, you know, data governance is the 1 that's going, no, don't do that. And and I but I would say it shouldn't be a no, don't do that. It should be a work with us with the data governance team. We'll help you do that. And we can't do everything at once. But if Gen AI is moving so quickly, then we'll have to prioritize that. But kind of work with your Gen AI team, work with your data governance team, get everybody working together to understand how can we prioritize and focus on the right things.
Because, ultimately, we are all on the same side. We all want good results for our organization.
[00:24:13] Unknown:
Data lakes are notoriously complex. For data engineers who battle to build and scale high quality data workflows on the data lake, Starburst is an end to end data lakehouse platform built on Trino, the query engine Apache Iceberg was designed for. Starburst has complete support for all table formats, including Apache Iceberg, Hive, and Delta Lake. And Starburst is trusted by teams of all sizes, including Comcast and DoorDash. Want to see Starburst in action? Go to data engineering podcast.com/starburst today and get $500 in credits to try Starburst Galaxy, the easiest and fastest way to get started using Trino.
So for an organization and a business that is just starting down this journey of data governance, it can become very overwhelming. You'd think I just have to boil the ocean, and so I have deliver everything all at once. I'm wondering what are some of the most useful concrete first steps that you recommend when you're working with an organization to help them establish a data governance practice and understand what is that overall road map where the ocean has finally been boiled?
[00:25:20] Unknown:
Yeah. So it's gonna take you many years to get to boil the whole ocean. So I always like to say to people, I think it's very important. The very first step is that you've got some kind of mandate or authority to start doing data governance because I didn't in the early days, and I found out the hard way why that doesn't work. There's only so much just enthusiasm and energy can get you. You need an executive sponsor or the authority from an executive sponsor to go and start talking about this and thinking about this. Because otherwise, you're just talking for the sake of it, and somebody will say, yeah, it sounds like it's gonna take longer to get on with this more exciting project that we want to do. So, know, go away, Nicola. We'll come back to you later, which is the polite version of what I was told in the early days. So you need that authority. You need somebody senior on your side saying this is a good idea. We need to find out more about this. Then next, I always think it's really important to identify my senior stakeholders. So very likely to be the people that are going to be my data owners, but I don't necessarily tell them that at that stage because that could scare them off, and I want them on board. But people always respond better to change if they feel like they had a say in it. So I'm never, gonna recommend that you just say, right, hello, you, you, and you, your data owners, we're doing data governance. Here it is. Go and get on with it. So we've got to let them think they have a say in it. So identify your senior stakeholders and they'll probably mainly be business people, but you need some senior people from IT as well because they're gonna help them with the art of the possible. What can you already do? You know, what quite often, your organization has technical tools that the business have no idea that they're even there. So we want to have this nice even spread of stakeholders, and we want them to then start almost like steering. So I always try very hard not to say data governance is a project because it's a way of life. It's something that we have to but we need a project like thing to start the initiative, to design the framework, to get it implemented, to get it going.
And I think you should take these senior stakeholders like you would as if you were having a project or a program, and you'd have a steering group. And that's what these senior stakeholders are so that you can work with them to prioritize and focus because I've never seen data governance done as a big bang or boiling in the ocean work. We need to do this iteratively and in phases. And I think if if you're starting data governance and you already work for the organization, never done data governance before, you tend I'm speaking from my own experience, pick something that you found annoying as a user or from whichever side you've come from.
You might find, as I did, that you're 1 of very few people in the organization who found that data problem annoying, and then you don't get any backup and support for doing it. And it gets to be Nicola and her silly little data thing. So what we need to do is get these senior stakeholders to be telling us, what is the next big thing? Are you replacing your ERP system? Are you doing master data management? Are you building a data lake or a data lakehouse? What is it that's going on? Where are the biggest problems? And then from there, you can work out an area of focus because it could be a system. It could be a data domain. Everybody could agree that, I don't know, perhaps we have customer data. It's on multiple systems. We we not we we technically have a master data rule. I worked with a client a couple of years ago who, through a series of mergers, had 3 masters for customer.
You can see the irony in that statement. But it's the, you know, what's causing us the biggest pain in this organization at the moment? What's holding us back in in picking? Are we gonna do gen AI? If so, what are we gonna do? So which data do we need to focus on first? And I think that really then helps you come up with this high level roadmap. And we might never get to some of the data. Perhaps it's just there for context, and it is only poor Nicola in the corner who uses it. But quite often, you know, we are tempted to dive into the things that are not so key. So we really want to make sure we're aligned with what that what the organization's strategically planning. And I think that's the best way because from there, you can start planning an iterative approach that focuses on the things that your business needs because I think we really and that's probably 1 of my other key things to to think about is making sure you're delivering business value and not doing it for best practice.
Or, as Nicola says, that that never really wins anybody over. We should be doing this because we're making something work better or work my website that people on my website that people can get for free to give them the first steps in data governance if anybody is interested in that. So, that's available for anybody who wants it.
[00:30:02] Unknown:
And I'll add a link in the show notes for that to so that people can find that easily. And once you have started down the path of data governance, you've identified and started conversations with the senior stakeholders, Everything's going wonderfully. What are some of the active hubris or some of the common roadblocks that you see people run into that can derail progress on data governance. And maybe you get to a point where some of the annoyances have been addressed, but you never get any further, or you just start irritating everybody and they stop listening to you altogether?
[00:30:38] Unknown:
So I think 1 of the common roadblocks is not delivering anything. So if you if you're doing data governance properly, you're probably gonna take a long time to deliver something. So you've got to find ways of delivering, solving some problems in the meantime to keep everybody engaged. So so not delivering value, I think, is 1 of the good things. I think people I've seen too many people get stuck in this analysis paralysis. Let's design the perfect data governance framework. And we're nearly there, and then Tobias has a look at it and goes, oh, you haven't thought of this little weird piece of data or scenario in our organization. And then we go back to the drawing board. And I think people get stuck like that trying to make something perfect. I'd rather go really simple, and I've learned this again the hard way, go really simple because you can always add detail as you go along. So I think that's that's really, a key 1. But I think that the biggest and most common roadblocks I've seen are the culture change.
People totally underestimate that. They think we can just say, right, we're doing data governance, nobody does. But we've got to get everybody in our organization to understand that we're now, considering data an asset, whereas before it was probably a relatively small number of people that realize that. And we need everybody to step up. And I think the people that underestimate or ignore that part of it always have, great problems. I think, there's 2 more that really come to mind. I don't wanna depress people. That's hard. But, it's lack of budget. I think people think they can, you know, we can just do this. But even at the bare minimum, you're gonna have to at least have somebody doing the work. So you've got a headcount.
In an ideal world, as we've already discussed, we'd have a tool or more depending on how wide the scope of the data governance team is. So lack of of, budget is a hard 1 and and lack of, senior stakeholder engagement. That is a really big 1 because if it feels like we're doing this just from a technical point of view, it can become a technical it can be seen as a as a techy solution that isn't really adding any business value. So I think we've got that's why we have to do the culture change, constant senior stakeholder engagement, that and that usually solves the budget problem as long as we can bring them along with us on the journey and explain why we're doing this, and we're not doing it just because we feel like it should be done.
[00:33:04] Unknown:
Another pitfall that organizations come into a lot is the rise of shadow IT of, oh, data governance is slowing me down. I'm just gonna go and throw some money at Snowflake or whatever it may be and do my own data work so that I can move faster. And I'm curious how you have seen organizations address that effectively and some of the ways to bring those teams into the fold and help them understand the purpose and benefits of governance beyond just, oh, it makes me move slower?
[00:33:35] Unknown:
Yeah. So I think that's a that's a really good 1. So I, I'm always a fan, hopefully, it's coming across, of the carrot approach. I'd love to sell the benefits, get everybody understanding it, and signing up. But I do get that sometimes people are just not gonna do it. So which is why I try and keep, perhaps, you know, not the minimum approach, but let's do this as a simple approach to begin with. So as you say, 1, it doesn't scare too many people off. They don't try and find their way around it. But this trouble this problem with shadow IT is very real, whether it's somebody just taking an extract from the data warehouse and building their own Excel their system on an Excel spreadsheet or or whether they're, as you say, sending it to third party providers and bypassing, IT.
I think the way that I've seen a lot of organizations, particularly, financial services across Europe, is to have what is, often called an end user computing policy. And because there's no 1 team that that sits nicely with, because it's a bit data governance, it's a bit data quality, it's a bit business continuity, a bit data security. Everybody kind of goes, not me, but it quite often ends up on the data governance team for that reason because lots of bits seem to if they don't have a home in an organization, they have data in the title, they often end up on the data governance team. So, I've been involved in that quite a lot. And I think actually having that policy can really help.
Not that I want a big stick to go and beat people up. But instead of me just going, please don't do this, I'm saying, look, you're not allowed to do this, but these are the benefits for doing it properly. So if you make them adhere to all the best practices that we have, if this is if the data was on an IT supported system, then they suddenly realize that actually it'd be better talking to IT and actually getting it on a proper system. So we think having that kind of policy can be really useful. I think, definitely in Europe, there are the financial regulators demand it, which is how I first came across it and also came from a banking background.
So always had that kind of policy. But I've seen clients in in sectors that have no regulatory requirement to have such a policy in place now because I think they use it as a tool to communicate with people and and say, look, we have this policy, you're not allowed to do this. Or or not allowed to do it without at least involving the data governance team, somebody from my team, make sure you probably data security, and you you get them to understand that this isn't around it. This is, you know, just as danger, if not more dangerous than than just taking a bit of time and doing it properly.
[00:36:08] Unknown:
Circling back on the idea of the metadata catalog, the automated lineage, automated data discovery, As technologists and engineers, it's very tempting to say, oh, I've got this tool. It finds all of the data for me. I can just add a few labels here and there, and magic happens. My job is done. Data governance is solved. What are some of the ways that you have seen that become a problem or some of the ways that that has led to a false start in a data governance effort and some of the ways that teams need to be thinking about incorporating these technologies into that more policy and procedural aspect of the practice.
[00:36:48] Unknown:
Yeah. And I and I think that's, you know, that's something that a lot of the vendors are now saying. Don't worry. Our tool tool can do this for you, we can get all this done in a really quick time. But I think if we've never had proper documentation, and understanding of our data, it's just fraught with problems. And even like beyond the before these tools existed, I've got horror stories I can tell you where somebody picked a date field to build a data certain to pick certain data for a data mart, not really understanding what that date was, and it wasn't the date they thought it was because we had no documentation.
At all, it's not gonna know what that date is either. It's just a date. And they didn't know whether it was an annual renewal or a policy date. So it's that kind of you need that kind of thing. I've had, over the years, so many data quality issues reported to me when there is nothing wrong with the data. We just have the same term used in different ways in different parts of the organization, so we've picked up the wrong data. And it's either not at all the same data or it's similar but calculated on a different basis, and causes problems. And you're thinking, how's a tool going to know this?
So I absolutely think that tools can be very powerful to fast track things. But I still think that we need to use the tools to automate things to do the mundane things. Let's do the profiling of data, show things to the business, but I don't think we can do it to to to write definitions. Somebody was trying to tell me at a conference, a couple of months ago, that we won't need data quality tools. Well, we don't we won't need humans for data quality tools because the tool will look at it and it will tell you if it's changed or if it's right or wrong. And I said, but what if it was wrong to begin with? And he was going, no, no, but the tool will know. And I went, how will it know it's wrong? If it's base its baseline is this is it and it's rubbish.
And we could go and do some cleansing work and improve it, and it will now tell us it's worse because it's changed from its baseline. So I think we can use tools, but we gotta be very careful what we use them for. I was talking to a vendor recently, and I was very impressed at how he was talking about no check data quality tools. So we still need the business rule. We still need the business input as to what makes us good enough, but we don't have to spend ages trying to write an SQL code, which is really good news for me, given what I've already told you. But, you know, we can we can cut out the the middleman for that bit. We can do the profiling, but instead of saying what the what the profiling tool says is gospel, we can give that to the business to say, are these outliers correct?
You know, this is what's in the field. Was it what you were expecting? So I think we we have to be very clever how we use tools. Otherwise, I think the tools are just, say, giving us access to our data quicker than ever before, but allowing us to perhaps make mistakes quicker as well.
[00:39:41] Unknown:
Another aspect too of the adherence to different tools is that in order for a tool to be effective and for people to actually use it and want to use it, it needs to fit into their workflow and not necessarily force them to conform to the tool's workflow. And I'm wondering how you've seen organizations approach that challenge of trying to bridge across all of these different types of work flows that everybody has based on their role and ensure that everybody is incentivized or ensure that everybody actually wants to work with the tool or work with that workflow because it is actually beneficial to them instead of being a hindrance and a chore?
[00:40:25] Unknown:
Yeah. I think that's a really good question. And so back when I was selecting as, like, an interim data governance manager, was when I worked to be fair, it's the only time I've worked hands on with 1 of the data catalog tools. And there'd been an interim data governance lead before me, and him and the data architect had totally geeked out in designing the the setup for this tool. They'd got really excited about the possibilities. They designed it and then kinda went ta da to the business who went, what's that? Because it was a tool that they hadn't asked for, and they had to use it to do something they'd never done before because they hadn't done data governance before. Now they they were in a regulated industry that there was a requirement for them to do them, which is why the organization had bought the tool. But the stakeholders weren't interested, and they were going, well, I don't wanna do it like that. And I don't want this, and I'm not using it like that. And we ended up having to take the whole thing down and redesign it. And in the period while we had the project to do that, I actually started them building a data catalog, go on an Excel spreadsheet, because it got them almost used to the idea of the fact that you're gonna have to do this anyway.
And now I'll give you a tool and say ta da. Now I can make it easier and better for you because you've got this, you know, tool to automate your very laborious manual way of doing it in an Excel spreadsheet. So I still advise all my clients, even if you've got the budget and the desire to do this, start with an Excel spreadsheet first because all of the tools that I've ever spoken to can take an upload of a CSV file. So you're not wasting the effort, and you're getting the business to understand this is something we do now rather than the, by the way, here's a tool that we're all really excited about, And the business users just look at you going, but we didn't ask for this. What what do we want this for? So I always feel it's better to almost, like, start data governance, get them doing some of the activities, getting them to understand the value of why we're doing it, and then say, ah, now we're going to work on getting you a tool to help do that better. I think rather than just getting the trouble is we do get excited. We see these tools. We can see the value of how they're gonna work. But the business are not where we are on our data journey and understanding it. And we have to remember that.
[00:42:40] Unknown:
As you have been working in this space, working with organizations, you have this coaching business where you work with teams to help them establish this practice and understand the benefits of governance. What do you see as the biggest net benefit to the data team and the organization when the data governance practice is established and active and healthy and everybody is working together in the same direction?
[00:43:09] Unknown:
Oh, so I think there are so many benefits. I think, obviously, from a from a data team point of view, I think your data teams could do what they were employed to do and not spend their time doing this data wrangling, trying to find the data, fix the data. You know, we employ some really clever individuals with some really great skills that they waste at least half the time in my experience. So, we've now got a team who are utilizing their skills, which means that we're probably getting more value for our investment in those individuals. We've probably got more engaged individuals.
And 1 of my clients recently was saying that they do have a trouble with staff turnover, because everything is so manual, so hard and so bad, that that is 1 of the things that they're hoping to reduce that we won't have to keep going through the, oh, we've got to recruit some more data people again. So I think, you know, we actually can start using the skills that we've recruited and are paying for more fully. And that in itself will will lead to all sorts of other benefits down the line. I think from an organization point of view, the, you know, the biggest 1 that I've seen every single organization that I've ever helped has been, improved, efficiency, so reduction in costs.
We stop all these manual workarounds where I get my spreadsheet from finance once a month, and then it's got duplicates in it or missing data or the format's wrong. And I just spend, well, there was 1 client I spoke with somebody who spent 2 weeks out of every month, cleaning the spreadsheet before they could use it for their process. And that was actually written into their documented process of what this chap's job was. So, again, highly skilled person, but he was spending 2 weeks of every month data wrangling. So they can actually get on using their skills. So I think we suddenly start losing all of this wasted time.
I think customer service goes up because we don't have we get things right. We don't have to do rework. We don't have the complaints, or even the extra costs, that are unexpected. I I helped an organization once who 1 of their their key performance, indicators was did they get their orders to their clients on time within the SLA? And according to the the KPI, they were 98% of the time, really great. The exec were really, really happy. Once we started doing, well, they started doing data governance, and I was helping the data governance manager. We worked out that the parts were often not available. The machines were not put together until the very last minute. And instead of being able to book the transport, you know, a week or 2 in advance at the best rates, they were virtually doing the FedEx overnight to get it there on time. So paying huge amounts of postage and delivery when they didn't need to.
So I think it's the, you know, we can start unearthing some of these things and highlighting things that have never been understood and seen before and and really start cutting costs and and, boosting efficiency.
[00:45:59] Unknown:
And in your experience working in this space, what are some of the most interesting or innovative or unexpected ways that you have seen data governance applied or implemented?
[00:46:10] Unknown:
Oh, unexpected, normally, are ones that I tend to raise my eyebrows and go, they don't always work out. I think the most interesting 1 that I've seen is an organization that didn't have data governance in place, but wanted to embrace data mesh. So, actually, I helped them do their very first phase of data governance was over data mesh. And I think that was probably the most interesting, also the most scary for me because it was my first dive into data mesh as well. And I think that was really interesting because, as I already said earlier, that I think people forget about the people side of data and the the culture change we need to do. But data mesh is all about that data democratization and creating data products, making them more accessible from business users.
So, by the very definition of having a data mesh initiative, they knew they needed to get the business involved. I think that was really interesting as well because I think a lot of the time, I'm not an advocate for a standard data governance framework. I don't think they work. Nobody works for a standard company. They weren't written for your company. They They weren't written for any specific company. But I do think there's some general components that are always the same, and we then have to tailor it and and make it right for your organization. But in that 1, we then had to try and work out, well, how do we fit this and how do we work with the data engineers? And do they have a data governance role or don't they have a data and what's this data product owner, you know, and how do they fit into it all? So I think that was really interesting.
We also had the challenge that although they were using the data mesh initiative as the starting point, they wanted to design a framework that could be rolled out to data that wouldn't go in the data mesh because not all data is going to go into data products. It's not useful to you. So, I think that was really interesting and perhaps an intuitive way of doing it. I mean, I, somebody told me an adage years ago, which was, you know, when's the best time to play plant an oak tree? Well, it was 200 years ago. When's the second best time? Well, that's today. So I'm always of the kind of opinion that that applies equally to data governance. When should we have started doing it? Well, years ago, so we were ready to do the exciting things now. But if we haven't got it, at least they had the foresight to say, we need data governance. This isn't gonna work without it. So we started straight away on that. And that was really challenging and scary at times because they never told me that that's what they were doing until day 1 of working with them. But it was it was really, really interesting seeing how we we panned out and how it evolved.
And I think you have to do that, whether you're doing it with some exciting new technologies or whether you're still doing it over your data warehouse. It doesn't make you wrong or bad. I think you need to think about, have I got this right and and evolve it. Don't think that you can design a data governance framework perfectly on day 1. And I think that's probably what I have seen. The people that are most willing to accept that, evolve, and adapt as they go forward are the most successful.
[00:49:05] Unknown:
Yeah. The willingness to evolve things and the willingness to follow along as things change rather than just saying, this is my governance policy. Everything has to be this way. If it doesn't fit, then we don't do it, and it be can become a straight jacket instead of a bridge.
[00:49:23] Unknown:
Absolutely.
[00:49:25] Unknown:
And in your experience of working in this space, coming to grips with data governance, understanding what it is from your earliest days of just getting thrown into the deep end? What are some of the most interesting or unexpected or challenging lessons that you've learned in the process of working with data governance and training and coaching? I think the biggest thing is that, you know, and training and coaching?
[00:49:43] Unknown:
I think the biggest thing is that that data governance is more about the people than the data. I really thought we could just fix the data and it would stay fixed and everything would be fine. And it's the people that because the data doesn't make itself wrong. It's only ever gone wrong because of people. So, I think that was perhaps the hardest thing and it took me a long time to realize that. I thought we could just fix the data and couldn't understand why it kept going wrong again. So we actually have to fix the people because we have to get them to understand data and that that if they do something as a, you know, like your your shadow IT example or a manual workaround, that there are repercussions for that, that they have to think about the bigger picture all the time. And I think that's possibly the the the biggest thing that I found, but also for most of my clients as well. I think a lot of them think, you know, I could just sit at my desk and do data. And, you know, actually, if you're a data governance person, you need to be out talking to people because you're, as you I said already, the facilitator between everybody in this organization.
You're breaking down the silos. You're getting people to actually talk to each other to get problems solved. And I think that really is that you are a problem solver or a negotiator or facilitator. I did actually say to 1 client, I was helping them to recruit their 1st ever data governance manager, and they I helped them write the job description, and then we shared it with their panel of, recruitment agencies and did a call with them. And a few of them were criticizing the range of skills that I felt that the data governance manager ought to have. And I I think I got a bit flippant with them because I was getting a bit fed up with their moaning. And I said, well, I could have put walk on water as well, and I didn't. But I think it's this, you know, I think people think that perhaps data well, we are all data geeks. We don't end up doing data governance if we're not. But, you know, you've got to be a people person as well as a data geek to be good at data governance because it is all about talking to people.
[00:51:40] Unknown:
Yeah. That that's definitely a bit of a unicorn.
[00:51:44] Unknown:
Yes. A bit of both. It's quite interesting because I I came from being, I think, a people person and became a data geek through doing it. But I think, you know, it's the yeah. It's an interesting mix of skills.
[00:51:55] Unknown:
We've already talked a little bit about some of the potential pitfalls of data governance, the fact that it can become a restrictive framework rather than something that supports people. And I'm wondering if there are any other challenges or anti patterns that you've seen in data governance that people need to be on the guard of.
[00:52:17] Unknown:
I think perhaps 1 I mean, so, you know, there's a there's a report on my website where I say top mistakes and how to avoid them. And I think number well, once the people change, not all the culture change, ignoring that. But the second 1 is is running data governance from IT. And I find it was a really difficult 1 to put in the report because I've actually helped many clients successfully do data governance from IT. And the trouble is that IT are often the first ones to realize that there's a need for data governance because they're dealing with the problems that result from the lack of it. So they say they need it, and they might start doing it. But unless we do this huge amount of engagement with the business, it can sometimes be seen as IT doing something to the business.
And it's really, we've got to make sure that they understand that they're responsible for the data, not IT. And I think that you can do it successfully from IT, but you have to be very careful. You probably have to have a pretty good relationship between IT and the business to do that. And you need to make sure that you get somebody who's really engaged and pushing it. So, I did some work for a client a few years back now where the CIO sponsored the data governance program. She was great, and she really got it, and she really was a CIO who thought the I was information, not just technology. And she was really great, but when we set up our 1st data governance council, I had to break it to her very gently that I didn't want her to chair it, that we needed to find a business person to do that at the same level of seniority as her. So I think that's a big pitfall I see because I just see IT gets it. They go, we need this data governance thing, and they dive in, but we've really got to bring the business with us.
And then and then think you say I think we've already said that the other 1 is doing too much or setting the bar too high. I think we should start simply and build on it. We've got to deliver some value and not get caught up in let's just make everything beautifully documented. Yes, there's ultimately so much value in having a catalog. But let's make sure we deliver some value along the way rather than just building this lovely catalog. You know, the business is not going to thank us because we've got 2,000 attributes documented in our catalog.
They're gonna thank us because it's now taking less time to get the reports they want, that they're correct, or systems perhaps get built and designed quicker because we've got the right stakeholders and the right data, the stakeholders give us better requirements. So I think it is the thinking about why you're doing it and focusing on the value and not the best practice kind of side of things.
[00:55:02] Unknown:
As you look to the future, as you keep an eye towards the evolution of technology, in particular, the introduction of large language models and their potential applications to this space, what are some of the future trends in data governance that you're excited by, and what are some of the ones that you're concerned about?
[00:55:21] Unknown:
So I I I think Gen AI is probably the most exciting thing. I think that's gonna if we use it correctly, it's gonna enable us to speed up things, should say, by doing the menial tasks that we perhaps had a team of people doing before. But I do yeah. I I think that's probably what I am the most excited by. But I think it's it's with all things. Use Geni for the right things, not the wrong things. So, personally, I wouldn't dream of asking Chat GPT to write my blogs. I tried it out of interest when it all became blew up last year, and it doesn't sound like me. It says things that I don't agree with. It took me longer editing it than it would have done if I'd just written a blog in the first place. So, I do use ChatGPT, but not for that.
I use it for planning and preparation and summarizing documents, you know, and getting key points. You know, if I've downloaded a free report on the website, I can ask it to summarize the PDF for me or summarize webinars so I don't have to spend an hour listening to it. There's some really powerful ways we can use Gen AI, and I think we've got to work out how you know, which are the ways that we do for data governance because I think it really can help us. But I suppose I think it's also scary as well. It's 1 of those 2 sided coins because I know of more than 1 of the data governance vendors who have said, you don't need your business to write definitions anymore because the AI will do it for you, and it will grade them red, amber, green, and it will tell you, you know, the green ones, you just don't have to worry about yeah, the ones, it's up to you. We only want you to look at these red ones.
But as we've already said, if the data's wrong in the fields, how does the Gen AI work out what that is and what this date is? How does it know if it's a date of birth or a date I first became your customer? You know, we I think it's it's exciting and scary at the safe time. It's probably the best thing to say. But I do think it's it's a really interesting time and and exciting time to be doing data governance.
[00:57:25] Unknown:
Absolutely. And the other pit potential pitfall of generative AI is that you wanna use it for everything, and you don't wanna use a flamethrower to make your toast.
[00:57:34] Unknown:
No. That's a really good way analogy of thinking about it. I, I heard Peter Hinson speak about the guy who wrote, he wrote the book, The New Normal, but he's now wishes he didn't call it that. A presentation that I think I've seen on YouTube now called The Never Normal. And he's saying that if we use AI for the wrong things, we'll get a lot of mediocre stuff. So it is. It's the, you know, what's the right things to use it for and use it wisely. And I think it can really help us fast track, stop people having to do the boring things that they don't want to do, so therefore, they don't do well. And but make sure we keep humans doing the bits that we need humans to do.
[00:58:10] Unknown:
Are there any other aspects of this space of data governance, building out the practice, starting the implementation, maintaining it that we didn't discuss yet that you'd like to cover before we close out the show? Oh, no. I think we've we've talked about loads about data governance. I can't think of anything. Alright. Well, for anybody who wants to get in touch with you and follow along with the work that you're doing, I'll have you add your preferred contact information to the show notes. And as the final question, I'd like to get your perspective on what you see as being the biggest gap in the tooling or technology that's available for data management today. Oh,
[00:58:43] Unknown:
biggest gap. I'm not sure that there is 1 so much. Perhaps it is perhaps just that integration between them all. As I said, I think a lot of the the data catalogs do some things and then not others. So I think what we really would love is something that can do your master data management, your data quality, your data lineage, and your catalog and probably your data quality issues, log and workflows, all well, not necessarily all in 1 because I do get that you you want specialist people to design the products, but in an integrated way because I've I've talked to many clients who've got bits all over the place, and they're not integrating well.
[00:59:30] Unknown:
Alright. Well, thank you very much for taking the time today to join me and share your experiences working in data governance and some of the advice that you have built up to help people on this path. It's definitely a very important topic area, and it's great to have you out there helping people to put it into practice. So thank you for that, and I hope you enjoy the rest of your day. Thank you for having me.
[00:59:59] Unknown:
Thank you for listening. Don't forget to check out our other shows, podcast dot in it, which covers the Python language, its community, and the innovative ways it is being used, and the Machine Learning Podcast, which helps you go from idea to production with machine learning. Visit the site at dataengineeringpodcast.com to subscribe to the show, sign up for the mailing list, and read the show notes. And if you've learned something or tried out a product from the show, then tell us about it. Email hosts at data engineering podcast.com with your story. And to help other people find the show, please leave a review on Apple Podcasts and tell your friends and coworkers.
Introduction and Episode Overview
Guest Introduction: Nicola Askam
Nicola's Journey into Data Governance
Defining Data Governance
Data Governance for Small Companies
Misconceptions About Data Governance
The Role of Data Catalogs
Managing Data Governance in Complex Environments
First Steps in Establishing Data Governance
Common Roadblocks in Data Governance
Addressing Shadow IT
Integrating Tools into Workflows
Benefits of a Healthy Data Governance Practice
Innovative Applications of Data Governance
Lessons Learned in Data Governance
Challenges and Anti-Patterns in Data Governance
Future Trends in Data Governance
Closing Thoughts and Contact Information