Summary
In this episode of the Data Engineering Podcast Lior Barak shares his insights on developing a three-year strategic vision for data management. He discusses the importance of having a strategic plan for data, highlighting the need for data teams to focus on impact rather than just enablement. He introduces the concept of a "data vision board" and explains how it can help organizations outline their strategic vision by considering three key forces: regulation, stakeholders, and organizational goals. Lior emphasizes the importance of balancing short-term pressures with long-term strategic goals, quantifying the cost of data issues to prioritize effectively, and maintaining the strategic vision as a living document through regular reviews. He encourages data teams to shift from being enablers to impact creators and provides practical advice on implementing a data vision board, setting clear KPIs, and embracing a product mindset to create tangible business impacts through strategic data management.
Announcements
Parting Question
In this episode of the Data Engineering Podcast Lior Barak shares his insights on developing a three-year strategic vision for data management. He discusses the importance of having a strategic plan for data, highlighting the need for data teams to focus on impact rather than just enablement. He introduces the concept of a "data vision board" and explains how it can help organizations outline their strategic vision by considering three key forces: regulation, stakeholders, and organizational goals. Lior emphasizes the importance of balancing short-term pressures with long-term strategic goals, quantifying the cost of data issues to prioritize effectively, and maintaining the strategic vision as a living document through regular reviews. He encourages data teams to shift from being enablers to impact creators and provides practical advice on implementing a data vision board, setting clear KPIs, and embracing a product mindset to create tangible business impacts through strategic data management.
Announcements
- Hello and welcome to the Data Engineering Podcast, the show about modern data management
- It’s 2024, why are we still doing data migrations by hand? Teams spend months—sometimes years—manually converting queries and validating data, burning resources and crushing morale. Datafold's AI-powered Migration Agent brings migrations into the modern era. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today to learn how Datafold can automate your migration and ensure source to target parity.
- Your host is Tobias Macey and today I'm interviewing Lior Barak about how to develop your three year strategic vision for data
- Introduction
- How did you get involved in the area of data management?
- Can you start by giving an outline of the types of problems that occur as a result of not developing a strategic plan for an organization's data systems?
- What is the format that you recommend for capturing that strategic vision?
- What are the types of decisions and details that you believe should be included in a vision statement?
- Why is a 3 year horizon beneficial? What does that scale of time encourage/discourage in the debate and decision-making process?
- Who are the personas that should be included in the process of developing this strategy document?
- Can you walk us through the steps and processes involved in developing the data vision board for an organization?
- What are the time-frames or milestones that should lead to revisiting and revising the strategic objectives?
- What are the most interesting, innovative, or unexpected ways that you have seen a data vision strategy used?
- What are the most interesting, unexpected, or challenging lessons that you have learned while working on data strategy development?
- When is a data vision board the wrong choice?
- What are some additional resources or practices that you recommend teams invest in as a supplement to this strategic vision exercise?
Parting Question
- From your perspective, what is the biggest gap in the tooling or technology for data management today?
- Thank you for listening! Don't forget to check out our other shows. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.
- Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
- If you've learned something or tried out a project from the show then tell us about it! Email hosts@dataengineeringpodcast.com with your story.
- Vision Board Overview
- Episode 397: Defining A Strategy For Your Data Products
- Minto Pyramid Principle
- KPI == Key Performance Indicator
- OKR == Objectives and Key Results
- Phil Jackson: Eleven Rings (affiliate link)
[00:00:11]
Tobias Macey:
Hello, and welcome to the Data Engineering podcast, the show about modern data management. It's 2024. Why are we still doing data migrations by hand? Teams spend months, sometimes years, manually converting queries and validating data, burning resources and crushing morale. DataFold's AI powered migration agent brings migrations into the modern era. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing.
Ready to turn your year long migration into weeks? Visit dataengineeringpodcast.com/datafold today to learn how DataFold can automate your migration and ensure source to target parity. Your host is Tobias Macy, and today I'm interviewing Lior Barak about how to develop your 3 year strategic vision for your data. So, Lior, can you start by introducing yourself?
[00:01:07] Lior Barak:
Hi. I'm so excited to be on the podcast today. Very thank you very much for inviting me. Basically, I am located in Europe, working for the past 15 years in the data realm, started as a data engineer, moved to an analyst, and then slowly moved and drafted towards more product. So so data product. When I realized actually that there is a very big vacuum in this area. So while engineers know what they need to do at the end of the day, analysts also have more or less. But at the same time, there is no real communication working process that existing there, and this what was fascinating me and I started drafting more towards product.
In the last position, I was basically at Idealo, which is the biggest, price comparison here in Germany, and I was responsible for the data platform.
[00:01:53] Tobias Macey:
Do you remember how you first got started working in data and what it is that kept you there?
[00:01:58] Lior Barak:
This is a good, good question because I actually started as an account manager, for an ecommerce company. And when somebody realized that they actually understand how to write an SQL, they asked me why am I doing account management. I should go to the data team. And I was the only member there, and I remember that I needed to extract data from MySQL, which was a very horrible system back then. We're talking about 15 years ago. Right? And this was my first experience extracting data from MySQL into CSVs and starting working with them. And I feel can remember in one of the episodes, about strategy, I think 387 or 397. There was a discussion about it that for engineers, a CSV file is a great thing, but for, end user is not.
And I learned it on my skin, basically, very fast that, CSVs are very nice, but they're not really running and working well for business people. And this is when I started to actually go even deeper.
[00:02:57] Tobias Macey:
So you actually have a substack where you talk a lot about your philosophies and strategies around data. And one of the ideas that you documented there is this concept of a 3 year data vision board. And before we get too much into the specifics of that and how to apply it, I'm wondering if you can just start by giving an outline of the types of problems that occur as a result of not having some sort of strategic plan for the organization's data and the systems that support it and just saying, we need data. Go make it happen.
[00:03:29] Lior Barak:
That's most of the cases. Right? We need data, making sure that it happens and appears, and this is the basic interaction of management with data teams. Then somebody sit downs and draft some strategy. Right? This is what we need to have to be able to build a certain dashboard. And I think this is the main problem. When we started to talk about, okay, what KPI do we need to have actually, which, yes, it's right to talk about which KPI we need to have and how we actually trace it back and re and creating this, line from ingesting the data, processing it, and bringing it forward. But at the same time, data is then not creating an impact. What is what is the impact actually that the KPI is making? And I think this is a lot of the parts that were missing for me personally. How do I actually explain why I'm working on x on y, and what is the contribution actually for the business? How it's actually creating some kind of an revenue impacted, data? So I had a post few days ago on LinkedIn and was what I was mentioning there is basically that if we want to have the data people sitting with the managing board, what they need to do is they need to be able to actually talk impact and not talk about enablement. They cannot sit in this table and talk about data quality or how they're enabling stuff. They rather needs to be able to talk about what is the impact of data in the organization, and I think this is part that we are really, really missing, and we should have more emphasis on this one. And in that framing of having the data strategy, having the data vision board, what is the format that you recommend for being able to capture that strategic vision? Is it just a bunch of sticky notes? Is it a document? Is it a a data visualization
[00:05:07] Tobias Macey:
of some sort? A, a diagram? I'm just wondering if you can talk to how that needs to be recorded to make it useful beyond the session of actually developing the strategy?
[00:05:19] Lior Barak:
So I'm a big fan of the Mackenzie pyramid principle of how to write documents. Right? So the ending with the results in a very simple and easy way for people to read and know what they need to do and then dig deeper into the documents. And I think this is also what was my guideline in in the setting up the data ecosystem vision board. The idea behind it is that we're starting with an overview, a very simple one, sticky notes that's saying what we need to do, and then having a deep dive is actually that developing all of these points and actually explaining much better. Okay. What are the initiatives that needs to happen out of it? What when we're talking about this capability, how do we actually defining what is the capability, what is the definition of done of it, and what is the end result, as well as what is the monetary value of it at the end of the day, and how we're creating, like, a profit out of it. And for being able to actually develop that vision board, that strategic document,
[00:06:07] Tobias Macey:
what are some of the types of decisions that need to be made, the level of detail that needs to be engaged in as you are embarking on that process of building out that overarching strategic vision for what you're going to be able to do and some of the steps that you need to take to get there? So I think that there are basically
[00:06:28] Lior Barak:
three forces that drive every data strategy at the end of the day. The first one is the regulation. So it's the outside environment that we're talking about. If it's a GDPR law here in Europe, if it's the California act, on data privacy, and so on and so on. So this is just as one example of regularity that's gonna control us from outside, as well as our cloud provider that have different policies of what we allow do not allow to do and how we're actually processing data. Then the second force that is driving these topics are the stakeholders, our users. At the end, there's a data team we need to supply, an environment for our stakeholders to leverage data, to use it, to drive the right decisions and the right profit out of it. And I think that they have a very high impact on what we are doing.
And then the 3rd force is basically the organization as an organization. They say, oh, what is actually the goal of this organization? Where are we going? What are we trying to achieve? What is our expectation of growth, for example, in the next 3 years? And based on it, we need to cut down basically and understand where where we want to be in 3 years from now. And what is it about that 3 year horizon
[00:07:37] Tobias Macey:
that lends itself to this type of encourages, the level of futility that it may encourages, the level of futility that it maybe, eliminates.
[00:07:58] Lior Barak:
So I I I was reflecting a lot, and I said, okay. 1 year, it's too short. Right? Because we're focusing, okay, what we need to achieve in this year. Most cases, we're not gonna manage to achieve most of it, and then we're gonna be very disappointed of it. If you're going to the 5 years, technology changes fast. Things are crazy out here. Right? So now we have Gene AI and everybody now every managing director I'm talking to about is, oh, I need to have a Gene AI. I need now a chatbot to replace my customer service and and so on and so. So if we're gonna go 5 years or 10 years, we're losing quite a lot of our, velocity to grow and and actually, yeah, arrive somewhere. And I think 3 years for me is a very soft zone because we can, on the one side, talk about the infrastructure that we need to have in the next 3 years because we have some kind of a a a planning or forecasting of the growth of the organization, where we wanna go, how many more sessions we're gonna have, what is the changes that we're gonna have. And at the same time, it's not setting us for too long because we can still adjust it. Right? And and this is also what I'm I'm saying about this, framework.
It's you set it up and you're looking now to take based on the best knowledge I have, how do I set the next 3 years to look like? And then in half years from now or 1 year from now, depends on the size of the organization, you can always adjust it. Right? Because things have changed. Now all of a sudden you have a new processing engine that you didn't have there before. And it's changing the entire plan of what you what you wanna do or what you wanna achieve. And I think that this is basically where we we the 3 years is very soft zone for us to say this is where we need to be. And we are not over committing for things that completely gonna change. You're completely gonna lose it. And at the same time, we also be staying very realistic to where we are today, and then it's it's a very soft zone, I would say. To that point of generative AI being a major motivator for all of this rapid change and all of these
[00:09:48] Tobias Macey:
knee jerk decisions of, well, we have to have that in place or else we're gonna fall behind even though it's still very early days, and we're still figuring out what generative AI is even good for and when it is useful. And that 3 year horizon is definitely reasonable from a strategy perspective because it gives you that room to say, okay. Well, this is what we're aiming towards. Here are some of the big chunks that we need to do to get there, and maybe each of those chunks is actually a year long project by itself. But another challenge of being able to actually embark on that strategic planning session is to be able to have some measure of buy in organizationally, and a lot of the organizational momentum is usually more short term, near term focus of, we need to get this done immediately because we need it in place yesterday.
And I'm wondering how you see that playing into the need for that strategic vision because it gives you the ability to have that
[00:10:52] Lior Barak:
projected growth and projected work while out you also have, at the same time, those immediate pressures of, we need everything yesterday because, otherwise we're gonna fall behind. I'm just wondering how you see those tensions play out. Exactly. So and this is why I said, like, 1 year is always a very problematic one. So I was sitting in many data strategy meetings. I actually were running a few of them as well as myself. And one line that I always realized was that we're talking about the present problem. So what needs to be solved now? Our data quality is not good. We talk about, I don't know, the processing engine is not fast enough and things are crashing for the analysts so they cannot run the analysis. We don't have a proper visualization tool, and we're looking very much on what our problems are today. And we're just saying, okay. This is what needs to be solved, the problems of today. And in many cases, and this is a lot of of my learnings during these data strategies, is that we don't talk about the future. We don't talk about how data is actually shaping the course of the organization. It's rather as a go along with us. Okay. So now the marketing needs to have their data for the campaign so they'd be able to steer them, to steer the budget. But what is actually the impact of having it? Or maybe we should talk about automation or giving them a revenue forecast that's gonna help them actually to understand when the session starts, what is the worth of a user in 20 60 days from today. And I think that this is one of the things that we're lacking to do very often in the regular data strategies of the 1 year. And in the 3 years, what I'm trying to force is the thinking about how do we envision our data ecosystem 3 years from now? What would what capabilities do we have there? We are tool agnostic.
Let's not go into tools because tools gonna change most likely. And by the time you're gonna start researching how to solve the problem that you have on a table, gonna be maybe 500 tools available there. And let's face it. Now we have Gene AI, which is Cloud, and Jiminie that is the majority. Llama is also now starting to rise slowly in group. They're not gonna be there. And most likely, we're gonna find, a different approach is like small LLMs that's gonna be very direct and very focused on the organization. Is it actually being considered now if somebody pushing to have a chat, a customer service chat that in the future that we need to go into something that is a little bit smaller? Also, the cost perspective, we don't really think about them because we're trying to solve a problem that is a critical organization right now. And with 3 years, we can actually think about what will be the cost of it, what will be the possible revenues. As actually, does it make sense? Is there is a data ROI at the end of the day for what we are doing, or are we just wasting our time? And this is things that I was part of. I admit it. I was committing to things that were actually costing the organization more than the the revenue that they were driving because I thought this is the right thing because I am an enabler function. I'm not an impact function.
And I think this is this is the core of the the the mistakes that we've done in the past years.
[00:13:37] Tobias Macey:
Now in order to be able to build that strategic vision, it's very easy for a data team to have their wish list and say, these are all the things that we're going to do. We're going to build all this amazing infrastructure, these, immaculate pipelines. Our data quality is going to be perfect. But at the end of the day, that probably doesn't actually have any real impact on the business because no matter how perfect your data is, if it's the wrong data, then it's not going to drive you in the direction that you need to be. So I'm wondering if you can talk to how you identify the personas and the specific people who should be included in that process of developing that data strategy and that overarching vision for where where you're moving organizationally from the data perspective?
[00:14:22] Lior Barak:
So I think for this, let's start basically with the data vision echo ecosystem vision board. Sorry. The ID behind are 3 layers. There is the present layer that talking about users. Identify them. Write them down. Even if it's some regulatory system that is out there and still write it down because you need to be aware of them. The existing out there, you will need to have some kind of communication with them in the future. Then you're talking about the usage. So based on my best knowledge, how data is being used in the organization, how these users specifically have influence on the data that being consumed and used in the organization.
And then, of course, what are the gaps and the issues that they're suffering from? So what gaps do they have and what issues they have with the data today? I think that when we're talking about the present layer, we're talking about basically getting to know all of our stakeholders, try to quantify their pains. So when we're talking about a gap or when we're talking about an issue or when we're talking about needs, for example, we need to be very clear about what what is actually the cost for the organization in money wise. If, every Monday, my data pipeline crashes and I need to do a recap that cost me another $50,000, write it down, make it available for everybody to actually see what are the cost of of this issue.
And then the conversation becoming a little bit easier. Right? Because you can come and say, this is actually what it costs us per year. Do you think it's something that makes sense to solve it? If the organization thinks that not, then it's fine. But I don't believe so in high prices, which happens quite often. And then it's easier to prioritize these topics. This is one part. The second part is also to create some kind of a shared reality between these forces and bring all of these issues to the table and tell them, okay, based on what we know now, this is the cost of it. These are the issues. This is what we need to prioritize a higher or lower.
And ask them also to participate in it, become part of it because at the end of the day, the communication is the key. If you're coming and actually creating a shared reality with the stakeholders, much easier to communicate with them later on and explain to them why you decided to go on x or y or why you're solving a certain problem. And they were part of it as well. So they saw the problem, they know the cost of it, and the conversation is becoming much easier. And this is, again, coming from these conversations that I had in data strategy. Sometimes it's like, you need to solve our marketing data issue or our finance data issue. And I was, but what is what is actually the cost of not having it? What are the risks that's happening to the organization if you don't have this data? And now all of a sudden, I have the legal adviser coming and telling me I need to apply some compliance rules or regulations, and I have somebody who's coming from the product and asking me now for a new tool for having a user flow. How can I actually estimate all of these, issues and decide where am I gonna put my focus? At the end of the day, our team is small. It needs to do maintenance. It needs to work on innovations, so called set, to check different things and understand how they're working. And on top of it, it also needs to have a rollout. Right? So whenever we have a new product or something changes, we need to make sure that it's rolled out to the organization. The organization is aware of what's going on.
These three areas are very critical, and we need to be able to actually be very, very focused in the way that we're doing. And this is why I said, like, let's try to quantify it and make it a little bit easier so we know also what we put emphasis higher or lower in our day to day. Very long answer.
[00:17:39] Tobias Macey:
No. It's a it's a very good answer. It's very helpful. And you already touched on one of the pieces that I was going to throw out as well is that when you get everybody into a room, everybody starts rattling off their wish list of, well, these are all the things that I want. Just make it happen. You're a magical being. You could do you could do all of these things. Data makes it happen. We have generative AI. We don't need developers anymore. What what what whatever the, the the current, imaginative theme is for the day.
And I'm wondering if you can talk to some of the useful ways to structure the types of meetings that you would conduct to actually build the strategic vision. Because if it's open ended, then you do just start getting getting the wish list from everybody. But if you have a specific agenda or a specific set of questions that you're setting out to answer that everybody's collaborate collaborating on, then you reduce the potential set of ideas or potential set of conversations to ones that are well grounded and are going to produce a useful outcome. And I'm just wondering how you've seen that work well in order to be able to bring everybody onto the same page and not just throw all of their wishes into a bucket and hope it doesn't overflow.
[00:18:53] Lior Barak:
I saw how it's not working well. Let's start with that. I I and I think that when when we're talking about the the data ecosystem vision board, the idea of it is to disconnect the people in the room from their problems and issues. What I, and this also what I was writing in in the so called guidelines when I'm talking about it. Collect information on a separate board. Let them, I don't know. A mural board, create an area for them. Let them put who they are, what are their needs, what are the gaps, what is whatever disturbs them. I hate the data team. You can write it down as well. I don't care. Whatever you want, you can share in this document. And then the idea behind it is afterwards, Whoever owns this board going through these different, boards that were created by each of the teams, collecting information, trying to create some kind of a theme and understand what is connected to what, what related to what, who are the stakeholders that are connected to each other in the requirements, and then find out how we can manipulate them in a better way. Sorry. I didn't say manipulate.
Just to to control in a better way the conversation. And I think that after you're collecting all of this information, you're creating or you're generating the present layer. So you're saying these are the issues that we saw. These are the users that we identified. These are the gaps. These are the issues and so on and so on. And you're creating, out of it, a overview of what you think is actually making sense to be on the board. And you put in the monetary value. I repeat it many, many times. It needs to have cost number next to it. What is the, saving if you're gonna apply it? What are the extra revenue that we're expecting?
Or what is the optimization that's gonna cause in the organization? And I think then you bring it to the room. After you've done the first filtering, you cleaned up all the mess because there is a lot of different requests. There is a lot of issues that gonna rise up, and people think that they need to be solved right now. And, actually, when you reflect on it, you can think about, actually, I don't need to because if I'm gonna change now our, ingestion process to a different one, actually, 90% of the problems that we're talking about right now gonna be solved. So let's bring this one to the table and actually accumulate all of their costs and say, this is the solution. This is what it's gonna cost gonna cost, and this is what it's gonna gonna bring back in in savings.
And then the conversations become a little bit easier. And also when you're connecting it all the dots in advance, people will have less option to fill up. If they will fight on certain initiatives, it's fine. They need them to bring and explain why they think it's more important than something that's gonna solve save us $200,000 in compared to their $20,000.
[00:21:26] Tobias Macey:
One of the challenges that I often run into when trying to conduct these types of conversations is that it's very easy for things to drift into too much detail where you say, oh, this is the problem I'm trying to solve. This is broadly the shape of how I think it's going to be addressed, and then people wanna start nitpicking the the specifics. And it's very easy to fall into that trap, and you have to constantly be on guard against it and recognize as it's happening that, okay. This is a useful conversation, but not for right now, not for this context. And I'm wondering how you think about that framing of what is an appropriate level of detail and a useful means of making sure that you don't fall too much into the minutiae.
[00:22:11] Lior Barak:
So I think you always need to think that whoever is in the room is the managing director of the company. He has no clue. He has no time. And all what he want to know is what it's gonna cost me and how I'm solving it. And I think this is the level of conversation you need to have. You can go to a little bit details of how it's gonna solve something or how how the solution gonna influence the company, and I think this is where you need to stop. And you need to stay tool agnostic. You don't, I I don't expect anybody to start now talking about what visualization tool will be the best for them or what is the best processing. This is not the level of conversation we need to have there. The agreement needs to be on this is the capability that we need to be able to have. We need to have in advanced analytics, a recommendation engine for our users.
Stop. Here, you're ending and whoever gonna pick up this task later on and we need to write down what initiatives or how they solving the problem, how they're going to the solution space, we'll need to define it. And then he can come back and have a conversation with the relevant stakeholders and not with the entire room because most of them gonna get bored and gonna say whatever. I remember that I was once in a in a managing director meeting with some team heads of departments, and one of them was pricing. Yeah. We need to work on cost savings. I was like, the managing director, obviously, you need to. I don't understand why you're even rising it here. This is your task. Go and do it. You know? And this is really what we need to think about as simple as possible and as as as money value as possible to explain actually why we're doing it. One of the other challenges that often comes in when you're talking about
[00:23:46] Tobias Macey:
data strategy is that engineers will get very excited and say, well, this is what we're going to do. This is what we're going to build, and maybe they've already started building something. But, occasionally, that can diverge from the actual needs because you get too far into the weeds of, oh, well, I'm going to optimize this piece of it because I think it's going to be important, or I'm going to add this component because, eventually, it will be useful for x, y, and z. And I'm wondering what you see as the utility of letting engineering start with the implementation to help limit the potential scope of what the strategy ends up being. Because if you already have something started and something to build from, then it provides some measure of grounding of the that overall conversation, the overall strategy versus starting with a blank slate of, I haven't done anything yet. We're going to talk about all the things that we need, and then I'm going to start building things. And I'm wondering how you see that play out in either direction of what are some of the useful ways to address that, 0 to 1 phase.
[00:24:54] Lior Barak:
So think about and I was also part of this conversation. The team is into a firefighting mode. Most of the teams are in firefighting mode and they're trying to actually be reactive to whatever happens around them. And I saw organization and teams that's going and starting to investigate. Okay. I need now to replace my processing engine tool, which which are the best options. They're going investigating coming with 5 options and already started to do pilots with them. The point is that this cost money. It's diversing their focus from other topics. And, yes, maybe it's gonna save you time in the long term. I don't mind. But this is part of your education plan and not part of something that you're gonna research for strategy. And it also need to be some part of a better defined problem space before you go into solution. Because it could be that when you're gonna start exploring the problem itself and gonna understand who are the stakeholders and understand the use cases, Whatever you designed right now may be solving, again, your present problems, but it's not solving the future problems of running millions of, events in few seconds to generate a recommendation or to generate, some kind of outcome of it. And I think this is a a a very dangerous zone when engineers starting to investigate a solution before we actually define properly the problem. And I there's a lot to have with this product mindset. We need to have a product mindset whenever we're approaching it, and we need to be very mindful in our environment. We need to stop and reflect and actually understand, does it actually make sense? And I know that, for example, in teams that I was working with, they used to have this, I don't know, like Friday education day. And in these Fridays, they used to investigate different parts, and it's fine. This is your education time. You can do whatever you want in this time. But this is not meaning that this is a solution that you're gonna go. You can pitch it later on and say, I think this is the solution.
After I after I presented to you the problem space and you said, I have something for you now. And
[00:26:54] Tobias Macey:
to illustrate this process a little bit, to kind of walk through a concrete example of developing a strategy. I'm wondering if we can just do a brief exercise, and maybe we'll use my day job as an example where we have a data platform. We're starting to roll it out to the organization. We're doing the standard ELT approach of ingest with Airbyte, and ingest and load with Airbyte, transform with DBT. We're using Superset for visualization, and the initial focus is business analytics because we have multiple different stakeholders, different business units, and we're trying to give a more cohesive means of data access, data control, and shared visibility into that data where up until now, it's largely been a little bit more ad hoc. We have different reports for different stakeholders even though a lot of the underlying data is the same. And so we have a lot of needs, a lot of complexity, and we're figuring out what are the next things that we need to do. So I'm wondering if given that framing, what would be the the first step in embarking on that vision board journey?
[00:28:07] Lior Barak:
It's to start and understand, first of all, the user. It's understanding what is working well or not today in your infrastructure. You said you rolled DBT. Is it actually working well and is it serving? And and also from cost perspective, is it actually cost optimized or not for you? It's to go and understand what are the regulations environment that you're living in and what can influence your future, process as well. There are organization that's still working, for example, to adapt to some data compliance regulations, and they're not yet there. Is there actually the tools that you're having today or the opt or the architecture you have created actually answering these problems or not? And I think that it's also to understand where we're going in the next 3 years or try to create some kind of a forecasting with the amount of events that this ingestion process we need to process and the amount of queries that the DBT will need to run or the amount of data that you need to process. And also understand, is it cost make in cost wise, is it making sense to us to keep it as as it is or not? And then say, what is the capabilities that we need to have out of it? So in 3 years from now, we want to have a system that allowing us to process and ingest data and validate it in the ingestion point as an example. Because we already identified that there is a problem in our with our ingestion.
And we want to be able to have a new processing engine for processing raw data into a data product before we arrive into DBT and so on and so on. And we're really designing it in a in an easy an easy way for us to understand. These are the capabilities. We are not talking about tools. We can we can, of course, refer to tools that we're having today. We have to because we need to acknowledge our our existence and our reality. It's imperfect, and it's not working, and this is why we're looking for something else. And then at the end, it's basically to go and say how we're actually measuring the success. What does it mean for our success success, vision board? Is it improving that ROI? Is it improvement in the data utilization? Is it improvement in data availability and so on and so on, And actually list down these KPIs and write them down. What are the principles that we're gonna work on? Because, okay, we have now we want to have open system as you mentioned before, and we want everybody to access the data, which is great. But do we actually want everybody to access the data, or do we want to have some kind of controlling system in the area that nobody is just going and traveling around raw events, trying to combine something out of them without having the experience? And I think that this is the part of the principles that we need to understand. And this is where your starting point should start. What what are we working well? What is not working well? Acknowledge the reality. And the reality not gonna be perfect. It will never be perfect. It doesn't matter what you're gonna do do and how you're pulling yourself out of it into a better direction.
[00:30:48] Tobias Macey:
And so given that framing for a little bit more detail, so we do have the smart layers. That's what we expose to stakeholders and end users so that they're not just digging into all of the raw events. They have a a more narrowly scoped set of data that they're relying on with some prebuilt dashboards and visualizations. Directionally, where we also want to head towards is being able to use that aggregated and collected data to feedback into our applications and products as well as being able to power an eventual generative AI chatbot style application or suite of applications.
And so there are a lot of unknowns in that journey, but that's kind of broadly where we're heading towards and the things that we need to figure out.
[00:31:36] Lior Barak:
And, you know, one of the things that inspired me also about these vision bodies was when I joined Zalando. What I was joining to very fast was this is the future. This is where you wanna be. And when you have a goal that you know where we need to reach out, so we want to be able to automate all the marketing campaigns to be generated automatically without any human touch or with minimal human touch, let's say. Then it's easier to also know what tools I need to pick. It's easier to understand what are the steps that needs to happen to arrive there. It's much easier to envision the end of the journey. Right? We we still gonna go through this entire journey. We still gonna be having issues in the in the path, and we will need to readjust and change, and it's completely fine. But at least we have an end goal. We know what needs to happen. And I think this is what is super important in the 3 years. This is what needs to happen. Is it making sense to us? Yes or no? And if yes, it should be on the board, and we should start working how we're arriving there. And once you have developed that vision board, you have your strategy, you know where you want to end up in 3 years,
[00:32:42] Tobias Macey:
What are the different either timelines or milestones or different triggers that you should be looking to to prompt you to revisit that strategy and push it out to the next 3 year horizon or update the strategic vision because maybe part of it no longer holds true and just some of the ways that that should become a living document or a, a living utility to help guide the overall work that's being done? It should be a living document. However,
[00:33:14] Lior Barak:
I will say that it should not be changed more than every 6 months and in the fixed iterations because the idea behind it is we already created the shared reality now with our stakeholders. We created the shared in in reality with ourselves, where we're going, what problems we solve. And if we're gonna start changing it every few months, we will arrive nowhere because we're gonna again go into this chaotic situation of this is not relevant. Now everything is is in problem. We need to re restart everything. We actually need to figure out if we committed to this and we said this this is what we need to solve, we start solving it. And in 6 months, we can review it and say, you know what? There were some changes.
Is no longer a tool that we like to use. We want to move now and use Treno because we think it's more cost efficient. Fair enough. But you at least you had this 6 months, you progressed, you moved, you didn't change anything. And the consistency is very, very important as well because we had some progress. We've done something. And I think that this is again the the ability to move between the innovation part, between the maintenance part, and between, the rollout and make sure that we're actually maximizing the effect. One of the examples, for example, for me that really, is very painful is when we try to roll out an observability tool in the organization.
And we started it was very great at the beginning, and then at some point, we just dropped the ball because we needed to run to other tests and other stuff. And my conclusion was like, okay. We need to kill the tool because nobody's using it. The reality was that we didn't actually give the right emphasis on the rollout and make sure the teams are adapting it, and we didn't give it the time. So if I'm saying it after 6 months and now we need to kill it because it doesn't make any more sense to do it. It's making sense because I have some some details that I can come and say. If I'm gonna come and say it after a few weeks, did I actually try to implement it, or do I have do do I understand why it's not working? You know? And this is why I'm saying, like, at least 6 months because you're giving yourself a little bit time to try different approaches and try to figure out how you can, fix the implementation, for example, or maybe it's a tool itself that is not fixing. And then you can also say that in the next 6 months, we need to have a completely different tool. Instead of observability and monitoring, we need to have a validation that rejecting bad data immediately.
That's also an option. It's just it's just that you have enough time to learn what is working and whatnot and have and bring these learnings to the future development.
[00:35:35] Tobias Macey:
And the other piece of it being a living document is that you need to be able to attach that to work being done within that time frame. And I'm wondering, what are some useful techniques or tactics that you've seen for being able to say, this is our overarching strategic vision. This is the sequencing of the big pieces that we're going to be delivering. Now we need to actually break it down into consumable chunks of work that can be done, but we want to make sure that that work gets linked back to that overarching strategic vision and some of the ways that you've seen that implemented and work well.
[00:36:11] Lior Barak:
In the board itself, what you can see is basically there are 7 zones more or less. Right? There is the data management. There is the data ingestion part. There is the data storage and so on and so on. So there are different zones that we gonna have in their capabilities. And then later on, these capabilities needs to be cut into initiatives that we're gonna work on, and these initiatives needs to be cut into tickets basically of some progress. I'm not advocating here now to say that we're gonna develop a product in the next 6 months, but rather how do we cut it down into a level that we have an MVP in a month that we can go and test it already. And how do we then basically building up on top of this success to develop the rest of the of our solution?
And this is, I think, what is very important. We also need to understand that let's face it. We cannot work on everything, so we need to prioritize what is the most important and most acutic problem right now for us. And then you link it back to this vision board because a, you can put their KPIs that's coming and saying, if I gonna be successful in the new ingestion solution, the data, quality issues should be reduced to, 2% only in a month. And I can say that the data, utilization of it gonna go up to 80 6 or 87% because I know that this is basically making data more safe.
And I think that this is very important part. It's also to talk about the KPIs that gonna help you guide you in this period and how you're measuring it back if it's that ROI. I I want to increase that ROI. And, actually, how do I do it? By creating an advanced system now that dashboards gonna be created or gonna be reduced. Instead of having thousands of dashboard, we're gonna have an organization only 8 of them. And by that, actually, we're gonna reduce the amount of data processing. The cost is gonna go down. The decisions are gonna be driven by these dashboards gonna be higher, so they're gonna generate more revenue. And we basically we're creating an environment that is attached to KPI that we can come and and stand behind it. Again, and this is from my feeling as as a data leader many times that I was standing there as like, okay. Data quality is not okay. But what how do we define that it's not okay?
And where do I need to be or how should I improve it if I roll a new product or if I'm rolling something new? What is actually the solution there? And I think this is the part that I'm I I try to reflect in this version of my 3 years vision board to be able to actually come and say, with KPIs, this is what I expected. This is what happened. This is where it's standing. Our movement has happened. 1 of the one of my, my things that I really hated was OKRs, for example, because you're committing to to certain development and then things coming in. And then you have this decision, am I actually achieving the OKR? Am I solving the emergencies?
And and they end up you always need to solve the emergency because you cannot keep it. You're giving up in your OKR. The OKR getting delayed. You're arriving, and then you have this failure because you arrive to a very low point on on your OKR achievements. And everybody looks at you and is like, poof. Again, the the data team have failed. No. It just we try to fix something else that was much more important, and it was not reflected. And I think this is also what we we we need to look at this data ecosystem vision board because it allowing us to put the KPIs. It's gonna reflect us in a positive way. Because if I solve an emergency, it means it solved a problem that was acutic and cost us money. So I did do something that actually drove us forward. It's just not in the okay of fig picture. It's in a different way, and this is how I did it. And in your work of
[00:39:43] Tobias Macey:
developing this technique, implementing it with the teams that you have worked on it with, What are some of the most interesting or unexpected what are some of the most interesting or innovative or unexpected ways that you have seen that overarching strategic planning session and the resulting document used in that organizational context?
[00:40:04] Lior Barak:
So I haven't seen that many. I just started. I need to be honest about it, slowly rolling it out, and I actually asked my friends to start using it, so colleagues that I knew, that working in data teams. And I was doing a workshop with 1 of them, few days ago, and we were sitting there and they brought the documentation from, the users or whatever they collected there. And they created some kind of their version of the board because they said, no. Usages doesn't make sense to me. I need no needs here. And I and they started to readjust things, and the document that came out of it was, okay. We're doing the same thing as the yearly plan. We have no plan for 3 years because you didn't even put it here. And I think that it's not it's not innovation. It's actually it's still a mindset that needs to be shifted here, and this is at least the the expectation to bring to these teams.
The ability to actually think behind, I'm just an enabler. I need to move the paper from side from my left to my right, and I need to place it there. Into actually, if I'm moving it from the left to the right, this is the impact I'm gonna create. And this is what I am able to control, and this is what my environment gonna control. And this is what I'm trying to bring to these teams. It's a shift in mindset, and I think this very much influenced also from I was reading the books from Phil Jackson about his coaching career in the NBA. One of the things I really loved was that he brought the mindfulness and the zen into the game. It was okay. He's famous for adapting the triangle approach, which are gonna be discussed soon in my, substack as well.
But he also brought the Zen and the mindfulness and understanding that things are not perfect, and you need to accept them. And you need to come with, an empty mind, which actually is very much about the, muscle memory process that the more you do it, the more your muscles you're gonna you automatically be able to to, do things. And I think that this is where I'm trying to take the board and the teams together with me, into this process of shifting their minds.
[00:42:05] Tobias Macey:
And in your work of developing this approach, thinking through how to apply the techniques at the organizational level, what are some of the most interesting or unexpected or challenging lessons that you've learned on that journey?
[00:42:20] Lior Barak:
There are so many. I think that, first of all, it's I don't have the answers for everything and nobody have. The forecasting was given for the fools. You'll never be able to actually truly forecast what's gonna happen in 3 years. And I think that this is the 2 most important learnings for me at least because I always try to be perfect in what I'm doing. And this is again another another of our flows. Right? Because we're trying to beg to bring the perfect solution. It does not exist. And we need to know when to stop and what is actually our endpoint that we're saying. This is I'm happy with this. If I'd be able to run a simple recommendation engine twice a day, I'm happy. I don't need to run it every hour. And I think that this is this is the part that we need to to reflect on and think about. And at least this is my learnings quite a lot. And for teams who are
[00:43:08] Tobias Macey:
figuring their way through the work that needs to be done, trying to understand where they're going, what are the cases where this data vision board exercise is not appropriate for their context or their team or organizational scale?
[00:43:23] Lior Barak:
I don't think the scale is gonna be an issue. I think it just that the level of engagement of these engineers and control that they have, I think that this is one of the things that I realized. So if you are 1 or 2 engineers scattered all over the organization, I don't think this vision board will help you much because you need to serve wherever you are located. You are not a centralized team. I'm not advocating for centralized teams. I'm not advocating for decentralized teams. I'm just thinking that from a perspective of organization, you need to have several engineers in place to be able to run these initiatives. You will be able to execute them and not only be in the, in defense in the in the defense all the time. And I think that this is something that, people need to reflect on. Other than that, I I will argue everybody should start using it because it's actually helping you to set up your mindset in a different way. And for
[00:44:16] Tobias Macey:
teams or organizations that want to embark on that journey and start that process, what are some of the resources that they should be looking to? Any specific documents that you've written that would be helpful, or any other pointers or practices that you recommend that teams invest in to manage the complexity and evolution of their data capabilities?
[00:44:44] Lior Barak:
They can go to my Substack. I have 4, newsletter series there that I basically drafted with all the information about how to use it. Of course, not all the details because it will be more than 4 newsletters, I'm already saying, and they can improvise around it, but the bases are there, what each of the columns means. So the each of the layers, what is the purpose of them, how to use them, what are the blocks inside it, and how to fill them up. And they can start doing it themselves. If they want, they can reach out, and I'm more than happy to actually consult and help on this. This is not my main business.
This is basically my try to contribute to the community that I grew up in actually and developed in and bring some of my learnings because I really feel the frustration when I'm talking to data leaders. And when I'm talking to a a senior management, in companies, everybody sharing the same thing is like, yeah, I'm paying 6 figures or 7 figures on my data team, but I cannot make the steeper decision. Should I invest more money in marketing or not? And I think that this is basically why I decided that I'm sharing it. I'm opening it. I'm not putting any paid wall behind it. And, yeah, use it and bring me feedback. This would be the best thing ever. So we can adjust it and we can apply it and it's an open source. So we can actually work it out and figure out what is working best or not. And I'll be adding links in the show notes to the relevant
[00:46:09] Tobias Macey:
posts as well for people to be able to easily find that. So are there any other aspects of this data vision board strategy, the organizational
[00:46:18] Lior Barak:
impact that it can have, or just the overall challenges of managing data for engineering teams that we didn't discuss yet that you would like to cover before we close out the show? No. I just will invite them. So I'm starting in January a new series, basically focusing more on the strategy itself. So the yearly strategy, how do you, work with the 3 parts that I mentioned before, the innovation, the maintenance, and the rollouts, which is helping basically to close a little bit better the 3 years vision. Right? So we have 3 years vision. How do we actually move now into actions? How do we executing it? And this is basically what I'm gonna start also discussing in, the Substack publication and people can join. And I hope I hope we're gonna start hearing more I'm an impact creator rather than I'm just an enabler when I'm talking to data engineers and when I'm talking to analysts, because I think this shift needs to happen, especially with Ginnai. We're gonna see a huge, huge draft and change in the way the data is being consumed, the way the data is being used. And we need to be there because at the end of the day, the data team is the star of the future revolution.
We cannot just hide behind other people and say they're gonna take care of it and we're just here to move things. We actually need to lead and we need to be an example how to use it correctly
[00:47:36] Tobias Macey:
and how we're creating impact with it. Alright. Well, for anybody who wants to get in touch with you and follow along with the work that you're doing or engage further with the work that you're doing on the strategic vision boards, I'll have you add your preferred contact information to the show notes. And as the final question, I'd like to get your perspective on what you see as being the biggest gap from the tooling or technology that's available for data management today. I think there is a very big gap,
[00:48:03] Lior Barak:
in the way that we're managing data ecosystems. So how do we actually having a supply and demand of data? And now we're actually bridging the communication between the the parts. So if I'm as a user, wants to have certain data, how do I communicate it, and how do I buy it, and how I create an SLA with whoever producing it, on my use cases and on the other side, also for the suppliers to have a mindset for developing developing the data products whenever I'm changing,
[00:48:28] Tobias Macey:
not just change it and hope that people will follow, but rather have a proper way of communication and and sending the information between both sides. Alright. Well, thank you very much for taking the time today to join me and unpacking your thoughts on this 3 year strategic vision strata or on this 3 year strategic vision practice and how to approach that and how to apply it to the work to be done. It's definitely a very useful framing, useful exercise for teams to engage in, something that I plan to dig a bit further into and maybe invest, some time into over the holiday break. So thank you again for all of the time and effort you're putting into that, and I hope you enjoy the rest of your day. Thank you very much for inviting me, and I really enjoyed this session. Thank you for listening, and don't forget to check out our other shows.
Podcast.netcoversthepythonlanguage, its community, and the innovative ways it is being used, and the AI Engineering Podcast is your guide to the fast moving world of building AI systems. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. And if you've learned something or tried out a project from the show, then tell us about it. Email host at data engineering podcast.com with your story. Just to help other people find the show, please leave a review on Apple Podcasts and tell your friends and coworkers.
Hello, and welcome to the Data Engineering podcast, the show about modern data management. It's 2024. Why are we still doing data migrations by hand? Teams spend months, sometimes years, manually converting queries and validating data, burning resources and crushing morale. DataFold's AI powered migration agent brings migrations into the modern era. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing.
Ready to turn your year long migration into weeks? Visit dataengineeringpodcast.com/datafold today to learn how DataFold can automate your migration and ensure source to target parity. Your host is Tobias Macy, and today I'm interviewing Lior Barak about how to develop your 3 year strategic vision for your data. So, Lior, can you start by introducing yourself?
[00:01:07] Lior Barak:
Hi. I'm so excited to be on the podcast today. Very thank you very much for inviting me. Basically, I am located in Europe, working for the past 15 years in the data realm, started as a data engineer, moved to an analyst, and then slowly moved and drafted towards more product. So so data product. When I realized actually that there is a very big vacuum in this area. So while engineers know what they need to do at the end of the day, analysts also have more or less. But at the same time, there is no real communication working process that existing there, and this what was fascinating me and I started drafting more towards product.
In the last position, I was basically at Idealo, which is the biggest, price comparison here in Germany, and I was responsible for the data platform.
[00:01:53] Tobias Macey:
Do you remember how you first got started working in data and what it is that kept you there?
[00:01:58] Lior Barak:
This is a good, good question because I actually started as an account manager, for an ecommerce company. And when somebody realized that they actually understand how to write an SQL, they asked me why am I doing account management. I should go to the data team. And I was the only member there, and I remember that I needed to extract data from MySQL, which was a very horrible system back then. We're talking about 15 years ago. Right? And this was my first experience extracting data from MySQL into CSVs and starting working with them. And I feel can remember in one of the episodes, about strategy, I think 387 or 397. There was a discussion about it that for engineers, a CSV file is a great thing, but for, end user is not.
And I learned it on my skin, basically, very fast that, CSVs are very nice, but they're not really running and working well for business people. And this is when I started to actually go even deeper.
[00:02:57] Tobias Macey:
So you actually have a substack where you talk a lot about your philosophies and strategies around data. And one of the ideas that you documented there is this concept of a 3 year data vision board. And before we get too much into the specifics of that and how to apply it, I'm wondering if you can just start by giving an outline of the types of problems that occur as a result of not having some sort of strategic plan for the organization's data and the systems that support it and just saying, we need data. Go make it happen.
[00:03:29] Lior Barak:
That's most of the cases. Right? We need data, making sure that it happens and appears, and this is the basic interaction of management with data teams. Then somebody sit downs and draft some strategy. Right? This is what we need to have to be able to build a certain dashboard. And I think this is the main problem. When we started to talk about, okay, what KPI do we need to have actually, which, yes, it's right to talk about which KPI we need to have and how we actually trace it back and re and creating this, line from ingesting the data, processing it, and bringing it forward. But at the same time, data is then not creating an impact. What is what is the impact actually that the KPI is making? And I think this is a lot of the parts that were missing for me personally. How do I actually explain why I'm working on x on y, and what is the contribution actually for the business? How it's actually creating some kind of an revenue impacted, data? So I had a post few days ago on LinkedIn and was what I was mentioning there is basically that if we want to have the data people sitting with the managing board, what they need to do is they need to be able to actually talk impact and not talk about enablement. They cannot sit in this table and talk about data quality or how they're enabling stuff. They rather needs to be able to talk about what is the impact of data in the organization, and I think this is part that we are really, really missing, and we should have more emphasis on this one. And in that framing of having the data strategy, having the data vision board, what is the format that you recommend for being able to capture that strategic vision? Is it just a bunch of sticky notes? Is it a document? Is it a a data visualization
[00:05:07] Tobias Macey:
of some sort? A, a diagram? I'm just wondering if you can talk to how that needs to be recorded to make it useful beyond the session of actually developing the strategy?
[00:05:19] Lior Barak:
So I'm a big fan of the Mackenzie pyramid principle of how to write documents. Right? So the ending with the results in a very simple and easy way for people to read and know what they need to do and then dig deeper into the documents. And I think this is also what was my guideline in in the setting up the data ecosystem vision board. The idea behind it is that we're starting with an overview, a very simple one, sticky notes that's saying what we need to do, and then having a deep dive is actually that developing all of these points and actually explaining much better. Okay. What are the initiatives that needs to happen out of it? What when we're talking about this capability, how do we actually defining what is the capability, what is the definition of done of it, and what is the end result, as well as what is the monetary value of it at the end of the day, and how we're creating, like, a profit out of it. And for being able to actually develop that vision board, that strategic document,
[00:06:07] Tobias Macey:
what are some of the types of decisions that need to be made, the level of detail that needs to be engaged in as you are embarking on that process of building out that overarching strategic vision for what you're going to be able to do and some of the steps that you need to take to get there? So I think that there are basically
[00:06:28] Lior Barak:
three forces that drive every data strategy at the end of the day. The first one is the regulation. So it's the outside environment that we're talking about. If it's a GDPR law here in Europe, if it's the California act, on data privacy, and so on and so on. So this is just as one example of regularity that's gonna control us from outside, as well as our cloud provider that have different policies of what we allow do not allow to do and how we're actually processing data. Then the second force that is driving these topics are the stakeholders, our users. At the end, there's a data team we need to supply, an environment for our stakeholders to leverage data, to use it, to drive the right decisions and the right profit out of it. And I think that they have a very high impact on what we are doing.
And then the 3rd force is basically the organization as an organization. They say, oh, what is actually the goal of this organization? Where are we going? What are we trying to achieve? What is our expectation of growth, for example, in the next 3 years? And based on it, we need to cut down basically and understand where where we want to be in 3 years from now. And what is it about that 3 year horizon
[00:07:37] Tobias Macey:
that lends itself to this type of encourages, the level of futility that it may encourages, the level of futility that it maybe, eliminates.
[00:07:58] Lior Barak:
So I I I was reflecting a lot, and I said, okay. 1 year, it's too short. Right? Because we're focusing, okay, what we need to achieve in this year. Most cases, we're not gonna manage to achieve most of it, and then we're gonna be very disappointed of it. If you're going to the 5 years, technology changes fast. Things are crazy out here. Right? So now we have Gene AI and everybody now every managing director I'm talking to about is, oh, I need to have a Gene AI. I need now a chatbot to replace my customer service and and so on and so. So if we're gonna go 5 years or 10 years, we're losing quite a lot of our, velocity to grow and and actually, yeah, arrive somewhere. And I think 3 years for me is a very soft zone because we can, on the one side, talk about the infrastructure that we need to have in the next 3 years because we have some kind of a a a planning or forecasting of the growth of the organization, where we wanna go, how many more sessions we're gonna have, what is the changes that we're gonna have. And at the same time, it's not setting us for too long because we can still adjust it. Right? And and this is also what I'm I'm saying about this, framework.
It's you set it up and you're looking now to take based on the best knowledge I have, how do I set the next 3 years to look like? And then in half years from now or 1 year from now, depends on the size of the organization, you can always adjust it. Right? Because things have changed. Now all of a sudden you have a new processing engine that you didn't have there before. And it's changing the entire plan of what you what you wanna do or what you wanna achieve. And I think that this is basically where we we the 3 years is very soft zone for us to say this is where we need to be. And we are not over committing for things that completely gonna change. You're completely gonna lose it. And at the same time, we also be staying very realistic to where we are today, and then it's it's a very soft zone, I would say. To that point of generative AI being a major motivator for all of this rapid change and all of these
[00:09:48] Tobias Macey:
knee jerk decisions of, well, we have to have that in place or else we're gonna fall behind even though it's still very early days, and we're still figuring out what generative AI is even good for and when it is useful. And that 3 year horizon is definitely reasonable from a strategy perspective because it gives you that room to say, okay. Well, this is what we're aiming towards. Here are some of the big chunks that we need to do to get there, and maybe each of those chunks is actually a year long project by itself. But another challenge of being able to actually embark on that strategic planning session is to be able to have some measure of buy in organizationally, and a lot of the organizational momentum is usually more short term, near term focus of, we need to get this done immediately because we need it in place yesterday.
And I'm wondering how you see that playing into the need for that strategic vision because it gives you the ability to have that
[00:10:52] Lior Barak:
projected growth and projected work while out you also have, at the same time, those immediate pressures of, we need everything yesterday because, otherwise we're gonna fall behind. I'm just wondering how you see those tensions play out. Exactly. So and this is why I said, like, 1 year is always a very problematic one. So I was sitting in many data strategy meetings. I actually were running a few of them as well as myself. And one line that I always realized was that we're talking about the present problem. So what needs to be solved now? Our data quality is not good. We talk about, I don't know, the processing engine is not fast enough and things are crashing for the analysts so they cannot run the analysis. We don't have a proper visualization tool, and we're looking very much on what our problems are today. And we're just saying, okay. This is what needs to be solved, the problems of today. And in many cases, and this is a lot of of my learnings during these data strategies, is that we don't talk about the future. We don't talk about how data is actually shaping the course of the organization. It's rather as a go along with us. Okay. So now the marketing needs to have their data for the campaign so they'd be able to steer them, to steer the budget. But what is actually the impact of having it? Or maybe we should talk about automation or giving them a revenue forecast that's gonna help them actually to understand when the session starts, what is the worth of a user in 20 60 days from today. And I think that this is one of the things that we're lacking to do very often in the regular data strategies of the 1 year. And in the 3 years, what I'm trying to force is the thinking about how do we envision our data ecosystem 3 years from now? What would what capabilities do we have there? We are tool agnostic.
Let's not go into tools because tools gonna change most likely. And by the time you're gonna start researching how to solve the problem that you have on a table, gonna be maybe 500 tools available there. And let's face it. Now we have Gene AI, which is Cloud, and Jiminie that is the majority. Llama is also now starting to rise slowly in group. They're not gonna be there. And most likely, we're gonna find, a different approach is like small LLMs that's gonna be very direct and very focused on the organization. Is it actually being considered now if somebody pushing to have a chat, a customer service chat that in the future that we need to go into something that is a little bit smaller? Also, the cost perspective, we don't really think about them because we're trying to solve a problem that is a critical organization right now. And with 3 years, we can actually think about what will be the cost of it, what will be the possible revenues. As actually, does it make sense? Is there is a data ROI at the end of the day for what we are doing, or are we just wasting our time? And this is things that I was part of. I admit it. I was committing to things that were actually costing the organization more than the the revenue that they were driving because I thought this is the right thing because I am an enabler function. I'm not an impact function.
And I think this is this is the core of the the the mistakes that we've done in the past years.
[00:13:37] Tobias Macey:
Now in order to be able to build that strategic vision, it's very easy for a data team to have their wish list and say, these are all the things that we're going to do. We're going to build all this amazing infrastructure, these, immaculate pipelines. Our data quality is going to be perfect. But at the end of the day, that probably doesn't actually have any real impact on the business because no matter how perfect your data is, if it's the wrong data, then it's not going to drive you in the direction that you need to be. So I'm wondering if you can talk to how you identify the personas and the specific people who should be included in that process of developing that data strategy and that overarching vision for where where you're moving organizationally from the data perspective?
[00:14:22] Lior Barak:
So I think for this, let's start basically with the data vision echo ecosystem vision board. Sorry. The ID behind are 3 layers. There is the present layer that talking about users. Identify them. Write them down. Even if it's some regulatory system that is out there and still write it down because you need to be aware of them. The existing out there, you will need to have some kind of communication with them in the future. Then you're talking about the usage. So based on my best knowledge, how data is being used in the organization, how these users specifically have influence on the data that being consumed and used in the organization.
And then, of course, what are the gaps and the issues that they're suffering from? So what gaps do they have and what issues they have with the data today? I think that when we're talking about the present layer, we're talking about basically getting to know all of our stakeholders, try to quantify their pains. So when we're talking about a gap or when we're talking about an issue or when we're talking about needs, for example, we need to be very clear about what what is actually the cost for the organization in money wise. If, every Monday, my data pipeline crashes and I need to do a recap that cost me another $50,000, write it down, make it available for everybody to actually see what are the cost of of this issue.
And then the conversation becoming a little bit easier. Right? Because you can come and say, this is actually what it costs us per year. Do you think it's something that makes sense to solve it? If the organization thinks that not, then it's fine. But I don't believe so in high prices, which happens quite often. And then it's easier to prioritize these topics. This is one part. The second part is also to create some kind of a shared reality between these forces and bring all of these issues to the table and tell them, okay, based on what we know now, this is the cost of it. These are the issues. This is what we need to prioritize a higher or lower.
And ask them also to participate in it, become part of it because at the end of the day, the communication is the key. If you're coming and actually creating a shared reality with the stakeholders, much easier to communicate with them later on and explain to them why you decided to go on x or y or why you're solving a certain problem. And they were part of it as well. So they saw the problem, they know the cost of it, and the conversation is becoming much easier. And this is, again, coming from these conversations that I had in data strategy. Sometimes it's like, you need to solve our marketing data issue or our finance data issue. And I was, but what is what is actually the cost of not having it? What are the risks that's happening to the organization if you don't have this data? And now all of a sudden, I have the legal adviser coming and telling me I need to apply some compliance rules or regulations, and I have somebody who's coming from the product and asking me now for a new tool for having a user flow. How can I actually estimate all of these, issues and decide where am I gonna put my focus? At the end of the day, our team is small. It needs to do maintenance. It needs to work on innovations, so called set, to check different things and understand how they're working. And on top of it, it also needs to have a rollout. Right? So whenever we have a new product or something changes, we need to make sure that it's rolled out to the organization. The organization is aware of what's going on.
These three areas are very critical, and we need to be able to actually be very, very focused in the way that we're doing. And this is why I said, like, let's try to quantify it and make it a little bit easier so we know also what we put emphasis higher or lower in our day to day. Very long answer.
[00:17:39] Tobias Macey:
No. It's a it's a very good answer. It's very helpful. And you already touched on one of the pieces that I was going to throw out as well is that when you get everybody into a room, everybody starts rattling off their wish list of, well, these are all the things that I want. Just make it happen. You're a magical being. You could do you could do all of these things. Data makes it happen. We have generative AI. We don't need developers anymore. What what what whatever the, the the current, imaginative theme is for the day.
And I'm wondering if you can talk to some of the useful ways to structure the types of meetings that you would conduct to actually build the strategic vision. Because if it's open ended, then you do just start getting getting the wish list from everybody. But if you have a specific agenda or a specific set of questions that you're setting out to answer that everybody's collaborate collaborating on, then you reduce the potential set of ideas or potential set of conversations to ones that are well grounded and are going to produce a useful outcome. And I'm just wondering how you've seen that work well in order to be able to bring everybody onto the same page and not just throw all of their wishes into a bucket and hope it doesn't overflow.
[00:18:53] Lior Barak:
I saw how it's not working well. Let's start with that. I I and I think that when when we're talking about the the data ecosystem vision board, the idea of it is to disconnect the people in the room from their problems and issues. What I, and this also what I was writing in in the so called guidelines when I'm talking about it. Collect information on a separate board. Let them, I don't know. A mural board, create an area for them. Let them put who they are, what are their needs, what are the gaps, what is whatever disturbs them. I hate the data team. You can write it down as well. I don't care. Whatever you want, you can share in this document. And then the idea behind it is afterwards, Whoever owns this board going through these different, boards that were created by each of the teams, collecting information, trying to create some kind of a theme and understand what is connected to what, what related to what, who are the stakeholders that are connected to each other in the requirements, and then find out how we can manipulate them in a better way. Sorry. I didn't say manipulate.
Just to to control in a better way the conversation. And I think that after you're collecting all of this information, you're creating or you're generating the present layer. So you're saying these are the issues that we saw. These are the users that we identified. These are the gaps. These are the issues and so on and so on. And you're creating, out of it, a overview of what you think is actually making sense to be on the board. And you put in the monetary value. I repeat it many, many times. It needs to have cost number next to it. What is the, saving if you're gonna apply it? What are the extra revenue that we're expecting?
Or what is the optimization that's gonna cause in the organization? And I think then you bring it to the room. After you've done the first filtering, you cleaned up all the mess because there is a lot of different requests. There is a lot of issues that gonna rise up, and people think that they need to be solved right now. And, actually, when you reflect on it, you can think about, actually, I don't need to because if I'm gonna change now our, ingestion process to a different one, actually, 90% of the problems that we're talking about right now gonna be solved. So let's bring this one to the table and actually accumulate all of their costs and say, this is the solution. This is what it's gonna cost gonna cost, and this is what it's gonna gonna bring back in in savings.
And then the conversations become a little bit easier. And also when you're connecting it all the dots in advance, people will have less option to fill up. If they will fight on certain initiatives, it's fine. They need them to bring and explain why they think it's more important than something that's gonna solve save us $200,000 in compared to their $20,000.
[00:21:26] Tobias Macey:
One of the challenges that I often run into when trying to conduct these types of conversations is that it's very easy for things to drift into too much detail where you say, oh, this is the problem I'm trying to solve. This is broadly the shape of how I think it's going to be addressed, and then people wanna start nitpicking the the specifics. And it's very easy to fall into that trap, and you have to constantly be on guard against it and recognize as it's happening that, okay. This is a useful conversation, but not for right now, not for this context. And I'm wondering how you think about that framing of what is an appropriate level of detail and a useful means of making sure that you don't fall too much into the minutiae.
[00:22:11] Lior Barak:
So I think you always need to think that whoever is in the room is the managing director of the company. He has no clue. He has no time. And all what he want to know is what it's gonna cost me and how I'm solving it. And I think this is the level of conversation you need to have. You can go to a little bit details of how it's gonna solve something or how how the solution gonna influence the company, and I think this is where you need to stop. And you need to stay tool agnostic. You don't, I I don't expect anybody to start now talking about what visualization tool will be the best for them or what is the best processing. This is not the level of conversation we need to have there. The agreement needs to be on this is the capability that we need to be able to have. We need to have in advanced analytics, a recommendation engine for our users.
Stop. Here, you're ending and whoever gonna pick up this task later on and we need to write down what initiatives or how they solving the problem, how they're going to the solution space, we'll need to define it. And then he can come back and have a conversation with the relevant stakeholders and not with the entire room because most of them gonna get bored and gonna say whatever. I remember that I was once in a in a managing director meeting with some team heads of departments, and one of them was pricing. Yeah. We need to work on cost savings. I was like, the managing director, obviously, you need to. I don't understand why you're even rising it here. This is your task. Go and do it. You know? And this is really what we need to think about as simple as possible and as as as money value as possible to explain actually why we're doing it. One of the other challenges that often comes in when you're talking about
[00:23:46] Tobias Macey:
data strategy is that engineers will get very excited and say, well, this is what we're going to do. This is what we're going to build, and maybe they've already started building something. But, occasionally, that can diverge from the actual needs because you get too far into the weeds of, oh, well, I'm going to optimize this piece of it because I think it's going to be important, or I'm going to add this component because, eventually, it will be useful for x, y, and z. And I'm wondering what you see as the utility of letting engineering start with the implementation to help limit the potential scope of what the strategy ends up being. Because if you already have something started and something to build from, then it provides some measure of grounding of the that overall conversation, the overall strategy versus starting with a blank slate of, I haven't done anything yet. We're going to talk about all the things that we need, and then I'm going to start building things. And I'm wondering how you see that play out in either direction of what are some of the useful ways to address that, 0 to 1 phase.
[00:24:54] Lior Barak:
So think about and I was also part of this conversation. The team is into a firefighting mode. Most of the teams are in firefighting mode and they're trying to actually be reactive to whatever happens around them. And I saw organization and teams that's going and starting to investigate. Okay. I need now to replace my processing engine tool, which which are the best options. They're going investigating coming with 5 options and already started to do pilots with them. The point is that this cost money. It's diversing their focus from other topics. And, yes, maybe it's gonna save you time in the long term. I don't mind. But this is part of your education plan and not part of something that you're gonna research for strategy. And it also need to be some part of a better defined problem space before you go into solution. Because it could be that when you're gonna start exploring the problem itself and gonna understand who are the stakeholders and understand the use cases, Whatever you designed right now may be solving, again, your present problems, but it's not solving the future problems of running millions of, events in few seconds to generate a recommendation or to generate, some kind of outcome of it. And I think this is a a a very dangerous zone when engineers starting to investigate a solution before we actually define properly the problem. And I there's a lot to have with this product mindset. We need to have a product mindset whenever we're approaching it, and we need to be very mindful in our environment. We need to stop and reflect and actually understand, does it actually make sense? And I know that, for example, in teams that I was working with, they used to have this, I don't know, like Friday education day. And in these Fridays, they used to investigate different parts, and it's fine. This is your education time. You can do whatever you want in this time. But this is not meaning that this is a solution that you're gonna go. You can pitch it later on and say, I think this is the solution.
After I after I presented to you the problem space and you said, I have something for you now. And
[00:26:54] Tobias Macey:
to illustrate this process a little bit, to kind of walk through a concrete example of developing a strategy. I'm wondering if we can just do a brief exercise, and maybe we'll use my day job as an example where we have a data platform. We're starting to roll it out to the organization. We're doing the standard ELT approach of ingest with Airbyte, and ingest and load with Airbyte, transform with DBT. We're using Superset for visualization, and the initial focus is business analytics because we have multiple different stakeholders, different business units, and we're trying to give a more cohesive means of data access, data control, and shared visibility into that data where up until now, it's largely been a little bit more ad hoc. We have different reports for different stakeholders even though a lot of the underlying data is the same. And so we have a lot of needs, a lot of complexity, and we're figuring out what are the next things that we need to do. So I'm wondering if given that framing, what would be the the first step in embarking on that vision board journey?
[00:28:07] Lior Barak:
It's to start and understand, first of all, the user. It's understanding what is working well or not today in your infrastructure. You said you rolled DBT. Is it actually working well and is it serving? And and also from cost perspective, is it actually cost optimized or not for you? It's to go and understand what are the regulations environment that you're living in and what can influence your future, process as well. There are organization that's still working, for example, to adapt to some data compliance regulations, and they're not yet there. Is there actually the tools that you're having today or the opt or the architecture you have created actually answering these problems or not? And I think that it's also to understand where we're going in the next 3 years or try to create some kind of a forecasting with the amount of events that this ingestion process we need to process and the amount of queries that the DBT will need to run or the amount of data that you need to process. And also understand, is it cost make in cost wise, is it making sense to us to keep it as as it is or not? And then say, what is the capabilities that we need to have out of it? So in 3 years from now, we want to have a system that allowing us to process and ingest data and validate it in the ingestion point as an example. Because we already identified that there is a problem in our with our ingestion.
And we want to be able to have a new processing engine for processing raw data into a data product before we arrive into DBT and so on and so on. And we're really designing it in a in an easy an easy way for us to understand. These are the capabilities. We are not talking about tools. We can we can, of course, refer to tools that we're having today. We have to because we need to acknowledge our our existence and our reality. It's imperfect, and it's not working, and this is why we're looking for something else. And then at the end, it's basically to go and say how we're actually measuring the success. What does it mean for our success success, vision board? Is it improving that ROI? Is it improvement in the data utilization? Is it improvement in data availability and so on and so on, And actually list down these KPIs and write them down. What are the principles that we're gonna work on? Because, okay, we have now we want to have open system as you mentioned before, and we want everybody to access the data, which is great. But do we actually want everybody to access the data, or do we want to have some kind of controlling system in the area that nobody is just going and traveling around raw events, trying to combine something out of them without having the experience? And I think that this is the part of the principles that we need to understand. And this is where your starting point should start. What what are we working well? What is not working well? Acknowledge the reality. And the reality not gonna be perfect. It will never be perfect. It doesn't matter what you're gonna do do and how you're pulling yourself out of it into a better direction.
[00:30:48] Tobias Macey:
And so given that framing for a little bit more detail, so we do have the smart layers. That's what we expose to stakeholders and end users so that they're not just digging into all of the raw events. They have a a more narrowly scoped set of data that they're relying on with some prebuilt dashboards and visualizations. Directionally, where we also want to head towards is being able to use that aggregated and collected data to feedback into our applications and products as well as being able to power an eventual generative AI chatbot style application or suite of applications.
And so there are a lot of unknowns in that journey, but that's kind of broadly where we're heading towards and the things that we need to figure out.
[00:31:36] Lior Barak:
And, you know, one of the things that inspired me also about these vision bodies was when I joined Zalando. What I was joining to very fast was this is the future. This is where you wanna be. And when you have a goal that you know where we need to reach out, so we want to be able to automate all the marketing campaigns to be generated automatically without any human touch or with minimal human touch, let's say. Then it's easier to also know what tools I need to pick. It's easier to understand what are the steps that needs to happen to arrive there. It's much easier to envision the end of the journey. Right? We we still gonna go through this entire journey. We still gonna be having issues in the in the path, and we will need to readjust and change, and it's completely fine. But at least we have an end goal. We know what needs to happen. And I think this is what is super important in the 3 years. This is what needs to happen. Is it making sense to us? Yes or no? And if yes, it should be on the board, and we should start working how we're arriving there. And once you have developed that vision board, you have your strategy, you know where you want to end up in 3 years,
[00:32:42] Tobias Macey:
What are the different either timelines or milestones or different triggers that you should be looking to to prompt you to revisit that strategy and push it out to the next 3 year horizon or update the strategic vision because maybe part of it no longer holds true and just some of the ways that that should become a living document or a, a living utility to help guide the overall work that's being done? It should be a living document. However,
[00:33:14] Lior Barak:
I will say that it should not be changed more than every 6 months and in the fixed iterations because the idea behind it is we already created the shared reality now with our stakeholders. We created the shared in in reality with ourselves, where we're going, what problems we solve. And if we're gonna start changing it every few months, we will arrive nowhere because we're gonna again go into this chaotic situation of this is not relevant. Now everything is is in problem. We need to re restart everything. We actually need to figure out if we committed to this and we said this this is what we need to solve, we start solving it. And in 6 months, we can review it and say, you know what? There were some changes.
Is no longer a tool that we like to use. We want to move now and use Treno because we think it's more cost efficient. Fair enough. But you at least you had this 6 months, you progressed, you moved, you didn't change anything. And the consistency is very, very important as well because we had some progress. We've done something. And I think that this is again the the ability to move between the innovation part, between the maintenance part, and between, the rollout and make sure that we're actually maximizing the effect. One of the examples, for example, for me that really, is very painful is when we try to roll out an observability tool in the organization.
And we started it was very great at the beginning, and then at some point, we just dropped the ball because we needed to run to other tests and other stuff. And my conclusion was like, okay. We need to kill the tool because nobody's using it. The reality was that we didn't actually give the right emphasis on the rollout and make sure the teams are adapting it, and we didn't give it the time. So if I'm saying it after 6 months and now we need to kill it because it doesn't make any more sense to do it. It's making sense because I have some some details that I can come and say. If I'm gonna come and say it after a few weeks, did I actually try to implement it, or do I have do do I understand why it's not working? You know? And this is why I'm saying, like, at least 6 months because you're giving yourself a little bit time to try different approaches and try to figure out how you can, fix the implementation, for example, or maybe it's a tool itself that is not fixing. And then you can also say that in the next 6 months, we need to have a completely different tool. Instead of observability and monitoring, we need to have a validation that rejecting bad data immediately.
That's also an option. It's just it's just that you have enough time to learn what is working and whatnot and have and bring these learnings to the future development.
[00:35:35] Tobias Macey:
And the other piece of it being a living document is that you need to be able to attach that to work being done within that time frame. And I'm wondering, what are some useful techniques or tactics that you've seen for being able to say, this is our overarching strategic vision. This is the sequencing of the big pieces that we're going to be delivering. Now we need to actually break it down into consumable chunks of work that can be done, but we want to make sure that that work gets linked back to that overarching strategic vision and some of the ways that you've seen that implemented and work well.
[00:36:11] Lior Barak:
In the board itself, what you can see is basically there are 7 zones more or less. Right? There is the data management. There is the data ingestion part. There is the data storage and so on and so on. So there are different zones that we gonna have in their capabilities. And then later on, these capabilities needs to be cut into initiatives that we're gonna work on, and these initiatives needs to be cut into tickets basically of some progress. I'm not advocating here now to say that we're gonna develop a product in the next 6 months, but rather how do we cut it down into a level that we have an MVP in a month that we can go and test it already. And how do we then basically building up on top of this success to develop the rest of the of our solution?
And this is, I think, what is very important. We also need to understand that let's face it. We cannot work on everything, so we need to prioritize what is the most important and most acutic problem right now for us. And then you link it back to this vision board because a, you can put their KPIs that's coming and saying, if I gonna be successful in the new ingestion solution, the data, quality issues should be reduced to, 2% only in a month. And I can say that the data, utilization of it gonna go up to 80 6 or 87% because I know that this is basically making data more safe.
And I think that this is very important part. It's also to talk about the KPIs that gonna help you guide you in this period and how you're measuring it back if it's that ROI. I I want to increase that ROI. And, actually, how do I do it? By creating an advanced system now that dashboards gonna be created or gonna be reduced. Instead of having thousands of dashboard, we're gonna have an organization only 8 of them. And by that, actually, we're gonna reduce the amount of data processing. The cost is gonna go down. The decisions are gonna be driven by these dashboards gonna be higher, so they're gonna generate more revenue. And we basically we're creating an environment that is attached to KPI that we can come and and stand behind it. Again, and this is from my feeling as as a data leader many times that I was standing there as like, okay. Data quality is not okay. But what how do we define that it's not okay?
And where do I need to be or how should I improve it if I roll a new product or if I'm rolling something new? What is actually the solution there? And I think this is the part that I'm I I try to reflect in this version of my 3 years vision board to be able to actually come and say, with KPIs, this is what I expected. This is what happened. This is where it's standing. Our movement has happened. 1 of the one of my, my things that I really hated was OKRs, for example, because you're committing to to certain development and then things coming in. And then you have this decision, am I actually achieving the OKR? Am I solving the emergencies?
And and they end up you always need to solve the emergency because you cannot keep it. You're giving up in your OKR. The OKR getting delayed. You're arriving, and then you have this failure because you arrive to a very low point on on your OKR achievements. And everybody looks at you and is like, poof. Again, the the data team have failed. No. It just we try to fix something else that was much more important, and it was not reflected. And I think this is also what we we we need to look at this data ecosystem vision board because it allowing us to put the KPIs. It's gonna reflect us in a positive way. Because if I solve an emergency, it means it solved a problem that was acutic and cost us money. So I did do something that actually drove us forward. It's just not in the okay of fig picture. It's in a different way, and this is how I did it. And in your work of
[00:39:43] Tobias Macey:
developing this technique, implementing it with the teams that you have worked on it with, What are some of the most interesting or unexpected what are some of the most interesting or innovative or unexpected ways that you have seen that overarching strategic planning session and the resulting document used in that organizational context?
[00:40:04] Lior Barak:
So I haven't seen that many. I just started. I need to be honest about it, slowly rolling it out, and I actually asked my friends to start using it, so colleagues that I knew, that working in data teams. And I was doing a workshop with 1 of them, few days ago, and we were sitting there and they brought the documentation from, the users or whatever they collected there. And they created some kind of their version of the board because they said, no. Usages doesn't make sense to me. I need no needs here. And I and they started to readjust things, and the document that came out of it was, okay. We're doing the same thing as the yearly plan. We have no plan for 3 years because you didn't even put it here. And I think that it's not it's not innovation. It's actually it's still a mindset that needs to be shifted here, and this is at least the the expectation to bring to these teams.
The ability to actually think behind, I'm just an enabler. I need to move the paper from side from my left to my right, and I need to place it there. Into actually, if I'm moving it from the left to the right, this is the impact I'm gonna create. And this is what I am able to control, and this is what my environment gonna control. And this is what I'm trying to bring to these teams. It's a shift in mindset, and I think this very much influenced also from I was reading the books from Phil Jackson about his coaching career in the NBA. One of the things I really loved was that he brought the mindfulness and the zen into the game. It was okay. He's famous for adapting the triangle approach, which are gonna be discussed soon in my, substack as well.
But he also brought the Zen and the mindfulness and understanding that things are not perfect, and you need to accept them. And you need to come with, an empty mind, which actually is very much about the, muscle memory process that the more you do it, the more your muscles you're gonna you automatically be able to to, do things. And I think that this is where I'm trying to take the board and the teams together with me, into this process of shifting their minds.
[00:42:05] Tobias Macey:
And in your work of developing this approach, thinking through how to apply the techniques at the organizational level, what are some of the most interesting or unexpected or challenging lessons that you've learned on that journey?
[00:42:20] Lior Barak:
There are so many. I think that, first of all, it's I don't have the answers for everything and nobody have. The forecasting was given for the fools. You'll never be able to actually truly forecast what's gonna happen in 3 years. And I think that this is the 2 most important learnings for me at least because I always try to be perfect in what I'm doing. And this is again another another of our flows. Right? Because we're trying to beg to bring the perfect solution. It does not exist. And we need to know when to stop and what is actually our endpoint that we're saying. This is I'm happy with this. If I'd be able to run a simple recommendation engine twice a day, I'm happy. I don't need to run it every hour. And I think that this is this is the part that we need to to reflect on and think about. And at least this is my learnings quite a lot. And for teams who are
[00:43:08] Tobias Macey:
figuring their way through the work that needs to be done, trying to understand where they're going, what are the cases where this data vision board exercise is not appropriate for their context or their team or organizational scale?
[00:43:23] Lior Barak:
I don't think the scale is gonna be an issue. I think it just that the level of engagement of these engineers and control that they have, I think that this is one of the things that I realized. So if you are 1 or 2 engineers scattered all over the organization, I don't think this vision board will help you much because you need to serve wherever you are located. You are not a centralized team. I'm not advocating for centralized teams. I'm not advocating for decentralized teams. I'm just thinking that from a perspective of organization, you need to have several engineers in place to be able to run these initiatives. You will be able to execute them and not only be in the, in defense in the in the defense all the time. And I think that this is something that, people need to reflect on. Other than that, I I will argue everybody should start using it because it's actually helping you to set up your mindset in a different way. And for
[00:44:16] Tobias Macey:
teams or organizations that want to embark on that journey and start that process, what are some of the resources that they should be looking to? Any specific documents that you've written that would be helpful, or any other pointers or practices that you recommend that teams invest in to manage the complexity and evolution of their data capabilities?
[00:44:44] Lior Barak:
They can go to my Substack. I have 4, newsletter series there that I basically drafted with all the information about how to use it. Of course, not all the details because it will be more than 4 newsletters, I'm already saying, and they can improvise around it, but the bases are there, what each of the columns means. So the each of the layers, what is the purpose of them, how to use them, what are the blocks inside it, and how to fill them up. And they can start doing it themselves. If they want, they can reach out, and I'm more than happy to actually consult and help on this. This is not my main business.
This is basically my try to contribute to the community that I grew up in actually and developed in and bring some of my learnings because I really feel the frustration when I'm talking to data leaders. And when I'm talking to a a senior management, in companies, everybody sharing the same thing is like, yeah, I'm paying 6 figures or 7 figures on my data team, but I cannot make the steeper decision. Should I invest more money in marketing or not? And I think that this is basically why I decided that I'm sharing it. I'm opening it. I'm not putting any paid wall behind it. And, yeah, use it and bring me feedback. This would be the best thing ever. So we can adjust it and we can apply it and it's an open source. So we can actually work it out and figure out what is working best or not. And I'll be adding links in the show notes to the relevant
[00:46:09] Tobias Macey:
posts as well for people to be able to easily find that. So are there any other aspects of this data vision board strategy, the organizational
[00:46:18] Lior Barak:
impact that it can have, or just the overall challenges of managing data for engineering teams that we didn't discuss yet that you would like to cover before we close out the show? No. I just will invite them. So I'm starting in January a new series, basically focusing more on the strategy itself. So the yearly strategy, how do you, work with the 3 parts that I mentioned before, the innovation, the maintenance, and the rollouts, which is helping basically to close a little bit better the 3 years vision. Right? So we have 3 years vision. How do we actually move now into actions? How do we executing it? And this is basically what I'm gonna start also discussing in, the Substack publication and people can join. And I hope I hope we're gonna start hearing more I'm an impact creator rather than I'm just an enabler when I'm talking to data engineers and when I'm talking to analysts, because I think this shift needs to happen, especially with Ginnai. We're gonna see a huge, huge draft and change in the way the data is being consumed, the way the data is being used. And we need to be there because at the end of the day, the data team is the star of the future revolution.
We cannot just hide behind other people and say they're gonna take care of it and we're just here to move things. We actually need to lead and we need to be an example how to use it correctly
[00:47:36] Tobias Macey:
and how we're creating impact with it. Alright. Well, for anybody who wants to get in touch with you and follow along with the work that you're doing or engage further with the work that you're doing on the strategic vision boards, I'll have you add your preferred contact information to the show notes. And as the final question, I'd like to get your perspective on what you see as being the biggest gap from the tooling or technology that's available for data management today. I think there is a very big gap,
[00:48:03] Lior Barak:
in the way that we're managing data ecosystems. So how do we actually having a supply and demand of data? And now we're actually bridging the communication between the the parts. So if I'm as a user, wants to have certain data, how do I communicate it, and how do I buy it, and how I create an SLA with whoever producing it, on my use cases and on the other side, also for the suppliers to have a mindset for developing developing the data products whenever I'm changing,
[00:48:28] Tobias Macey:
not just change it and hope that people will follow, but rather have a proper way of communication and and sending the information between both sides. Alright. Well, thank you very much for taking the time today to join me and unpacking your thoughts on this 3 year strategic vision strata or on this 3 year strategic vision practice and how to approach that and how to apply it to the work to be done. It's definitely a very useful framing, useful exercise for teams to engage in, something that I plan to dig a bit further into and maybe invest, some time into over the holiday break. So thank you again for all of the time and effort you're putting into that, and I hope you enjoy the rest of your day. Thank you very much for inviting me, and I really enjoyed this session. Thank you for listening, and don't forget to check out our other shows.
Podcast.netcoversthepythonlanguage, its community, and the innovative ways it is being used, and the AI Engineering Podcast is your guide to the fast moving world of building AI systems. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. And if you've learned something or tried out a project from the show, then tell us about it. Email host at data engineering podcast.com with your story. Just to help other people find the show, please leave a review on Apple Podcasts and tell your friends and coworkers.
Introduction and Overview
Guest Introduction: Lior Barak
The Importance of a 3-Year Data Vision
Forces Driving Data Strategy
Generative AI and Strategic Planning
Developing a Data Strategy
Engineering and Implementation Challenges
Maintaining a Living Document
Lessons Learned and Challenges