Summary
One of the core responsibilities of data engineers is to manage the security of the information that they process. The team at Satori has a background in cybersecurity and they are using the lessons that they learned in that field to address the challenge of access control and auditing for data governance. In this episode co-founder and CTO Yoav Cohen explains how the Satori platform provides a proxy layer for your data, the challenges of managing security across disparate storage systems, and their approach to building a dynamic data catalog based on the records that your organization is actually using. This is an interesting conversation about the intersection of data and security and the lessons that can be learned in each direction.
Announcements
- Hello and welcome to the Data Engineering Podcast, the show about modern data management
- When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With their managed Kubernetes platform it’s now even easier to deploy and scale your workflows, or try out the latest Helm charts from tools like Pulsar and Pachyderm. With simple pricing, fast networking, object storage, and worldwide data centers, you’ve got everything you need to run a bulletproof data platform. Go to dataengineeringpodcast.com/linode today and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
- Your host is Tobias Macey and today I’m interviewing Yoav Cohen about Satori, a data access service to monitor, classify and control access to sensitive data
Interview
- Introduction
- How did you get involved in the area of data management?
- Can you start by describing what you have built at Satori?
- What is the story behind the product and company?
- How does Satori compare to other tools and products for managing access control and governance for data assets?
- What are the biggest challenges that organizations face in establishing and enforcing policies for their data?
- What are the main goals for the Satori product and what use cases does it enable?
- Can you describe how the Satori platform is architected?
- How has the design of the platform evolved since you first began working on it?
- How have your experiences working in cyber security informed your approach to data governance?
- How does the design of the Satori platform simplify technical aspects of data governance?
- What aspects of governance do you delegate to other systems or platforms?
- What elements of data infrastructure does Satori integrate with?
- For someone who is adopting Satori, what is involved in getting it deployed and set up with their existing data platforms?
- What do you see as being the most complex or underserved aspects of data governance?
- How much of that complexity is inherent to the problem vs. being a result of how the industry has evolved?
- What are some of the most interesting, innovative, or unexpected ways that you have seen the Satori platform used?
- What are the most interesting, unexpected, or challenging lessons that you have learned while building Satori?
- When is Satori the wrong choice?
- What do you have planned for the future of the platform?
Contact Info
- @yoavcohen on Twitter
Parting Question
- From your perspective, what is the biggest gap in the tooling or technology for data management today?
Closing Announcements
- Thank you for listening! Don’t forget to check out our other show, Podcast.__init__ to learn about the Python language, its community, and the innovative ways it is being used.
- Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
- If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@dataengineeringpodcast.com) with your story.
- To help other people find the show please leave a review on iTunes and tell your friends and co-workers
- Join the community in the new Zulip chat workspace at dataengineeringpodcast.com/chat
Links
The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA
Hello, and welcome to the Data Engineering Podcast, the show about modern data management. When you're ready to build your next pipeline and want to test out the projects you hear about on the show, you'll need somewhere to deploy it. So check out our friends over at Linode. With our managed Kubernetes platform, it's now even easier to deploy and scale your workflows or try out the latest Helm charts from tools like Pulsar, Packaderm, and Dagster. With simple pricing, fast networking, object storage, and worldwide data centers, you've got everything you need to run a bulletproof data platform. Go to data engineering podcast.com/linode today. That's l I n o d e, and get a $100 credit to try out a Kubernetes cluster of your own. And don't forget to thank them for their continued support of this show.
Macy. And today, I'm interviewing Yoav Cohen about Satori, a data access service to monitor, classify, and control access to sensitive data. So, Yoav, can you start by introducing yourself?
[00:01:09] Unknown:
Sure. Thank you for having me, Tobias. So I'm the co founder and CTO of Satori Cyber.
[00:01:16] Unknown:
And do you remember how you first got involved in data management?
[00:01:20] Unknown:
It all started when I joined a cybersecurity startup about 10 years ago. The data technologies we all have today were either less common back then, or we just didn't have the budget for anything off the shelf. So we basically built our own data lake without even calling it that way. And since then, I've been involved in building and operating global scale data processing
[00:01:46] Unknown:
And so you mentioned that some of your background is in cybersecurity. I'm wondering if you can give a bit of a description about what it is that you're building at Satori and some of the story behind how the product and the company got started?
[00:02:00] Unknown:
Sure. So Satori builds a universal data access service that provides visibility and control on data access and helps data teams spend less time implementing security and privacy controls, and more time working with data. It's a transparent data proxy that analyzes queries for data and the results sits and builds a comprehensive view of what data you have, where it's located, and how it's being used, and provides tools to enforce security and privacy policies on data access. Now, the story behind building Satori was, as I mentioned about going back 10 years ago, 1 of the biggest challenges we were facing with our homegrown data lake was how to ensure proper use of the data in an environment where there are different stakeholders for the data. You have regulation, you have compliance requirements, and all that at a very large scale of dozens of data centers and a ton of data. Conventional tools were either up for the scale we were operating in, or were too expensive for us to use as a small startup.
And so we ended up spending a lot of engineering cycles and resources to make that work. And when LDAD, my co founder and CEO, and I started thinking about what we wanted to do next, we came back to these challenges and found out things have only gotten worse in the industry since then. Well, today, it's easier to operate data at scale. And for 99% of the companies, you can do that pretty easily with today's tools. Governing data at scale with techniques that were invented decades ago is not a good fit anymore. And when we started talking to data driven organizations, we realized this was no longer the problem of just the big tech giants. It was everybody's problem.
[00:03:46] Unknown:
And so in terms of the overall capabilities for being able to handle access control and governance for data. You mentioned that a lot of the practices and platforms for being able to do that are predate the current iteration of how we think about data and how we process it. So I'm wondering if you can just give a bit of an overview about your perspective on the state of the industry for governance and how Satori compares to some of those other platforms and tools and practices in the industry for being able to manage these governance practices?
[00:04:22] Unknown:
Satori's approach is different in that we understand companies don't really want to re architect their data environment just to get the security and privacy capabilities they need today. So we've offered the choice. Most organizations will adopt something that is easy to implement, and that became our mission. As opposed to other solutions out there in which you need to register all your users to the solution, model all of your data sources and datasets before you can actually start using the tool. With Satori, all that upfront investment doesn't exist. Once deployed, you immediately get in-depth visibility on data access in your environment and can start enforcing security and privacy policies on your critical data flows.
To summarize, I think Satori's approach is very much pragmatic, while other solutions out there are trying to boil the data governance ocean, so to speak.
[00:05:12] Unknown:
And so for organizations that are trying to establish the policies for data governance and be able to manage actually enforcing them and defining them in a repeatable way and in a way that's not going to be influenced by varying interpretations, so in a way that is very clear as to the intent and purpose and steps necessary to enforce those policies. What are some of the biggest challenges that organizations are facing because of the current state of the data ecosystem?
[00:05:44] Unknown:
I think the biggest challenge in enforcing policies is the overhead that process generates on data engineering teams. The reason for that is that, in most cases, data access policies are defined using database objects like permissions, grants, or views. Modifying and managing these objects is a very technical process, and it's also very risky. A lot can go wrong when you change these things. That's why it's being handled by very technical folks, the data engineers. What we're hearing from these companies we talk to is that on average, 30% of the time spent in data engineering teams goes to managing access and permissions.
And when you talk to data engineers, they much rather have the business take care of that. They have the right context to do that. And so they can be more focused on generating new datasets to help the business.
[00:06:37] Unknown:
And then as far as the goals of the Satori product and platform, what are the primary use cases that you're looking to enable? And what are the elements of data governance that you are currently deciding not to try and tackle as part of the platform that you're
[00:06:56] Unknown:
offering? So our main goals for Satori is to be easy to implement in a complex environment and to provide elegant solutions to problems that today are being addressed by manual work. For example, take how dynamic data masking is implemented in Satori as opposed to other solutions. Usually, what you see in dynamic masking solutions is a policy defining how to mask each column based on the role of the person viewing the data. That's pretty much the way it goes. That is really hard to maintain at scale because as new data is introduced and existing data changes, someone has to keep that policy up to date. With Satori, because we classify data as it's being accessed, our masking policy is defined at the level of data types and not columns.
So a single masking policy can be used across different types of data stores. And it doesn't have to be updated when new data is introduced. As for use cases that we enable, it's mostly around automatically discovering and classifying sensitive data, running analytics and data science without compromising sensitive data, and mitigating security and compliance risks by auditing data access and enforcing policies on data access.
[00:08:09] Unknown:
Particularly in the scope of compliance issues and potential regulatory areas that companies might be subject to, what are the risks associated with getting data governance wrong or overlooking certain aspects of it or certain capabilities? And what are some of the potential ramifications of either ignoring or overlooking those aspects of data governance?
[00:08:35] Unknown:
There are 3 main risks organizations take on themselves when their data governance programs are not meeting today's requirements. The first 1 is the risk of a data breach. Sensitive data is leaked outside of the organization that can lose to a loss of consumer trust that can lead to regulatory fines on organizations. The second 1 is data quality issues. A lot of organizations today are data driven, striving to be data driven. And when data governance is not working correctly, that can lead to data quality issues, and that can lead to incorrect decisions based on incorrect data.
[00:09:16] Unknown:
And in terms of the actual Satori product and the platform that you've built there, can you give an overview about how it's architected and some of the design goals that you had at the start and how those assumptions or intentions of the platform have changed or evolved since you first began working on it, as you started to onboard more customers and work with them to address their design challenges?
[00:09:42] Unknown:
So the Satori platform is comprised of 2 main components. The data access controller, This is where we have our proxy, our classification engine, and our policy engine. The data access controller can either be consumed as a service or deployed in the customer environment. The second 1 is the management console. This is a SaaS application where you manage your data stores and policies on the Satori service. When a new data store is added to the service, Satori generates an alternative host name to access that data store. And that host name leads to the Satori data access controller when accessed. So from a data consumer perspective, the only thing that's changed is the URL of the data store. And in many cases, that change can also be applied centrally, so it becomes invisible to data consumers.
Once data consumers start accessing their data via Satori, Satori analyzes the queries and classifies the results sets to understand what type of data is being accessed and where it's located within the data store. We aggregate all that metadata, and we generate a data inventory where you can see the schemas of all of your data and the tagging that we provide on top of that schema. All of that happens without having to do any configuration or set up on the system. It's all automatic. It's almost like we crowdsource the creation of that data inventory, which is kept up to date continuously based on what we observe from data access.
All these context that we collect and generate is provided as input into our attribute based access control policy engine, which customers can use to implement any type of data access policy they need. So we asked about things that changed since we started building the platform. I must say, not a lot has changed. But 1 thing that we learned was very important, which we didn't realize early on, was analyzing the queries and not just focusing on result sets. And in order to create that data inventory I was talking about, it's imperative that we understand in which data store locations queries are accessing.
And the only way to do that in a high quality way is by analyzing queries, and most queries are non trivial. So that has been a challenge, which we had to overcome and we didn't anticipate at first.
[00:12:17] Unknown:
Or proxy oriented nature as compared to other systems that might rely on a kind of push based system where maybe the ETL pipelines will publish the metadata to the data governance system or might crawl the metadata from the data storage layer and just some of the benefits and trade offs of using this proxy approach versus a maybe more active or more heavyweight integration process for being able to pull in that information?
[00:12:49] Unknown:
The advantage of the proxy approach is that, first of all, it's focused on the data that's actually being accessed, as opposed to data that's, you know, just sits there and is not accessed. 2nd are under the hood, we actually do both. So we look at data access, but we also query the data store to get schemas of tables. And we incorporate that information into the inventory. So you can say that we actually do both, but we do it in an ongoing way, instead of doing it like, once a week, once a day. So it's ongoing. It's kept up to date. That's 1 of the big advantages. Whenever new data is accessed, it's automatically being classified.
So the data inventory is kept fresh and up to date all the time.
[00:13:36] Unknown:
And I'm curious if you have seen cases where because you're only populating the catalog when people are querying for certain records, if that leads to maybe certain blind spots in terms of the types of assets that the company has and some additional able to identify and locate those unaccounted for records or assets or data locations?
[00:14:08] Unknown:
So we haven't encountered that as a significant challenge. Whenever a query is being sent to data that Satori might not have seen before, we automatically update that our inventory on the fly. And so that becomes available into the policy engine. It becomes available in our visibility and analytics tools.
[00:14:30] Unknown:
And as far as the data platform components and the aspects of data infrastructure that you're working with, because of the fact that you're proxying, I'm wondering how that either simplifies or complicates the work of being able to integrate with different types of databases or storage systems?
[00:14:49] Unknown:
So actually, what we've learned is that early on, we thought this is going to be a main challenge. And we call that the coverage, data storage coverage. How are we going to support multiple types of systems? Actually, what we've learned is that and we built a system to support that is that the only thing that we need to do to support a new type of data store is understand how clients of that data store communicate with the data data store communicate with the data stores to to understand the actual network protocol used. After we do that, many data stores share the same behavior.
And so for us today, adding support for a new data store can be something that we can do in just like 2, 3 weeks, and we're up and running. There are a lot of similarities between how different data stores behave. And modern data stores usually rely on HTTP protocols. So it really alleviates a lot of work that we have to do at the network level to parse the data store specific protocol.
[00:15:50] Unknown:
Digging more in to the actual deployment and integration piece of getting Satori set up, particularly on the point of these HTTP protocols, I'm curious if you've seen any difficulties in being able to act as a middleman for those connections because of things like TLS and needing to be able to decrypt the communications to understand what the intent is and then proxy it back to the back end and what some of the common patterns are for how those TLS layers are segregated across the network. So do you act as the TLS terminator? And is it common that it will then go unencrypted to the back end? Or is it common to re encrypt to the back end and just some of the kind of deployment considerations that play into that kind of setup?
[00:16:36] Unknown:
So it's a good question. All of the data traffic that goes through Satori is encrypted end to end. That's important to understand. We do act as a TLS terminator. And because we need to analyze a query, and we need to analyze the results. The way we accomplish that is by providing a Satori generated hosting for data consumers to access the data. So traffic between data consumers and Satori is encrypted by a TLS session that is established by our proxy. And traffic from our proxy to the data store is encrypted by a TLS session that is established by the data store. So you get end to end encryption all the way.
Some of the more older data platforms have proprietary ways of handling encryptions. And so we have to build support for those protocols. And we have, but we never degrade the security or privacy of the environment.
[00:17:38] Unknown:
Going back to what you were saying at the beginning about a lot of your background being being in the area of cybersecurity, I'm curious if there are any other ways that that background has influenced your design and thinking about data governance and how to architect the platform for being able to be secure and scalable. Have
[00:17:57] Unknown:
a have a very long background in cybersecurity and in building cybersecurity solutions. We incorporate our background in building global scale proxy networks in our previous company into Satori, and operating a SaaS service. Our background in cybersecurity also informed how we approach data governance. And the way we see it, there are many similarities between these 2 fields. I think the first 1 is the need for deep visibility into the activity of the actors in the system. In cybersecurity, these could be hackers or bots. And in data governance, these could be scripts or applications, or analysts or even malicious insiders.
By providing deep visibility into how these actors operate in the environment, we provide the necessary information to organizations on building the right controls and applying the right controls. The second thing that we're bringing from our background in cybersecurity is the need to provide operational value, not just to the team that bought the solution, but for adjacent teams as well. So for example, if a Satori sale is being championed by the chief data officer, officer because of a need to simplify the process of creating a data inventory, the privacy team can also enjoy the fact that they can now generate data access reports for compliance purposes.
And it's important to get that buy in of all of the teams that are involved and provide value to all of them.
[00:19:39] Unknown:
In terms of the overall life cycle of data, so Satori acts as a secure means of access control and cataloging and being able to do things like dynamic masking. I'm curious how it sits in the overall life cycle of data management and some of the other ways that it integrates with the broader data platform or ways that users of Satori think about the product in the overall scope of their entire data governance strategy?
[00:20:07] Unknown:
So in an overall data governance strategy, Satori's role is to be that enforcement point for data access policies. And the way we fulfill that role is by analyzing the queries, analyzing the results. It's building all that metadata so the policies can be very much informed. What we don't do, for example, and we delegate into the environment is authentication, as an example. Satori does not authenticate users. We rely on existing authentication capabilities that are deployed in the environment, whether it's username or passwords or things like SAML or OAuth or LDAP, to do that job for us.
Other integrations that we have, which are very important, are with the identity provider system. Satori is not a, as I mentioned, not an authentication or a user management solution. But user context is very important in order to enforce data access policies. So we connect into your Okta or your Active Directory or your PingOne. And we pull information about users. So for example, which organization groups they are part of or which roles they are assigned to. So we can use that as context, as input into our policy engine.
[00:21:30] Unknown:
And so in terms of the overall space of data governance, you mentioned at the beginning that a lot of the current challenges are because of the fact that governance policies and capabilities haven't evolved at the same rate as the underlying data platforms and the ways that it's being used. And I'm wondering if you think that that is simply because of the ways that those 2 pieces have evolved or if it's inherent to the problem and that's what contributes so much to the complexity and the fact that it is such an underserved market?
[00:22:04] Unknown:
So I think it's mostly how the industry has evolved. If you think about it, each database vendor operated in a silo, building their own access control systems, leading to a lot of inconsistency in how authorization is modeled and implemented in each platform. Even simple concepts like role based action control are not implemented the same way in all of the systems. What we at Satori are envisioning is a different model, which is based on an attribute based access control system instead of role based access control, and in which, instead of granting people to data, you grant data to people. What we mean by that is that, you know, given a dataset, the data access policy should specify what data consumers need to do to get access to the data and the scope of the access that they get and not just rely on, you know, what Active Directory group they have, and then they get unrestricted access to data.
[00:23:04] Unknown:
And I'm wondering if you have seen any trends in the governance space along the lines of what's happening in other areas of technology of trying to standardize around common patterns and common interfaces for a particular problem domain so that different vendors and different open source products can innovate based on the specifics of their internals while still making it simpler for the overall ecosystem to interoperate without having to have all of these point to point solutions where you have an n times m problem and just coalesce around the kind of default set of baseline capabilities with maybe some additional protocol enhancements for specific use cases, you know, particularly around things like attribute or role based access control?
[00:23:52] Unknown:
So, unfortunately, we haven't seen that yet. I really hope we can converge as an industry to something like that. Obviously, with authentication, I think that is pretty much standardized, but you would imagine that it would be implemented the same way in every platform. And since we operate in that space, and we have to deal with these complexities, we see different implementations of SAML and OAuth and other protocols in every type of data source. So unfortunately, I'm not the bearer of good news in that area. I think the complexity
[00:24:25] Unknown:
of data access and permissions and policies is going to be with us for the foreseeable future. And another area that we could potentially see some common practices and common approaches might be in terms of labeling and tagging of data to be able to easily identify whether something is PII and whether it falls under something like GDPR enforcement versus CCPA enforcement versus HIPAA and some of these various regulatory regimes to make it easier to have some kind of prebuilt policy packs that somebody can say, okay, now I need to be PCI compliant, so I need to pull in this pack of policies to apply to this data that has these common labels? Or I'm wondering if you've seen anything like that.
[00:25:12] Unknown:
This is an interesting direction. And I think, to some extent, it's possible to provide organizations with templates of policies on how they can be compliant to different types of regulations. And there's still a space between the definition of the template and how that's being implemented under the hood in different platforms. But, yeah, definitely, I think that adopting best practices is something that we're seeing happening, and we'll see more of that.
[00:25:42] Unknown:
And so for people who are interested in implementing Satori, what is the actual process for getting it deployed and integrated into their existing data platform?
[00:25:53] Unknown:
The first step is to pick 1 of your data stores. And that basically, should be the 1 that you want to get visibility on. Maybe it's your Snowflake or your Redshift or BigQuery. Once you register the host name of that data store with Satori, we provide you with an alternative host name to access the data store. And as as mentioned before, that's the host name that routes traffic into our data access controller. From then on, every access via the new host name go through Satori, and all the features and capabilities kick in.
If you need to move everyone to access the data store via Satori, usually this happens by updating your single sign on app. You change the URL, and then you have everyone going through Satori. Once this happens, this is where the magic happens. The Satori dashboard and data inventory, they start filling up with information about your data access, and it's like flipping on the light switch. In fact, this is the meaning of the word Satori in Japanese, which is sudden enlightenment. And that is exactly the experience that we're looking to provide our customers with.
[00:27:01] Unknown:
And in terms of users of Satori, what are some of the most interesting or innovative or unexpected ways that you've seen it being used?
[00:27:09] Unknown:
So 1 of the most impactful ways we've seen Satori being used is by a b to b to c marketing platform company where PII of their customers' customers would flow into their data stores in ways and places they cannot anticipate. They're using Satori to first make sure they have a good handle on where that PII is. And when new PII is introduced into the system, they can quickly get a good handle on it, deploy access policies around it, and understand what's going on.
[00:27:45] Unknown:
And as far as being able to identify those new sources of PII, I'm curious what you have found to be useful heuristics for identifying these different patterns where I know that there are certain commonly structured aspects such as credit card numbers or social security numbers and addresses, but it's also difficult to be able to just do, like, a brute force regular expression approach or just simple pattern matching because it's possible for maybe the column definitions to be changed or for some of that information to be split across multiple columns.
I'm wondering what you have found to be some of the challenges of automatic discovery of these types of records and some of the useful strategies that companies can employ to make sure sure that they don't fall between the gaps?
[00:28:34] Unknown:
What we do is a combination of 3 types of algorithms. We have dictionary based classifiers for things like blood type, salutation, state codes, and country codes, etcetera. We have pattern based classifiers for things like email addresses, usernames, encrypted passwords, and so on. And for more complex data types that have a more free form, we have a set of machine learning based classifiers. When you combine all these 3, you get a pretty good handle of PII. The next complexity the level of complexity after you do all that is supporting PII in different types of languages. And that usually involves curating datasets in in different languages, training sets to be able to train the models to identify PII in those different languages.
Data classification has always been a tough problem to solve. I think that the combination of 80% automating that work for you, maintaining a low rate of false positives, and providing customers with an easy interface to update classification decisions or complement classification decisions with their own internal information about the business and the data is the right approach. There's no silver bullet in solving that problem.
[00:30:03] Unknown:
In terms of your own experience of building and growing the Satori product and business, what are some of the most interesting or unexpected or challenging lessons that you've learned in that process?
[00:30:14] Unknown:
So coming from a SaaS background, we have a deep appreciation for the benefits customers get from vendor operated solutions. However, in the data domain, we expected most customers would want a self hosted solution. So we designed our system to be able to support both deployment options. Turns out that the factor that determines the desired deployment option was the fact that Tori is a data governance solution. It was the customer's propensity to deploy and operate vendor solutions or consume the value as a service. And so today, we have about 50% of our customers on the SaaS solution and 50% hosting it themselves.
We expect to see more usage in the SaaS platform as we make more progress.
[00:30:59] Unknown:
And for people who are evaluating data governance options and understanding how they want to manage access control, what are the cases where Satori is the wrong choice?
[00:31:10] Unknown:
I think that Satori, quote unquote, would be the wrong choice if customers are only looking for data discovery and classifications, and they are not really looking for auditing data access or enforcing policies, In those cases, other solutions might be a good fit.
[00:31:30] Unknown:
And as you look to the future of the product and the business, what are some of the plans that you have for the near to medium term?
[00:31:37] Unknown:
So 1 of the main concepts we are building right now and planning to release soon are data access workflows. For example, if a data analyst queries a dataset he or she has never queried before, Satori can block that query and instead send a Slack message to the user with a link where he or she can submit a request to access the data from the data owner, basically distributing the work that today is centralized on the data engineering team with that 30% spent on granting and revoking access to data, distributing that load across the organization to people who have the right context to approve those requests.
And we expect workflows to become a very important part of our story, but not just for us, but for the industry as a whole.
[00:32:26] Unknown:
Well, for anybody who wants to get in touch with you and follow along with the work that you're doing, I'll have you add your preferred contact information to the show notes. And as a final question, I would like to get your perspective on what you see as being the biggest gap in the tooling or technology that's available for data management today.
[00:32:40] Unknown:
So I think the biggest gap used to be how to process, store inquiry huge datasets. That is largely solved today. The next area of focus for the industry, and that is where Satori is focused on, is how to use all that data in the way that is secure, compliant, responsible, and ethical. I believe there is still a lot of room for new and innovative ways to overcome these challenges, and we'll see a lot of these innovations coming out in the in the next few years.
[00:33:10] Unknown:
Well, thank you very much for taking the time today to join me and share the work that you're doing at Satori. It's definitely a very interesting approach to the problem of data governance and access control, which is, as we've discussed, a very important area and an area that still has a lot of room for innovation. So thank you for all the time and effort you're putting into that, and I hope you enjoy the rest of your day. You too. Thank you for having me, Tobias.
[00:33:38] Unknown:
For listening. Don't forget to check out our other show, podcast.init@pythonpodcast.com to learn about the Python language, its community, and the innovative ways it is being used. And visit the site at dataengineeringpodcast.com to subscribe to the show, sign up for the mailing list, and read the show notes. If you've learned something or tried out a project from the show, then tell us about it. Email hosts at data engineering podcast.com with your story. And to help other people find the show, please leave review on Itunes and tell your friends and coworkers.
Introduction and Episode Overview
Guest Introduction: Yoav Cohen and Satori
Building Satori: Challenges and Solutions
Data Governance and Industry Practices
Satori's Approach to Data Masking and Use Cases
Satori's Architecture and Deployment
Proxy Approach vs. Other Systems
Cybersecurity Influence on Satori's Design
Satori's Role in Data Governance Strategy
Challenges in Data Governance Evolution
Implementing Satori: Steps and Integration
Innovative Uses of Satori
Lessons Learned in Building Satori
Future Plans for Satori
Biggest Gaps in Data Management Tooling
Closing Remarks