Exploring The Evolution And Adoption of Customer Data Platforms and Reverse ETL

Hello, and welcome to the Data Engineering Podcast, the show about modern data management.

When you're ready to build your next pipeline and want to test out the projects you hear about on the show, you'll need somewhere to deploy it. So check out our friends over at Linode.

With our managed Kubernetes platform, it's now even easier to deploy and scale your workflows or try out the latest Helm charts from tools like Pulsar, Packaderm, and Dagster.

With simple pricing, fast networking, object storage, and worldwide data centers, you've got everything you need to run a bulletproof data platform.

Go to data engineering podcast.com/linode

today. That's l I n o d e, and get a $100 credit to try out a Kubernetes cluster of your own. And don't forget to thank them for their continued support of this show.

Are you bored with writing scripts to move data into SaaS tools like Salesforce, Marketo, or Facebook Ads? Hightouch is the easiest way to sync data into the platforms that your business teams rely on.

The data you're looking for is already in your data warehouse and BI tools.

Connect your warehouse to Hightouch, paste a SQL query, and use their visual mapper to specify how data should appear in your SaaS systems.

No more scripts, just SQL. Supercharge your business teams with customer data using Hightouch for reverse ETL today.

Get started for free at data engineering podcast.com/hitouch.

Your host is Tobias Macy. And today, I'm interviewing Rachel Bradley Haas and Tejas Manohar about the combination of operational analytics and the customer data platform. So, Rachel, can you start by introducing yourself? Yeah. Hi. My name is Rachel Bradley Haas. I'm a cofounder of Big Time Data, which is a data consulting company that helps people build end to end data platforms

all the way from collection to taking action on it. Tejas, you've been on the show before, but for anybody who hasn't listened to that episode, can you introduce yourself as well? Hey. I'm Tejas, 1 of the founders of a company called Hightouch that helps companies take action on top of the data in their data warehouse by moving it into systems that business teams use, like Salesforce, Marketo, Braze, or Facebook ads.

And going back to you, Rachel, do you remember how you got involved in the area of data? I like to say I'm the laziest developer there ever was. And because of that, I try to automate everything. And the best way to automate everything is to use data to make data driven decisions.

So, honestly, that's how I got into data, and ever since then, I've never looked back. Tejas, how about you? Yeah. So I actually got into data and the whole data space by joining Segment, January 2016. I was an early engineer at that company, and it's 1 of the leading, players in the customer data platform space. I found out about Segment by actually being a a customer of the service a couple years prior when it was just like a 5 to 10% startup, and that's how I got introduced into the whole space. Always interesting hearing people put years to certain events because, you know, looking back, it seems like some of these services are either brand new or they've been here forever.

And it seemed, you know, 2016 segment has been sort of ubiquitous. It seems like it's been around for a long time, but 2016 seems like, you know, just a few short years ago. So Yeah. That's fair.

At the intro, I mentioned that we're talking about operational analytics and the customer data platform, and those are 2 concepts that seem to go kind of hand in hand. But for people who aren't familiar with the overall idea of a customer data platform, can you just start by giving a bit of a definition about what that encompasses and some of the sort of capabilities that it entails?

Yeah. For sure. So the idea behind a customer data platform is that it's a central database of customer information

that's actionable.

So I would say actionable is actually the key word there. The idea is that it's not just a database of customer information, but also database that has features that allows you to actually move that information to different systems that might be used by business teams or marketing teams around your company so that you can actually use that data to power customer experiences, whether it's affecting how a salesperson reaches out to a customer at a b to b company or affecting the actual content or targeting of a marketing campaign in a b to c company.

So for some context on the overall space, CDP or customer data platform, it's a bit of a loaded term. Sometimes it refers to the off the shelf solutions that are called customer data platforms, like solutions like Segment or Particle or Treasure Data. And then sometimes companies just call internal systems that they build to help themselves better use customer data, their customer data platform as well. Yeah. I was just gonna say the only thing that I think is really important that I continue to remind people that I work with every single day is we collect massive amounts of data, and there's so much money that goes into how you store your data, how do you stream it, how do you do all these things. And I think it's important to remember we do that for a reason,

to take action on it. So if we're not putting it in a place that people that want to be strategic, like marketing, sales, growth, CS, all those things can act on, It's basically useless, in my opinion. So it's really important that you end up investing in the CDP to be able to do things with it. A lot of the times, we see data engineers

processing all this data, streamlining it, doing everything with it. And it's like, well, if you're not producing it in a way that someone can use it to make decisions,

it's a waste of time and money, in my opinion.

And on that note of being able to structure it in a way that you can make decisions based on it, what are some of the challenges and complexities that organizations and engineers

encounter when they're trying

to build this system and establish a unified view of their customer interactions and all of the different

communications that they might have with those customers in ways that their customers are engaging with them? I think

1 of the most important things is having,

this is gonna sound like such a buzzword thing to say, an analytics engineer.

And the reason I think that is because you need someone that can speak both languages. So you have people that are working with data engineers that have to understand the technology, how to build things scalably,

how to not have a bunch of 1 offs, but they also have to understand how that data is gonna be used and consumed. You know, being able to understand what is an MQL, how are consumed. You know, being able to understand what is an MQL, how are people gonna do this lead enrichment, what are they doing with it, what is an outreach sequence, all those different things

will impact how you model your data for performance,

and you have to have someone that understands all of the specific caveats. What's an SDR versus a BDR? And all those things. And I'll tell you, not a lot of people are interested in kind of being that,

I would call it, translator. And so having someone that can be a translator between technical skills and how the business uses that data is so important. So it's not only the right tools but the right people,

and then kind of that whole process of how do you standardize it to make it scalable. So I don't know if I answered what's the difficult part, but it's like that's the overall strategy of how it needs to be approached. And I think

that person that controls that integral component is the hardest person to find at any company. Yeah. I I totally agree with that. And I would say that the whole movement in analytics, engineering, and data tooling to allow you to do more with just SQL instead of having to learn all these coding skills like Python or super technical products like Airflow

is really all to allow

analytics and data professionals to focus more on the business problems and learning about those things than a bunch of specific technical skills. So I think this wave of of analytics engineering tooling just reduces the barrier to entry to actually

solving problems like data modeling or data integration, and instead allows,

analytics and data professionals and analytics engineers to really focus on the the tough problem, which is translating business requirements and business problems into the right technical solutions. Yeah. I completely agree with that. Actually, 1 of the things I would just say is, like, the way that tools have enabled people to just click and drag and drop and do things when getting data in

and being able to just have a basic job run where you're not having to set up your own airflow even for yourself. I mean, it is helpful when you're trying to do Python stuff, but when you're just talking about setting up dbtcloud

and setting up Stitch or 5chan or anything like that, you all of a sudden have these tools where you don't need to be able to make custom API scripts

to, like, go and call and pull this data, and you can focus more on, like, what is the business logic you need to build in to be able to get the results you want and

have it be in a transparent way that's in a Git repository somewhere so it's not hidden in views or in 1 off things. Right? Like, nothing's worse than having business logic dispersed across different systems and not understanding where things come from. So I think that's been a huge change that makes

our job a little bit more scalable.

And another interesting element of the concept of a customer data platform is the definition of what a customer might be, sort of how you think about engaging with those customers. And particularly, if you're in an organization that has multiple different product lines, like, what does it mean for them to be a customer? Is it a customer of this particular product, or is it a customer of the entire organization? And how do those different concepts and scalability complexities of understanding sort of how to segment those customers come into play when you're designing and building out these platforms.

I wanna confirm what you mean by, you know, different customers. So I'm just gonna give an example. You can tell me if it's right or wrong in terms of what you're thinking. So, personally, our company deals with a lot of

PLG growth, but also at the same time, enterprise customers that maybe signed up,

you know, like, immediately went to paid and child and never did anything. I worked at Heroku for a while, and we had all the way from freemium all the way to huge enterprise customers that were also going through Salesforce first before they became Heroku customers. And it makes it really difficult because you have potentially 2, 3 different sources of truth of who's paying for a product and whether or not you consider these free users

customers as well. Is that what you're talking about? Yeah. Exactly. Just, you know, you might have a business. The Heroku and Salesforce example is a great 1 where, you know,

as you said, if they're a free user, are they a customer? If they came in through Salesforce, are they Heroku's customer, or are they Salesforce's customer? Or if they're using both, are they still the same customer, or do I have to count them separately into some of the interesting complexities that arise as a result of those interplays within the organization and across different product lines? Yeah. I mean, I think that's 1 of the things that's been really difficult. And so when I was at Heroku, we were lucky enough to have what I would consider the OG reverse ETL.

There was Heroku Connect, which was

syncing between a Heroku Postgres database and Salesforce, and we would have never been able to manage the freemium to enterprise or

enterprise back to freemium motion without a lot of that automation in place. It's 1 of the reasons why I am so passionate about reverse ETL because I saw the power of it very early on. So

back in 2016,

Wintages was at Segment.

And 1 of the things that's really powerful about it is you allow

building how you want to surface that in a sales tool like Salesforce. And so because you have this ability to make these complex decisions

in a separate tool, and then mirror it into a system,

in a more standardized way, that's how we were able to handle those different things. You can choose when and how you decide what a customer is. Do they have to have

important

action in the product, even though they're a free customer, to be considered

a customer worthy of being in Salesforce and things like that. So that's really how we've managed it, but I'm sure Tejas has a different approach as well. Yeah. I would largely agree with that. And I think the warehouse is actually the best place to

answer some of these questions, like what a customer is, who a customer is, what is a customer across different platforms and different communication channels and different data sources. A lot of companies are looking for kind of a silver bullet when it comes to

identity resolution for customers or entity resolution or building a single view of the customer. But in reality, I find that most companies kind of outgrow these generic solutions very quickly and need to build their own SQL queries and sort of formulas to establish what a customer is inside of their data warehouse. And a data warehouse that allows you to, you know, query data any way you want with the power of SQL is really the only solution that's flexible enough to adapt the needs of companies. The other thing I would mention is that it's not just about having multiple product lines as well in the case of a company like Heroku that's now owned by Salesforce. But even if you have, you know, different data sources flowing into your warehouse, like data from an analytics system like Segment, and then data coming in from an ad system reporting on your ad performance, like Facebook ads or data coming in from

a webinar system You might still need to do some basic identity resolution to merge the data between all of these different systems and a data warehouse where you can join in SQL and build your own queries and transformations

is really,

the place that allows you to iterate on this definition of what a customer is over time and freely as it as it sees fit to your business. So I think a lot of companies are looking for a silver bullet here when there actually really is not a silver bullet. What you wanna opt for instead is is the flexibility to be able to iterate freely and continuously on the definition of what a customer is. Digging more into the sort of technical and operational aspects of the customer data platform, you mentioned data warehouses a few times. And the introduction of cloud data warehouses has definitely

brought in a new wave of interest in how to use these systems and, you know, business intelligence and data warehouses, things that we've been using for decades. And I'm wondering if you can just talk to some of the ways that a customer data platform is

distinct or disjoint from just having a data warehouse and a BI dashboard to be able to understand sort of what are the interactions with your business, you know, across your customers?

It's an interesting question. I think CDPs are, like, conflated terms that it's hard to to answer generically for how all of us are thinking about it. But, really, what I kind of saw at Segment was that CDPs and marketing tech solutions were actually some of the earliest companies to adopt,

some of this cloud data warehouse technology like Snowflake and BigQuery. Actually, a lot of Snowflake's early customers were advertising tech and marketing tech companies. At Segment, we were heavily using BigQuery before a lot of our customers had adopted BigQuery to power a lot of our CDP features on the back end that allowed marketers to slice and dice data, move it to different systems, build an identity of a customer, etcetera.

Something that's been interesting is, originally, these day cloud data warehouse solutions and the most modern data warehousing technology that we use today was often used by these marketing tech and data vendors inside of their own kind of proprietary products. But what's happened over the last 5 to 6 years is that every company has wanted to invest in data analytics and data engineering and data warehousing and BI

internally,

and every company is building their own data warehouse that actually represents a source of truth information across all different data sources of a business.

So, originally, we didn't even have data in a central place, so they had to first look towards solutions in the market like CDPs that helped you both collect data, transform it, manage it, and then sync it to other places.

Now if you look at most companies,

companies already have a data warehouse

as well as tooling that helps you get data into it, build models inside of it, report on it in BI. And the real last problem that people are trying to solve when it when it comes to customer data platform is how do you activate that that data, or how do you use it for marketing, for sales, and for different operations of your business? So I think, like, if we were to build reverse ETL and operational analytics and and high touch 5 or 6 years ago, it would have technically worked, but not enough companies would have had the prerequisites, like having all their data in a data warehouse and having clean data models in a data warehouse for it to be useful for them. But if we fast forward, CDPs didn't really grow as fast as other technologies in the whole space, like Snowflake, like BigQuery, like DBT.

And what we're seeing is plenty of demand for customers to kind of turn their data warehouse into a live customer data platform that not only influences analytics, but also influences

the operations around a business.

From my opinion,

I think of the data warehouse as being, like,

the base foundation

if you do it right for a CDP. Right? So it's like you can turn your warehouse into a CDP

if you have the right tools, but, like, that's why I've always called it a data warehouse and not CDP because I think it can sometimes get confused with a lot of off the shelf things, which I feel

only do 80% of what you need them to do. So being able to do something in house where you combine different tools and get a 100% of what you need, realistically, 95%, but we'll just say a 100%.

That's kind of been my opinion on

why I call it a data warehouse versus a CDP. So

You mentioned the sort of activation of the data and, you know, we've mentioned the term reverse ETL and operational analytics a few times. And this is a trend that seems to be going hand in hand with the growth of cloud data warehouses and the focus on using them for customer data platforms. And I'm wondering if you can talk to

some of the

semantics between the initial term of reverse CTL and the now more widely used term of operational analytics and some of the ways that that sort of evolution of terms

reflects the evolution of the ecosystem and the ways that it's being used and sort of what you think is the relative importance of reverse ETL versus operational analytics and its relation to this idea of the customer data platform?

Yeah. This is definitely a tough 1. I think at High Touch, we've been monitoring all the terms pretty closely to see which one's customers are using more. So I have a little bit of a quantitative answer here, but reverse ETL has been growing a lot faster than operational analytics when it comes to what people are searching on Google and stuff like that. Operational analytics does have, I think, more or an equivalent amount of, like, searches per month, for example, which I think is a pretty good indicator for what term is picking up. But the the reason it it has a lot of searches per month is that it already had a lot of searches per month before companies were using the term in the context of reverse ETL or data warehouses because operational analytics also means a lot of other things like analytics on your business' operation. So,

personally,

I'm not a huge fan of the term operational analytics. I think it's just like customer data platform. It's a bit too generic and confusing for some customers. For example, I I have a friend who's a former customer of ours. Ed Cloudby now works at FanDuel, and and he's the manager of operational analytics there. When I was chatting with him, it turned out that that was just analytics on FanDuel's operations and had nothing to do with operational analytics and the and the way we use it at Hithetch. So I would say the distinction is really that reverse ETL is a is a specific technical process of of moving data from the data warehouse

into these business tools, and it's like a very specific way to solve the problem of making data self-service or of of allowing companies to activate their data. And then operational analytics is more just the general idea of

putting your analytics to work and using all the work you've done in analytics also for the live operations of your business.

But personally, I'm much more a fan of terms like data activation or activating data or operationalizing data than operational analytics just because I think it's can be confused with other things. I completely agree. I think the reason why I don't love reverse ETL is because it's just so much more the way I view it. If we were to put a price on things right? So, like, say you're

extracting, doing typical ETL and you're bringing a bunch, like, billions of records into your warehouse, but granularly,

none of those are really that important. You have to model them, understand how they relate to your overall customer journey, which customers they are, are they in your CRM system, do we care about them at all, And then this very, very high value

result is what's being sent somewhere else. So it's like, oh, come on. We gotta give, like, this reverse ETL more power. Like, we can't just be like, it's the exact opposite of ETL because it's like, no. You're sending these high value data points to different tools that someone's going to

act on and do something with versus, like, ETL. It's like, I'm just gonna hit an API point and get data in. So it's like, I don't love reverse ETL, but I agree it's kind of confusing to call it operational analytics because

there are a bunch of people that really do analytics on operations. So it's like, what does that mean? But I think 1 of the biggest things that's been really interesting

is there are different people that own different components of the business. We'll just take Salesforce for example.

I don't want to have to

update my code every single time someone wants to change a process in Salesforce.

Sales owns their own process. They own what they wanna do with the lead, when does it MQL, all those things. I don't wanna have to be changing code on my end. So what's important is I say, here's the valuable data that you can build a process off of and take action on. You own the definition of how you wanna take action on it, but I can surface you the source of truth of these customers.

And so I think it's really important that

what reverse ETL tools do

is allow people that know their area of the business to act on a single source of truth

in a scalable way. And so that's why I think it deserves more than reverse ETL, but I don't love operational analytics.

Yeah. I think terms are tough. We're trying to push Yeah. Yeah.

I'm not gonna lie. I had to Google some of the terms on here because I was like, is it what I think it means? I was just like, oh. Like, I know this space and I can speak to it, but, you know, when people coin terms, you're just like, am I thinking the same thing as what they're thinking? So, like, with CDP, I always think segment.

And to say that segment

is this all powerful thing that's gonna fuel these things is like, no. Segment data bringing it in and modeling it with a bunch of other stuff is the way I view, like, the evolution of the data platform.

And so when people say CDP, I'm like, I don't want just a CDP that doesn't solve all my issues, but then the CDP we're talking about here is a combination of all these tools and being able to act on customer data. Yeah. I think what's tough is when some of these vendors get sufficiently large, they pick a generic term that can encompass

all the product surface area that they'd ever want to build, like customer data platform. I can't imagine a term that's more generic than that. I I don't know. What about CRM,

customer relationship management?

Fair enough. Fair enough. I can't tell which one's more generic, honestly. That's how you know when you've made it. It's like you just own the most generic term and everyone thinks of you. Yeah. Maybe the next evolution will just be customer platforms.

Yep.

Yeah. Another gripe with the idea of reverse ETL as a term came up in a conversation I was having recently, which is that

ETL as a discipline has no implied directionality.

So, you know, calling it reverse ETL is kind of pointless because there there was no direction for it to be pointing in in the first place. So Yeah. I think the only thing I really like about the term reverse ETL is despite it maybe not making sense, it does immediately click for a lot of customers, and a lot of customers just actually think of it that way. Like, I remember,

originally,

we didn't want to adopt the term as a vendor. It was something that we heard in communities. It was something that customers would would say to us, but we didn't really wanna adopt it because it it felt kind of lame and too specific. But then we realized that

it's rare to be able to start a company and within a year or 2, have a term that can be widely linked to your company. So we decided to just go with it. I mean, customers were literally asking us in the call, so it's like the reverse of ETL or the reverse of Fivetran or the reverse of Stitch.

So it's just

inevitable that we had to adopt the term. Yeah. It's interesting because I've actually heard people say now that

Stitch and Fivetran shouldn't be ETL. They should just be EL, and then DBT is the t. But I do think in your specific example,

you do have a t because you're taking this raw data and transforming in a way that needs to be consumed. I honestly think you all have the harder job than Stitch and Fivetran because you have to deal with all the errors that come back the other way or changing,

you know, models or whatnot that you have to deal with. So it's like you all do

reverse ETL because you have to extract it and then transform it and load it, versus, like, the Stitch and 5 Trans are actually just doing the E and L.

Yeah. And in my opinion, the other really hard part about reverse ETL that's not really conveyed in such a technical name is that you're really not just building a platform to move data points around, but a a platform that's kind of cross functional that allows, you know, data teams, like technical folks to also collaborate with other folks in the business, whether it's sales ops or marketing ops or marketers directly.

And that's really, a tough design problem. I mean, something that we've thought super deeply about at Hyatt, that's, like, kind of creating parts of the app where you manage data models that

have technical features like integrations with DBT and stuff like that, and then having separate parts of the app that can be consumed by, you know, people who are used to managing Salesforce or Facebook ads, etcetera.

It's really understanding both personas. That's a challenge versus if you look at a product like Fivetran or such, I think, sometimes they're even calling themselves just data replication

instead of ETL as well. It's it's really just replicating data points into the warehouse.

The technical user can do whatever they want with it, but there's not a huge design problem in, like, how you express the business workflows.

Yeah. I'll just add on that. I'm not gonna name names, but I actually know 2 of your competitors. And 1 went really far the technical route of the UI, and 1 went way more the operational analytics way. And it's just

like 1 was way too technical for me to use, and 1 was way too bland for me to use. And so it's like I think it's a happy medium when you can support both personas because you really do have to support

and feel like data engineers feel comfortable connecting

their CDP warehouse, whatever you wanna call it, and then letting a marketing person access it and have some freedom. It's a little bit nerve racking because at the end of the day, if a mass email goes out,

it's not gonna be the marketing person that got blamed for the wrong data points.

That's true.

Yeah. In my opinion, I think a lot of the innovation in the space will come

not just from kind of making it easier to build a lot of integrations, but also how in in terms of product features that make it really easy to hand off between the data team and the business team. So I think that collaboration layer is really the biggest area of opportunity for any company in reverse ETL, operational analytics, or customer data pod. Absolutely.

And the whole space of reverse ETL or whatever term we decide to cement on as the time goes by.

It's only about a year or 2 old in terms of sort of as a product category. I mean, people have been doing it forever, I'm sure. But as a distinct product category, it's relatively recent. And I'm wondering what have been some of the

changes in terms of the focus

for you at Hightouch, for the industry, and for end users of the product as far as

who are the target users, how are they using it, how are they communicating about it within the organization, and sort of what are some of the interesting evolutions that have happened over those past year or 2? I think

1 thing on our end is it's become a lot easier to our services at our platform at Hyatt. It's because customers now come to us with knowledge of what we're doing beforehand, which is a really new experience compared to, like, let's say, a year or a year and a half ago when we had to explain everything from scratch in terms of our whole approach when we talked to customers because reverse ETL was just, as you mentioned, not a concept that people were familiar with. 1 of the the big challenges that's rather obvious when it comes to building a reverse ETL platform is, initially, when we started building the platform, we had a few key use cases that

platform is initially when we started building the platform, we had a few key use cases in mind. You know, we'd help enrich systems like Salesforce for b to b companies, and we'd help enrich marketing and advertising platforms for b to c companies. But what we found in serving the data teams is data teams are so cross functional

and in ways that we really didn't imagine, honestly. We've started building integrations to finance systems like NetSuite. We have requests to build integrations with systems like Anaplan and

SAP.

We've built integrations

with systems like Zendesk or Intercom on the support side, and and even things all the way down to just, like, everyday business workflow tools like Slack and Asana.

I really think it'll be interesting to see how this space plays out and if there's certain companies that focus on doing reverse ETL for certain types of business workflows that take off kind of vertical specific

operational analytics or reverse ETL providers. At Itouch, we're going broad, and we're really focusing on

enabling the analytics engineer or the person who thinks both about business problems as well as about

SQL and data modeling, etcetera,

to deeply solve and address the needs of their business users no matter what team those business users are on. So we're addressing sales things, marketing things, finance things, customer success, and kind of all across the board. But it'll be interesting to see how that plays out. I personally think the biggest area of opportunity is in what's really gonna be necessary to solve this problem throughout the ecosystem is to have, like, a really good interface for the business users to be able to collaborate with the data users all through the reverse ETL platforms. So it's a complex problem, and it's not 1 that'll be solved in 1 year, but we've started chipping away at it at that high touch by building certain vertical applications on top of the reverse ETL platform. For example, our audiences product, which kind of allows marketers to come into the hype that you happen, select different subsets of customers to sync to their marketing tools on top of data models that are predefined

by the data team. And I think a lot of the innovation we'll see in the VeracyTail platform will just be more and more of these vertical applications that allow business users to safely and effectively

get their hands closer to the data layer without having to build definitions or SQL queries from scratch.

When you're talking about being able to integrate with things like Asana or Trello, it, you know, brought to mind things like Zapier, which is a tool that these teams would have used, know, maybe a couple of years ago to be able to link together all their different workflows.

And I'm wondering what you have seen in organizations that you're working with as far as the kind of relative popularity of these point to point evented workflows versus the hub and spoke model that the sort of high touch and reverse ETL platforms enable?

Yeah. So, honestly, we still see a lot of that. Actually,

a lot of our customers end up starting with kind of an event based

point integration platform like a Zapier or Workado or Trade dot io. We kind of see all of these systems

pretty frequently at. And then when they realize they need to operate on a more full view of the customer in order to build the workflow they're actually looking to build, that's when they realize they wanna tap into the data warehouse. And sometime along that chain, they discover that there are reverse ETL platforms like Hightouch that allow them to more easily do that. So I think that's a huge part of the market and a huge part of the story. We see a lot of our customers using those systems, and oftentimes we think of HiTouch as Zapier for customer data. And because it's for customer data, naturally we're built on top of the data warehouses. I really see that evolving as the source of truth for customer data across all types of companies and across all maturity of companies. Not to say there's not good use cases for tools like Zapier as well. There are types of data integration problems or business workflow problems that just don't make sense or don't have a clear advantage to go through the data warehouse. Like, we do things like calendar schedules on Calendly into our Slack via via Zapier, and I see no reason to do that sort of thing via high touch. It doesn't relate to a bunch of data across a bunch of different systems or kind of source of truth models. It's just plugging few things together or at Segment, we plugged Jira into Google Sheets to do some more planning,

and there's really no reason to do that in a data warehouse either. But when it comes to workflows that are really thinking about customer data and your source of truth for customer data, I think those will end up gravitating towards the warehouse.

The other interesting thing is the sort of link between having all of this customer information in your data warehouse and this, you know, apocryphal CDP

and the rise of the sort of reverse ETL category. And I'm wondering what you see as the viability

of, you know, 1 existing without the other, whether it's using reverse ETL without having the CDP, just being able to, you know, maybe aggregate across multiple systems instead of having it all in 1 place or using a CDP without then having the reverse CTL to populate information back out into the systems that you harvested it from? I can see a world in which a CDP could exist without reverse ETL because, obviously, that has happened. I don't think it would be a very useful CDP. It'd be a lot of manual processes, and you'd basically just be building dash boards on top of it in a visualization tool, then downloading it and then manually loading it somewhere else. Or you would have a lot of 1 off scripting, which is kinda how

it was before, where you're making

random API calls to Salesforce, getting errors back, not knowing how to handle a lot of different things. I would say,

in its own way,

it's kind of like how Snowflake makes it so you don't really need a DDA anymore. It's like all these errors that you have to deal with and all the APIs you have to custom learn to be able to write to these 3rd party tools, High Touch does for you and Reverse Etail does for you. So it's like you're able to kind of

act on it a lot quicker and you need less headcount, in my opinion.

But I still think it's possible to have a CDP without reverse ETL. Like I said, I just don't think you'd be getting your money's worth.

In terms of a reverse ETL without CDP,

I don't know where they'd be pulling their data from, maybe Google Sheets, but I know that that is something that's a source. But I don't think, once again, you'd be getting a lot of value out of the reverse ETL tool if you're not centralizing everything in 1 place. 1 thing I would say is

I really love the new thing around how everyone's saying, oh, warehouse first approach. Right? And so I think the warehouse first approach is really we're kind of considering the CDP here, where it's like get everything into your warehouse,

centralize it, then send those valuable events.

What is difficult is if you basically are sending the same data to a bunch of different sources using reverse ETL but you never have it centralized, then you're gonna have data flowing into Salesforce from 5 different places, maybe Zendesk, Jira,

product data, something else. Right? And then you're gonna clog these individual systems

with irrelevant data and having to replicate the same business logic in each of these different tools.

And, inevitably, your copy and pasting is gonna break. You're gonna have different data, different places. So

I think either one's a little bit janky if you do 1 without the other. Yeah. I think the market has really been saying that, basically, the data warehouses, the new CDP. I mean, the toughest problem in CDP

is centralizing all your data in 1 place. At Segment, the leading, like, off the shelf kind of CDP providers, and at Segment, we would always say, you know, in the future, all companies are going to be having a first class feature to send data into Segment.

If we look at where the market is now, that didn't really happen. Instead, what's really happened is that every company is either publishing or kind of sponsoring

some sort of first class way to get their data into a data warehouse instead. Peray is, for example, has a native connection to data warehouses.

Fivetran, for example, has

hundreds of connectors to replicate data from different source systems into data warehouses. And it's become a lot easier to centralize data in a data warehouse than it is to centralize data in any other sort of proprietary platform. And I think that alone is the core reason why the data warehouse will become the customer data platform.

And really what's missing from that stack is a good standard way to take action and activate data, and and that's what we aim to be at PyTorch with reverse ETL as, like, 1 of the core underlying processes for how we do that.

Struggling with broken pipelines,

stale dashboards,

missing data?

If this resonates with you, you're not alone.

Data engineers struggling with unreliable data need to look no further than Monte Carlo, the world's 1st end to end, fully automated data observability platform.

In the same way that application performance monitoring ensures reliable software and keeps application downtime at bay, Monte Carlo solves the costly problem with broken data pipelines.

Monte Carlo monitors and alerts for data issues across your data warehouses, data lakes, ETL and business intelligence reducing the time to detection and resolution from weeks or days to just minutes.

Start trusting your data with Monte Carlo today. Visit dataengineeringpodcast.com/impact

today to save your spot at Impact, the Data Observability

Summit, a half day virtual event featuring the 1st US chief data scientist, the founder of the data mesh, the creator of Apache Airflow and more data pioneers

spearheading some of the biggest movements in data.

The first 50 people who RSVP will be entered to win an Oculus Quest 2.

Over the sort of medium to long term, I'm wondering what you see as the opportunities

and new capabilities that people will be asking for for this space of reverse ETL and customer data platforms to be able to

more effectively

close the loop of information and interactions with customers and maybe some of the

additional

sort of industries or verticals that this pattern can be applied to? 1 of the things that I think it's missing

in general, the entire data platform, is a lot stronger orchestration

in terms of when data's coming in, when it's being modeled, when it's being sent out. There are some very,

I guess, basic things that you can do when you're using, you know gosh. I'm blanking on it. Airflow. But I think in general, understanding

this is when your data's coming in. You have the most up to date you will ever have of this customer's data,

and then this is how it should be modeled. All of your dependencies, which is great. Like, once you kick off, you know, a dbt job or something like that, it's really great because the dependencies run 1 after the other. And then some of the reverse detail tools have the ability to say, okay. I'm gonna kick off this job and then send this data other places. But there's a whole feedback loop of, like, when you send it to a third party tool, when it's acted on by a customer, when it sinks back into your data warehouse. So it's this whole cycle of data,

of sending, interacting with a customer, and receiving feedback. So I think a way to strongly relate all these feedback loops and streamline them I don't necessarily think that something like full on streaming is necessary in this industry right now. And I'm only gonna say that because until every single tool does streaming, you're only as fast as your slowest tool

And guarantee, not everyone's gonna be streaming in the next 10 years. So I think 1 of the biggest things is that you just need to fine tune how quickly, how often you can get that data in for the right price

and making sure that you're not being wasteful. Right? So you don't need to be streaming or every 5 minutes certain yearly financial data. But that whole data orchestration of making it move as quickly as possible allows

you to reach out to your customers, make a quicker impact, and so on. That totally makes sense, and that's something we're thinking pretty deeply about. And another area which I think is a really big opportunity for CTO platforms and just for every platform in the space is right now a lot of the platforms have features to do alerts and tools like, you know, Datadog or PagerDuty

when a sync fails, and that's kind of table stakes in my opinion. But to really take it a step further, I think there needs to be a really good workflow around

preempting failures. So, you know, kind of alerting the user that, hey, this sync is probably gonna fail because of some misconfiguration

or something that's

something that you're defining in sync as invalid with a way the destination system works or giving technical users or even semi technical users a good interface to write tests or assertions for the syncs or kind of test individual rows or a few batches of rows before running the whole sync across your millions of records in your data warehouse. So we built some interfaces for this sort of stuff in, like, test row and stuff like that. But I think there's a ton of opportunity here, and I think the ecosystem and all the vendors are kind of far from figuring out the ideal solution here, but it's a really large problem. I mean, as you make it more accessible to do

large data transfers and large data integration work, you also need to make it safer so that as people who aren't used to writing scripts and tests to to do all these things start doing performing these processes, they don't mess anything up as well. As Rachel mentioned earlier, when the marketing email goes out to the wrong people, it's not gonna be the marketer's fault. It'll be the data person's fault, and I think there's a lot of innovation to be done in the whole reverse CTLA tooling to really prevent issues like that. I actually think that brings up a really good point. Actually, 2 good points. 1 is audit logs. Right? So I think 1 thing that's really great is that you can kind of see where data is being sent, when and why. And

not to say that people point fingers, but it's really nice to know that you have protection to be able to say, this is why this data point was sent. This is underlying business logic. We're not making this up. We didn't randomly decide to send the wrong data somewhere. Right? So, like, being able to

I will tell you, creating an end to end staging environment for the new complexity of, like, CDP end to end data platform is so much work, and it's almost a full time job keeping it up and running. And then it's like even creating a staging environment you need a staging environment for your staging environment to make sure you don't screw up your staging environment. And it's like, how do you get all this data coming

in and having to keep up to date without paying for basically 2 entire data platforms, but know that when you're running it, it's the right data to be testing it. So then you got, like,

Salesforce sandbox coming in, then you have to have a separate staging environment in your Snowflake, then you have to have dbt

running a staging environment, then you have to have it sync to a sandbox, and then when it's ready to go live,

it's so much work to push it to production. Right? Like, that's so much work, but it's so important when you're engaging with your customers. So having something that could potentially streamline

the whole

end to end staging to production for this, like, very complex system

would be worth a lot of money. So anybody listening, go and start that company. I'd be happy to to use it. Yeah. I agree. I think the way that I think about it is, like,

like us, for example, should be thinking about what would the

absolute, you know, best in class engineering team that has all the time in the world

build in terms of both tooling and processes

if they wanted to build a great sync of data between the data warehouse

and x y z destination systems around a company.

Things like, how do you test rows? How do you alert when there's failures? How do you start us all working? How do you run tests before you make a change to the pipeline? How do you think about staging in production? And then the tooling should provide a really good way to do all of this stuff without thinking about every single detail with ideally just SQL and some settings. And that's the biggest design problem that exists, but the technical design problem and just the user experience, user interface to put design problem. That's 1 of the reasons I think that reverse ETL is and operational analytics is actually just so much harder than data replication and normal ETL because the end to end requirements in in ETL platform are pretty simple. You just drop the data in the warehouse, and now you don't even have to transform it because people can go transform it with DBT. When it comes to reverse ETL, there's just so much more that can go wrong and so much more potential on what can go right as well. On the topic of sort of data quality

and the other big trend that's happening in the past couple of years of sort of data lineage tracking and open metadata

and being able to propagate this across all these different systems where in, you know, business intelligence dashboards, they're starting to have data quality indicators so that when you view a chart, you can see this is when the data was last updated.

You know, maybe the data quality check upstream failed, so you should take this chart with a grain of salt. I'm wondering what you're seeing as the potential or any activity that's happening in some of these operational systems, whether it's HubSpot or Zendesk or Salesforce, to be able to

expose those quality indicators now that you are feeding all of this information from an automated platform to be able to say, you know, this sync was only partially completed, so, you know, this record might not be fully up to date or something like that, and then being able to track that back into the system, like, Hightouch that is actually performing these replications.

Yeah. I think that's a great idea and something I've been kind of noodling on. Don't know the exact right solution, but it would be amazing if you could go into a platform like Salesforce and see when the last time, you know, this deal was actually updated, what the definition of it is with, like, a link out to a platform like Hightouch or your metadata system that can actually tell you that,

whether the value is not up to date due to some upstream data pipeline failure. And I think all the kind of prerequisite steps to being able to expose this metadata are in the works. You know, DBT

is trying to track dependencies

of DBT models outside of the scope.

The DBT project with, you know, their exposures feature,

there's a lot of features they're on that you can see online, like the metadata tiles that can tell you

whether a certain

model or transformation step has failed, and then, you know, Hithetch can know whether, thus, the data in Salesforce is is not up to date yet because of something upstream.

I think there's a ton of opportunity there to both

track and consolidate that information and and lineage of data, but to also figure out the best way to expose it to the business users without being overwhelming. Just exposing it when it's actually relevant, if that makes sense. And so in your experience

of working with customers and building these reverse ETL

reverse ETL capabilities or the CDP or the end to end reverse CTL capabilities or the CDP or the end to end integration?

I can give

1 of the bigger ones that I actually ended up helping High Touch implement this internally too indirectly. But, basically, we had a client that had a PLG motion. Right? And so when I say that, you know, they have users in their products that are either free or paying,

They're using things, and they needed to understand how do I get them into the hands of sales in a scalable way without paying a fortune for Salesforce storage or whatnot. Right? So you you have all these individual users. You don't know whether or not they're valuable yet, but how do I surface them in certain ways? Not only that, but now you have users that belong to a team

that are connected to a Stripe account, and you're just like, okay. How am I gonna get this all in a scalable way into Salesforce for people to act on it, to be paid, comped, all those things correctly? So we were able

to build basically our version of what we've said is a PLG

supported

ABM system using Salesforce. So we think about it in the way of, like, what is a purchasing entity?

In the past, before you had a lot of SaaS, everything was at an account level. And now you're dealing with things at individual team levels, purchasing entity levels, and all of that. So think about it as a Stripe customer. So what we need to do is be able to say these Stripe customers or these purchasing entities

belong underneath these accounts based off of account domains

or regions or whatnot, and these are the users that belong to those things. So you basically are able to use

this data coming in from their product, call it like Mongo.

You're able to model it, say, here's the Teams. Here's the users. Here's their Stripe data or, you know, whatever billing Chargebee data. Here's what we know about them. Here's when we want to create an account

in Salesforce based off of their owner domain or their billing domain,

or is there already an account in there? You can go and then create purchasing entities, which we'll call it an organization or a team, and then you're able to relate production

product data to something in Salesforce. We're able to connect everything

and make this reverse ETL process really easy because what you're gonna have is, at a product level, you're gonna have things at the purchasing instance level, and now you're able to replicate that in Salesforce,

have those specific instances

assigned to a sales team, have marketing be able to market to the users of those paid systems,

all of this stuff. In the past, you would basically just have to attach users to an account and have this complexity of, like, 20 different hierarchies in Salesforce

in Salesforce in a scalable way. And so it's been really fun. We've actually implemented that a couple different places,

and

people are amazed. And it's kinda like, dude, this is what we've been wanting to do for so long, and now it's a lot easier because of reverse ETL. So now you're not

bound by, you know, what does Salesforce natively support? You're able to say, I'm gonna build my own architecture of a data model in Salesforce.

So I think those are super powerful examples of all, but I think something exciting for me is they're really unique examples that we don't see too often. Like, for example, people not just moving data between 2 different systems, but also creating some sort of business workflow using reverse ETL. An example is 1 of our customers, Blend, and we've written about this and stuff like that in the past, but they're kind of a mortgage loan platform company. They recently went public, and they do push a bunch of data into Salesforce with what they know about customers from their data warehouse using Hidash, but then they also

do things like

create Asana tickets automatically

for their customer success team to go look into

some unusual product usage information,

product usage data that they're seeing in their data warehouse defined by a SQL query, or they create Slack alerts saying to go try out these features to their customers and to different folks around their organization when they notice certain patterns that indicate that customer might be ready to do that based on the data in their data warehouse. And I think some of these business workflow examples are pretty cool in that most people wouldn't typically think of using their data warehouse for these purposes,

and they almost start to kind of chip away at BI use cases in a way where previously people might be opening a BI dashboard and refreshing it to alert themselves on some of these use cases or having a BI tool send them the whole report every single day, and then they click into it and and see what's changed to actually being able to power these deep business workflow type use cases entirely with Hytush. And I'm seeing more and more demand in the market for things like this. 1 of our customers just messaged me yesterday and was like, we're looking for a communication alerting platform based on the data warehouse. And I was like, how have you seen our Slack integration?

And then they were like, wow. I didn't know you could build business workflows like this. Obviously, she's like, Hi, Thech. I was trying to look all around, and no BI tool has a good Slack integration.

So personally, I'm really excited about those types of use cases as well because I think they don't seem as sophisticated. They seem more like a Zapier type use case, but I think they're super, super powerful and really expand the accessibility

of data throughout an organization.

Oh, absolutely. I think it's 1 of the coolest things seeing, like, when we onboard customers to, like, high touch,

and we do some basic integration, like, let's connect Salesforce. Oh, it's magical. We send ARR to your account level. Woo hoo. Like, you've been trying to do that for 5 years. And then all of a sudden they say things like, oh, I wish I could get a Slack alert when the ARR drops on certain accounts day over day. I'm like, well, you can do that. How? Oh, high touch. It's really interesting to see them kind of evolve their understanding of what they can do with the data in the warehouse because a lot of people

are,

I guess, data illiterate in terms of how much you can use it and where you can use it and all those things. So I think, once again, reverse ETL makes things, like data, more actionable

in a scalable way. I think another element to that as well is that business users have been conditioned over the past several years about what's possible with data because, you know, they might ask for those kinds of things, but the answer is,

yeah, that'll take about 6 months and a $1, 000, 000 to build. And then the next request is another 6 months and another $1, 000, 000

versus, you know, with the level of sophistication that we've built up in these systems, it becomes

much more feasible. And so they're, you know,

shocked by the capabilities that they're being given because they're so used to the fact that the answer is going to be, yes. That's possible. But

Absolutely. I think it has to do with the evolution of tools and then the initial foundation that's built by an analytics or data engineering team. So when you already have those base models in something like DBT

where you have this basic definition of all these different user attributes and different things,

once again, it's so easy to just be like, well, we have that. Let's reuse the data we already have and do automation off of it. Let's not reinvent the wheel each time. And so the fact that these tools have evolved

to support the scalable data modeling system and reuse all of that logic, it makes it so easy. It's funny to get someone on a call and say, okay. What data do you need? Okay. Click a few buttons to get that data in. I'm gonna write 1 minor script, and then I'm gonna click a few more buttons, connect to new destination in high touch. There you go. In an hour, you have all the data flowing through the way you want it to. And they thought,

oh my gosh. I thought this was gonna take me a year of backlog to do it. I'm like, no. You You just have to be friends with the right person.

In your own experience

of building and managing customer data platforms and working with reverse ETL and activating the information that's in the warehouse for these different business use cases, what are some of the most interesting or unexpected or challenging lessons that you've learned in the process? I honestly think still the biggest challenge that exists is

developing the right data models in the data warehouse. Personally, I think if that's done well, that solves 90 percent of the pains throughout the organization.

It helps the analytics. It makes the job of Hyatt that's a lot easier. But really something that's still very hard and still very business specific is data modeling and really understanding

all the different data systems that you're you're syncing into your data warehouse, how they relate to each other, how they should be joined together,

how the definition should be created on top of them, and then how you should iterate on those over time. And I think when I think about issues that our customers face, I would say 90% of them kinda flow back to you to that. Not building good data models in in the source of truth or in the data warehouse,

I think it's a huge challenge. I'm not sure if it's a technology challenge or just say

process,

technique, understanding,

not knowledge

type challenge. But

the whole trend that I'm really excited about is analytics engineers shouldn't need to develop all these, you know,

super specific technical skills like

coding and airflow and Python and etcetera, etcetera, etcetera,

just to get their hands on data and and build pipelines on top of them. And I think if we, as vendors, can focus on

making the tooling really easy to use and and just require SQL, then analytics engineers and data analysts, data engineers can spend more and more of their time really understanding the business requirements and how to best do data modeling. But, yeah, the biggest challenge that I see really,

really always comes back to not data integration,

but the data modeling itself and getting that right in the source. Yeah. I agree with that. I'll add 1 point, though.

I think a lot of it has to do with understanding

we'll call it your internal end customer, whether it be marketing or sales or whatnot, in understanding what problem they're trying to solve

and being very explicit

about what each of these data points mean because

there are gonna be assumptions made on both sides. And if you don't get over those assumptions, you might build something that doesn't really solve their needs or gives them inaccurate data to make decisions on. So I think, especially as an analytics engineer or an analyst or a data engineer that flows more into the business side, you really need to understand what problem are they trying to solve so that when you're going and getting this data and modeling it, it's actually solving

the problem that they have, not the 1 you think they have. You also have to work with the data engineer and say, this is what I'm trying to solve. Am I looking at the right data? Do we have the right data? So I think it's a lot of really

internal management

of ideas

and

data understanding and all of those things that goes into the modeling that's actually more difficult than, like, writing the queries themselves or writing the models. Because, like, if you understand what someone is trying to get out of it and you understand what the raw data means, it's a lot easier. But a lot of the times

when someone comes to you, they don't know what they don't know, and you have to validate that you guys are speaking the same language.

For people who are interested in being able

to build some of these automation capabilities for their business, what are the cases where CDP or reverse ATL or the combination of the 2 is the wrong choice? I don't think it's ever the wrong choice in my mind, but I'm biased.

I'm heavily biased too. I think there are there are some use cases that actually do need true real time processing. Mhmm. I think some example use cases are, let's say,

post purchase notifications. Like, if if I'm on, you know, let's say, jetblue.com

and I buy a flight, like, I I don't even if our my snowflake syncs are every 5 minutes as a customer, it's just not fair to wait 5 minutes to get a confirmation that Yeah. A flight was actually purchased or notification of it of anything of such or for that information to register in a super important mission critical system. So then, yes, people can build mission critical pipelines on top of the data warehouse and on top of reverse detail, and the tooling's only making it easier to do so. But if those pipelines need to run

extremely quickly, like, let's say, under a minute or a few minutes, then it's definitely not a fit for reverse CTL as it stands today.

That said, we are thinking about more broadly,

how to help companies tap into all the data infrastructure they have to address those problems and those use cases as well. For example, building connectors to sit on top of, like, a Kafka or Kinesis queue and pipe a row from there over to an event API and something like a email marketing system so that they can immediately fire the the email off for a post purchase confirmation email. But I think real time and latency

are, like, a few of the only issues that can't be solved with just changes to reverse ETL type process today, and that's something that needs to be solved upstream, the data warehouses or in the source technologies that reverse ETL platforms like Hyatt that should actually connect to. And while 90 or 95 or maybe even 99% of use cases don't need, like, super low latency, there are those use cases that do, and it'll really be beautiful when you're able to do all of that through 1 system.

As you continue to work in this space, what are some of the other industry trends that you're keeping an eye on? And are there any particular product categories that you anticipate might be the next sort of breakout event along the same lines of, you know, CDPs and reverse

CTO? I mean, there's 2 that come to mind. One's already kind of breaking out, but I have yet to see something

that I feel

makes a huge impact right now. I feel bad saying this, but, like, I'm waiting to see the evolution and who's the front runner for the data quality area.

I think a lot of these things

run post processing and the damage is already done in certain situations. So it's like, cool. Thanks for telling me after the fact. Now I have to rerun my jobs. Right? So I think it would be more interesting to see

a more scalable,

cheaper,

less process

heavy

data quality tool, I guess. You know, a lot of these also sometimes take as much effort to get

set up as they help you in the long run, so it's a little bit labor intensive to the start. So I'm interested to see how that evolves.

The only other 1 is still I'm interested to see a lot more on the streaming side. You know, I've been keeping an eye on Materialise,

but as I mentioned, you're only as fast as your slowest tool. And since a lot of people in the CDP space are heavily reliant on CRMs and things like that, that's not gonna be coming in streaming. So unless you get

streaming ETL coming in and all those different things, Materialise could be really cool in the long run, but

every other tool needs to speed up before Materialise

reaches its full potential.

Yeah. I would say other than the 2 that Rachel mentioned, I'm also pretty excited about some of the stuff happening in the metrics layer space.

I think, as a vendor, kind of think about how to make reverse ETL more and more accessible. How do we actually enable work flows where, you know, marketers and sales ops people can come to the system and say, I want

these data points or I want these data points plotted over these time periods to be synced into my system, like

number of shows watched in the last 7 days without a analyst having to go in and define number of shows watched last 7 days, number of shows watched in the last 30 days, number of shows watched in the last 60 days. How do we make that whole process more streamlined? And I think it's it's having a better semantic layer around the data. And

previously, like, the only good place for the semantic layer has really been Looker or LookML. It's the only wide stream way, but I think there's a few initiatives. There's companies like Transform that are trying to make a more generic layer for this. There's also, you know, talks in the DBT

GitHub issues and forums about DBT potentially playing in the space.

I'm not a 100% sure what the right solution is here yet, but I know kind of a standardized metrics layer will making reverse ETL more accessible to business users a lot easier, and I'm super excited to to see what happens there and tap into it. Well, for anybody who wants to get in touch with both of you, I'll have you each add your preferred contact information to the show notes. And as a final question, I'd like to get your perspectives on what you see as being the biggest gap in the tool in our technology that's available for data management today.

Data orchestration.

It's the biggest pain point for me trying to make sure everything runs in the right order at the right time, or else you end up with

stale data being sent to places

that then automation downstream is negatively impacted.

And for me, it's streaming our real time just because

those are the 1 to 5% of use cases that we can't solve by building more product features in our product until there's improvements to the underlying technology. Think it will happen. I think it'll happen incrementally,

but I can't wait till it's all resolved and we don't have to have any caveats there. Well, thank you both very much for taking the time today to join me and share your experiences working in the space of CDPs and reverse ETLs. Definitely a very

interesting set of problems and

growing need for a number of customers and companies. So definitely appreciate the time and energy you're putting into that space, and I hope you enjoy the rest of your day. Yes. Thank you. Thank you.

For listening.

Don't forget to check out our other show, podcast.init@pythonpodcast.com

to learn about the Python language, its community, and the innovative ways it is being used.

And visit the site of data engineering podcast.com

to subscribe to the show, sign up for the mailing list, and read the show notes. If you've learned something or tried out a project from the show, then tell us about it. Email hosts at data engineering podcast.com

with your story. And to help other people find the show, please leave a review on Itunes and tell your friends and coworkers.

Data Engineering Podcast

Summary

Announcements

Interview

Contact Info

Parting Question

Closing Announcements

Links