Build Real Time Applications With Operational Simplicity Using Dozer

Hello, and welcome to the Data Engineering Podcast, the show about modern data management.

Introducing guesswork

and SQL grunt work out of building complete customer profiles so you can quickly ship actionable enriched

data to every downstream team.

You specify the customer traits, then profiles runs the joints and computations for you to create complete customer profiles.

Get all of the details and try the new product today at dataengineeringpodcast.com/rudderstack.

Modern data teams are using HEX to 10 x their data impact.

HEX combines a notebook style UI with an interactive report builder.

This allows data teams to both dive deep to find insights and then share their work in an easy to read format to the whole org.

In hex, you can use SQL, Python, R, and no code visualization

together to explore, transform, and model data.

HEX also has AI built directly into the workflow to help you generate, edit, explain, and document your code.

The best data teams in the world, such as the ones at Notion, AngelList, and Anthropic use HEX for ad hoc investigations,

creating machine learning models, and building operational dash boards for the rest of their company.

HEX makes it easy for data analysts and data scientists to collaborate together and produce work that has an impact.

Make your data team unstoppable with HEX.

Sign up today at dataengineeringpodcast.com/hex

to get a 30 day free trial for your team. Your host is Tobias Macy. And today, I'm interviewing Matteo Pilati about Dozer, an open source engine that includes data ingestion, transformation, and API generation for real time sources. So, Matteo, can you start by introducing yourself?

Hello.

I'm Matteo. I'm originally from, Italy based in Singapore.

I'm,

been in in engineering for the last, 20 years

and about,

12, 13 years working on data. I've worked for a start up and and mostly financial institution,

been part of DataRobot,

quite early.

I've set up the data team at

DBS in Singapore. And, right before starting those, I was

leading data engineering for, Goldman Sachs in Singapore in Asia Pacific.

And do you remember how you first got started working in data?

Well, I first started working in data when,

when I was at Nokia, actually. That was, like I think it was, like, around 2, 000 and

8, actually. And, you know, there was no big data at that time, but, you know, we were processing,

log data from the telco equipment. So in reality, it was big data. Even if, if the term was big data was, was not invented yet.

And now bringing us up to what you're building at Dozer, can you give a bit of an overview about what it is and some of the story behind how it came to be and why you decided that you wanted to invest your time and energy on it? Yeah. Sure. So,

the problem that,

that we're solving at Dozer

is fundamentally

exposing data,

to be,

for the integration to customer facing application.

And how I encountered the problem was

was when I was at DBS that we had to build an entire

data infrastructure

layer for serving

APIs to mobile application.

And it was across multiple countries, across multiple products, and it was a massive project with a lot of, moving parts, a lot of custom code to be built,

a big team, and it took quite a bit of time. And that's where,

you know, I I solved this problem in DBS.

I saw a similar problem when I was at Goldman Sachs. Then, you know, talking with friends in in the data space,

I noticed that this is actually it's it's,

it's it's

apparently a simple problem, but in reality, there is a lot of, complexity behind. And, that's how we started, me and Vivek, my cofounder, started thinking about Dozer. And then we started doing some research, and and, we we found

we found other

projects that tried to address a similar pro similar problems. 1 of the project is, that is

pretty well known is

the project called Bulldogzer, actually. That's where we got the inspiration from. That is a project from Netflix

that solves a similar problem. And then we start to thinking, okay, this this problem has, this solution this problem has legs,

And, and that's how we started. So, basically, we, we we started fundraising. We started the company. We started the open source project, and and here we are.

And in terms of making it open source, what was your decision process that led you to that conclusion?

Well, you know,

the data,

now open source is is, is getting more and more popular. But if you look at the data the data space,

pretty much every project is open source. And I don't believe there there is any,

I mean, I don't believe today it's possible to make a project

successful in the data space

without

being open source. So for us, it was kind of a given

that this had to be open source. In terms of the actual problem space that you're addressing, as you noted, there's overlap with a number of different technologies and use cases,

and I'm wondering what it is about Dozer

and the overlap of the Venn diagram across those different technologies that makes Dozer a better fit for the specific problems that you're focused on versus any of those other solutions. So in particular, things like Flink for streaming or Airbyte or Fivetran for data integration

or things like Kube JS for data APIs. I'm just wondering what it is about Dozer and that particular intersection of problem spaces

that, made it useful and necessary to build this product.

Yeah. That's right. That's a good, that's a really good question. So,

the way we position the product is, you know, as you mentioned, if you want to break up an entire

infrastructure for serving data, you have to, put together multiple technologies.

And, then you have the problem of real time data

and the batch data.

You have different type of caching, different type of lookup that you have to perform.

So the technology

is start becoming,

more and more, and the infrastructure,

the architecture become more and more complex.

Now imagine where you have a product engineering team

that needs to integrate data,

with

a customer facing application. Now they face a problem of

talking

with the data engineering team. They they basically talk a different language actually. So what we wanted what we want to enable with Dozer is the ability for a product engineering team to be kind of independent and say, okay. I can source my data from my data lake, from my operational systems.

I can source everything in real time. I can do some transformation because what is available there in the system is not exactly what I want, and I want to easily expose it as API and be able to integrate it with my customer facing application.

And I want to do it and I want to do all this without

bringing up an entire infrastructure,

But,

I can have a simple tool to do that. Another thing that another consideration

that, that, we have done is the fact that

we feel the, the world of data engineering

is, is changing a lot. I mean, it has been primarily dominated by

JVM based tools,

and most of them are, like you say, distributed architecture. Now and in many situation, you don't really need that kind of complexity.

So with new,

languages like Rust,

new processor coming out, ARM based processor with a large amount of cores, We feel it's,

today, it's possible to achieve,

on a single machine

what,

before needed a distributor architecture. So you put everything together and say, okay. Dozer is a tool that allows our ideas. Dozer is a tool that allows

data,

sorry, product engineers

to build this entire end to end end to end experiences.

And

because of the fact that there are multiple steps in building that experience

and that has largely been addressed by a number of different technologies that have to get stitched together,

What are the major points of friction that you have seen teams run into as far as building those solutions with those different technologies?

And what are some of the elements of the user experience that you're designing into Dozer to reduce that overall friction?

Yeah. So the friction that I've seen is is, you know, it's a typical friction that you have in an enterprise where you have your your private team and your data engineering team. The data engineering team focuses more on the on the big picture, and they want a kind of like the perfect data platform to be implemented in the enterprise.

While the product team, they want

they want something that is readily available to be consumed by API. That's a friction that that I've seen. And, you know, sometimes,

the pulling data from your data platform might might not even be enough, for example, because if you need real time data, if your data platform doesn't have a full,

full, streaming real time streaming infrastructure yet, so that's that's,

that's a challenging problem for the the the data engineering team.

The other aspect that is,

that is

very important here is that,

since we are our idea, we focus on serving data

to customers, to customer facing application.

Obviously,

you have a different set of problems,

that,

from the data engineering team. You know, data engineering is prime has primarily been used for internal consumption. So if a report is not ready, for example, 1 day, that's a problem, but it's not not a critical issue. Now when you start putting that data in front of the customer, obviously, you need another level of reliability because it's it's a it's, it's data that need to be available 247

at a in a at a particular SLA.

So and,

that's what

we address also in Dozer is about the

the monitoring of the cyst or the the let's say let's say the data observability

from, from this point of view and,

and the

understanding when problems happen, being able to to resolve that problem because, you know, you're serving customers. So that's the, that's the philosophy behind it. Obviously, we are very early right now.

But that's that's our that's our vision.

Another aspect of this cross section is that there are a number of different personas or roles that are typically involved in some of the different,

desired end goals where for building a data API, it might be an application engineer who just needs to integrate data into their application. It might be a data engineer who's trying to provide output to

machine learning or an analytics use case. It might be a data scientist who's trying to build,

some experimental,

system to determine whether or not the direction they're going is the right path.

And

I'm wondering,

because of the fact that you do have so many different target audiences,

how that also factors into the way that you think about the design and usage of Dozer to be able to address the needs of each of those personas?

Yeah. I mean, fundamentally, the the whole idea is that it goes back to the the

simplicity of usage. So you just,

connect your data sources. You express your

transformation

using

SQL that is that is

is translated into a stream pipeline,

and

data is, is cached in the API. So these are usually 3 distinct components. It's an ingestion component, a streaming ingestion component,

it's a it's a transformation engine, and it's a caching layer plus API.

We say our philosophy is that because we see,

we want to give the possibility to a single developer or a small team to to orchestrate

all these. We say, okay, this is, just a configuration file.

We use YAML, and we're gonna have the our,

cloud base with a UI,

available really soon.

But fundamentally, you can express

ingestion, transformation,

and and caching

all in a single configuration,

so that you can

bring up the entire the entire infrastructure.

Some some people call it

data as code, actually, as comparing it to

philosophy we have.

And digging now into Dozer itself, can you talk through some of the ways that it's implemented

and the architectural elements of it that make it a better fit for this end to end data integration and delivery use case?

Dowser is,

is fundamentally a self contained tool.

We didn't want to have dependency

to external tools like Kafka or distributed key value stores. So we wanted to keep it very simple. It's all implemented in Rust that gives us the the performance.

And it's fundamentally 3, 3 pieces. 1 piece is the

ingestion part. So we

always treat data coming in as streaming data, whether it's

whether it's actual streaming or not. Like, I give you an example. Relational database, easy to think, we use CDC

to capture all the changes.

But even when we are dealing with Snowflake or, Delta Lake, which is not strictly something,

real time,

we basically capture,

change streams

to,

detect what has what has been changed, and we keep the and we update the state.

After the ingestion, we, we fundamentally have a a, streaming SQL engine that has been built up built from from the ground up. And this is, let's say, WASI,

NC SQL. Actually, we support all the pretty much complex operation, aggregation,

joints, etcetera. And this allows us to basically,

allows a user to create a a model

by joining sources aggregating that then will be stored in the in the caching layer. Once the data is transformed, everything is, is,

stored in our our caching layer, which is not a, which is an embed based on embedded database, we leverage,

Lmdb,

actually, which is a was a very fast

memory mapped

key value store. And on top of that, we have the,

the query the the API. Now this is the the the the, the execution engine. Obviously, on top of it, there is a lot of stuff to handle API. Like, for example,

API automated API versioning. So what happens when something goes

something changes in the source?

We automatically

detect that change. We we,

we

typical problems that

engineers have been dealing with API, we are bringing the solution to those problems

to, to the data to the data world. Now all these can, can be run as a as a as a single binary or or or can be run with

different binaries. But the fundamental idea is that you can just do a brew install and, and,

basically put down a YAML configuration

file, and you have 1 process basically connecting to the database,

doing the real time processing,

and

exposing,

low latency API. That's that's how it works. Another thing to note that the philosophy, the approach that you have taken is that because everything is pre aggregated,

so, the idea is that,

complex

aggregation

and joints,

you do,

you do that in the in the SQL layer,

and, you,

key the simple operation, you do them directly on the Like, the cache allows to you to do filtering,

sorting,

basic operations. If you have anything more complex than that, you do it you do it upstream. That's that's fundamental idea. That allows us to guarantee

the millisecond latency, sub millisecond latency on every

And building up a streaming engine and a SQL engine

in the same unit is definitely a very complex set of engineering challenges. I'm curious, what are some of the what are some of the sources of inspiration that you drew from to be able to understand how to address this problem

and some of the off the shelf components that you were able to use to build out this overall architecture.

Okay. Yeah. So,

we didn't use many of the I would say off the shelf components, we use

LMDB, definitely. That's the the, caching layer that we're leveraging.

For the caching layer

on top of LNDB, obviously, we've built a couple of some indexes,

some some,

an additional layer. And, you know, there I mean,

there is, there is a lot of literature

about that and a lot of project doing that as well. The the on the on the streaming SQL engine, obviously, we got some inspiration from,

from the tools that are out there. For the caching layer,

we,

we leveraged

Lmdb. It's a it's a database we like a lot. Actually, we did we did a lot of testing

comparing,

RocksDB

versus Lmdb.

I previously,

in the past used a lot Lmdb, and I like it a lot. And, you know, RocksDB is very good, for

heavy writes, and while Lmdb is very good for for heavy reads. And that's why we decided to to go that direction.

Talking about the,

the streaming SQL, obviously, that's a pretty it's a complex piece of code, but it's it's,

I mean, other companies as in are embarking into,

into this.

So we got inspiration from the traditional

SQL

SQL,

real streaming SQL tools like Fling, for example.

So that's that's, that's and

the value for us is being able to to give, the end to end integration of all these without worrying without letting the user

handle all these moving parts. As far as the evolution of the project, I'm wondering what are some of the,

design and goals of the project that have changed since you first started working on it and some of the assumptions that you had going in that have been updated in the process.

Yeah. So,

I would say that, when we started out,

we had,

we had this idea of having a

single binary

deployment,

actually. And that was how the project started. And

and that's how was the first person the first 1 who was.

Later on, you know, now we are starting to,

to work on the on our cloud version.

And, you know,

all I mean, not everything

is,

is best suited to be a single binary deployment, actually. So,

our cloud deployment is entirely based on Kubernetes, so we realize the need of,

separating

up some some some components.

And that's what we are

we are we are actually doing right now. So I would say that the,

we still want to be able to run

it run it as a as a single binary, but sometimes

for

operationalization,

that's, that's probably not,

not the best. And that's why we are we're kind of, like, starting to to

to separate it in in,

in various components. That's that's probably the biggest

the biggest change at,

that,

that architecturally,

we have done. And so for people who are getting started with Dozer, they want to deploy it and integrate it into their existing infrastructure,

start pulling data from their various sources. I'm wondering if you can talk through the overall process

of getting that set up and integrated and configured.

Yeah. So the, the process

a documentation

on on on our

on our website. So, fundamentally,

you can download it. So it's, you can download it using or,

or use a Docker image. And, we have a couple of samples as well that where we we show how we connect to database,

how we we are constantly adding samples.

We we can pull data from a tree, how we combine them together,

etcetera etcetera.

And it's, it's fundamentally writing a YAML file where you define

the sources you want to connect,

the SQL transformation

that you want to,

that you want to run, and the endpoints,

that, that you want to expose. And that's pretty much about it, actually. So

then you start Dozer and,

automatically,

you have, you have APIs.

We are working on the more

details on the on the deployment. As I mentioned, we are working also on our

cloud version, which we plan to release

about

next month.

And, yes, and also more and more samples. I mean, we are we're working on more

the

if you want to get started, that's that's straightforward. It's just 2 lines of code. And we have also have a couple of videos showing tutorial

showing how how to how to do it.

And once it is deployed, you've got it connected up to some of your different sources. Can you talk through the

developer workflow and the process involved for teams who are building out some of these different data applications,

some of the elements

of,

access control, multi tenancy, things like

that? Yeah. Talking about yeah. I forgot to mention that we,

we have client integrations as well.

So we integrate to it. We have JavaScript client. We have a a Python client as well. So if you are consuming the data from your

web applications

or your,

data science application using Python,

we give you,

we provide

a a client that

easily integrates with API and easily integrates with authentication. For authentication,

we are, currently, our authentication is is our authorization,

we don't really handle authentication.

We we let you handle authentication yourself. It's all JWT

based, and we handle

authorization

at the role level.

Field level field level authorization

is, is coming soon, and that's how easy is to integrate. As I mentioned before, we handle

deployment of new API

and versioning of APIs. So that means that, whenever you let's say, for example, you change your model, so you update your SQL

because,

something changed or even the source changed,

we basically detect that,

something is wrong. We notify you.

We allow you to,

you, at that point, you can,

update your SQL. You can deploy a new version of the API,

and,

we'll we let you basically switch

the API when you've done integration. So, fundamentally, it's kind of like a blue green deployment

for your data. That's, that's how,

how the workflow is from a, let's say, from a a product engineer point of view. That's that's that's how the flow works.

Other

aspects that are interesting to talk through are some of the elements around testing, validation,

working in preproduction environments, and promoting changes. I'm wondering if you can talk through some of those aspects of

the development workflow and the

maintenance and evolution of these applications that are built on Dozer?

Yeah. Right. So this is something that,

that is not,

much addressed in the car in the open source version, and that's something we are we are implementing in our in our cloud version. So our cloud version with the full life cycle of

API management,

allowing you to have,

to deploy in a in a in a, let's say, preproduction environment,

and then testing the API there, and then promoting to a production environment.

So that's that's

this is, this is something that is not gonna come out

next month. It

likely come out, around September, but that's, that's, that's what we're working.

Another aspect of any project that is open source but has a business behind it is the question of governance and being able to balance the needs of the open source project with the goals of the business and sustainability of both. I'm wondering if you can talk through some of the ways that you're thinking about those problems.

Yeah. Right. Yeah. So

we

the philosophy that we have is that, open source will always be there, and we will always,

always support it.

Our idea is that,

the open source version

is,

will give all the features,

but, the scalability

of API

is up to the user deploying it, actually.

So all the feature will be available. There will be no features that are, that are cut down in terms of, like,

for example, SQL, connectors, or API.

What really will,

will what is the added value of

of the cloud is the peace of mind. So API auto scaling,

global data distribution,

etcetera, etcetera. So that's that's the way we are we are thinking about it.

And recognizing that it's still fairly early in the project, I'm wondering if you can talk to some of the most interesting or innovative or unexpected ways that you have seen Dozer used. Yeah. Sure. So as I mentioned, we are very early. But, you know, we started having different

type of user leveraging

those for

for different,

different applications. I would say that 1, 1 application that came up,

in multiple situation is, for example, payments, where, you know, typically

to to to run a payment system,

you have to you have to run multiple transactional

And,

you know, 1 of the requirement that you have in in rule. And,

you know, 1 of the requirement that you have in in payment system is,

is,

is being able to

aggregate all this data

and give a unified view,

to the to your user.

And that's where we have seen, for example, those are being used by by multiple,

multiple companies,

because,

you know, they

payment is a scenario where you need real time information. I mean, once the customer have done a payment, they immediately want to see,

what's going on. And, you know, wearing all these transactional system in the back end is not really feasible.

So there there is the need of a unified view that is updated

in in real time, actually. That's that's, for example, 1,

1 of the interesting use use case we have, we have seen.

There is another use

case that came up

that is more of a like experimental,

is

integration

with, with, with LLM, for example. So,

you know, now a lot of, lots of people are experimenting with

with vector databases

and using LLMs to

to basically

using vector

databases as a knowledge base for for an LLM.

But if you imagine having

a,

building a chatbot,

answering question about the knowledge base. But,

if you think,

let's take for example a bank,

a knowledge base is not really necessary. You need to have

a customer

360

in order to personalize

the LLM responses

for,

that user. Now

that customer 360,

needs to be built and needs to be maintained in,

in real time, needs to be up to date. And, and in order to build it,

data comes from multiple sources. So you need to have a system that allows you to collect all this data,

create

a unified profile,

and being able to serve that profile

at low latency

when the LLM

to be provided as in the context of the LLM, actually. So that's another that's another interesting use case that came up. It's very experimental,

but that's that's that's,

we thought it was quite, quite unique and quite creative, actually.

And in your experience of building Dozer and trying to grow the business around it, grow the community around it, what are the most interesting or unexpected or challenging lessons that you've learned in the process? So, from a technology point of view, I don't think there have been many surprises, I would say, because, you know, I've been in technology for more than 20 years, so that's that's, that's,

that's easier.

I think that,

most of the surprises came from the community

aspect and, community building because I'm I'm basically a newbie there, and I'm learning my way. And,

and it's incredible

how how when you put out something,

how

people,

start

the curiosity of people and the,

how much they are willing to help

to contribute to the project, actually, even if the project is is is very early. And, you know, as an engineer, you always want to wait to put out a project that is,

that is,

that is almost ready, not to burn yourself.

But, you know, maybe that's that's

many people say it's not right. I mean, it's debatable, actually. But, you know, even putting the project

relatively early,

I believe you get a lot of support from the community. I mean, they want to be involved. They want to learn more about it.

And it's incredible, you know, how when you put out the project, it maybe is is, is not,

fully ready yet. People still want to try and still want to solve solve the problem

and contribute back instead of, like, saying, okay, this doesn't work and just drop it 5 minutes later. That is that is quite,

quite incredible.

And

for people who are interested in building

APIs on top of real time data sources, what are the cases where Dozer is the wrong choice?

Dozer,

you know, it it does integrate

a a an engine that preaggregates data.

So

Dozer is a perfect choice

when you know exactly what you want from the API,

and,

you know the consumption pattern.

Now if you have a situation where you're doing more

exploratory,

so you don't know exactly what you, what you need,

and you want to run

different queries,

and most of them are are can be all up queries.

Dozer is definitely not the choice here. There are other tools that allows you to

ingest all the data

and do all upgrades on top of it. You can think about tools like Rockset, for example.

You can think about

Tinybird. Those are more,

the philosophy of those tool is that you take all your data, you pump the all the data in the system, and then you run

queries in,

semi real time there.

Those are on the other side is much more efficient when you say, I know exactly what I want. I know what kind of query,

I'm gonna be doing, and I want

the the lowest possible latency.

That's that's that's when those are fits very well.

And as you continue to build and iterate on Dozer, what are some of what are some of the things you have planned for the near to medium term?

So near term near to medium is, is,

our

to be honest now, we are fully focused on, on, on cloud deployment, actually.

And,

at the same time,

enabling more

scalability

on our, on our open source,

on our open source

release as well. Other aspect is on the, on the

connectors

side, so expanding the connectors. Now we have a

limited set of connectors,

and, and, that's, that's,

that's what we're working on, expanding them.

And, 3rd is,

the,

the UI. Right now, use Dozer is fully based on a on a CLI and a configuration

file. So we are working on a,

on a UI layer to be more more

easier usable for people who prefer that. Obviously, there are more there are also longer term things, like there are longer terms item that we're working on,

what I mentioned before,

the

monitoring,

observability,

more of, more work on the

security aspect as I said, field level,

field level,

field level authorization.

So this is this is what, what is on our plate in the in the coming months.

And to the point of the

source and destination integrations and also just the overall extensibility of Dozer, what are some of the

interfaces

or,

plug in options

for people who are using Dozer to be able to extend it and, innovate on it for themselves?

Yeah. So the,

okay. In terms of, like, sources,

currently, we support

Snowflake, data Delta Lake,

Postgres,

s 3 files.

We have a connector for the Ethereum blockchain as well.

So these are the the connector we currently,

currently support.

The connector

and we support also, I forgot to mention, is not strictly a connector, but it's basically a gRPC endpoint. So you can actually pump in data,

from from your from your code as well. The connector model is, is

very easily implementable. In fact, we have a couple of contributors

that are working on

MySQL

connector implementation

and and another 1 who are working on a MongoDB connector implementation.

So that's that's, that's, that's pretty straightforward to implement. On the,

on the sync side,

we don't really

advocate for having multiple syncs. I mean, we are not strict I mean, we have a streaming SQL engine behind, but we don't we don't sell ourselves as a agnostic

SQL

streaming SQL engine.

Our sync is our caching layer, is tightly integrated is tightly integrated with that because that's

our value proposition is be able to,

to serve as to serve API.

There's gonna be

1

1 thing that,

we,

we are exploring,

actually, because it came up a couple of times, is the ability,

to plug in,

with an external key value store.

That's something that

we are we kind of started

looking at it, but we we are not sure about it yet. But that's that's, that's

that's the the the fundament. I mean, most of our extensibility

will be on the on the on the source side.

Are there any other aspects of the Dozer project in this space of end to end data integration and delivery that we didn't discuss yet that you'd like to cover before we close out the show?

I think we discussed pretty much everything. I mean, I I I cannot think of other things that, that,

that,

we didn't cover. Yeah. I think we discussed pretty much everything.

Alright.

Well, for anybody who wants to get in touch with you or follow along with the work that you and your team are doing, I'll have you add your preferred contact information to the show notes. And as the final question, I'd like to get your perspective on what you see as being the biggest gap in the tooling or technology that's available for data management today.

Well, as I mentioned before, I think that,

that

there is a revolution in progress in the data management space that is driven by

RASP. That's

what I truly

what I truly believe. I mean, we are going,

we are actually going into a direction,

a shift where, you know, we used to have fully distributed system, and now we started realizing

that, maybe that's not needed. Maybe let's go back to the roots. Let's go back to the

single machine with multicore.

And that is and with much more efficient languages like like Rust. And, you know, that is, that is happening very much that is happening both in the streaming

and in the batch, in the batch space. I mean, streaming, there are multiple

there is Dozer. There are a lot of other project like Materialise, Rising Wave,

as you mentioned, like, cube JS, all written in in in in Rust. But if you look at the the batch,

batch landscape as well, there is a lot of stuff happening there. I mean, a lot of people are excited about DuckDV,

Data Fusion,

Polars,

and I

believe that this tool will completely

change the the landscape of of data engineering. Alright. Well, thank you very much for taking the time today to join me and share the work that you're doing on Dozer. It's definitely very interesting project, interesting platform.

Definitely appreciate the time and energy that you and your team are putting into bringing that into the world.

So thank you again for taking the time today, and I hope you enjoy the rest of your day. Thank you very much.

Thank you for listening. Don't forget to check out our other shows, podcast.init,

which covers the Python language, its community, and the innovative ways it is being used, and the Machine Learning podcast,

which helps you go from idea to production with machine learning. Visit the site at dataengineeringpodcast

dotcom to subscribe to the show, sign up for the mailing list, and read the show notes. And if you've learned something or tried out a product from the show, then tell us about it. Email hosts atdataengineeringpodcast.com

with your story.

And to help other people find the show, please leave a review on Apple Podcasts and just tell your friends and coworkers.

Data Engineering Podcast

Summary

Announcements

Interview

Contact Info

Parting Question

Closing Announcements

Links