Data Engineering Podcast


This show goes behind the scenes for the tools, techniques, and difficulties associated with the discipline of data engineering. Databases, workflows, automation, and data manipulation are just some of the topics that you will find here.

Support the show!

Rewind 10 seconds
1X
Skip 30 seconds ahead
0:00/0:00

Listen in your favorite app:



More options

Here are shows you might like

See show recommendations
AI Engineering Podcast
Tobias Macey
The Python Podcast.__init__
Tobias Macey

439 Episodes

Enhancing Data Accessibility and Governance with Gravitino - E438

Summary As data architectures become more elaborate and the number of applications of data increases, it becomes increasingly challenging to locate and access the underlying data. Gravitino was created to provide a single interface to locate and query your data. In this episode Junping Du explains how Gravitino works, the capabilities that it…

Summary As data architectures become more elaborate and the number of applications of data…

01 September 2024 | 00:38:41


The Evolution of DataOps: Insights from DataKitchen's CEO - E437

Summary In this episode of the Data Engineering Podcast, host Tobias Macey welcomes back Chris Berg, CEO of DataKitchen, to discuss his ongoing mission to simplify the lives of data engineers. Chris explains the challenges faced by data engineers, such as constant system failures, the need for rapid changes, and high customer demands. Chris delves…

Summary In this episode of the Data Engineering Podcast, host Tobias Macey welcomes back Chris Berg,…

04 August 2024 | 00:53:30


Achieving Data Reliability: The Role of Data Contracts in Modern Data Management - E436

Summary Data contracts are both an enforcement mechanism for data quality, and a promise to downstream consumers. In this episode Tom Baeyens returns to discuss the purpose and scope of data contracts, emphasizing their importance in achieving reliable analytical data and preventing issues before they arise. He explains how data contracts can be…

Summary Data contracts are both an enforcement mechanism for data quality, and a promise to…

28 July 2024 | 00:49:26


How Generative AI Is Impacting Data Engineering Teams - E435

Summary Generative AI has rapidly gained adoption for numerous use cases. To support those applications, organizational data platforms need to add new features and data teams have increased responsibility. In this episode Lior Gavish, co-founder of Monte Carlo, discusses the various ways that data teams are evolving to support AI powered features…

Summary Generative AI has rapidly gained adoption for numerous use cases. To support those…

21 July 2024 | 00:54:45


The Role of Product Managers in Data-Centric Organizations - E434

Summary In this episode Praveen Gujar, Director of Product at LinkedIn, talks about the intricacies of product management for data and analytical platforms. Praveen shares his journey from Amazon to Twitter and now LinkedIn, highlighting his extensive experience in building data products and platforms, digital advertising, AI, and cloud services.…

Summary In this episode Praveen Gujar, Director of Product at LinkedIn, talks about the intricacies…

13 July 2024 | 00:52:58


Neon: A Serverless And Developer Friendly Postgres - E433

Summary Postgres is one of the most widely respected and liked database engines ever. To make it even easier to use for developers to use, Nikita Shamgunov decided to makee it serverless, so that it can scale from zero to infinity. In this episode he explains the engineering involved to make that possible, as well as the numerous details that he…

Summary Postgres is one of the most widely respected and liked database engines ever. To make it…

08 July 2024 | 00:57:43


Improve Data Quality Through Engineering Rigor And Business Engagement With Synq - E432

Summary This episode features an insightful conversation with Petr Janda, the CEO and founder of Synq. Petr shares his journey from being an engineer to founding Synq, emphasizing the importance of treating data systems with the same rigor as engineering systems. He discusses the challenges and solutions in data reliability, including the need for…

Summary This episode features an insightful conversation with Petr Janda, the CEO and founder of…

30 June 2024 | 00:59:48


Stitching Together Enterprise Analytics With Microsoft Fabric - E431

Summary Data lakehouse architectures have been gaining significant adoption. To accelerate adoption in the enterprise Microsoft has created the Fabric platform, based on their OneLake architecture. In this episode Dipti Borkar shares her experiences working on the product team at Fabric and explains the various use cases for the Fabric…

Summary Data lakehouse architectures have been gaining significant adoption. To accelerate adoption…

23 June 2024 | 00:53:23


Being Data Driven At Stripe With Trino And Iceberg - E430

Summary Stripe is a company that relies on data to power their products and business. To support that functionality they have invested in Trino and Iceberg for their analytical workloads. In this episode Kevin Liu shares some of the interesting features that they have built by combining those technologies, as well as the challenges that they face…

Summary Stripe is a company that relies on data to power their products and business. To support…

16 June 2024 | 00:53:20


X-Ray Vision For Your Flink Stream Processing With Datorios - E429

Summary Streaming data processing enables new categories of data products and analytics. Unfortunately, reasoning about stream processing engines is complex and lacks sufficient tooling. To address this shortcoming Datorios created an observability platform for Flink that brings visibility to the internals of this popular stream processing system.…

Summary Streaming data processing enables new categories of data products and analytics.…

09 June 2024 | 00:42:22