Data Engineering Podcast


This show goes behind the scenes for the tools, techniques, and difficulties associated with the discipline of data engineering. Databases, workflows, automation, and data manipulation are just some of the topics that you will find here.

Support the show!

08 October 2021

Make Your Business Metrics Reusable With Open Source Headless BI Using Metriql - E228

Rewind 10 seconds
1X
Skip 30 seconds ahead
0:00/0:00

Share on social media:


Summary

The key to making data valuable to business users is the ability to calculate meaningful metrics and explore them along useful dimensions. Business intelligence tools have provided this capability for years, but they don’t offer a means of exposing those metrics to other systems. Metriql is an open source project that provides a headless BI system where you can define your metrics and share them with all of your other processes. In this episode Burak Kabakcı shares the story behind the project, how you can use it to create your metrics definitions, and the benefits of treating the semantic layer as a dedicated component of your platform.

Announcements

  • Hello and welcome to the Data Engineering Podcast, the show about modern data management
  • When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With their managed Kubernetes platform it’s now even easier to deploy and scale your workflows, or try out the latest Helm charts from tools like Pulsar and Pachyderm. With simple pricing, fast networking, object storage, and worldwide data centers, you’ve got everything you need to run a bulletproof data platform. Go to dataengineeringpodcast.com/linode today and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
  • Atlan is a collaborative workspace for data-driven teams, like Github for engineering or Figma for design teams. By acting as a virtual hub for data assets ranging from tables and dashboards to SQL snippets & code, Atlan enables teams to create a single source of truth for all their data assets, and collaborate across the modern data stack through deep integrations with tools like Snowflake, Slack, Looker and more. Go to dataengineeringpodcast.com/atlan today and sign up for a free trial. If you’re a data engineering podcast listener, you get credits worth $3000 on an annual subscription
  • Modern Data teams are dealing with a lot of complexity in their data pipelines and analytical code. Monitoring data quality, tracing incidents, and testing changes can be daunting and often takes hours to days. Datafold helps Data teams gain visibility and confidence in the quality of their analytical data through data profiling, column-level lineage and intelligent anomaly detection. Datafold also helps automate regression testing of ETL code with its Data Diff feature that instantly shows how a change in ETL or BI code affects the produced data, both on a statistical level and down to individual rows and values. Datafold integrates with all major data warehouses as well as frameworks such as Airflow & dbt and seamlessly plugs into CI workflows. Go to dataengineeringpodcast.com/datafold today to start a 30-day trial of Datafold. Once you sign up and create an alert in Datafold for your company data, they will send you a cool water flask.
  • Your host is Tobias Macey and today I’m interviewing Burak Emre Kabakcı about Metriql, a headless BI and metrics layer for your data stack

Interview

  • Introduction
  • How did you get involved in the area of data management?
  • Can you describe what Metriql is and the story behind it?
  • What are the characteristics and benefits of a "headless BI" system?
  • What was your motivation to create and open-source Metriql as an independent project outside of your business?
    • How are you approaching governance and sustainability of the project?
  • How does Metriql compare to projects such as AirBnB’s Minerva or Transform’s platform?
  • How does the industry/vertical of a business impact their ability to benefit from a metrics layer/headless BI?
    • What are the limitations to the logical complexity that can be applied to the calculation of a given metric/set of metrics?
  • Can you describe how Metriql is implemented?
    • How have the design and goals of the project changed or evolved since you began working on it?
    • What are the most complex/difficult engineering elements of building a metrics layer?
  • Can you describe the workflow of defining metrics?
    • What have been your guiding principles in defining the user experience for working with metriql?
    • What are the opportunities for including business users in the definition of metrics? (e.g. pushing down/generating definitions from a BI layer)
  • What are the biggest challenges and limitations of creating metrics definitions purely in SQL?
  • What are the options for exposing metrics back to the warehouse and other operational systems such as reverse ETL vendors?
  • What are the missing elements in the data ecosystem for taking full advantage of a headless BI/metrics layer?
  • What are the most interesting, innovative, or unexpected ways that you have seen Metriql used?
  • What are the most interesting, unexpected, or challenging lessons that you have learned while working on Metriql?
  • When is Metriql the wrong choice?
  • What do you have planned for the future of Metriql?

Contact Info

Parting Question

  • From your perspective, what is the biggest gap in the tooling or technology for data management today?

Links

The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

Support Data Engineering Podcast


Share on social media:


Listen in your favorite app:



More options

Here are shows you might like

See show recommendations
AI Engineering Podcast
Tobias Macey
The Python Podcast.__init__
Tobias Macey