Remove project-use-case etl-pipeline-with-snowflake-dbt-airflow-example
article thumbnail

How Shopify Is Building Their Production Data Warehouse Using DBT

Data Engineering Podcast

In this episode Zeeshan Qureshi and Michelle Ark share their experiences using DBT to manage the data warehouse for Shopify. In this episode Zeeshan Qureshi and Michelle Ark share their experiences using DBT to manage the data warehouse for Shopify.

article thumbnail

Real World Change Data Capture At Datacoral

Data Engineering Podcast

In this episode Raghu Murthy, founder and CEO of Datacoral, does a deep dive on how he and his team manage change data capture pipelines in production. Modern Data teams are dealing with a lot of complexity in their data pipelines and analytical code. RudderStack’s smart customer data pipeline is warehouse-first.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Moving Machine Learning Into The Data Pipeline at Cherre

Data Engineering Podcast

Summary Most of the time when you think about a data pipeline or ETL job what comes to mind is a purely mechanistic progression of functions that move data from point A to point B. Sometimes, however, one of those transformations is actually a full-fledged machine learning project in its own right.

article thumbnail

Data Engineering Weekly #123

Data Engineering Weekly

link] Uber: Setting Uber’s Transactional Data Lake in Motion with Incremental ETL Using Apache Hudi Uber writes a comprehensive guide on running incremental ETL using Apache Hudi. link] Open Questions on Type-2 modeling I keep thinking about the Type-2 SCD and the complexity of the data pipeline.

article thumbnail

The Spiritual Alignment of dbt + Airflow

dbt Developer Hub

Airflow and dbt are often framed as either / or: You either build SQL transformations using Airflow’s SQL database operators (like SnowflakeOperator ), or develop them in a dbt project. You either orchestrate dbt models in Airflow, or you deploy them using dbt Cloud.

article thumbnail

Build vs Buy Data Pipeline Guide

Monte Carlo

Sometimes the build versus buy debate can include layers of its own, with decisions to buy often evolving into a decision to use open-source tooling to retain flexibility or leverage fully-managed solutions for faster time-to-value. Missed Nishith’s 5 considerations? Check out Part 1 of the build vs buy guide to catch up.

article thumbnail

Managing The DoorDash Data Platform

Data Engineering Podcast

Summary The team at DoorDash has a complex set of optimization challenges to deal with using data that they collect from a multi-sided marketplace. Modern Data teams are dealing with a lot of complexity in their data pipelines and analytical code. RudderStack’s smart customer data pipeline is warehouse-first.