Remove best-practices clone-incremental-models
article thumbnail

How we shaved 90 minutes off our longest running model

dbt Developer Hub

When running a job that has over 1,700 models, how do you know what a “good” runtime is? While there are many possible answers depending on dataset size, complexity of modeling, and historical run times, the crux of the matter is normally “did you hit your SLAs”? The model fct_dbt_invocations takes, on average, 1.5 hours to run.

article thumbnail

How to Speed up Local Development of a Docker Application running on AWS

DoorDash Engineering

While most engineering tooling at DoorDash is focused on making safe incremental improvements to existing systems, in part by testing in production (learn more about our end-to-end testing strategy ), this is not always the best approach when launching an entirely new business line.

AWS 119
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

How I Study Open Source Community Growth with dbt

dbt Developer Hub

My models process the data so that it's easy to perform analysis and spot trends. Here are the tools I chose to use: Google Bigquery acts as the main database, holding all the source data, intermediate models, and data marts. That's why I built a mini-warehouse for studying community growth.

article thumbnail

Dat: Distributed Versioned Data Sharing with Danielle Robinson and Joe Hand - Episode 16

Data Engineering Podcast

Happening April 29th to the 30th in New York it will give you a solid understanding of the latest breakthroughs and best practices in AI for business. In New York, it will give you a solid understanding of the latest breakthroughs and best practices in AI for business. To the 30th.

Data 100
article thumbnail

What is ETL Pipeline? Process, Considerations, and Examples

ProjectPro

This guide provides definitions, a step-by-step tutorial, and a few best practices to help you understand ETL pipelines and how they differ from data pipelines. When working on real-time business problems, data scientists build models using various Machine Learning or Deep Learning algorithms.

Process 52
article thumbnail

The Ultimate Modern Data Stack Migration Guide

phData: Data Engineering

Business-Focused Operation Model: Teams can shed countless hours of managing long-running and complex ETL pipelines that do not scale. Transparent Pricing Model: Say goodbye to tedious cost adjustments for hardware, software, platform maintenance, upgrade costs, etc. Why Migrate to a Modern Data Stack?