article thumbnail

Bringing The Power Of The DataHub Real-Time Metadata Graph To Everyone At Acryl Data

Data Engineering Podcast

Summary The binding element of all data work is the metadata graph that is generated by all of the workflows that produce the assets used by teams across the organization. The DataHub project was created as a way to bring order to the scale of LinkedIn’s data needs. No more scripts, just SQL.

Metadata 100
article thumbnail

How to get started with dbt

Christophe Blefari

dbt Core is an open-source framework that helps you organise data warehouse SQL transformation. dbt was born out of the analysis that more and more companies were switching from on-premise Hadoop data infrastructure to cloud data warehouses. This switch has been lead by modern data stack vision.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Is Modern Data Warehouse Architecture Broken? 

Monte Carlo

The data warehouse is the foundation of the modern data stack, so it caught our attention when we saw Convoy head of data Chad Sanderson declare, “ the data warehouse is broken ” on LinkedIn. Treating data like an API. Immutable data warehouses have challenges too.

article thumbnail

Choosing the right Data Warehouse SQL Engine: Apache Hive LLAP vs Apache Impala

Cloudera

Some of the most powerful results come from combining complementary superpowers, and the “dynamic duo” of Apache Hive LLAP and Apache Impala, both included in Cloudera Data Warehouse , is further evidence of this. Both Impala and Hive can operate at an unprecedented and massive scale, with many petabytes of data.

article thumbnail

A Cost-Effective Data Warehouse Solution in CDP Public Cloud – Part1

Cloudera

Today’s customers have a growing need for a faster end to end data ingestion to meet the expected speed of insights and overall business demand. This ‘need for speed’ drives a rethink on building a more modern data warehouse solution, one that balances speed with platform cost management, performance, and reliability.

article thumbnail

Combining The Simplicity Of Spreadsheets With The Power Of Modern Data Infrastructure At Canvas

Data Engineering Podcast

In this episode Ryan explains how he and his team have designed their platform to bring everyone onto a level playing field and the benefits that it provides to the organization. Atlan is the metadata hub for your data ecosystem. Modern data teams are dealing with a lot of complexity in their data pipelines and analytical code.

Metadata 130
article thumbnail

Building A System Of Record For Your Organization's Data Ecosystem At Metaphor

Data Engineering Podcast

Summary Building a well managed data ecosystem for your organization requires a holistic view of all of the producers, consumers, and processors of information. The team at Metaphor are building a fully connected metadata layer to provide both technical and social intelligence about your data. No more scripts, just SQL.

Systems 100