Remove Accessibility Remove Books Remove Data Lake Remove High Quality Data
article thumbnail

Troubleshooting Kafka In Production

Data Engineering Podcast

Summary Kafka has become a ubiquitous technology, offering a simple method for coordinating events and data across different systems. Data lakes are notoriously complex. What motivated to write a book about how to manage Kafka in production? There are many options now for persistent data queues.

Kafka 245
article thumbnail

Designing A Non-Relational Database Engine

Data Engineering Podcast

Data lakes are notoriously complex. For data engineers who battle to build and scale high quality data workflows on the data lake, Starburst powers petabyte-scale SQL analytics fast, at a fraction of the cost of traditional methods, so that you can meet all your data needs ranging from AI to data applications to complete analytics.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Version Your Data Lakehouse Like Your Software With Nessie

Data Engineering Podcast

Summary Data lakehouse architectures are gaining popularity due to the flexibility and cost effectiveness that they offer. The link that bridges the gap between data lake and warehouse capabilities is the catalog. Data lakes are notoriously complex. What is involved in integrating Nessie into a given data stack?

Data Lake 147
article thumbnail

Unlocking Your dbt Projects With Practical Advice For Practitioners

Data Engineering Podcast

Data lakes are notoriously complex. For data engineers who battle to build and scale high quality data workflows on the data lake, Starburst powers petabyte-scale SQL analytics fast, at a fraction of the cost of traditional methods, so that you can meet all your data needs ranging from AI to data applications to complete analytics.

Project 147
article thumbnail

[O’Reilly Book] Chapter 1: Why Data Quality Deserves Attention Now

Monte Carlo

Your downstream data consumers including product analysts, marketing leaders, and sales teams rely on data-driven tools like CRMs, CXPs, CMSs, and any other acronym under the sun to do their jobs quickly and effectively. But what happens when the data is wrong?

article thumbnail

What is Data Fabric: Architecture, Principles, Advantages, and Ways to Implement

AltexSoft

A data fabric is an architecture design presented as an integration and orchestration layer built on top of multiple disjointed data sources like relational databases , data warehouses , data lakes, data marts , IoT , legacy systems, etc., to provide a unified view of all enterprise data.

article thumbnail

What is Data Observability? 5 Key Pillars To Know

Monte Carlo

Lineage : When data breaks, the first question is always “where?” Data lineage provides the answer by telling you which upstream sources and downstream ingestors were impacted, as well as which teams are generating the data and who is accessing it. The original article that launched a category: What is data observability?