Remove Data Engineer Remove Data Ingestion Remove Data Lake Remove MongoDB
article thumbnail

Maintain Your Data Engineers' Sanity By Embracing Automation

Data Engineering Podcast

Summary Building and maintaining reliable data assets is the prime directive for data engineers. While it is easy to say, it is endlessly complex to implement, requiring data professionals to be experts in a wide range of disparate topics while designing and implementing complex topologies of information workflows.

article thumbnail

Joe Reis Flips The Script And Interviews Tobias Macey About The Data Engineering Podcast

Data Engineering Podcast

Summary Data engineering is a large and growing subject, with new technologies, specializations, and "best practices" emerging at an accelerating pace. RudderStack helps you build a customer data platform on your warehouse or data lake. Data teams are increasingly under pressure to deliver.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Tame The Entropy In Your Data Stack And Prevent Failures With Sifflet

Data Engineering Podcast

In this episode CEO and founder Salma Bakouk shares her views on the causes and impacts of "data entropy" and how you can tame it before it leads to failures. Datafold integrates with all major data warehouses as well as frameworks such as Airflow & dbt and seamlessly plugs into CI workflows. In fact, while only 3.5%

Data Lake 130
article thumbnail

Most important Data Engineering Concepts and Tools for Data Scientists

DareData

Learn the most important data engineering concepts that data scientists should be aware of. As the field of data science and machine learning continues to evolve, it is increasingly evident that data engineering cannot be separated from it. DigDag: An open-source orchestrator for data engineering workflows.

article thumbnail

MongoDB CDC: When to Use Kafka, Debezium, Change Streams and Rockset

Rockset

MongoDB has grown from a basic JSON key-value store to one of the most popular NoSQL database solutions in use today. These attributes have caused MongoDB to be widely adopted especially alongside JavaScript web applications. These attributes have caused MongoDB to be widely adopted especially alongside JavaScript web applications.

MongoDB 52
article thumbnail

Clean Up Your Data Using Scalable Entity Resolution And Data Mastering With Zingg

Data Engineering Podcast

Summary Despite the best efforts of data engineers, data is as messy as the real world. Entity resolution and fuzzy matching are powerful utilities for cleaning up data from disconnected sources, but it has typically required custom development and training machine learning models. In fact, while only 3.5%

MongoDB 130
article thumbnail

An Exploration Of The Open Data Lakehouse And Dremio's Contribution To The Ecosystem

Data Engineering Podcast

Summary The "data lakehouse" architecture balances the scalability and flexibility of data lakes with the ease of use and transaction support of data warehouses. Mention the podcast to get a free "In Data We Trust World Tour" t-shirt. Data teams are increasingly under pressure to deliver.

Data Lake 100