Remove Data Engineering Remove Data Ingestion Remove Data Lake Remove Scala
article thumbnail

Maintain Your Data Engineers' Sanity By Embracing Automation

Data Engineering Podcast

Summary Building and maintaining reliable data assets is the prime directive for data engineers. While it is easy to say, it is endlessly complex to implement, requiring data professionals to be experts in a wide range of disparate topics while designing and implementing complex topologies of information workflows.

article thumbnail

Joe Reis Flips The Script And Interviews Tobias Macey About The Data Engineering Podcast

Data Engineering Podcast

Summary Data engineering is a large and growing subject, with new technologies, specializations, and "best practices" emerging at an accelerating pace. RudderStack helps you build a customer data platform on your warehouse or data lake. Data teams are increasingly under pressure to deliver.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Tame The Entropy In Your Data Stack And Prevent Failures With Sifflet

Data Engineering Podcast

In this episode CEO and founder Salma Bakouk shares her views on the causes and impacts of "data entropy" and how you can tame it before it leads to failures. Datafold integrates with all major data warehouses as well as frameworks such as Airflow & dbt and seamlessly plugs into CI workflows.

Data Lake 130
article thumbnail

Azure Data Engineer Resume

Edureka

Azure Data Engineering is a rapidly growing field that involves designing, building, and maintaining data processing systems using Microsoft Azure technologies. Contents: What is the role of an Azure Data Engineer?

article thumbnail

Clean Up Your Data Using Scalable Entity Resolution And Data Mastering With Zingg

Data Engineering Podcast

Summary Despite the best efforts of data engineers, data is as messy as the real world. Entity resolution and fuzzy matching are powerful utilities for cleaning up data from disconnected sources, but it has typically required custom development and training machine learning models. That’s where our friends at Ascend.io

MongoDB 130
article thumbnail

An Exploration Of The Open Data Lakehouse And Dremio's Contribution To The Ecosystem

Data Engineering Podcast

Summary The "data lakehouse" architecture balances the scalability and flexibility of data lakes with the ease of use and transaction support of data warehouses. Mention the podcast to get a free "In Data We Trust World Tour" t-shirt. Data teams are increasingly under pressure to deliver.

Data Lake 100
article thumbnail

Discover And De-Clutter Your Unstructured Data With Aparavi

Data Engineering Podcast

Summary Unstructured data takes many forms in an organization. From a data engineering perspective that often means things like JSON files, audio or video recordings, images, etc. Acryl Data provides DataHub as an easy to consume SaaS product which has been adopted by several companies. That’s where our friends at Ascend.io