article thumbnail

Exploring Processing Patterns For Streaming Data Integration In Your Data Lake

Data Engineering Podcast

Summary One of the perennial challenges posed by data lakes is how to keep them up to date as new data is collected. In this episode Ori Rafael shares his experiences from Upsolver and building scalable stream processing for integrating and analyzing data, and what the tradeoffs are when coming from a batch oriented mindset.

Data Lake 100
article thumbnail

Maintaining Your Data Lake At Scale With Spark

Data Engineering Podcast

Summary Building and maintaining a data lake is a choose your own adventure of tools, services, and evolving best practices. The flexibility and freedom that data lakes provide allows for generating significant value, but it can also lead to anti-patterns and inconsistent quality in your analytics.

Data Lake 100
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Building A Data Lake For The Database Administrator At Upsolver

Data Engineering Podcast

Summary Data lakes offer a great deal of flexibility and the potential for reduced cost for your analytics, but they also introduce a great deal of complexity. In order to bring the DBA into the new era of data management the team at Upsolver added a SQL interface to their data lake platform.

Data Lake 100
article thumbnail

Data Ingestion: 7 Challenges and 4 Best Practices

Monte Carlo

Data ingestion is the process of collecting data from various sources and moving it to your data warehouse or lake for processing and analysis. It is the first step in modern data management workflows. Source : Fundamentals of Data Engineering by Joe Reis and Matt Housley.

article thumbnail

20+ Data Engineering Projects for Beginners with Source Code

ProjectPro

Data Warehousing: Data warehousing utilizes and builds a warehouse for storing data. A data engineer interacts with this warehouse almost on an everyday basis. Data Analytics: A data engineer works with different teams who will leverage that data for business solutions.