Data Lake, Hadoop, Lambda Architecture and MongoDB

Aggregator Leaf Tailer: An Alternative to Lambda Architecture for Real-Time Analytics

Rockset

FEBRUARY 6, 2019

Traditional Data Processing: Batch and Streaming MapReduce, most commonly associated with Apache Hadoop, is a pure batch system that often introduces significant time lag in massaging new data into processed results. The final output would be written to a serving system like Apache Cassandra, Elasticsearch or MongoDB.

Lambda Architecture

Lambda Architecture Architecture MongoDB Kafka

20+ Data Engineering Projects for Beginners with Source Code

ProjectPro

AUGUST 24, 2021

Learn how to process Wikipedia archives using Hadoop and identify the lived pages in a day. Utilize Amazon S3 for storing data, Hive for data preprocessing, and Zeppelin notebooks for displaying trends and analysis. Understand the importance of Qubole in powering up Hadoop and Notebooks. The final step is Publish.

Data Engineering

Data Engineering Data Engineer Coding Project

Data Engineering Digest

Aggregator Leaf Tailer: An Alternative to Lambda Architecture for Real-Time Analytics

20+ Data Engineering Projects for Beginners with Source Code

Webinars

Stay Connected