Remove apache-kafka-tested
article thumbnail

Running Unified PubSub Client in Production at Pinterest

Pinterest Engineering

A central component of data ingestion infrastructure at Pinterest is our PubSub stack, and the Logging Platform team currently runs deployments of Apache Kafka and MemQ. years since our previous blog post, PSC has been battle-tested at large scale in Pinterest with notably positive feedback and results.

Kafka 99
article thumbnail

Revolutionizing Real-Time Streaming Processing: 4 Trillion Events Daily at LinkedIn

LinkedIn Engineering

Authors: Bingfeng Xia and Xinyu Liu Background At LinkedIn, Apache Beam plays a pivotal role in stream processing infrastructures that process over 4 trillion events daily through more than 3,000 pipelines across multiple production data centers. The release of Apache Beam in 2016 proved to be a game-changer for LinkedIn.

Process 119
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Druid Deprecation and ClickHouse Adoption at Lyft

Lyft Engineering

Introduction At Lyft, we have used systems like Apache ClickHouse and Apache Druid for near real-time and sub-second analytics. In this particular blog post, we explain how Druid has been used at Lyft and what led us to adopt ClickHouse for our sub-second analytic system. ioConfig: Kafka server info, topic names, etc. (ex.

Kafka 104
article thumbnail

Data Engineering Weekly #168

Data Engineering Weekly

The blog narrates how Chronon fits into Stripe’s online and offline requirements. link] Grab: Enabling near real-time data analytics on the data lake Apache Hudi’s Merge On Read (MoR) is a game changer in developing low-latency analytics on top of the data lake. link] All rights reserved ProtoGrowth Inc, India.

article thumbnail

An Engineering Guide to Data Quality - A Data Contract Perspective - Part 2

Data Engineering Weekly

I won’t bore you with the importance of data quality in the blog. Data Testing vs. Data Observability Data testing and data observability are two important aspects of data quality. Data testing ensures that data meets specific requirements. The Fronting Kafka pattern follows a two-cluster approach.

article thumbnail

Fraud Detection with Cloudera Stream Processing Part 1

Cloudera

In a previous blog of this series, Turning Streams Into Data Products , we talked about the increased need for reducing the latency between data generation/ingestion and producing analytical results and insights from this data. This blog will be published in two parts. This is what we call the first-mile problem.

Process 83
article thumbnail

Advanced Testing Techniques for Spring Kafka

Confluent

Apache Kafka®. All of these share one thing in common: complexity in testing. This is the final blog […]. Asynchronous boundaries. Frameworks. Configuring frameworks. Now imagine them combined—it gets much harder.

Kafka 98