Remove what-is-an-apache-kafka-cluster
article thumbnail

Running Unified PubSub Client in Production at Pinterest

Pinterest Engineering

A central component of data ingestion infrastructure at Pinterest is our PubSub stack, and the Logging Platform team currently runs deployments of Apache Kafka and MemQ. years since our previous blog post, PSC has been battle-tested at large scale in Pinterest with notably positive feedback and results.

Kafka 99
article thumbnail

Big Data Technologies that Everyone Should Know in 2024

Knowledge Hut

What was once popular and in demand can quickly become outdated. In this blog post, we will discuss such technologies. What Are Big Data T echnologies? There are a variety of big data processing technologies available, including Apache Hadoop, Apache Spark, and MongoDB.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Druid Deprecation and ClickHouse Adoption at Lyft

Lyft Engineering

Introduction At Lyft, we have used systems like Apache ClickHouse and Apache Druid for near real-time and sub-second analytics. In this particular blog post, we explain how Druid has been used at Lyft and what led us to adopt ClickHouse for our sub-second analytic system. ioConfig: Kafka server info, topic names, etc.

Kafka 104
article thumbnail

Analysis of Confluent Buying Immerok

Jesse Anderson

I’ve always been vocal about ksqlDB’s and Kafka Stream’s limitations. The Future of ksqlDB and Kafka Streams With this announcement, the future of primarily ksqlDB and, to a lesser extent, Kafka Streams comes into view. Since Kafka Streams is part of the Apache project, I don’t see it going away as quickly.

Kafka 147
article thumbnail

Streaming Ingestion for Apache Iceberg With Cloudera Stream Processing

Cloudera

Recently, we announced enhanced multi-function analytics support in Cloudera Data Platform (CDP) with Apache Iceberg. The CSP engine is powered by Apache Flink, which is the best-in-class processing engine for stateful streaming pipelines. Iceberg is a high-performance open table format for huge analytic data sets.

Process 112
article thumbnail

Streams Replication Manager Prefixless Replication

Cloudera

Streams Replication Manager (SRM) is an enterprise-grade replication solution that enables fault tolerant, scalable, and robust cross-cluster Kafka topic replication. SRM replicates data at high performance and keeps topic properties in sync across clusters. ACL and configuration changes are not synced across mirrored clusters.

article thumbnail

A Closer Look at The Next Phase of Cloudera’s Hybrid Data Lakehouse

Cloudera

Cloudera is now the only provider to offer an open data lakehouse with Apache Iceberg for cloud and on-premises. Apache Ozone As AI and other advanced analytics continue to grow in scale, performance and scalable data storage will need to expand right along with them. But even with its rise, AI is still a struggle for some enterprises.