article thumbnail

Improving Efficiency Of Goku Time Series Database at Pinterest (Part?—?1)

Pinterest Engineering

In the first blog, we will share a short summary on the GokuS and GokuL architecture, data format for Goku Long Term, and how we improved the bootstrap time for our storage and serving components. This is because the synthetic data points would be present in the retry kafka waiting to be pushed into the recovering host by the retry ingestor.

article thumbnail

Apache Kafka Deployments and Systems Reliability – Part 1

Cloudera

There are many ways that Apache Kafka has been deployed in the field. In our Kafka Summit 2021 presentation, we took a brief overview of many different configurations that have been observed to date. In this blog series, we will discuss each of these deployments and the deployment choices made along with how they impact reliability.

Kafka 115
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Operating Apache Kafka with Cruise Control

Cloudera

There are two big gaps in the Apache Kafka project when we think of operating a cluster. There are no solutions for these inside the Kafka project but there are many good 3rd party tools for both problems. Cruise Control is integrated with Kafka through metrics reporting. About Cruise Control. Architecture. Metrics Reporting.

Kafka 71
article thumbnail

Using Kafka Connect Securely in the Cloudera Data Platform

Cloudera

In this post I will demonstrate how Kafka Connect is integrated in the Cloudera Data Platform (CDP), allowing users to manage and monitor their connectors in Streams Messaging Manager while also touching on security features such as role-based access control and sensitive information handling. Kafka Connect. Streams Messaging Manager.

Kafka 72
article thumbnail

Kafka Listeners – Explained

Confluent

Put another way, courtesy of Spencer Ruport: LISTENERS are what interfaces Kafka binds to. Apache Kafka ® is a distributed system. You need to tell Kafka how the brokers can reach each other but also make sure that external clients (producers/consumers) can reach the broker they need to reach. Is anyone listening? on AWS, etc.)

Kafka 99
article thumbnail

Staying in the Zone: How DoorDash used a service mesh to manage  data transfer, reducing hops and cloud spend

DoorDash Engineering

In this blog post, we describe the journey DoorDash took using a service mesh to realize data transfer cost savings without sacrificing service quality. Storage traffic: Includes traffic from microservices to stateful systems such as Aurora PostgreSQL, CockroachDB, Redis, and Kafka.

Bytes 84
article thumbnail

Data Engineering Weekly #151

Data Engineering Weekly

Github writes an excellent blog to capture the current state of the LLM integration architecture. The blog is an excellent read to understand late-arriving data, backfilling, and incremental processing complications. link] Sophie Blee-Goldman: Kafka Streams and Rebalancing through the Ages Consumers come and go.