article thumbnail

Data News — Week 24.11

Christophe Blefari

Attributing Snowflake cost to whom it belongs — Fernando gives ideas about metadata management to attribute better Snowflake cost. Obviously Benoit prefers Kestra, at the expense of writing YAML and running a Java application. Unlocking Kafka's potential: tackling tail latency with eBPF. This is Croissant.

Metadata 272
article thumbnail

Running Unified PubSub Client in Production at Pinterest

Pinterest Engineering

A central component of data ingestion infrastructure at Pinterest is our PubSub stack, and the Logging Platform team currently runs deployments of Apache Kafka and MemQ. Given that around 50% of Java clients at Pinterest are on Flink, PSC integration with Flink was key to achieving our platform goals of fully migrating Java clients to PSC.

Kafka 99
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

The Good and the Bad of Apache Kafka Streaming Platform

AltexSoft

Kafka can continue the list of brand names that became generic terms for the entire type of technology. In this article, we’ll explain why businesses choose Kafka and what problems they face when using it. In this article, we’ll explain why businesses choose Kafka and what problems they face when using it. What is Kafka?

Kafka 93
article thumbnail

Optimizing Kafka Clients: A Hands-On Guide

Rock the JVM

Introduction Apache Kafka is a well-known event streaming platform used in many organizations worldwide. The focus of this article is to provide a better understanding of how Kafka works under the hood to better design and tune your client applications. Environment Setup First, we want to have a Kafka Cluster up and running.

Kafka 65
article thumbnail

The Importance of Distributed Tracing for Apache-Kafka-Based Applications

Confluent

Apache-Kafka ® -based applications stand out for their ability to decouple producers and consumers using an event log as an intermediate layer. This article describes how to instrument Kafka-based applications with distributed tracing capabilities in order to make dataflows between event-based components more visible.

Kafka 111
article thumbnail

Build AI-powered Recommendations with Confluent Cloud for Apache Flink® and Rockset

Rockset

While it's well-known that Flink excels at filtering, joining and enriching streaming data from Apache Kafka® or Confluent Cloud , what is less known is that it is increasingly becoming ingrained in the end-to-end stack for AI-powered applications. These additional inputs are referred to as metadata filtering. What is RAG?

Cloud 64
article thumbnail

Data Reprocessing Pipeline in Asset Management Platform @Netflix

Netflix Tech

This platform has evolved from supporting studio applications to data science applications, machine-learning applications to discover the assets metadata, and build various data facts. During this evolution, quite often we receive requests to update the existing assets metadata or add new metadata for the new features added.