Remove Blog Remove Building Remove Kafka Remove Metadata
article thumbnail

Metadata Management And Integration At LinkedIn With DataHub

Data Engineering Podcast

The key to those solutions is a robust and flexible metadata management system. LinkedIn has gone through several iterations on the most maintainable and scalable approach to metadata, leading them to their current work on DataHub. What were you using at LinkedIn for metadata management prior to the introduction of DataHub?

Metadata 100
article thumbnail

Building Real-time Machine Learning Foundations at Lyft

Lyft Engineering

On the flip side, there was a substantial appetite to build real-time ML systems from developers at Lyft. In this blog post, we will discuss what we built in support of that goal and some of the lessons we learned along the way. To meet the needs of our customers, we kicked off the Real-time Machine Learning with Streaming initiative.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

The Importance of Distributed Tracing for Apache-Kafka-Based Applications

Confluent

Apache-Kafka ® -based applications stand out for their ability to decouple producers and consumers using an event log as an intermediate layer. This article describes how to instrument Kafka-based applications with distributed tracing capabilities in order to make dataflows between event-based components more visible.

Kafka 111
article thumbnail

Running Unified PubSub Client in Production at Pinterest

Pinterest Engineering

A central component of data ingestion infrastructure at Pinterest is our PubSub stack, and the Logging Platform team currently runs deployments of Apache Kafka and MemQ. years since our previous blog post, PSC has been battle-tested at large scale in Pinterest with notably positive feedback and results.

Kafka 99
article thumbnail

Data Reprocessing Pipeline in Asset Management Platform @Netflix

Netflix Tech

This platform has evolved from supporting studio applications to data science applications, machine-learning applications to discover the assets metadata, and build various data facts. During this evolution, quite often we receive requests to update the existing assets metadata or add new metadata for the new features added.

article thumbnail

Data News — Week 22.48

Christophe Blefari

Joe Reis launched his Substack — Joe is the co-author of the great The Fundamentals of Data Engineering and his blog already have 2 articles I deeply recommend: No extra credit for complexity & Groundhog Days. Anna shares a good checklist to build data team foundations.

Kafka 130
article thumbnail

Ensuring the Successful Launch of Ads on Netflix

Netflix Tech

To do this, we devised a novel way to simulate the projected traffic weeks ahead of launch by building upon the traffic migration framework described here. It also included metadata about ads, such as ad placement and impression-tracking events. We stored these responses in a Keystone stream with outputs for Kafka and Elasticsearch.

Algorithm 136