article thumbnail

Metadata Management And Integration At LinkedIn With DataHub

Data Engineering Podcast

The key to those solutions is a robust and flexible metadata management system. LinkedIn has gone through several iterations on the most maintainable and scalable approach to metadata, leading them to their current work on DataHub. What were you using at LinkedIn for metadata management prior to the introduction of DataHub?

Metadata 100
article thumbnail

2. Diving Deeper into Psyberg: Stateless vs Stateful Data Processing

Netflix Tech

By Abhinaya Shetty , Bharath Mummadisetty In the inaugural blog post of this series, we introduced you to the state of our pipelines before Psyberg and the challenges with incremental processing that led us to create the Psyberg framework within Netflix’s Membership and Finance data engineering team.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

1. Streamlining Membership Data Engineering at Netflix with Psyberg

Netflix Tech

In this three-part blog post series, we introduce you to Psyberg , our incremental data processing framework designed to tackle such challenges! Using fixed lookback windows to always reprocess data, assuming that most late-arriving events will occur within that window. a cancel event for a missed signup would have had no effect).

article thumbnail

A Look Back at the Gartner Data and Analytics Summit

Cloudera

Here are a couple of the biggest takeaways we had from our time at the event. In those discussions, it was clear that everyone understood the need to treat data estates more cohesively as a whole—that means bringing more attention to security, data governance, and metadata management, the latter of which has become increasingly popular.

Metadata 106
article thumbnail

3. Psyberg: Automated end to end catch up

Netflix Tech

By Abhinaya Shetty , Bharath Mummadisetty This blog post will cover how Psyberg helps automate the end-to-end catchup of different pipelines, including dimension tables. The session metadata table can then be read to determine the pipeline input. Metadata Recording : Metadata is persisted for traceability.

article thumbnail

Ensuring the Successful Launch of Ads on Netflix

Netflix Tech

In this blog post, we’ll discuss the methods we used to ensure a successful launch, including: How we tested the system Netflix technologies involved Best practices we developed Realistic Test Traffic Netflix traffic ebbs and flows throughout the day in a sinusoidal pattern. Basic with ads was launched worldwide on November 3rd.

Algorithm 136
article thumbnail

Data Reprocessing Pipeline in Asset Management Platform @Netflix

Netflix Tech

This platform has evolved from supporting studio applications to data science applications, machine-learning applications to discover the assets metadata, and build various data facts. During this evolution, quite often we receive requests to update the existing assets metadata or add new metadata for the new features added.