article thumbnail

Data Engineering Best Practices - #2. Metadata & Logging

Start Data Engineering

Metadata: Information about pipeline runs, & data flowing through your pipeline 3.2. Introduction 2. Setup & Logging architecture 3. Data Pipeline Logging Best Practices 3.1. Obtain visibility into the code’s execution sequence using text logs 3.3. Understand resource usage by tracking Metrics 3.4.

Metadata 130
article thumbnail

Eliminate Friction In Your Data Platform Through Unified Metadata Using OpenMetadata

Data Engineering Podcast

Summary A significant source of friction and wasted effort in building and integrating data management systems is the fragmentation of metadata across various tools. After experiencing the impacts of fragmented metadata and previous attempts at building a solution Suresh Srinivas and Sriharsha Chintalapani created the OpenMetadata project.

Metadata 100
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Metadata Management And Integration At LinkedIn With DataHub

Data Engineering Podcast

The key to those solutions is a robust and flexible metadata management system. LinkedIn has gone through several iterations on the most maintainable and scalable approach to metadata, leading them to their current work on DataHub. If you hand a book to a new data engineer, what wisdom would you add to it?

Metadata 100
article thumbnail

How to learn data engineering

Christophe Blefari

Learn data engineering, all the references ( credits ) This is a special edition of the Data News. But right now I'm in holidays finishing a hiking week in Corsica 🥾 So I wrote this special edition about: how to learn data engineering in 2024. The idea is to create a living reference about Data Engineering.

article thumbnail

Our First Netflix Data Engineering Summit

Netflix Tech

Engineers from across the company came together to share best practices on everything from Data Processing Patterns to Building Reliable Data Pipelines. The result was a series of talks which we are now sharing with the rest of the Data Engineering community! Learn more about how batch and streaming data pipelines are built at Netflix.

article thumbnail

Bringing The Power Of The DataHub Real-Time Metadata Graph To Everyone At Acryl Data

Data Engineering Podcast

Summary The binding element of all data work is the metadata graph that is generated by all of the workflows that produce the assets used by teams across the organization. What are some examples of automated actions that can be triggered from metadata changes? What are the available events that can be used to trigger actions?

Metadata 100
article thumbnail

Data Engineering Weekly #164

Data Engineering Weekly

link] Meta: Logarithm - A logging engine for AI training workflows and services Logarithm indexes 100+GB/s of logs in real-time and thousands of queries a second!!! The logging engine to debug AI workflow logs is an excellent system design study if you’re interested in it.