Remove Blog Remove Building Remove Designing Remove Metadata
article thumbnail

Level Up Your Data Platform With Active Metadata

Data Engineering Podcast

Summary Metadata is the lifeblood of your data platform, providing information about what is happening in your systems. In order to level up their value a new trend of active metadata is being implemented, allowing use cases like keeping BI reports up to date, auto-scaling your warehouses, and automated data governance.

Metadata 130
article thumbnail

Building A Data Mesh Platform At PayPal

Data Engineering Podcast

Jean-Georges Perrin was tasked with designing a new data platform implementation at PayPal and wound up building a data mesh. It's supposed to make building smarter, faster, and more flexible data infrastructures a breeze. We feel your pain. It ends up being anything but that. When is a data mesh the wrong choice?

Building 147
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Metadata Management And Integration At LinkedIn With DataHub

Data Engineering Podcast

The key to those solutions is a robust and flexible metadata management system. LinkedIn has gone through several iterations on the most maintainable and scalable approach to metadata, leading them to their current work on DataHub. What were you using at LinkedIn for metadata management prior to the introduction of DataHub?

Metadata 100
article thumbnail

Build AI-powered Recommendations with Confluent Cloud for Apache Flink® and Rockset

Rockset

In this blog, we’ll discuss how RAG fits into the paradigm of real-time data processing and show an example product recommendation application using both Kafka and Flink on Confluent Cloud together with Rockset. Building a real-time, contextual and trustworthy knowledge base for AI applications revolves around RAG pipelines.

Cloud 64
article thumbnail

Introducing Cloudera DataFlow Designer: Self-service, No-Code Dataflow Design

Cloudera

Now, we shift focus on the needs of developers and addressing the challenges they face when building dataflows in the cloud. We’ve observed organizations using more and more data sources and destinations , as well as expecting a more diverse range of developers to build data movement flows. Enabling self-service for developers.

article thumbnail

Cloudera DataFlow Designer: The Key to Agile Data Pipeline Development

Cloudera

We just announced the general availability of Cloudera DataFlow Designer , bringing self-service data flow development to all CDP Public Cloud customers. In our previous DataFlow Designer blog post , we introduced you to the new user interface and highlighted its key capabilities.

article thumbnail

Building Real-time Machine Learning Foundations at Lyft

Lyft Engineering

On the flip side, there was a substantial appetite to build real-time ML systems from developers at Lyft. In this blog post, we will discuss what we built in support of that goal and some of the lessons we learned along the way. To meet the needs of our customers, we kicked off the Real-time Machine Learning with Streaming initiative.