Remove scaling-ai-ml-infrastructure-at-uber
article thumbnail

Data Engineering Weekly #123

Data Engineering Weekly

link] Uber: Setting Uber’s Transactional Data Lake in Motion with Incremental ETL Using Apache Hudi Uber writes a comprehensive guide on running incremental ETL using Apache Hudi. How do you handle these models without compromising scale and usability? Map table vs. using complex data structure?

article thumbnail

Data Engineering Weekly #124

Data Engineering Weekly

Come and hear talks from companies like StarTree, Confluent, LinkedIn, DoorDash, Imply, and Uber on how they are advancing the state-of-the-art in user-facing analytics delivered instantly. The blog highlights that the job is not just writing SQL but providing a strategic business solution for an organization.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Expert Roundtable: How to Build Real-Time Personalization and Recommendation Systems

Rockset

I recently had the good fortune to host a small-group discussion on personalization and recommendation systems with two technical experts with years of experience at FAANG and other web-scale companies. Prior to that, Prabhu was head of core infrastructure at Pinterest. View the blog summary and video here.

Systems 52
article thumbnail

Real-Time Data Predictions for 2023

Rockset

This blog compiles real-time data predictions from industry leaders so you know what’s coming in 2023. Confluent’s State of Data in Motion Report found that 97% of companies around the world are using streaming data, making it central to the data landscape.

article thumbnail

[O’Reilly Book] Chapter 1: Why Data Quality Deserves Attention Now

Monte Carlo

And as organizations increasingly leverage data and build more and more complex data ecosystems and infrastructure, this problem is only slated to increase. And finally, we’ll take a closer look at how best-in-class teams can achieve high data quality at each stage of the data pipeline and what it takes to maintain data trust at scale.

article thumbnail

Data Engineering Weekly #165

Data Engineering Weekly

The blog further emphasizes its increased investment in Data Mesh and clean data. The blog is an excellent overview of all the improvements made to PySpark in 2023. link] Uber: Scaling AI/ML Infrastructure at Uber The advancement in AI/ML brings significant challenges for infrastructure to scale and support.

article thumbnail

List of Top Data Science Platforms in 2023

Knowledge Hut

In this blog, we go through what a Data Science Platform is, the different types of platforms, and how they can be used to bring value to the business so that the big corporates can stay in the race to conquer the market of the future. It is working to make AI and ML easier. Top Data Science Platforms 1. Platform H2O.ai