article thumbnail

Pushing The Limits Of Scalability And User Experience For Data Processing WIth Jignesh Patel

Data Engineering Podcast

Summary Data processing technologies have dramatically improved in their sophistication and raw throughput. Unfortunately, the volumes of data that are being generated continue to double, requiring further advancements in the platform capabilities to keep up. What do you have planned for the future of your academic research?

article thumbnail

Composable data management at Meta

Engineering at Meta

In recent years, Meta’s data management systems have evolved into a composable architecture that creates interoperability, promotes reusability, and improves engineering efficiency. Data is at the core of every product and service at Meta. We needed to change our thinking to be able to move faster.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Data Engineering Weekly #173

Data Engineering Weekly

link] Meta: Composable data management at Meta Meta writes about its transition to a composable data management system to improve interoperability, reusability, and engineering efficiency. Is massive scale data warehouses like Snowflake or data processing engines like Spark require for incremental processing?

article thumbnail

Modern Data Engineering

Towards Data Science

Platform Specific Tools and Advanced Techniques Photo by Christopher Burns on Unsplash The modern data ecosystem keeps evolving and new data tools emerge now and then. In this article, I want to talk about crucial things that affect data engineers. It was created by Spotify to manage massive data processing workloads.

article thumbnail

Effective Pandas Patterns For Data Engineering

Data Engineering Podcast

As a result it has become a standard tool for data engineers for a wide range of applications. Matt Harrison is a Python expert with a long history of working with data who now spends his time on consulting and training. What are the main tasks that you have seen Pandas used for in a data engineering context?

article thumbnail

Aligning Velox and Apache Arrow: Towards composable data management

Engineering at Meta

This new convergence helps Meta and the larger community build data management systems that are unified, more efficient, and composable. Meta’s Data Infrastructure teams have been rethinking how data management systems are designed.

article thumbnail

Massively Parallel Data Processing In Python Without The Effort Using Bodo

Data Engineering Podcast

In this episode Ehsan Totoni explains how he built the Bodo project to bring the speed and processing power of HPC techniques to the Python data ecosystem without requiring any re-work. If you’re a data engineering podcast listener, you get credits worth $3000 on an annual subscription Struggling with broken pipelines?