Remove data-analytics-data-platforms-episode-95
article thumbnail

Clean Up Your Data Using Scalable Entity Resolution And Data Mastering With Zingg

Data Engineering Podcast

Summary Despite the best efforts of data engineers, data is as messy as the real world. Entity resolution and fuzzy matching are powerful utilities for cleaning up data from disconnected sources, but it has typically required custom development and training machine learning models.

MongoDB 130
article thumbnail

An Exploration Of The Expectations, Ecosystem, and Realities Of Real-Time Data Applications

Data Engineering Podcast

Summary Data has permeated every aspect of our lives and the products that we interact with. As a result, end users and customers have come to expect interactions and updates with services and analytics to be fast and up to date. Data stacks are becoming more and more complex. Sifflet also offers a 2-week free trial.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Performing Fast Data Analytics Using Apache Kudu - Episode 64

Data Engineering Podcast

Summary The Hadoop platform is purpose built for processing large, slow moving data in long-running batch jobs. As the ecosystem around it has grown, so has the need for fast data analytics on fast moving data. _;_init__ Episode Pandas Podcast.__init__

article thumbnail

Maintain Your Data Engineers' Sanity By Embracing Automation

Data Engineering Podcast

Summary Building and maintaining reliable data assets is the prime directive for data engineers. While it is easy to say, it is endlessly complex to implement, requiring data professionals to be experts in a wide range of disparate topics while designing and implementing complex topologies of information workflows.

article thumbnail

Analytics Engineering Without The Friction Of Complex Pipeline Development With Optimus and dbt

Data Engineering Podcast

Summary One of the most impactful technologies for data analytics in recent years has been dbt. It’s hard to have a conversation about data engineering or analysis without mentioning it. Despite its widespread adoption there are still rough edges in its workflow that cause friction for data analysts.

article thumbnail

Operational Analytics To Increase Efficiency For Multi-Location Businesses With OpsAnalitica

Data Engineering Podcast

In this episode Tommy Yionoulis shares his experiences working in the service and hospitality industries and how that led him to found OpsAnalitica, a platform for collecting and analyzing metrics on multi location businesses and their operational practices. Data teams are increasingly under pressure to deliver.

article thumbnail

Making Analytical APIs Fast With Tinybird

Data Engineering Podcast

Summary Building an API for real-time data is a challenging project. The team at Tinybird wants to make it easy to turn a continuous stream of data into a production ready API or data product. RudderStack’s smart customer data pipeline is warehouse-first. Making it robust, scalable, and fast is a full time job.