article thumbnail

Designing A Non-Relational Database Engine

Data Engineering Podcast

Summary Databases come in a variety of formats for different use cases. The default association with the term "database" is relational engines, but non-relational engines are also used quite widely. Datafold has recently launched data replication testing, providing ongoing validation for source-to-target replication.

article thumbnail

Reconciling The Data In Your Databases With Datafold

Data Engineering Podcast

Summary A significant portion of data workflows involve storing and processing information in database engines. In this episode Gleb Mezhanskiy, founder and CEO of Datafold, discusses the different error conditions and solutions that you need to know about to ensure the accuracy of your data. Data lakes are notoriously complex.

Database 147
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Find Out About The Technology Behind The Latest PFAD In Analytical Database Development

Data Engineering Podcast

Summary Building a database engine requires a substantial amount of engineering effort and time investment. In this episode he explains how he used the combination of Apache Arrow, Flight, Datafusion, and Parquet to lay the foundation of the newest version of his time-series database. Data lakes are notoriously complex.

Database 162
article thumbnail

Release Management For Data Platform Services And Logic

Data Engineering Podcast

I listened to the recent episode "Transforming Your Database" and appreciated the valuable advice on how to approach the selection and integration of new databases in applications and the impact on team dynamics. Data lakes are notoriously complex. Data lakes are notoriously complex.

article thumbnail

Troubleshooting Kafka In Production

Data Engineering Podcast

Get all of the details and try the new product today at dataengineeringpodcast.com/rudderstack You shouldn't have to throw away the database to build with fast-changing data. It’s the only true SQL streaming database built from the ground up to meet the needs of modern data products. With Materialize, you can!

Kafka 245
article thumbnail

Tackling Real Time Streaming Data With SQL Using RisingWave

Data Engineering Podcast

RisingWave is a database engine that was created specifically for stream processing, with S3 as the storage layer. In this episode Yingjun Wu explains how it is architected to power analytical workflows on continuous data flows, and the challenges of making it responsive and scalable. Starburst : ![Starburst

SQL 173
article thumbnail

Monte Carlo Announces Support for Kafka and Vector Databases at IMPACT 2023

Monte Carlo

Kafka and Vector Database support According to Databricks’ State of Data and AI report , the number of companies using SaaS LLM APIs has grown more than 1300% since November 2022 with a nearly 411% increase in the number of AI models put into production during that same period. Both integrations will be available early 2024.

Kafka 64