article thumbnail

Tackling Real Time Streaming Data With SQL Using RisingWave

Data Engineering Podcast

Summary Stream processing systems have long been built with a code-first design, adding SQL as a layer on top of the existing framework. In this episode Yingjun Wu explains how it is architected to power analytical workflows on continuous data flows, and the challenges of making it responsive and scalable.

SQL 173
article thumbnail

Using Trino And Iceberg As The Foundation Of Your Data Lakehouse

Data Engineering Podcast

Summary A data lakehouse is intended to combine the benefits of data lakes (cost effective, scalable storage and compute) and data warehouses (user friendly SQL interface). Data lakes are notoriously complex. Data lakes are notoriously complex. Your first 30 days are free! Your first 30 days are free!

Data Lake 262
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Creating Shared Context For Your Data Warehouse With A Controlled Vocabulary

Data Engineering Podcast

In this episode Emily Riederer shares her work to create a controlled vocabulary for managing the semantic elements of the data managed by her team and encoding it in the schema definitions in her data warehouse. Atlan is a collaborative workspace for data-driven teams, like Github for engineering or Figma for design teams.

article thumbnail

Modern Customer Data Platform Principles

Data Engineering Podcast

A substantial amount of the data that is being managed in these systems is related to customers and their interactions with an organization. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Data lakes are notoriously complex.

Data Lake 147
article thumbnail

Bulldozer: Batch Data Moving from Data Warehouse to Online Key-Value Stores

Netflix Tech

Usually Data scientists and engineers write Extract-Transform-Load (ETL) jobs and pipelines using big data compute technologies, like Spark or Presto , to process this data and periodically compute key information for a member or a video. The processed data is typically stored as data warehouse tables in AWS S3.

article thumbnail

Top 10 Azure Data Engineer Job Opportunities in 2024 [Career Options]

Knowledge Hut

Implemented and managed data storage solutions using Azure services like Azure SQL Database , Azure Data Lake Storage, and Azure Cosmos DB. Education & Skills Required Proficiency in SQL, Python, or other programming languages. Develop predictive models and data-driven solutions to address business challenges.

article thumbnail

Accelerate Development Of Enterprise Analytics With The Coalesce Visual Workflow Builder

Data Engineering Podcast

Summary The flexibility of software oriented data workflows is useful for fulfilling complex requirements, but for simple and repetitious use cases it adds significant complexity. In this episode Satish Jayanthi explains how he is building a framework to allow enterprises to move quickly while maintaining guardrails for data workflows.