Data Engineer, Data Workflow and SQL - Data Engineering Digest

Tackling Real Time Streaming Data With SQL Using RisingWave

Data Engineering Podcast

FEBRUARY 4, 2024

Summary Stream processing systems have long been built with a code-first design, adding SQL as a layer on top of the existing framework. RisingWave is a database engine that was created specifically for stream processing, with S3 as the storage layer. Can you describe what RisingWave is and the story behind it?

SQL

SQL Data Lake High Quality Data Data Pipeline

Snowflake’s New Python API Empowers Data Engineers to Build Modern Data Pipelines with Ease

Snowflake

APRIL 17, 2024

In today’s data-driven world, developer productivity is essential for organizations to build effective and reliable products, accelerate time to value, and fuel ongoing innovation. Dive in to experience how the enhanced Python API streamlines your data workflows and unlocks the full potential of Python within Snowflake.

Data Pipeline

Data Pipeline Python Data Engineering Data Engineer

Designing A Non-Relational Database Engine

Data Engineering Podcast

APRIL 14, 2024

In this episode Oren Eini, CEO and creator of RavenDB, explores the nuances of relational vs. non-relational engines, and the strategies for designing a non-relational database. Datafold has recently launched data replication testing, providing ongoing validation for source-to-target replication. Data lakes are notoriously complex.

Non-relational Database

Non-relational Database Relational Database Database Designing

Webinars

Peak Performance: Continuous Testing & Evaluation of LLM-Based Applications

From Developer Experience to Product Experience: How a Shared Focus Fuels Product Success

Understanding User Needs and Satisfying Them

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

How To Get Promoted In Product Management

MORE WEBINARS

Establish A Single Source Of Truth For Your Data Consumers With A Semantic Layer

Data Engineering Podcast

APRIL 7, 2024

Summary Maintaining a single source of truth for your data is the biggest challenge in data engineering. Different roles and tasks in the business need their own ways to access and analyze the data in the organization. Data lakes are notoriously complex. Your first 30 days are free!

Data Lake

Data Lake High Quality Data BI Data Workflow

Making Email Better With AI At Shortwave

Data Engineering Podcast

APRIL 21, 2024

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Dagster offers a new approach to building and running data platforms and data pipelines. Data lakes are notoriously complex. Go to dataengineeringpodcast.com/dagster today to get started.

Data Lake

Data Lake High Quality Data Data Pipeline Machine Learning

Using Trino And Iceberg As The Foundation Of Your Data Lakehouse

Data Engineering Podcast

FEBRUARY 18, 2024

Summary A data lakehouse is intended to combine the benefits of data lakes (cost effective, scalable storage and compute) and data warehouses (user friendly SQL interface). Data lakes are notoriously complex. Visit: dataengineeringpodcast.com/data-council today. Your first 30 days are free!

Data Lake

Data Lake High Quality Data Data Warehouse Google Cloud

Troubleshooting Kafka In Production

Data Engineering Podcast

DECEMBER 24, 2023

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Introducing RudderStack Profiles. RudderStack Profiles takes the SaaS guesswork and SQL grunt work out of building complete customer profiles so you can quickly ship actionable, enriched data to every downstream team.

Kafka

Kafka Data Lake High Quality Data SQL

Adding Anomaly Detection And Observability To Your dbt Projects Is Elementary

Data Engineering Podcast

MARCH 31, 2024

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Data lakes are notoriously complex. Datafold has recently launched data replication testing, providing ongoing validation for source-to-target replication. Your first 30 days are free!

Project

Project Data Lake High Quality Data Data Workflow

Build Your Second Brain One Piece At A Time

Data Engineering Podcast

APRIL 28, 2024

In this episode he explains the data collection and preparation process, the collection of model types and sizes that work together to power the experience, and how to incorporate it into your workflow to act as a second brain. Data lakes are notoriously complex. Go to dataengineeringpodcast.com/dagster today to get started.

Building

Building Data Lake High Quality Data Machine Learning

How to Become a Data Engineer in 2024?

Knowledge Hut

DECEMBER 26, 2023

Data Engineering is typically a software engineering role that focuses deeply on data – namely, data workflows, data pipelines, and the ETL (Extract, Transform, Load) process. What is Data Science? What are the roles and responsibilities of a Data Engineer? And many more.

Data Engineering

Data Engineering Data Engineer Engineering Pipeline-centric

Reconciling The Data In Your Databases With Datafold

Data Engineering Podcast

MARCH 17, 2024

Summary A significant portion of data workflows involve storing and processing information in database engines. Validating that the information is stored and processed correctly can be complex and time-consuming, especially when the source and destination speak different dialects of SQL. Your first 30 days are free!

Database

Database Data Lake High Quality Data Data Workflow

Top 10 Azure Data Engineer Job Opportunities in 2024 [Career Options]

Knowledge Hut

MARCH 28, 2024

Data engineering is one of them. According to AnalytixLabs , the data science market is expected to be worth USD 230.80 All these numbers point to one thing–increased job roles and careers, especially when we talk about data engineering jobs in Azure, which are on the rise every year. Let’s get started.

Data Engineering

Data Engineering Data Engineer Engineering Data Warehouse

Designing Data Transfer Systems That Scale

Data Engineering Podcast

DECEMBER 3, 2023

Andrei Tserakhau has dedicated his careeer to this problem, and in this episode he shares the lessons that he has learned and the work he is doing on his most recent data transfer system at DoubleCloud. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Introducing RudderStack Profiles.

Systems

Systems Designing Data Lake SQL

Ship Smarter Not Harder With Declarative And Collaborative Data Orchestration On Dagster+

Data Engineering Podcast

MARCH 24, 2024

In this episode Pete Hunt, CEO of Dagster labs, outlines these new capabilities, how they reduce the burden on data teams, and the increased collaboration that they enable across teams and business units. Data lakes are notoriously complex. Go to dataengineeringpodcast.com/dagster today to get started. Your first 30 days are free!

Data Lake

Data Lake High Quality Data Hadoop Data Pipeline

Azure Data Engineer Job Description [Roles and Responsibilities]

Knowledge Hut

SEPTEMBER 25, 2023

This demonstrates how in-demand Microsoft Certified Data Engineers are becoming. They are moving their servers and on-premises data to Azure Cloud. What does all of this mean for Data Engineering professionals? Who is an Azure Data Engineer? Azure Data Engineers work with these and other solutions.

Data Engineering

Data Engineering Data Engineer Engineering Data Lake

Addressing The Challenges Of Component Integration In Data Platform Architectures

Data Engineering Podcast

NOVEMBER 26, 2023

In this episode Tobias Macey shares his thoughts on the challenges that he is facing as he prepares to build the next set of architectural layers for his data platform to enable a larger audience to start accessing the data being managed by his team. Data lakes are notoriously complex. With Materialize, you can! Rudderstack : ![Rudderstack]([link]

Architecture

Architecture Data Lake High Quality Data SQL

Building Linked Data Products With JSON-LD

Data Engineering Podcast

SEPTEMBER 17, 2023

Summary A significant amount of time in data engineering is dedicated to building connections and semantic meaning around pieces of information. Linked data technologies provide a means of tightly coupling metadata with raw information. With Materialize, you can! Hex brings everything together.

Building

Building BI SQL Python

Pushing The Limits Of Scalability And User Experience For Data Processing WIth Jignesh Patel

Data Engineering Podcast

JANUARY 7, 2024

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Data lakes are notoriously complex. What are the open questions today in technical scalability of data engines? What are the open questions today in technical scalability of data engines?

Data Process

Data Process Process Data Lake High Quality Data

Data Sharing Across Business And Platform Boundaries

Data Engineering Podcast

FEBRUARY 11, 2024

In this episode Andrew Jefferson explains the complexities of building a robust system for data sharing, the techno-social considerations, and how the Bobsled platform that he is building aims to simplify the process. Support Data Engineering Podcast Summary Sharing data is a simple concept, but complicated to implement well.

Data Lake

Data Lake High Quality Data Government Data Pipeline

Build A Data Lake For Your Security Logs With Scanner

Data Engineering Podcast

JANUARY 28, 2024

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Data lakes are notoriously complex. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Data lakes are notoriously complex.

Data Lake

Data Lake Building High Quality Data AWS

When And How To Conduct An AI Program

Data Engineering Podcast

MARCH 3, 2024

Colleen Tartow has worked across all stages of the data lifecycle, and in this episode she shares her hard-earned wisdom about how to conduct an AI program for your organization. Data lakes are notoriously complex. Visit dataengineeringpodcast.com/data-council and use code dataengpod20 to register today!

Programming

Programming Data Lake High Quality Data Data Pipeline

Find Out About The Technology Behind The Latest PFAD In Analytical Database Development

Data Engineering Podcast

FEBRUARY 25, 2024

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Dagster offers a new approach to building and running data platforms and data pipelines. Data lakes are notoriously complex. Go to dataengineeringpodcast.com/dagster today to get started.

Database

Database Technology Data Lake High Quality Data

Modern Customer Data Platform Principles

Data Engineering Podcast

JANUARY 21, 2024

In this episode Tasso Argyros, CEO of ActionIQ, gives a summary of the major epochs in database technologies and how he is applying the capabilities of cloud data warehouses to the challenge of building more comprehensive experiences for end-users through a modern customer data platform (CDP).

Data Lake

Data Lake High Quality Data NoSQL Data Warehouse

Designing Data Platforms For Fintech Companies

Data Engineering Podcast

DECEMBER 31, 2023

In this episode Andrey Korchack, CTO of fintech startup Monite, discusses the complexities of designing and implementing a data platform in that sector. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Data lakes are notoriously complex.

Designing

Designing Data Lake High Quality Data SQL

Unlocking Your dbt Projects With Practical Advice For Practitioners

Data Engineering Podcast

NOVEMBER 19, 2023

Summary The dbt project has become overwhelmingly popular across analytics and data engineering teams. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Data projects are notoriously complex. Data lakes are notoriously complex.

Project

Project Data Lake High Quality Data SQL

Version Your Data Lakehouse Like Your Software With Nessie

Data Engineering Podcast

MARCH 10, 2024

In this episode Alex Merced explains how the branching and merging functionality in Nessie allows you to use the same versioning semantics for your data lakehouse that you are used to from Git. Data lakes are notoriously complex. Visit dataengineeringpodcast.com/data-council and use code dataengpod20 to register today!

Data Lake

Data Lake High Quality Data Data Pipeline Architecture

What Is Data Engineering And What Does A Data Engineer Do?

Meltano

OCTOBER 5, 2022

Interested in becoming a data engineer? The need for data experts in the U.S. job market is expected to grow by 22% in this decade, and according to LinkedIn’s 2020 report , a data engineer is listed as the 8th fastest growing job today. But what is data engineering exactly and what does a data engineer do?

Data Engineering

Data Engineering Data Engineer Engineering Raw Data

Shining Some Light In The Black Box Of PostgreSQL Performance

Data Engineering Podcast

NOVEMBER 5, 2023

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Introducing RudderStack Profiles. RudderStack Profiles takes the SaaS guesswork and SQL grunt work out of building complete customer profiles so you can quickly ship actionable, enriched data to every downstream team.

PostgreSQL

PostgreSQL Data Lake High Quality Data SQL

Data Engineering Weekly #114

Data Engineering Weekly

JANUARY 15, 2023

Data Engineering Weekly Is Brought to You by RudderStack RudderStack provides data pipelines that make it easy to collect data from every application, website, and SaaS platform, then activate it in your warehouse and business tools. Pipelines for data in motion can quickly turn into DAG hell.

Data Engineering

Data Engineering Data Engineer Engineering Metadata

10 Essential Azure Data Engineer Skills to Improve in 2023

Knowledge Hut

NOVEMBER 17, 2023

Azure Data Engineers play an important role in building efficient, secure, and intelligent data solutions on Microsoft Azure's powerful platform. The position of Azure Data Engineers is becoming increasingly important as businesses attempt to use the power of data for strategic decision-making and innovation.

Data Engineering

Data Engineering Data Engineer Engineering Data Lake

Top 20 Azure Data Engineering Projects in 2023 [Source Code]

Knowledge Hut

NOVEMBER 2, 2023

Azure Data engineering projects are complicated and require careful planning and effective team participation for a successful completion. While many technologies are available to help data engineers streamline their workflows and guarantee that each aspect meets its objectives, ensuring that everything works properly takes time.

Data Engineering

Data Engineering Data Engineer Project Coding

Adding An Easy Mode For The Modern Data Stack With 5X

Data Engineering Podcast

DECEMBER 17, 2023

In this episode founder Tarush Aggarwal explains how the realities of the modern data stack are impacting data teams and the work that they are doing to accelerate time to value. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Introducing RudderStack Profiles.

Data Lake

Data Lake High Quality Data SQL Architecture

Enhancing The Abilities Of Software Engineers With Generative AI At Tabnine

Data Engineering Podcast

NOVEMBER 12, 2023

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Introducing RudderStack Profiles. RudderStack Profiles takes the SaaS guesswork and SQL grunt work out of building complete customer profiles so you can quickly ship actionable, enriched data to every downstream team.

Software Engineer

Software Engineer Software Engineering Engineering Data Lake

Run Your Own Anomaly Detection For Your Critical Business Metrics With Anomstack

Data Engineering Podcast

DECEMBER 10, 2023

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management You shouldn't have to throw away the database to build with fast-changing data. It’s the only true SQL streaming database built from the ground up to meet the needs of modern data products.

Data Lake

Data Lake High Quality Data SQL Architecture

A Complete Guide to Azure Data Engineer Certification (DP-203)

Knowledge Hut

DECEMBER 28, 2023

Its comprehensive suite of services can handle data at scale. It’s no surprise that the demand for certified Azure data engineers has skyrocketed. Today, Azure Data Engineer certification is an invaluable asset for those looking to excel in the field of data engineering. Who is an Azure Data Engineer?

Certification

Certification Data Engineering Data Engineer Engineering

Azure Data Engineer (DP-203) Certification Cost in 2023

Knowledge Hut

SEPTEMBER 29, 2023

This growth is creating a strong demand for data experts, especially Azure data engineers. But who are Azure data engineers, and what do they do? Moreover, what benefits can you expect from a career in Azure Data Engineering? Why Should You Get an Azure Data Engineer Certification?

Certification

Certification Data Engineering Data Engineer Engineering

Upgrade your Modern Data Stack

Christophe Blefari

SEPTEMBER 28, 2023

We need to store, process and visualise data, everything else is just marketing. I often say that data engineering is boring, insanely boring. When you are a data engineer you're getting paid to build systems that people can rely on. An easy-to-manage central storage and querying and transforming layer in SQL.

Cloud Storage

Cloud Storage Big Data Hadoop SQL

Accelerate Development Of Enterprise Analytics With The Coalesce Visual Workflow Builder

Data Engineering Podcast

APRIL 3, 2022

Summary The flexibility of software oriented data workflows is useful for fulfilling complex requirements, but for simple and repetitious use cases it adds significant complexity. In this episode Satish Jayanthi explains how he is building a framework to allow enterprises to move quickly while maintaining guardrails for data workflows.

Data Warehouse

Data Warehouse Data Workflow Data Architecture SQL

Making Sense Of The Technical And Organizational Considerations Of Data Contracts

Data Engineering Podcast

DECEMBER 18, 2022

In this episode Abe Gong brings his experiences with the Great Expectations project and community to discuss the technical and organizational considerations involved in implementing these constraints to your data workflows. Missing data? Missing data? Struggling with broken pipelines? Stale dashboards? Stale dashboards?

Metadata

Metadata Business Intelligence Data Lake BI

Unlocking The Value Of Data Across The Organization Through User Friendly Data Tools With Prophecy

Data Engineering Podcast

MAY 22, 2022

With an eye to making data workflows more accessible to everyone in an organization Raj Bains and his team at Prophecy designed a powerful and extensible low-code platform that lets technical and non-technical users scale data flows without forcing everyone into the same layers of abstraction.

Scala

Scala SQL Data Data Workflow

Be Confident In Your Data Integration By Quickly Validating Matching Records With data-

Data Engineering Podcast

JULY 3, 2022

Summary The perennial challenge of data engineers is ensuring that information is integrated reliably. In order to quickly identify if and how two data systems are out of sync Gleb Mezhanskiy and Simon Eskildsen partnered to create the open source data-diff utility. Data teams are increasingly under pressure to deliver.

Data Integration

Data Integration MongoDB Scala MySQL

A Reflection On The Data Ecosystem For The Year 2021

Data Engineering Podcast

JANUARY 1, 2022

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. Missing data?

Data Warehouse

Data Warehouse Data Lake SQL Hadoop

Doing DataOps For External Data Sources As A Service at Demyst

Data Engineering Podcast

NOVEMBER 27, 2021

If you are having trouble answering questions for your business with the data that you generate and collect internally, then it is definitely worthwhile to explore the information available from external sources. Missing data? The data you’re looking for is already in your data warehouse and BI tools.

Data Warehouse

Data Warehouse Data Lake BI Business Intelligence

Data Exploration For Business Users Powered By Analytics Engineering With Lightdash

Data Engineering Podcast

OCTOBER 22, 2021

In this episode Oliver Laslett describes why dashboards aren’t sufficient for business analytics, how Lightdash promotes the work that you are already doing in your data warehouse modeling with dbt, and how they are focusing on bridging the divide between data teams and business teams and the requirements that they have for data workflows.

Engineering

Engineering Business Intelligence BI Data Warehouse

Tackling Real Time Streaming Data With SQL Using RisingWave

Snowflake’s New Python API Empowers Data Engineers to Build Modern Data Pipelines with Ease

Webinars

Trending Sources

Designing A Non-Relational Database Engine

Webinars

Establish A Single Source Of Truth For Your Data Consumers With A Semantic Layer

Making Email Better With AI At Shortwave

Using Trino And Iceberg As The Foundation Of Your Data Lakehouse

Troubleshooting Kafka In Production

Adding Anomaly Detection And Observability To Your dbt Projects Is Elementary

Build Your Second Brain One Piece At A Time

How to Become a Data Engineer in 2024?

Reconciling The Data In Your Databases With Datafold

Top 10 Azure Data Engineer Job Opportunities in 2024 [Career Options]

Designing Data Transfer Systems That Scale

Ship Smarter Not Harder With Declarative And Collaborative Data Orchestration On Dagster+

Azure Data Engineer Job Description [Roles and Responsibilities]

Addressing The Challenges Of Component Integration In Data Platform Architectures

Building Linked Data Products With JSON-LD

Pushing The Limits Of Scalability And User Experience For Data Processing WIth Jignesh Patel

Data Sharing Across Business And Platform Boundaries

Build A Data Lake For Your Security Logs With Scanner

When And How To Conduct An AI Program

Find Out About The Technology Behind The Latest PFAD In Analytical Database Development

Modern Customer Data Platform Principles

Designing Data Platforms For Fintech Companies

Unlocking Your dbt Projects With Practical Advice For Practitioners

Version Your Data Lakehouse Like Your Software With Nessie

What Is Data Engineering And What Does A Data Engineer Do?

Shining Some Light In The Black Box Of PostgreSQL Performance

Data Engineering Weekly #114

10 Essential Azure Data Engineer Skills to Improve in 2023

Top 20 Azure Data Engineering Projects in 2023 [Source Code]

Adding An Easy Mode For The Modern Data Stack With 5X

Enhancing The Abilities Of Software Engineers With Generative AI At Tabnine

Run Your Own Anomaly Detection For Your Critical Business Metrics With Anomstack

A Complete Guide to Azure Data Engineer Certification (DP-203)

Azure Data Engineer (DP-203) Certification Cost in 2023

Upgrade your Modern Data Stack

Accelerate Development Of Enterprise Analytics With The Coalesce Visual Workflow Builder

Making Sense Of The Technical And Organizational Considerations Of Data Contracts

Unlocking The Value Of Data Across The Organization Through User Friendly Data Tools With Prophecy

Be Confident In Your Data Integration By Quickly Validating Matching Records With data-

A Reflection On The Data Ecosystem For The Year 2021

Doing DataOps For External Data Sources As A Service at Demyst

Data Exploration For Business Users Powered By Analytics Engineering With Lightdash

Stay Connected