Blog, Data, Engineering and Systems - Data Engineering Digest

I asked ChatGPT to write a blog post about Data Engineering. Here it is.

Confessions of a Data Guy

DECEMBER 29, 2022

Data engineering is a vital field within the realm of data science that focuses on the practical aspects of collecting, storing, and processing large amounts of data. appeared first on Confessions of a Data Guy. Here it is.

Data Engineering

Data Engineering Data Engineer Engineering IT

How to learn data engineering

Christophe Blefari

JANUARY 20, 2024

Learn data engineering, all the references ( credits ) This is a special edition of the Data News. But right now I'm in holidays finishing a hiking week in Corsica 🥾 So I wrote this special edition about: how to learn data engineering in 2024. Who are the data engineers?

Data Engineering

Data Engineering Data Engineer Engineering Hadoop

Data Engineering Weekly #171

Data Engineering Weekly

MAY 12, 2024

This year, we added some additional features to bring the data community together. Gloss Genius: How We Migrated From dbt Cloud and Scaled Our Data Development Gloss Genius describes its migration journey from dbt cloud to Airflow + custom Github actions. At scale, it becomes impossible to enrich the data assets manually.

Data Engineering

Data Engineering Data Engineer Engineering Data Lake

Webinars

The Product Manager’s Guide to Optimizing DX for Systemic Impact

Understanding User Needs and Satisfying Them

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Leading the Development of Profitable and Sustainable Products

How To Get Promoted In Product Management

MORE WEBINARS

GPT and LLMs from a Data Engineering Perspective

Jesse Anderson

SEPTEMBER 14, 2023

There has been quite a bit of writing covering GPT and LLMs from data science and business perspectives. I haven’t seen much from the data engineering side. Let me share my perspective, having been in data and AI for a while and using LLMs before they became popular. that summarizes blog posts using LLMs.

Data Engineering

Data Engineering Data Engineer Engineering NoSQL

Data Engineering Weekly #166

Data Engineering Weekly

APRIL 7, 2024

dbt: 2024 State of Analytics Engineering The 2024 dbt’s state of analytical engineering report is out. Poor data quality and unlcear data ownership remains the top challenges for the data teams. Data Mesh continuously gaining popularity among the enterprises.

Data Engineering

Data Engineering Data Engineer Engineering Unstructured Data

Data Engineering Weekly #161

Data Engineering Weekly

MARCH 3, 2024

RudderStack is the Warehouse Native CDP, built to help data teams deliver value across the entire data activation lifecycle, from collection to unification and activation. Editor’s Note: Chennai, India Meetup - March-08 Update We are thankful to Ideas2IT to host our first Data Hero’s meetup.

Data Engineering

Data Engineering Data Engineer Pipeline-centric Engineering

Brief History of Data Engineering

Jesse Anderson

DECEMBER 12, 2022

Google looked over the expanse of the growing internet and realized they’d need scalable systems. They were the first companies to commercialize open source big data technologies and pushed the marketing and commercialization of Hadoop. Apache Spark came in 2009 and gave a unified batch and streaming engine.

Data Engineering

Data Engineering Data Engineer Engineering Hadoop

Data Engineering Weekly #157

Data Engineering Weekly

FEBRUARY 4, 2024

RudderStack is the Warehouse Native CDP, built to help data teams deliver value across the entire data activation lifecycle, from collection to unification and activation. Joe Reis: Definition of Data Modeling & What Data Modeling Is not Joe raised a very fundamental question in data engineering.

Data Engineering

Data Engineering Data Engineer Engineering PostgreSQL

Career Opportunities in Software Engineering

Knowledge Hut

APRIL 23, 2024

Software engineering is a rapidly growing field with vast career opportunities. Software career path offers diverse options, from developing mobile applications and games to creating sophisticated software systems that power businesses and industries. These levels consist of junior engineer, engineer, and senior engineer.

Software Engineer

Software Engineer Software Engineering Engineering Programming Language

A senior engineer/EM job search story

The Pragmatic Engineer

AUGUST 10, 2023

👋 Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. In every issue, I cover topics related to Big Tech and startups through the lens of engineering managers and senior engineers. In what is neat: The Pragmatic Engineer is now the first-ever newsletter listed on Learnerbly.

Engineering

Engineering Recruitment Retail Software Engineer

The Recommendation System at Lyft

Lyft Engineering

APRIL 3, 2023

This blog post focuses on the scope and the goals of the recommendation system, and explores some of the most recent changes the Rider team has made to better serve Lyft’s riders. Introduction: Scope of the Recommendation System The recommendation system covers user experiences throughout the ride journey.

Systems

Systems Pipeline-centric Machine Learning Transportation

Data Engineering Weekly #151

Data Engineering Weekly

DECEMBER 3, 2023

RudderStack is the Warehouse Native CDP, built to help data teams deliver value across the entire data activation lifecycle, from collection to unification and activation. Github writes an excellent blog to capture the current state of the LLM integration architecture. Lackluster AI/ML results often stem from poor data quality.

Data Engineering

Data Engineering Data Engineer Engineering Bytes

Data Engineering Weekly #159

Data Engineering Weekly

FEBRUARY 18, 2024

RudderStack is the Warehouse Native CDP, built to help data teams deliver value across the entire data activation lifecycle, from collection to unification and activation. Our hope is only with the amazing community of data practitioners who constantly support us. We are so over the Big Data Era to Modern Data Stack.

Data Engineering

Data Engineering Data Engineer Engineering Data

Data Engineering Weekly #162

Data Engineering Weekly

MARCH 10, 2024

Editor’s Note: Chennai Meetup Wrap-Up & Preparation work started for DEWCon I am so grateful for the enthusiastic participants who made our Chennai Data Heroes- Community for Data Folks meetup vibrant! Big thanks to our insightful speakers, Hareshkumar Selvakumar - Talks about his work on Data Products for PayPal.

Data Engineering

Data Engineering Data Engineer Engineering Datasets

Data Engineering Weekly #155

Data Engineering Weekly

JANUARY 21, 2024

RudderStack is the Warehouse Native CDP, built to help data teams deliver value across the entire data activation lifecycle, from collection to unification and activation. Interesting article on the impact of search engine optimization (SEO) on the quality of search engine results. Visit rudderstack.com to learn more.

Data Engineering

Data Engineering Data Engineer Engineering Kafka

Data Engineering Weekly #147

Data Engineering Weekly

SEPTEMBER 24, 2023

Data Engineering Weekly Is Brought to You by RudderStack RudderStack Profiles takes the SaaS guesswork and SQL grunt work out of building complete customer profiles so you can quickly ship actionable, enriched data to every downstream team. Airflow isn’t meant to process the data. See how it works today.

Data Engineering

Data Engineering Data Engineer Engineering Kafka

Data Engineering Weekly #148

Data Engineering Weekly

OCTOBER 1, 2023

Data Engineering Weekly Is Brought to You by RudderStack RudderStack Profiles takes the SaaS guesswork and SQL grunt work out of building complete customer profiles so you can quickly ship actionable, enriched data to every downstream team. What is the data behavior? See how it works today. Dropbox: Is this a date?

Data Engineering

Data Engineering Data Engineer Engineering Data Pipeline

Data Engineering Weekly #150

Data Engineering Weekly

NOVEMBER 5, 2023

RudderStack is the Warehouse Native CDP, built to help data teams deliver value across the entire data activation lifecycle, from collection to unification and activation. Netflix: Streaming SQL in Data Mesh Learnings from our journey In hindsight, we wish we had invested in enabling Flink SQL on the DataMesh platform much earlier.

Data Engineering

Data Engineering Data Engineer Engineering SQL

A Year of Modern: Our Top 2022 Blog Posts — Chosen by You

The Modern Data Company

JANUARY 31, 2023

Another year, another chance to learn more about the world of data. In 2023, The Modern Data Company (Modern) hopes to reach more companies and organizations with our data operating system, build incredible value from existing and upcoming data assets, and share insights into major shifts in what it means to be data-driven.

Retail

Retail Healthcare Data Architecture Data Pipeline

Data Pipeline Observability: A Model For Data Engineers

Databand.ai

JUNE 28, 2023

Data Pipeline Observability: A Model For Data Engineers Eitan Chazbani June 29, 2023 Data pipeline observability is your ability to monitor and understand the state of a data pipeline at any time. We believe the world’s data pipelines need better data observability. To measure, but not track.

Data Pipeline

Data Pipeline Data Engineering Data Engineer Engineering

GPT-based data engineering accelerators

RandomTrees

FEBRUARY 2, 2024

GPT-based data engineering accelerators make the working of data more accessible. These accelerators use GPT models to do data tasks faster, fix any issues, and save a lot of time. GPT models change data in simple language and also provide summaries and explanations. One can rely on this information.

Data Engineering

Data Engineering Data Engineer Engineering Data Pipeline

Moving Enterprise Data From Anywhere to Any System Made Easy

Cloudera

JUNE 2, 2022

Since 2015, the Cloudera DataFlow team has been helping the largest enterprise organizations in the world adopt Apache NiFi as their enterprise standard data movement tool. This need has generated a market opportunity for a universal data distribution service. Why does every organization need it when using a modern data stack?

Systems

Systems Data Lake Google Cloud Data Collection

Data Engineering Weekly #143

Data Engineering Weekly

AUGUST 20, 2023

Data Engineering Weekly Is Brought to You by RudderStack RudderStack Profiles takes the SaaS guesswork, and SQL grunt work out of building complete customer profiles, so you can quickly ship actionable, enriched data to every downstream team. But 4 years later, in 2023 — where has the data mesh gotten us?

Data Engineering

Data Engineering Data Engineer Engineering Data

1. Streamlining Membership Data Engineering at Netflix with Psyberg

Netflix Tech

NOVEMBER 14, 2023

By Abhinaya Shetty , Bharath Mummadisetty At Netflix, our Membership and Finance Data Engineering team harnesses diverse data related to plans, pricing, membership life cycle, and revenue to fuel analytics, power various dashboards, and make data-informed decisions. What is late-arriving data? Let’s dive in!

Data Engineering

Data Engineering Data Engineer Engineering Metadata

Data Engineering Weekly #123

Data Engineering Weekly

MARCH 19, 2023

Contribute to the Rudderstack Transformations Library, Win $1000 RudderStack Transformations lets you customize event data in real time with your own JavaScript or Python code. link] Sanjeev Mohan: What Exactly is a Data Product? Is chatGPT a data product? Is Data a product? What is Data Product, indeed?

Data Engineering

Data Engineering Data Engineer Engineering Media

Maintain Your Data Engineers' Sanity By Embracing Automation

Data Engineering Podcast

JULY 10, 2022

Summary Building and maintaining reliable data assets is the prime directive for data engineers. While it is easy to say, it is endlessly complex to implement, requiring data professionals to be experts in a wide range of disparate topics while designing and implementing complex topologies of information workflows.

Data Engineering

Data Engineering Data Engineer Engineering MongoDB

Data Engineering Weekly #137

Data Engineering Weekly

JULY 2, 2023

Data Engineering Weekly Is Brought to You by RudderStack RudderStack Profiles takes the SaaS guesswork, and SQL grunt work out of building complete customer profiles, so you can quickly ship actionable, enriched data to every downstream team. So, let's shape the future of Data Engineering together.

Data Engineering

Data Engineering Data Engineer Engineering Pipeline-centric

Data Engineering Weekly #141

Data Engineering Weekly

AUGUST 6, 2023

Data Engineering Weekly Is Brought to You by RudderStack RudderStack Profiles takes the SaaS guesswork and SQL grunt work out of building complete customer profiles, so you can quickly ship actionable, enriched data to every downstream team. See how it works today. Editor’s Note: DewCon.ai

Data Engineering

Data Engineering Data Engineer Engineering Kafka

Data Engineering Weekly #122

Data Engineering Weekly

MARCH 12, 2023

Contribute to the Rudderstack Transformations Library, Win $1000 RudderStack Transformations lets you customize event data in real time with your own JavaScript or Python code. link] Editor’s Note: Data Engineering Radio At Data Engineering Weekly, We strive to bring the best thought process around building and operating data.

Data Engineering

Data Engineering Data Engineer Engineering SQL

Data Engineering Weekly #135

Data Engineering Weekly

JUNE 18, 2023

Data Engineering Weekly Is Brought to You by RudderStack RudderStack provides data pipelines that make it easy to collect data from every application, website, and SaaS platform, then activate it in your warehouse and business tools. Data management is critical for any organization to succeed in this AI world.

Data Engineering

Data Engineering Data Engineer Engineering MySQL

Data Engineering Weekly #125

Data Engineering Weekly

APRIL 2, 2023

Contribute to the Rudderstack Transformations Library, Win $1000 RudderStack Transformations lets you customize event data in real-time with your own JavaScript or Python code. Meta: Presto - A Decade of SQL Analytics at Meta Presto and Kafka are the two systems that greatly impacted data infrastructure in the last decade.

Data Engineering

Data Engineering Data Engineer Engineering Pipeline-centric

Data Engineering Weekly #127

Data Engineering Weekly

APRIL 16, 2023

Data Engineering Weekly Is Brought to You by RudderStack RudderStack provides data pipelines that make collecting data from every application, website, and SaaS platform easy, then activating it in your warehouse and business tools. Sign up free to test out the tool today. 📊 ⏱️Got 5 minutes?

Data Engineering

Data Engineering Data Engineer Engineering Pipeline-centric

System Administrator Salary in 2024 [Fresher to Experienced]

Knowledge Hut

MARCH 28, 2024

The modern tech-driven landscape is ideal for System Administrators. As computers continue to permeate every aspect of the economy, the need for skilled System Administrators is expected to grow exponentially. Who Is a System Administrator?

Systems

Systems Computer Science Certification Recruitment

An Engineering Guide to Data Quality - A Data Contract Perspective - Part 2

Data Engineering Weekly

MAY 16, 2023

In the first part of this series, we talked about design patterns for data creation and the pros & cons of each system from the data contract perspective. In the second part, we will focus on architectural patterns to implement data quality from a data contract perspective. Why is Data Quality Expensive?

Engineering

Engineering Kafka Data Pipeline Data Warehouse

Data Engineering Weekly #142

Data Engineering Weekly

AUGUST 13, 2023

Data Engineering Weekly Is Brought to You by RudderStack RudderStack Profiles takes the SaaS guesswork, and SQL grunt work out of building complete customer profiles, so you can quickly ship actionable, enriched data to every downstream team. But 4 years later, in 2023 — where has the data mesh gotten us?

Data Engineering

Data Engineering Data Engineer Engineering Food

Data Engineering Weekly #134

Data Engineering Weekly

JUNE 12, 2023

Data Engineering Weekly Is Brought to You by RudderStack RudderStack provides data pipelines that make it easy to collect data from every application, website, and SaaS platform, then activate it in your warehouse and business tools. Sign up free to test out the tool today. Should you? Should you?

Data Engineering

Data Engineering Data Engineer Engineering AWS

Data Engineering Weekly #128

Data Engineering Weekly

APRIL 23, 2023

Data Engineering Weekly Is Brought to You by RudderStack RudderStack provides data pipelines that make collecting data from every application, website, and SaaS platform easy, then activating it in your warehouse and business tools. I open-sourced Schemata last year, the industry's first Data Contract as a Code (DCC).

Data Engineering

Data Engineering Data Engineer Engineering Data Pipeline

Data Engineering Weekly #119

Data Engineering Weekly

FEBRUARY 19, 2023

Data Engineering Weekly Is Brought to You by RudderStack RudderStack provides data pipelines that make it easy to collect data from every application, website, and SaaS platform, then activate it in your warehouse and business tools. Seeing a pattern similar to Data Mart emerging in ML infrastructure is interesting.

Data Engineering

Data Engineering Data Engineer Engineering Google Cloud

Data Engineering in Retrospect: Key Trends and Patterns of 2023

Data Engineering Weekly

NOVEMBER 26, 2023

It’s the end of the year, and there will be a lot of buzz about what the next five years in data engineering might bring. 🕵️‍♂️📈 Before we delve into the patterns, it's important to remember that data infrastructure maturity model. Who knows what insights we might uncover?

Data Engineering

Data Engineering Data Engineer Engineering Kafka

The Workflow Engine For Data Engineers And Data Scientists

Data Engineering Podcast

JUNE 24, 2019

Summary Building a data platform that works equally well for data engineering and data science is a task that requires familiarity with the needs of both roles. Data engineering platforms have a strong focus on stateful execution and tasks that are strictly ordered based on dependency graphs.

Data Engineering

Data Engineering Data Engineer Engineering Data Science

Building a Chatbot Using Prompt Engineering

Edureka

APRIL 19, 2024

In other words, you don’t need to be a programmer or have any coding background to dive into this creative process of building a chatbot using prompt engineering. In this tutorial blog, we’ll be building a chatbot using prompt engineering inspired by Steve Harvey, offering tailored advice on life motivation and personal growth.

Building

Building Engineering Coding Systems

Data Engineering Weekly #136

Data Engineering Weekly

JUNE 25, 2023

Data Engineering Weekly Is Brought to You by RudderStack RudderStack provides data pipelines that make it easy to collect data from every application, website, and SaaS platform, then activate it in your warehouse and business tools. The design focuses on a three-layer system design. You're not alone!

Data Engineering

Data Engineering Data Engineer Engineering Data Lake

D3: An Automated System to Detect Data Drifts

Uber Engineering

FEBRUARY 23, 2023

Data quality is of paramount importance at Uber, powering critical decisions and features. In this blog learn how we automated column-level drift detection in batch datasets at Uber scale, reducing the median time to detect issues in critical datasets by 5X.

Systems

Systems Datasets Data

Data Engineering Weekly #109

Data Engineering Weekly

NOVEMBER 27, 2022

Data Engineering Weekly Is Brought to You by RudderStack RudderStack provides data pipelines that make it easy to collect data from every application, website, and SaaS platform, then activate it in your warehouse and business tools. Sign up free to test out the tool today.

Data Engineering

Data Engineering Data Engineer Engineering SQL

I asked ChatGPT to write a blog post about Data Engineering. Here it is.

How to learn data engineering

Webinars

Trending Sources

Data Engineering Weekly #171

Webinars

GPT and LLMs from a Data Engineering Perspective

Data Engineering Weekly #166

Data Engineering Weekly #161

Brief History of Data Engineering

Data Engineering Weekly #157

Career Opportunities in Software Engineering

A senior engineer/EM job search story

The Recommendation System at Lyft

Data Engineering Weekly #151

Data Engineering Weekly #159

Data Engineering Weekly #162

Data Engineering Weekly #155

Data Engineering Weekly #147

Data Engineering Weekly #148

Data Engineering Weekly #150

A Year of Modern: Our Top 2022 Blog Posts — Chosen by You

Data Pipeline Observability: A Model For Data Engineers

GPT-based data engineering accelerators

Moving Enterprise Data From Anywhere to Any System Made Easy

Data Engineering Weekly #143

1. Streamlining Membership Data Engineering at Netflix with Psyberg

Data Engineering Weekly #123

Maintain Your Data Engineers' Sanity By Embracing Automation

Data Engineering Weekly #137

Data Engineering Weekly #141

Data Engineering Weekly #122

Data Engineering Weekly #135

Data Engineering Weekly #125

Data Engineering Weekly #127

System Administrator Salary in 2024 [Fresher to Experienced]

An Engineering Guide to Data Quality - A Data Contract Perspective - Part 2

Data Engineering Weekly #142

Data Engineering Weekly #134

Data Engineering Weekly #128

Data Engineering Weekly #119

Data Engineering in Retrospect: Key Trends and Patterns of 2023

The Workflow Engine For Data Engineers And Data Scientists

Building a Chatbot Using Prompt Engineering

Data Engineering Weekly #136

D3: An Automated System to Detect Data Drifts

Data Engineering Weekly #109

Stay Connected