7 Python Libraries Every Data Engineer Should Know
KDnuggets
APRIL 25, 2024
Interested in switching to data engineering? Here’s a list of Python libraries you’ll find super helpful.
This site uses cookies to improve your experience. By viewing our content, you are accepting the use of cookies. To help us insure we adhere to various privacy regulations, please select your country/region of residence. If you do not select a country we will assume you are from the United States. View our privacy policy and terms of use.
KDnuggets
APRIL 25, 2024
Interested in switching to data engineering? Here’s a list of Python libraries you’ll find super helpful.
Data Engineering Podcast
JUNE 25, 2023
Summary Data transformation is a key activity for all of the organizational roles that interact with data. Because of its importance and outsized impact on what is possible for downstream data consumers it is critical that everyone is able to collaborate seamlessly. Can you describe what SQLMesh is and the story behind it?
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Understanding User Needs and Satisfying Them
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know
Leading the Development of Profitable and Sustainable Products
Snowflake
APRIL 17, 2024
In today’s data-driven world, developer productivity is essential for organizations to build effective and reliable products, accelerate time to value, and fuel ongoing innovation. Recognizing this shift, Snowflake is taking a Python-first approach to bridge the gap and help users leverage the power of both worlds.
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Understanding User Needs and Satisfying Them
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know
Leading the Development of Profitable and Sustainable Products
Christophe Blefari
JANUARY 20, 2024
Learn data engineering, all the references ( credits ) This is a special edition of the Data News. But right now I'm in holidays finishing a hiking week in Corsica 🥾 So I wrote this special edition about: how to learn data engineering in 2024. Who are the data engineers?
Analytics Vidhya
JUNE 20, 2023
Introduction In today’s data-driven world, organizations across industries are dealing with massive volumes of data, complex pipelines, and the need for efficient data processing.
Simon Späti
OCTOBER 19, 2022
Will Rust kill Python for Data Engineers? But then again, you have to ask: was Python made for Data Engineering in the first place? Let’s explore why Rust has potential for data engineers, what it does well and why it has become the most loved programming language for 7 years running.
Simon Späti
OCTOBER 19, 2022
Will Rust kill Python for Data Engineers? But then again, you have to ask: was Python made for Data Engineering in the first place? Let’s explore why Rust has potential for data engineers, what it does well and why it has become the most loved programming language for 7 years running.
Seattle Data Guy
MAY 26, 2024
Many data engineers and analysts don’t realize how valuable the knowledge they have is. They’ve spent hours upon hours learning SQL, Python, how to properly analyze data, build data warehouses, and understand the differences between eight different ETL solutions.
Ascend.io
SEPTEMBER 14, 2023
The rise of data-intensive operations has positioned data engineering at the core of today’s organizations. As the demand to efficiently collect, process, and store data increases, data engineers have started to rely on Python to meet this escalating demand. Why Python for Data Engineering?
Data Engineering Weekly
MAY 26, 2024
link] Meta: Composable data management at Meta Meta writes about its transition to a composable data management system to improve interoperability, reusability, and engineering efficiency. seconds, enhancing real-time sports data analytics efficiency! The author highlights some of the key missing features from S3.
Jesse Anderson
DECEMBER 12, 2022
They were the first companies to commercialize open source big data technologies and pushed the marketing and commercialization of Hadoop. With an immutable file system like HDFS, we needed scalable databases to read and write data randomly. Apache Spark came in 2009 and gave a unified batch and streaming engine.
Confessions of a Data Guy
FEBRUARY 26, 2023
Someone on Linkedin recently brought up the point that companies could save gobs of money by swapping out AWS Python lambdas for Rust ones. While it raised the ire of many a Python Data Engineer, I thought it sounded like a great idea. At least it’s an excuse to […] The post AWS Lambdas – Python vs Rust.
Towards Data Science
NOVEMBER 4, 2023
Platform Specific Tools and Advanced Techniques Photo by Christopher Burns on Unsplash The modern data ecosystem keeps evolving and new data tools emerge now and then. In this article, I want to talk about crucial things that affect data engineers. Are your data pipelines efficient? Data warehouse exmaple.
Confessions of a Data Guy
APRIL 16, 2023
You might think […] The post DuckDB vs Polars for Data Engineering. appeared first on Confessions of a Data Guy. I haven’t seen this since Databricks and Snowflake first came out and started throwing mud at each other.
Data Engineering Podcast
FEBRUARY 5, 2023
In that time there have been a number of generational shifts in how data engineering is done. Parting Question From your perspective, what is the biggest gap in the tooling or technology for data management today? __init__ covers the Python language, its community, and the innovative ways it is being used.
Seattle Data Guy
FEBRUARY 11, 2023
Apache Airflow is a very popular tool that data engineers rely on. Why do data engineers like Airflow? What are… Read more The post What Is Apache Airflow – Data Engineering Consulting appeared first on Seattle Data Guy. Also, what does Apache Airflow event do? What is a DAG?
Data Engineering Podcast
JULY 2, 2023
Summary Feature engineering is a crucial aspect of the machine learning workflow. In this episode Razi Raziuddin shares how data engineering teams can support the machine learning workflow through the development and support of systems that empower data scientists and ML engineers to build and maintain their own features.
Confessions of a Data Guy
SEPTEMBER 9, 2023
In the vast world of data, it’s not just about gathering and analyzing information anymore; it’s also about ensuring that data pipelines, processes, and platforms run seamlessly and efficiently.
Waitingforcode
FEBRUARY 3, 2023
In this blog post I'll share with you a list of Java and Scala classes I use almost every time in data engineering projects. The part for Python will follow next week! We all have our habits and as programmers, libraries and frameworks are definitely a part of the group.
Confessions of a Data Guy
OCTOBER 6, 2023
I wring my hands sometimes, wishing that things and technologies somehow come together into some bubbling […] The post The Ultimate Data Engineering Chadstack. appeared first on Confessions of a Data Guy. Running Rust inside Apache Airflow.
Towards Data Science
MAY 22, 2023
Solving data preparation tasks with ChatGPT Photo by Ricardo Gomez Angel on Unsplash Data engineering makes up a large part of the data science process. In CRISP-DM this process stage is called “data preparation”. It comprises tasks such as data ingestion, data transformation and data quality assurance.
Data Engineering Weekly
MARCH 17, 2024
Compliance is mandatory, with strict penalties for violations, emphasizing the importance of data scientists familiarizing themselves with the law to avoid prohibited AI uses and ensure ethical, safe AI development. It discusses the significance of data governance, sharing history, and generative AI's impact on data economy standards.
Knowledge Hut
DECEMBER 26, 2023
Data Engineering is typically a software engineering role that focuses deeply on data – namely, data workflows, data pipelines, and the ETL (Extract, Transform, Load) process. What is Data Science? What are the roles and responsibilities of a Data Engineer? What is Data Science?
Data Engineering Weekly
FEBRUARY 18, 2024
RudderStack is the Warehouse Native CDP, built to help data teams deliver value across the entire data activation lifecycle, from collection to unification and activation. Our hope is only with the amazing community of data practitioners who constantly support us. We are so over the Big Data Era to Modern Data Stack.
Towards Data Science
AUGUST 19, 2023
How I made the transition to an analytics engineer Photo by Campaign Creators on Unsplash A few years ago, I was at a point where I was feeling unfulfilled in my career. I had been working in data engineering for three years and the initial excitement of starting in the world of tech had faded.
Data Engineering Weekly
MARCH 31, 2024
Intuit: How Intuit data analysts write SQL 2x faster with the internal GenAI tool The productivity increase with GenAI is undeniable, and several startups are trying to solve the Text2SQL generation problem. My key highlight is that Excellent data documentation and “clean data” improve results.
Cloudera
JULY 13, 2021
After the launch of CDP Data Engineering (CDE) on AWS a few months ago, we are thrilled to announce that CDE, the only cloud-native service purpose built for enterprise data engineers, is now available on Microsoft Azure. . CDP data lifecycle integration and SDX security and governance. Easy job deployment.
Towards Data Science
OCTOBER 21, 2023
Advanced ETL techniques for beginners Continue reading on Towards Data Science »
Start Data Engineering
OCTOBER 11, 2021
Leetcode: data structures and algorithms 4. Data modeling 4.1 Data warehousing 4.2 Data pipelines 6. Introduction Skills 1. Distributed system fundamentals 7. Event streaming 8. System design 9. Business questions 10. Cloud computing 11.
Data Engineering Podcast
JANUARY 30, 2022
Summary Pandas is a powerful tool for cleaning, transforming, manipulating, or enriching data, among many other potential uses. As a result it has become a standard tool for data engineers for a wide range of applications. The only thing worse than having bad data is not knowing that you have it.
Data Engineering Podcast
APRIL 14, 2024
The default association with the term "database" is relational engines, but non-relational engines are also used quite widely. In this episode Oren Eini, CEO and creator of RavenDB, explores the nuances of relational vs. non-relational engines, and the strategies for designing a non-relational database.
Data Engineering Weekly
MARCH 24, 2024
link] Kai Waehner: The Data Streaming Landscape 2024 This is a comprehensive overview of the state of the data streaming landscape in 2024. link] Meta: Logarithm - A logging engine for AI training workflows and services Logarithm indexes 100+GB/s of logs in real-time and thousands of queries a second!!!
Towards Data Science
DECEMBER 4, 2023
A Glossary with Use Cases for First-Timers in Data Engineering An happy Data Engineer at work Are you a data engineering rookie interested in knowing more about modern data infrastructures? In this guide Data Engineering meets Formula 1. I bet you are, this article is for you!
Knowledge Hut
JUNE 26, 2023
Welcome to the world of data engineering, where the power of big data unfolds. If you're aspiring to be a data engineer and seeking to showcase your skills or gain hands-on experience, you've landed in the right spot. What are Data Engineering Projects?
Data Engineering Podcast
JULY 10, 2022
Summary Building and maintaining reliable data assets is the prime directive for data engineers. While it is easy to say, it is endlessly complex to implement, requiring data professionals to be experts in a wide range of disparate topics while designing and implementing complex topologies of information workflows.
Knowledge Hut
MAY 3, 2023
Did you know that data is now an essential component of modern business operations? With companies increasingly relying on data-driven insights to make informed decisions, there has never been a greater need for skilled specialists who can manage and evaluate vast amounts of data.
Data Engineering Podcast
MAY 22, 2022
Summary Machine learning has become a meaningful target for data applications, bringing with it an increase in the complexity of orchestrating the entire data flow. With simple pricing, fast networking, object storage, and worldwide data centers, you’ve got everything you need to run a bulletproof data platform.
Data Engineering Weekly
SEPTEMBER 3, 2023
Data Engineering Weekly Is Brought to You by RudderStack RudderStack Profiles takes the SaaS guesswork and SQL grunt work out of building complete customer profiles so you can quickly ship actionable, enriched data to every downstream team. See how it works today. 📣 Exciting news! . or "Don't reinvent the wheel.".
Team Data Science
JANUARY 8, 2021
Big Data has become the dominant innovation in all high-performing companies. Notable businesses today focus their decision-making capabilities on knowledge gained from the study of big data. Big Data gives you an advantage in competition as true for businesses as it is for professionals working in the area of analytics.
Knowledge Hut
MARCH 5, 2024
Data engineers are highly in demand and short in supply. Data engineering is one of the hottest jobs that is trending across the globe. Singapore has a thriving technical market that has been on the lookout for data engineers. Who is Data Engineer and What Do They Do?
Data Engineering Weekly
SEPTEMBER 24, 2023
Data Engineering Weekly Is Brought to You by RudderStack RudderStack Profiles takes the SaaS guesswork and SQL grunt work out of building complete customer profiles so you can quickly ship actionable, enriched data to every downstream team. Airflow isn’t meant to process the data. See how it works today.
Knowledge Hut
MARCH 15, 2024
At the same time, it has opened up a wealth of opportunities for data engineers. With businesses harnessing the power of Azure’s services, the need for skilled data engineers has topped the charts. Speaking from experience, the data engineers in this role are right in the thick of it all.
Ascend.io
FEBRUARY 28, 2024
The rise of generative AI is changing more than just technology; it’s reshaping our professional landscapes — and yes, data engineering is directly experiencing the impact. How does AI recalibrate the workload and priorities of data teams? How can data engineers harness the power of AI?
Data Engineering Weekly
AUGUST 27, 2023
Data Engineering Weekly Is Brought to You by RudderStack RudderStack Profiles takes the SaaS guesswork, and SQL grunt work out of building complete customer profiles so you can quickly ship actionable, enriched data to every downstream team. Does that will data teams shrink in size? See how it works today.
Expert insights. Personalized for you.
We have resent the email to
Are you sure you want to cancel your subscriptions?
Let's personalize your content