article thumbnail

Building cost effective data pipelines with Python & DuckDB

Start Data Engineering

Building efficient data pipelines with DuckDB 4.1. Use DuckDB to process data, not for multiple users to access data 4.2. Cost calculation: DuckDB + Ephemeral VMs = dirt cheap data processing 4.3. Processing data less than 100GB? KISS: DuckDB + Python = easy to debug and quick to develop 4.

article thumbnail

Snowflake’s New Python API Empowers Data Engineers to Build Modern Data Pipelines with Ease

Snowflake

Yet while SQL applications have long served as the gateway to access and manage data, Python has become the language of choice for most data teams, creating a disconnect. Recognizing this shift, Snowflake is taking a Python-first approach to bridge the gap and help users leverage the power of both worlds.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Kafka to MongoDB: Building a Streamlined Data Pipeline

Analytics Vidhya

Handling and processing the streaming data is the hardest work for Data Analysis. We know that streaming data is data that is emitted at high volume […] The post Kafka to MongoDB: Building a Streamlined Data Pipeline appeared first on Analytics Vidhya.

MongoDB 217
article thumbnail

Simplified End-to-End Development for Production-Ready Data Pipelines, Applications, and ML Models

Snowflake

Snowflake offers a secure, streamlined approach to developing across data workloads, reducing costs and reliance on external tools. This means faster development and happier data teams. Explore and experiment with data, visualize results, share insights — all in one place. Interact with Snowflake objects directly in Python.

article thumbnail

Writing memory efficient data pipelines in Python

Start Data Engineering

Using distributed frameworks Pros & Cons Conclusion Further reading References Introduction If you are Wondering how to write memory efficient data pipelines in python Working with a dataset that is too large to fit into memory Then this post is for you.

article thumbnail

Data Pipeline Design Patterns - #2. Coding patterns in Python

Start Data Engineering

Singleton, & Object pool patterns Python helpers 1. Introduction Sample project Code design patterns 1. Functional design 2. Factory pattern 3. Strategy pattern 4. Dataclass 3. Context Managers 4. Testing with pytest 5.

Designing 147
article thumbnail

Unpacking The Seven Principles Of Modern Data Pipelines

Data Engineering Podcast

Summary Data pipelines are the core of every data product, ML model, and business intelligence dashboard. The folks at Rivery distilled the seven principles of modern data pipelines that will help you stay out of trouble and be productive with your data. Closing Announcements Thank you for listening!