Data Pipeline, ETL Tools, Events and Metadata

Data Pipeline

ETL Tools

Events

Metadata

Data Pipeline with Airflow and AWS Tools (S3, Lambda & Glue)

Towards Data Science

APRIL 6, 2023

Today’s post follows the same philosophy: fitting local and cloud pieces together to build a data pipeline. And, when it comes to data engineering solutions, it’s no different: They have databases, ETL tools, streaming platforms, and so on — a set of tools that makes our life easier (as long as you pay for them).

AWS

AWS Data Pipeline Amazon Web Services Python

Mastering the Art of ETL on AWS for Data Management

ProjectPro

FEBRUARY 16, 2023

The process of data extraction from source systems, processing it for data transformation, and then putting it into a target data system is known as ETL, or Extract, Transform, and Load. ETL has typically been carried out utilizing data warehouses and on-premise ETL tools.

AWS

AWS Data Management ETL Tools Management

Join 16,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

How to Optimize the Developer Experience for Monumental Impact

Generative AI Deep Dive: Advancing from Proof of Concept to Production

Understanding User Needs and Satisfying Them

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Leading the Development of Profitable and Sustainable Products

MORE WEBINARS

Trending Sources

Data Lake Explained: A Comprehensive Guide to Its Architecture and Use Cases

AltexSoft

AUGUST 29, 2023

Instead of relying on traditional hierarchical structures and predefined schemas, as in the case of data warehouses, a data lake utilizes a flat architecture. This structure is made efficient by data engineering practices that include object storage. Watch our video explaining how data engineering works.

Data Lake

Data Lake Architecture IT Amazon Web Services

Webinars

How to Optimize the Developer Experience for Monumental Impact

Generative AI Deep Dive: Advancing from Proof of Concept to Production

Understanding User Needs and Satisfying Them

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Leading the Development of Profitable and Sustainable Products

MORE WEBINARS

The Spiritual Alignment of dbt + Airflow

dbt Developer Hub

NOVEMBER 28, 2021

In my days as a data consultant and now as a member of the dbt Labs Solutions Architecture team, I’ve frequently seen Airflow, dbt Core & dbt Cloud ( via the official provider ) blended as needed, based on the needs of a specific data pipeline, or a team’s structure and skillset. Let’s export log events into BigQuery. “I

Google Cloud

Google Cloud SQL Cloud Consulting

Demystifying event streams: Transforming events into tables with dbt

dbt Developer Hub

NOVEMBER 3, 2022

Let’s discuss how to convert events from an event-driven microservice architecture into relational tables in a warehouse like Snowflake. We use Snowflake as our data warehouse where we build dashboards both for internal use and for customers. This data would become the main dbt sources used by our report models in BI.

Kafka

Kafka ETL Tools BI Database

Sqoop vs. Flume Battle of the Hadoop ETL tools

ProjectPro

OCTOBER 28, 2015

Some of the common challenges with data ingestion in Hadoop are parallel processing, data quality, machine data on a higher scale of several gigabytes per minute, multiple source ingestion, real-time ingestion and scalability. Need for Apache Sqoop How Apache Sqoop works? Need for Flume How Apache Flume works?

ETL Tools

ETL Tools Hadoop Relational Database Unstructured Data

The Good and the Bad of Apache Kafka Streaming Platform

AltexSoft

OCTOBER 21, 2022

The technology was written in Java and Scala in LinkedIn to solve the internal problem of managing continuous data flows. This scenario involves three main characters — publishers, subscribers, and a message or event broker. A subscriber is a receiving program such as an end-user app or business intelligence tool.

Kafka

Kafka Hadoop ETL Tools Big Data

What is ETL Pipeline? Process, Considerations, and Examples

ProjectPro

NOVEMBER 30, 2021

This guide provides definitions, a step-by-step tutorial, and a few best practices to help you understand ETL pipelines and how they differ from data pipelines. The crux of all data-driven solutions or business decision-making lies in how well the respective businesses collect, transform, and store data.

Process

Process Data Pipeline Data Warehouse AWS

IBM InfoSphere vs Oracle Data Integrator vs Xplenty and Others: Data Integration Tools Compared

AltexSoft

OCTOBER 8, 2021

Considered to be a leader in the field of data integration, Oracle Data Integrator (ODI) is a multi-functional solution that is part of Oracle’s data management ecosystem. The platform provides features for event-based , data-based, and service-based integration styles. Data profiling and cleansing.

Data Integration

Data Integration Hadoop Data Warehouse Data Lake

Turning Streams Into Data Products

Cloudera

JUNE 16, 2022

CSP was recently recognized as a leader in the 2022 GigaOm Radar for Streaming Data Platforms report. Reduce ingest latency and complexity: Multiple point solutions were needed to move data from different data sources to downstream systems.

Kafka

Kafka Manufacturing Data Lake SQL

Data Engineering Digest

Data Pipeline with Airflow and AWS Tools (S3, Lambda & Glue)

Mastering the Art of ETL on AWS for Data Management

Webinars

Trending Sources

Data Lake Explained: A Comprehensive Guide to Its Architecture and Use Cases

Webinars

The Spiritual Alignment of dbt + Airflow

Demystifying event streams: Transforming events into tables with dbt

Sqoop vs. Flume Battle of the Hadoop ETL tools

The Good and the Bad of Apache Kafka Streaming Platform

What is ETL Pipeline? Process, Considerations, and Examples

IBM InfoSphere vs Oracle Data Integrator vs Xplenty and Others: Data Integration Tools Compared

Turning Streams Into Data Products

Stay Connected