Remove Data Warehouse Remove Data Workflow Remove Google Cloud Remove Metadata
article thumbnail

Toward a Data Mesh (part 2) : Architecture & Technologies

François Nguyen

TL;DR After setting up and organizing the teams, we are describing 4 topics to make data mesh a reality. How can we interoperate between the data domains ? How do we govern all these data products and domains ? It will be illustrated with our technical choices and the services we are using in the Google Cloud Platform.

article thumbnail

Making The Total Cost Of Ownership For External Data Manageable With Crux

Data Engineering Podcast

Atlan is the metadata hub for your data ecosystem. Instead of locking your metadata into a new silo, unleash its transformative potential with Atlan’s active metadata capabilities. Modern data teams are dealing with a lot of complexity in their data pipelines and analytical code.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Data Engineering Zoomcamp – Data Ingestion (Week 2)

Hepta Analytics

This week, we got to think about our data ingestion design. We looked at the following: How do we ingest – ETL vs ELT Where do we store the dataData lake vs data warehouse Which tool to we use to ingest – cronjob vs workflow engine NOTE : This weeks task requires good internet speed and good compute.

article thumbnail

Data Pipeline Architecture Explained: 6 Diagrams and Best Practices

Monte Carlo

The modern data stack era , roughly 2017 to present data, saw the widespread adoption of cloud computing and modern data repositories that decoupled storage from compute such as data warehouses, data lakes, and data lakehouses.

article thumbnail

Big Data (Quality), Small Data Team: How Prefect Saved 20 Hours Per Week with Data Observability

Monte Carlo

Here’s how Prefect , Series B startup and creator of the popular data orchestration tool, harnessed the power of data observability to preserve headcount, improve data quality and reduce time to detection and resolution for data incidents. But a growing company means growing data needs. Image courtesy of Prefect.

article thumbnail

The Good and the Bad of Apache Airflow Pipeline Orchestration

AltexSoft

DevOps tasks — for example, creating scheduled backups and restoring data from them. Airflow is especially useful for orchestrating Big Data workflows. Airflow is not a data processing tool by itself but rather an instrument to manage multiple components of data processing. Metadata database.

article thumbnail

The Top Data Strategy Influencers and Content Creators on LinkedIn

Databand.ai

In his current role as Senior Director of Product Management at Google, he focuses on BigQuery, Cloud Dataflow, Cloud DataProc, Cloud DataPrep, Cloud PubSub, and Cloud Composer. She also posts frequently on LinkedIn about data analytics, data strategy, data governance, and data engineering.

BI 52