Creating a Data Pipeline with Spark, Google Cloud Storage and Big Query
Towards Data Science
MARCH 6, 2023
You probably already saw Matt Turck’s 2021 Machine Learning, AI and Data (MAD) Landscape. Many open-source data-related tools have been developed in the last decade, like Spark, Hadoop, and Kafka, without mention all the tooling available in the Python libraries. Google Cloud Storage (GCS) is Google’s blob storage.
Let's personalize your content