Creating a Data Pipeline with Spark, Google Cloud Storage and Big Query
Towards Data Science
MARCH 6, 2023
It’s possible to go from simple ETL pipelines built with python to move data between two databases to very complex structures, using Kafka to stream real-time messages between all sorts of cloud structures to serve multiple end applications. Google Cloud Storage (GCS) is Google’s blob storage. Image by the author.
Let's personalize your content