Google Cloud SQL for PostgreSQL, a part of Google’s robust cloud ecosystem, offers businesses a dependable solution for managing relational data. However, with the expanding need for advanced data analytics, it is required to integrate data storage and processing platforms like Snowflake. Migrating data from PostgreSQL on Google Cloud SQL to Snowflake will allow you to extract actionable insights from your data. Such insights can be crucial to drive strategic business decisions and optimize marketing strategies to improve customer experiences.

Let’s look into the different methods that can help you integrate these two platforms.

Methods to Load Data from PostgreSQL on Google Cloud SQL to Snowflake

Prerequisites

  • Host name or IP address of your PostgreSQL server is available.
  • PostgreSQL version of 9.4 or higher.
  • An active Snowflake account.

Method 1: Move Data from PostgreSQL on Google Cloud SQL to Snowflake Using CSV Files

The method exports your data from Google Cloud SQL PostgreSQL as CSV files and then loads these files to Snowflake tables.

Here are the steps involved in this data migration process:

Step 1: Export Data from PostgreSQL on Google Cloud SQL as CSV Files

This involves exporting data from a database on a Cloud SQL instance to a CSV file in a Google Cloud Storage (GCS) bucket. Here are the steps to export data as a CSV file:

  • Log in to your Google Cloud account. Navigate to the Console > Cloud SQL Instances page.
  • Click on an instance name to open the Overview page of the instance. Then, click on Export.
  • You can select Offload export to allow other operations to run while the export progresses.
  • In the Cloud Storage export location, specify the name of the bucket, folder, and file that you want to export. Alternatively, click on Browse to find or create a bucket, folder, or file.

If you select Browse:

  • Select a GCS bucket or folder in the Location section.
  • In the Name box, select an existing file from the list in the Location section or add a name for the CSV file.
  • Click Select.
  • For Format, click CSV, and for the Database for export, select the database name from the drop-down menu.
  • In SQL query, enter a SQL query specifying the table you want to export data from. The query must specify a table in the specified database since you can’t export an entire database in CSV format.
  • Click on Export.

This will export data from your PostgreSQL instance on Google Cloud SQL as a CSV file to a GCS bucket.

Step 2: Load the CSV Files to Snowflake

Now, you must load the CSV files containing the PostgreSQL on Google Cloud SQL data to Snowflake tables. To do this, you must:

  • Create a storage integration in Snowflake to access GCS buckets. Use the Snowflake UI to run the following SQL command:
CREATE STORAGE INTEGRATION integration_name
TYPE = EXTERNAL_STAGE
STORAGE_PROVIDER = ‘GCS’
ENABLED = TRUE
STORAGE_ALLOWED_LOCATIONS = (‘GCS_storage_path');

This command creates a new storage integration object in Snowflake with the provided integration name. The EXTERNAL_STAGE indicates the storage integration is for use with Snowflake external stages. GCS is mentioned as the cloud storage provider. ENABLED = TRUE specifies that the storage integration is enabled. You can list the storage locations in STORAGE_ALLOWED_LOCATIONS that this integration is allowed to access. An example format of this is gcs://your_bucket_name/.

  • Create a stage in Snowflake to load the data from GCS. Here’s an example SQL code:
CREATE STAGE my_gcs_stage
URL = 'gcs://your_bucket_name/path'
STORAGE_INTEGRATION = integration_name;

This will create a new external stage in Snowflake named my_gcs_stage. You must specify the location on the GCS that this stage points to. This stage is tied to the storage integration, the name of which must be provided for STORAGE_INTEGRATION.

  • Use the COPY INTO command to load the data from GCS to the Snowflake target table.
COPY INTO mytable
FROM @my_gcs_stage
FILE_FORMAT = (TYPE = CSV);
ON_ERROR = CONTINUE;

This loads data from the data source specified by @my_gcs_stage to the Snowflake table named mytable. The file format is specified as CSV, and even if Snowflake encounters an error while loading the data, it will still continue the loading process.

Here’s a list of benefits associated with using CSV export/import for PostgreSQL on Google Cloud SQL to Snowflake migration:

  • Using an intermediary storage solution like GCS bucket serves as a buffer. If the data loading process in Snowflake encounters any issues, the original data files will remain intact in the storage.
  • It is ideal for one-time or infrequent data transfers, wherein the associated latencies won’t significantly impact the operations.

Method 2: Automating the Data Replication Using a No-Code Tool

There are some limitations to using the CSV Export/Import for PostgreSQL on Google Cloud SQL to Snowflake Integration, including:

  • For large databases, the CSV export process in Cloud SQL can take an hour or more. During the export, you cannot perform any other operations.
  • Owing to the latency involved in the export process and manual loading of data to Snowflake, it isn’t suitable for real-time data migrations.
  • The use of GCS storage to store large CSV files can lead to additional costs.

You can use no-code tools to overcome these limitations. Such tools are associated with the following benefits:

  • Progress Monitoring: Most no-code tools provide monitoring and logging features to track the data migration progress. This helps identify errors and troubleshoot issues in real time. Alerts, error logs, or notifications will help you identify any data migration issues and address them.
  • Scalability: No-code tools are typically designed to scale up or scale down to handle large-scale and small-scale data migrations, respectively. These tools flexibly adapt to changing data integration needs and accommodate growing datasets.
  • Automation: No-code tools usually provide scheduling and automation functionalities to automate the integration process. From data ingestion and transformation to loading the data into the destination at specified intervals, you can automate the entire process.

Hevo Data is one such efficient no-code tool that you can use to simplify the process of setting up a PostgreSQL on Google Cloud SQL to Snowflake ETL. This fully managed data pipeline platform is designed for error-free, near-real-time data integration. You can use Hevo Data to seamlessly move data between any two platforms in just a few minutes.

Additional Prerequisites

  • Whitelist Hevo’s IP addresses.
  • SELECT, USAGE, and CONNECT privileges granted to the database user.
  • If the Pipeline mode is Logical Replication:
    • Enable Log-based incremental replication.
    • PostgreSQL database instance is a master instance.
  • Hevo is assigned:
    • USAGE permissions on data warehouses.
    • USAGE and CREATE SCHEMA permissions on databases.
    • USAGE, MONITOR, MODIFY, CREATE TABLE, and CREATE EXTERNAL TABLE permissions on the current and future schemas.
  • The user must be assigned the following roles:
    • If a warehouse is to be created, ACCOUNTADMIN or SYSADMIN role in Snowflake.
    • To create a new role for Hevo, ACCOUNTADMIN or SECURITYADMIN in Snowflake.

To connect PostgreSQL on Google Cloud SQL to Snowflake using Hevo Data, here are the steps you can follow:

Step 1: Configure PostgreSQL on Google Cloud SQL as the Data Source

PostgreSQL on Google Cloud SQL to Snowflake: Configure PostgreSQL on Google Cloud SQL as a Source
Image Source

Step 2: Configure Snowflake as the Data Destination

PostgreSQL on Google Cloud SQL to Snowflake: Configure Snowflake as a Destination
Image Source

Using Hevo Data to set up your data integration process offers the following benefits:

  • Near Real-Time Data Replication: Hevo supports near-real-time data replication without compromising on accuracy. This helps ensure the most recent data is available on platforms.
  • Security: Hevo is compliant with all major certifications, including GDPR, SOC II, and HIPAA. It provides end-to-end encryption on data integration processes.
  • Built-in Integrations: Hevo offers 150+ built-in connectors (including 50+ free sources) to simplify the process of setting up data integration pipelines.
  • Transformation: Hevo has a drag-and-drop interface consisting of preloaded transformations to help you format the data. Alternatively, you can use Hevo’s Python interface to perform advanced data transformation.

What Can You Achieve With PostgreSQL on Google Cloud SQL to Snowflake Integration?

Migrating data from PostgreSQL on Google Cloud SQL to Snowflake can provide answers to the following questions:

  • How to personalize marketing campaigns based on individual customer behaviors and preferences?
  • Are there any gaps in the skills or knowledge of teams that could benefit from additional training?
  • What is the estimated revenue that a customer will generate during the time they’re associated with the business?
  • Are there any external factors, such as global trends or regional events, influencing customer preferences?
    • How can you adapt to the external factors?

Conclusion

A PostgreSQL on Google Cloud SQL Snowflake integration can help organizations anticipate market trends, refine their customer experiences, and drive growth.

The two methods to connect PostgreSQL on Google Cloud SQL to Snowflake include using CSV export/import and no-code tools. There are some drawbacks associated with the CSV export/import method. Usually, it is time-consuming, effort-intensive, and lacks automation and real-time capabilities. Additionally, the use of GCS buckets to store data incurs extra costs.

To overcome these drawbacks, you can use no-code tools like Hevo Data for the data migration process. It offers a fully managed workflow and supports real-time or near-real-time integrations. Setting up a data migration pipeline with Hevo will only take a few minutes.

Hevo Data, with its range of readily available pre-built connectors and transformations, simplifies the process of data migration between any source and destination. The intuitive interface of Hevo makes it suitable even for non-engineers to set up pipelines and achieve analytics-ready data quickly.

If you don’t want SaaS tools with unclear pricing that burn a hole in your pocket, opt for a tool that offers a simple, transparent pricing model. Hevo has 3 usage-based pricing plans starting with a free tier, where you can ingest up to 1 million records.

Schedule a demo to see if Hevo would be a good fit for you, today!

mm
Freelance Technical Content Writer, Hevo Data

Suchitra's profound enthusiasm for data science and passion for writing drives her to produce high-quality content on software architecture, and data integration

All your customer data in one place.