Snowflake is one of the most popular cloud data warehouses today . Snowflake has a number of features that make it an amazing cloud data warehouse to work with for data engineers. Snowflake leverages several new Azure features, including limitless storage accounts, accelerated networking, and storage soft delete. The goal of Snowflake is to provide the same Snowflake experience no matter which cloud infrastructure provider customers choose, with no barriers to entry. Snowflake on Azure will make it easier than ever for teams across an organization to become more data-driven, efficient and productive with their most valuable resources: data and people.
In this article I show you how to setup a notification integration in Snowflake that talks to Azure event grid linked to an Azure storage account. Essentially this service will listen to any events (file additions in this case) in an Azure Storage container and perform a given activity via Snowpipe. Here is a high level architecture for this setup.
Setup an Azure storage account by following the Microsoft docs here. Once a a storage account is setup, create a container as shown below.
Next, we need to create a Queue as shown below. Then note down the queue URL.
Next, lets setup an event subscription by first clicking on the Events icon in the left pane. Below are the configuration details for the event subscription in azure.
With that you should now have the following setup at the Azure end.
- Azure storage account
- a container named ‘myfolder’
- a Queue called ‘snowpipe-notification-queue’
- a notification event configured to use the queue from item 3
Log into your snowflake account (if you don’t have one you can create a 30 day free trial account here) and open a brand new worksheet and now its time to do some coding!
Execute the following code in your Snowflake worksheet to create a notification integration. You would need your Azure tenant id and the queue url for this piece of code to work. A notification integration is a Snowflake object that provides an interface between Snowflake and third-party cloud message queuing services.
create notification integration SNOWPIPE_DEMO_EVENT enabled = true type = queue notification_provider = azure_storage_queue azure_storage_queue_primary_uri = 'https://yourazurestorageaccountname.queue.core.windows.net/snowpipe-notification-queue' azure_tenant_id = 'your azure tenant id';
Next, run the code below to find the value for AZURE_CONSENT_URL
desc integration SNOWPIPE_DEMO_EVENT
Open the URL in a new tab, you will be prompted to login to your azure portal and allow the app registration to be added to your azure tenant. Then navigate back to you Azure storage account and assign the “Storage Queue Data Contributor” role to the app registration as shown below.
Next, we create a stage object in snowflake
CREATE OR REPLACE STAGE "DEMO_DB"."PUBLIC"."DEMO_STAGE" url = 'azure://yourazurestorageaccountname.blob.core.windows.net/myfolder/' credentials = (azure_sas_token= 'your sas token from your azure storage account' );
For the above code you would need a SAS token. You can navigate to the “Shared access signature” page in your Azure Storage account and generate a key for the container as per your security requirements. refer here to learn more.
Run the code below to make sure you can see a list of file in the container or just an empty result if you don’t have files yet. But make sure you don’t get any errors when you run this.
Now it time to finally create the snowpipe object
CREATE OR REPLACE pipe "DEMO_DB"."PUBLIC"."DEMO_PIPE" auto_ingest = true integration = 'SNOWPIPE_DEMO_EVENT' as copy into employees from @"DEMO_DB"."PUBLIC"."DEMO_STAGE" file_format = my_csv_format--(type = 'CSV')
Above code will be executed each time a new file is added to the azure storage container. For this code to work make sure you have a table created. In my case I have an ’employee’ table created which has a structure shown below.
create table employees ( firstname varchar(100), lastname varchar(100), empid varchar(100) );
the raw csv file I am using to load into this table is below
Now you should see the whole process in action as soon as you drop a csv file into the container in Azure. Please feel free to email me if you have questions