Loading JSON data into snowflake by running copy command.Here the column type of the table where JSON data will be loaded is of VARIANT type.
copy into <table name> from @stagename file_format = (type=json);
For querying the JSON data either parse through JSON arrays or use FLATTEN function as shown below:
table (flatten( <variant_column_name>,<json_element_name_tobe_flattened> )) ;
Using GET_PATH extract value of JSON data by providing the path name.
select get_path( <variant_column_name>,<root_json_element>.<repeating_json_element>)
Are you planning to migrate from ‘Redshift’ to ‘Snowflake’? In this post I will not focus on ‘WHY’ one should decide to migrate from ‘Redshift’ to ‘Snowflake’ rather I will discuss on high level ‘HOW’ one should migrate from Redshift to Snowflake.Here are top 5 things to consider as part of high level migration planning:
Batch loading of data in ETL is usually done overnight so Business can use the processed data early in the day.Often for any correction in the data, business has to wait for the next overnight load to happen, which is usually in 24 hours.
In today’s world where data is an asset, the more up-to-date Data is the more relevant it is for business in order to make business critical decisions.Often this batch load in ETL is a heavy and lengthy process and can’t be run during the daytime when the resultant dataset is in-use by the end user or…
I have been working with Data Ingestion, both with near real time and batch loading. Today, I am sharing here one such method for loading of streaming data into #snowflake via #snowpipe.
Here are 7 simple steps you can follow to do the setup:
I spent some time working with Apache #Kafka and also did a small poc on #Kinesis. Often I am asked which one to prefer.Well it totally depends on the cost involved,volume,and kind of human resources available for support.Here is my take on Apache Kafka vs Kinesis.
First difference is the obvious one Apache Kafka is open source and Kinesis is #aws service that you have to pay for as per usage.They both have different terms for the similar functions.What is partition in Kafka is called Shards in Kinesis. Both are distributed platform for messaging.
At the high level,Setting up Kafka…
I have been working with AWS Database Migration service for a long time.It gets challenging to monitor the replication tasks running in AWS Database Migration service specially if there are many.Here is an easy way to setup active monitoring of AWS Database Migration service(DMS) on slack.Follow these simple steps to setup the same:
Here are few steps to help you query raw data on S3 using AWS Athena:
Are you trying to setup ETL using AWS Glue?AWS Glue is ETL service provided by AWS.It is serverless, so you just pay for the resources used and not for the infra.Before you start with AWS Glue here are some of components that one should be aware of in order to setup ETL process using AWS Glue:
Data Engineer, BI, Data Analytics,DWH