Author name: cndro

Uncategorized

Snowflake Connector for Python

Introduction The snowflake connector is a native, pure Python package that does not depend on JDBC or ODBC. It can be installed using pip on Linux, macOS, and Windows platforms where Python 3.6 and upwards are installed. The connector supports developing applications using the Python Database API v2 specification (PEP-249), including using the following standard API objects: Cursor objects for executing DDL/DML statements and queries. How to install the Connector Python Package Installer & Setup Tools The Snowflake Connector for Python is installed with pip, first execute the following command to ensure the required version is installed: 1python -m pip install –upgrade pip To install the Snowflake Connector for Python and the dependent libraries: Determine the version of the Snowflake Connector for Python that you plan to install. To install the dependent libraries, run the pip (or pip3) command and point to the requirement for that version of the connector. 1pip install -r https://raw.githubusercontent.com/snowflakedb/snowflake-connector-python/v2.6.2/tested_requirements/requirements_36.reqs Validate your installation Create a file known as validate.py, and run the file which shows the version of the snowflakes. If this code brings up an error when ran, this shows that something is wrong with the installation. #!/usr/bin/env python import snowflake.connector # Gets the version ctx = snowflake.connector.connect( user=’john’, password=’xxxxxxxx’, account=’xxxxxxxx’ ) cs = ctx.cursor() try: cs.execute(“SELECT current_version()”) one_row = cs.fetchone() print(one_row[0]) finally: cs.close() ctx.close() How to Use the Python Connector You can Connect to Snowflake using the login parameters: import snowflake.connector PASSWORD = os.getenv(‘SNOWSQL_PWD’) WAREHOUSE = os.getenv(‘WAREHOUSE’) conn = snowflake.connector.connect( user=USER, password=PASSWORD, account=ACCOUNT, warehouse=WAREHOUSE, database=DATABASE, schema=SCHEMA ) After you have logged in, create a database, schema, and warehouse with the connection. How to Create Database, Schema, and Warehouse  You can use the CREATE DATABASE, CREATE SCHEMA and CREATE WAREHOUSE commands to create this. I displayed an example below for each, whereby I specified the name of the database in which to create the schema and also a name specified for the data warehouse. conn.cursor().execute(“CREATE WAREHOUSE IF NOT EXISTS john_warehouse”) conn.cursor().execute(“CREATE DATABASE IF NOT EXISTS testing”) conn.cursor().execute(“USE DATABASE testing”) conn.cursor().execute(“CREATE SCHEMA IF NOT EXISTS testingschema”) Using the Database, Schema, and Warehouse You can also reference your database, schema, and warehouse. Make use of each with the following commands with which we can start creating tables afterward. conn.cursor().execute(“USE WAREHOUSE tiny_warehouse_mg”) conn.cursor().execute(“USE DATABASE testdb_mg”) conn.cursor().execute(“USE SCHEMA testdb_mg.testschema_mg”) How to Create Tables You can create tables and insert data into your database using the connection. Check the  example below. onn.cursor().execute( “CREATE OR REPLACE TABLE ” “first_t(col1 integer, col2 string)”) conn.cursor().execute( “INSERT INTO first_t(col1, col2) VALUES ” + ” (123, ‘test string1’), ” + ” (456, ‘test string2’)”) Hope you found this article helpful. If you have any questions, feel free to ask in the comments section.

Uncategorized

A Comparison of Alteryx and Fivetran for ETL Processes

Introduction ETL Tools can deliver instant data integration solutions to help businesses in saving time, cost, and resources. Recently, there has been an increase in usage of Alteryx and Fivetran by companies. Both have gained a lot of attention as several companies are really adopting them. Alteryx and Fivetran have been used to solve many pressing issues in the world of Data Analytics, Data Engineering, and Data Science. Today, I would like to provide what each of these cool software does, probably you might be in a situation where you would like to know which one you would need for your business processes. Alteryx Alteryx is a drag and drop software used for data cleansing, mining, ETL, and other Business Intelligence activities. It is very easy to use. In fact, you do not need to have the technical skills to start using Alteryx. Alteryx provides you with many tools for various purposes. You can pick these tools, drag and drop them on the canvas for your extraction, transformation, and loading workloads. It also provides you with different products access for your data integration workloads which are being used for different purposes. There is a designer version of Alteryx which is being used for the development of applications and workflows. You can install Alteryx Designer on your machine and start using it to develop workflows. There is an Alteryx Server version that provides the company with the ability to publish and automate your resources. You can collaborate with your team and get feedback on the application you are building with just a few clicks in Alteryx Gallery. It’s worth noting that Alteryx is used for a variety of purposes one of which is that you can use Alteryx to work with ETL processes. You can use it to extract data from APIs, webpages, transform varieties of data, JSON, CSV, XML, and load the data into different databases or warehouses. Now let us look at each of these products. Alteryx Designer: It provides you with different tools that have different configurations. This is where development happens and as well as your playground to connect with different Alteryx resources and external data. Alteryx Designer is used by Data Analysts, Data Engineers, and Data Scientists. Interestingly all the tools needed for development by them are all in Alteryx Designer. Alteryx Server: It serves as a sharing platform; you are able to move your development to production with this powerful software. You can use macro and applications built by your co-workers and collaborate with each other on the Server. Alteryx Server gives you the ability to connect your applications with internal and external applications. Alteryx Connect: It  is a place where you keep resources like data sources connection metadata, workflows information, Analytical app information, e.t.c for easy access. I want you to see Alteryx Connect as your room wardrobe where you can easily find your array of clothes easily. With Alteryx Connect, companies can easily locate resources and keep all the Alteryx resources and packages organized. All your resources are cataloged for you. Alteryx Gallery: With Alteryx gallery, you can collaborate with your team, you can access your company’s Alteryx Gallery anywhere you want because it is a cloud-hosted or self-hosted application. Alteryx Gallery allows you to share data, macros, applications, and insights within your organization. With the Alteryx Gallery, you have the ability to manage and accommodate many more users as your business needs grow. All these users can then develop different applications and share them on the platform. Alteryx specializes in self-service analysis with an elegant user interface. These analyses can be used as Extract, Turn, Load (ETL) methods within the Alteryx system. You can use Alteryx to connect with different data sources and perform complex analysis, do spatial, predictive, and statistical analysis with just drag and drop tools. Fivetran Fivetran is a data integration platform you can use to extract, transform and Load (ETL) processes. The design orientation of Fivetran is such that it does not want you to do any manual work for your data integrations, just plug and play. Read along as I give a better illustration. I want you to think of how you pay for your electricity every month. You would not need to go to the power source itself to extract the electricity you need; you believe that all you need is to pay for your electricity and you sit back at home to watch your television. Sounds cool, right? No stress! This is exactly how Fivetran works. Even if you have the engineering capability, the time factor to get the electricity from the extraction point would wear you out. Fivetran provides you with many connectors to different data sources. For example, it provides you with connectors to Facebook Ads Insight, Salesforce, Zendesk, Twitter Analytics, and lots more. You can check Data Sources | Connector Directory | Fivetran. Interestingly, Fivetran provides you with the ability to create your custom connectors, but in some cases, you would not need to create custom connectors yourself because there are many connectors in the box already. Personally, What I like about Fivetran is that I can stream my data into Fivetran, you can choose to use either Amazon Kinesis, Apache Kafka, and others. Pricing Fivetran pricing is flexible and the costs are calculated based on your workload. There are four pricing/workload plans namely the Starter ($1 per credit), Standard ($1.5 per credit), Enterprise ($2), and the Business-Critical plan (contact support for this). In fairness, there are many things you can use this powerful software to achieve. They are powerful and they make things easy for you. I cannot tell it all to you here, the best way is to explore them. Alteryx provides you with 30 days free trial with full access to all the Alteryx Designer resources. On the other hand, Fivetran provides you with 14 days free trial as well, you can use over 150 plus connectors freely for the period of 14 days. I will advise you to try

Uncategorized

Previous Value Function in Tableau

At the beginning of my Tableau Journey back then in 2018, I was going through a goal news edition and I came across something similar to the image below. I thought about it, what if I can do something similar in Tableau with the previous function. So, I jumped onto Tableau with sample superstore to see the possibilities of this. It was surprising to see * in tooltip whenever the list of items per region exceeded 1. I was cool and happy with the screenshot above but got surprised with the screenshot below; I got that asterisk because Tableau is trying to aggregate by using (ATTR) and ATTR returns a returns value if it has a single value for all the rows else it throws it out *. I love investigating stuff, so I traced what this could be to how many products we have per state. Notice that we have 103 marks which indicate 103 products (since this is more than one, the asterisk should be expected according to its definition). Enough talking and let’s get down to the solution. With Tableau’s previous function we could build something similar to what I wanted. Here is a quick glance at Tableau’s definition of the Previous Function; returns the value of this calculation in the previous row. Calculated Field IF FIRST () =0 THEN MIN ([Product Name]) ELSE PREVIOUS_VALUE (MIN ([Product Name])) + ‘ ,’ +MIN ([Product Name]) END LAST () = 0 Tableau worksheet. Build the bar chart by dragging States to row and sales to column. 2. Build another worksheet that will be used as a tooltip in the bar chart, here is where we will use the two calculated fields. a. Create a new sheet b. Drag States to row, followed by product names, then drag the first calculated field to row. c. Compute the Table Calculation using Product Name as shown below. 3. Also drag the second calculated field to row, then compute using Product name as shown above. 4. Hold Ctrl and drag the second calculated field to filter, then check True. 5. Go back to the bar chart worksheet, click on the tooltip and insert the second worksheet you just built. 6. Hover on each bar to see if your tooltip is working fine. Then format to your taste. Now that you know how the Previous value function works in Tableau, you can go ahead to try your hands on it. If you have any more questions, feel free to ask in the comment box. Thanks for reading.

Uncategorized

What is Snowflake used for? 5 Benefits to Your Business

Introduction Data Warehousing is an evolving field. The old models of data warehousing such as batch processing and ETL (extract, transform, load) are inefficient for modern data-intensive applications. Data warehouses can’t handle the number of concurrent updates that occur when they’re in use. Snowflake is a cloud-computing data warehouse and you will learn how it can benefit your business in this article. What is Snowflake? Snowflake was founded in 2013 and has a global customer base of over 3,000 organizations. The data warehouse is designed for scalability and performance, providing companies with the ability to analyze their data for insights that drive business decisions. Snowflake Data Warehouse is an enterprise-grade data warehouse that is used to store massive amounts of sensitive data (like CRM) for later use by customers. Data warehouses are used in most businesses to store all their data for future use. Businesses do this to ensure that the data is not lost and can be retrieved when needed. Data warehouses are usually structured in different models with each model having its own benefits. Why You Should Use Snowflake for Your Data Storage Needs Today A Snowflake is a data warehouse based on the relational model of data storage. It is designed to deal with big data. It is the most scalable, cost-effective, and secure cloud solution for data warehousing. You should use it for storage because it can handle huge datasets that are generated by IoT devices which are bigger than what traditional databases can handle. The Architecture of Snowflake’s Data Warehouse Back in the days when we buy cable TV, we always get a package deal of both content and infrastructure. Today, we can have them separately. Now, you have control over what you use and pay for. Similarly, Snowflake separates computation and storage. Companies with high storage but low CPU benefit from this architecture. Snowflake’s architecture is made up of three layers namely: Database storage layer Compute layer Cloud services layer Benefits of Snowflake to Your Business Effective Data Sharing Snowflake’s architecture allows smooth data sharing among Snowflake’s users. Companies can also share data with consumers regardless of the consumers being Snowflake’s customers or not. A client account can be created for them directly from the user interface and managed by the provider. Accessibility and Performance Gone are the days where you could experience delays or failures due to concurrency when you use a traditional warehouse. With Snowflake’s unique multi-cluster architecture, you can run your queries concurrently without any hassle. The cloud is elastic such that if you need to run a large volume of queries quickly, you can scale up. Security Snowflake is a well-known cloud-based data warehousing platform used by governments, health, banking sectors, and security businesses. It has built-in protections that obey SOC 1 and Type 11 regulations. Simplicity Snowflake is fully cloud-based and requires no IT installation or management. It has built-in speed optimization, data security, and safe data exchange that ensures that datasets of any complexity have instant access and restoration. Low cost Snowflakes provide a range of pricing based on your usage and allow you to switch computing resources on and off. Snowflake’s low-cost, cloud-data platform allows you to store unlimited amounts of data and only pay for what you use. Additional Benefit For optimum marketing The marketing automation space is one of the most competitive because it’s so important for companies to stay ahead of their competitors in terms of making sure they are getting the most out of their marketing spending. This is where Snowflake Data Warehouse comes in. It’s the power behind marketing automation platforms in companies. By using it, they can provide marketers with a variety of data-driven tools that will help them create optimal customer journeys for each person. Conclusion Snowflake data warehouses are one of the most popular architectures for data warehousing. They provide real-time analytics, fast queries, and massive scalability which makes them more effective than other models of data warehousing.

Uncategorized

Kickstart an Alteryx Workflow with Alteryx API

An API  stands for Application Programming Interface. It is a specific set of codes or instructions and specifications used in software programs to communicate with other applications/devices. It is a software program that facilitates interaction with other software programs. What does Alteryx API do? Alteryx API allows you to programmatically interact with Alteryx products to extend and customize your experience. Alteryx provides different APIs and they are Alteryx Engine API, .NET API, Connect API, Gallery API, and Alteryx Analytics Hub(AAH) API. This article will go through the steps to kickstart an Alteryx workflow with an Alteryx Gallery API. We would be solving a problem with the Alteryx API. Python will be the platform with which the Alteryx API communicates with the Alteryx Server Gallery. We will achieve this via the following steps: Get server gallery API endpoint Authenticate the connection to the Alteryx Server Gallery Parse in the function’s parameter to connect with Alteryx Gallery Use functions to kickstart Alteryx workflow List all the workflows in the Gallery Write codes to run your workflow Step 1. Get the Server Gallery API credentials (API Key, Secret Key, and the Gallery URL) We can get the credential from the admin section of the Alteryx Gallery as shown in the screenshot below. This credential includes the API key and the secret key, enabling us to connect and authenticate with the server. Secondly, we need the URL of the gallery prepended with API endpoints. It will look like this https://dev-alteryx-cndro.com/. We can prepend the endpoints based on what we need to do. Step 2. Authenticate with the Alteryx Gallery API To do this we need to authenticate using the OAuth 1.0a signature which is described here https://github.com/Kong/mashape-oauth/blob/master/FLOWS.md#oauth-10a-one-legged. Interestingly, there is a Python library that can help us to authenticate with the Alteryx Gallery. We are going to be using the package. You can find the package here GitHub — Theamazingdp/AlteryxGalleryAPI: Python for Connecting and Working with Alteryx Gallery API. To ensure consistency, we created a Python Class object and function for each task in a python script. The first function buildOauthParams is to build the authentication parameter, the generateSignature is to authenticate the connection. import time import collections import random import math import string import sys import requests import base64 import urllib import hmac import hashlib from credentials import apikey, apiSecret from argparse import ArgumentParser class Gallery(object): def __init__(self, apiLocation, apiKey, apiSecret): self.apiLocation = ‘http://’ + apiLocation + ‘/gallery/api/v1’ self.apiKey = apiKey self.apiSecret = apiSecret #self.apiVerbose = apiVerbose def buildOauthParams(self): return {‘oauth_consumer_key’: self.apiKey, ‘oauth_nonce’: self.generate_nonce(5), ‘oauth_signature_method’: ‘HMAC-SHA1’, ‘oauth_timestamp’: str(int(math.floor(time.time()))), ‘oauth_version’: ‘1.0’} def generateSignature(self, httpMethod, url, params): q = lambda x: requests.utils.quote(x, safe=”~”) sorted_params = collections.OrderedDict(sorted(params.items())) normalized_params = urllib.parse.urlencode(sorted_params) base_string = “&”.join((httpMethod.upper(), q(url), q(normalized_params))) secret_bytes = bytes(“&”.join([self.apiSecret, ”]), ‘ascii’) base_bytes = bytes(base_string, ‘ascii’) sig = hmac.new(secret_bytes, base_bytes, hashlib.sha1) return base64.b64encode(sig.digest()) In the above Python code, we wrote a function to connect and authenticate with the Alteryx Gallery. In the next steps, we are going to use these functions to connect. Step 3: Parse in the function’s parameter to connect with ALTERYX GALLERY Establish_connection = Gallery(‘https://dev-alteryx-cndro.com/’, apiKey, apiSecret) Step 4. Use these functions to kickstart Alteryx workflows with the Alteryx API At this point, we will start communicating with the Alteryx Server Gallery resources using API. To do this, we will need to list all the workflows in the Gallery and then trace the workflow we need to execute using the API. We will create a function that lists all the workflows, we will call it “subscription.” The python function is defined in the following lines of code. The function will output all the Alteryx Workflow in a subscription. def subscription(self, search=””): method = ‘GET’ url = self.apiLocation + ‘/workflows/subscription/’ params = self.buildOauthParams() if search != “”: params.update({‘search’: search}) signature = self.generateSignature(method, url, params) params.update({‘oauth_signature’: signature}) try: output = requests.get(url, params=params) output.raise_for_status() except requests.exceptions.HTTPError as err: print(err) sys.exit(err) except requests.exceptions.RequestException as err2: print(err2) sys.exit(err2) return output, output.json() Step 5: List all the workflows in the Gallery Here, we are going to use the workflow function we created. list_all_workflows_on_server = Establish_connection.subscription() def executeWorkflow(self, appId, **kwargs): method = ‘POST’ url = self.apiLocation + ‘/workflows/’ + appId + ‘/jobs/’ params = self.buildOauthParams() signature = self.generateSignature(method, url, params) params.update({‘oauth_signature’: signature}) if ‘payload’ in kwargs: if self.apiVerbose: print(‘Payload included: %s’ % kwargs[‘payload’]) payload_data = kwargs[‘payload’] try: output = requests.post(url, json=payload_data, headers={‘Content-Type’: ‘application/json’}, params=params) output.raise_for_status() except requests.exceptions.HTTPError as err: print(err) sys.exit(err) except requests.exceptions.RequestException as err2: print(err2) sys.exit(err2) else: if self.apiVerbose: print(‘No Payload included’) try: output = requests.post(url, params=params) output.raise_for_status() except requests.exceptions.HTTPError as err: print(err) sys.exit(err) except requests.exceptions.RequestException as err2: print(err2) sys.exit(err2) return output, output.json() Now that we have all the workflows in our subscriptions, we will isolate and get the information specific to the Alteryx workflow. We will provide an interface to get the name of the workflow from our users. The program will ask you at execution which workflow (name) you are looking for. Step 6: Write the following lines of code to run your workflow Search_for_workflows = input(‘Hi, Please enter the name of your workflow you would like to run: ‘) for workflow in list_all_workflows_on_server: if workflow[‘metaInfo’][‘name’] == Search_for_workflows: print(‘Congratulation, the name of the workflow you searched is available on the server’) workflow_id = workflow[‘id’] print(‘This is the workflow id for the name you searched on the server’, workflow_id) #fire_up_a_workflow = Establish_connection.executeWorkflow(workflow_id) elif workflow[‘metaInfo’][‘name’] != Search_for_workflows: print(‘Please make sure you entered the workflow name the right way or enter another workflow name to run’) Final Thoughts Alteryx is a powerful, yet easy-to-use software that makes life super easy. Integrating this software with other applications using the Gallery API is golden. It helps to solve lots of data engineering problems at the enterprise level. You can bring Alteryx into JavaScript applications, android applications, spark applications, etc. using the Gallery API. In an event-based trigger, you can instruct your application to kickstart an Alteryx workflow with the Gallery API when an event is executed. If you have any questions please feel free to ask in the comment section.

Uncategorized

Loading Data into Snowflake Table using SnowSQL Client

Recently, Snowflake has gained momentum in the cloud data warehousing space. Snowflake offers a cloud-based data storage and analytics service, generally known as “data warehouse-as-a-service”. It lets corporate users store and analyze data using cloud-based hardware and software. Snowflake table stores data and there are three types namely: permanent, transient, and transient table. Snowsql is a command-line client used to connect to Snowflake to execute SQL queries and perform all DDL and DML operations. These operations include loading data into and unloading data out of database tables. You might be wondering how you can transfer data from your local machine into Snowflake. There are a few ways to load data from your computer into the Snowflake table but we would be using Snowsql in this article. Let’s jump right into it. To use Snow SQL you will need to install it on your PC. You can download SnowSQL for your platform from the Snowflake repository. It is available for Windows, Linux, and macOS. Below is the index of SnowSQL After you have run the windows installer, open your CMD as an administrator and finish the installation on CMD using snowsql -v . Use the Snowsql Client to ingest the CSV data into the Snowflake Stage area Then log in to Snowflake using the following; snowsql -a accountName -u userName In my case I have; snowsql -a dja65956.us-east-1 -u omolayo Provide your password. Select the database with the line of code: USE DATABASE YOURDATABASENAME; In my case, I have USE DATABASE TESTING_DB; The next step is to upload the data from your local computer to the snowflake database stage using this; PUT file://C:\Users\Omolayo\Documents\CndroData\SnowflakeDatatesting.csv @cndro_stage; After you have run the windows installer, open your CMD as an administrator and finish the installation on CMD using snowsql -v . Use the Snowsql Client to ingest the CSV data into the Snowflake Stage area Then log in to Snowflake using the following; snowsql -a accountName -u userName In my case I have; snowsql -a dja65956.us-east-1 -u omolayo Provide your password. Select the database with the line of code: USE DATABASE YOURDATABASENAME; In my case I have USE DATABASE TESTING_DB; The next step is to upload the data from your local computer to the snowflake database stage using this; PUT file://C:\Users\Omolayo\Documents\CndroData\SnowflakeDatatesting.csv @cndro_stage; The data should look like this; The screenshot below shows the communication between your computer and your snowflake cloud console. Use the Snowflake console to copy the data into the table you have created using the SQL code below. COPY INTO TESTING FROM @cndro_stage FILE_FORMAT = (type = csv field_optionally_enclosed_by=’”’) PATTERN = ‘.*SnowflakeDatatesting.csv.gz’ ON_ERROR = ‘skip_file’; You should have your table like this. We are working on the worksheet’s editor console, please see the screenshot. Final Thoughts In today’s guide, we walked through how you can load data into Snowflake table using SnowSQL Client. Kindly share and follow for more articles.Thanks for reading and see you next time.

Scroll to Top