Member-only story
Data Exploration in Python for SAS Programmers(Part 1)
The aim of this article is to introduce basic data exploration in python for SAS programmers. Every other job nowadays asks for python programming experience and before python craze there was a leading programming language for data analysis: SAS! (R programmers please calm down! SAS is older than R and R was typically restricted to academic world.)
First things first: I am using a Jupyter notebook to write python code. I prefer anaconda’s distribution of python and you can download it here: https://www.anaconda.com/products/individual
How to import data in python? There are many types of files to import and I will stick to reading csv file here. We will use pandas: a data analysis and manipulation tool built on top of python programming language. You can find more about pandas here: https://pandas.pydata.org/ . Below you can see a quick way to install it from Jupyter(the exclamation mark is needed if installing from Jupyter).
Let’s go over the syntax, pd.read_csv is basically saying goto pandas library and fetch the read_csv function. Inside the paranthesis of read_csv, we are providing the path to our file. I selected the cars dataset which is also available in SASHELP library. Also, note the “cars_df”: the “df” stands for dataframe. A dataframe is where the dataset is held in memory. The default…