This class is a comprehensive introduction to Python for Data Analysis and Visualization. Best for people who have some basic knowledge of programming and want to take it to the next level. It introduces how to work with different data structures in Python and covers the most popular Python data analysis and visualization modules, including numpy, scipy, pandas, matplotlib, and seaborn. We use Ipython notebook to demonstrate the results of codes and change codes interactively throughout the class.
Python is a high-level programming language. You will learn the basic syntax and data structures in Python. We demonstrate and run codes within Ipython notebook, which is a great tool providing a robust and productive environment for interactive and exploratory computing.
Python is an object-oriented programming (OOP) language. Having some basic knowledge of OOP will help you understand how Python codes work. More often than not, you will have to deal with data that is dirty and unstructured. You will learn many ways to clean your data such as applying regular expressions.
There are two modules for scientific computation that make Python powerful for data analysis: Numpy and Scipy. Numpy is the fundamental package for scientific computing in Python. SciPy is an expanding collection of packages addressing scientific computing.
Python can also generate graphics easily using “Matplotlib” and “Seaborn”. Matplotlib is the most popular Python library for producing plots and other 2D data visualizations. Seaborn is a Python visualization library based on matplotlib. It provides a high-level interface for drawing statistical graphics.
Pandas provides rich data structures and functions for working with structured data. The “DataFrame” object in Pandas is just like the “data.frame” object in R. Pandas makes data manipulation (filter, select, group, aggregate, etc.) as easy as in R.
Students are encouraged to work on an exploratory data analysis project based on their own interests. A project presentation demo will be arranged after the course. Certificates are awarded at the end of the program at the satisfactory completion of the course.
Saturdays, June 13 - July 18, 2020
1:00 pm - 5:00 pm
Sundays, August 2 - August 30, 2020
1:00 pm - 5:00 pm
Hasan Aljabbouli is an Assistant Professor in Computer Science. He obtained his Master's and Doctorate in Artificial Intelligence from Cardiff University in the United Kingdom and his Bachelor's in Engineering in Information Technology from Homs University. He worked for different universities and has published many scholastic materials in Data Mining and Machine Learning and its applications. In addition to his academic experience, Hasan received two patents and earned relevant experiences participating in various technical projects.
Alex Baransky received his degree in Environmental Biology from Columbia University. He has experience with multiple computer languages including Python, R, and SQL. As an engineer at heart and biologist through training, Alex is passionate about animal behavior and finding innovative ways to use data science in the field of biology.
Get a comprehensive introduction to Data Analysis and Data Visualization using the Python programming language today.