Data science unifies statistics, data analysis, machine learning and their related methods in order to understand and analyze actual phenomena with data. It employs techniques and theories drawn from many fields within the context of mathematics, statistics, information science, and computer science.
In this track, we’ll be exploring the tools and techniques to get you started on your journey.
You’ll pick up the basic building blocks of how to analyze and communicate data findings.
The first course you’ll take is Data Analysis Basics, where you’ll establish some language and definitions as well as how to think about data. Next, we’ll cover some Python topics, as it’s the language data scientists use the most. You’ll establish a firm foundation in Python lists, dictionaries, sequences, tuples, and more.
Next we’ll cover how to install and use Anaconda, as well as Jupyter Notebooks, two useful tools for your Python work. Additionally, you’ll start creating charts with the Python library matplotlib, an industry standard data visualization library. Matplotlib provides a way to easily generate a wide variety of plots and charts in a few lines of Python code.
You’ll get a basic introduction to NumPy, the fundamental package for scientific computing, and then pandas, which provides fast, flexible, and expressive data structures for your Python data work.
We’ll then cover some best practices for cleaning and preparing data, data visualization, and an introduction to scraping data from the Web. To wrap up this Track, you’ll take our Introduction to Big Data course and then our Machine Learning Basics course.
Ready to take the next step in your Data Science career? Let’s get started!