About the Course

Learning from data in order to gain useful predictions and insights is an important task, covered under the data science umbrella. This involves skills and knowledge from a wide variety of fields such as statistics, artificial intelligence, effective visualization, as well as efficient (big) data engineering, processing and storage. Given data arising from some real-world phenomenon, how does one analyze that data so as to understand that phenomenon? With a special focus on the full data science process, this course teaches critical concepts and practical skills in computer programming and statistical inference, in conjunction with hands-on analysis of datasets, involving issues such as data cleaning; sampling; data management to be able to access big data quickly and reliably; exploratory data analysis to generate hypotheses and intuition; prediction based on statistical methods such as regression and classification; and communication of results through visualization.