Statistical Principles
Builds on the existing knowledge of students from the previous courses on Python Programming for Data Analysis, and Development as Freedom.
There is a plethora of data available in the public domain that has led to a proliferation of stories, reportage and hence policy implications based on data. There is also a growing tendency to obfuscate using data. What data hides can sometimes be more important than what is revealed. Therefore, while a strong grasp on statistical methods is important to cull out meaningful information from data, it is equally or more important to do so keeping ethical data analysis and mining principles. However, statistical literacy and application is often fraught with incorrect inferences based on imperfect assumptions. This course aims to equip students to apply some fundamental principles of statistics and exploratory data analysis on real data. Real data can come from a variety of sources. It would include data arising from surveys and also include programme data arising from social policies that are dynamic in nature. Given the diverse nature of data collection and representation, a good grounding in statistical methods is critical in conducting ethically sound evidence based policy analysis and fostering a culture of public reasoning using data.
The statistical concepts covered in the course include the ability to quantify uncertainty, some aspects of descriptive statistics, inferential principles, and exposure to selected advanced techniques of statistical machine learning. For the purposes of communicating effectively using data, this course also aims to simplify the process of data-based information dissemination.
