Skip to main content

Data Modelling and Analysis

Overview

  • Credit value: 15 credits at Level 4
  • Convenor: Dr Felix Reidl
  • Assessment: programming exercises (30%) and a data analysis mini-project (70%)

Module description

In this module we cover fundamental aspects of data science and analytics. You will develop basic mathematical knowledge and skills including elements of linear algebra, preliminaries for calculus, as well as discrete probability theory and fundamentals of statistics.

We will show you how to use the popular and powerful language Python to solve computational tasks from these mathematical subjects. In particular, you will become acquainted with popular Python libraries and packages for programming to solve problems arising from linear algebra, probability theory and statistics.

Indicative syllabus

  • Taxonomy of data
  • Data representation (histograms, box plots)
  • Measures of central tendency (mode and the modal class, mean, median)
  • Measures of dispersion (range, interquartile range and percentiles, variance and standard deviation)
  • Counting and combinatorics (factorial, binomial coefficient)
  • Discrete probability (random variables, expectation, variance, and correlation)
  • Conditional probability and Bayes’ Rule
  • Common discrete distribution families (binomial, geometric, poisson)
  • Vector spaces (vector operations, scalar product)
  • Matrix algebra (matrix product, linear transformations)
  • Metrics
  • Tools: Python, Jupyter notebooks, pandas, matplotlib

Learning objectives

By the end of this module, you will have:

  • knowledge of basic linear algebra and matrix theory, basic discrete probability theory and statistics, and relevant Python libraries and packages
  • skills in programming in Python to solve computational tasks from linear algebra and discrete probability theory
  • an understanding of the link between the basic knowledge acquired from the module and data science/analytics applications.