Data Science Techniques and Applications
Overview
- Credit value: 15 credits at Level 7
- Convenor: Dr Alessandro Provetti
- Assessment: a data analysis mini-project (20%) and two-hour examination (80%)
Module description
In this module we present data science as a set of nine computational problems, then examine the geometrical interpretation of data and its consequences. The 'Rating and Ranking' and 'Complex Network' models are also studied in some depth.
The module has been designed to overlap with the Machine Learning/Applied Machine Learning modules.
Indicative syllabus
- Data science as nine computational problems
- Statistics, linear algebra and information theory
- Python modules for data analytics such as NumPy and Scikit-learn
- The geometric view of data; the curse of dimensionality; spectral and decomposition techniques
- Advanced techniques: Non-negative Matrix Factorization (NMF) and Factorization Machines (FM)
- Rating and ranking and their use in prediction for, for example, sports
- From data to networks (graphs), and their relevant properties
- Network analysis in: biology, international trade, computer networks, web search and finance
Learning objectives
By the end of this module, you will be able to:
- understand data science as nine computational and modelling problems
- deploy techniques for quantitative data analysis, such as information entropy, spectral analysis and matrix decomposition
- use Python to apply the techniques learned on the module
- validate and evaluate data analysis results
- demonstrate satisfactory knowledge of network models.