# MPhil. Mathematical Statistics

Many laboratories, both government and private, maintain independent research staffs that include statisticians. Their work often deals with the development of new technology, including design and analysis of experiments, software development, and numerical simulation, such as weather and climate forecasting, which depends heavily on the use of supercomputers. Another major reason for studying statistics at higher level is to pursue a career in teaching. However, the turn out of the training fails to meet the minimum demand. Mathematics and Statistics departments of Universities are encountering a growing gap between supply and demand of faculty. Dwindling numbers of students are entering in graduate study in Statistics, resulting to a bumper crop of faculty retirees. These trends need to be reversed. Rising public concern has gradually driven teachers’ salaries upward, and there is renewed interest in teaching as a career among graduates of undergraduate programmes in mathematical sciences.

Graduate study in Statistics thus becomes essential. The undergraduate training provides a foundation upon which more advanced statistics courses will be built. In graduate study, one or two further years of coursework completes this basic training. Thereafter, more specialized courses, often at the frontiers of research, are taken. Applied Statistics students take courses in various application areas to acquire experience in modeling the real world, and to learn how statistics coupled with mathematics solve problems from the physical and biological sciences, and in risk management as well.

The breadth and depth of Statistics graduate programme prepares the students for managerial positions in government, business, and industry and research work that will lead to publication of theses, enrolment in a PhD study, employment in research institutions as faculty. The diversity of applications is an exciting aspect of the field and is one reason why the demand for well-trained statisticians continues to be strong. This is in line with the department’s graduate training of students to formulate abstract mathematical models for real-world problems and also design and apply appropriate computer-based solutions to real world problems.

The programme emphasises on the teaching of theory and principles of mathematics, statistical theory and methodology, and applications to provide the basis for meaningful practical applications. The field exercise components of the programme are meant to give the students a chance to work in groups on real and practical issues. Important components of the programmes are the courses on sample surveys, operation research, and statistical quality control, where students plan, prepare, and conduct their own surveys/experiments under supervision of teaching staff. Statistical skills learned through lectures and laboratory works are put also put into practice during industrial attachment and also writing of their theses in the second year.

**Aims and Objectives**

The MPhil (Mathematical Statistics) emphasizes a broad, solid foundation in techniques and underpinnings of probability theory and statistical modeling. Its focus on breadth and depth is intended to produce well-rounded, knowledgeable scholars. This concentration is excellent preparation for academic positions in mathematical statistics and industrial or governmental positions that require broadly trained statisticians with a strong understanding of statistical theory.

The programme also emphasizes the theory and application of a broad array of statistical models, such as linear, generalized linear, nonlinear, categorical, spatial, correlated response, and nonparametric regression models. This prepares students to specify and choose appropriate models; fit the models using available statistical software; and make sound statistical conclusions and interpretive statements. It is excellent preparation for students interested in academic, industrial, or government positions that involve data modeling and analysis.

**Specifically the programme will:**

- Provide a solid training in mainstream advanced statistical modelling
- Expose students to modern developments in Statistics
- Be flexible in allowing the student to take a broad range of options, including modules within financial mathematics, and measure theory
- Reflect the research interests of the Department of Mathematics including specialised topics in statistical shape analysis and directional data, and stochastic financial modelling.
- Enable students pursue a PhD study in Mathematical Statistics.
- Give the graduates of this programme an appropriate combination of statistical and industrial systems backgrounds so that they may have successful technical careers in industry or successful careers doing research in industrial statistics.
- Train academic industrial statisticians who can serve as better bridges between the academic and corporate worlds.

**Course Contents for each Semester**

**YEAR ONE: SEMESTER ONE**

**MSTAT 573: Mathematical Statistics (3, 2, 4)**

This will deepen mathematical understanding of Statistical inference as well as decision theory. Topics to be covered include the following: Order statistics; Theory of estimation: Criteria of estimation, sufficiency, completeness, uniqueness and exponential class probability density functions, Cramer-Rao inequality and methods of estimation; Statistical hypotheses testing: Review of significance test, Power function, losses and risks, most powerful, generalised, likelihood ratio, conditional and sequential tests; Decision theory: Basic concepts, decision criteria, minimax and Bayesian estimation criterion. Non-parametric statistics: Various estimation methods based on kernels, smoothing splines, local polynomials, etc. would be considered.

**MSTAT 577: Analysis of Categorical Data (3, 2, 4)**

This course introduces methods for analyzing response data that are categorical, rather than continuous. Topics include: categorical response data and contingency tables. Generalized linear models; Linear Models for Binary Data, Generalized Linear Models for Counts, Moments and Likelihood for Generalized Linear Models, Inference for Generalized Linear Models, Fitting Generalized Linear Models, Quasi-likelihood and Generalized Linear Models, Generalized Additive Models. Loglinear and logit models, Poisson regression, Model diagnostics, estimation procedures. Procedures in statistical packages that can handle generalized linear models will be covered.

**MSTAT 579: Stochastic Processes** **(2, 2, 3)**

Review of Probability theory, Regularity of stochastic processes, Convergence of Random walks to Brownian Motion. Brownian motion and its Martingales, Diffusion Processes, Random Time change and 1-dimensional diffusions, Brownian Motion on the half line. Convergence of Markov Chains to Diffusions, Reflected processes in Higher Dimensions Stochastic integrals, Ito’s Formula, Stochastic Differential Equations. Application in industry and finance.

**MSTAT 583: Statistical Quality Control (2, 2, 3)**

Development of statistical concepts and theory underlying procedures used in quality control applications. Sampling inspection procedures, the sequential probability ratio test, continuous sampling procedures, process control procedures, and experimental design. Statistical quality control demonstrates how statistics and data analysis can be applied effectively to process control and management. Topics include the definition of quality, its measurement through statistical techniques, variable and attribute control charts, CUSUM charts, multivariate control charts, process capability analysis, design of experiments, and classical and Bayesian acceptance sampling. Statistical software will be used to apply the techniques to real-life case studies from manufacturing and service industries.

**MSTAT 585: Econometrics (2, 2, 3) **

This course covers the. Topics include Review: mathematical expectation, Sampling distributions and inference, Regression basics. Multivariate regression: matrix form, Dummy variables and interactions; testing linear restrictions using F-tests; Inference problems - heteroscedasticity and autocorrelation. Instrumental variables and 2SLS; simultaneous equations models; measurement error. Panel Data Models, Volatility models: ARCH and GARCH family models, and multivariate volatility models. Practical using EVIEWS and R software

**MSTAT 587 Non-Parametric Methods (2, 2, 3)**

Topics include: Review of common smoothing techniques: Kernel estimates, nearest-neighbour estimates, spline smoothers, local polynomial estimators. Choice of smoothing parameters: Measures of estimation quality and rates of convergence, bandwidth selection by cross-validation, asymptotic distribution of kernel estimates, boundary kernels. Orthogonal series expansion and wavelets: Fourier series (some basic concepts), orthogonal series density estimates, orthogonal series regression estimates, Windowed Fourier Transform. Introduction to neural networks: From perceptron to nonlinear neuron, neural network regression, network specification. Various estimation methods based on kernels, smoothing splines, local polynomials and wavelets would be considered.

**YEAR ONE: SEMESTER TWO**

**MSTAT 572: Advanced Sample Survey Methods (2, 2, 3)**

Sample Survey Designs: Basic Concepts of Sampling, Sampling Designs: sampling with varying probabilities; Stratified, Systematic, multistage Techniques of sample design: multiphase designs; selection with probability proportional to size (PPS); Probability Sampling procedures, estimation of population total, mean and proportion. Non-probability sampling procedure, Jacknife and Bootstrap procedures for resampling. Complex Surveys. Ratio and Regression Estimations; panel design; model based sampling Survey Errors, and Re-sampling methods. Use of appropriate software to calculate standard errors (variance estimation).

**MSTAT 574: Survival Analysis (2, 2, 3)**

Survival distributions, Types of censored data, Estimation for various survival models, Non-parametric estimation of survival distributions, The proportional hazard and accelerated lifetime models for covariate data, Regression analysis with lifetime data. Practical Aspects; Statistical models for transfers between multiple states (e.g., alive, ill, dead), the multi-state Markov model, relationship between probabilities of transfer and transition intensities, estimation for the parameters in these models; The binomial and Poisson models of mortality.

**MSTAT 576: Multivariate Analysis (2, 2, 3)**

In many disciplines the basic data on an experimental unit consist of a vector of possibly correlated measurements. Examples include the chemical composition of a rock; the results of clinical observations and tests on a patient; the household expenditures on different commodities. Through the challenge of problems in a number of fields of application, this course considers appropriate statistical models for explaining the patterns of variability of such multivariate data. Topics include: Multivariate Normal distribution, Distribution of sample mean and covariance multiple, partial and canonical correlation; multivariate regression; tests on means and covariances; ANOVA; principal components analysis; factor analysis; discriminant analysis and classification; cluster analysis; multidimensional scaling.

**MSTAT 578: Design and Analysis of Experiments (2, 2, 3)**

An introduction to the design and analysis of experiments, Topics include the design and analysis of completely randomized designs, randomized block designs, Latin square designs, incomplete block designs, factorial designs, fractional factorial designs, nested designs and split-plot designs and response surface designs. Students will complete and present a research project on an advanced topic in experimental design. Applications involve the use of a statistical software package. Experimental Design and Analysis: Basic Concepts of Planning and Designing Experiments, Multiple Comparisons, Randomized Block Designs, Factorial Designs, Nested and Split-Plot Designs, Latin Square Designs; Analysis of Covariance and Confounding; Application, and use of Statistical Computing Packages (such SPSS, R, Genstat, Excel, etc.).

**MSTAT 580: Statistical Computing and Consulting (2, 2, 3)**

Consulting: Consulting introduction, Ethics, Consulting practice in industry, Student presentations about their internships, Scientific writing, Effective communication, Common issues in consulting/data analysis Case studies. Computing: Introduction to statistical computing (R–package), Least squares (regression): Penalized and weighted least squares, Density estimation and smoothing and Matrix computations. Optimization (likelihood estimation): Newton-Raphson, Fisher scoring,Combinatorial optimization. Integration (probabilities): Quadrature and Laplace approximation. Resampling and Monte Carlo inferences: Jackknife and Bootstrap, Permutation procedures and Monte Carlo simulation. Statistical graphics

**MSTAT 582: Bayesian Statistics (2, 2, 3)**

To introduce the concepts of Bayesian inference and the analysis of data using Bayesian methods. The concept of prior and posterior distributions; connections with the classical approach; estimation and loss; hypothesis testing and the Bayes factor; Bayesian computation and Markov Chain Monte Carlo.

**MSTAT 584: Advanced Time Series Analysis (2, 2, 3)**

Univariate time series: stationary, autocorrelation function, trends, ARIMA processes, unit roots, fractional ARIMA processes, forecasting, distributed lags, maximum likelihood estimation (MLE), model selection criteria, regression models with ARIMA errors and spectral analysis. Multivariate time series: Stable Vector Autoregressive (VAR) Models, Cointegration Techniques, Vector Error Correction Models (VECM), Structural VARs and VECMs, Unit Roots and Cointegration in Panels. Threshold models: TAR, STAR, ESTAR and LSTAR Models.

**MSTAT 586: Spatial Statistics (2, 2, 3)**

Spatial data structures: geostatistical data, lattices, and point patterns. Stationary and isotropic random fields. Autocorrelated data structures. Semivariogram estimation and spatial prediction for geostatistical data. Mapped and sampled point patterns. Regular, completely random, and clustered point processes. Spatial regression and neighborhood analyses for data on lattices.

**YEAR TWO: SEMESTER ONE**

**MSTAT 691: Thesis I (0, 9, 9)**

Seminar presentations on chosen thesis topic.

**MSTAT 693: Topics in Operation Research (3, 1, 3)**

Dynamic programming and heuristics. Project scheduling; probability and cost considerations in project scheduling; project control. Critical path analysis. Reliability problems, replacement and maintenance costs; discounting; group replacement, renewal process formulation, application of dynamic programming. Queuing theory in practice: obstacles in modeling queuing systems, data gathering and testing, queuing decision models, case studies. Game theory, matrix games; minimax strategies, saddle points, mixed strategies, solution of a game. Behavioural decision theory, descriptive models of human decision making; the use of decision analysis in practice.

**MSTAT 695: ****Demographic**** Methods (2, 2, 3)**

This course introduces the basic techniques of demographic analysis. Students will become familiar with the sources of data available for demographic research. Population composition and change measures will be presented. Measures of mortality, fertility, marriage and migration levels and patterns will be defined. Life table, standardization and population projection techniques will also be explored.

**MSTAT 697 Artificial Neural Networks (2, 2, 3) **

Introduction to artificial neural networks: Biological neural networks, Pattern analysis tasks: Classification, Regression, Clustering, Computational models of neurons, Structures of neural networks, Learning principles. Linear models for regression and classification: Polynomial curve fitting, Bayesian curve fitting, Linear basis function models, Bias-variance decomposition, Bayesian linear regression, Least squares for classification, Logistic regression for classification, Bayesian logistic regression for classification. Feed-forward neural networks: Pattern classification using perceptron, Multilayer feed-forward neural networks (MLFFNNs), Pattern classification and regression using MLFFNNs, Error back propagation learning, Fast learning methods: Conjugate gradient method, Auto-associative neural networks, Bayesian neural networks. Radial basis function networks: Regularization theory, RBF networks for function approximation, RBF networks for pattern classification. Kernel methods for pattern analysis: Statistical learning theory, Support vector machines for pattern classification, Support vector regression for function approximation, Relevance vector machines for classification and regression. Self-organizing maps: Pattern clustering, Topological mapping, Kohonen’s self-organizing map. Feedback neural networks: Pattern storage and retrieval, Hopfield model, Boltzmann machine, Recurrent neural networks.

**YEAR TWO: SEMESTER TWO**

**MSTAT 692: Thesis II (0, 15, 15)**

Oral examination on submitted thesis.