Data science is a multidisciplinary field of study that involves scientific methods, processes and systems in extracting both explicit and implicit information from a variety of data structures. It combines the knowledge of mathematics and statistics, programming and data analytics. This master programme offers a variety of courses with emphasis on data analytics. Students are free to choose from three different learning modules: Data Computing, Data Analytic, and Finance & Business Analytic to match their interests and career paths. The aim of the programme is to produce knowledgeable, ethical and competitive graduates who can contribute to the nations.
Study Duration
Minimum 3 semesters (1½ years)
Maximum 4 semesters (2 years)
Intake
Intake – every October
*subjected to UKM academic calendar
Data Computing Module
Semester | Core Course | Elective Course |
I | STQD6014 STQD6214 STQD6414 | Choose four (4): STQD6124 STQD6324 STQD6114 STQS6444 STQM6154 STQD6334 STQD6524 |
II | STQD6024 STQP6014 | |
III | STQD6889 | |
Total Credit | 29 | 16 |
Data Analytic Module
Semester | Core Course | Elective Course |
I | STQD6014 STQD6214 STQD6414 | Choose four (4): STQD6124 Data Visualization and Communication STQS6284 STQD6114 STQD6134 STQS6444 STQS6234 STQM6154 STQD6334 STQD6524 |
II | STQD6024 STQP6014 | |
III | STQD6889 | |
Total Credit | 29 | 16 |
Finance and Business Analytic Module
Semester | Core Course | Elective Course |
I | STQD6014 STQD6214 STQD6414 | Choose four (4): STQD6124 STQD6134 TQD6114 STQS6444 STQD6334 STQA6014 STQA6034 |
II | STQD6024 STQP6014 | |
III | STQD6889 | |
Total Credit | 29 | 16 |
STQA6014 Investment Analysis and Portfolio Management
The focus of this course is on the investment decision making. It presents the applications of various investment instruments and its role in risk management. The concept of risks and returns are covered comprehensively. Efficient diversification is discussed with the emphasis on the construction of efficient portfolio. The different kinds of investment instruments are assessed and weighted. Share valuation methods and portfolio theories such as the Markowitz theory, the Single Index model, the Capital Asset Pricing Model are discussed. The fundamental and technical analyses are also explained. The behavioral finance theory such as the Efficient Market Hypothesis is included. Students will participate in learning activities consisting of article journal discussion and project presentations.
STQA6034 Issues in Risk Management and Insurance
This course has one main objectives; the first is to provide students with a broad perspective of risk management that emphasize traditional risk management and insurance while introducing other types of risk management, while the second is to equip students with the tools needed for the analysis of mathematical models that describe the loss process. The major topics that will be covered are risk management (objectives, measurement, diversification and retention), hedging, corporate risk management, enterprise risk management, estimation methods (for complete and incomplete data) and model selection. The students will also be trained to use R and Excell software for computing relevant mathematical analysis. At the end of semester, students are required to make a presentation on an article from an agreed journal so that they will appreciate the applicability of concepts and methodologies covered in this course.
STQD6014 Data Science
This course aims to expose students to the basic principles of data science and Python programming. Students will be introduced with the concept of big data and the various types of data related to it. This course would also covers the algorithms, processes, methods and analyses used in the field of data science with examples and discussions using Python. Other topics covered are the current data technologies available for storing and archiving data.
STQD6024 Machine Learning
This course aims to expose students on concepts, techniques and algorithms in machine learning. Machine learning revolves around the development of a computer system, which is able to self-learning and improving through experience and recorded data. This course is among main technologies in Big Data and its applications in various fields. Among common topics covered are neural neonerk, decision tree and support vector machines. Among advanced topics covered are ensemble and unsupervised learning also reinforcement and evolutionary learning.
STQD6114 Unstructured Data Analytics
The aim of this course is to introduce students to basic and current methods used to compile, summarize and analyze unstructured and semi-structured data. Unstructured data includes texts, images and audios. Focus are given to algorithms and techniques for mining, exploring and analyzing unstructured data using suitable packages. Students are also exposed to sources for unstructured data. Related applications of unstructured data such as sentiment analysis, document clustering and information extraction are also discussed.
STQD6124 Data Visualization and Communication
This course introduces students to the basic principles of data visualization and communication. Students are exposed to the principle of designing visualizations, human perception, colour theory and effective data storytelling. Suitable graphs and charts to convey information clearly are taught. Students will be trained to use visualization softwares such as R, ggplot, MatplotLib, D3 and others. Some specific graphical techniques will be introduces such as visualizing multivariate, time series, spatial, texts, hierarchical and neonerk data.
STQD6134 Business Analytics
This course aims to expose students on the techniques and tools for the transformation of raw data into meaningful and useful information for business analysis purposes. It is divided into customer, operation and people analytics. Customer analytics focuses on how data is used to describe, explain, and predict customer behavior. Meanwhile, operation analytics focuses on how the data can be used to profitably match supply with demand in various business settings. This also covers on how to model future demand uncertainties, how to predict the outcomes of competing policy choices and how to choose the best course of action in the face of risk. Finally, people analytics is a data – driven approach to managing people at work.
STQD6214 Mathematical Statistics with Computing
This course aims to expose students to the fundamentals of mathematical statistics including descriptive statistics, graphical displays, sampling distributions, hypothesis testing and other methods in data analysis. This course also reflects the integral role of R in computing statistical problems. Basic simulation concepts are discussed with examples. Students will learn how to generate data, analyze data using statistical methods and interpret the results obtained.
STQD6324 Data Management
This course aims to provide the fundamental and state of the art on the technologies used in data management big data solutions. Students will be introduced to data model, databases, querying and big data processing. It covers data security, data centre and the development of big data solutions such as the Hadoop ecosystem, including MapReduce and HDFS. Apache Spark will also be introduced, including Spark’s architecture, data distribution and parallelisation of tasks. Students will have a better understanding on how to optimise the information in the big data using Spark’s memory caching, as well as using the more advanced operations available in Spark.
STQD6334 Multicriteria Decision Making
The purpose of this course is to introduce the concepts and techniques in solving Multi-criteria Decision Making problems. The methods to be used to solve the problems depend on the type of problems. Topics included are decision making without probabilities, decision making with probabilities, decision making with sample information, decision making under uncertainties, Analytic Hierarchy Process, TOPSIS, VIKOR, PROMETHEE and ELECTRE.
STQD6414 Data Mining
This course explains in detail about the process of exploration in the database (KDD) and data mining. This course discusses the process of data preparation which includes data cleaning, integration, transformation, reduction and discretization. This course covers the the general concept of data mining process on various types of data stream, sequence, time series, text, spatial and web-data.
STQD6524 Statistical Methods for Computational Biology
The aim of this course is to give exposure on statistical methods and computation in biology and bioinformatics. Focus is given on the understanding of basic statistical concepts and inferential statistics as well as their use in solving biological problems. This course covers topics such as introduction to genetic data, gene expression data, DNA sequential data, Protein and RNA, sequential analysis, phylogenetic, gene expression analysis and micro array data analysis. Statistical methods that will be covered are inferential statistics methods, hypothesis testings, multivariate, statistical modelling, experimental design, robust statistical techniques, Bayesian and Markov Chain Monte Carlo.
STQD6889 Capstone Project
Capstone project provides experiential learning opportunity and gives students space to produce a product which is evaluated by potential employers. The project is obtained from real world problems and executed in collaboration with industry, government or private agencies, or academics. Students will use knowledge and skills which they have obtained throughout their study to help solve real problems. During the course of the project, students will be involved with the whole process of identifying and defining problems, giving solutions and limitations, perform analysis, reporting and presenting results and giving suggestions
STQM6154 Network Science
This course introduces mathematical theories in neonerk science. Neonerk science is a multidiscipline field which investigate problems that can be understood through neonerk approach. Among the aims of neonerk science are to find cross-neonerk equations and increase understanding of systems which are represented by neonerks through data analysis. The use of neonerk science can be found in mathematics, social neonerks, biological systems and transportations
STQP6014 Research Methodology and Industrial Seminar
The aim of this course is to give a background and method to perform scientific research in Data Dcience field. Research ethics, research principes, research designs and the role of researhers are discussed. Research methodologies, sampling and data collection as well as critical literature review are exposed to the students. Students will also be exposed to current issues and recent research in Data Science through a series of Data Science Seminar by inviting researchers and main industry practitioners in this field.
STQS6234 Bayesian Inference
This course introduces to the students on Bayesian’s theories. Bayesian inference for normal distributions is also discussed. Other than that, Bayesian inference for distributions other than normal, for example Binomial and Poisson is also explained. Other topics include hierarchical Bayesian model, empirical Bayesian, hypothesis testing, correlation, regression and analysis of variance.
STQS6284 Multivariate Analysis
This course intends to introduce statistical mehods for multivariate data. Students are emphasized on the comprehension of the concepts and theories in multivariate analysis. Among topics covered in this course are matrix algebra, multivariate normal distribution, hypothesis testing for multivariate data, principal component analysia, factor analysis, discriminant analysis and cluster analysis.
STQS6444 Time Series Modelling and Forecasting
The objectives of this course are estimating simple regression models, explaining the techniques for modeling trend and volatility in time series data, explaining the cointegrating relation between one or more time series, and at the same time highlighting several major issues in time series analysis that are related to stationarity, trend, volatility, and cointegration. In particular, for modeling trend and volatility, the focus will be on the ARCH-GARCH models. As for cointegration, the error-correction mechanism and the Johansen approach will be discussed. At the end of the semester, the students will be required to write one short report on the application of statistical testing methods and model analyses that are covered during the semester.