Sains Malaysiana 48(12)(2019):
2831–2839
http://dx.doi.org/10.17576/jsm-2019-4812-24
A Relative Tolerance
Relation of Rough Set in Incomplete Information
(Perhubungan Toleransi
Relatif Set Kasar dalam Maklumat tak Lengkap)
RD ROHMAT SAEDUDIN1*, SHAHREEN KASIM2, HAIRULNIZAM MAHDIN2, MOHD FARHAN MD FUDZEE2, EDI SUTOYO1, IWAN TRI RIYADI YANTO3, ROHAYANTI HASSAN4
1School
of Industrial Engineering, Telkom University, 40257, Bandung, West Java, Indonesia
2Faculty
of Computer Science and Information Technology, Universiti Tun Hussein Onn
Malaysia, 86400 Batu Pahat, Johor Darul Takzim, Malaysia
3Department
of Information Systems, Universitas Ahmad Dahlan, 55161, Yogyakarta, Indonesia
4Faculty
of Computing, Universiti Teknologi Malaysia, 81310 Skudai, Johor Darul Takzim, Malaysia
Diserahkan: 21
Februari 2019/Diterima: 25 Disember 2019
ABSTRACT
University is an educational institution that has objectives to
increase student retention and also to make sure students graduate on time.
Student learning performance can be predicted using data mining techniques e.g.
the application of finding essential association rules on student learning base
on demographic data by the university in order to achieve these objectives.
However, the complete data i.e. the dataset without missing values to generate
interesting rules for the detection system, is the key requirement for any
mining technique. Furthermore, it is problematic to capture complete
information from the nature of student data, due to high computational time to
scan the datasets. To overcome these problems, this paper introduces a relative
tolerance relation of rough set (RTRS). The novelty of RTRS is that, unlike previous rough set approaches that use tolerance
relation, non-symmetric similarity relation, and limited tolerance relation, it
is based on limited tolerance relation by taking account into consideration the
relatively precision between two objects and therefore this is the first work
that uses relatively precision. Moreover, this paper presents the
mathematical properties of the RTRS approach and compares the
performance and the existing approaches by using real-world student dataset for
classifying university’s student performance. The results show that the
proposed approach outperformed the existing approaches in terms of
computational time and accuracy.
Keywords: Classification; educational data mining; incomplete
information systems; rough set theory
ABSTRAK
Universiti adalah sebuah institusi pendidikan yang antara
objektifnya adalah untuk meningkatkan penahanan pelajar dan juga untuk
memastikan pelajar bergraduasi dalam jangka masa yang ditetapkan. Untuk
mencapai objektif tersebut, pelajar perlulah memastikan prestasi pembelajaran
sentiasa konsisten. Teknik perlombongan data boleh digunakan untuk meramal
prestasi pembelajaran pelajar. Namun, isu data hilang atau data tidak lengkap
membataskan keberkesanan teknik perlombongan data khasnya dalam mengenal pasti
hubungan atribut pembelajaran pelajar dan atribut demografi pelajar. Isu
menjadi lebih sukar apabila melibatkan data pelajar yang banyak. Maka, kertas
ini mencadangkan teknik perhubungan toleransi relatif set kasar (RTRS)
bagi mengatasi isu ini. Kelainan RTRS dalam kertas ini adalah
dengan menggunakan ketepatan relatif antara dua objek atribut. Selain itu,
kertas ini turut membentangkan formula matematik yang digunakan dalam RTRS.
Seterusnya, prestasi cadangan teknik RTRS ini dibandingkan dengan
teknik asal menggunakan set data pelajar universiti untuk mengelaskan prestasi
pelajar tersebut. Hasil menunjukkan bahawa teknik RTRS yang
dicadangkan mengatasi teknik sedia ada daripada segi masa komputer dan
ketepatan.
Kata kunci: Pengelasan; perlombongan data pendidikan; sistem
maklumat tidak lengkap; teori set kasar
RUJUKAN
Borkar, S. & Rajeswari, K. 2013. Predicting students academic
performance using education data mining. IJCSMC International Journal of
Computer Science and Mobile Computing 2(7): 273-279.
Bunting, B.P., Adamson, G. & Mulhall, P.K. 2002. A Monte Carlo
examination of an MTMM model with planned incomplete data structures. Structural
Equation Modeling 9(3): 369-389.
Chiroma, H., Abdulkareem, S., Muaz, S.A., Abubakar, A.I., Sutoyo,
E., Mungad, M., Younes, Saadi., Eka, Novita, Sari. & Herawan, T. 2015. An
intelligent modeling of oil consumption. Advances in Intelligent Systems and
Computing 320: 557-568.
Chmielewski, M.R., Grzymala-Busse, J.W., Peterson, N.W. &
Than, S. 1993. The rule induction system LERS-a version for personal computers. Foundations of Computing and Decision Sciences 18(3-4): 181-212.
Dobrota, M., Bulajić, M. & Radojičić, Z. 2014.
Data mining models for prediction of customers’ satisfaction: The CART
analysis. In Innovative Management and Firm Performance, edited by Jakšić,
M.L., Rakočević, S.B. & Martić, M. London: Palgrave
Macmillan. pp. 401-421.
Fayyad, U.M. 1996. Data mining and knowledge discovery: Making
sense out of data. IEEE Expert: Intelligent Systems and Their Applications 11(5):
20-25.
Ibrahim, Z. & Rusli, D. 2007. Predicting students’ academic
performance: Comparing artificial neural network, decision tree and linear
regression. 21st Annual SAS Malaysia Forum, 5th September.
Kotsiantis, S., Pierrakeas, C. & Pintelas, P. 2004. Predicting
students’performance in distance learning using machine learning techniques. Applied
Artificial Intelligence 18(5): 411-426.
Kryszkiewicz, M. 1999. Rules in incomplete information systems. Information
Sciences 113(3): 271-292.
Kryszkiewicz, M. 1998. Rough set approach to incomplete
information systems. Information Sciences 112(1): 39-49.
Márquez-Vera, C., Cano, A., Romero, C. & Ventura, S. 2013.
Predicting student failure at school using genetic programming and different
data mining approaches with high dimensional and imbalanced data. Applied
Intelligence 38(3): 315-330.
Minaei-Bidgoli, B., Kashy, D.A., Kortemeyer, G. & Punch, W.F.
2003. Predicting student performance: An application of data mining methods
with an educational web-based system. Proceedings-Frontiers in Education
Conference 2003 1: 13-18.
Mohammed, M.A.T., Mohd, W.M.W., Arshah, R.A., Mungad, M., Sutoyo,
E. & Chiroma, H. 2016. Analysis of parameterization value reduction of soft
sets and its algorithm. International Journal of Software Engineering and
Computer Systems 2(1): 51-57.
Ogunde, A.O. & Ajibade, D.A. 2014. A data mining system for
predicting university students’ graduation grades using ID3 decision tree
algorithm. Journal of Computer Science and Information Technology 2(1):
21-46.
Pal, S. 2012. Mining educational data to reduce dropout rates of
engineering students. International Journal of Information Engineering and
Electronic Business 4(2): 1. Doi: 10.5815/ ijieeb.2012.02.01.
Romero, C. & Ventura, S. 2007. Educational data mining: A
survey from 1995 to 2005. Expert Systems with Applications 33(1):
135-146.
Saedudin, R.R., Kasim, S., Mahdin, H., Sutoyo, E., Riyadi Yanto,
I.T., Hassan, R. & Ismail, M.A. 2018. A relative tolerance relation of
rough set (RTRS) for potential fish yields in Indonesia. Journal of Coastal
Research: Special Issue 82 - Coastal Ecosystem Responses to Human and Climatic
Changes throughout Asia. pp. 84-92.
Saedudin,
R.R., Sutoyo, E., Kasim, S., Mahdin, H. & Yanto, I.T.R. 2017a. A
comparative analysis of rough sets for incomplete information system in student
dataset. International Journal on Advanced Science, Engineering and
Information Technology 7(6): 2078-2084.
Saedudin, R.R., Sutoyo,
E., Kasim, S., Mahdin, H. & Yanto, I.T.R. 2017b. Attribute selection on
student performance dataset using maximum dependency attribute. Electrical,
Electronics and Information Engineering (ICEEIE), 2017 5th International
Conference. pp. 176-179.
Saedudin, R.R., Kasim,
S.B., Mahdin, H. & Hasibuan, M.A. 2016. Soft set approach for clustering
graduated dataset. International Conference on Soft Computing and Data
Mining. pp. 631-637.
Slavin, R.E., Karweit,
N.L. & Wasik, B.A. 1994. Preventing Early School Failure: Research, Policy,
and Practice. Boston: Allyn & Bacon.
Stefanowski, J. &
Tsoukias, A. 2001. Incomplete information tables and rough classification. Computational
Intelligence 17(3): 545-566.
Stefanowski, J. &
Tsoukiàs, A. 1999. On the extension of rough sets under incomplete information. International Workshop on Rough Sets, Fuzzy Sets, Data Mining, and
Granular-Soft Computing. pp. 73-81.
Sutoyo E., Yanto,
I.T.R., Saadi, Y., Chiroma, H., Hamid, S. & Herawan, T. 2019. A framework
for clustering of web users transaction based on soft set theory. In Proceedings
of the International Conference on Data Engineering 2015 (DaEng-2015). Lecture
Notes in Electrical Engineering, edited by Abawajy, J., Othman, M.,
Ghazali, R., Deris, M., Mahdin H. & Herawan T. Singapore: Springer. 520:
307-314.
Sutoyo, E., Yanto,
I.T.R., Saedudin, R.R. & Herawan, T. 2017a. A soft set-based co-occurrence
for clustering web user transactions. Telkomnika (Telecommunication
Computing Electronics and Control) 15(3): 1344-1353.
Sutoyo, E., Saedudin, R.R.,
Yanto, I.T.R. & Apriani, A. 2017b. Application of adaptive neuro-fuzzy
inference system and chicken swarm optimization for classifying river water
quality. Electrical, Electronics and Information Engineering (ICEEIE), 2017
5th International Conference. pp. 118-122.
Van Nguyen, D., Yamada,
K. & Unehara, M. 2013. Extended tolerance relation to define a new rough
set model in incomplete information systems. Advances in Fuzzy Systems 2013:
37209.
Wang, G. 2002. Extension
of rough set under incomplete information systems. Proceedings of the 2002
IEEE International Conference 2: 1098-1103.
Wu, Y. & Guo, Q.
2010. An extension model of rough set in incomplete information system. Future
Computer and Communication (ICFCC), 2010 2nd International Conference 2:
434-438.
Yadav, S.K., Bharadwaj,
B. & Pal, S. 2012. Mining education data to predict student’s retention: A
comparative study. International Journal of Computer Science and Information
Security 10(2): 113-117.
Yadav, S.K. & Pal,
S. 2012. Data mining: A prediction for performance improvement of engineering
students using classification. World of Computer Science and Information
Technology Journal WCSIT 2(2): 51-56.
Yang, X. 2009. An
improved model of rough sets on incomplete information systems. Management
of e-Commerce and e-Government, 2009. ICMECG’09. International Conference. pp.
193-196.
Yang, X., Song, X. &
Hu, X. 2011. Generalisation of rough set for rule induction in incomplete
system. International Journal of Granular Computing, Rough Sets and
Intelligent Systems 2(1): 37-50.
Yanto, I.T.R., Saedudin,
R.R., Hartama, D. & Herawan, T. 2016. Clustering based on classification
quality (CCQ). International Conference on Soft Computing and Data Mining. pp.
327-335.
Yanto, I.T.R., Saedudin,
R.R., Lashari, S.A. & Haviluddin. 2018a. A numerical classification
technique based on fuzzy soft set using hamming distance. International
Conference on Soft Computing and Data Mining. pp. 252-260.
Yanto, I.T.R., Sutoyo,
E., Apriani, A. & Verdiansyah, O. 2018b. Fuzzy soft set for rock igneous
clasification. 2018 International Symposium on Advanced Intelligent
Informatics (SAIN). pp. 199-203.
Zhou, J. & Yang, X.
2012. Rough set model based on hybrid tolerance relation. International
Conference on Rough Sets and Knowledge Technology. pp. 28-33.
Zhou, Q. 2010. Research
on tolerance-based rough set models. System Science, Engineering Design and
Manufacturing Informatization (ICSEM), 2010 International Conference 2:
137-139.
*Pengarang untuk
surat-menyurat; email: rdrohmat@telkomuniversity.ac.id
|