Sains Malaysiana 51(8)(2022):
2655-2668
http://doi.org/10.17576/jsm-2022-5108-24
Hydroclimatic Data Prediction using
a New Ensemble Group Method of Data Handling Coupled with Artificial Bee Colony
Algorithm
(Ramalan Data Hidroklimatik menggunakan Kaedah Pengendalian Data Kumpulan Ensembel Baharu Digandingkan dengan Algoritma Koloni Lebah Buatan)
BASRI
BADYALINA1,*, NURKHAIRANY AMYRA MOKHTAR1, NUR
AMALINA MAT JAN2, MUHAMMAD FADHIL MARSANI3, MOHAMAD
FAIZAL RAMLI4, MUHAMMAD MAJID4 & FATIN FARAZH YA'ACOB4
1Faculty of Computer and Mathematical
Sciences, Universiti Teknologi MARA, Cawangan Johor, Kampus Segamat, 85000 Segamat,
Johor Darul Takzim,
Malaysia
2Department of Physical and Mathematical Science, Faculty of Science, Universiti Tunku Abdul Rahman, Kampar Campus, Jalan Universiti, Bandar Barat, 31900 Kampar, Perak Darul Ridzuan, Malaysia
3School of Mathematical Sciences, Universiti Sains Malaysia, 11800
Minden, Penang, Malaysia
4Universiti Teknologi MARA, Cawangan Johor, Kampus Segamat, 8500 Segamat,
Johor Darul Takzim, Malaysia
Diserahkan: 22 Ogos 2021/Diterima: 22 Februari 2022
Abstract
Linear regression is widely used in flood
quantile study that consists of meteorological and physiographical variables.
However, linear regression does not capture the complex nonlinear relationship
between predictor and target variables. It is rare to find a hydrological
application using the group method of data handling (GMDH) model, artificial
bee colony (ABC) algorithm, and ensemble technique, precisely predicting
ungauged sites. GMDH model is known to be an effective model in complying with
a nonlinear relationship. Therefore, in this paper, we enhance the GMDH model
by implementing the ABC algorithm to optimize the parameter of partial
description GMDH model with some transfer functions, namely polynomial, radial
basis, sigmoid and hyperbolic tangent function. Then, ensemble averaging
combines the output from those various transfer functions and becomes the new
ensemble GMDH model coupled with the ABC algorithm (EGMDH-ABC) model. The
results show that this method significantly improves the prediction performance
of the GMDH model. The EGMDH-ABC model satisfies the nonlinearity in data to
produce a better estimation. Also, it provides more robust, accurate, and
efficient results.
Keywords: ABC algorithm; GEV distribution;
GMDH modele; Peninsular Malaysia; ungauged site
Abstrak
Regresi linear digunakan secara meluas dalam kajian kuantiti banjir yang terdiri daripada pemboleh ubah meteorologi dan fisiografi. Walau bagaimanapun, regresi linear tidak mengenal pasti hubungan tidak linear yang kompleks antara pemboleh ubah peramal dan sasaran. Sukar untuk menemui aplikasi hidrologi yang menggunakan kaedah kumpulan model pengendalian data (GMDH), algoritma koloni lebah tiruan (ABC) dan teknik penggabungan, khususnya dalam meramalkan kuantil banjir di kawasan tiada data. Model GMDH dikenali sebagai model yang berkesan dalam mengenal pasti hubungan tidak linear. Oleh itu, dalam kajian ini, kami menambah baik model GMDH dengan menerapkan algoritma ABC untuk mengoptimumkan parameter penerangan separa model GMDH dengan beberapa fungsi pemindahan iaitu fungsi polinomial, asas radial, sigmoid dan tangen hiperbolik. Kemudian, penggabungan secara purata digunakan untuk menggabungkan hasil daripada pelbagai fungsi pemindahan tersebut dan membangunkan model baru iaitu EGMDH-ABC. Hasil kajian
menunjukkan bahawa kaedah ini meningkatkan prestasi ramalan model GMDH dengan
ketara. Model EGMDH-ABC berjaya mengenal pasti ketidaklinearan di dalam data
untuk menghasilkan anggaran yang lebih baik. Di samping itu, hasil keputusan yang lebih mantap, tepat dan cekap dapat dihasilkan.
Kata kunci: Algoritma
ABC; lembangan tiada data; model GMDH; Semenanjung Malaysia; taburan GEV
RUJUKAN
Adnan, R.M., Liang, Z.,
Parmar, K.S., Soni, K. & Kisi, O. 2021. Modeling monthly streamflow in mountainous basin by MARS,
GMDH-NN and DENFIS using hydroclimatic data. Neural Computing and Applications 33(7): 2853-2871.
Ahmadi, A., Nasseri, M. & Solomatine,
D.P. 2019. Parametric uncertainty assessment of hydrological models: Coupling
UNEEC-P and a fuzzy general regression neural network. Hydrological Sciences
Journal 64(9): 1080-1094.
Ahmadi, M.H., Ahmadi,
M-A., Mehrpooya, M. & Rosen, M.A. 2015. Using
GMDH neural networks to model the power and torque of a stirling engine. Sustainability 7(2): 2243-2255.
Alobaidi, M.H., Ouarda, T.B.M.J., Marpu, P.R.
& Chebana, F. 2021. Diversity-driven ANN-based
ensemble framework for seasonal low-flow analysis at ungauged sites. Advances in Water Resources 147:
103814.
Amiri, M. & Soleimani, S.
2021. ML-based group method of data handling: An improvement on the
conventional GMDH. Complex & Intelligent Systems 7: 2949-2960.
Ashrafzadeh, A., Kişi, O., Aghelpour, P., Biazar, S.M. & Masouleh, M.A.
2020. Comparative study of time series models, support vector machines, and
GMDH in forecasting long-term evapotranspiration rates in northern Iran. Journal
of Irrigation and Drainage Engineering 146(6): 04020010.
Aslan, S. 2019. A
transition control mechanism for artificial bee colony (ABC) algorithm. Computational Intelligence and
Neuroscience 2019: Article ID. 5012313.
Ayoub, M.A., Elhadi, A., Fatherlhman, D.,
Saleh, M.O., Alakbari, F.S. & Mohyaldinn,
M.E. 2022. A new correlation for accurate prediction of oil formation volume
factor at the bubble point pressure using group method of data handling
approach. Journal of Petroleum Science and Engineering 208: 109410.
Aziz, K., Haque, M.M.,
Rahman, A., Shamseldin, A.Y. & Shoaib, M. 2017.
Flood estimation in ungauged catchments: Application of artificial intelligence
based methods for Eastern Australia. Stochastic Environmental Research and
Risk Assessment 31(6): 1499-1514.
Badem, H., Basturk, A., Caliskan, A. & Yuksel, M.E. 2017. A new efficient training strategy for
deep neural networks by hybridization of artificial bee colony and
limited–memory BFGS optimization algorithms. Neurocomputing 266:
506-526.
Badyalina, B.
& Shabri, A. 2015. Flood estimation at ungauged
sites using group method of data handling in Peninsular Malaysia. Jurnal Teknologi 76(1). https://doi.org/10.11113/jt.v76.2640
Badyalina, B., Mokhtar, N.A.,
Mat Jan, N.A., Hassim, N.H. & Yusop, H. 2021a.
Flood frequency analysis using L-moment for Segamat River. MATEMATIKA: Malaysian Journal of Industrial and Applied Mathematics 37(2): 47-62.
Badyalina, B., Shabri, A. & Marsani, M.F.
2021b. Streamflow estimation at ungauged basin using modified group method of
data handling. Sains Malaysiana 50(9): 2765-2779.
Broderick, C.,
Matthews, T., Wilby, R.L., Bastola,
S. & Murphy, C. 2016. Transferability of hydrological models and ensemble
averaging methods between contrasting climatic periods. Water Resources
Research 52(10): 8343-8373.
Campos, J.A. & Pedrollo, O.C. 2021. A regional ANN-based model to estimate
suspended sediment concentrations in ungauged heterogeneous basins. Hydrological
Sciences Journal 66(7): 1222-1232.
Cannon, A.J. 2010. A
flexible nonlinear modelling framework for nonstationary generalized extreme
value analysis in hydroclimatology. Hydrological
Processes: An International Journal 24(6): 673-685.
Criss, R.E. &
Winston, W.E. 2008. Do Nash values have value? Discussion and alternate
proposals. Hydrological Processes: An
International Journal 22(14): 2723-2725.
De Paola, F., Giugni, M., Pugliese, F., Annis,
A. & Nardi, F. 2018. GEV parameter estimation and
stationary vs. non-stationary analysis of extreme rainfall in African test
cities. Hydrology 5(2): 28.
Desai, S. & Ouarda, T.B.M.J. 2021. Regional hydrological frequency
analysis at ungauged sites with random forest regression. Journal of
Hydrology 594: 125861.
Elbaz, K., Shen, S-L.,
Zhou, A., Yin, Z-Y. & Lyu, H-M. 2021. Prediction
of disc cutter life during shield tunneling with AI
via the incorporation of a genetic algorithm into a GMDH-type neural network. Engineering 7(2): 238-251.
Fillipova, V., Leedal, D. & Hammond, A. 2020. Regional Flood
Frequency Estimation for the Contiguous USA using Artificial Neural Networks.
EGU General Assembly Conference Abstracts.
Goyal, H.R., Ghanshala, K.K. & Sharma, S. 2021. Post flood
management system based on smart IoT devices using AI approach. Materials Today: Proceedings.
Guru, N. & Jha, R.
2014. A study on selection of probability distributions for at-site flood
frequency analysis in Mahanadi River Basin, India.
http://dx.doi.org/10.1201/b17133-241
Hecht-Nielsen, R. 1990. Neurocomputing.Boston:
Addison-Wesley. pp.
89-93.
Hosking, J.R.M. &
Wallis, J.R. 1997. Regional Frequency Analysis: An Approach Based on
L-moments. Cambrige: Cambrige University Press. http://dx.doi.org/10.1017/cbo9780511529443
Hosseini, S.A., Taheri,
B., Abyaneh, H.A. & Razavi,
F. 2021. Comprehensive power swing detection by current signal modeling and prediction using the GMDH method. Protection
and Control of Modern Power Systems 6(1): 1-11.
Ivakhnenko, A.G. 1971. Polynomial
theory of complex systems. IEEE Transactions on Systems, Man, and
Cybernetics 4: 364-378.
Ivakhnenko, A.G. 1970. Heuristic
self-organization in problems of engineering cybernetics. Automatica 6(2): 207-219.
Jolánkai, Z. & Koncsos, L. 2018. Base flow index estimation on gauged and
ungauged catchments in Hungary using digital filter, multiple linear regression
and artificial neural networks. Periodica Polytechnica Civil Engineering 62(2): 363-372.
Karaboga, D. & Akay, B. 2009. A comparative study of artificial bee colony
algorithm. Applied Mathematics and
Computation 214(1): 108-132.
Karaboga, D. & Basturk, B. 2007. A powerful and efficient algorithm for
numerical function optimization: Artificial bee colony (ABC) algorithm. Journal
of Global Optimization 39(3): 459-471.
Kardani, N., Bardhan, A., Kim, D., Samui, P. & Zhou, A. 2021.
Modelling the energy performance of residential buildings using advanced
computational frameworks based on RVM, GMDH, ANFIS-BBO and ANFIS-IPSO. Journal
of Building Engineering 35: 102105.
Khan, M.S.R., Hussain,
Z. & Ahmad, I. 2021. Regional flood frequency analysis, using l-moments,
artificial neural networks and OLS regression, of various sites of
Khyber-Pakhtunkhwa, Pakistan. Applied Ecology and Environmental Research 19(1): 471-489.
Kordrostami, S., Alim, M.A., Karim, F. & Rahman, A. 2020. Regional flood
frequency analysis using an artificial neural network model. Geosciences 10(4): 127.
Le, L.T., Nguyen, H.,
Dou, J. & Zhou, J. 2019. A comparative study of PSO-ANN, GA-ANN, ICA-ANN,
and ABC-ANN in estimating the heating load of buildings' energy efficiency for
smart city planning. Applied Sciences 9(13): 2630.
Lee, W.H., Choi, H.S.,
Lee, D. & Choi, B. 2021. Stream flow generation for simulating yearly bed
change at an ungauged stream in monsoon region. Water 13(4): 554.
Lu, R., Hu, H., Xi, M.,
Gao, H. & Pun, C-M. 2019. An improved artificial bee colony algorithm with
fast strategy, and its application. Computers & Electrical Engineering 78: 79-88.
Mamun, A.A., Hashim, A.
& Amir, Z. 2012. Regional statistical models for the estimation of flood
peak values at ungauged catchments: Peninsular Malaysia. Journal of
Hydrologic Engineering 17(4): 547-553. doi:
doi:10.1061/(ASCE)HE.1943-5584.0000464.
Maofa Wang, Mohammad Rezaie-Balf, Sujay Raghavendra Naganna & Zaher Mundher Yaseen. 2021. Sourcing CHIRPS
precipitation data for streamflow forecasting using intrinsic time-scale
decomposition based machine learning models. Hydrological Sciences Journal 66(9): 1437-1456.
Mat Jan, N.A., Shabri, A., Hounkpè, J. & Badyalina, B. 2018. Modelling non-stationary extreme
streamflow in Peninsular Malaysia. International Journal of Water 12(2):
116-140.
Mat Jan, N.A., Shabri, A., Ismail, S., Badyalina,
B., Abadan, S.S. & Yusof, N. 2016a. Three-parameter lognormal distribution:
Parametric estimation using L-moment and TL-moment approach. Jurnal Teknologi 78: 6-11.
Mat Jan, N.A., Shabri, A. & Badyalina, B.
2016b. Selecting probability distribution for regions of Peninsular Malaysia
streamflow. AIP Conference Proceedings. 1750: 060014.
McCuen, R.H., Knight, Z.
& Cutter, A.G. 2006. Evaluation of the Nash-Sutcliffe Efficiency Index. Journal of Hydrologic Engineering 11(6): DOI:10.1061/(ASCE)1084-0699(2006)11:6(597).
Meresa, H. 2019. Modelling of
river flow in ungauged catchment using remote sensing data: Application of the
empirical (SCS-CN), artificial neural network (ANN) and hydrological model
(HEC-HMS). Modeling Earth Systems and
Environment 5(1): 257-273.
Mokhtar, N.A., Zubairi, Y.Z., Hussin, A.G., Badyalina, B., Ghazali, A.F., Ya’acob,
F.F. & Kerk, L.C. 2021. Modelling wind direction
data of Langkawi Island during Southwest monsoon in 2019 to 2020 using
bivariate linear functional relationship model with von Mises distribution. Journal
of Physics: Conference Series 1988(1): 012097.
Nariman Valizadeh, Majid Mirzaei, Mohammed Falah Allawi, Haitham Abdulmohsin Afan, Nuruol Syuhadaa Mohd, Aini Hussain, & Ahmed
El-Shafie. 2017. Artificial intelligence and
geo-statistical models for streamflow forecasting in ungauged stations: State
of the art. Natural Hazards 86(3): 1377-1392.
Otiniano, C.E.G., De Paiva,
B.S. & Neto, D.S.B. 2019. The transmuted GEV
distribution: Properties and application. Communications for Statistical Applications and Methods 26(3):
239-259.
Pandey, G.R. &
Nguyen, V-T-V. 1999. A comparative study of regression based methods in
regional flood frequency analysis. Journal of Hydrology 225(1-2):
92-101.
Shu, C. &
Burn, D.H. 2004. Artificial neural network ensembles and their application in
pooled flood frequency analysis. Water Resources Research 40(9). https://doi.org/10.1029/2003WR002816
Shu, C. & Ouarda, T.B.M.J. 2008. Regional flood frequency analysis at ungauged sites
using the adaptive neuro-fuzzy inference system. J. Hydrol. 349(1-2): 31-43. doi:10.1016/j.jhydrol.2007.10.050.
Shu, C. & Ouarda, T.B.M.J. 2007. Flood
frequency analysis at ungauged sites using artificial neural networks in
canonical correlation analysis physiographic space. Water Resources Research 43: doi:
10.1029/2006WR005142.
Sivakumar, B. &
Singh, V.P. 2012. Hydrologic system complexity and nonlinear dynamic concepts
for a catchment classification framework. Hydrology and Earth System
Sciences 16(11): 4119-4131.
Solanki, P., Baldaniya, D., Jogani, D.,
Chaudhary, B., Shah, M. & Kshirsagar, A. 2021.
Artificial intelligence: New age of transformation in petroleum upstream. Petroleum
Research 7(1): 106-114.
Tan, A., Zhou, G. &
He, M. 2021. Surface defect identification of Citrus based on KF-2D-Renyi and
ABC-SVM. Multimedia Tools and
Applications 80(6): 9109-9136.
Tang, Z. &
Fishwick, P.A. 1993. Feedforward neural nets as models for time series
forecasting. ORSA Journal on Computing 5(4): 374-385.
Tegegne, G., Kim, Y‐O.
& Lee, J‐K. 2019. Spatiotemporal reliability ensemble averaging of multimodel simulations. Geophysical Research Letters 46(21): 12321-12330.
Tereshko, V. & Lee, T.
2002. How information-mapping patterns determine foraging behaviour of a honey
bee colony. Open Systems & Information Dynamics 9(2): 181-193.
Wan Zawiah Wan Zin, Abdul Aziz Jemain, Kamarulzaman Ibrahim, Jamaludin Suhaila & Mohd Deni Sayang.
2009. A comparative study of extreme rainfall in Peninsular Malaysia: With
reference to partial duration and annual extreme series. Sains Malaysiana 38(5): 751-760.
Wong, F.S. 1991. Time
series forecasting using backpropagation neural networks. Neurocomputing 2(4): 147-159.
Wu, J., Wang, Y.,
Zhang, X. & Chen, Z. 2016. A novel state of health estimation method of
Li-ion battery using group method of data handling. Journal of Power Sources 327: 457-464.
Xiang, W-L. & An,
M-Q. 2013. An efficient and robust artificial bee colony algorithm for
numerical optimization. Computers & Operations Research 40(5):
1256-1265.
Xiao, Y., Wu, J., Lin,
Z. & Zhao, X. 2018. A deep learning-based multi-model ensemble method for
cancer prediction. Computer Methods and Programs in Biomedicine 153:
1-9.
Yang, S., Yang, D.,
Chen, J., Santisirisomboon, J., Lu, W. & Zhao, B.
2020. A physical process and machine learning combined hydrological model for daily
streamflow simulations of large watersheds with limited observation data. Journal
of Hydrology 590: 125206.
Yin, H., Guo, Z.,
Zhang, X., Chen, J. & Zhang, Y. 2021. Runoff predictions in ungauged basins
using sequence-to-sequence models. Journal of Hydrology 603: 126975.
Yurtkuran, A. & Emel, E. 2016. A discrete artificial bee colony algorithm
for single machine scheduling problems. International Journal of Production
Research 54(22): 6860-6878.
*Pengarang untuk surat-menyurat; email: basribdy@uitm.edu.my
|