Hybridization Model for Air Pollution Prediction Using Time Series Data

Authors

  • Roni Yunis Universitas Mikroskil
  • Andri Andri Universitas Mikroskil
  • Djoni Djoni Universitas Mikroskil

DOI:

https://doi.org/10.31154/cogito.v10i1.619.422-435

Keywords:

time series data, air pollution, OSEMN, hybrid, LSTM-Prophet

Abstract

In recent years, data science analysis, particularly time series predictions, has been widely employed across various industrial sectors. However, time series data presents high complexity, especially in seasonal patterns such as monthly, daily, or hourly fluctuations. Irregular fluctuations and external factors increasingly challenge accurate predictions. Therefore, this research proposes a hybrid approach combining SVR-SARIMA, SVR-Prophet, LSTM-SARIMA, and LSTM-Prophet to enhance time series prediction accuracy. This study followed the OSEMN methodology approach: gathering data, cleaning data, exploring data, developing models, and interpreting crucial aspects of problem-solving. Seasonal effect predictions indicated a rise in SO2 and NO2 during dry and rainy seasons until the next two years (average daily increments of 0.0831 μg/m3 for SO2 and 0.0516 μg/m3 for NO2). Estimates suggest a decrease in the order of three particles. The evaluation showed that the SVR model performed better compared to the other three models (RMSE 7.765, MAE 5.477, and MAPE 0.261). The best-performing hybrid model was LSTM-Prophet (99.74% accuracy) with RMSE 12.319, MAE 12.057, and MAPE 0.259 values.

References

N. Osseiran and C. Lindmeier, “9 out of 10 people worldwide breathe polluted air, but more countries are taking action,” WHO, 2018. [Online]. Available: https://www.who.int/news/item/02-05-2018-9-out-of-10-people-worldwide-breathe-polluted-air-but-more-countries-are-taking-action. [Accessed: 26-Dec-2022].

WHO, WHO global air quality guidelines. 2021.

A. Vidianto, A. Sindunata, and N. Yudistira, “Air Pollution Particulate Matter (PM2.5) Forecasting using Long Short Term Memory Model,” ACM Int. Conf. Proceeding Ser., pp. 139–145, 2021, doi:10.1145/3479645.3479662.

F. Hamami and I. A. Dahlan, “Univariate Time Series Data Forecasting of Air Pollution using LSTM Neural Network,” 2020 Int. Conf. Adv. Data Sci. E-Learning Inf. Syst. ICADEIS 2020, pp. 12–16, 2020, doi: 10.1109/ICADEIS49811.2020.9277393.

J. Arumugam, S. Sabarichvarane, and V. Venkatesan, Prasanna, “A Comparative Study of Bitcoin Price Prediction Using SVR and LSTM,” IJCRT, vol. 10, no. 9, pp. 742–749, 2022, doi: 10.3390/math7100898.

M. Castelli, F. M. Clemente, A. Popovič, S. Silva, and L. Vanneschi, “A Machine Learning Approach to Predict Air Quality in California,” Complexity, vol. 2020, no. Ml, 2020, doi: 10.1155/2020/8049504.

K. K. R. Samal, K. S. Babu, S. K. Das, and A. Acharaya, “Time series based air pollution forecasting using SARIMA and prophet model,” ACM Int. Conf. Proceeding Ser., pp. 80–85, 2019, doi: 10.1145/3355402.3355417.

E. Dave, A. Leonardo, M. Jeanice, and N. Hanafiah, “Forecasting Indonesia Exports using a Hybrid Model ARIMA-LSTM,” Procedia Comput. Sci., vol. 179, no. 2020, pp. 480–487, 2021, doi: 10.1016/j.procs.2021.01.031.

L. Guo, W. Fang, Q. Zhao, and X. Wang, “The hybrid PROPHET-SVR approach for forecasting product time series demand with seasonality,” Comput. Ind. Eng., vol. 161, no. June, p. 107598, 2021, doi: 10.1016/j.cie.2021.107598.

S. Xu, H. Kai, and T. Zhang, “Forecasting the demand of the aviation industry using hybrid time series SARIMA-SVR approach,” Transp. Res. Part E, vol. 122, no. December 2018, pp. 169–180, 2019, doi: 10.1016/j.tre.2018.12.005.

S. Bhanja and A. Das, “A hybrid deep learning model for air quality time series prediction,” Indones. J. Electr. Eng. Comput. Sci., vol. 22, no. 3, pp. 1611–1618, 2021, doi: 10.11591/ijeecs.v22.i3.pp1611-1618.

S. Du, T. Li, Y. Yang, and S. J. Horng, “Deep Air Quality Forecasting Using Hybrid Deep Learning Framework,” IEEE Trans. Knowl. Data Eng., vol. 33, no. 6, pp. 2412–2424, 2021, doi:10.1109/TKDE.2019.2954510.

S. J. Taylor and B. Letham, “Forecasting at Scale,” PeerJ Prepr. 5e3190v2, vol. 35, no. 8, pp. 48–90, 2017.

U. A. Bhatti, Y. Yan, M. Zhou, S. Ali, A. Hussain, and ..., “Time Series Analysis and Forecasting of Air Pollution Particulate Matter (PM2.5): An SARIMA and Factor Analysis Approach,” Ieee …, 2021, doi: 10.1109/ACCESS.2021.3060744.

S. Fan, D. Hao, Y. Feng, K. Xia, and W. Yang, “A hybrid model for air quality prediction based on data decomposition,” Inf., vol. 12, no. 5, 2021, doi: 10.3390/info12050210.

S. Prajapati et al., “Comparison of Traditional and Hybrid Time Series Models for Forecasting COVID-19 Cases,” 2021, doi: 10.21203/rs.3.rs-493195/v1.

A. Hasnain, Y. Sheng, M. Z. Hashmi, U. A. Bhatti, and ..., “Time series analysis and forecasting of air pollutants based on prophet forecasting model in Jiangsu province, China,” Frontiers in …. frontiersin.org, 2022, doi: 10.3389/fenvs.2022.945628.

S. Mahajan, L. J. Chen, and T. C. Tsai, “Short-term PM2.5 forecasting using exponential smoothing method: A comparative analysis,” Sensors (Switzerland), vol. 18, no. 10, pp. 1–15, 2018, doi: 10.3390/s18103223.

B. C. Liu, A. Binaykia, P. C. Chang, M. K. Tiwari, and C. C. Tsao, “Urban air quality forecasting based on multidimensional collaborative Support Vector Regression (SVR): A case study of Beijing-Tianjin-Shijiazhuang,” PLoS One, vol. 12, no. 7, pp. 1–17, 2017, doi: 10.1371/journal.pone.0179763.

C. J. Huang and P. H. Kuo, “A deep cnn-lstm model for particulate matter (Pm2.5) forecasting in smart cities,” Sensors (Switzerland), vol. 18, no. 7, 2018, doi: 10.3390/s18072220.

K. Kumari, M. Bhardwaj, and S. Sharma, “OSEMN Approach for Real Time Data Analysis,” Int. J. Eng. Manag. Res., vol. 10, no. 02, pp. 107–110, 2020, doi: 10.31033/ijemr.10.2.11.

Downloads

Published

2024-06-30

How to Cite

Yunis, R., Andri, A., & Djoni, D. (2024). Hybridization Model for Air Pollution Prediction Using Time Series Data. CogITo Smart Journal, 10(1), 1–14. https://doi.org/10.31154/cogito.v10i1.619.422-435