Açık Akademik Arşiv Sistemi

Evaluation of data preprocessing and feature selection process for prediction of hourly PM10 concentration using long short-term memory models

Show simple item record

dc.contributor.authors Aksangur, Ipek; Eren, Beytullah; Erden, Caner
dc.date.accessioned 2022-12-20T13:25:06Z
dc.date.available 2022-12-20T13:25:06Z
dc.date.issued 2022
dc.identifier.issn 0269-7491
dc.identifier.uri http://dx.doi.org/10.1016/j.envpol.2022.119973
dc.identifier.uri https://hdl.handle.net/20.500.12619/99193
dc.description Bu yayının lisans anlaşması koşulları tam metin açık erişimine izin vermemektedir.
dc.description.abstract Studies have confirmed that PM10, defined as respirable particles with diameters of 10 mu m and smaller, has adverse effects on human health and the environment. Various estimation methods are employed to determine the PM10 concentration using historical data on controlling PM10 air pollution, early warning, and protecting public health and the environment. The present study analyses different Long Short-Term Memory (LSTM) models that can predict hourly PM10 concentration. In parallel, the study also investigates the effectiveness of the data preprocessing and feature selection (DPFS) process on the prediction accuracy of the LSTM models. For this purpose, three different LSTM models, namely Vanilla, Bi-Directional, and Stacked, were developed. Then, a comprehensive data preprocessing stage is used to eliminate missing and erroneous data and outliers from real -world raw data, and a feature selection process is applied to extract unnecessary features. The LSTM models consider three air quality parameters, including SO2, O-3, and CO, and three meteorological factors, including relative humidity, wind direction, and wind speed. The prediction performances of the LSTM models are compared using the RMSE, MAE and R-2 performance index according to whether DPFS is used in the models or not. As a result, when the DPFS process was applied, the proposed LSTM models achieved high prediction performance and can be used to predict hourly PM10 concentrations. Overall, the DPFS process significantly enhanced the developed LSTM models' prediction performance. Furthermore, the proposed model might be a useful tool for city administrators to make decisions and improve air quality management efforts.
dc.language English
dc.language.iso eng
dc.relation.isversionof 10.1016/j.envpol.2022.119973
dc.subject Environmental Sciences & Ecology
dc.subject Air quality
dc.subject Data preprocessing
dc.subject Feature selection
dc.subject Particulate matter (PM 10 )
dc.subject Long -short term memory (LSTM)
dc.title Evaluation of data preprocessing and feature selection process for prediction of hourly PM10 concentration using long short-term memory models
dc.contributor.authorID Erden, Caner/0000-0002-7311-862X
dc.identifier.volume 311
dc.relation.journal ENVIRONMENTAL POLLUTION
dc.identifier.doi 10.1016/j.envpol.2022.119973
dc.identifier.eissn 1873-6424
dc.contributor.author Aksangur, Ipek
dc.contributor.author Eren, Beytullah
dc.contributor.author Erden, Caner
dc.relation.publicationcategory Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı


Files in this item

Files Size Format View

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record