Effect of traffic dataset on various machine-learning algorithms when forecasting air quality

Sulaimon, Ismail, Alaka, Hafiz, Olu-Ajayi, Razak, Ahmad, Mubashir, Ajayi, Saheed and Hye, Abdul (2022) Effect of traffic dataset on various machine-learning algorithms when forecasting air quality. Journal of Engineering, Design and Technology. ISSN 1726-0531

Copy

Purpose: Road traffic emissions are generally believed to contribute immensely to air pollution, but the effect of road traffic data sets on air quality (AQ) predictions has not been fully investigated. This paper aims to investigate the effects traffic data set have on the performance of machine learning (ML) predictive models in AQ prediction. Design/methodology/approach: To achieve this, the authors have set up an experiment with the control data set having only the AQ data set and meteorological (Met) data set, while the experimental data set is made up of the AQ data set, Met data set and traffic data set. Several ML models (such as extra trees regressor, eXtreme gradient boosting regressor, random forest regressor, K-neighbors regressor and two others) were trained, tested and compared on these individual combinations of data sets to predict the volume of PM 2.5, PM 10, NO 2 and O 3 in the atmosphere at various times of the day. Findings: The result obtained showed that various ML algorithms react differently to the traffic data set despite generally contributing to the performance improvement of all the ML algorithms considered in this study by at least 20% and an error reduction of at least 18.97%. Research limitations/implications: This research is limited in terms of the study area, and the result cannot be generalized outside of the UK as some of the inherent conditions may not be similar elsewhere. Additionally, only the ML algorithms commonly used in literature are considered in this research, therefore, leaving out a few other ML algorithms. Practical implications: This study reinforces the belief that the traffic data set has a significant effect on improving the performance of air pollution ML prediction models. Hence, there is an indication that ML algorithms behave differently when trained with a form of traffic data set in the development of an AQ prediction model. This implies that developers and researchers in AQ prediction need to identify the ML algorithms that behave in their best interest before implementation. Originality/value: The result of this study will enable researchers to focus more on algorithms of benefit when using traffic data sets in AQ prediction.

Item Type	Article
Additional information	© Emerald Publishing Limited. This is the accepted manuscript version of an article which has been published in final form at https://10.1108/JEDT-10-2021-0554
Keywords	original article, air quality prediction, traffic dataset, big-data, machine learning, machine learning, traffic data set, air-quality prediction, general engineering
Date Deposited	15 May 2025 14:55
Last Modified	18 Jun 2025 23:11

Explore Further

Journal of Engineering, Design and Technology

picture_as_pdf: Paper_1_-_Air_pollution_prediction_revewed_2.edited.pdf
subject: Submitted Version
: Available under Creative Commons: BY-NC 4.0

View

Download

Atom

BibTeX

OpenURL ContextObject in Span

OpenURL ContextObject

Dublin Core

MPEG-21 DIDL

Data Cite XML

EndNote

HTML Citation

METS

MODS

RIOXX2 XML

Reference Manager

Refer

ASCII Citation

Export

Downloads