An Application Of Machine Learning With Boruta Feature Selection To Improve NO2 Pollution Prediction

Balogun, Habeeb; Alaka, Hafiz; Egwim, Christian; Ajayi, Saheed

View/Open

Conference.pdf (PDF, 355Kb)

Author

Balogun, Habeeb

Alaka, Hafiz

Egwim, Christian

Ajayi, Saheed

Abstract

Projecting and monitoring NO2 pollutants' concentration is perhaps an efficient and effective technique to lower people's exposure, reducing the negative impact caused by this harmful atmospheric substance. Many studies have been proposed to predict NO2 Machine learning (ML) algorithm using a diverse set of data, making the efficiency of such a model dependent on the data/feature used. This research installed and used data from 14 Internet of thing (IoT) emission sensors, combined with weather data from the UK meteorology department and traffic data from the department for transport for the corresponding time and location where the pollution sensors exist. This paper select relevant features from the united data/feature set using Boruta Algorithm. Six out of the many features were identified as valuable features in the NO2 ML model development. The identified features are Ambient humidity, Ambient pressure, Ambient temperature, Days of the week, two-wheeled vehicles(counts), cars/taxis(counts). These six features were used to develop different ML models compared with the same ML model developed using all united data/features. For most ML models implemented, there was a performance improvement when developed using the features selected with Boruta Algorithm.