A Big Data Analytics Approach for Construction Firms Failure Prediction Models
Oyedele, Lukumon O.
Owolabi, Hakeem O
Akinade, Olugbenga O.
Ajayi, Saheed O.
Using 693 000 datacells from 33 000 sample construction firms that operated or failed between 2008 and 2017, failure prediction models were developed using artificial neural network (ANN), support vector machine, multiple discriminant analysis (MDA), and logistic regression (LR). The accuracy of the models on test data surprisingly showed ANN to have only a slightly better accuracy than LR and MDA. The ANN's number of units in the hidden layer and weight decay hyperparameters were consequently tuned using the grid search. Tuning process led to tedious machine computation that was aborted after many hours without completion. The state of art big data analytics (BDA) technology was, for the first time in failure prediction, consequently employed and the tuning was completed in some seconds. Mean accuracy from cross validation was used for selection of the model with best parameter values, which were used to develop a new ANN model that outperformed all previously developed models on test data. Subsequent use of selected variables to develop new models led to reduced tuning computational cost, but not improved performance. Since the real-life effect of a misclassification cost is greater than the tedious computation cost, it was concluded that BDA is the best compromise.