Electricity Theft Detection from Electricity, Gas and Water Measurements using Machine Learning
Electricity theft is a critical source of non-technical losses in modern power systems, causing substantial financial and operational challenges for utilities. Traditional detection methods, such as manual inspections, are inadequate to detect advanced theft techniques, including meter tampering and cyberattacks on smart grids. This study introduces a machine learning-based framework for electricity theft detection using the TDD2022 dataset (derived from OEDI) and evaluates multiple algorithms—Random Forest, Decision Tree, XGBoost, LightGBM, CatBoost, Extra Trees, and Logistic Regression. To address class imbalance, SMOTE is applied, while feature selection leverages LASSO and ReliefF. Experiments compare electricity-only data with multi-utility inputs (electricity and gas) under balanced and imbalanced conditions. Results show that tree-based ensembles, particularly Extra Trees combined with SMOTE and ReliefF, achieve superior performance (accuracy >95%, AUC ≈0.99). Consumer-specific models outperform global models, with commercial classes yielding near-perfect detection, while residential profiles remain challenging. The findings highlight the importance of tailored modeling and feature selection for scalable, accurate theft detection in smart grid environments.
| Item Type | Article |
|---|---|
| Identification Number | 10.3390/en19092045 |
| Additional information | © 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license. https://creativecommons.org/licenses/by/4.0/ |
| Date Deposited | 08 May 2026 07:39 |
| Last Modified | 08 May 2026 07:39 |
