TY - JOUR
T1 - Simple Data Augmentation Tricks for Boosting Performance on Electricity Theft Detection Tasks
AU - Liao, Wenlong
AU - Yang, Zhe
AU - Bak-Jensen, Birgitte
AU - Pillai, Jayakrishnan Radhakrishna
AU - Von Krannichfeldt, Leandro
AU - Wang, Yusen
AU - Yang, Dechang
PY - 2023/7/1
Y1 - 2023/7/1
N2 - In practical engineering, electricity theft detection is usually performed on highly imbalanced datasets (i.e., the number of fraudulent samples is much smaller than the benign ones), which limits the accuracy of the classifier. To alleviate the data imbalance problem, this article proposes simple data augmentation tricks (SDAT) to boost performance on electricity theft detection tasks. SDAT includes five simple but powerful operations: adding noises to electricity consumption readings, drifting values of electricity consumption readings, quantizing electricity consumption readings to a level set, adding a fixed value to electricity consumption readings, and adding changeable values to electricity consumption readings. In addition, eight potential tricks are also mentioned. Numerical simulations are conducted on a real-world dataset. The simulation results show that SDAT can significantly boost the performance of different classifiers, especially for small datasets. Besides, specific suggestions on how to select parameters of SDAT are provided for its migration use to other datasets.
AB - In practical engineering, electricity theft detection is usually performed on highly imbalanced datasets (i.e., the number of fraudulent samples is much smaller than the benign ones), which limits the accuracy of the classifier. To alleviate the data imbalance problem, this article proposes simple data augmentation tricks (SDAT) to boost performance on electricity theft detection tasks. SDAT includes five simple but powerful operations: adding noises to electricity consumption readings, drifting values of electricity consumption readings, quantizing electricity consumption readings to a level set, adding a fixed value to electricity consumption readings, and adding changeable values to electricity consumption readings. In addition, eight potential tricks are also mentioned. Numerical simulations are conducted on a real-world dataset. The simulation results show that SDAT can significantly boost the performance of different classifiers, especially for small datasets. Besides, specific suggestions on how to select parameters of SDAT are provided for its migration use to other datasets.
KW - Electric potential
KW - Electricity theft detection
KW - Games
KW - Level set
KW - Smart grids
KW - Task analysis
KW - Training
KW - Voltage measurement
KW - data augmentation
KW - electricity consumption reading
KW - smart grid
KW - smart meter
UR - http://www.scopus.com/inward/record.url?scp=85151542345&partnerID=8YFLogxK
U2 - 10.1109/TIA.2023.3262232
DO - 10.1109/TIA.2023.3262232
M3 - Journal article
SN - 0093-9994
VL - 59
SP - 4846
EP - 4858
JO - I E E E Transactions on Industry Applications
JF - I E E E Transactions on Industry Applications
IS - 4
ER -