Reducing Annotation Efforts in Electricity Theft Detection through Optimal Sample Selection

Wenlong Liao, Birgitte Bak-Jensen, Jayakrishnan Radhakrishna Pillai, Xiaofang Xia, Guangchun Ruan, Zhe Yang

Research output: Contribution to journalJournal articleResearchpeer-review

1 Citation (Scopus)
38 Downloads (Pure)


Supervised machine learning models are receiving increasing attention in electricity theft detection due to their high detection accuracy. However, their performance depends on a massive amount of labeled training data, which comes from time-consuming and resource-intensive annotations. To maximize model performance within a limited annotation budget, this article aims to reduce the annotation effort in electricity theft detection through optimal sample selection. In particular, a general framework and three new strategies are proposed to select the most valuable and representative samples from different perspectives, including uncertainty, class imbalance, and diversity of samples. In-depth simulations and analyses are conducted to evaluate the effectiveness of the proposed strategies on commonly used machine learning models and a real-world dataset. Simulation results show that the proposed strategies significantly outperform baselines on datasets of different sizes and fraudulent ratios. Besides, the proposed strategies are effective in improving detection performance across a range of classifiers.

Original languageEnglish
Article number3508911
JournalI E E E Transactions on Instrumentation and Measurement
Pages (from-to)1-11
Number of pages11
Publication statusPublished - 2024


  • Annotations
  • Costs
  • Data annotation
  • Data models
  • Electricity theft
  • Games
  • Machine Learning
  • Power distribution
  • Sample selection
  • Smart grid
  • Training
  • Training data


Dive into the research topics of 'Reducing Annotation Efforts in Electricity Theft Detection through Optimal Sample Selection'. Together they form a unique fingerprint.

Cite this