TY - JOUR
T1 - Bidding strategy for trading wind energy and purchasing reserve of wind power producer–A DRL based approach
AU - Cao, Di
AU - Hu, Weihao
AU - Xu, Xiao
AU - Dragicevic, Tomislav
AU - Huang, Qi
AU - Liu, Zhou
AU - Chen, Zhe
AU - Blaabjerg, Frede
PY - 2020/5
Y1 - 2020/5
N2 - Wind power producers (WPP) are punished when take part in the short-term electricity market due to the inaccuracy of wind power prediction. The profit loss can be partially offset by strategic reserve purchasing. Due to the uncertainty of real-time wind power production, the reserve capacity price and the highly dynamic price regulation, it is difficult to determine the quantity of reserve to purchase, which might have a great impact on the profit of WPP. This paper investigates the possible influence on the revenue of WPP if they take part in both the energy and reserve market. This problem is first formulated as a Markov decision process (MDP). After that, the asynchronous advantage actor-critic (A3C) method is used to take care of this problem. Several agents are applied to explore the action space simultaneously. Neural networks are applied to fit the policy function and value function, which are trained against each other to learn the optimal dynamic bidding strategy from historical data by constantly trial and error. Simulation results of a wind farm in Denmark validate that the profit loss of the WPP can be significantly reduced when the A3C algorithm is used for bidding strategy formulation when WPP participate in both the energy and reserve market.
AB - Wind power producers (WPP) are punished when take part in the short-term electricity market due to the inaccuracy of wind power prediction. The profit loss can be partially offset by strategic reserve purchasing. Due to the uncertainty of real-time wind power production, the reserve capacity price and the highly dynamic price regulation, it is difficult to determine the quantity of reserve to purchase, which might have a great impact on the profit of WPP. This paper investigates the possible influence on the revenue of WPP if they take part in both the energy and reserve market. This problem is first formulated as a Markov decision process (MDP). After that, the asynchronous advantage actor-critic (A3C) method is used to take care of this problem. Several agents are applied to explore the action space simultaneously. Neural networks are applied to fit the policy function and value function, which are trained against each other to learn the optimal dynamic bidding strategy from historical data by constantly trial and error. Simulation results of a wind farm in Denmark validate that the profit loss of the WPP can be significantly reduced when the A3C algorithm is used for bidding strategy formulation when WPP participate in both the energy and reserve market.
KW - Wind power bidding
KW - Deep reinforcement learning
KW - Data-driven
KW - Agent-based
KW - Uncertainty
UR - http://www.scopus.com/inward/record.url?scp=85074211025&partnerID=8YFLogxK
U2 - 10.1016/j.ijepes.2019.105648
DO - 10.1016/j.ijepes.2019.105648
M3 - Journal article
SN - 0142-0615
VL - 117
SP - 1
EP - 10
JO - International Journal of Electrical Power & Energy Systems
JF - International Journal of Electrical Power & Energy Systems
M1 - 105648
ER -