ADP-based intelligent frequency control via adaptive virtual inertia emulation

,


Introduction
Modern electricity production is evolving toward the so-called distributed generation (DG) structure with the deterioration of the energy supply-demand (Delille et al., 2012).For most DGs, solar photovoltaic (PV), wind turbine, storage, and diesel generator units, are connected to a power grid through grid-connected inverters to shape an autonomous microgrid (MG) (Zhou et al., 2017).However, the DG cannot provide inertia like the traditional synchronous generator (SG), especially the fact that the proportion of renewable energy in the power system capacity is increasing, whose influence on the stability of the power system can not be ignored.Therefore, a certain voltage and frequency support capability of renewable energy is necessary to maintain the stable operation of power systems (Huang et al., 2020).
To adjust the frequency response, the droop-based control (Tayab et al., 2017) that owns the function of primary frequency regulation (Wang et al., 2021) has been widely applied in DGs within MGs.Specifically, the frequency and voltage reference for the inverter-based DG can be generated by adopting the active power-frequency droop and reactive power-voltage droop.Thus, the capability of the voltage and frequency regulation of the DG can be contributed to the MG.However, the method lacks rotating kinetic energy as SGs, which makes it has a very small inertia.When the active load power is suddenly changed, it would lead to a poor voltage and frequency response, or even collapse.
To address this issue, inertia emulation, e.g.virtual synchronous generator (VSG) technology has provided a promising solution.For example, a synchronverter (Zhong et al., 2011) for inverter-based DGs has been built to play an important role in "slowing down" the transient system dynamics, where the mechanical equation and electromagnetic equation of the SG are applied to control the inverter.Further, to make the synchronverter with a better stability, five modifications of virtual inductors, virtual capacitors, and anti-windup are proposed (Natarajan et al., 2017).Besides, some theoretical reviews (Dreidy et al., 2017), (Bevrani et al., 2014) have been developed to comparatively study these control algorithms.However, the above implementation of virtual inertia is usually based on the assumption that the infinite power can be generated or absorbed by the generator in the short term, whereas the DC-side capacitor is limited in the real-world (Ashabani et al., 2014).Thus, a distributed virtual inertia scheme (Fang et al., 2018) based on regulating the DC-link voltages of power converters is developed, where relatively large DC-link capacitor units are aggregated to provide the frequency support.Although the proposed approach can improve the system frequency response and the overall control energy, the value of the stored energy for adjusting frequency still lacks consideration.In addition, a derivative control term (Morren et al., 2006), i.e.rate of change of the frequency (RoCoF), is applied to suppress the frequency drop similar to the traditional primary frequency control.Based on the RoCoF, an improved droop controller is designed to improve the transient frequency response (Soni et al., 2013).However, the inertia and damping parameters in most of the above-mentioned works are constant.In order to achieve better frequency control performance, a self-tuning algorithm (Torres et al., 2014), to get the optimal parameters of the VSG to improve the frequency nadir and RoCoF, is proposed.Following, similar works are developed, such as that based on self-adaptive inertia and damping combination control (Li et al., 2017), VSG-based adaptive inertia control method (Hou et al., 2020) to enhance the system dynamic response, and the adjusting virtual inertia to the maximum or minimum value with RoCoF (Alipoor et al., 2015).However, the adaptive-controlled inertia in these works maybe not the optimal inertia while it can improve the frequency response, where it can be found that the works reviewed above primarily focus on the overall frequency regulation, while the cost and energy resources required for such regulation are not involved.
When it comes to obtaining an optimal virtual inertia gain for the converter, it is usually to solve a nonlinear Hamilton-Jacobi-Bellman (HJB) equation.Adaptive dynamic programming (ADP) that is a powerful method (Vamvoudakis et al., 2010), (Liu et al., 2020), (Liu et al., 2019) and has been greatly developed and applied in optimal tracking control (Zhang et al., 2018), robust control (Yang et al., 2016) and multi-agent consensus control (Jiang et al., 2016).In particular, ADP is usted to the wireless connected vehicles (Gao et al., 2017), power systems (Mu et al., 2020) and wastewater treatment (Wang et al., 2020), which shows great practical application prospects (Liu et al., 2021).To overcome the above disadvantages, a novel VSG controller for converters in power systems is derived in this paper.The proposed controller realizes a trade-off between the critical frequency bounds and the required control energy.The main contributions of this paper are shown as follows: 1) A uniform nonlinear system dynamics under the VSG control is derived, where the reciprocal of virtual inertia is modeled as the state feedback control input.
2) An ADP-based optimization technique is proposed to obtain the optimal inertia controller from the derived HJB equation , which incorporates both the critical frequency limits and the required control effort.
3) The actor and critic neural networks (NNs) are constructed to implement the policy iteration algorithm to approximate the optimal control input and optimal cost function, respectively.
The remainder of this paper is organized as follows.The system dynamics for the VSG are briefly investigated in section II.Section III presents the optimal virtual inertia controller design approach based on an online actor-critic algorithm.Section IV investigates the stability issue of the VSG with the developed optimal controller using small-signal analysis.Simulation results are presented to demonstrate the effectiveness of the proposed control scheme in Section V.The conclusions are drawn in Section VI.

System Model
Combined with an emulated swing equation and the turbine governor dynamics (Markovic et al., 2019), a second-order system describing the VSG shown in Fig. 1 is considered as follows where Δω represents the frequency deviation; ΔP g represents the variation of the mechanical turbine power; R and T g are the droop coefficient and time constant of the turbine governor dynamics, respectively; J and D are the virtual moment of inertia and the active droop damping constant, respectively; ΔP is seen as a known system change in the electric power.From (1), we get Correspondingly, the transfer function G(s) of the investigated second-order system dynamics in Laplace s-domain can be developed as (3) where Further, ΔP (s) = s −1 is considered to obtain the time domain solution of frequency deviation.Then, (3) can be rewritten as ). Following, we take the first and second derivative of (4) to investigate the frequency nadir and maximum RoCoF, which results in and let (5) equal to zero, the t f when the frequency nadir Δω max appears can be obtained as and by making (6) equal to zero, the t r when the maximum RoCoF Δ ωmax appears can be obtained.Different from the characteristic that Δω max only has a unique solution, Δ ωmax has two solutions, which are expressed as However, combining the expression for T g , D, J, ζ and ω n , the following can be

Cost function Controller
Critic

Hamiltonian error
Tuning law Optimal adaptive frequency regulation

Frequency dynamic model
Figure 2. System frequency dynamics model associated with the proposed adaptive optimal inertia controller. obtained which implies that φ ∈ (2 cos −1 ζ, π).Thus, the maximum RoCoF happens at the moment of the disturbance, and is directly decided by the power change ΔP and emulated inertia J, which indicates that the overall system performance can achieve significant improvements through the inertia adjustment.

Non-Linear Analysis
Applying a step-change ΔP ∀t ∈ [0 + , +∞], then (2) can be re-expressed as Selecting the system state vector and control input as x = Δω Δ ω T and u = 1/J, respectively, (10) can be expressed in a compressed form as follows where Note that the system in ( 11) is the non-linear, the optimal control is almost impossible to be solved by a Linear-Quadratic Regulator (LQR).In the following, an adaptive optimal inertia controller is developed, which is regulated by the ADP method.The detailed schematic diagram of benchmark converter with the intelligent frequency control strategy is presented in Fig. 2.

Adaptive Controller Design
For the nominal system (11), the control policy u(x) by minimizing the cost function is designed as where r (x, u(x)) = x T Qx + u T Ru, Q and R are positive definite symmetric matrices.
The optimal cost function V * (x) is defined as where Ψ (Ω) is an admissible control set, and the optimal control policy u * (x) satisfies the HJB equation where ∇V * (x) is the partial derivative of V * relative to x. Suppose ( 14) holds, then the optimal frequency control policy for system (11) can be derived as Bringing ( 15) into ( 14), the expression of the HJB equation in terms of ∇V * (x) can be further written as where the optimal HJB equation holds that 0 = H(x(t), u * (x), ∇V * (x)) with the optimal control policy u * (x).Note that, the nonlinear nature of ( 16), the HJB equation is almost impossible to be directly solved.Thus, the approach of synchronous policy iteration (PI) is adopted to solve the approximate solution of the optimal control u * (x) and the optimal value function V * (x).

Critic and Actor NNs Implementation of Adaptive Controller
To realize the PI algorithm, both the actor and critic NNs tuned simultaneously, is adopted.Therefore, it is reasonable to assume that there exists weights W c so that the value function can be restructured as where ψ(x) and δ(x) are the activation function vector and the approximation error, respectively.In fact, the ideal weight W c of the critic NN that can give the best approximate solution for ( 17) is unknown, thus an estimated weight Ŵc is adopted, and the output of the critic NN can be expressed as V (x) = Ŵ T c ψ(x) .Thus, the estimated hamiltonian function can be obtained as where ê = − W T c ∇ψ − (∇δ(x)) T ẋ as the critic NN error, Wc = W c − Ŵc as the critic weight estimation error.Thus, it is reasonable to define E = 1 2 êT ê to regulate the critic network weight.Then, the tuning law for the critic NN weight can be computed as where σ 1 = ∇ψ ẋ. α 1 is the primary learning rate of critic NN.According to the above analysis, the ideal weight W c is unknown.Therefore, using the estimated weights, i.e. an actor NN, the control policy is calculated by where Ŵ a represents the current estimated values of the ideal critic NN weights W c .Then, bring the control law ( 20) into ( 19), the tuning law for the critic NN can be recalculated as where σ 2 = ∇ψ 1 (f (x) + gu 2 ).According to (Vamvoudakis et al., 2010), the tuning law of the actor NN is designed in the stability proof, which is where D 1 (x) = ∇ψ(x)g(x)R −1 g T (x)∇ψ T (x), α 2 is primary learning rate of actor NN.Therefore, the weights of critic and actor NNs, tuning at the same time, can be obtained by using the updating law ( 21) and ( 22).It means that (20) can provide the optimal frequency control law for benchmark system (11).
Remark 1.In this paper, the main contribution is proposing a novel distributed optimal virtual inertia concept for converters in power systems with a wide application of DERs.ADP method is used to solving the optimal problem, which has been proved to be iteratively convergent (Abu-Khalaf et al., 2005) and closed-loop stable (Vamvoudakis et al., 2010).This paper adopts a general ADP method, which can guarantee that the NNs weights are convergent under the designed tuning law.In view of the page limits, the detailed stability proof will not be provided in this paper.Readers, who interest in it, can refer to relevant literature.

Small-Signal Model
By using small-signal approximation (Sun et al., 2020), ( 11) is linearized as where For concise, define results in the closed-loop systems of the form ẋ = Ax, with A defined as f + M .

Small-Signal Stability Analysis
The controller parameters are given in Table I.Based on the small-signal model expressed by ( 23), the trajectories of the eigenvalues with the actor NN weight Ŵ varying from -2 to 4 in the step of 0.02 are shown in Fig. 3.In this case, the large load disturbance condition is set to keep same with the simulation results section.Note that the eigenvalues in Fig. 3 are located on the right plane when the parameter Ŵ2 of the weights Ŵ is less than -1.48, and the system is in an unstable state.As the Ŵ2 gradually increases, the eigenvalues enter the left plane.Therefore, W 2 min =-1.48 is considered to be the lower limit of the variable weights coefficient in the system stable region under the above condition.The remaining parameters of Ŵ do not affect the stability of the system, and we will not discuss them here.

Simulation Results
To qualitatively verify the proposed adaptive method, the optimal inertia controller has been incorporated into a state-of-the-art converter control scheme.The simulation has been conducted in MATLAB/Simulink.The small load power changes ΔP =3 p.u. and the large load power changes ΔP = −10 p.u. are investigated, respectively.For better demonstrating the advantages, four groups of parameters are applied: (a) small constant inertia (J 0 =0.5);(b) large constant inertia (J 0 =2.2);(c) the LQR-based control; (d) the ADP-based adaptive inertia control proposed in this paper.

Under Small Load Changes
The LQR controller is designed as Δu = −kx and the quadratic objective is consistent with (12), then the control feedback gain k = 0.0250 0.0507 .For the ADP-based controller, the used parameters are as follows, Q=I 2×2 , R = 1, α 1 = 30, α 2 = 1, Γ 1 = 5 and Γ 2 = 10, where I is the identity matrix.Based on system data, the weights of critic NN can be obtained after training, which is presented in Fig. 4(a).sponse in Fig. 5, the inertia, frequency nadir, maximum RoCoF, and energy utilization of four cases are demonstrated in Table II.A large RoCoF occurs with the small inertia control, and a very slow response and frequency oscillation arise with the large inertia control.The control performance of LQR is similar with the ADP-based control proposed in this paper when the frequency nadirs are small since it is a linear optimal control method.However, the performance becomes worse shown in Fig. 7 in the case of large load changes.As observed, with the ADP-based control, the nadir of arrested frequency and RoCoF are quite satisfying, which benefits from the emulated inertia shown in Fig. 5(c).A relatively large inertia is provided to help deviate slowly, and a relatively small inertia is provided to help return quickly.
Meanwhile, we also provide the comparison of the energy utilization response (computed as E J = JΔ ωdt (Markovic et al., 2019)) in Fig. 5(d).As shown, the energy consumption of the large inertia control method is the largest, followed by the LQR-  based and ADP-based controller, which further reflects that the proposed control method can save more DC side energy.

Under Large Load Changes
The scene of the converter encountering small load changes has been investigated in Section 5.1.However, there may be sudden large load incorporation.Therefore, we present the effects of the proposed optimal frequency control method under large load changes.Similarly, the gain k for the LQR-based controller is solved as k = −0.0909−0.1804 .For the ADP-based control, the weights shown in Fig. 6 From Fig. 7, it is obvious that the ADP-based control performs better than the small inertia control, large inertia control, and LQR-based control for dealing with large load changes.In this sudden scenario, both the small inertia control and the large inertia control have poor performance.Since LQR control method serves linear systems, the emergence of large load disturbances may own poor response in frequency regulation compared with the ADP-based control.At the same time, the proposed ADP-based control has more DC side energy saving shown in Fig. 7(d).Therefore, the ADP-based optimal inertia controller can be well adaptive when the system is coming to various scenarios.

Conclusion
This paper propose a novel distributed optimal virtual inertia concept for converters in power systems with a wide application of DERs.An ADP-based intelligent optimal frequency controller is designed to adjust the virtual inertia based on the predefined cost function, while simultaneously preserving a trade-off between the critical frequency limits and the required control energy.The proposed approach is incorporated into a detailed state-of-the-art control scheme and verifies on two different load changes, namely the small load changes and large load changes.The simulation results show that the frequency response is greatly improved while maintaining more DC side energy.The future work will focus on the impact of self-adaptive inertia and damping control, as well as the extension into an MG case.

Figure 3 .
Figure 3.The root locus with control parameters Ŵ2 varying.
The weight vector converges to Ŵa = Ŵc = 0.2544 0.0492 −0.5246 T .Then, by using the converged actor NN weights and the equation (20), the optimal frequency controller can be obtained.During the training, with the effective persistence of excitation (PE) condition, the states converge very close to zero as needed, which can be observed in Fig. 4(b).Fig. 5 illustrates the comparison of four simulation results.From the frequency re-

Figure 4 .Figure 5 .
Figure 4. Simulation results under small load changes.(a) Convergence of the critic parameters.(b) Evolution of the system states.

Figure 6 .
Figure 6.Simulation results under large load changes.(a) Convergence of the critic parameters.(b) Evolution of the system states.

Figure 7 .
Figure 7. Effects of inertia response of the proposed control scheme under large load changes.(a) System frequency.(b) RoCoF.(c) Inertia coefficient.(d) Effects on energy utilization.
(a) converges to the optimal values Ŵa = Ŵc = 0.2192 0.0425 −0.5212 T , and the system states can converge very close to zero as needed, which is shown in Fig. 6(b).

Table 1 .
SIMULATION TEST PARAMETERS

Table 2 .
COMPARISON OF Four SIMULATION RESULTS