Data-Driven Detection of Stealth Cyber-Attacks in DC Microgrids

Cyber-physical systems such as microgrids contain numerous attack surfaces in communication links, sensors, and actuators forms. Manipulating the communication links and sensors is done to inject anomalous data that can be transmitted through the cyber layer along with the original data stream. The presence of malicious, anomalous data packets in the cyber layer of a dc microgrid can create hindrances in fulfilling the control objectives, leading to voltage instability and affecting load dispatch patterns. Hence, detecting anomalous data is essential for the restoration of system stability. This article answers two important research questions: 1) Which data-driven detection scheme offers the best detection performance against stealth cyber-attacks in dc microgrids? 2) What is the detection performance improvement when fusing two features (i.e., current and voltage data) for training compared with using a single feature (i.e., current)? Our investigations revealed that 1) adopting an unsupervised deep recurrent autoencoder anomaly detection scheme in dc microgrids offers superior detection performance compared with other benchmarks. The autoencoder is trained on benign data generated from a multisource dc microgrid model. 2) Fusing current and voltage data for training offers a 14.7% improvement. The efficacy of the results is verified using experimental data collected from a dc microgrid testbed when subjected to stealth cyber-attacks.


I pu
Vector notation of per-unit output current of all the agents. L Laplacian matrix. W Row-stochastic matrix representing the distribution of attack elements in the microgrid.

I. INTRODUCTION
D C MICROGRIDS facilitate hassle-free integration of renewable energy sources [1], helping to achieve lower levels of carbon-emission through decreased dependence on fossil fuels (e.g., coal) for power generation [2], [3]. The ability to function autonomously provides immunity to such systems against potential impacts of external faults [4]. The main control challenges faced by dc microgrids during autonomous operation are regulation of voltage and load current sharing among the distributed generators (DGs). These objectives are achieved through the use of secondary controllers coupled with communication networks to aid real-time data exchange. Such networks may have a centralized or distributed topology. However, distributed secondary control is more reliable as it is not affected due to single-point failures [5].
The use of information and communication technology to achieve control objectives exposes the microgrid to manipulative cyber-attacks [6]. These attacks can target the communication infrastructure [7], sensor measurements [8], and/or controllers [9]. Malicious manipulation of any of these attack surfaces may generate anomalous data. In this context, the term anomalous data refers to the abnormal elements present in a stream of data that do not exhibit the expected behavioral patterns. Though faults can also be the source of such anomalies [10], [11], fault-based anomalies are less sophisticated, unlike attack-based anomalies that can be specially modeled and injected through stealth attacks to inflict the desired level of damage. Such abnormal elements may propagate through the network to achieve specific objectives such as voltage instability or disruptions in optimal load sharing arrangements among DGs. The following paragraphs depict some of the detection techniques proposed recently.

A. Related Works
Beg et al. [10] used parametric time-frequency logic to detect cyber-attack and fault-based anomalies in dc microgrids. The proposed detector extracts time-frequency information from training datasets (consisting of anomalous data) and uses the same to identify abnormal elements (present along with the normal inputs) during the testing phase. In [12], an attack detector was presented that can compare groups of elements on the basis of whether they satisfy certain invariants. Detection of discrepancies implies the presence of false data. A signaltemporal-logic-based anomaly detection strategy has been presented in [13]. State-estimation-based anomaly detection techniques have been proposed in [14]- [16]. However, well-crafted stealthy cyber-attacks can easily fool state observers [17]- [19]. Also, state estimation methods also require prior knowledge about the physical structure of the system. Physics-informed anomaly detection techniques have been proposed in [20] and [21], which are particularly focused on distinguishing between large signal disturbances, such as grid/sensor faults and cyberattacks.
Detection strategies that employ data-driven machinelearning-based tools generally do not require information about the physical architecture of the system. Machine-learningbased techniques perform anomaly detection by comparing live/captured data from the cyber-physical system with predicted values generated on the basis of reference datasets available for their training. Such techniques can be broadly categorized into four types: 1) supervised learning, 2) unsupervised learning, 3) reinforcement learning [22], and 4) semisupervised learningbased approaches [23]. The main difference between the four categories lies in the type of reference datasets used during their training phase. Unlike the other three, supervised learning models can only be trained using labeled datasets that may or may not be accessible to researchers. Khan et al. [24] suggested the use of multiclass support vector machines (SVMs) for anomaly detection in microgrids. SVMs are examples of supervised learning models. In [25], a deep-learning-based anomaly detection technique has been proposed to identify sensor-level cyber-attacks in dc microgrids. Kavousi et al. [26] have used an improved feedforward neural-network-based approach to detect anomalies (generated as a consequence of sensor-level data integrity attacks) in microgrids. However, the authors have only considered anomaly detection in the advanced metering infrastructure and ignored other potential vulnerabilities (e.g., DG-level sensors).
Unfortunately, the aforementioned works require the availability of labeled data to train the detector. The availability of such data is not always true, especially for the zero-day cyber-attack data (attacks that have not been detected before). Also, capturing important features from the data is necessary to achieve high detection performance.

B. Contributions
In order to fill the gap in the literature, this article answers the following two important research questions. 1) Which data-driven detection scheme offers the best performance against stealth cyber-attacks in dc microgrids? 2) Is adopting a single feature (i.e., current) sufficient for training the detector, or will fusing two features (i.e., current and voltage data) improve the results, and what would the detection improvement level be? It turns out that the characteristics of an ideal detector for this application are to present 1) an unsupervised anomaly detection that needs to be trained using only benign data while being able to detect malicious data during the testing phase. Such an ability is possible via learning high-quality features from the input (normal) data during the training phase. This enables the detector to effectively find and mark malicious data elements that do not exhibit the identified features. The detector should have 2) a deep structure to perceive the complex patterns within the data. 3) A recurrent mechanism to capture the time-series temporal correlations. 4) Feature fusion that incorporates current and voltage data to further improve the detection, as this enables the detector to capture distinct representations from both features. To achieve this, we carry out the following contributions.
1) We utilize a long short-term memory stacked autoencoder (LSTM-SAE) as a deep recurrent unsupervised anomaly detector to identify abnormal data elements in autonomous dc microgrids. This detector is trained using datasets obtained during normal operation of a K-DG dc microgrid model with distributed network topology. 2) We compare the performance of the proposed LSTM-SAE to benchmark detectors including unsupervised autoregressive integrated moving average (ARIMA) model, one-class SVM, and feedforward stacked autoencoder (F-SAE) that are trained on the benign behavior. We also examine the use of supervised two-class SVM, feedforward, convolutional neural network (CNN), and LSTM classifiers trained and tested on both classes. Sequential gird-search hyperparameter optimization is carried out to enhance the results. 3) We conduct multiple experiments. In the first one, using current datasets, the stacked and recurrent structure of the LSTM-SAE model provides an improvement of up to 18.3% in detection rate (DR), 12.7% in false alarm (FA), and 31% in highest difference (HD) compared to the benchmark detectors. The second experiment fuses current and voltage datasets such that the decision of whether the sample is benign or malicious is based on two data sources. Doing so provided a further improvement of up to 4.7% in DR, 11.5% in FA, and 14.7% in HD. The accuracy of the results is verified further using a dataset obtained from an experimental dc microgrid testbed. The results are consistent when validated, the detection performance varies by around ±0.4% in most cases. The rest of this article is organized as follows. Section II describes cyber-physical preliminaries of microgrids. Section III discusses the used datasets. Section IV presents the details about the cyber-attacks detectors. Section V discusses the experimental results. Finally, Section VI concludes this article.

II. CYBER-PHYSICAL PRELIMINARIES OF MICROGRIDS
This article considers an autonomously operating dc microgrid system with K sources. The architecture of the microgrid is shown in Fig. 1. Each of the sources (interfaced using dc/dc buck converters for regulated power conversion) is connected to one another via tie-lines. These elements collectively represent the microgrid physical layer. Operation of the power electronic converters occurs in a voltage-controlled mode. Proper voltage regulation and current sharing are achieved using a cooperative secondary control framework where a local controller is associated with each of the DGs [27]. All the local controllers are connected through a distributed communication network, which requires each controller to share information only with its neighboring controller(s).
The cyber layer can be considered as a graph (consisting of multiple nodes and edges), where each node represents an agent and each edge represents a communication link that connects two agents. Elements of the network compose an adjacency matrix, A = [a kj ] ∈ R N ×N , where the communication weights may be expressed as a kj > 0, if (ψ k , ψ j ) ∈ E (E denotes an edge that connects ψ k i.e., the local node and ψ j i.e., the neighboring node). Else, a kj = 0. The matrix for inbound cyber information can be represented as Z in = diag{ k∈K a kj }. The Laplacian matrix L is said to be balanced if A and Z in are equal (since, Each of the controller units can be represented as an agent in the cyber layer, sending and receiving a group of measurements with their respective neighboring agents to attain average voltage regulation and proportionate current sharing. Considering preliminaries of the communication graph, control input of the local secondary controller (associated with each DG) can be stated as (according to the elements present in x). Additionally, M k is the set of neighbors of agent k. To clarify the error formulation in (11), we can simplify it using A similar extrapolation can be done to represent u k . Remark I: According to the cooperative synchronization law [28], consensus will be achieved by all agents (who par- Using (2), the local control inputs necessary to achieve the control targets (average voltage regulation and proportionate sharing of load current) can be acquired from the secondary controller by using the voltage correction terms as (for kth agent) [29] Average voltage regulation: Proportionate current sharing: For proportionate current sharing, I ref = 0. Correction terms acquired in (5) and (6) can be added to the global reference voltage for the achievement of local voltage references (for the kth agent) using The target objectives mentioned in (3) and (4) are achieved by using (7) as the local reference voltage (for the kth agent). As per the distributed consensus algorithm for a heavily connected digraph (in the dc microgrid) [30], the system objectives [using (1)- (7)] shall converge to As shown by the red symbols in Fig. 1, malicious attackers may try to corrupt the cyber layer in several ways (e.g., false data injection, denial of service, etc.) to disturb the achievement of the objectives mentioned in (8). In case of a stealth attack, the attack vector penetrates deep in the control layer by deceitfully hiding from the system operator. The ability to access multiple nodes allows such vectors to create disturbances that can be continued over an elongated stretch of time and enables them to forcefully cause generation outages. This may ultimately result in system shutdown. Hence, identifying the compromised node(s) is essential to prevent malware propagation (reducing chances of further destabilization).
Such attacks can perform coordinated manipulation to fool the system observer via the following additions in (1): where u a , x, and x attack denote the vector representation of the attacked control input u a k = {u V a k , u Ia k }, the states x k = {V k , I pu k }, and the attack elements It should be noted that x attack could be a step, sawtooth, sinusoidal, or an unbounded signal. Furthermore, W = [w kj ] depicts a row-stochastic matrix with its elements expressed by The diagonal entries denote the placement of attack elements in locally measured X. Moreover, the nonzero entries in offdiagonal elements in W represent the communicated measurements. Using (9), we formalize that an undetectable attack can be maintained if and only if the sum of the change in state produced by the attack and the zero input evolution of the state induced by the attack belong to the system's weakly unobservable subspace. Although Wx attack will always be equal to zero from a system-level perspective, the change identified across an agent is suppressed by the opposite shift in the remaining agents, without contributing any significant dynamics into the system.

III. DATA PREPARATION
An autonomous dc microgrid model (as shown in Fig. 1) with distributed secondary control architecture is designed in the MATLAB/Simulink environment. The system consists of K = 4 DGs connected to each other via tie lines. The simulated parameters are found in the Appendix. The datasets are generated using this virtual test system. DG-level current and voltage measurements are observed and recorded. Benign values represent system parameters during normal operation. Malicious values are obtained by modifying certain measurements to model a cyber-attack (as per the stealth attack modeling strategy mentioned in [29]). The current and voltage measurement blocks are used to sense the local current and voltage for each DG. These data are then saved for each DG, where they are cooperating to achieve a common objective in (8). The experiments are verified further using experimental data from a dc microgrid testbed described in Section V-D2.

A. Benign Data
To obtain the benign dataset, the simulation model is run without injecting any bias in voltage and current measurements. Thus, the system is allowed to operate normally without any manipulations. As shown in Fig. 2, the current and voltage data plotted before t = 2 s are benign as it does not contain any bias/attack elements.

B. Malicious Data
To obtain the malicious data, the attack vector (shown in Table I) is injected into current and voltage measurements using (6). Fig. 2 shows the local voltage and current for each DG when subjected to voltage and current attacks after t = 2 s. Despite the presence of these attacks, the objectives mentioned in (5) are achieved, which makes them stealthy in nature. As a result, it is difficult to identify the compromised elements accurately in microgrids, which mandates automated efforts.
For each class, there is an equal number of current and voltage samples of 5.6 million readings each. For the anomaly detectors, we split the benign readings into a disjoint train X TR and test sets using a 2 : 1 ratio, whereas we concatenate the malicious readings with the benign test set to build the final test set X TST . For the supervised detectors, we concatenate both readings from both classes and split them into disjoint train X TR and test X TST sets using the ratio of 2 : 1.

IV. ANOMALY DETECTION
This section first discusses common machine-learning-based solutions adopted to detect anomalies along with their limitations. Then, it investigates the adoption of an autoencoder-based detection and how it can overcome the limitations.

A. Benchmark Detectors
This section discusses several machine learning-based cyberattacks detectors. For a comprehensive comparative analysis, we examined detectors with various characteristics including shallow/deep structure, static/recurrent mechanism, and supervised/unsupervised detection mechanism to determine which sets of characteristics lead to the best detection performance. Specifically, we investigated the use of ARIMA, one-class SVM, and F-SAE as anomaly detectors. Then, we examine the use of a two-class SVM, feedforward neural network, CNN, and LSTM classifiers as supervised detectors.
1) Anomaly Detectors: ARIMA is considered as a shallow dynamic anomaly detector trained in order to predict future patterns using minimum prediction mean-square error (MSE). Then, during testing, it detects abnormal patterns whenever the MSE exceeds a certain threshold [32]. The one-class SVM is also a shallow static anomaly detector that is trained only on benign data, which is then tested on both benign and malicious samples. The F-SAE is a static deep detector that learns the behavioral patterns of benign samples throughout the reconstruction process and detects malicious samples based on their deviation from the benign ones [33].
2) Supervised Detectors: The two-class SVM is a classifier that is trained on both, benign and malicious samples, which is then tested on both types of samples [34] to make a decision using a decision boundary. The feedforward [35] model is a static deep detector that learns the behavior of samples in a singular direction using stacked hidden layers. The CNN model is a deep detector that performs convolutions on the time-series data to extract relevant features. The LSTM model is a deep recurrent neural network (RNN) type where information flows in recurrent cycles to hold previous knowledge.
There are three main limitations with such models. First, shallow architectures are not capable of capturing the complex patterns and temporal correlations present in the time-series datasets. Second, static detectors do not capture well the timeseries nature of the data. Third, the detection of the supervised detectors is limited to see attacks that are part of the training set, and hence, they are vulnerable to unseen (zero-day) attacks that are not part of the training set. Such factors negatively affect the performance of these detectors. Next, we present a deep dynamic anomaly detector that detects unseen attacks due to its unsupervised learning nature.

B. Autoencoder-Based Anomaly Detection
This section investigates the use of autoencoders for anomaly detection due to two key features. First, autoencoders may be stacked into several hidden layers, and hence, we can develop a deep structure that is capable of extracting more representative and relevant features from our datasets. Second, autoencoders can be equipped with a sequence-to-sequence (seq2seq) structure, and hence, they have the ability to better capture the time-series nature of our datasets. Both of these features help improve the overall detection performance, and to improve it further, a sequential grid hyperparameter optimization is carried out. Autoencoders are types of anomaly detectors [33] that operate by learning the behavioral patterns of a (normal) class. The learned behavioral patterns of that class are then used to identify abnormal deviations from those learned patterns. Herein, we use this deviation for anomaly detection. Using anomaly detectors, specifically autoencoders, is an effective approach that aids in detecting anomalies using the reconstruction error during the reconstruction process of the data. Using SAEs, the dimensionality of the data is reduced during the encoding step and the data is reconstructed during the decoding step, where the reconstruction error represents the differences among the initial and reconstructed data. SAEs are trained on benign samples where the parameters of the encoder and decoder are optimized to have minimized reconstruction errors. Let x denote the rows of the training dataset X TR , H = f Θ (x) for the encoder, and R = g Θ (x) for the decoder, and Θ denote the SAE parameters where min Θ C(x, g Θ (f Θ (x))), x ∈ X TR .
C(x, g Θ (f Θ (x))) represents the cost function (i.e., the MSE), which is responsible for penalizing g Θ (f Θ (x)) due to its deviation from x. Using the cost function (11), benign data will have a smaller reconstruction error compared to malicious data (anomalies). To detect an anomaly, the reconstruction error has to exceed a specific threshold value.
Herein, we adopt an RNN-based autoencoder, namely, LSTM for two reasons. First, it can enhance the detection performance due to its capability of capturing complex patterns and the temporal correlation in the time-series data. Second, it can overcome the vanishing gradient problem while learning temporal correlation over long intervals. Fig. 3 presents the structure of the deep LSTM-based stacked autoencoder (LSTM-SAE). The LSTM-SAE model comprises two LSTM-based RNNs; deep LSTM encoder and decoder [36], [37] where (x ∈ X TR ) denotes the LSTM encoder's input, where it encodes the time-series vector in a hidden state. This represents identifying an alternative representation of the time-series data that is more compact into the latent layer [38]. Within the encoder, after the input layer, there are L and N l hidden LSTM layers and cells, respectively, in each LSTM layer. Within the decoder, the LSTM encoder's output is carried out as the LSTM decoder's input, which is responsible for reconstructing the initial time-series data. During training, the LSTM-SAE aims to minimize the MSE of the input-output reconstruction.
An LSTM cell presents a state c t at a time instant t and produces a hidden state h t as an output. The access to such a cell is controlled by input i E,t , forget f E,t , and o E,t output gates in the encoder and additional input i D,t , forget f D,t , and output o D,t gates. A data sample x t at time t as well as the previous hidden states of the LSTM cells within the same layer (h E,t−1 in the encoder and h D,t−1 in the decoder) are the LSTM cell's external inputs. The cell state (c E,t−1 in the encoder and c D,t−1 in the decoder) is the LSTM cell's internal inputs. To activate the gates, the aforementioned external and internal inputs as well as the activation functions and bias are initiated. The encoder's last timestep presents the h and c states that are fed as the starting hidden and cell states in the decoder. Algorithm 1 shows the overall operation mechanism of the LSTM-SAE. Specifically, lines 9-13 and 18-22 present the calculation of i E/D,t , f E/D,t , and o E/D,t . The learnable weight matrices and bias vectors are denoted by W l (·) , U l (·) , V l (·) , and b l (·) . Solving (11) results in obtaining the optimal learnable parameters.
After training on X TR , the testing is applied on X TST . The cost function measures the MSE among the initial and reconstructed data, whenever it is smaller than a specific threshold, the sample is given the label y = 0 (benign), otherwise, the sample is assigned the label y = 1 (malicious). The same model is utilized throughout the different experiments. We generate current and voltage readings throughout four equal subsets {I 1 , I 2 , I 3 , I 4 } and {V 1 , V 2 , V 3 , V 4 }, respectively. The first experiment employs current data as an input (single feature) with binary labels; benign and malicious. The second experiment employs two features; 1) current and 2) voltage readings. Fusing the current and voltage datasets results in {IV 1 , IV 2 , IV 3 , IV 4 } with binary labels; benign and malicious. Such a fusion method is applied where the model considers both the current and voltage readings during each timestep in an iterative process. This way, the reconstruction error comes from both readings in order to determine whether the sample is benign or malicious, which enhances the detection performance. For all experiments, we run the detectors on each subset and report the performance separately.

C. Performance Evaluation of the Detectors
We report three performance metrics to assess the detection performance. A true positive (TP) sample is a malicious one and detected as malicious. Similarly, a true negative (TN) sample is a benign one and detected as benign. In contrast, a false positive (FP) sample is a benign one, but detected as malicious and a false negative (FN) sample is a malicious one, but identified as benign. The reported performance metrics include detection rate (DR = TP/(TP+FN)), which specifies the amount of malicious samples that are detected as malicious, false alarm (FA = FP/(TN+FP)) that gives the amount of benign samples detected as malicious, and highest difference (HD = DR − FA) that subtracts FA from DR.

D. Threshold Values
To get the performance metrics' scores, we generate a confusion matrix by comparing Y CAL to Y TST . Determining Y CAL is done using a threshold that is compared to the reconstruction error. We determine this threshold according to the median of the interquartile range (IQR) of the receiver operating characteristic (ROC) curve. Scores that are smaller than that threshold value denote benign samples, whereas scores that are larger than that value represent malicious samples.

E. Hyperparameter Optimization
The selection of the ideal hyperparameter values for the detectors helps enhance detection performance. L denotes the ideal number of LSTM layers, which is the same in both, the encoder and decoder. N l denotes the ideal number of neurons within the LSTM layers. O, D, A H , and A O denote the optimal optimizer, dropout rate, hidden activation function, and output activation function, respectively. Algorithm 2 shows that the conducted hyperparameter optimization is done using four sequential steps. Since the amount of hyperparameters that we are optimizing is large, an exhaustive grid search might be associated with higher computational complexity. Therefore, we implement a grid search that is sequential instead. To select the hyperparameters, cross-validation is conducted over X TR . P * denotes the hyperparameter ultimate settings that lead to improving DR against our validation set, where the given setting of hyperparameters results in a specific model (MD).

V. SIMULATION RESULTS
Herein, we discuss the performance of the benchmark as well as the LSTM-SAE models when detecting anomalies. The results are reported for both of the conducted experiments as mentioned in Section IV-B.

A. Computational Complexity
Training the examined detectors is conducted offline on an NVIDIA GeForce RTX 2070 hardware accelerator using Keras API. The offline training of benchmark detectors takes 1 h and the LSTM-SAE takes 1.5 h. The online testing requires 1.6 s to report a decision on a single reading.

B. Threshold Values
For the investigated anomaly detectors, the ROC curves illustrated in Fig. 4 are utilized to specify the detectors' threshold values to separate benign from malicious samples. Dividing the curve into three quartiles and obtaining the IQR's median lead to the subsequent threshold values: 0.54, 0.45, and 0.59 for the ARIMA-based, one-class SVM, and LSTM-SAE-based detectors, respectively, in the first experiment (using current data). In the second experiment (using current and voltage data), the threshold values are 0.51, 0.43, 0.52, and 0.55 for the ARIMA-based, one-class SVM, F-SAE, and LSTM-SAE detectors, respectively. The ROC curve for the two-class SVM is also plotted in Fig. 4 for comparison.

C. Hyperparameter Optimization
The For both of the experiments, the ideal hyperparameter combination of the LSTM-SAE detector turns out to be as follows. The optimal number of LSTM layers is four, where the optimal number of neurons in the two encoder layers is (500, 300) with the inverse order (300, 500) in the decoder's side. The optimal optimizer and dropout rate are Adam and 0.2, respectively. Sigmoid is the optimal choice for both, the hidden and output activation functions. In the ARIMA-based detector, the differencing and moving average values are 1 and 0, respectively. For the SVM detectors, scale and sigmoid are the ideal kernel and gamma, respectively. The optimal feedforward parameters are 6 layers with 300 neurons, Adamax optimizer, 0.2 dropout rate, and Sigmoid hidden and output activation function. The F-SAE model has the same amount of layers and neurons as the LSTM-SAE with an SGD optimizer, 0.4 dropout rate, and Sigmoid and Softmax for the hidden and output activation functions, respectively. The LSTM-model has 6 layers with 500 cells, Adam optimizer, no dropout rate, weight constraint of 5, ReLU and Softmax hidden and output activation function, respectively, as the ideal parameters.

D. Performance Evaluation
This section discusses the detection performance of the examined detectors using the simulated data discussed in Section III. We also use experimental data to validate the performance results.
1) Simulated Data: Table II presents the results of the first experiment, which reports the performance of the developed detectors using only the four current datasets as well as their  average performance. The average performance of the LSTM-SAE-based detector shows that it significantly outperforms the rest of the detectors. Specifically, the LSTM-SAE-based detector outperforms the benchmark detectors by 3.5−18.3%, 2.6−12.7%, and 6.1−31% in DR, FA, and HD, respectively. Table III summarizes the results of the second experiment, which reports the performance of the examined detectors using the four current and voltage datasets. According to the simulation results, the LSTM-SAE-based detector also outperforms the rest of the benchmark detectors by 3.1−16.4%, 3.1−14.1%, and 6.3−30.6% in DR, FA, and HD, respectively. The superior performance of the LSTM-SAE-based detector is due to its deep structure, which gives it the ability to better capture the complex patterns of the data. Also, its recurrent architecture allows it to apprehend the temporal correlations within the time-series data. Moreover, given its unsupervised anomaly training nature, the detection is done on totally unseen data, which means that it can detect zero-day attacks.
Fusing the voltage and current data helps in improving the detection performance of the detectors. Specifically, the average HD of the detectors has improved by 9.7−14.8%. This is due to the fact that utilizing the obtained reconstruction error from both the current and voltage data helps in increasing the models' certainty regarding the decision on whether a sample is benign or malicious. Conducting such a data fusion method provided an improvement of up to 4.6% in DR, 11.5% in FA, and 14.7% in HD.
2) Validation on Experimental Data: As illustrated in Fig. 5, the multilabeled dataset is obtained from a dc microgrid experimental testbed that is operating at a voltage reference V dc ref of 48 V with N = 2 dc/dc buck converters that are tied radially to a programmable load (voltage-dependent mode). Each converter is controlled using the control structure in Fig. 1 by dSPACE MicroLabBox DS1202 (target), with control commands from the ControlDesk in the PC (host). A single-line diagram of the experimental setup is shown in Fig. 6. The control strategy is operated under the presence and absence of stealth cyber-attacks throughout the local and neighboring measurements. The parameters of the experimental testbed are given in the Appendix. The results shown in Tables IV and V verify the correctness of our conducted simulations. {I 1 , I 2 } and {IV 1 , IV 2 } denote the current and voltage readings from the two converters, respectively. Running the investigated detection schemes on the testbed

VI. CONCLUSION
This article answered two important research questions regarding data-driven-based approaches for stealth cyber-attack detection in dc microgrids. Our extensive experiments provide the following conclusions: 1) Adopting an LSTM-based stacked autoencoder offers superior detection performance compared to benchmark machine-learning-based detectors due to its deep recurrent structure. Such characteristics help in discovering the complex patterns and temporal correlations of the time-series dataset. Also, the LSTM-SAE model can detect unseen attacks since it is an unsupervised anomaly detector that is trained only on benign data. Utilizing only current data for training, the LSTM-SAE model offered an improvement of up to 18.3% in DR, 12.7% in FA, and 31% in HD compared to benchmark detectors. 2) Performing feature fusion that incorporates current and voltage data for training improved the detection performance further by up to 4.7% in DR, 11.5% in FA, and 14.7% in HD as it enables the detector to capture distinct representations from both features. Running the investigated detection schemes on a real testbed offered consistent performance that varies only by ±0.4% compared to the detection performance using the simulated data.

Simulation Parameters
The test model is composed of four DGs (rated for 6 kW each). The line parameter R kl is attached from the kth agent to the lth agent where each agent has identical controller gains. Plant

Experimental Testbed Parameters
The system is composed of two sources with 600 W equally rated converters, and for each converter, the controller gains are consistent.