Literature DB >> 32726953

Distributed Model-Free Bipartite Consensus Tracking for Unknown Heterogeneous Multi-Agent Systems with Switching Topology.

Huarong Zhao¹, Li Peng^1,2, Hongnian Yu³.

Abstract

This paper proposes a distributed model-free adaptive bipartite consensus tracking (DMFABCT) scheme. The proposed scheme is independent of a precise mathematical model, but can achieve both bipartite time-invariant and time-varying trajectory tracking for unknown dynamic discrete-time heterogeneous multi-agent systems (MASs) with switching topology and coopetition networks. The main innovation of this algorithm is to estimate an equivalent dynamic linearization data model by the pseudo partial derivative (PPD) approach, where only the input-output (I/O) data of each agent is required, and the cooperative interactions among agents are investigated. The rigorous proof of the convergent property is given for DMFABCT, which reveals that the trajectories error can be reduced. Finally, three simulations results show that the novel DMFABCT scheme is effective and robust for unknown heterogeneous discrete-time MASs with switching topologies to complete bipartite consensus tracking tasks.

Entities: Chemical Disease Species

Keywords: bipartite consensus; data driven; multi-agent system; switching topologies

Year: 2020 PMID： 32726953 PMCID： PMC7435747 DOI： 10.3390/s20154164

Source DB: PubMed Journal: Sensors (Basel) ISSN： 1424-8220 Impact factor: 3.576

1. Introduction

Multi-agent systems (MASs) and machine learning, two exciting trends in the robotics field, have recently attracted more and more researchers’ attention due to the new epoch of artificial intelligence (AI) [1,2]. How to introduce intelligent algorithms into traditional control theories is one of the hottest and significant research topics. Specifically, utilizing intelligent algorithms to improve the robustness of MASs and reducing the calculation burden of designing controllers [3,4,5] to achieve consensus tracking are two of the challenges we need to address. In the past half-century, most of the excellent control schemes have been developed based on explicit or implicit mathematical models. Examples are sliding model control, intermittent control, impulse control, and fuzzy control, to name but a few. In addition, most of these control theories were successfully applied to consensus tracking tasks of MASs. In [5], Barbot et al. first introduced the concept of a second-order sliding mode. Many novelty approaches have been developed since then. For instance, a novel sliding-mode-based discrete differentiator was proposed that can estimate the accurate derivatives input of the controlled plant [6], and the output constraint problems are considered in the second-order sliding mode controller designer in [7]. In [8], Xu et al. researched the second-order consensus problems of MASs, where local intermittent information among the agents is utilized to design a distributed adaptive completely intermittent controller to achieve second-order consensus. The impulse control approaches can be seen in [9,10], where the fixed-time quantity consensus, delayed and stochastic perturbation, and second-order consensus are considered to design appropriate properties for MASs. In terms of fuzzy control, the author in [11] designed a mixed controller, which consists of a fuzzy controller and a fuzzy observer, to solve the partly unmeasurable states of controlled systems. It is noteworthy that most traditional control algorithms [5,6,7,8,9,10,11] must consider the dynamics of a controlled system, which is called model-based control (MBC). However, an accurate model of the plant is hard to obtain, so most MBC approaches are established on the approximate dynamics of systems, which usually are not robust in a partial application. Fortunately, in the past few years, with the development of machine learning, another branch of control theory has been developed that is inspired by machine learning and tries to introduce the leaning approach into traditional theories to avoid the difficulties in acquiring or estimating the dynamics of physical systems. To complete similar control tasks as those solved by MBC schemes, the new control theory works by merely using the interactive information between itself and its external environment, improving the control performance by self-leaning; this is called model-free control (MFC) or data-driven control [2,4]. Recently, several papers [12,13,14,15,16,17,18,19,20,21,22,23,24] have reported on model-free adaptive control (MFAC), interactive learning control (ILC), repetitive learning control (RLC), reinforcement learning (RL), and so on. The consensus tracking problems of MASs were researched in [12] by the MFAC approach, where both the time invariable and varying desired trajectories tracking are archived. Moreover, the further theoretical analysis of MFAC was rigorously presented in [4], which introduces that the MFAC method only needs input/output (I/O) measurement data of a controlled plant, without the need of any explicit mathematical model, Lyapunov stability theory, or key technical lemma to design controllers for various control tasks. ILC is an effective approach for repetitive operating systems, which was developed by many researchers such as in [2,13,14]. In [2], Hui et al. extended the dimension of ILC, which has a time dimension, iteration dimension, and space dimension, to achieve a faster and more precise tracking performance for the MASs’ formation task. In [13], Li et al. studied how to combine the ILC with model predictive control to achieve better performance. The RLC model is utilized to track periodic exogenous signals in continuous processes, which can be seen in [14], where a novel distributed adaptive protocol is investigated for uncertain nonlinear leader–follower MASs to achieve global asymptotic consensus. In [15], Odekunle et al. presented a novel approach to solve the non-zero-sum game output regulation problem for MASs by using RL. In our investigations, we found that another category of MFC methods is based on neural networks (NNs), which have unparalleled approximation abilities for nonlinear dynamics. In [16,17,18], the authors designed actor–critic-based neural networks to approximate the value function and control policy for each agent, respectively, to optimize consensus control performance. It should be pointed out that NNs-based methods need training processes and external testing signals for controller design, which are not convenient. Meanwhile, there are some interesting adaptive schemes in [19,20,21,22,23]. In the aforementioned related studies [5,6,7,8,9,10,11], consensus problems of MASs are based on MBC approaches, while the authors of [12,13,14,15,16,17,18,19,20,21,22,23] employed and developed MFC methods to address consensus or consensus tracking problems for MASs; however, it is still an open and challenging problem for unknown dynamics MASs to achieve consensus tracking. Furthermore, it is obvious from a review of the above literature that MASs consensus control and tracking only consider the cooperation interactions among agents. In fact, we usually find that the two relationships are inseparable from one another in natural or engineering scenarios, for instance, activators and inhibitors in biological systems, teams opposed in a sports match, or duopolistic regimes arising when agents compete for limited resources in economical systems [24]. Hence, to improve the adaptive and autonomous abilities of MASs, the competition relationship needs to be considered, which is becoming a hot research topic. Altafini [25] first explored consensus for MASs with antagonistic interactions, and this specific consensus is called bipartite consensus (BC), which means that agents are assigned to two alliances, where each alliance has a unique sign, but each agent ultimately achieves the same position, velocity, and/or angle. After that, BC sparked the interest of many researchers and has been discussed for MASs with linear, nonlinear, and even heterogeneous dynamics. Moreover, the BC for MASs with Lipschitz-type, second-order, or high-order dynamics is investigated in [24,26,27]. Inspired by the above contributions, several theories have been extended. In [28], a distributed extended state observer is employed to guarantee leader–follower BC for MASs with mismatched unknown disturbance. It is observable that formulating a BC controller is more challenging for high-order MASs than for low-order ones. The BC problem for high-order MASs with input saturation is researched by combining distributed event-triggered control and a low-gain feedback technique in [29]. The finite-time and fixed-time BC for MASs are explored in [30,31], respectively. A novel RL based protocol is presented in [32], which is the first use of RL for unknown discrete-time leader–follower MASs, where the author utilizes data-driven actor–critic-based NNs to address the BC problem for unknown MASs, but it increases computations. Moreover, a training process is necessary. Although much effort has been made toward solving the BC problem [33,34,35,36], to the best of our knowledge, pseudo partial derivative (PPD) approaches have not been taken into account in the existent results. From the above observations and analysis, this paper employs a PPD method to estimate an equivalent dynamic linearization data model of an easy agent, where merely the measurement I/O data of neighborhood agents is necessary. Then, a distributed model-free adaptive bipartite consensus tracking (DMFABCT) scheme is designed for unknown detected-time heterogeneous nonaffine nonlinear MASs with switching topologies to realize time-invariant and time-varying reference trajectory bipartite consensus tracking tasks by using the neighbor-based tracking error. It is worth pointing out that although a few agents could receive the desired trajectory, the rigorous theoretical proof confirms that our proposed algorithm can guarantee convergence of all agents. In the investigation of the existing consensus approaches of MASs, the main contributions of this work might be summarized as follows: A DMFABCT framework is established for unknown heterogeneous nonaffine nonlinear detected-time MASs with switching topologies and a coopetition network. It is a data-driven distributed intelligent algorithm, which has good performance to address the BC problem under both time-invariant and time-varying reference trajectories. Although Bu et al. [37] proposed a novel data-driven framework for MASs, it only discussed the cooperative interactions. The proposed DMFABCT scheme is designed by neighbor-based online measurement I/O data that can bypass the confusion of existing consensus algorithms as seen in [5,6,7,8,9,10,11,24,25,26,27,28,29,30,31,32,33,34,35] to obtain an accurate mathematical model so that the designed scheme is more robust and reduces energy costs from the massive computation. Both collaborative and antagonistic interactions among agents are considered in the proposed protocol. Compared with the protocols in [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23], the proposed protocol is more reasonable. Moreover, the difference of DMFABCT from the novel algorithm proposed in [32] is that DMFABCT copes with the BC problem with PPD, where the training processes and external testing signals are not necessary. The remainder of this paper is structured as follows. Several essential preliminaries are presented in Section 2. The introduction of the DMFABCT algorithm and the tracking performance of fixed and time-varying reference trajectory analysis are presented in Section 3. Three numerical simulation experiments are provided in Section 4. Finally, conclusions and future work are provided in Section 5.

2. Preliminaries and Problem Formulation

2.1. Graph Theory and Some Notations

Let denote the set of real numbers. The Euclidean norm of is expressed by . The identity matrix and diagonal matrix are expressed by and , respectively, where the dimension is dependent on the context. In this paper, the algebraic graph theory is employed to analyze the interaction topologies of MASs. It should to be pointed out that the graphs are directed and the weighted directed graph is expressed by , where , , and are the set of vertices, the set of edges, and the adjacency matrix, respectively. Then, as the parent and is the child, if the can transmit the information to directly, which is expressed as . If is not the father of , , otherwise . In the graph of MASs, the has many children so utilizes the to describe the relationships among each agent, which is named as the neighborhood of the agent in other literature. In this paper, the cooperative and competitive relationships are considered between each agent so that the elements of have three different values, −1, 0, and 1. If the node and belong to a same group, agent could get the information from agent , , otherwise . When , the agents and must be in opposite groups, which is called a competitive relationship between the agents and . Alternatively, there is another definition, which is cooperation. Moreover, we usually use cooperation to represent the two different situations among the MASs network. The Laplacian matrix of can be calculated by , where and are called in-degree of vertex . The coopetition network is called structurally balanced if the whole nodes in can be divided into two disjointed subsets, that is, , . They satisfy the following three conditions: and . if , . if , . Furthermore, if this MASs graph contains a spanning tree, the information can transmit from a root node to any other node, and so this graph is considered to be a strongly connected graph. In order to investigate time-varying switching topologies, let denote a time-varying switching graph with a virtual leader, which is dependent on , and , , are the corresponding adjacency matrix, degree matrix, and Laplacian matrix, respectively. denotes the neighborhood of the agent and is employed to depict the relationship between the virtual leader 0 and each follower. If the agent can directly get the desired trajectory from virtual leader 0, , . Otherwise, . To describe the time-varying topology, let denote the set of all directed graphs for the agents, where denotes the total number of possible interaction graphs.

2.2. Problem Formulation

In existing studies, the consensus problem, especially the bipartite consensus problem, is often considered for a group of agents with identical dynamics. However, heterogeneity is the intrinsic property for multi-agent systems. Therefore, the problem of bipartite consensus for heterogeneous agents presents many challenges. It is noteworthy that the following assumptions are fundamental conditions of nonlinear dynamics for our analysis. Consider a discrete-time heterogeneous SISO (simple-input-simple-output) MAS with where Those conditions where The authors of [ Under these circumstances where the agent’s dynamic (1) satisfies Assumptions 1, 2, and where Using PPD to establish a dynamic linearization data model is called the PPD approach, where the PPD is only dependent on The following distributed measurement output: If the agent All of the time-varying switching communication graphs are strongly connected graphs and the trajectory information of the virtual leader can be transmitted to one or more follower agents directly. In the relative literature, The above Assumption 3 is a fundamental condition for researching the bipartite consensus tracking problems. Moreover, it can obviously find Assumption 4, which is implied in the traditional model-based control algorithms as a type of linear-like characteristic. Furthermore, this assumption is wildly used in some practical multi-agent systems, for instance, in unmanned air vehicles and mobile robots.

3. Main Results

In order to solve the bipartite consensus tracking problem stated in Section 2.2, we propose the DMFABCT approach below: where , are the step sizes, which will be defined in the next section. and are weight factors. According to Assumption 4, let , which is the initial value of , and it is the estimated value of . Practically, if the is very small, it means that the does not update any more, thus, is selected as 10−4. It is noted that The feature of this DMFABCT scheme is that agents’ model dynamics are not required, for instance, the PPD parameters estimation algorithm is only used on the measured I/O data of multi-agent systems to complete the formulation, therefore, it is a classic data-driven control approach for solving the MASs’ BC problem. Both To analyze the stability of MASs, Lemma 2 is one of the important conditions. A time-varying irreducible substochastic matrix and the set of all possible where The stability analysis of the DMFABCT approach is presented by Theorem 1. Under these circumstances where the MASs (1) satisfies Assumptions 1, 2, and 4 and its communication topology satisfies Assumption 3, apply the proposed DMFABCT algorithms (4)–(6) to track the desired reference trajectory and We prove this theorem using the three steps below. Step 1 (Proving the Boundedness of ): Define . According to the Lemma 1 and parameter estimation law (4), the following equation can be obtained. According to Equation (7) the following equation can be obtained. The inequalities can be obtained by selecting and , which satisfy and . because the system studied in this paper is a single input and output. Thus, a constant can be selected to satisfy the following inequality. Since , according to Assumption 4, the following inequalities can be obtained. Obviously, it can obtain and so that . Moreover, since is bounded, it is obvious that is bounded. Step 2 (Proving the Convergence of ): Since , Equation (3) can be rewritten as follows: Equation (11) can be written for clarity as a compact form where for and for , , and is the N-vector. Moreover, obviously . According to Equation (12), the compact form of the DMFABCT algorithm (6) can be written as follows: where According to equations , , and , Equation (2) can be written as follows: where . According to , it is easy to get . Furthermore, we could substitute (13) to (14) to get where , , . From (15), we can obtain that if for all , then . Step 3 (Obtaining the Convergence Condition of MASs): In this step, the convergence condition of MASs will be derived. According to the conditions , , , , and for all , the following inequalities can be obtained: First of all, in order to guarantee the strictly connected property of MASs under all of the communication topologies, must be an irreducible matrix. Secondly, for all and satisfies following inequality which means that all of the diagonal entry in are larger than the reciprocal of . In this case, obviously is strictly less than one, so is an irreducible substochastic matrix and its diagonal entries are positive. According to (15), the following inequality can be obtained. According to Lemma 1, the following inequality can be obtained. where stands for the floor function. Hence, the bipartite consensus fixed trajectory tracking errors of MASs can converge to the origin. □ Under these circumstances where the MASs (1) satisfies Assumptions 1, 2, and 4 and its communication topology satisfies Assumption 3, apply the designed DMFBAC schemes (4)–(6) to track the time-varying reference trajectory and Since , then , so that the bipartite consensus tracking error Equation in (15) can be rewritten as so that the following inequality can be obtained. Let and utilizing Lemma 1 we can obtain that , and (16) can be written as follows: where denotes the floor function. Finally, the bounded of is obtained. Thus, bipartite time-varying trajectory tracking error is bound, which is dependent on the output gain of the reference trajectory. □

4. Simulation

In order to illustrate the efficiency of the proposed bipartite consensus tracking algorithm, three numerical simulations with seven follower agents are performed, where agents are governed by It can be discovered that each agent has a unique dynamics system model, so the considered MASs are heterogeneous. Furthermore, it is noteworthy that the above dynamics system models are only applied to produce the I/O data for the MASs, while the distributed DMFABCT algorithm does not utilize any model information. During the design of this algorithm, the dynamics of MASs are all unknown. The communication topology of considered MASs is shown in Figure 1. It demonstrates that the virtual leader is denoted by using vertex 0 and the followers are distributed into two alliances in each topology. Moreover, in Figure 1, the black solid lines are used to express the cooperative relationships among agents, and the competitive relationships are denoted by dotted lines. It is noted that only a subset of agents could directly receive the information from the leader. Moreover, the information among agents only transmits along the arrows and the direction is fixed. Although other agents cannot directly get the commands from the virtual leader, all of the communication graphs satisfy Assumption 3, so the virtual leader can intervene in the two competitive alliances. As the matrixes above show, the reciprocal of the greatest diagonal entry of is 0.5 for . In order to satisfy the convergence condition for all in Theorem 2, we choose the controller parameters as for each simulation and the other parameters are selected as , , , and .

Figure 1

Communication topology among agents.

4.1. Fixed Trajectory Tracking Example

In order to obtain a clear result of this simulation, a piecewise function and the desired reference trajectory are given below: Initial conditions are chosen as , for all agents and , , , , , , in this simulation. The simulation results of the bipartite tracking performance, tracking errors, and PPD estimation of each agent are shown in Figure 2, Figure 3 and Figure 4, respectively.

Figure 2

Tracking performance of each agent (example 1).

Figure 3

Tracking errors of each agent (example 1).

Figure 4

Pseudo partial derivative (PPD) estimation of each agent (example 1).

From Figure 2, Figure 3 and Figure 4 it can be seen that the output between followers and leader has an extreme variation initially, but the bipartite tracking errors can be decreased radically and the bipartite tracking is realized after a few steps. For example, in Figure 2, the value of trajectory is changed from 10 to 20 at and we could also find that several agents exchanged their groups at the same time, but only after about 100 steps after a new bipartite consensus is achieved, which Figure 3 also reveals. Furthermore, from Figure 4 we can see that the changing of the topology and the desire trajectory affect the estimation value of PPDs for each agent, but they achieve stable values immediately, which shows that the proposed DMFABCT has a good robustness.

4.2. Time-Varying Trajectory Tracking Example

In this example, the bipartite consensus time-varying trajectory tracking is discussed, and the desired trajectory is where is the output gain rate and the time-varying topologies are governed by where the initial data of , , dynamics of each agent, and other parameters were defined in the beginning of this section. The bipartite consensus tracking performance of this example and the tracking errors of each agent are presented in Figure 5, which shows that the DMFABCT scheme can decrease the number of errors dramatically. Although the errors of the bipartite tracking cannot be removed, they converge to a small bound, which is demonstrated in Figure 6 and Figure 7. Compared with the desired output data of agents, the max distortion rate can be obtained in Figure 7, which is 0.084%. Obviously, this result demonstrates that MASs with switching topologies also can perform the bipartite time-varying tracking tasks. From Figure 8, we can also arrive at the same conclusion that MASs can change the value of PPDs to adaptive environmental change and can obtain a high fault-tolerance property.

Figure 5

Tracking performance of each agent (example 2).

Figure 6

Tracking errors of each agent (example 2).

Figure 7

Tracking errors rate of each agent at (example 2).

Figure 8

PPD estimation of each agent (example 2).

By tracking performance of different tracking trajectories, according to Figure 3 and Figure 6, we can conclude that the performance of fixed trajectory tracking is better than that of the time-varying trajectory tracking, which further validates the correctness of the theoretical analysis in Section 3. In addition, in order to further analyze the errors forces of the time-varying trajectory, we change the output gain rate of the desired trajectory from 500 to 4000 to analyze the tracking performance. From Figure 9, we can easily find that the error rates of each agent all decrease, when the value of increases. The error rates of MASs at , , and are shown in Figure 7, Figure 10 and Figure 11, respectively. Although the biggest error rate of MASs at is about 0.418%, it can bind the error rates of each agent, which means that the errors of MASs are also bounded. Furthermore, errors rates of each agent, which are shown in Figure 11, are close to the original point, so that it further demonstrates the correctness of Theorem 2. Meanwhile, we can conclude that MASs are stable under the proposed DMFABCT scheme and the tracking errors are dependent on the output gain of the reference trajectory.

Figure 9

Tracking errors rate of each agent at (example 2).

Figure 10

Tracking errors rate of each agent at (example 2).

Figure 11

Tracking errors rate of each agent at (example 2).

4.3. Realistic DC Linear Motors Example

In this case, we utilize seven permanent magnet DC linear motors to perform fixed and time-varying trajectory bipartite consensus tracking tasks. The realistic dynamic of the DC linear motor is investigated in [37,40], which has been modeled as below: where is continuous time (s), is the position (m), is the speed (m/s), is the combined mass of translator and load, is the developed force (N), is the friction force (N), and is the ripple force (N). The friction and ripple forces have been identified as: where is the minimum level of Coulomb friction and is the level of static friction, and are lubricant and load parameters, respectively. is an additional empirical parameter. In this example, these parameters are selected as: , , , , , , , .The desired velocity is given as Using the Euler formula to discretize the above model and selecting sampling time as , we have . In this case, a random noise is introduced in the output measurement data for each DC motor. Moreover, we define the bound of the noise as . Here, we use the same parameters and the communication topology as those of example 2 to perform the simulation. The fixed trajectory bipartite consensus tracking performances of seven DC motors are shown in Figure 12 and another tracking task is presented in Figure 13. From the two simulation results, we observe that several agents have changed their alliance, but the results of the two different bipartite consensus tracking tasks show that the tracking errors of MASs can be reduced, which further proves the effectiveness and applicability of the designed DMFABCT.

Figure 12

Tracking errors of each agent (example 3).

Figure 13

Tracking errors of each agent (example 4).

As shown above, the proposed DMFABCT scheme is correct and effective.

5. Conclusions

In this work, a data-driven bipartite consensus tracking scheme has been proposed for unknown nonlinear discrete-time multi-agent systems with switching topologies, and a compact form linearization model is established. This algorithm ensures that all agents can track the fixed and time-varying desired trajectory and realize the bipartite tracking. Compared with the model-based control algorithm, one of the main advanced features in our method is that it does not need the agent’s dynamics and requires only the input–output. Moreover, both of the cooperation and competition relationships among multi-agent systems are considered, and the convergence and stability of the algorithm are proven by rigorous mathematical analyses. Meanwhile, the corresponding simulation of the bipartite consensus tracking algorithm has been presented to validate the effectiveness of the proposed algorithm. In the future work, we will consider the bipartite consensus problem for multi-input-multi-output multi-agent systems with delay and disturbances.

5 in total

1. 3-D Learning-Enhanced Adaptive ILC for Iteration-Varying Formation Tasks.

Authors: Yu Hui; Ronghu Chi; Biao Huang; Zhongsheng Hou
Journal: IEEE Trans Neural Netw Learn Syst Date: 2019-03-15 Impact factor: 10.451

2. Data-Driven Multiagent Systems Consensus Tracking Using Model Free Adaptive Control.

Authors: Xuhui Bu; Zhongsheng Hou; Hongwei Zhang
Journal: IEEE Trans Neural Netw Learn Syst Date: 2017-03-14 Impact factor: 10.451

3. Data-Driven Distributed Optimal Consensus Control for Unknown Multiagent Systems With Input-Delay.

Authors: Huaipin Zhang; Dong Yue; Chunxia Dou; Wei Zhao; Xiangpeng Xie
Journal: IEEE Trans Cybern Date: 2018-04-09 Impact factor: 11.448

4. Neural Networks-Based Adaptive Finite-Time Fault-Tolerant Control for a Class of Strict-Feedback Switched Nonlinear Systems.

Authors: Lei Liu; Yan-Jun Liu; Shaocheng Tong
Journal: IEEE Trans Cybern Date: 2018-05-04 Impact factor: 11.448

5. Data-driven model reference control of MIMO vertical tank systems with model-free VRFT and Q-Learning.

Authors: Mircea-Bogdan Radac; Radu-Emil Precup; Raul-Cristian Roman
Journal: ISA Trans Date: 2018-01-08 Impact factor: 5.468

5 in total