Artificial Intelligence and Quantum Computing for Advanced Wireless Networks. Savo G. Glisic. Читать онлайн. MREADZ.NET

Artificial Intelligence and Quantum Computing for Advanced Wireless Networks. Savo G. Glisic

Читать онлайн книгу.

В начало <89 90 91 92 93 94 95 96 97 98 >В конец

Artificial Intelligence and Quantum Computing for Advanced Wireless Networks - Savo G. Glisic

where h_w is a parametric function. This transition function, which has been successfully used in recursive NNs, is not affected by the positions and the number of the children. In the following, Eq. (5.76) is referred to as the nonpositional form, and Eq. (5.74) is called the positional form. To implement the GNN model, we need (i) a method to solve Eq. (5.74)); (ii) a learning algorithm to adapt f_w and g_w using examples from the training dataset; and (iii) an implementation of f_w and g_w. These steps will be considered in the following paragraphs.

Computation of the state: Banach’s fixed‐point theorem does not only ensure the existence and the uniqueness of the solution of Eq. (5.74), but it also suggests the following classic iterative scheme for computing the state:

(5.77) equation

where x(t) denotes the t‐th iteration of x. The dynamical system (Eq. (5.77)) converges exponentially fast to the solution of Eq. (5.75) for any initial value (0). We can, therefore, think of x(t) as the state that is updated by the transition function F_w . In fact, Eq. (5.77) implements the Jacobi iterative method for solving nonlinear equations [63]. Thus, the outputs and the states can be computed by iterating

(5.78) equation

Note that the computation described in Eq. (5.78) can be interpreted as the representation of a network consisting of units that compute f_w and g_w . Such a network is called an encoding network, following an analogous terminology used for the recursive neural network model [64]. In order to build the encoding network, each node of the graph is replaced by a unit computing the function f_w (see Figure 5.10). Each unit stores the current state x_n(t) of node n, and when activated it calculates the state x_n(t + 1) using the node label and the information stored in the neighborhood. The simultaneous and repeated activation of the units produce the behavior described in Eq. (5.78). The output of node n is produced by another unit, which implements g_w. When f_w and g_w are implemented by FNNs, the encoding network turns out to be a recurrent neural network where the connections between the neurons can be divided into internal and external connections. The internal connectivity is determined by the neural network architecture used to implement the unit. The external connectivity depends on the edges of the processed graph.

The learning algorithm: Learning in GNNs consists of estimating the parameter w such that ϕ_w approximates the data in the learning dataset Eq. (5.73) where q_i is the number of supervised nodes in G_i . For graph‐focused tasks, one special node is used for the target (q_i = 1 holds), whereas for node‐focused tasks, in principle, the supervision can be performed on every node. The learning task can be posed as the minimization of a quadratic cost function

Schematic illustration of graph (on the top, left), the corresponding encoding network (top right), and the network obtained by unfolding the encoding network (at the bottom).The nodes (the circles) of the graph are replaced, in the encoding network, by units computing fw and gw (the squares).

Figure 5.10 Graph (on the top, left), the corresponding encoding network (top right), and the network obtained by unfolding the encoding network (at the bottom).The nodes (the circles) of the graph are replaced, in the encoding network, by units computing f_w and g_w (the squares). When f_w and g_w are implemented by FNNs, the encoding network is a recurrent neural network. In the unfolding network, each layer corresponds to a time instant and contains a copy of all the units of the encoding network. Connections between layers depend on encoding network connectivity.

(5.79) equation

The learning algorithm is based on a gradient‐descent strategy and is composed of the following steps:

1 The states xn(t) are iteratively updated by Eq. (5.78) until at time T they approach the fixed‐point solution of Eq. (5.75): x(T) ≈ x.

2 The gradient ∂ew(T)/∂w is computed.

3 The weights w are updated according to the gradient computed in Step 2.

When it comes to Step 1, note that the hypothesis that F_w is a contraction map ensures convergence to the fixed point. Step 3 is carried out within the traditional framework of gradient descent. We will show in the following that Step 3 can be carried out in a very efficient way by exploiting the diffusion process that takes place in GNNs. This diffusion process is very much related to the process that occurs in recurrent neural networks, for which the gradient computation is based on the BPTT algorithm (see Chapter 3). In this case, the encoding network is unfolded from time T back to an initial time t₀ . The unfolding produces the layered network shown in Figure 5.10. Each layer corresponds to a time instant and contains a copy of all the units f_w of the encoding network. The units of two consecutive layers are connected following graph connectivity. The last layer corresponding to time T also includes the units g_w and computes the output of the network.

Backpropagation through time consists of carrying out the traditional backpropagation step (see Chapter 3) on the unfolded network to compute the gradient of the cost function at time T with respect to all the instances of f_w and g_w . Then, ∂e_w(T)/∂w is obtained by summing the gradients of all instances. However, backpropagation through time requires storing the states of every instance of the units. When the graphs and T − t₀ are large, the memory required may be considerable. On the other hand, in this case, a more efficient approach is possible, based on the Almeida–Pineda algorithm [65, 66]. Since Eq. (5.78) has reached a stable point x before the gradient computation, we can assume that x(t) = x holds for

Скачать книгу

В начало <89 90 91 92 93 94 95 96 97 98 >В конец