Artificial Intelligence and Quantum Computing for Advanced Wireless Networks. Savo G. Glisic. Читать онлайн. MREADZ.NET

Artificial Intelligence and Quantum Computing for Advanced Wireless Networks. Savo G. Glisic

Читать онлайн книгу.

В начало <21 22 23 24 25 26 27 28 29 30 >В конец

Artificial Intelligence and Quantum Computing for Advanced Wireless Networks - Savo G. Glisic

Скачать книгу

2.12 we have eight points, and we want to apply k‐means to create clusters for these points. Here is how we can do it:

1 Choose the number of clusters k.

2 Select k random points from the data as centroids.

3 Assign all the points to the closest cluster centroid.

4 Recompute the centroids of newly formed clusters.

5 Repeat steps 3 and 4.

There are essentially three stopping criteria that can be adopted to stop the k‐means algorithm:

1 Centroids of newly formed clusters do not change.

2 Points remain in the same cluster.

3 The maximum number of iterations is reached.

Meaning three means (k = 3) clustering on 2D dataset using [3] is shown in Figure 2.13.

2.1.9 Dimensionality Reduction

During the last decade, technology has advanced in tremendous ways, and analytics and statistics have played major roles. These techniques fetch an enormous number of datasets that is usually composed of many variables. For instance, the real‐world datasets for image processing, Internet search engines, text analysis, and so on, usually have a higher dimensionality, and to handle such dimensionality, it needs to be reduced but with the requirement that specific information should remain unchanged.

Schematic illustration of k equals 3 means clustering on 2D dataset.

Figure 2.13 k = 3 means clustering on 2D dataset.

Source: Based on PulkitS01 [3], K‐Means implementation, GitHub, Inc. Available at [53],https://gist.github.com/PulkitS01/97c9920b1c913ba5e7e101d0e9030b0e.

Schematic illustration of concept of data projection.

Figure 2.14 Concept of data projection.

Dimensionality reduction [32–34] is a method of converting high‐dimensional variables into lower‐dimensional variables without changing the specific information of the variables. This is often used as a preprocessing step in classification methods or other tasks.

Linear dimensionality reduction linearly projects n‐dimensional data onto a k‐dimensional space, k < n, often k < < n (Figure 2.14).

Design Example 2.2

Principal component analysis (PCA)

The algorithm successively generates principal components (PC): The first PC is the projection direction that maximizes the variance of the projected data. The second PC is the projection direction that is orthogonal to the first PC and maximizes the variance of the projected data. Repeat until k‐orthogonal lines are obtained (Figure 2.15).

The projected position of a point on these lines gives the coordinates in k‐dimensional reduced space.

Steps in PCA: (i) Compute covariance matrix ∑ of the dataset S, (ii) calculate the eigenvalues and eigenvectors of ∑. The eigenvector with the largest eigenvalue λ₁ is the first PC. The eigenvector with the kth largest eigenvalue λ_k is the kth PC. λ_k/∑_i λ_i = proportion of variance captured by the kth PC.

Schematic illustration of successive data projections.

Figure 2.15 Successive data projections.

The full set of PCs comprises a new orthogonal basis for the feature space, whose axes are aligned with the maximum variances of the original data. The projection of original data onto the first k PCs gives a reduced dimensionality representation of the data. Transforming reduced dimensionality projection back into the original space gives a reduced dimensionality reconstruction of the original data. Reconstruction will have some error, but it can be small and often is acceptable given the other benefits of dimensionality reduction. Choosing the dimension k is based on ∑_{i = 1,k} λ_i/∑_{i = 1,S} λ_i > β[%], where β is a predetermined value.

2.2 ML Algorithm Analysis

2.2.1 Logistic Regression

In this section, we provide more details on the performance analysis [4] of the logistic regression introduced initially in Section 2.1.2. There, in Eq. (2.5), we provide an expression for the probability that an individual with dataset values X₁, X₂, …, X_p is in outcome g. That is, p_g = Pr(Y = g ∣ X). For this expression, we need to estimate the parameters β’s used in B’s. The likelihood for a sample of N observations is given by

(2.7) l equals product Underscript normal j equals 1 Overscript normal upper N Endscripts product Underscript g equals 1 Overscript upper G Endscripts normal pi Subscript italic g j Baseline Superscript y Super Subscript italic g j

where and y_gj is one if the j^th observation is in outcome g and zero otherwise. Using the fact that sigma-summation Underscript g equals 1 Overscript upper G Endscripts y Subscript italic g j Baseline equals 1 , the log likelihood, L, becomes

(2.8)Скачать книгу