Machine Learning Techniques and Analytics for Cloud Security. Группа авторов

Читать онлайн книгу.

Machine Learning Techniques and Analytics for Cloud Security - Группа авторов


Скачать книгу
hierarchical clustering is the second algorithm. It groups similar objects into groups (cluster). In this algorithm, it basically treats every observation as an individual cluster. After that, it iterates the following steps continuously:

       (1) At first, consider the two clusters or groups that are closest together.

       (2) Then, combine the two most similar clusters. Until all the clusters are combined together, this process continues [24].

      The fuzzy c-means clustering is the last and third algorithm. This algorithm’s concept is very like to the k-means clustering. The algorithm is as follows:

       (1) At first, identify clusters number.

       (2) Then, randomly assign coefficients to each data point for being in the clusters.

       (3) Until the algorithm has converged, repeats (1) and (2) step:(i) Compute centroid of each cluster or group.(ii) For every data point, compute the coefficient of being in the cluster.

      2.3 Result

      Result section consists of description of datasets, analysis of results, and validation of results.

Schematic illustration of flowchart of the methodology.

      2.3.1 Description of Datasets

      Influenza sequences (glycan dataset) are taken from the National Centre for Biotechnology Information. At first, to perform searching operation, Basic Local Alignment Search Tool (BLAST) has been applied on H1N1 infected human datasets of Influenza A/447/08 at Oklahoma, Influenza A/1138/08 at Oklahoma, and Influenza A/447/08 at Oklahoma and on non-infected normal human of Influenza A/California/04/2009-4C. The dataset of H1N1 contains glycan data in Oklahoma City and the dataset of normal human contains glycan data in California City. The dataset consists of 442 different glycans and list of linkers are sp0, sp8, sp9, sp12, etc. Individual columns of the dataset represent the glycan numbers, glycan structure, the RFU, the STDEV value, and the SEM.

      2.3.2 Analysis of Result

Schematic illustration of K-means cluster analysis of Influenza A (H1N1) non-infected human. Schematic illustration of K-means cluster analysis of Influenza A (H1N1) infected human.

      Figure 2.3 K-means cluster analysis of Influenza A (H1N1) infected human.

Schematic illustration of K-means cluster analysis of Influenza A (H1N1) infected human.

      Figure 2.4 K-means cluster analysis of Influenza A (H1N1) infected human.

Schematic illustration of K-means cluster analysis of Influenza A (H1N1) infected human.

      2.3.3 Validation of Results

       2.3.3.1 T-Test (Statistical Validation)

      The t-test statistical validation has been applied for comparing the means of two samples (infected and normal), even if they have different number of glycans. The following steps are used to solve t-test validation:

       a) List H1N1 infected datasets for sample 1.

       b) List normal dataset for sample 2.

       c) Record the number replicates (in the data set, n = 3) for sample (The number of replicates for sample1, i.e., n1 is 3, the number of replicates for sample2, i.e., n2 is 3).

       d) Compute the mean of both n1 and n2 (x1’, x2’). [mean = total/n]

       e) Compute the standard deviation (σ) for each sample (σ1, σ2). Where, σ2 = ∑d2/(n − 1)

       f) Compute the variance that is the difference between the two means . Where

       g) Compute σb (square root of ).

       h) Compute the p value as follows:

images Schematic illustration of hierarchical cluster analysis of Influenza A (H1N1) infected human. Schematic illustration of hierarchical cluster analysis of Influenza A (H1N1) infected human. Schematic illustration of hierarchical cluster analysis of Influenza A (H1N1) infected human. Schematic illustration of fuzzy c-means cluster analysis of Influenza A (H1N1) infected human.

       2.3.3.2 Statistical Validation


Скачать книгу