Data Analytics in Bioinformatics. Группа авторов. Читать онлайн. MREADZ.NET

Data Analytics in Bioinformatics. Группа авторов

Читать онлайн книгу.

В начало <11 12 13 14 15 16 17 18 19 20 >В конец

Data Analytics in Bioinformatics - Группа авторов

Training Data 1.0000000 Outstanding Test Data 0.9773333 Outstanding Index: 0.5: No Discriminant, 0.6–0.8: Can be considered accepted, 0.8–0.9: Excellent, >0.9: Outstanding

1.9 Neural Networks

The Artificial Neural Network (ANN) was invented by Frank Rosenblatt in 1958 [91]. They are inspired by biological neural networks. It is a collection of connected nodes that are called neurons but artificial. The Original goal of Artificial Neural network (ANN), is to solve the problems as the human brain does [92–93]. It does so by taking the inputs, processing them, and calculating the output. Neural Networks can learn by themselves. The outputs that are generated by the neural networks are not limited to the input attributes provided by the user. It doesn’t require a database, rather it stores the input in its network. The general form of an artificial neural network is shown below in Figure 1.17 and its detailed version is shown in Figure 1.18. Its other name is the connectionist system. This system learns by considering examples and performing tasks. Neural Networks has its applications in various fields such as:

Schematic illustration of the general form of an artificial neural network.

Figure 1.17 Neural network (general).

Schematic illustration of an artificial neural network and its detailed version is shown here.

Figure 1.18 Neural network (detailed).

Speech Recognition [94]

Signature Verification Application [95]

Human face Recognition [96]

Character Recognition [97]

Natural Language Processing [98].

A basic execution procedure of a neural network [99] is presented in Figure 1.15. In general, it has consisted of three layers. These are the Input layer, Hidden Layer, and the Output layer. These layers are consist of neurons and these neurons and are connected among themselves. In this figure, the input layer contains the health parameters. Depending on the number of inputs received, the hidden processing layer will work to provide an output. Here, only two hidden processing layers are considered but could be any depending on the nature of the purpose of the machine. The outputs are attained and these outputs will act as an input for the next neuron and this process goes on forever. A detailed form of the neural network is explained briefly below with the help of Figure 1.18 for easy understanding.

In Figure 1.16, x₁, x₂, x₃….x_n are the inputs, and the weights they carry are represented by w₁… w_n. Their processing is done by the function F, where it performs summation with values up to n. After processing, the output is transmitted to the next neuron as an input. The AUC obtained after implementation of the neural network on the heart disease dataset is presented in Table 1.8. It shows that the model is performing excellently on the training dataset and outstanding on the testing dataset. The implementation is done on python (Google Colab).

Table 1.8 AUC: Neural network.

Parameter	Data	Value	Result
The area under the ROC Curve (AUC)	Training Data	0.8366730	Excellent
	Test Data	0.9415238	Outstanding
	Index: 0.5: No Discriminant, 0.6–0.8: Can be considered accepted, 0.8–0.9: Excellent, >0.9: Outstanding

Some additional Points obtained from the implementation are also presented below:

Neural Score: 83.78

Neural Test Score: 90.24

Accuracy: 0.9024390.

Some other types of Neural Networks are available and listed below for reference:

Multilayer Perceptron [100]

Convolutional Neural Network [101]

Recursive Neural Network [102]

Recurrent Neural Network [103]

Long short term memory [104]

Sequence to Sequence Model [105]

Shallow neural Network [106].

1.10 Comparison of Numerical Interpretation

A summarized version of the AUC results of the above discussed supervised learning methods is given below in Table 1.9 as a comparison of methods. The result indicates that the performance of Random Forest, K-Nearest Neighbor, Decision Tree, and Support Vector Classifier performs outstandingly in both Train and test data sets. Whereas the Logistic Regression and Neural Network perform Outstanding on the testing data set only. It indicates that the models used in Logistic Regression and Neural Network need improvement in the training data set. Hence, the accuracy level will be achieved.

Table 1.9 AUC: Comparison of numerical interpretations.

S. No.	Supervised Learning Parameter	AUC Training Data Value (T1)	AUC Test Data Value (T2)	Result
1	Logistic Regression	0.8374022	0.9409523	T1: Excellent T2: Outstanding
2	Random Forest	1.0000000	1.0000000	T1: Outstanding T2: Outstanding
3	K-Nearest Neighbor	1.0000000	1.0000000	T1: Outstanding T2: Outstanding
4	Decision Tree	0.9588996	0.9773333	T1: OutstandingT2: Outstanding
5	Support Vector Classifier	1.0000000	0.9773333	T1: Outstanding T2: Outstanding
6	Скачать книгу В начало <11 12 13 14 15 16 17 18 19 20 >В конец