Computational Analysis and Deep Learning for Medical Care. Группа авторов. Читать онлайн. MREADZ.NET

Computational Analysis and Deep Learning for Medical Care. Группа авторов

Читать онлайн книгу.

В начало <9 10 11 12 13 14 15 16 17 18 >В конец

Computational Analysis and Deep Learning for Medical Care - Группа авторов

rel="nofollow" href="#ulink_dbf2a90e-1d8b-569a-8c0e-e64612c04be7">Table 1.5 shows its parameters.

Table 1.4 Various parameters of ZFNet.

Layer name	Input size	Filter size	Window size	# Filters	Stride	Padding	Output size	# Feature maps	# Connections
Conv 1	224 × 224	7 × 7	-	96	2	0	110 × 110	96	14,208
Max-pooling 1	110 × 110		3 × 3	-	2	0	55 × 55	96	0
Conv 2	55 × 55	5 × 5	-	256	2	0	26 × 26	256	614,656
Max-pooling 2	26 × 26	-	3 × 3	-	2	0	13 × 13	256	0
Conv 3	13 × 13	3 × 3	-	384	1	1	13 × 13	384	885,120
Conv 4	13 × 13	3 × 3	-	384	1	1	13 × 13	384	1,327,488
Conv 5	13 × 13	3 × 3	-	256	1	1	13 × 13	256	884,992
Max-pooling 3	13 × 13	-	3 × 3	-	2	0	6 × 6	256	0
Fully connected 1	4,096 neurons								37,752,832
Fully connected 2	4,096 neurons								16,781,312
Fully connected 3	1,000 neurons								4,097,000
Softmax	1,000 classes								62,357,608 (Total)

Schematic illustration of architecture of VGG-16.

Figure 1.4 Architecture of VGG-16.

1.2.5 GoogLeNet

In 2014, Google [5] proposed the Inception network for the ImageNet Challenge in 2014 for detection and classification challenges. The basic unit of this model is called “Inception cell”—parallel convolutional layers with different filter sizes, which consists of a series of convolutions at different scales and concatenate the results; different filter sizes extract different feature map at different scales. To reduce the computational cost and the input channel depth, 1 × 1 convolutions are used. In order to concatenate properly, max pooling with “same” padding is used. It also preserves the dimensions. In the state-of-art, three versions of Inception such as Inception v2, v3, and v4 and Inception-ResNet are defined. Figure 1.5 shows the inception module and Figure 1.6 shows the architecture of GoogLeNet.

For each image, resizing is performed so that the input to the network is 224 × 224 × 3 image, extract mean before feeding the training image to the network. The dataset contains 1,000 categories, 1.2 million images for training, 100,000 for testing, and 50,000 for validation. GoogLeNet is 22 layers deep and uses nine inception modules, and global average pooling instead of fully connected layers to go from 7 × 7 × 1,024 to 1 × 1 × 1024, which, in turn, saves a huge number of parameters. It includes several softmax output units to enforce regularization. It is trained on a high-end GPUs within a week and achieved top-5 error rate of 6.67%. GoogleNet trains faster

Скачать книгу

В начало <9 10 11 12 13 14 15 16 17 18 >В конец