Outsmarting AI. Brennan Pursell
Читать онлайн книгу.the catch: The neural network has no way of knowing whether its output house price is accurate or not. It needs to be “trained” based on example data sets. You need to train it based on good data—actual examples, in this case, of houses successfully sold at a given price.
The beauty is that when you enter into the network the output and its matching set of inputs, the network’s algorithms can adjust the parameters in the hidden layers automatically.
There are three key algorithms that make neural networks work. In them you will see that “deep learning” and “machine learning” really have a lot to do with training, but almost nothing to do with human learning.
Backpropagation makes the neural network adjust the weights in the hidden layers. Backpropagation is usually at the heart of “deep learning,” “machine learning,” and AI today. The algorithm was a breakthrough by Geoffrey Hinton and, independently, Yann Lecun, in the mid-1980s, although their work relied on research from the 1960s and ’70s. Backpropagation only lately became an effective tool when available data and processing speeds both grew exponentially in the last decade.
Backpropagation starts at the output—the house price, in our example—and then works backward through the hidden layers toward the inputs. Say that the neural network predicted a price of $500,000 for a certain house, but you know that the actual price was $525,000. Backpropagation takes the correct price and adjusts the weights in each calculation in the hidden layers so that the neural network would come to the same price based on the initial inputs.
But that is just one example. If you train the network on many examples, then backpropagation averages all the adjustments that it needs to make, in order to optimize accurate performance across the entire data set.
The more you train it, the more you test its results, the better it gets.
Gradient descent refers to the mathematical process of measuring the adjustment of the weights—the parameters—in the hidden layers to improve the accuracy of the neural network’s calculations and minimize the effect of errors in the data. Think of it as steps toward the sweet spot, the best set of weights in the network for generating the best, most realistic output, given the inputs. Gradient descent relies on derivatives in good old calculus, which determine the slopes of function lines.
Finally, the sigmoid and rectified linear unit (ReLU) functions help neural networks to generate clear “yes” and “no” answers and/or classifications (in the form of a 1 or a 0) or to sort data inputs into various categories. These functions are important in decision trees, as you might expect. ReLUs in particular enable neural networks to obtain their results faster.
“Recurrent neural networks,” “convolutional neural networks,” “deep belief networks,” “generative adversarial networks,” and even “feedforward multilayer perceptron deep networks” all rely on the software you just learned about. And none of them are worth anything without good quality data, and lots of it.
Equipped with the basic math and software underlying AI, you can readily face down any aggressive sales associate who tries to persuade you that his or her AI thinks like a human, one smarter than you.
As you have seen, the technical ideas behind AI have been around for decades, and their applicability in the workplace has soared in just the past few years. The amount of available raw data has exploded as more and more applications on more and more mobile devices collect data and share it on the Internet with service companies and their partners. We communicate and work increasingly through apps. The “Internet of Things” (IoT) is a volcano, disgorging data everywhere faster than anyone can measure. No one can stop it. “Smart” devices and sensors proliferate. Better hardware, from the individual device to the network systems, help to process and transfer that data faster than ever.
AI systems are increasingly capable of recognizing images, processing human language, and managing information in structured and unstructured data through statistical procedures. (I will revisit what applied AI can do for your organization in chapter 3.)
To sum it all up: Software and hardware put statistics on steroids—and it will get much, much more powerful over time.
We need to use AI because, when the data are well-labeled and the procedures are correct, computers run great numbers of them at high speed and low cost. Computers are also immune to human errors such as prejudice, favoritism, distraction, and mood swings (although we all wonder sometimes, when our systems suddenly slow down, freeze up, or crash).
Nonetheless, AI has three big problems: dependency, consistency, and transparency. Dependency refers to the machines’ need for large amounts of high-quality, correctly classified training data. “Garbage in, garbage out,” as we said earlier. Consistency is a problem because adjustments made to algorithms produce different end results, regardless of data completeness and quality. Different AI systems produce different results on the same darned data. Finally, and most importantly, transparency in neural network processes is limited at best. Backpropagation makes it extremely difficult to know why the network produces the result that it does. The chains of self-adjusting calculations get so long and complicated that they turn the AI system into a “black box.”[15]
Despite these problems, AI functionalities are improving all the time, and applications of AI technology are proliferating in just about every sector of the economy. For every human being that uses a smartphone, there is literally no avoiding it.
1.
See World Intellectual Property Organization, Technology Trends 2019: Artificial Intelligence. https://www.wipo.int/edocs/pubdocs/en/wipo_pub_1055.pdf. The executive summary can be downloaded from https://www.wipo.int/edocs/pubdocs/en/wipo_pub_1055-exe_summary1.pdf.
2.
The best book on AI mathematics is Nick Polson and James Scott, AIQ (New York: St. Martin’s Press, 2018).
3.
See https://www.netflixprize.com/community/topic_1537.html and https://www.netflixprize.com/assets/GrandPrize2009_BPC_BellKor.pdf. The irony is that Netflix never actually used the prize-winning algorithm, supposedly because of the excessive engineering expense involved. It has used other algorithms developed by the prize-winning team. Casey Johnston, “Netflix Never Used Its $1 Million Algorithm Due to Engineering Costs,” Ars Technica, April 16, 2012, https://www.wired.com/2012/04/netflix-prize-costs/.
4.
See Polson and Scott, ch. 2.
5.
Public domain image from Wikipedia. https://en.wikipedia.org/wiki/Regression_analysis.
6.
See Polson and Scott, ch. 2.
7.
P(H|D) = P(H) * P(D|H) / P(D). See Polson and Scott, ch. 3.
8.
See Polson and Scott, ch. 4.
9.
See Polson and Scott, ch. 5.
10.
An excellent book on this topic is Steven Finlay, Artificial Intelligence and Machine Learning for Business: A No-Nonsense Guide to Data-Driven Technologies (Relativistic, 2nd ed., 2017).
11.
Quoted in Clive Cookson, “Huge Surge in AI Patent Applications in Past 5 Years,” Financial Times, January 31, 2019.
12.