Statistical Relational Artificial Intelligence. Luc De Raedt
Читать онлайн книгу.tasks. Niepert et al. [2010] used Markov logic for ontology matching. Tenenbaum et al. [2011] demonstrated that relational models can address some of the deepest questions about the nature and origins of human thought. Verbeke et al. [2012] used statistical relational learning for identifying evidence based medicine categories. Schiegg et al. [2012] segmented motion capture data using Markov logic mixture of Gaussian processes making use of both continuous and declarative observations. Song et al. [2013] presented a Markov logic framework for recognizing complex events from multimodal data. Statistical relational learning has also been employed for learning to predict heart attacks [Weiss et al., 2012], onset of Alzheimer’s [Natarajan et al., 2012a], for analyzing pathways and omics experiments [De Maeyer et al., 2013], and extracting adverse drug events from text [Odom et al., 2015]. Lang et al. [2012] applied statistical relational learning to robot manipulations tasks, Nitti et al. [2014] to robot tracking tasks, and Moldovan and De Raedt [2014] to affordance learning. Generally, statistical relational methods are essential for enabling autonomous robots to do the right thing to the right object in the right way [Tenorth and Beetz, 2013]. Hadiji et al. [2015] used relational label propagation to track migration among computer scientists. Kersting et al. [2015] used relational linear programs for topic classification in bibliographic networks, and Toussaint [2015] used relational mathematical programs for combined robot task and motion planning. Foulds et al. [2015] showcased the benefit of relational probabilistic languages for general-purpose topic modeling. Khot et al. [2015b] used Markov logic to answer elementary-level science questions using knowledge extracted automatically from textbooks, and Mallory et al. [2016] used DeepDive to extract both protein–protein and transcription factor interactions from over 100,000 full-text PLOS articles.
Figure 1.7: The robot’s relational probabilistic model is populated with the data produced by its perception and experience (robot log data, human motion tracking, environment information, etc.) as well as with facts (or assertions) extracted from other dark data.
It is our belief that there will be many more exciting techniques and applications of StarAI in the future.
1.5 BRIEF HISTORICAL OVERVIEW
Before starting the technical introduction to StarAI, we briefly sketch some key streams of research that have led to the field of StarAI. Note that a full historical overview of this rich research field is out of the scope of this book.
StarAI and SRL (Statistical Relational Learning) basically integrate three key technical ingredients of AI: logic, probability, and learning. While each of these constituents has a rich tradition in AI, there has been a strong interest in pairwise combinations of these techniques since the early 1990s.
- Indeed, various techniques for probabilistic learning such as gradient-based methods, the family of EM algorithms or Markov Chain Monte Carlo methods have been developed and exhaustively investigated in different communities, such as in the Uncertainty in AI community for Bayesian networks and in the Computational Linguistics community for Hidden Markov Models. These techniques are not only theoretically sound, they have also resulted in entirely new technologies for, and revolutionary novel products in computer vision, speech recognition, medical diagnostics, troubleshooting systems, etc. Overviews of probabilistic learning can be found in Koller and Friedman [2009], Bishop [2006], Murphy [2012].
- Inductive logic programming and relational learning techniques studied logic learning, i.e., learning using first order logical or relational representations. Inductive Logic Programming has significantly broadened the application domain of data mining especially in bio- and chemo-informatics and now represent some of the best-known examples of scientific discovery by AI systems in the literature. Overviews of inductive logic learning and relational learning can be found in this volume and in Muggleton and De Raedt [1994], Lavrac and Dzeroski [1994], De Raedt [2008].
- Probabilistic logics have been also studied from a knowledge representational perspective [Nilsson, 1986, Halpern, 1990, Poole, 1993b]. The aim of this research was initially more a probabilistic characterization of logic than suitable representations for learning.
In the late 1990s and early 2000s, researchers working on these pairwise combinations started to realize that they also need the third component. Influential approaches of this period include the Probabilistic Relational Models [Friedman et al., 1999], which extended the probabilistic graphical model approach toward relations, the probabilistic logic programming approach of Sato [1995], which extended the knowledge representation approach of Poole [1993b] with a learning mechanism, and approaches such as Bayesian and stochastic logic programs [Kersting and De Raedt, 2001, Cussens, 2001] working within an inductive logic programming tradition. Around 2000, Lise Getoor and David Jensen organized the first workshop on “Statistical Relational Learning” at AAAI which gathered researchers active in the area for this time. This was the start of a successful series of events that continues till today (since 2010 under the header “StarAI”).
In the early days, many additional formalisms were contributed such as RBNs [Jäger, 1997], MMMNs [Taskar et al., 2004], SRMs [Getoor et al., 2001c], MLNs [Richardson and Domingos, 2006], BLOG [Milch et al., 2005], RDNs [Neville and Jensen, 2004], and IBAL [Pfeffer, 2001], many of which are described in the book edited by Getoor and Taskar [2007]. Especially influential was the Markov Logic approach of Richardson and Domingos [2006], as it was an elegant combination of undirected graphical models with relational logic. In this early period, research focused a lot on representational and expressivity issues, often referring to the “alphabet-soup” of systems, reflecting the fact that many systems have names that use three or four letters from the alphabet. Around 2005, it was realized that the commonalities between the different systems are more important than their (often syntactic) differences, and research started to focus on key open issues surrounding learning and inference. One central question is that of lifted inference [Poole, 2003], that is, whether one can perform probabilistic inference without grounding out the probabilistic relational model first. Other questions concern the relationship to existing probabilistic and logical solvers and the continuing quest for efficient inference techniques. Finally, research on SRL and StarAI has also inspired the addition of probabilistic primitives to programming languages, leading to what is now called probabilistic programming. Although some of the early formalisms [Poole, 1993b, Sato, 1995, Pfeffer, 2001] already extend an existing programming language with probabilities, and hence, possess the power of a universal Turing machine, this stream of research became popular since BLOG [Milch et al., 2005] and Church [Goodman et al., 2008].
PART I
Representations
CHAPTER 2
Statistical and Relational AI Representations
Artificial intelligence (AI) is the study of computational agents that act intelligently [Russell and Norvig, 2010, Poole and Mackworth, 2010] and, although it has drawn on many research methodologies, AI research arguably builds on two formal foundations: probability and logic.
The basic argument for probability as a foundation of AI is that agents that act under uncertainty are gambling, and probability is the calculus of gambling in that agents who do not use probability will lose to those that do use it (see [Talbott, 2008], for an overview). While there are a number of interpretations of probability, the most suitable for the present book is a Bayesian or subjective view of probability: our agents do not encounter generic events, but have to make decisions in particular circumstances, and only have access to their percepts (observations) and their beliefs. Probability is the calculus of how beliefs are updated based on observations (evidence).
The basic argument for first-order logic is that an agent needs to reason about individuals, properties of individuals, and relations among these individuals, and at least be able to represent conjunctions, implications, and quantification. These requirements arise from designing languages that allow one to easily and compactly represent the necessary knowledge for a non-trivial task. First-order logic has become widely used in the theory