Deep Learning Approaches to Text Production. Shashi Narayan. Читать онлайн. MREADZ.NET

Neural methods have triggered a paradigm shift in text production by supporting two key features. First, recurrent neural networks allow for the learning of powerful language models which can be conditioned on arbitrarily long input and are not limited by the Markov assumption. In practice, this proved to allow for the generation of highly fluent, natural sounding text. Second, the encoder-decoder architecture provides a natural and unifying framework for all generation tasks independent of the input type (data, text, or meaning representation). As shown by the dramatic increase in the number of conference and journal submissions on that topic, these two features have led to a veritable explosion of the field.

In this book, we introduce the basics of early neural text-production models and contrast them with pre-neural approaches. We begin by briefly reviewing the main characteristics of pre-neural text-production models, emphasising the stark contrast with early neural approaches which mostly modeled text-production tasks independent of the input type and of the communicative goal. We then introduce the encoder-decoder framework where, first, a continuous representation is learned for the input and, second, an output text is incrementally generated conditioned on the input representation and on the representation of the previously generated words. We discuss the attention, copy, and coverage mechanisms that were introduced to improve the quality of generated texts. We show how text-production can benefit from better input representation when the input is a long document or a graph. Finally, we motivate the need for neural models that are sensitive to the current communication goal. We describe different variants of neural models with task-specific objectives and architectures which directly optimise task-specific communication goals. We discuss generation from text, data, and meaning representations, bringing various text-production scenarios under one roof to study them all together. Throughout the book we provide an extensive list of references to support further reading.

As we were writing this book, the field had already moved on to new architectures and models (Transformer, pre-training, and fine-tuning have now become the dominant approach), and we discussed these briefly in the conclusion. We hope that this book will provide a useful introduction to the workings of neural text production and that it will help newcomers from both academia and industry quickly get acquainted with that rapidly expanding field.

We would like to thank several people who provided data or images, and authorization to use them in this book. In particular, we would like to thank Abigail See for the pointer-generator model, Asli Celikyilmaz for the diagrams of deep communicating paragraph encoders, Bayu Distiawan Trisedya for graph-triple encoders, Bernd Bohnet for an example from the 2018 surface realisation challenge, Diego Marcheggiani for graph convolutional network (GCN) diagrams, Jiwei Tan for hierarchical document encoders and graph-based attention mechanism using them, Jonathan May for an abstract meaning representation (AMR) graph, Laura Perez-Beltrachini for an extended RotoWire example, Linfeng Song for graph-state long short-term memories (LSTMs)

Скачать книгу

В начало <1 2 3 4 5 6 7 8 9 >В конец