Data Theory. Simon Lindgren
Читать онлайн книгу.especially when researching sociality and politics through the internet. The book emphasises the need to think freely and openly about both theory and method, and goes beyond some of the ways of doing social research that are dominant today. It does so by playfully and tentatively combining elements of theories and methods, some of which are commonly seen as being incompatible.
In the face of the availability of new types of digital research data, and the contemporary popularity of computational methods in an increasing range of scholarly fields, the book should be read as an explorative attempt to make synergetic gains by harnessing the respective powers of interpretive social analysis and computational methods within one and the same research framework. This is not to say that the proposed hybrid approach must replace any other existing approach, but I want to explore potential points of contact between some concepts and strategies that are not regularly combined.
The book is tentative because it does not enter any deeper discussion about the ontological, epistemological, or technical appropriateness of combining, for example, vector models with selectively read parts of Laclau and Mouffe’s discourse analysis, or social media metrics with elements of Bourdieu’s theory of practice. Nor does it address, in any conventional way, the discussions about using ‘quantitative’ scores and measures as input for ‘qualitative’ readings, that its case examples are likely to invite. This conscious choice has been made in order for the argument not to get stuck with such things. The book as a whole should be read as a proposal: ‘What if we did it like this?’ This is a way of exploring how far things can bend.
Furthermore, the proposal that I make in this book is meant to be modest. A significant amount of valuable and important work has already been done, and continues to be carried out, along the rough lines that I am suggesting, by scholars in fields such as science, technology, and society studies (Marres, 2017), mixed methods research (Hesse-Biber and Griffin, 2013), analytical sociology (Keuschnigg, Lovsjö, and Hedström, 2018), and computational social science (Lazer et al., 2009). In developing my contribution to the discussion of how the data/theory equation can be balanced in contemporary data-driven social research, I draw on influences from all of these areas. In addition, there is a vast literature in the area of the philosophy of science, where I am by no means an expert, but with which the book still sometimes enters into partial dialogue. Therefore, the book should be read for what it is – that is, an account by an alleged ‘qualitative’ sociologist entering the field of computational methods, with the aim of tracing the outlines of a hybrid methodological position potentially to be held, not in particular by data scientists or computational social scientists, nor by digital ethnographers or anthropologists, but by scholars wanting to maintain an interpretative sociological framework for analysis, while incorporating computational methods that follow society’s datafication.
Outline of this book
The first chapter of the book, Beyond Method, is about the need to rethink and repurpose research methods, as well as the role of social theory. I argue that social researchers must move beyond prevailing notions of methodology in order to find new and creative solutions in response to an increasingly complex social reality.
The second chapter, Decoding Social Forms, turns to the empirical subject area of the book – social media politics – and continues the discussion about how to research complex sociality. Social research, and its object of study (society), are equally messy, in ways that should be embraced rather than avoided. In addressing how social theory can help in navigating the complexities, the chapter covers a set of key concepts, drawing on classic sociological theorists such as Weber, Durkheim, and Simmel.
The third chapter, Unintended Consequences, continues to make the argument that pre-digital social theory can be repurposed to make sense of ambivalent sociality in a datafied society. In the chapter, we approach US President Donald Trump’s infamous ‘covfefe’ tweet from the perspective of the sociology of unanticipated consequences, in order to disentangle its surrounding twisted web of tweets, talk, and discourse. This is a case study, presented before we delve deeper into the territory of computational methods in the chapters that follow, to illustrate how social theory can aid the disentanglement of ambivalent online social practice. In this particular case, we will take help from sociologist Robert K. Merton’s perspective on the sometimes unpredictable, and possibly ambivalent, relationships between what people do, or intend, and the outcomes of those actions.
Chapter 4, Actor-Networks, provides an example of how computational approaches can be combined with interpretive theoretical analysis. This is done here, in an area – science, technology and society studies (Callon et al., 1983; Marres, 2017, pp. 106–8) – where such connections have already been made, and where there is great potential. The case analysis in the chapter is based on a dataset consisting of 1.1 million tweets, which were collected using search terms relating to climate change discourse. The chapter uses these data to explore how computational approaches to text analysis can be brought together with actor-network theory (Callon, 2001; Latour, 2005; Law, 1999). This is done by combining elements of the theory with suitable techniques for processing the tweets. First, an analysis based in actor-network theory needs to identify social actors (human and others) in the social context that is under analysis. This is done in this research example with the help of the computational linguistics technique of Named Entity Recognition (Grishman and Sundheim, 1996), which algorithmically identifies and tags any names of people, places, organisations, corporations, nationalities, events, and so on, that appear in the tweets. Second, actor-network theory is interested in how actors connect in relational systems. It wants to map chains of association between humans, things, and ideas that play a part in how social reality is constructed, and how ‘truths’ are manifested. In this chapter’s case example, information about such associations was gained by analysing the network contexts of the mapped actors with the help of topic modelling through so-called Latent Dirichlet Allocation (Blei, Ng, and Jordan, 2003). The information gathered through that machine learning model, in combination with techniques for visualisation from the field of social network analysis (Bastian, Heymann, and Jacomy, 2009; Shannon, 2003; Wasserman and Faust, 1994), enables the drawing of tangible maps of actor-networks. The chapter concludes by returning to the general theme of this book, by raising and discussing the issue of how and why theories and methods can, and must, be adapted and tweaked in ways that mean simplification as well as promoting the emergence of new analytical opportunities. Theories, as well as methods, should be seen as open-source: free for all to share, alter, and transform.
Chapter 5, Collective Representations, is focused on introducing early twentieth-century approaches to the sociology of knowledge to the age of the internet, and especially to the research context of current data science. The key argument in the classic, Durkheimian, approach is that language, conceptual thinking, and logic are shaped by the social contexts out of which they arise. This notion, that stereotypes, categorisations, and manners of speaking that exert great power over our reasoning and actions are social products, has formed the basis for a series of other constructionist perspectives on society and culture over the years. The chapter discusses some modern developments in the sociology of knowledge, alongside social constructionism, and poststructural perspectives such as those of Laclau and Mouffe (1985), and Deleuze and Guattari (1987), where abstract theoretical notions such as discourse, rhizome, and assemblage are exploratively brought together with data science methods. The focus is particularly on text mining through machine learning, and specifically on word embedding models. The chapter aims to show how one can approach, much as a social anthropologist would, massively networked social settings online through big data techniques, and draw on sociological theory in decoding their worldviews. The chapter includes an empirical case study of the forum website Reddit, based on a comprehensive dataset including more than 1.2 billion posts.
The next section of the book, Chapter 6, Symbolic Power, works through an example of how a well-established social theory can be transformed and adapted to enable operationalisations that are fit for social media datasets. The case in focus is Pierre Bourdieu’s theory of social practice