Data Theory. Simon Lindgren
Читать онлайн книгу.social scientists have worked to bring disciplines such as sociology into closer contact with data-intensive approaches. In those cases, the translating interface between the two paradigms has commonly been that of statistical and mathematical language. It has been the ‘quantitatively’ oriented social scientists that have bridged over. For example, Salganik (2018, p. 379) discusses how big data can be useful in social research by helping produce faster estimates, and engaging large numbers of research participants in crowd-coding efforts, especially if one is using established statistical strategies to increase the validity of the more messy kinds of online data. In this book, I instead advocate a more interpretive and ‘qualitative’ interface between social science and data science.
Analysing sociality in the age of deep mediatisation may appear to be something that should be done in more ‘quantitative’ terms, because of its scale and the numerical character of much social media data. But there is actually even more reason to approach such objects of study, as well as the new types of data they enable and exude, from a more interpretive standpoint. Just because sociality in the digital age happens in volume and numbers, does not mean that its traces are automatically akin to survey data or other forms of statistical inputs. It is important to realise that the internet, and its networked social tools and platforms, in many ways serve up a different research context than what has been the familiar one to social science. The new context possesses an ‘essential changeability’ that begs a conscious shift of focus and method (Jones, 1999, p. xi). It is because of this that researching digital society demands that the researcher be even more critical and reflective than is already demanded by scholarship in general.
The data that we face do not equal ‘society’. As, explained by Salganik (2018, p. 58), behaviour in big data systems is algorithmically confounded, as ‘it is driven by the engineering goals of the systems’. This means that when we analyse different forms of social interaction, social patterns, and activities in the datafied age, we are analysing a new form of sociality, where automated actors, such as bots, as well as the algorithmic logic of the systems, become part of the situation. On the one hand, this is nothing new to sociology, as it has at its core a long-standing interest in the interplay between structure and agency. It wants to study what constitutes social action, and which enabling and limiting structures shape such action. On the other hand, in relation to the digital setting, we are dealing with new types of agency and new types of structures. Because of the multifaceted character of sociality as mediated through digitally networked tools and platforms, there is today more reason than ever to mix methods in social research so that the discipline can continue to develop. Work towards this can, for example, draw on new tools for data collection via web scrapers, APIs, or online repositories. And they can also include new devices and strategies for analysing data, in the form of computerised language processing, the harnessing of geolocative hardware, new visualisation techniques, and so on.
Law (2004) has written about a need to move into an era After Method in social research. His argument is that we must realise that there is no ‘general world’ to be researched, and that there are no ‘general rules’ for how reality should be analysed (Law, 2004, p. 164). It is not necessarily the case that failing to follow conventional methodological rules that are imposed on science means that one will end up with substandard or distorted knowledge. Underlining the inherent messiness of social research practice, Law argues that we may have to ‘rethink our ideas about clarity and rigour, and find ways of knowing the indistinct and the slippery without trying to grasp and hold them tight’ (Law, 2004, p. 12). This is achieved through being deliberately imprecise, by conventional standards, and to conceive of social analysis in broader and more generous terms. As researchers, we must stop desiring for, and expecting, security. Method, Law argues, offers no guarantees in reality, even though we have been taught in academic programmes to believe that it does. In a way, Law’s approach is more about honesty than actual change, as ‘the problem is not so much lack of variety in the practice of method, as the hegemonic and dominatory pretensions of certain versions or accounts of method’ (Law, 2004, p. 13). It is Law’s argument that we should think of methodology and analysis not in terms of dogma and rules, but as assemblages, where each set of tools and approaches – like ‘a radio receiver, a gong, an organ pipe, or a gravity wave detector’ – will resonate with the analysed reality in its specific ways (Law, 2004, p. 126). The sum of it all is that we must dare to think more openly and less dogmatically about method and analysis, and especially so in relation to the messiness of the social.
Instruments of revelation
It is inspiring to try to rethink the explorative, largely theory-less, data-drivenness of data science research as a form of ethnography in the sense that it is overarchingly about achieving what Clifford Geertz (1973) described as ‘thick descriptions’ of social life:
Ethnography is thick description. What the ethnographer is in fact faced with – except when (as, of course, he must do) he is pursuing the more automatized routines of data collection – is a multiplicity of complex conceptual structures, many of them superimposed upon or knotted into one another, which are at once strange, irregular, and inexplicit, and which he must contrive somehow first to grasp and then to render.
(Geertz, 1973, pp. 9–10)
Coming full circle, one of Geertz’s influences was Max Weber’s Verstehen method, as he famously stated:
Believing, with Max Weber, that man is an animal suspended in webs of significance he himself has spun, I take culture to be those webs, and the analysis of it to be therefore not an experimental science in search of law but an interpretive one in search of meaning.
(Geertz, 1973, p. 5)
In this book, I construe the outcomes of computational analysis, not as being replies to research questions in and by themselves, but as intermediary results to be interpreted in turn. This view thus understands data-intensive techniques not as ready-made methods that are simple means of achieving research goals, but even as part of the actual object of research. The computational techniques – like a village or street corner – are something for the participant ethnographer to enter into. Richard Rogers (2013, p. 1) suggests that what he calls ‘digital methods’ are about identifying and following ‘the methods of the medium’ that are already embedded in digital society. Rogers’ argument is that the internet is already doing research-like things by itself, such as collecting, computing, sorting, ranking, and visualising data. The central idea in Rogers’ approach to the study of the digital is not to intervene or interfere very much with these existing ‘methods’. Our analyses may in fact be more accurate if we respect the integrity of them, follow them with curiosity, and learn from them. Rogers (2013, p. 1) writes:
For example, crawling, scraping, crowd sourcing, and folksonomy, while of different genus and species, are all web techniques for data collection and sorting. PageRank and similar algorithms are means to order and rank. Tag clouds and other common visualizations display relevance and resonance. How may we learn from and reapply these and other online methods? The purpose is not so much to contribute to their fine-tuning and build the better search engine, for that task is best left to computer science and allied fields. Rather, the purpose is to think along with them.
The role of the researcher, then, becomes to attempt to ‘follow the medium’ and its methods as they evolve, and to find ways of exploiting and recombining them in useful ways. In the context of this book, one can correspondingly think in terms of following the methods of data science, seeing them as symptomatic of the datafied society, rather than simply designed to analyse it from the outside. But, in line with Rogers’ argument, these methods can still be harnessed, interpreted, and repurposed for interpretive social analysis.
The aim of thinking and working in this way is, Rogers (2013, p. 3) writes, ‘to build upon the existing, dominant devices themselves, and with them perform a cultural and societal diagnostics’. This means that the ‘initial outputs’ of the research – a network graph, a topic model, a set of sentiment scores, a clustering of users – can be ‘seen or rendered in new light’. The main challenge for digital research, in that case, is to develop a mindset as well as a methodological outlook