Blog Archives

Distributional evidence in CA is not evidence

2/14/2019

In recent years some Conversation Analysts have moved from a purely qualitative approach to a mixed-method approach, in which Conversation Analytic findings are combined with quantitative methods, such as experiments. In a special issue of Research on Language and Social Interaction from 2017 a large group of scholars showed how CA can be used in the lab to study phenomena like gaze, blinking and turn-taking. De Ruiter and Albert go on to argue that the viability of CA as a method in essence relies on scholars connecting to other fields, combinging the strengths of different methodologies. Similarly in 2015 Tanya Stivers made an appeal that coding of data is not heretical for CA, which is a fundamental step to subsequently doing many experimental studies. While I applaud integration of methods—a point I argue for interactional linguistics in my dissertation—there is one crucial methodological point that bears pointing out.

Distributional evidence

The first question we ask is whether coding is possible at all. As Stivers herself points out, coding massively reduces the complexity of a phenomenon under investigation, while CA's primary strength is in explicating a phenomenon in all its complexity. But, she goes on to argue, conversation analysts already code, albeit not in the way we may typically understand coding. We study specific practices in specific sequential environments: "characterizations of practices are necessarilly specific with respect to composition and poition. And subtypes of practices are further specified." These features could serve as a basis for formal coding.

Additionally, Stivers argues, CA already relies on distributional evidence to support its findings. Relative frequencies are indicative of various preferences, such as the one for self- over other-correction, minimization in person reference, and recogitional reference. In these and other papers the authors rely on characterizations in terms of "massively", "quite common", and "scarcely ever". The idea being that if a phenomenon is rare, there cannot possibly be a preference, so when we find a skewed distribution of self-correction over other-correction, this provides evidence of a preference for the former.

Intuitively this makes sense. If people do A a lot more often than B, then A is obviously preferred over B. I bike to work far more often than I take the bus, hence there is a preference for taking the bike. The problem is—and this is a persistent problem with frequentist analyses—the reasoning goes the wrong way. If there is a preference, we expect there to be a skewed distribution; a skewed distribution is thus a symptom and indicative of a preference. But if all we see is a skewed distribution, than the preference is only one among a possibly large number of explanations.

Circular reasoning

Note at this point that obviously a skewed distribution can be taken as evidence of a preference. If we develop a theory that predicts a skewed distribution, and we find that distribution, this distribution does indeed function as evidence for our theory, and we simultaneously have evidence against any theory that does not predict such a distribution. But a skewed distribution does not help us distinguish between theories that predict that distribution.

I a recent Perspective paper in Nature Human Behavior Michael Muthukrishna en Joseph Henrich provide a pretty intuitive analogy. If we walk into a room and we find a broken vase, there are different explanations: wind coming from the window, a rowdy child, a playful cat, etc. Just having a broken vase does not help us distinguish between these and other explanations, so it cannot in and of itself be understood as evidence for e.g. the cat having gone on rampage and not e.g. a very stormy afternoon.

And here we come to the central issue: CA as a primarily inductive methods develops its theories from the data. This means that we get the idea for, for example, an interactional preference for self-repair because we find that self-repair is far more frequent in our data than other-repair. But at this point all we have done is make an observation. We cannot then decide to develop a theory that fits those data, and claim that the data are evidence of the theory. That would be the essence of circular reasoning.

Now fortunately, and Stivers points this out as well, when developing the notion of a preference for self-repair, Schegloff et al. try to account for the skewed distribution by exploring the data itself. As far as I'm aware, this is what Conversation Analysts always do; we provide an argument from the data. And that is obviously exactly what we'd want to do: we observe something strange in the data and we want to see if there is a clear and coherent explanation for that observation. There need not be; our observation may be a coincidence. As Stivers says, "patterns suggest that there may be a preference"; nothing more. And indeed, this is a typical problem with observed frequencies: they are an accidential result of the situation(s) in which the data were recorded. But the only way to find out is to study the data themselves, not the distributions in those data.

Evidence

It should be clear that in the inductive approach to Conversation Analysis, we cannot use frequency distributions, either exact or discriptive, as evidence for our analyses—neither definitive, nor partial as Stivers calls it. They are the observations on which we build our analyses, they are not the analyses themselves. Doing inferential statistical analyses on them is just bad science, and while this is fortunately not (or rarely?) done in CA, it serves to warn against such a woeful misunderstanding of what statistics and logic can offer. That is not to say distributions can never serve as evidence, but we would need to first develop a theory (possibly based on observed distributions), then collect data to test that theory, and only then could we use a distribution as evidence of our theory. Since CA is generally limited to the first step in this process, we should not pretend that our distributions in any way evidence our theories.

Can we do "unmotivated looking"?

2/5/2019

My PI recently pointed me (although not me specifically) to an interesting lecture by Srikant Sarangi. In this RECLAS Lecture Sarangi discusses various methodological issues that qualitative researchers face when interpreting and analyzing data. One of his main points is that in choosing an analytic lens, whether that is Conversation Analysis or some other qualitative method, we cannot escape interpreting data through that lens. There is no such thing as merely seeing a phenomenon in its pure form; anything you see, you interpret. This is obvious in everyday life, where depending on our knowledge we may see a flower, a rose, or rosa hulthemia. Or from another perspective, once we've learned to read, we cannot just see lines on a screen or a piece of paper: we see and immediately read text. There is no escaping it.

Analytic Goals

Sarangi then raises the specific issue of what Harvey Sacks calls "unmotivated looking," the ideal that we don't bring any analytic ideas or problems to the table, but start by just going through the data to see what comes up. Schegloff framed it as follows in his seminal paper on Confirming Allusions:

"An examination not prompted by prespecified analytic goals (not even that it be the characterization of an action), but by "noticings" of initially unremarkable features of the talk or of other conduct." (Schegloff, 1996: 172)

According to Sarangi, this is inherently impossible. Looking in some motivated way is unavoidable, because we always direct our attention to some channels, and not others. We cannot look at everything, and so we look at one thing or a few things at the exclusion of others. We may of course look at the same phenomenon repeatedly—which is what we do in data sessions—but we do this precisely because we will be looking at different aspects each time. Either that, or we're looking at the same aspect in more detail, but then we are clearly doing motivated looking.

Interestingly this exact issue came up recently when I had revised and resubmitted a paper. One of the reviewers had argued that my analysis was not data driven, but that I had come with preconceptions about what to look for and what I would find in the data, and that my paper was merely a confirmation of these preconceptions. So what I was doing was even worse than doing motivated looking.

I argued in response that while the presented analysis was indeed not gained through unmotivated looking as described by Schegloff, this did not mean it was not data driven. I had merely relied on earlier findings by other scholars and myself. While the topic of unmotivated looking was not discussed as such, it was tacitly addressed, when the reviewer responded that according to my description no CA study could ever be data driven, since we always bring other theories with us when we analyze. We cannot escape what we know about conversation. Obviously the reviewer specifically did not want to make the point that there is no such thing as unmotivated looking, quite the opposite really. But it raises the issue of what unmovitated looking truly is, what it means for Conversation Analysis, and whether it is indeed still feasible.

Methodological Lens

One way to understand Sacks and Schegloff, and that seems to be the way that Sarangi understands them, is that you start an analysis from a blank slate. You assume absolutely nothing and try to see the data for what it is—whatever that may be. And in that case he is right, that is an absurd notion. There is simply no way to pick up a recording of a conversation, and start looking at it completely unmotivated by any analytical goal whatsoever. The fact that you're going to do Conversation Analysis inherently means you have a limited set of possible analytical goals, and you're aware of these goals. In fact, you use the method to further constrain those goals, as with any other qualitative method.

But this is not the way Sacks, Schegloff, or my reviewer for that matter, understand unmotivated looking or data-driven research. It means that instead of formulating a research question before you study the data, you determine your research question based on what you find interesting in the data. What you find interesting is obviously inherently determined in part by your methodological lens and toolbox, there is no escaping that, but within the confines of that lens you can still do unmotivated looking.

Schegloff's own example was about what people do when they confirm a yes/no-type question by providing a repeat of that question. That is a very specific practice, but unmotivated looking can be much broader. When I started on my current research position at the Nuffield Department of Primary Care, I was tasked to study remote consultations, or video-mediated consultations. The research protocol specified a few research questions, but these were so broad as to be anything but constraining. My goal was to analyze the communicative practices that make up a succesful remote consultation. But not only is that a question one obviously cannot answer in a year, there are so many practices that I was basically free to study anything, as long as it dealt with communication. So anything.

My first couple of months were spent looking at the data, trying to figure out what would be interesting questions to answer. The beauty and the challenge of a field that is as understudied as video-mediated consultations, is that you can choose whatever you want, because chances are nobody will have done research on it, let alone published on it. In the end, or middle since the research is ongoing, I decided to focus on the greetings and the physical examinations. Of course these choices and my analyses were guided by the fact that they are about video-mediated consultations, and they are guided by the need to come up with questions and answers that benefit clinicians and patients, but at the same time they are still analyses that are being developed after "noticings of initially unremarkable features," and in that sense they are the result of unmotivated looking.

Seeing things differently

The point is this, I think. When we as Conversation Analysts say we are doing unmotivated looking, that does not mean we pretend to leave behind all knowledge and experience and the assumptions and biases that go with that. We are very much aware of what CA can and cannot do, and what types of questions a CA lens will and can generate. What unmotivated looking means for us, is that we can just pick up a piece of data or dataset, and transcriptions of those data, and using the toolbox of CA go through it for noticeable features. We subsequently try to refine what those features are, so as to come to an analysis. Alternatively, we may do a practice-driven analysis, where we start with an active interest in a specific phenomenon that we then collect—as Mick Smith and I are have been doing for the past year for our study of Oh I thought X and Oh ik dacht dat X.

Unmotivated looking is thus very much an option for data analysis, but it does not mean what Sarangi takes it to mean. That's not to say his point, and the lecture in which he makes it, are without merit. Quite the opposite really. It forces us to rethink what it means to do analysis, and to think about what prior experiences we bring in to our research projects. It is a way of reminding us that once we make the decision to do Conversation Analysis, we will notice certain phenomena, but not others. Just as within CA, somebody like my with a theoretical linguistic background will notice other features than somebody who has spent their career studying embodied interaction in medical care.

We take unmotivated looking for granted, because Sacks and Schegloff developed the toolbox from that perspectice, but as Sarangi point out, the more assumptions we make, the more problematic our analyses. We need to be critical of what unmotivated looking means for us, and not assume that because as individual scholars we have an understanding of it, that understanding extends to our peers or colleagues across disciplines.