Additionally, Stivers argues, CA already relies on distributional evidence to support its findings. Relative frequencies are indicative of various preferences, such as the one for self- over other-correction, minimization in person reference, and recogitional reference. In these and other papers the authors rely on characterizations in terms of "massively", "quite common", and "scarcely ever". The idea being that if a phenomenon is rare, there cannot possibly be a preference, so when we find a skewed distribution of self-correction over other-correction, this provides evidence of a preference for the former.
Intuitively this makes sense. If people do A a lot more often than B, then A is obviously preferred over B. I bike to work far more often than I take the bus, hence there is a preference for taking the bike. The problem is—and this is a persistent problem with frequentist analyses—the reasoning goes the wrong way. If there is a preference, we expect there to be a skewed distribution; a skewed distribution is thus a symptom and indicative of a preference. But if all we see is a skewed distribution, than the preference is only one among a possibly large number of explanations.
I a recent Perspective paper in Nature Human Behavior Michael Muthukrishna en Joseph Henrich provide a pretty intuitive analogy. If we walk into a room and we find a broken vase, there are different explanations: wind coming from the window, a rowdy child, a playful cat, etc. Just having a broken vase does not help us distinguish between these and other explanations, so it cannot in and of itself be understood as evidence for e.g. the cat having gone on rampage and not e.g. a very stormy afternoon.
And here we come to the central issue: CA as a primarily inductive methods develops its theories from the data. This means that we get the idea for, for example, an interactional preference for self-repair because we find that self-repair is far more frequent in our data than other-repair. But at this point all we have done is make an observation. We cannot then decide to develop a theory that fits those data, and claim that the data are evidence of the theory. That would be the essence of circular reasoning.
Now fortunately, and Stivers points this out as well, when developing the notion of a preference for self-repair, Schegloff et al. try to account for the skewed distribution by exploring the data itself. As far as I'm aware, this is what Conversation Analysts always do; we provide an argument from the data. And that is obviously exactly what we'd want to do: we observe something strange in the data and we want to see if there is a clear and coherent explanation for that observation. There need not be; our observation may be a coincidence. As Stivers says, "patterns suggest that there may be a preference"; nothing more. And indeed, this is a typical problem with observed frequencies: they are an accidential result of the situation(s) in which the data were recorded. But the only way to find out is to study the data themselves, not the distributions in those data.