The Shrinking Divide

Updated: Oct 10, 2020


I have never held much of the Methodenstreit between qualitative and quantitative researchers. However, if the divide between those approaches ever had any significance, it would appear that it is diminishing rapidly. Developments in both causal inference and (paradoxically) machine learning are to be given credit for this development.


That the divide has shrunk in causal inference is demonstrated amply in Jason Seawright's excellent Multi-Method Social Science: Combining Qualitative and Quantitative Tools. Developing a causal identification strategy, even within a quantitative framework such as potential outcomes, requires intimate case knowledge. Although tools like synthetic control can help with the case selection, ultimately the credibility of the identification strategy depends partially on the researcher's ability that the controls are comparable to the treated unit(s).


What may be less obvious is the shrinking divide between quantitative and qualitative research in machine learning. Machine learning conjures up images of big data, quite the opposite of in-depth case knowledge. So what does qualitative research have to contribute to machine learning?


The answer is interpretation, in particular local interpretation. In "Why Should I Trust You?," Ribeiro, Singh, and Guestrin make the powerful case that a well-performing classifier that cannot be explained is not to be trusted. Interpretation is not limited to the classification algorithm as a whole, but extends to individual observations. Can I explain how the prediction for a particular test sample comes about? Ribeiro et al. developed LIME as a framework for local interpretation, but there are others such as DALEX. The key point is that these methods look at the way in which features (predictors) produce a prediction.


If we get into the habit of considering individual predictions, what is to prevent us from evaluating those predictions in light of qualitative evidence about the cases? We can use this evidence as auxiliary information that helps us to understand, for example, particularly egregious forms of misclassification. The qualitative evidence may help us to identify omitted features or scope conditions that may then be used to improve the algorithm or, at least, help us understand its limitations. The kinds of qualitative evidence that can be brought to bear include historical case knowledge, in-depth interviews, and perhaps even focus groups. It could be quite interesting, for example, to have citizens look at the way in which predictions come about and comment on where they think the blind spots are.


In a future post, I will go into LIME and DALEX. For now, I end by reiterating that, even in the age of big data, qualitative methods are an integral part of a comprehensive research program. Quality versus quantity is an anachronistic and non-productive divide.



38 views0 comments