Qualitative research, Research methods

Assessing the ‘Quality’ of Qualitative Research

One of the questions that comes up regularly in training courses on qualitative methods is how we should assess the quality of a qualitative study. At some point in their research career, qualitative researchers will inevitably experience the ‘apples versus oranges’ phenomenon, whereby our qualitative research is evaluated based on quantitative principles and criteria, instead of qualitative principles. The quality standards used in quantitative research do not directly translate to qualitative studies.

Should We Use Standardized Criteria to Evaluate Qualitative Research?

Over the years, many qualitative scholars have proposed frameworks and criteria for assessing qualitative research (see Guba and Lincoln 1989; Lather 1993; Schwandt 1996; Bochner 2000; Ritchie et al. 2003; Tracy 2010; Altheide and Johnson 2011). Some have also argued that standardized criteria in are unhelpful in qualitative inquiry (i.e. see Schwandt 1996; Altheide and Johnson 2011). For example, Bochner (2000) argues that ‘traditional empiricist criteria’ are ‘unhelpful’ when applied to new ethnographic approaches (cited in Tracy 2010: 838). As Altheide and Johnson (2011: 582) argue:

“There are many ways to use, practice, promote, and claim qualitative research, and in each there is a proposed or claimed relationship between some field of human experience, a form of representation, and an audience. Researchers and scholars in each of these areas have been grappling with issues of truth, validity, verisimilitude, credibility, trustworthiness, dependability, confirmability, and so on. What is valid for clinical studies or policy studies may not be adequate or relevant for ethnography or autoethnography or performance ethnography.”

Qualitative research is conducted within different research paradigms, which complicates the assessment of the quality of a particular study.

As Tracy (2010) notes, many of these critiques result in the development of new quality standards and criteria for evaluating qualitative inquiry which are seen as more flexible than quantitative standard and of more sensitive to the context bound nature of qualitative research. Below, we explore the main criteria proposed for assessing qualitative research:

Criteria for Assessing Qualitative Research

  1. Trustworthiness

In the 1980s, Guba and Lincoln (1989 see also Krefting 1991) developed criteria which can be used to determine rigor in a qualitative inquiry. Instead of ‘rigor’, they focus on the development of trustworthiness in qualitative inquiry through determining: credibility, transferability, reliability and confirmability.

Qualitative criteriaQuantitative criteria
CredibilityInternal validity
TransferabilityExternal validity or generalizability
  1. Credibility

Credibility asks us to consider if the research findings are plausible and convincing. Questions to consider include:

  • How well does the study capture and portray the world it is trying to describe?
  • How well backed up are the claims made by the research?
  • What is the evidential base for the research?
  • How plausible are the findings?

As Stenfors et al. (2020) point out, there should be alignment between ‘theory, research question, data collection, analysis and results’ while the ‘sampling strategy, the depth and volume of data, and the analytical steps taken’ must be appropriate within that framework.

  1. Transferability

Here, we are interested in how clear the basis is for drawing wider inference (Ritchie et al. 2003) from our study. Can the findings of our study be transferred to another group, context or setting?

As Ritchie et al. (2003) argue, the findings of qualitative research can be generalized but the framework within which this can occur needs greater clarification. Instead, we refer to the transferability of findings in a qualitative study. For example, in an empirical sense: can findings from qualitative research studies be applied to populations or settings beyond the particular sample of the study? We can also explore the generation of theoretical concepts or propositions which are deemed to be of wider, or universal, application from a qualitative study.

When attempting to extrapolate from a qualitative study we should be conscious that meanings and behaviours are context bound. Therefore extrapolation may be possible if offered as a working hypothesis to help us to make sense of findings in other contexts.

Questions to consider include:

  • Sample coverage: did the sample frame contain any known bias; were the criteria used for selection inclusive of the constituencies thought to be of importance?
  • Capture of the phenomena: was the environment, quality of questioning effective for participants to fully express their views?
  • Identification or labelling: have the phenomena been identified, categorised and named in ways that reflect the meanings assigned by participants?
  • Interpretation: is there sufficient internal evidence for the explanatory accounts that have been developed?
  • Display: have the findings been portrayed in a way that remains true to the original data and allows others to see the analytic constructions which have occurred? (see Ritchie et al. 2003)
  1. Dependability

Dependability is ‘the extent to which the research could be replicated in similar conditions’ (Stenfors et al. 2020). The researcher should have provided enough information on the design and conduct of their study that another researcher could follow these and take the same steps in their study. Given the context specific nature of qualitative research, it can be difficult to demonstrate which features of the qualitative data should be expected to be consistent, dependable or reliable.

Questions to consider for reliability include:

  • Was the sample design/selection without bias, ‘symbolically’ representative of the target population, comprehensive of all known constituencies; was there any known feature of non-response or attrition within the sample?
  • Was the fieldwork carried out consistently, did it allow respondents sufficient opportunities to cover relevant ground, to portray their experiences?
  • Was the analysis carried out systematically and comprehensively, were classifications, typologies confirmed by multiple assessment?
  • Is the interpretation well supported by the evidence?
  • Did the design/conduct allow equal opportunity for all perspectives to be identified or were there features that led to selective, or missing, coverage? (see Ritchie et al. 2003).
  1. Confirmability

Here, we are looking for a clear link between the data and the findings. For example, researchers should evidence their claims with the use of quotes/excerpts of data. Qualitative researchers should avoid the temptation to quantify findings with claims such as ‘70% of participants felt that xxx…’ It is also important in the Discussion to demonstrate how the research findings relate to the wider body of literature and to answer the research question. Any limitations of the study should also be flagged up.

  1. Reflexivity

Stenfors et al. (2020) draw attention to reflexivity as another important criteria in assessing qualitative inquiry. For Guba and Lincoln (1989) the reflexive journal is a further means of helping to assess qualitative inquiry. A reflexive approach helps us to be aware of the social, ethical and political impact of our research, the central, fluid and changing nature/s of power relations (with participants, gatekeepers, research funders, etc.) and our relationships with the researched (Lumsden 2019).

We can ask whether the researcher has stepped back and critically reflected on their role in the research process, their relationships with the researched, and their social position? It should be clear how reflexivity has been embedded in the research process (Stenfors et al. 2020). As Altheide and Johnson (2011: 581) write:

‘Good qualitative research—and particularly ethnographies—shows the hand of the ethnographer. The effort may not always be successful, but there should be clear “tracks” that the attempt has been made.’

Additional Criteria: Ethics

Tracy (2010) also provides a useful overview of 8 key criteria for excellent qualitative research: worthy topic, rich rigor, sincerity, credibility, resonance, significant contribution, ethical, meaningful coherence (p.840). There is overlap with the above criteria and some elements could be said to be already subsumed in the above discussion, therefore I will not delve into them all here. However, it is important to draw attention to ethical considerations in qualitative studies. As Tracy notes, the research should consider:

  • Procedural ethics (such as human subjects);
  • Situational and culturally specific ethics;
  • Relational ethics;
  • Exiting ethics (leaving the scene and sharing the research) (see Tracy 2010: 840).

Strategies for Determining Trustworthiness (Rigor)

The strategies adopted in order to determine the trustworthiness of a qualitative study depend on a variety of factors including: research paradigm, the specifics of each research design, the research methods utilised (i.e. interviews, ethnography, observation, focus groups, creative methods, visual methods, secondary data analysis, narratives etc.) and the type of qualitative analysis being conducted.

Moore (2015) provides a useful evaluation of the use of various strategies for ensuring rigor in qualitative studies. Strategies which she evaluates as typically used in attempts to ensure validity and reliability include:

  • Prolonged engagement in ethnographic research via time spent in the field to reduce researcher effect;
  • Prolonged observation in ethnographic research reduces researcher effect;
  • Thick description;
  • Triangulation;
  • Development of a coding system and inter-rater reliability in semi-structured interviews;
  • Researcher bias;
  • Negative case analysis;
  • Peer review debriefing (in team research);
  • Member checks;
  • External audits (viewed as problematic and not routinely used) (see pages 1217-1220).

She provides a useful evaluation of the appropriateness and success of these strategies for ensuring rigor, for those who wish to explore this further. Interestingly, through her critique of these strategies, Moore also suggests that ‘qualitative researchers return to the terminology of social sciences, using rigor, reliability, validity, and generalizability’ (p.1212) instead of those proposed in the 1980s by Guba and Lincoln (1989).


Awareness of the criteria used when assessing the quality of qualitative research is key for anyone conducting qualitative research. As we have seen these criteria typically include: trustworthiness, credibility, transferability, dependability, confirmability, reflexivity and ethics.

However the strategies which each researcher adopts in order to ensure the trustworthiness (rigor) of their study, will depend on a variety of factors specific to each qualitative research project including the research method they adopt and the research paradigm. As Moore (2019: 1219) writes: ‘…rigor, comprising both validity and reliability, is achieved primarily by researchers in the process of data collection and analysis’. In addition, the assessment criteria which are valid when assessing fields such as clinical studies may not be relevant for those working in areas such as ethnography or narrative studies (see Altheide and Johnson 2011). There is no easy route or ‘one size fits all’ approach for assessing the quality of qualitative research, but the above criteria give us a good starting point which we can refer to when designing and conducting our qualitative inquiries.

References and further reading

Altheide, D.L. and Johnson, J.M. (2011) ‘Reflections on Interpretive Adequacy in Qualitative Research.’ In N.K. Denzin and Y.S. Lincoln (eds) Handbook of Qualitative Research, Fifth Edition (pp. 581-594). London: Sage.

Bochner, A. (2000) ‘Criteria Against Ourselves.’ Qualitative Inquiry, 6: 266-272.

Braun, V. and Clarke, V. (2013) Successful Qualitative Research. London: Sage.

Guba, E. and Lincoln, Y. (1989) Fourth Generation Evaluation. Newbury Park, CA: Sage.

Krefting, L. (1991) ‘Rigor in Qualitative Research: The Assessment of Trustworthiness.’ American Journal of Occupational Therapy, 45: 214–222.

Lather, P. (1993) ‘Fertile Obsession: Validity after Poststructuralism.’ Sociological Quarterly, 34: 673-693.

Lingard L. (2015) ‘Joining a Conversation: The Problem/Gap/Hook Heuristic.’ Perspectives on Medical Education, 4(5): 252–253.

Lumsden, K. (2019) Reflexivity: Theory, Method and Practice. London: Routledge.

Morse, J.M. (2015) ‘Critical Analysis of Strategies for Determining Rigor in Qualitative Inquiry.’ Qualitative Health Research, 25(9): 1212-1222.

Schwandt, T.A. (1996) ‘Farewell to Criteriology.’ Qualitative Inquiry, 2: 58-72.

Spencer, L., Ritchie, J., Lewis, J., and Dillon, L. (2003) Quality in Qualitative Evaluation: A Framework for Assessing Research Evidence, GCSRO.  Available at: www.policyhub.gov.uk/publications

Stenfors, T., Kajamaa, A. and Bennett, D. (2020) ‘How to… Assess the Quality of Qualitative Research.’ The Clinical Teacher, https://doi.org/10.1111/tct.13242

Tracy, S.J. (2010) ‘Qualitative Quality: Eight “Big-Tent” Criteria for Excellent Qualitative Research.’ Qualitative Inquiry, 16: 837–851.

Qualitative research

A kick in the teeth?: The problems with “hierarchies of qualitative research” for policy-making and evidence-based policing

The problem with “gold standard” hierarchies of evidence

Four years ago I co-authored a paper on the rise of the “evidence base” (EB) in policing which drew attention to misguided attempts to replicate the “gold standard” hierarchies of evidence used in medicine, health, social care, education etc. in the context of policing (see Lumsden and Goode 2016). The paper addressed the rise of evidence-based policy and practice as a dominant discourse in policing in the UK, and the implications this has for social scientists conducting research in this area, and for police officers and staff. Our paper was intended as an exploration and conversation starter, to draw attention to the dangers and risks of a narrow focus on research which was ultimately driven by/via a government and policy-makers’ construction (see also Lumsden 2017). Other scholars have made similar observations, drawing our attention to the false premise that we should even attempt an evidence-based approach in policing (see Thacher 2001).

The development of an evidence-base in policing has largely been driven by the Maryland Scale of Scientific Methods, imported from the USA to the UK, and adopted and promoted by its proponents (i.e. Sherman), and which placed systematic reviews, RCTs, positivist scientific methods at the top layers of the hierarchy, with qualitative methods at the bottom. Evidence-based policing (EBP) was also later developed in the form of the “Evidence Based Policing Matrix” devised by Lum and colleagues (see Lum, Koper and Telep 2011): “…a research-to-practice translation tool which organizes moderate to very rigorous evaluations of police interventions visually, allowing agencies and researchers to view the field of research in this area,..” and which in that sense is more sympathetic to qualitative research.

We argued that adoption of evidence-based policing (EBP) and the related “gold standard” used to evaluate research (such as those measurable on the Maryland Scale) act as a “technology of power” (Foucault 1988) to draw boundaries (Gieryn 1983; Styhre 2011) around which methodologies and forms of knowledge are legitimate and useful for policing. We also drew attention to the risks posed to researchers entering the field of the loss of decades of seminal policing research if its utility for informing policing and criminal justice is to be judged using the “gold standard” criteria defined by the evidence-based movement more broadly.

Qualitative methods and the evidence-base

The general disregard of qualitative methods in evidence-based policy is not new, and the debate has been well trodden and rehearsed in social care, education, medicine, and health care from the 1990s onwards (i.e. see Avby, Nilsen and Dahlgren 2014; Dixon-Woods, Bonas and Booth 2006). It is worth noting that the recent College of Policing definition of evidence-based policing in the UK has widened to refer to “best available evidence from appropriate methods”, and highlights the need to have a clear theoretical basis and context of research:

“The ‘best available’ evidence will use appropriate research methods and sources for the question being asked. Research should be carefully conducted, peer reviewed and transparent about its methods, limitations, and how its conclusions were reached. The theoretical basis and context of the research should also be made clear. Where there is little or no formal research, other evidence such as professional consensus and peer review, may be regarded as the ‘best available’, if gathered and documented in a careful and transparent way.” (College of Policing 2017)

What I now want to address in this piece are the main problems concerning the use of hierarchies of evidence for judging the merits and rigour of qualitative research. Although I’m sympathetic of attempts to ensure qualitative research is included in EBP policy-making and the point that guidance is needed for policy-makers, “hierarchies of qualitative research” are misleading and do not assist policy makers in conducting a robust, fair and inclusive evaluation of qualitative studies, and in assessing whether they are “promising” enough to be included in the policing evidence-base.

First, I’ll respond to the very notion of “hierarchies of qualitative research”. Then, I’ll turn to some of the claims made in a related blog on the hierarchy (see Huey 2019) which discusses the problems with “pseudo-scientific” “junky” qualitative research.

A “Hierarchy of Qualitative Research” for policy-making


In a response to tweets concerning the above example of a “hierarchy for qualitative research” for evidence-based policing (EBP), it is claimed that “rigour” is built into the hierarchy at every step. However, the very notion of a “Hierarchy of Qualitative Research” in itself is misleading and only serves to further legitimise and dismiss certain forms of knowledge, types of qualitative research, methods, etc. It engages in boundary work (as noted above) around which particular methodologies, forms of knowledge and “voices”, are viewed as legitimate and useful for policing (Gieryn 1983). We cannot use hierarchies to judge qualitative studies.

An abbreviated (and by no means complete) list of some of the problems I see with the use of hierarchies for judging qualitative research in EBP follows:

1. Systematic reviews are at the top of the hierarchy (not surprising given it is largely following the tenets of previous hierarchies such as the Maryland Scale). Are these qualitative systematic reviews per se, or the use of qualitative data within a systematic review? Is the criteria for a “what works” standard review still determined by quantitative principles (i.e. see Booth 2001)?

2. As stated in the blog which introduces us to the model, it makes one big assumption – “that the study being used was well-designed and well-executed” but that a “new category for 0: studies that manifestly poorly designed and executed” was also added. The question then is how do policy makers know before they apply the model that the study is “poorly designed and executed” – are we assuming they have the skills to be making these judgement calls in evaluating qualitative research, before applying the model, which is then step-two of the process? If so, it is not really assisting them to evaluate qualitative research.

3. It fails to acknowledge the relationship between method and methodologies

4. …And fails to acknowledge how methodology and methods are always tied to the research question being posed. Will the method used help answer the question/s posed? MORE methods (i.e. “quandrangled studies”) does not ensure quality – as methods are ultimately tied to the research question and aims. As Murphy et al. write in their guidance on using qualitative methods in health technology assessment (HTA):

“The goal of all research in HTA should be to establish knowledge about which we can be reasonably confident, and to provide findings that are relevant to policy makers and practitioners. Therefore, decisions about whether qualitative or quantitative methods (or a combination of both) are most appropriate to a particular research problem should be made on the basis of which approach is likely to answer the question most effectively and efficiently.” (1998: iii, author’s emphasis)

5. Quantity over quality: The model gives the impression that more methods are better – i.e. “quadrangled studies” are “very promising” using “more 4 or more different methods or data sources” which could include interviews, focus groups, field observations, media analysis. (Also see point 10 below).

6. “Mixed methods” studies can involve using various forms of qualitative methods, as well as quant/qual mixed methods studies. That said, “very promising” is also mixed methods, even though it only consists of qualitative methods, whereas mixed methods are only “what’s promising”.

7. It doesn’t acknowledge how different qualitative methods also have specific standards which guide their design and use, i.e. focus groups versus participant observation (as well as the general standards for qualitative research).

8. Context is key: Both in qualitative research, and in the study of policing and crime. The latter is not the same as i.e. health-based EBP (although the latter has its own issues concerning context and individual differences in the design and implementation of various initiatives).

9. Case studies are situated at the bottom of the hierarchy alongside “anecdotes” and “expert opinion” when in fact case studies themselves vary widely, and can involve use of multiple methods (qual and/or quant) to explore research question/s (i.e. see use of clinical case studies in nursing, health care).

10. It uses the standards of quantitative research to judge qualitative research (also see point 5 above). The overall message, whether intended or not, is the more qualitative methods the better, the more interviewees the better, etc. as you will then be able to “generalise” (when in fact generalisation is not necessarily the main aim of qualitative studies). (For a discussion of the number of participants in qualitative studies see the insightful report “How many interviews is enough?” (Baker and Edwards 2013)).

In sum: undoubtedly, models such as this will appeal to some police and policy-makers who are looking for a “quick fix”, but it does not help them to evaluate “good” versus “bad” qualitative research: this is more complex, as reflected in for example the various check-lists and tools which are used in health and medicine to evaluate qualitative studies. It will (like EBP models before it) result in the privileging of certain (qualitative) methods, types of research, and therefore forms of knowledge, at the expense of others. Ann Oakley wrote about the need to “dissolve the methods” war in relation to education and the evidence-base in 2002 arguing that:

“A main danger ahead is that large areas of research, traditionally important in education, escape the evidence net either because no one can reach any consensus about how to sort out the reliable from the unreliable, or (which might be worse) because a new orthodoxy sets in according to which qualitative research is simply a world apart—nothing to do with evidence at all.” (Oakley 2002: 284)

I’ve been training non-academic groups and professionals in qualitative methods for the past five years (including police officers, NGOs, policy makers, government, local authorities, health care, charities etc.) and what is clear is that there is a desire to learn the intricacies of qualitative methods so they can evaluate research studies and strengthen their own in-house work. There is also a need for training on how to evaluate qualitative research. Crucially, this includes training in how we should not use the principles of quantitative research to judge qualitative research. These hierarchies do not adequately equip policy-makers to be able to assess the quality of qualitative research.

“Hierarchies of qualitative research” also risk reifying dominant discourses of the evidence-base in policing. There is a risk that it creates expectations that police-researchers, “pracademics”, and academics must mould their research so that it “fits” the model to be i.e. “very promising” – so that it will then be used to build the evidence-base – rather than being robustly linked to questions of epistemology, ontology, methodology and the research question they wish to answer. For example, in the UK the Research Excellence Framework (REF) and proposed Knowledge Excellence Framework (KEF), put pressure on researchers to demonstrate the “usefulness” of their work and EBP aligns with this impetus (Lumsden and Goode 2018).

Claims of “pseudoscientific” “junky qualitative research”

This also relates to a point made in the blog launching the hierarchy which describes an instance of what the author calls “pseudoscientific” and “junky qualitative research”. The argument is that the authors of one particular study which is selected as an example of poor quality research made grander claims than was appropriate from a small sample of only 13 participants, including making national policy recommendations in relation to criminal justice policy. (There is no reference to the particular report in the blog). I agree with the more general point made here about not making national policy recommendations from such a small study (and the other criticisms), however, the hierarchy does not help us address these issues. For example:

1. In the blog it is claimed that there has been “deep resistance among qualitative researchers to the idea of trying to set standards for their work”. Perhaps this is the case in Canada and the US where this model originates, but it hasn’t been my experience. There are multiple examples of ongoing work and discussion in the social sciences, psychology etc., regarding the need to improve transparency and set standards. One example is the influential work of Braun and Clarke in relation to thematic analysis in New Zealand/UK, their particular approach which they call “reflexive thematic analysis”, and their related calls for qualitative researchers to be more transparent in how they have analysed data. This also includes on their website “guidelines for reviewers and editors evaluating thematic analysis transcripts”. The point here is that qualitative research does not occur in a vacuum where “anything goes”.

2. There is a whole continuum of qualitative research styles and inquiry, with i.e. arts-based research inquiries siting at one end of the qualitative continuum. Also, some arts-based projects and inquiries have had impact in relation to policy-making in i.e. the UK, and policy-researchers are also increasingly using these methods. For example, life history and narrative approaches might include only a small number of participants, but for good reasons, which are related to a host of factors including different disciplines, methodologies and philosophies of social science, etc.

3. A lack of training in qualitative methods for graduate students. This is one point that I do agree with in terms of the need for training in both methodologies and methods. I’d also add to that training in the politics of the evidence-base, evaluation methods, and awareness of the impact agenda.

Final thoughts

Researchers have a responsibility not to make grand claims from their research and we need standards for judging qualitative research. However, the hierarchy does not address these issues. It only exacerbates them. We can’t solve these issues by privileging studies which use “more” methods. The issue is more complex. Hierarchies of qualitative research, like those before them, zone in on what their creators see as the low standards in the field (cf Lumsden and Goode 2016) and risk “disciplining” qualitative research(ers) (Denzin, Lincoln and Giardina 2006).

We might also look to, and learn from, the debates and work previously conducted in fields such as health and medical research in terms of evaluating qualitative research for policy and judging what is “good qualitative research’” As Barbour (2001) writes in relation to “qualitative research checklists” in medical research: “Reducing qualitative research to a list of technical procedures … is overly prescriptive and results in ‘the tail wagging the dog’”. These checklists “can strengthen the rigour of qualitative research only if embedded in a broader understanding of qualitative research design and data analysis” (2001: 322, author’s emphasis).

Therefore, any practitioner-focused framework which aims to assess the rigour of qualitative research must attempt to be inclusive of a whole host of epistemological and ontological standpoints, related methodologies and methods. Transparency in how we conducted our research is key, but evidence-based policing also needs to be inclusive rather than exclusive, and not kick qualitative research in the teeth.


Avby G, Nilsen P and Dahlgren MA (2014) Ways of understanding evidence-based practice in social work: A qualitative study. British Journal of Social Work 44: 1366–1383.
Baker SE and Edwards R (2013) How many qualitative interviews Is enough? NCRM Review Paper. Accessed July 2019: http://eprints.ncrm.ac.uk/2273/4/how_many_interviews.pdf
Barbour RS (2001) Checklists for improving rigour in qualitative research: a case of the tail wagging the dog? British Medical Journal 322: 115–117.
Denzin NK, Lincoln YS and Giardina MD (2006) Disciplining qualitative research. International Journal of Qualitative Studies in Education 19(6): 769–782.
Dixon-Woods M, Bonas S, Booth A, et al. (2006) How can systematic reviews incorporate qualitative research? A critical perspective. Qualitative Research 6(1): 27–44.
Foucault M (1988) Technologies of the self. In: Martin LH, Gutman H and Hutton PH (eds) Technologies of the Self. Amherst, MA: University of Massachusetts Press, 16–49.
Gieryn TF (1983) Boundary-work and the demarcation of science from non-science: strains and interests in professional ideologies of scientists. American Sociological Review 48(6): 781–795.
Huey L (2019) If we’re going to use qualitative research for public policy, then we need better standards. 22 July 2019. Accessed July 2019: https://www.lhuey.net/single-post/2019/07/22/If-We’re-Going-to-Use-Qualitative-Research-for-Public-Policy-then-We-Need-to-Better-Standards
Lum C, Koper CS and Telep CW (2011) The evidence-based policing matrix. Journal of Experimental Criminology 7(1): 3–26.
Lumsden K (2016) Police officer and civilian staff receptivity to research and evidence-based policing in England: providing a contextual understanding through qualitative interviews. Policing: a Journal of Policy and Practice 11(2): 157-167.
Lumsden K and Goode JH (2016) Policing research and the rise of the evidence-base: police officer and staff understandings of research, its implementation and “what works”. Sociology 52(4): 813-829.
Lumsden K and Goode JE (2018) Public criminology, reflexivity and the enterprise university: experiences of research, knowledge transfer work and co-option with police forces. Theoretical Criminology 22(2): 243-257.
Murphy E, Dingwall R, Greatbatch D, Parker S, Watson P (1998) Qualitative research methods in health technology assessment: a review of the literature. Health Technology Assessment 1998: 2(16).
Oakley A (2002) Social science and evidence-based everything: the case of education. Educational Review 54(3): 277–286.
Styhre A (2011) Knowledge Sharing in Professions. Surrey: Gower.
Thacher D (2001) Policing is not a treatment. Journal of Research in Crime and Delinquency 38(4): 387-415.


Dr Karen Lumsden is based at the University of Nottingham in the UK. She is a sociologist and criminologist with expertise in qualitative research methods and applied research with a range of audiences including police constabularies and victim organisations. She has held posts at the University of Leicester, Loughborough University, the University of Abertay Dundee, and the University of Aberdeen. Karen has a PhD in Sociology, Masters in Social Research, MA in Sociology, and PGCE in Higher Education Learning & Teaching, all from the University of Aberdeen. Karen has experience of teaching qualitative research methods at postgraduate level and to academics and practitioners via the Social Research Association, her own consultancy (The Qualitative Researcher), and at international summer schools and ESRC doctoral training centres. She is the author of over 40 publications including four books, is on the Editorial Board of the journal Sociology, and is currently the Chair of the Editorial Board of Sociological Research Online. She tweets at @karenlumsden2

Qualitative research, Thematic analysis

Why themes don’t ’emerge’ from the data

Frequently, I read methods sections of articles or dissertations, or over hear students and academics commenting on how in thematic analysis their themes ‘emerged’ from the data. (I have probably been guilty of this myself in the past!)


This assumption that in thematic analysis it is only the job of the researcher to grab these (pre-existing) themes out of an interview transcript or field notes, ignores the labour that goes into qualitative data analysis, including the organisation of data, levels of coding, and the subsequent generation of themes.

Themes are constructed by the researcher/s, and are shaped and reshaped in the often cyclical process of analysis, interpretation, analysis, interpretation and so on…

Qualitative researchers are also storytellers, organising and structuring data via the stages of coding (whether deductive and/or inductive coding), organising these codes, and then constructing themes from these codes.

In this process the messy qualitative data is reorganised and a story is woven and constructed from the themes.

So, how should we refer to this process? Perhaps, instead of ‘emerging’ from the data, themes are generated, identified, and/or constructed by the researcher from the qualitative data.