• University of Memphis Libraries
  • Research Guides

Empirical Research: Defining, Identifying, & Finding

Defining empirical research, what is empirical research, quantitative or qualitative.

  • Introduction
  • Database Tools
  • Search Terms
  • Image Descriptions

Calfee & Chambliss (2005)  (UofM login required) describe empirical research as a "systematic approach for answering certain types of questions."  Those questions are answered "[t]hrough the collection of evidence under carefully defined and replicable conditions" (p. 43). 

The evidence collected during empirical research is often referred to as "data." 

Characteristics of Empirical Research

Emerald Publishing's guide to conducting empirical research identifies a number of common elements to empirical research: 

  • A  research question , which will determine research objectives.
  • A particular and planned  design  for the research, which will depend on the question and which will find ways of answering it with appropriate use of resources.
  • The gathering of  primary data , which is then analysed.
  • A particular  methodology  for collecting and analysing the data, such as an experiment or survey.
  • The limitation of the data to a particular group, area or time scale, known as a sample [emphasis added]: for example, a specific number of employees of a particular company type, or all users of a library over a given time scale. The sample should be somehow representative of a wider population.
  • The ability to  recreate  the study and test the results. This is known as  reliability .
  • The ability to  generalize  from the findings to a larger sample and to other situations.

If you see these elements in a research article, you can feel confident that you have found empirical research. Emerald's guide goes into more detail on each element. 

Empirical research methodologies can be described as quantitative, qualitative, or a mix of both (usually called mixed-methods).

Ruane (2016)  (UofM login required) gets at the basic differences in approach between quantitative and qualitative research:

  • Quantitative research  -- an approach to documenting reality that relies heavily on numbers both for the measurement of variables and for data analysis (p. 33).
  • Qualitative research  -- an approach to documenting reality that relies on words and images as the primary data source (p. 33).

Both quantitative and qualitative methods are empirical . If you can recognize that a research study is quantitative or qualitative study, then you have also recognized that it is empirical study. 

Below are information on the characteristics of quantitative and qualitative research. This video from Scribbr also offers a good overall introduction to the two approaches to research methodology: 

Characteristics of Quantitative Research 

Researchers test hypotheses, or theories, based in assumptions about causality, i.e. we expect variable X to cause variable Y. Variables have to be controlled as much as possible to ensure validity. The results explain the relationship between the variables. Measures are based in pre-defined instruments.

Examples: experimental or quasi-experimental design, pretest & post-test, survey or questionnaire with closed-ended questions. Studies that identify factors that influence an outcomes, the utility of an intervention, or understanding predictors of outcomes. 

Characteristics of Qualitative Research

Researchers explore “meaning individuals or groups ascribe to social or human problems (Creswell & Creswell, 2018, p3).” Questions and procedures emerge rather than being prescribed. Complexity, nuance, and individual meaning are valued. Research is both inductive and deductive. Data sources are multiple and varied, i.e. interviews, observations, documents, photographs, etc. The researcher is a key instrument and must be reflective of their background, culture, and experiences as influential of the research.

Examples: open question interviews and surveys, focus groups, case studies, grounded theory, ethnography, discourse analysis, narrative, phenomenology, participatory action research.

Calfee, R. C. & Chambliss, M. (2005). The design of empirical research. In J. Flood, D. Lapp, J. R. Squire, & J. Jensen (Eds.),  Methods of research on teaching the English language arts: The methodology chapters from the handbook of research on teaching the English language arts (pp. 43-78). Routledge. .

Creswell, J. W., & Creswell, J. D. (2018).  Research design: Qualitative, quantitative, and mixed methods approaches  (5th ed.). Thousand Oaks: Sage.

How to... conduct empirical research . (n.d.). Emerald Publishing. .

Scribbr. (2019). Quantitative vs. qualitative: The differences explained  [video]. YouTube. .

Ruane, J. M. (2016).  Introducing social research methods : Essentials for getting the edge . Wiley-Blackwell. .  

  • << Previous: Home
  • Next: Identifying Empirical Research >>
  • Last Updated: Jan 8, 2024 11:48 AM
  • URL:

Empirical evidence: A definition

Empirical evidence is information that is acquired by observation or experimentation.

Scientists in a lab

The scientific method

Types of empirical research, identifying empirical evidence, empirical law vs. scientific law, empirical, anecdotal and logical evidence, additional resources and reading, bibliography.

Empirical evidence is information acquired by observation or experimentation. Scientists record and analyze this data. The process is a central part of the scientific method , leading to the proving or disproving of a hypothesis and our better understanding of the world as a result.

Empirical evidence might be obtained through experiments that seek to provide a measurable or observable reaction, trials that repeat an experiment to test its efficacy (such as a drug trial, for instance) or other forms of data gathering against which a hypothesis can be tested and reliably measured. 

"If a statement is about something that is itself observable, then the empirical testing can be direct. We just have a look to see if it is true. For example, the statement, 'The litmus paper is pink', is subject to direct empirical testing," wrote Peter Kosso in " A Summary of Scientific Method " (Springer, 2011).

"Science is most interesting and most useful to us when it is describing the unobservable things like atoms , germs , black holes , gravity , the process of evolution as it happened in the past, and so on," wrote Kosso. Scientific theories , meaning theories about nature that are unobservable, cannot be proven by direct empirical testing, but they can be tested indirectly, according to Kosso. "The nature of this indirect evidence, and the logical relation between evidence and theory, are the crux of scientific method," wrote Kosso.

The scientific method begins with scientists forming questions, or hypotheses , and then acquiring the knowledge through observations and experiments to either support or disprove a specific theory. "Empirical" means "based on observation or experience," according to the Merriam-Webster Dictionary . Empirical research is the process of finding empirical evidence. Empirical data is the information that comes from the research.

Before any pieces of empirical data are collected, scientists carefully design their research methods to ensure the accuracy, quality and integrity of the data. If there are flaws in the way that empirical data is collected, the research will not be considered valid.

The scientific method often involves lab experiments that are repeated over and over, and these experiments result in quantitative data in the form of numbers and statistics. However, that is not the only process used for gathering information to support or refute a theory. 

This methodology mostly applies to the natural sciences. "The role of empirical experimentation and observation is negligible in mathematics compared to natural sciences such as psychology, biology or physics," wrote Mark Chang, an adjunct professor at Boston University, in " Principles of Scientific Methods " (Chapman and Hall, 2017).

"Empirical evidence includes measurements or data collected through direct observation or experimentation," said Jaime Tanner, a professor of biology at Marlboro College in Vermont. There are two research methods used to gather empirical measurements and data: qualitative and quantitative.

Qualitative research, often used in the social sciences, examines the reasons behind human behavior, according to the National Center for Biotechnology Information (NCBI) . It involves data that can be found using the human senses. This type of research is often done in the beginning of an experiment. "When combined with quantitative measures, qualitative study can give a better understanding of health related issues," wrote Dr. Sanjay Kalra for NCBI.

Quantitative research involves methods that are used to collect numerical data and analyze it using statistical methods, ."Quantitative research methods emphasize objective measurements and the statistical, mathematical, or numerical analysis of data collected through polls, questionnaires, and surveys, or by manipulating pre-existing statistical data using computational techniques," according to the LeTourneau University . This type of research is often used at the end of an experiment to refine and test the previous research.

Scientist in a lab

Identifying empirical evidence in another researcher's experiments can sometimes be difficult. According to the Pennsylvania State University Libraries , there are some things one can look for when determining if evidence is empirical:

  • Can the experiment be recreated and tested?
  • Does the experiment have a statement about the methodology, tools and controls used?
  • Is there a definition of the group or phenomena being studied?

The objective of science is that all empirical data that has been gathered through observation, experience and experimentation is without bias. The strength of any scientific research depends on the ability to gather and analyze empirical data in the most unbiased and controlled fashion possible. 

However, in the 1960s, scientific historian and philosopher Thomas Kuhn promoted the idea that scientists can be influenced by prior beliefs and experiences, according to the Center for the Study of Language and Information . 

— Amazing Black scientists

— Marie Curie: Facts and biography

— What is multiverse theory?

"Missing observations or incomplete data can also cause bias in data analysis, especially when the missing mechanism is not random," wrote Chang.

Because scientists are human and prone to error, empirical data is often gathered by multiple scientists who independently replicate experiments. This also guards against scientists who unconsciously, or in rare cases consciously, veer from the prescribed research parameters, which could skew the results.

The recording of empirical data is also crucial to the scientific method, as science can only be advanced if data is shared and analyzed. Peer review of empirical data is essential to protect against bad science, according to the University of California .

Empirical laws and scientific laws are often the same thing. "Laws are descriptions — often mathematical descriptions — of natural phenomenon," Peter Coppinger, associate professor of biology and biomedical engineering at the Rose-Hulman Institute of Technology, told Live Science. 

Empirical laws are scientific laws that can be proven or disproved using observations or experiments, according to the Merriam-Webster Dictionary . So, as long as a scientific law can be tested using experiments or observations, it is considered an empirical law.

Empirical, anecdotal and logical evidence should not be confused. They are separate types of evidence that can be used to try to prove or disprove and idea or claim.

Logical evidence is used proven or disprove an idea using logic. Deductive reasoning may be used to come to a conclusion to provide logical evidence. For example, "All men are mortal. Harold is a man. Therefore, Harold is mortal."

Anecdotal evidence consists of stories that have been experienced by a person that are told to prove or disprove a point. For example, many people have told stories about their alien abductions to prove that aliens exist. Often, a person's anecdotal evidence cannot be proven or disproven. 

There are some things in nature that science is still working to build evidence for, such as the hunt to explain consciousness .

Meanwhile, in other scientific fields, efforts are still being made to improve research methods, such as the plan by some psychologists to fix the science of psychology .

" A Summary of Scientific Method " by Peter Kosso (Springer, 2011)

"Empirical" Merriam-Webster Dictionary

" Principles of Scientific Methods " by Mark Chang (Chapman and Hall, 2017)

"Qualitative research" by Dr. Sanjay Kalra National Center for Biotechnology Information (NCBI)

"Quantitative Research and Analysis: Quantitative Methods Overview" LeTourneau University

"Empirical Research in the Social Sciences and Education" Pennsylvania State University Libraries

"Thomas Kuhn" Center for the Study of Language and Information

"Misconceptions about science" University of California

Sign up for the Live Science daily newsletter now

Get the world’s most fascinating discoveries delivered straight to your inbox.

Alina Bradford

Why do babies rub their eyes when they're tired?

Why do people dissociate during traumatic events?

Space photo of the week: James Webb telescope reveals surprising starburst in ancient galaxy

Most Popular

By Anna Gora December 27, 2023

By Anna Gora December 26, 2023

By Anna Gora December 25, 2023

By Emily Cooke December 23, 2023

By Victoria Atkinson December 22, 2023

By Anna Gora December 16, 2023

By Anna Gora December 15, 2023

By Anna Gora November 09, 2023

By Donavyn Coffey November 06, 2023

By Anna Gora October 31, 2023

By Anna Gora October 26, 2023

  • 2 Polar vortex is 'spinning backwards' above Arctic after major reversal event
  • 3 9,000-year-old rock art discovered among dinosaur footprints in Brazil
  • 4 Where does the solar system end?
  • 5 Secretive Delta IV Heavy rocket launch postponed indefinitely
  • 2 The 7 most powerful supercomputers in the world right now
  • 3 Fiber-optic data transfer speeds hit a rapid 301 Tbps — 1.2 million times faster than your home broadband connection
  • 4 Powerful X-class solar flare slams Earth, triggering radio blackout over the Pacific Ocean
  • 5 Polar vortex is 'spinning backwards' above Arctic after major reversal event

Purdue University

  • Ask a Librarian

Research: Overview & Approaches

  • Getting Started with Undergraduate Research
  • Planning & Getting Started
  • Building Your Knowledge Base
  • Locating Sources
  • Reading Scholarly Articles
  • Creating a Literature Review
  • Productivity & Organizing Research
  • Scholarly and Professional Relationships

Introduction to Empirical Research

Databases for finding empirical research, guided search, google scholar, examples of empirical research, sources and further reading.

  • Interpretive Research
  • Action-Based Research
  • Creative & Experimental Approaches

Your Librarian

Profile Photo

  • Introductory Video This video covers what empirical research is, what kinds of questions and methods empirical researchers use, and some tips for finding empirical research articles in your discipline.

Help Resources

  • Guided Search: Finding Empirical Research Articles This is a hands-on tutorial that will allow you to use your own search terms to find resources.

Google Scholar Search

  • Study on radiation transfer in human skin for cosmetics
  • Long-Term Mobile Phone Use and the Risk of Vestibular Schwannoma: A Danish Nationwide Cohort Study
  • Emissions Impacts and Benefits of Plug-In Hybrid Electric Vehicles and Vehicle-to-Grid Services
  • Review of design considerations and technological challenges for successful development and deployment of plug-in hybrid electric vehicles
  • Endocrine disrupters and human health: could oestrogenic chemicals in body care cosmetics adversely affect breast cancer incidence in women?

empirical research evidence

  • << Previous: Scholarly and Professional Relationships
  • Next: Interpretive Research >>
  • Last Updated: Mar 26, 2024 11:38 AM
  • URL:

Penn State University Libraries

Empirical research in the social sciences and education.

  • What is Empirical Research and How to Read It
  • Finding Empirical Research in Library Databases
  • Designing Empirical Research
  • Ethics, Cultural Responsiveness, and Anti-Racism in Research
  • Citing, Writing, and Presenting Your Work

Contact the Librarian at your campus for more help!

Ellysa Cahoy

Introduction: What is Empirical Research?

Empirical research is based on observed and measured phenomena and derives knowledge from actual experience rather than from theory or belief. 

How do you know if a study is empirical? Read the subheadings within the article, book, or report and look for a description of the research "methodology."  Ask yourself: Could I recreate this study and test these results?

Key characteristics to look for:

  • Specific research questions to be answered
  • Definition of the population, behavior, or   phenomena being studied
  • Description of the process used to study this population or phenomena, including selection criteria, controls, and testing instruments (such as surveys)

Another hint: some scholarly journals use a specific layout, called the "IMRaD" format, to communicate empirical research findings. Such articles typically have 4 components:

  • Introduction : sometimes called "literature review" -- what is currently known about the topic -- usually includes a theoretical framework and/or discussion of previous studies
  • Methodology: sometimes called "research design" -- how to recreate the study -- usually describes the population, research process, and analytical tools used in the present study
  • Results : sometimes called "findings" -- what was learned through the study -- usually appears as statistical data or as substantial quotations from research participants
  • Discussion : sometimes called "conclusion" or "implications" -- why the study is important -- usually describes how the research results influence professional practices or future studies

Reading and Evaluating Scholarly Materials

Reading research can be a challenge. However, the tutorials and videos below can help. They explain what scholarly articles look like, how to read them, and how to evaluate them:

  • CRAAP Checklist A frequently-used checklist that helps you examine the currency, relevance, authority, accuracy, and purpose of an information source.
  • IF I APPLY A newer model of evaluating sources which encourages you to think about your own biases as a reader, as well as concerns about the item you are reading.
  • Credo Video: How to Read Scholarly Materials (4 min.)
  • Credo Tutorial: How to Read Scholarly Materials
  • Credo Tutorial: Evaluating Information
  • Credo Video: Evaluating Statistics (4 min.)
  • Next: Finding Empirical Research in Library Databases >>
  • Last Updated: Feb 18, 2024 8:33 PM
  • URL:

SEP home page

  • Table of Contents
  • Random Entry
  • Chronological
  • Editorial Information
  • About the SEP
  • Editorial Board
  • How to Cite the SEP
  • Special Characters
  • Advanced Tools
  • Support the SEP
  • PDFs for SEP Friends
  • Make a Donation
  • SEPIA for Libraries
  • Entry Contents


Academic tools.

  • Friends PDF Preview
  • Author and Citation Info
  • Back to Top

Theory and Observation in Science

Scientists obtain a great deal of the evidence they use by collecting and producing empirical results. Much of the standard philosophical literature on this subject comes from 20 th century logical empiricists, their followers, and critics who embraced their issues while objecting to some of their aims and assumptions. Discussions about empirical evidence have tended to focus on epistemological questions regarding its role in theory testing. This entry follows that precedent, even though empirical evidence also plays important and philosophically interesting roles in other areas including scientific discovery, the development of experimental tools and techniques, and the application of scientific theories to practical problems.

The logical empiricists and their followers devoted much of their attention to the distinction between observables and unobservables, the form and content of observation reports, and the epistemic bearing of observational evidence on theories it is used to evaluate. Philosophical work in this tradition was characterized by the aim of conceptually separating theory and observation, so that observation could serve as the pure basis of theory appraisal. More recently, the focus of the philosophical literature has shifted away from these issues, and their close association to the languages and logics of science, to investigations of how empirical data are generated, analyzed, and used in practice. With this shift, we also see philosophers largely setting aside the aspiration of a pure observational basis for scientific knowledge and instead embracing a view of science in which the theoretical and empirical are usefully intertwined. This entry discusses these topics under the following headings:

1. Introduction

2.1 traditional empiricism, 2.2 the irrelevance of observation per se, 2.3 data and phenomena, 3.1 perception, 3.2 assuming the theory to be tested, 3.3 semantics, 4.1 confirmation, 4.2 saving the phenomena, 4.3 empirical adequacy, 5. conclusion, other internet resources, related entries.

Philosophers of science have traditionally recognized a special role for observations in the epistemology of science. Observations are the conduit through which the ‘tribunal of experience’ delivers its verdicts on scientific hypotheses and theories. The evidential value of an observation has been assumed to depend on how sensitive it is to whatever it is used to study. But this in turn depends on the adequacy of any theoretical claims its sensitivity may depend on. For example, we can challenge the use of a particular thermometer reading to support a prediction of a patient’s temperature by challenging theoretical claims having to do with whether a reading from a thermometer like this one, applied in the same way under similar conditions, should indicate the patient’s temperature well enough to count in favor of or against the prediction. At least some of those theoretical claims will be such that regardless of whether an investigator explicitly endorses, or is even aware of them, her use of the thermometer reading would be undermined by their falsity. All observations and uses of observational evidence are theory laden in this sense (cf. Chang 2005, Azzouni 2004). As the example of the thermometer illustrates, analogues of Norwood Hanson’s claim that seeing is a theory laden undertaking apply just as well to equipment generated observations (Hanson 1958, 19). But if all observations and empirical data are theory laden, how can they provide reality-based, objective epistemic constraints on scientific reasoning?

Recent scholarship has turned this question on its head. Why think that theory ladenness of empirical results would be problematic in the first place? If the theoretical assumptions with which the results are imbued are correct, what is the harm of it? After all, it is in virtue of those assumptions that the fruits of empirical investigation can be ‘put in touch’ with theorizing at all. A number scribbled in a lab notebook can do a scientist little epistemic good unless she can recruit the relevant background assumptions to even recognize it as a reading of the patient’s temperature. But philosophers have embraced an entangled picture of the theoretical and empirical that goes much deeper than this. Lloyd (2012) advocates for what she calls “complex empiricism” in which there is “no pristine separation of model and data” (397). Bogen (2016) points out that “impure empirical evidence” (i.e. evidence that incorporates the judgements of scientists) “often tells us more about the world that it could have if it were pure” (784). Indeed, Longino (2020) has urged that “[t]he naïve fantasy that data have an immediate relation to phenomena of the world, that they are ‘objective’ in some strong, ontological sense of that term, that they are the facts of the world directly speaking to us, should be finally laid to rest” and that “even the primary, original, state of data is not free from researchers’ value- and theory-laden selection and organization” (391).

There is not widespread agreement among philosophers of science about how to characterize the nature of scientific theories. What is a theory? According to the traditional syntactic view, theories are considered to be collections of sentences couched in logical language, which must then be supplemented with correspondence rules in order to be interpreted. Construed in this way, theories include maximally general explanatory and predictive laws (Coulomb’s law of electrical attraction and repulsion, and Maxwellian electromagnetism equations for example), along with lesser generalizations that describe more limited natural and experimental phenomena (e.g., the ideal gas equations describing relations between temperatures and pressures of enclosed gasses, and general descriptions of positional astronomical regularities). In contrast, the semantic view casts theories as the space of states possible according to the theory, or the set of mathematical models permissible according to the theory (see Suppe 1977). However, there are also significantly more ecumenical interpretations of what it means to be a scientific theory, which include elements of diverse kinds. To take just one illustrative example, Borrelli (2012) characterizes the Standard Model of particle physics as a theoretical framework involving what she calls “theoretical cores” that are composed of mathematical structures, verbal stories, and analogies with empirical references mixed together (196). This entry aims to accommodate all of these views about the nature of scientific theories.

In this entry, we trace the contours of traditional philosophical engagement with questions surrounding theory and observation in science that attempted to segregate the theoretical from the observational, and to cleanly delineate between the observable and the unobservable. We also discuss the more recent scholarship that supplants the primacy of observation by human sensory perception with an instrument-inclusive conception of data production and that embraces the intertwining of theoretical and empirical in the production of useful scientific results. Although theory testing dominates much of the standard philosophical literature on observation, much of what this entry says about the role of observation in theory testing applies also to its role in inventing, and modifying theories, and applying them to tasks in engineering, medicine, and other practical enterprises.

2. Observation and data

Reasoning from observations has been important to scientific practice at least since the time of Aristotle, who mentions a number of sources of observational evidence including animal dissection (Aristotle(a), 763a/30–b/15; Aristotle(b), 511b/20–25). Francis Bacon argued long ago that the best way to discover things about nature is to use experiences (his term for observations as well as experimental results) to develop and improve scientific theories (Bacon 1620, 49ff). The role of observational evidence in scientific discovery was an important topic for Whewell (1858) and Mill (1872) among others in the 19th century. But philosophers didn’t talk about observation as extensively, in as much detail, or in the way we have become accustomed to, until the 20 th century when logical empiricists transformed philosophical thinking about it.

One important transformation, characteristic of the linguistic turn in philosophy, was to concentrate on the logic of observation reports rather than on objects or phenomena observed. This focus made sense on the assumption that a scientific theory is a system of sentences or sentence-like structures (propositions, statements, claims, and so on) to be tested by comparison to observational evidence. It was assumed that the comparisons must be understood in terms of inferential relations. If inferential relations hold only between sentence-like structures, it follows that theories must be tested, not against observations or things observed, but against sentences, propositions, etc. used to report observations (Hempel 1935, 50–51; Schlick 1935). Theory testing was treated as a matter of comparing observation sentences describing observations made in natural or laboratory settings to observation sentences that should be true according to the theory to be tested. This was to be accomplished by using laws or lawlike generalizations along with descriptions of initial conditions, correspondence rules, and auxiliary hypotheses to derive observation sentences describing the sensory deliverances of interest. This makes it imperative to ask what observation sentences report.

According to what Hempel called the phenomenalist account , observation reports describe the observer’s subjective perceptual experiences.

… Such experiential data might be conceived of as being sensations, perceptions, and similar phenomena of immediate experience. (Hempel 1952, 674)

This view is motivated by the assumption that the epistemic value of an observation report depends upon its truth or accuracy, and that with regard to perception, the only thing observers can know with certainty to be true or accurate is how things appear to them. This means that we cannot be confident that observation reports are true or accurate if they describe anything beyond the observer’s own perceptual experience. Presumably one’s confidence in a conclusion should not exceed one’s confidence in one’s best reasons to believe it. For the phenomenalist, it follows that reports of subjective experience can provide better reasons to believe claims they support than reports of other kinds of evidence.

However, given the expressive limitations of the language available for reporting subjective experiences, we cannot expect phenomenalistic reports to be precise and unambiguous enough to test theoretical claims whose evaluation requires accurate, fine-grained perceptual discriminations. Worse yet, if experiences are directly available only to those who have them, there is room to doubt whether different people can understand the same observation sentence in the same way. Suppose you had to evaluate a claim on the basis of someone else’s subjective report of how a litmus solution looked to her when she dripped a liquid of unknown acidity into it. How could you decide whether her visual experience was the same as the one you would use her words to report?

Such considerations led Hempel to propose, contrary to the phenomenalists, that observation sentences report ‘directly observable’, ‘intersubjectively ascertainable’ facts about physical objects

… such as the coincidence of the pointer of an instrument with a numbered mark on a dial; a change of color in a test substance or in the skin of a patient; the clicking of an amplifier connected with a Geiger counter; etc. (ibid.)

That the facts expressed in observation reports be intersubjectively ascertainable was critical for the aims of the logical empiricists. They hoped to articulate and explain the authoritativeness widely conceded to the best natural, social, and behavioral scientific theories in contrast to propaganda and pseudoscience. Some pronouncements from astrologers and medical quacks gain wide acceptance, as do those of religious leaders who rest their cases on faith or personal revelation, and leaders who use their political power to secure assent. But such claims do not enjoy the kind of credibility that scientific theories can attain. The logical empiricists tried to account for the genuine credibility of scientific theories by appeal to the objectivity and accessibility of observation reports, and the logic of theory testing. Part of what they meant by calling observational evidence objective was that cultural and ethnic factors have no bearing on what can validly be inferred about the merits of a theory from observation reports. So conceived, objectivity was important to the logical empiricists’ criticism of the Nazi idea that Jews and Aryans have fundamentally different thought processes such that physical theories suitable for Einstein and his kind should not be inflicted on German students. In response to this rationale for ethnic and cultural purging of the German educational system, the logical empiricists argued that because of its objectivity, observational evidence (rather than ethnic and cultural factors) should be used to evaluate scientific theories (Galison 1990). In this way of thinking, observational evidence and its subsequent bearing on scientific theories are objective also in virtue of being free of non-epistemic values.

Ensuing generations of philosophers of science have found the logical empiricist focus on expressing the content of observations in a rarefied and basic observation language too narrow. Search for a suitably universal language as required by the logical empiricist program has come up empty-handed and most philosophers of science have given up its pursuit. Moreover, as we will discuss in the following section, the centrality of observation itself (and pointer readings) to the aims of empiricism in philosophy of science has also come under scrutiny. However, leaving the search for a universal pure observation language behind does not automatically undercut the norm of objectivity as it relates to the social, political, and cultural contexts of scientific research. Pristine logical foundations aside, the objectivity of ‘neutral’ observations in the face of noxious political propaganda was appealing because it could serve as shared ground available for intersubjective appraisal. This appeal remains alive and well today, particularly as pernicious misinformation campaigns are again formidable in public discourse (see O’Connor and Weatherall 2019). If individuals can genuinely appraise the significance of empirical evidence and come to well-justified agreement about how the evidence bears on theorizing, then they can protect their epistemic deliberations from the undue influence of fascists and other nefarious manipulators. However, this aspiration must face subtleties arising from the social epistemology of science and from the nature of empirical results themselves. In practice, the appraisal of scientific results can often require expertise that is not readily accessible to members of the public without the relevant specialized training. Additionally, precisely because empirical results are not pure observation reports, their appraisal across communities of inquirers operating with different background assumptions can require significant epistemic work.

The logical empiricists paid little attention to the distinction between observing and experimenting and its epistemic implications. For some philosophers, to experiment is to isolate, prepare, and manipulate things in hopes of producing epistemically useful evidence. It had been customary to think of observing as noticing and attending to interesting details of things perceived under more or less natural conditions, or by extension, things perceived during the course of an experiment. To look at a berry on a vine and attend to its color and shape would be to observe it. To extract its juice and apply reagents to test for the presence of copper compounds would be to perform an experiment. By now, many philosophers have argued that contrivance and manipulation influence epistemically significant features of observable experimental results to such an extent that epistemologists ignore them at their peril. Robert Boyle (1661), John Herschell (1830), Bruno Latour and Steve Woolgar (1979), Ian Hacking (1983), Harry Collins (1985) Allan Franklin (1986), Peter Galison (1987), Jim Bogen and Jim Woodward (1988), and Hans-Jörg Rheinberger (1997), are some of the philosophers and philosophically-minded scientists, historians, and sociologists of science who gave serious consideration to the distinction between observing and experimenting. The logical empiricists tended to ignore it. Interestingly, the contemporary vantage point that attends to modeling, data processing, and empirical results may suggest a re-unification of observation and intervention under the same epistemological framework. When one no longer thinks of scientific observation as pure or direct, and recognizes the power of good modeling to account for confounds without physically intervening on the target system, the purported epistemic distinction between observation and intervention loses its bite.

Observers use magnifying glasses, microscopes, or telescopes to see things that are too small or far away to be seen, or seen clearly enough, without them. Similarly, amplification devices are used to hear faint sounds. But if to observe something is to perceive it, not every use of instruments to augment the senses qualifies as observational.

Philosophers generally agree that you can observe the moons of Jupiter with a telescope, or a heartbeat with a stethoscope. The van Fraassen of The Scientific Image is a notable exception, for whom to be ‘observable’ meant to be something that, were it present to a creature like us, would be observed. Thus, for van Fraassen, the moons of Jupiter are observable “since astronauts will no doubt be able to see them as well from close up” (1980, 16). In contrast, microscopic entities are not observable on van Fraassen’s account because creatures like us cannot strategically maneuver ourselves to see them, present before us, with our unaided senses.

Many philosophers have criticized van Fraassen’s view as overly restrictive. Nevertheless, philosophers differ in their willingness to draw the line between what counts as observable and what does not along the spectrum of increasingly complicated instrumentation. Many philosophers who don’t mind telescopes and microscopes still find it unnatural to say that high energy physicists ‘observe’ particles or particle interactions when they look at bubble chamber photographs—let alone digital visualizations of energy depositions left in calorimeters that are not themselves inspected. Their intuitions come from the plausible assumption that one can observe only what one can see by looking, hear by listening, feel by touching, and so on. Investigators can neither look at (direct their gazes toward and attend to) nor visually experience charged particles moving through a detector. Instead they can look at and see tracks in the chamber, in bubble chamber photographs, calorimeter data visualizations, etc.

In more contentious examples, some philosophers have moved to speaking of instrument-augmented empirical research as more like tool use than sensing. Hacking (1981) argues that we do not see through a microscope, but rather with it. Daston and Galison (2007) highlight the inherent interactivity of a scanning tunneling microscope, in which scientists image and manipulate atoms by exchanging electrons between the sharp tip of the microscope and the surface to be imaged (397). Others have opted to stretch the meaning of observation to accommodate what we might otherwise be tempted to call instrument-aided detections. For instance, Shapere (1982) argues that while it may initially strike philosophers as counter-intuitive, it makes perfect sense to call the detection of neutrinos from the interior of the sun “direct observation.”

The variety of views on the observable/unobservable distinction hint that empiricists may have been barking up the wrong philosophical tree. Many of the things scientists investigate do not interact with human perceptual systems as required to produce perceptual experiences of them. The methods investigators use to study such things argue against the idea—however plausible it may once have seemed—that scientists do or should rely exclusively on their perceptual systems to obtain the evidence they need. Thus Feyerabend proposed as a thought experiment that if measuring equipment was rigged up to register the magnitude of a quantity of interest, a theory could be tested just as well against its outputs as against records of human perceptions (Feyerabend 1969, 132–137). Feyerabend could have made his point with historical examples instead of thought experiments. A century earlier Helmholtz estimated the speed of excitatory impulses traveling through a motor nerve. To initiate impulses whose speed could be estimated, he implanted an electrode into one end of a nerve fiber and ran a current into it from a coil. The other end was attached to a bit of muscle whose contraction signaled the arrival of the impulse. To find out how long it took the impulse to reach the muscle he had to know when the stimulating current reached the nerve. But

[o]ur senses are not capable of directly perceiving an individual moment of time with such small duration …

and so Helmholtz had to resort to what he called ‘artificial methods of observation’ (Olesko and Holmes 1994, 84). This meant arranging things so that current from the coil could deflect a galvanometer needle. Assuming that the magnitude of the deflection is proportional to the duration of current passing from the coil, Helmholtz could use the deflection to estimate the duration he could not see (ibid). This sense of ‘artificial observation’ is not to be confused e.g., with using magnifying glasses or telescopes to see tiny or distant objects. Such devices enable the observer to scrutinize visible objects. The minuscule duration of the current flow is not a visible object. Helmholtz studied it by cleverly concocting circumstances so that the deflection of the needle would meaningfully convey the information he needed. Hooke (1705, 16–17) argued for and designed instruments to execute the same kind of strategy in the 17 th century.

It is of interest that records of perceptual observation are not always epistemically superior to data collected via experimental equipment. Indeed, it is not unusual for investigators to use non-perceptual evidence to evaluate perceptual data and correct for its errors. For example, Rutherford and Pettersson conducted similar experiments to find out if certain elements disintegrated to emit charged particles under radioactive bombardment. To detect emissions, observers watched a scintillation screen for faint flashes produced by particle strikes. Pettersson’s assistants reported seeing flashes from silicon and certain other elements. Rutherford’s did not. Rutherford’s colleague, James Chadwick, visited Pettersson’s laboratory to evaluate his data. Instead of watching the screen and checking Pettersson’s data against what he saw, Chadwick arranged to have Pettersson’s assistants watch the screen while unbeknownst to them he manipulated the equipment, alternating normal operating conditions with a condition in which particles, if any, could not hit the screen. Pettersson’s data were discredited by the fact that his assistants reported flashes at close to the same rate in both conditions (Stuewer 1985, 284–288).

When the process of producing data is relatively convoluted, it is even easier to see that human sense perception is not the ultimate epistemic engine. Consider functional magnetic resonance images (fMRI) of the brain decorated with colors to indicate magnitudes of electrical activity in different regions during the performance of a cognitive task. To produce these images, brief magnetic pulses are applied to the subject’s brain. The magnetic force coordinates the precessions of protons in hemoglobin and other bodily stuffs to make them emit radio signals strong enough for the equipment to respond to. When the magnetic force is relaxed, the signals from protons in highly oxygenated hemoglobin deteriorate at a detectably different rate than signals from blood that carries less oxygen. Elaborate algorithms are applied to radio signal records to estimate blood oxygen levels at the places from which the signals are calculated to have originated. There is good reason to believe that blood flowing just downstream from spiking neurons carries appreciably more oxygen than blood in the vicinity of resting neurons. Assumptions about the relevant spatial and temporal relations are used to estimate levels of electrical activity in small regions of the brain corresponding to pixels in the finished image. The results of all of these computations are used to assign the appropriate colors to pixels in a computer generated image of the brain. In view of all of this, functional brain imaging differs, e.g., from looking and seeing, photographing, and measuring with a thermometer or a galvanometer in ways that make it uninformative to call it observation. And similarly for many other methods scientists use to produce non-perceptual evidence.

The role of the senses in fMRI data production is limited to such things as monitoring the equipment and keeping an eye on the subject. Their epistemic role is limited to discriminating the colors in the finished image, reading tables of numbers the computer used to assign them, and so on. While it is true that researchers typically use their sense of sight to take in visualizations of processed fMRI data—or numbers on a page or screen for that matter—this is not the primary locus of epistemic action. Researchers learn about brain processes through fMRI data, to the extent that they do, primarily in virtue of the suitability of the causal connection between the target processes and the data records, and of the transformations those data undergo when they are processed into the maps or other results that scientists want to use. The interesting questions are not about observability, i.e. whether neuronal activity, blood oxygen levels, proton precessions, radio signals, and so on, are properly understood as observable by creatures like us. The epistemic significance of the fMRI data depends on their delivering us the right sort of access to the target, but observation is neither necessary nor sufficient for that access.

Following Shapere (1982), one could respond by adopting an extremely permissive view of what counts as an ‘observation’ so as to allow even highly processed data to count as observations. However, it is hard to reconcile the idea that highly processed data like fMRI images record observations with the traditional empiricist notion that calculations involving theoretical assumptions and background beliefs must not be allowed (on pain of loss of objectivity) to intrude into the process of data production. Observation garnered its special epistemic status in the first place because it seemed more direct, more immediate, and therefore less distorted and muddled than (say) detection or inference. The production of fMRI images requires extensive statistical manipulation based on theories about the radio signals, and a variety of factors having to do with their detection along with beliefs about relations between blood oxygen levels and neuronal activity, sources of systematic error, and more. Insofar as the use of the term ‘observation’ connotes this extra baggage of traditional empiricism, it may be better to replace observation-talk with terminology that is more obviously permissive, such as that of ‘empirical data’ and ‘empirical results.’

Deposing observation from its traditional perch in empiricist epistemologies of science need not estrange philosophers from scientific practice. Terms like ‘observation’ and ‘observation reports’ do not occur nearly as much in scientific as in philosophical writings. In their place, working scientists tend to talk about data . Philosophers who adopt this usage are free to think about standard examples of observation as members of a large, diverse, and growing family of data production methods. Instead of trying to decide which methods to classify as observational and which things qualify as observables, philosophers can then concentrate on the epistemic influence of the factors that differentiate members of the family. In particular, they can focus their attention on what questions data produced by a given method can be used to answer, what must be done to use that data fruitfully, and the credibility of the answers they afford (Bogen 2016).

Satisfactorily answering such questions warrants further philosophical work. As Bogen and Woodward (1988) have argued, there is often a long road between obtaining a particular dataset replete with idiosyncrasies born of unspecified causal nuances to any claim about the phenomenon ultimately of interest to the researchers. Empirical data are typically produced in ways that make it impossible to predict them from the generalizations they are used to test, or to derive instances of those generalizations from data and non ad hoc auxiliary hypotheses. Indeed, it is unusual for many members of a set of reasonably precise quantitative data to agree with one another, let alone with a quantitative prediction. That is because precise, publicly accessible data typically cannot be produced except through processes whose results reflect the influence of causal factors that are too numerous, too different in kind, and too irregular in behavior for any single theory to account for them. When Bernard Katz recorded electrical activity in nerve fiber preparations, the numerical values of his data were influenced by factors peculiar to the operation of his galvanometers and other pieces of equipment, variations among the positions of the stimulating and recording electrodes that had to be inserted into the nerve, the physiological effects of their insertion, and changes in the condition of the nerve as it deteriorated during the course of the experiment. There were variations in the investigators’ handling of the equipment. Vibrations shook the equipment in response to a variety of irregularly occurring causes ranging from random error sources to the heavy tread of Katz’s teacher, A.V. Hill, walking up and down the stairs outside of the laboratory. That’s a short list. To make matters worse, many of these factors influenced the data as parts of irregularly occurring, transient, and shifting assemblies of causal influences.

The effects of systematic and random sources of error are typically such that considerable analysis and interpretation are required to take investigators from data sets to conclusions that can be used to evaluate theoretical claims. Interestingly, this applies as much to clear cases of perceptual data as to machine produced records. When 19 th and early 20 th century astronomers looked through telescopes and pushed buttons to record the time at which they saw a star pass a crosshair, the values of their data points depended, not only upon light from that star, but also upon features of perceptual processes, reaction times, and other psychological factors that varied from observer to observer. No astronomical theory has the resources to take such things into account.

Instead of testing theoretical claims by direct comparison to the data initially collected, investigators use data to infer facts about phenomena, i.e., events, regularities, processes, etc. whose instances are uniform and uncomplicated enough to make them susceptible to systematic prediction and explanation (Bogen and Woodward 1988, 317). The fact that lead melts at temperatures at or close to 327.5 C is an example of a phenomenon, as are widespread regularities among electrical quantities involved in the action potential, the motions of astronomical bodies, etc. Theories that cannot be expected to predict or explain such things as individual temperature readings can nevertheless be evaluated on the basis of how useful they are in predicting or explaining phenomena. The same holds for the action potential as opposed to the electrical data from which its features are calculated, and the motions of astronomical bodies in contrast to the data of observational astronomy. It is reasonable to ask a genetic theory how probable it is (given similar upbringings in similar environments) that the offspring of a parent or parents diagnosed with alcohol use disorder will develop one or more symptoms the DSM classifies as indicative of alcohol use disorder. But it would be quite unreasonable to ask the genetic theory to predict or explain one patient’s numerical score on one trial of a particular diagnostic test, or why a diagnostician wrote a particular entry in her report of an interview with an offspring of one of such parents (see Bogen and Woodward, 1988, 319–326).

Leonelli has challenged Bogen and Woodward’s (1988) claim that data are, as she puts it, “unavoidably embedded in one experimental context” (2009, 738). She argues that when data are suitably packaged, they can travel to new epistemic contexts and retain epistemic utility—it is not just claims about the phenomena that can travel, data travel too. Preparing data for safe travel involves work, and by tracing data ‘journeys,’ philosophers can learn about how the careful labor of researchers, data archivists, and database curators can facilitate useful data mobility. While Leonelli’s own work has often focused on data in biology, Leonelli and Tempini (2020) contains many diverse case studies of data journeys from a variety of scientific disciplines that will be of value to philosophers interested in the methodology and epistemology of science in practice.

The fact that theories typically predict and explain features of phenomena rather than idiosyncratic data should not be interpreted as a failing. For many purposes, this is the more useful and illuminating capacity. Suppose you could choose between a theory that predicted or explained the way in which neurotransmitter release relates to neuronal spiking (e.g., the fact that on average, transmitters are released roughly once for every 10 spikes) and a theory which explained or predicted the numbers displayed on the relevant experimental equipment in one, or a few single cases. For most purposes, the former theory would be preferable to the latter at the very least because it applies to so many more cases. And similarly for theories that predict or explain something about the probability of alcohol use disorder conditional on some genetic factor or a theory that predicted or explained the probability of faulty diagnoses of alcohol use disorder conditional on facts about the training that psychiatrists receive. For most purposes, these would be preferable to a theory that predicted specific descriptions in a single particular case history.

However, there are circumstances in which scientists do want to explain data. In empirical research it is often crucial to getting a useful signal that scientists deal with sources of background noise and confounding signals. This is part of the long road from newly collected data to useful empirical results. An important step on the way to eliminating unwanted noise or confounds is to determine their sources. Different sources of noise can have different characteristics that can be derived from and explained by theory. Consider the difference between ‘shot noise’ and ‘thermal noise,’ two ubiquitous sources of noise in precision electronics (Schottky 1918; Nyquist 1928; Horowitz and Hill 2015). ‘Shot noise’ arises in virtue of the discrete nature of a signal. For instance, light collected by a detector does not arrive all at once or in perfectly continuous fashion. Photons rain onto a detector shot by shot on account of being quanta. Imagine building up an image one photon at a time—at first the structure of the image is barely recognizable, but after the arrival of many photons, the image eventually fills in. In fact, the contribution of noise of this type goes as the square root of the signal. By contrast, thermal noise is due to non-zero temperature—thermal fluctuations cause a small current to flow in any circuit. If you cool your instrument (which very many precision experiments in physics do) then you can decrease thermal noise. Cooling the detector is not going to change the quantum nature of photons though. Simply collecting more photons will improve the signal to noise ratio with respect to shot noise. Thus, determining what kind of noise is affecting one’s data, i.e. explaining features of the data themselves that are idiosyncratic to the particular instruments and conditions prevailing during a specific instance of data collection, can be critical to eventually generating a dataset that can be used to answer questions about phenomena of interest. In using data that require statistical analysis, it is particularly clear that “empirical assumptions about the factors influencing the measurement results may be used to motivate the assumption of a particular error distribution”, which can be crucial for justifying the application of methods of analysis (Woodward 2011, 173).

There are also circumstances in which scientists want to provide a substantive, detailed explanation for a particular idiosyncratic datum, and even circumstances in which procuring such explanations is epistemically imperative. Ignoring outliers without good epistemic reasons is just cherry-picking data, one of the canonical ‘questionable research practices.’ Allan Franklin has described Robert Millikan’s convenient exclusion of data he collected from observing the second oil drop in his experiments of April 16, 1912 (1986, 231). When Millikan initially recorded the data for this drop, his notebooks indicate that he was satisfied his apparatus was working properly and that the experiment was running well—he wrote “Publish” next to the data in his lab notebook. However, after he had later calculated the value for the fundamental electric charge that these data yielded, and found it aberrant with respect to the values he calculated using data collected from other good observing sessions, he changed his mind, writing “Won’t work” next to the calculation (ibid., see also Woodward 2010, 794). Millikan not only never published this result, he never published why he failed to publish it. When data are excluded from analysis, there ought to be some explanation justifying their omission over and above lack of agreement with the experimenters’ expectations. Precisely because they are outliers, some data require specific, detailed, idiosyncratic causal explanations. Indeed, it is often in virtue of those very explanations that outliers can be responsibly rejected. Some explanation of data rejected as ‘spurious’ is required. Otherwise, scientists risk biasing their own work.

Thus, while in transforming data as collected into something useful for learning about phenomena, scientists often account for features of the data such as different types of noise contributions, and sometimes even explain the odd outlying data point or artifact, they simply do not explain every individual teensy tiny causal contribution to the exact character of a data set or datum in full detail. This is because scientists can neither discover such causal minutia nor would their invocation be necessary for typical research questions. The fact that it may sometimes be important for scientists to provide detailed explanations of data, and not just claims about phenomena inferred from data, should not be confused with the dubious claim that scientists could ‘in principle’ detail every causal quirk that contributed to some data (Woodward 2010; 2011).

In view of all of this, together with the fact that a great many theoretical claims can only be tested directly against facts about phenomena, it behooves epistemologists to think about how data are used to answer questions about phenomena. Lacking space for a detailed discussion, the most this entry can do is to mention two main kinds of things investigators do in order to draw conclusions from data. The first is causal analysis carried out with or without the use of statistical techniques. The second is non-causal statistical analysis.

First, investigators must distinguish features of the data that are indicative of facts about the phenomenon of interest from those which can safely be ignored, and those which must be corrected for. Sometimes background knowledge makes this easy. Under normal circumstances investigators know that their thermometers are sensitive to temperature, and their pressure gauges, to pressure. An astronomer or a chemist who knows what spectrographic equipment does, and what she has applied it to will know what her data indicate. Sometimes it is less obvious. When Santiago Ramón y Cajal looked through his microscope at a thin slice of stained nerve tissue, he had to figure out which, if any, of the fibers he could see at one focal length connected to or extended from things he could see only at another focal length, or in another slice. Analogous considerations apply to quantitative data. It was easy for Katz to tell when his equipment was responding more to Hill’s footfalls on the stairs than to the electrical quantities it was set up to measure. It can be harder to tell whether an abrupt jump in the amplitude of a high frequency EEG oscillation was due to a feature of the subjects brain activity or an artifact of extraneous electrical activity in the laboratory or operating room where the measurements were made. The answers to questions about which features of numerical and non-numerical data are indicative of a phenomenon of interest typically depend at least in part on what is known about the causes that conspire to produce the data.

Statistical arguments are often used to deal with questions about the influence of epistemically relevant causal factors. For example, when it is known that similar data can be produced by factors that have nothing to do with the phenomenon of interest, Monte Carlo simulations, regression analyses of sample data, and a variety of other statistical techniques sometimes provide investigators with their best chance of deciding how seriously to take a putatively illuminating feature of their data.

But statistical techniques are also required for purposes other than causal analysis. To calculate the magnitude of a quantity like the melting point of lead from a scatter of numerical data, investigators throw out outliers, calculate the mean and the standard deviation, etc., and establish confidence and significance levels. Regression and other techniques are applied to the results to estimate how far from the mean the magnitude of interest can be expected to fall in the population of interest (e.g., the range of temperatures at which pure samples of lead can be expected to melt).

The fact that little can be learned from data without causal, statistical, and related argumentation has interesting consequences for received ideas about how the use of observational evidence distinguishes science from pseudoscience, religion, and other non-scientific cognitive endeavors. First, scientists are not the only ones who use observational evidence to support their claims; astrologers and medical quacks use them too. To find epistemically significant differences, one must carefully consider what sorts of data they use, where it comes from, and how it is employed. The virtues of scientific as opposed to non-scientific theory evaluations depend not only on its reliance on empirical data, but also on how the data are produced, analyzed and interpreted to draw conclusions against which theories can be evaluated. Secondly, it does not take many examples to refute the notion that adherence to a single, universally applicable ‘scientific method’ differentiates the sciences from the non-sciences. Data are produced, and used in far too many different ways to treat informatively as instances of any single method. Thirdly, it is usually, if not always, impossible for investigators to draw conclusions to test theories against observational data without explicit or implicit reliance on theoretical resources.

Bokulich (2020) has helpfully outlined a taxonomy of various ways in which data can be model-laden to increase their epistemic utility. She focuses on seven categories: data conversion, data correction, data interpolation, data scaling, data fusion, data assimilation, and synthetic data. Of these categories, conversion and correction are perhaps the most familiar. Bokulich reminds us that even in the case of reading a temperature from an ordinary mercury thermometer, we are ‘converting’ the data as measured, which in this case is the height of the column of mercury, to a temperature (ibid., 795). In more complicated cases, such as processing the arrival times of acoustic signals in seismic reflection measurements to yield values for subsurface depth, data conversion may involve models (ibid.). In this example, models of the composition and geometry of the subsurface are needed in order to account for differences in the speed of sound in different materials. Data ‘correction’ involves common practices we have already discussed like modeling and mathematically subtracting background noise contributions from one’s dataset (ibid., 796). Bokulich rightly points out that involving models in these ways routinely improves the epistemic uses to which data can be put. Data interpolation, scaling, and ‘fusion’ are also relatively widespread practices that deserve further philosophical analysis. Interpolation involves filling in missing data in a patchy data set, under the guidance of models. Data are scaled when they have been generated in a particular scale (temporal, spatial, energy) and modeling assumptions are recruited to transform them to apply at another scale. Data are ‘fused,’ in Bokulich’s terminology, when data collected in diverse contexts, using diverse methods are combined, or integrated together. For instance, when data from ice cores, tree rings, and the historical logbooks of sea captains are merged into a joint climate dataset. Scientists must take care in combining data of diverse provenance, and model new uncertainties arising from the very amalgamation of datasets (ibid., 800).

Bokulich contrasts ‘synthetic data’ with what she calls ‘real data’ (ibid., 801–802). Synthetic data are virtual, or simulated data, and are not produced by physical interaction with worldly research targets. Bokulich emphasizes the role that simulated data can usefully play in testing and troubleshooting aspects of data processing that are to eventually be deployed on empirical data (ibid., 802). It can be incredibly useful for developing and stress-testing a data processing pipeline to have fake datasets whose characteristics are already known in virtue of having been produced by the researchers, and being available for their inspection at will. When the characteristics of a dataset are known, or indeed can be tailored according to need, the effects of new processing methods can be more readily traced than without. In this way, researchers can familiarize themselves with the effects of a data processing pipeline, and make adjustments to that pipeline in light of what they learn by feeding fake data through it, before attempting to use that pipeline on actual science data. Such investigations can be critical to eventually arguing for the credibility of the final empirical results and their appropriate interpretation and use.

Data assimilation is perhaps a less widely appreciated aspect of model-based data processing among philosophers of science, excepting Parker (2016; 2017). Bokulich characterizes this method as “the optimal integration of data with dynamical model estimates to provide a more accurate ‘assimilation estimate’ of the quantity” (2020, 800). Thus, data assimilation involves balancing the contributions of empirical data and the output of models in an integrated estimate, according to the uncertainties associated with these contributions.

Bokulich argues that the involvement of models in these various aspects of data processing does not necessarily lead to better epistemic outcomes. Done wrong, integrating models and data can introduce artifacts and make the processed data unreliable for the purpose at hand (ibid., 804). Indeed, she notes that “[t]here is much work for methodologically reflective scientists and philosophers of science to do in string out cases in which model-data symbiosis may be problematic or circular” (ibid.)

3. Theory and value ladenness

Empirical results are laden with values and theoretical commitments. Philosophers have raised and appraised several possible kinds of epistemic problems that could be associated with theory and/or value-laden empirical results. They have worried about the extent to which human perception itself is distorted by our commitments. They have worried that drawing upon theoretical resources from the very theory to be appraised (or its competitors) in the generation of empirical results yields vicious circularity (or inconsistency). They have also worried that contingent conceptual and/or linguistic frameworks trap bits of evidence like bees in amber so that they cannot carry on their epistemic lives outside of the contexts of their origination, and that normative values necessarily corrupt the integrity of science. Do the theory and value-ladenness of empirical results render them hopelessly parochial? That is, when scientists leave theoretical commitments behind and adopt new ones, must they also relinquish the fruits of the empirical research imbued with their prior commitments too? In this section, we discuss these worries and responses that philosophers have offered to assuage them.

If you believe that observation by human sense perception is the objective basis of all scientific knowledge, then you ought to be particularly worried about the potential for human perception to be corrupted by theoretical assumptions, wishful thinking, framing effects, and so on. Daston and Galison recount the striking example of Arthur Worthington’s symmetrical milk drops (2007, 11–16). Working in 1875, Worthington investigated the hydrodynamics of falling fluid droplets and their evolution upon impacting a hard surface. At first, he had tried to carefully track the drop dynamics with a strobe light to burn a sequence of images into his own retinas. The images he drew to record what he saw were radially symmetric, with rays of the drop splashes emanating evenly from the center of the impact. However, when Worthington transitioned from using his eyes and capacity to draw from memory to using photography in 1894, he was shocked to find that the kind of splashes he had been observing were irregular splats (ibid., 13). Even curiouser, when Worthington returned to his drawings, he found that he had indeed recorded some unsymmetrical splashes. He had evidently dismissed them as uninformative accidents instead of regarding them as revelatory of the phenomenon he was intent on studying (ibid.) In attempting to document the ideal form of the splashes, a general and regular form, he had subconsciously down-played the irregularity of individual splashes. If theoretical commitments, like Worthington’s initial commitment to the perfect symmetry of the physics he was studying, pervasively and incorrigibly dictated the results of empirical inquiry, then the epistemic aims of science would be seriously undermined.

Perceptual psychologists, Bruner and Postman, found that subjects who were briefly shown anomalous playing cards, e.g., a black four of hearts, reported having seen their normal counterparts e.g., a red four of hearts. It took repeated exposures to get subjects to say the anomalous cards didn’t look right, and eventually, to describe them correctly (Kuhn 1962, 63). Kuhn took such studies to indicate that things don’t look the same to observers with different conceptual resources. (For a more up-to-date discussion of theory and conceptual perceptual loading see Lupyan 2015.) If so, black hearts didn’t look like black hearts until repeated exposures somehow allowed subjects to acquire the concept of a black heart. By analogy, Kuhn supposed, when observers working in conflicting paradigms look at the same thing, their conceptual limitations should keep them from having the same visual experiences (Kuhn 1962, 111, 113–114, 115, 120–1). This would mean, for example, that when Priestley and Lavoisier watched the same experiment, Lavoisier should have seen what accorded with his theory that combustion and respiration are oxidation processes, while Priestley’s visual experiences should have agreed with his theory that burning and respiration are processes of phlogiston release.

The example of Pettersson’s and Rutherford’s scintillation screen evidence (above) attests to the fact that observers working in different laboratories sometimes report seeing different things under similar conditions. It is plausible that their expectations influence their reports. It is plausible that their expectations are shaped by their training and by their supervisors’ and associates’ theory driven behavior. But as happens in other cases as well, all parties to the dispute agreed to reject Pettersson’s data by appealing to results that both laboratories could obtain and interpret in the same way without compromising their theoretical commitments. Indeed, it is possible for scientists to share empirical results, not just across diverse laboratory cultures, but even across serious differences in worldview. Much as they disagreed about the nature of respiration and combustion, Priestley and Lavoisier gave quantitatively similar reports of how long their mice stayed alive and their candles kept burning in closed bell jars. Priestley taught Lavoisier how to obtain what he took to be measurements of the phlogiston content of an unknown gas. A sample of the gas to be tested is run into a graduated tube filled with water and inverted over a water bath. After noting the height of the water remaining in the tube, the observer adds “nitrous air” (we call it nitric oxide) and checks the water level again. Priestley, who thought there was no such thing as oxygen, believed the change in water level indicated how much phlogiston the gas contained. Lavoisier reported observing the same water levels as Priestley even after he abandoned phlogiston theory and became convinced that changes in water level indicated free oxygen content (Conant 1957, 74–109).

A related issue is that of salience. Kuhn claimed that if Galileo and an Aristotelian physicist had watched the same pendulum experiment, they would not have looked at or attended to the same things. The Aristotelian’s paradigm would have required the experimenter to measure

… the weight of the stone, the vertical height to which it had been raised, and the time required for it to achieve rest (Kuhn 1962, 123)

and ignore radius, angular displacement, and time per swing (ibid., 124). These last were salient to Galileo because he treated pendulum swings as constrained circular motions. The Galilean quantities would be of no interest to an Aristotelian who treats the stone as falling under constraint toward the center of the earth (ibid., 123). Thus Galileo and the Aristotelian would not have collected the same data. (Absent records of Aristotelian pendulum experiments we can think of this as a thought experiment.)

Interests change, however. Scientists may eventually come to appreciate the significance of data that had not originally been salient to them in light of new presuppositions. The moral of these examples is that although paradigms or theoretical commitments sometimes have an epistemically significant influence on what observers perceive or what they attend to, it can be relatively easy to nullify or correct for their effects. When presuppositions cause epistemic damage, investigators are often able to eventually make corrections. Thus, paradigms and theoretical commitments actually do influence saliency, but their influence is neither inevitable nor irremediable.

Thomas Kuhn (1962), Norwood Hanson (1958), Paul Feyerabend (1959) and others cast suspicion on the objectivity of observational evidence in another way by arguing that one cannot use empirical evidence to test a theory without committing oneself to that very theory. This would be a problem if it leads to dogmatism but assuming the theory to be tested is often benign and even necessary.

For instance, Laymon (1988) demonstrates the manner in which the very theory that the Michelson-Morley experiments are considered to test is assumed in the experimental design, but that this does not engender deleterious epistemic effects (250). The Michelson-Morley apparatus consists of two interferometer arms at right angles to one another, which are rotated in the course of the experiment so that, on the original construal, the path length traversed by light in the apparatus would vary according to alignment with or against the Earth’s velocity (carrying the apparatus) with respect to the stationary aether. This difference in path length would show up as displacement in the interference fringes of light in the interferometer. Although Michelson’s intention had been to measure the velocity of the Earth with respect to the all-pervading aether, the experiments eventually came to be regarded as furnishing tests of the Fresnel aether theory itself. In particular, the null results of these experiments were taken as evidence against the existence of the aether. Naively, one might suppose that whatever assumptions were made in the calculation of the results of these experiments, it should not be the case that the theory under the gun was assumed nor that its negation was.

Before Michelson’s experiments, the Fresnel aether theory did not predict any sort of length contraction. Although Michelson assumed no contraction in the arms of the interferometer, Laymon argues that he could have assumed contraction, with no practical impact on the results of the experiments. The predicted fringe shift is calculated from the anticipated difference in the distance traveled by light in the two arms is the same, when higher order terms are neglected. Thus, in practice, the experimenters could assume either that the contraction thesis was true or that it was false when determining the length of the arms. Either way, the results of the experiment would be the same. After Michelson’s experiments returned no evidence of the anticipated aether effects, Lorentz-Fitzgerald contraction was postulated precisely to cancel out the expected (but not found) effects and save the aether theory. Morley and Miller then set out specifically to test the contraction thesis, and still assumed no contraction in determining the length of the arms of their interferometer (ibid., 253). Thus Laymon argues that the Michelson-Morley experiments speak against the tempting assumption that “appraisal of a theory is based on phenomena which can be detected and measured without using assumptions drawn from the theory under examination or from competitors to that theory ” (ibid., 246).

Epistemological hand-wringing about the use of the very theory to be tested in the generation of the evidence to be used for testing, seems to spring primarily from a concern about vicious circularity. How can we have a genuine trial, if the theory in question has been presumed innocent from the outset? While it is true that there would be a serious epistemic problem in a case where the use of the theory to be tested conspired to guarantee that the evidence would turn out to be confirmatory, this is not always the case when theories are invoked in their own testing. Woodward (2011) summarizes a tidy case:

For example, in Millikan’s oil drop experiment, the mere fact that theoretical assumptions (e.g., that the charge of the electron is quantized and that all electrons have the same charge) play a role in motivating his measurements or a vocabulary for describing his results does not by itself show that his design and data analysis were of such a character as to guarantee that he would obtain results supporting his theoretical assumptions. His experiment was such that he might well have obtained results showing that the charge of the electron was not quantized or that there was no single stable value for this quantity. (178)

For any given case, determining whether the theoretical assumptions being made are benign or straight-jacketing the results that it will be possible to obtain will require investigating the particular relationships between the assumptions and results in that case. When data production and analysis processes are complicated, this task can get difficult. But the point is that merely noting the involvement of the theory to be tested in the generation of empirical results does not by itself imply that those results cannot be objectively useful for deciding whether the theory to be tested should be accepted or rejected.

Kuhn argued that theoretical commitments exert a strong influence on observation descriptions, and what they are understood to mean (Kuhn 1962, 127ff; Longino 1979, 38–42). If so, proponents of a caloric account of heat won’t describe or understand descriptions of observed results of heat experiments in the same way as investigators who think of heat in terms of mean kinetic energy or radiation. They might all use the same words (e.g., ‘temperature’) to report an observation without understanding them in the same way. This poses a potential problem for communicating effectively across paradigms, and similarly, for attributing the appropriate significance to empirical results generated outside of one’s own linguistic framework.

It is important to bear in mind that observers do not always use declarative sentences to report observational and experimental results. Instead, they often draw, photograph, make audio recordings, etc. or set up their experimental devices to generate graphs, pictorial images, tables of numbers, and other non-sentential records. Obviously investigators’ conceptual resources and theoretical biases can exert epistemically significant influences on what they record (or set their equipment to record), which details they include or emphasize, and which forms of representation they choose (Daston and Galison 2007, 115–190, 309–361). But disagreements about the epistemic import of a graph, picture or other non-sentential bit of data often turn on causal rather than semantical considerations. Anatomists may have to decide whether a dark spot in a micrograph was caused by a staining artifact or by light reflected from an anatomically significant structure. Physicists may wonder whether a blip in a Geiger counter record reflects the causal influence of the radiation they wanted to monitor, or a surge in ambient radiation. Chemists may worry about the purity of samples used to obtain data. Such questions are not, and are not well represented as, semantic questions to which semantic theory loading is relevant. Late 20 th century philosophers may have ignored such cases and exaggerated the influence of semantic theory loading because they thought of theory testing in terms of inferential relations between observation and theoretical sentences.

Nevertheless, some empirical results are reported as declarative sentences. Looking at a patient with red spots and a fever, an investigator might report having seen the spots, or measles symptoms, or a patient with measles. Watching an unknown liquid dripping into a litmus solution an observer might report seeing a change in color, a liquid with a PH of less than 7, or an acid. The appropriateness of a description of a test outcome depends on how the relevant concepts are operationalized. What justifies an observer to report having observed a case of measles according to one operationalization might require her to say no more than that she had observed measles symptoms, or just red spots according to another.

In keeping with Percy Bridgman’s view that

… in general, we mean by a concept nothing more than a set of operations; the concept is synonymous with the corresponding sets of operations (Bridgman 1927, 5)

one might suppose that operationalizations are definitions or meaning rules such that it is analytically true, e.g., that every liquid that turns litmus red in a properly conducted test is acidic. But it is more faithful to actual scientific practice to think of operationalizations as defeasible rules for the application of a concept such that both the rules and their applications are subject to revision on the basis of new empirical or theoretical developments. So understood, to operationalize is to adopt verbal and related practices for the purpose of enabling scientists to do their work. Operationalizations are thus sensitive and subject to change on the basis of findings that influence their usefulness (Feest 2005).

Definitional or not, investigators in different research traditions may be trained to report their observations in conformity with conflicting operationalizations. Thus instead of training observers to describe what they see in a bubble chamber as a whitish streak or a trail, one might train them to say they see a particle track or even a particle. This may reflect what Kuhn meant by suggesting that some observers might be justified or even required to describe themselves as having seen oxygen, transparent and colorless though it is, or atoms, invisible though they are (Kuhn 1962, 127ff). To the contrary, one might object that what one sees should not be confused with what one is trained to say when one sees it, and therefore that talking about seeing a colorless gas or an invisible particle may be nothing more than a picturesque way of talking about what certain operationalizations entitle observers to say. Strictly speaking, the objection concludes, the term ‘observation report’ should be reserved for descriptions that are neutral with respect to conflicting operationalizations.

If observational data are just those utterances that meet Feyerabend’s decidability and agreeability conditions, the import of semantic theory loading depends upon how quickly, and for which sentences reasonably sophisticated language users who stand in different paradigms can non-inferentially reach the same decisions about what to assert or deny. Some would expect enough agreement to secure the objectivity of observational data. Others would not. Still others would try to supply different standards for objectivity.

With regard to sentential observation reports, the significance of semantic theory loading is less ubiquitous than one might expect. The interpretation of verbal reports often depends on ideas about causal structure rather than the meanings of signs. Rather than worrying about the meaning of words used to describe their observations, scientists are more likely to wonder whether the observers made up or withheld information, whether one or more details were artifacts of observation conditions, whether the specimens were atypical, and so on.

Note that the worry about semantic theory loading extends beyond observation reports of the sort that occupied the logical empiricists and their close intellectual descendents. Combining results of diverse methods for making proxy measurements of paleoclimate temperatures in an epistemically responsible way requires careful attention to the variety of operationalizations at play. Even if no ‘observation reports’ are involved, the sticky question about how to usefully merge results obtained in different ways in order to satisfy one’s epistemic aims remains. Happily, the remedy for the worry about semantic loading in this broader sense is likely to be the same—investigating the provenance of those results and comparing the variety of factors that have contributed to their causal production.

Kuhn placed too much emphasis on the discontinuity between evidence generated in different paradigms. Even if we accept a broadly Kuhnian picture, according to which paradigms are heterogeneous collections of experimental practices, theoretical principles, problems selected for investigation, approaches to their solution, etc., connections between components are loose enough to allow investigators who disagree profoundly over one or more theoretical claims to nevertheless agree about how to design, execute, and record the results of their experiments. That is why neuroscientists who disagreed about whether nerve impulses consisted of electrical currents could measure the same electrical quantities, and agree on the linguistic meaning and the accuracy of observation reports including such terms as ‘potential’, ‘resistance’, ‘voltage’ and ‘current’. As we discussed above, the success that scientists have in repurposing results generated by others for different purposes speaks against the confinement of evidence to its native paradigm. Even when scientists working with radically different core theoretical commitments cannot make the same measurements themselves, with enough contextual information about how each conducts research, it can be possible to construct bridges that span the theoretical divides.

One could worry that the intertwining of the theoretical and empirical would open the floodgates to bias in science. Human cognizing, both historical and present day, is replete with disturbing commitments including intolerance and narrow mindedness of many sorts. If such commitments are integral to a theoretical framework, or endemic to the reasoning of a scientist or scientific community, then they threaten to corrupt the epistemic utility of empirical results generated using their resources. The core impetus of the ‘value-free ideal’ is to maintain a safe distance between the appraisal of scientific theories according to the evidence on one hand, and the swarm of moral, political, social, and economic values on the other. While proponents of the value-free ideal might admit that the motivation to pursue a theory or the legal protection of human subjects in permissible experimental methods involve non-epistemic values, they would contend that such values ought not ought not enter into the constitution of empirical results themselves, nor the adjudication or justification of scientific theorizing in light of the evidence (see Intemann 2021, 202).

As a matter of fact, values do enter into science at a variety of stages. Above we saw that ‘theory-ladenness’ could refer to the involvement of theory in perception, in semantics, and in a kind of circularity that some have worried begets unfalsifiability and thereby dogmatism. Like theory-ladenness, values can and sometimes do affect judgments about the salience of certain evidence and the conceptual framing of data. Indeed, on a permissive construal of the nature of theories, values can simply be understood as part of a theoretical framework. Intemann (2021) highlights a striking example from medical research where key conceptual resources include notions like ‘harm,’ ‘risk,’ ‘health benefit,’ and ‘safety.’ She refers to research on the comparative safety of giving birth at home and giving birth at a hospital for low-risk parents in the United States. Studies reporting that home births are less safe typically attend to infant and birthing parent mortality rates—which are low for these subjects whether at home or in hospital—but leave out of consideration rates of c-section and episiotomy, which are both relatively high in hospital settings. Thus, a value-laden decision about whether a possible outcome counts as a harm worth considering can influence the outcome of the study—in this case tipping the balance towards the conclusion that hospital births are more safe (ibid., 206).

Note that the birth safety case differs from the sort of cases at issue in the philosophical debate about risk and thresholds for acceptance and rejection of hypotheses. In accepting an hypothesis, a person makes a judgement that the risk of being mistaken is sufficiently low (Rudner 1953). When the consequences of being wrong are deemed grave, the threshold for acceptance may be correspondingly high. Thus, in evaluating the epistemic status of an hypothesis in light of the evidence, a person may have to make a value-based judgement. However, in the birth safety case, the judgement comes into play at an earlier stage, well before the decision to accept or reject the hypothesis is to be made. The judgement occurs already in deciding what is to count as a ‘harm’ worth considering for the purposes of this research.

The fact that values do sometimes enter into scientific reasoning does not by itself settle the question of whether it would be better if they did not. In order to assess the normative proposal, philosophers of science have attempted to disambiguate the various ways in which values might be thought to enter into science, and the various referents that get crammed under the single heading of ‘values.’ Anderson (2004) articulates eight stages of scientific research where values (‘evaluative presuppositions’) might be employed in epistemically fruitful ways. In paraphrase: 1) orientation in a field, 2) framing a research question, 3) conceptualizing the target, 4) identifying relevant data, 5) data generation, 6) data analysis, 7) deciding when to cease data analysis, and 8) drawing conclusions (Anderson 2004, 11). Similarly, Intemann (2021) lays out five ways “that values play a role in scientific reasoning” with which feminist philosophers of science have engaged in particular:

(1) the framing [of] research problems, (2) observing phenomena and describing data, (3) reasoning about value-laden concepts and assessing risks, (4) adopting particular models, and (5) collecting and interpreting evidence. (208)

Ward (2021) presents a streamlined and general taxonomy of four ways in which values relate to choices: as reasons motivating or justifying choices, as causal effectors of choices, or as goods affected by choices. By investigating the role of values in these particular stages or aspects of research, philosophers of science can offer higher resolution insights than just the observation that values are involved in science at all and untangle crosstalk.

Similarly, fine points can be made about the nature of values involved in these various contexts. Such clarification is likely important for determining whether the contribution of certain values in a given context is deleterious or salutary, and in what sense. Douglas (2013) argues that the ‘value’ of internal consistency of a theory and of the empirical adequacy of a theory with respect to the available evidence are minimal criteria for any viable scientific theory (799–800). She contrasts these with the sort of values that Kuhn called ‘virtues,’ i.e. scope, simplicity, and explanatory power that are properties of theories themselves, and unification, novel prediction and precision, which are properties a theory has in relation to a body of evidence (800–801). These are the sort of values that may be relevant to explaining and justifying choices that scientists make to pursue/abandon or accept/reject particular theories. Moreover, Douglas (2000) argues that what she calls “non-epistemic values” (in particular, ethical value judgements) also enter into decisions at various stages “internal” to scientific reasoning, such as data collection and interpretation (565). Consider a laboratory toxicology study in which animals exposed to dioxins are compared to unexposed controls. Douglas discusses researchers who want to determine the threshold for safe exposure. Admitting false positives can be expected to lead to overregulation of the chemical industry, while false negatives yield underregulation and thus pose greater risk to public health. The decision about where to set the unsafe exposure threshold, that is, set the threshold for a statistically significant difference between experimental and control animal populations, involves balancing the acceptability of these two types of errors. According to Douglas, this balancing act will depend on “whether we are more concerned about protecting public health from dioxin pollution or whether we are more concerned about protecting industries that produce dioxins from increased regulation” (ibid., 568). That scientists do as a matter of fact sometimes make such decisions is clear. They judge, for instance, a specimen slide of a rat liver to be tumorous or not, and whether borderline cases should count as benign or malignant (ibid., 569–572). Moreover, in such cases, it is not clear that the responsibility of making such decisions could be offloaded to non-scientists.

Many philosophers accept that values can contribute to the generation of empirical results without spoiling their epistemic utility. Anderson’s (2004) diagnosis is as follows:

Deep down, what the objectors find worrisome about allowing value judgments to guide scientific inquiry is not that they have evaluative content, but that these judgments might be held dogmatically, so as to preclude the recognition of evidence that might undermine them. We need to ensure that value judgements do not operate to drive inquiry to a predetermined conclusion. This is our fundamental criterion for distinguishing legitimate from illegitimate uses of values in science. (11)

Data production (including experimental design and execution) is heavily influenced by investigators’ background assumptions. Sometimes these include theoretical commitments that lead experimentalists to produce non-illuminating or misleading evidence. In other cases they may lead experimentalists to ignore, or even fail to produce useful evidence. For example, in order to obtain data on orgasms in female stumptail macaques, one researcher wired up females to produce radio records of orgasmic muscle contractions, heart rate increases, etc. But as Elisabeth Lloyd reports, “… the researcher … wired up the heart rate of the male macaques as the signal to start recording the female orgasms. When I pointed out that the vast majority of female stumptail orgasms occurred during sex among the females alone, he replied that yes he knew that, but he was only interested in important orgasms” (Lloyd 1993, 142). Although female stumptail orgasms occurring during sex with males are atypical, the experimental design was driven by the assumption that what makes features of female sexuality worth studying is their contribution to reproduction (ibid., 139). This assumption influenced experimental design in such a way as to preclude learning about the full range of female stumptail orgasms.

Anderson (2004) presents an influential analysis of the role of values in research on divorce. Researchers committed to an interpretive framework rooted in ‘traditional family values’ could conduct research on the assumption that divorce is mostly bad for spouses and any children that they have (ibid., 12). This background assumption, which is rooted in a normative appraisal of a certain model of good family life, could lead social science researchers to restrict the questions with which they survey their research subjects to ones about the negative impacts of divorce on their lives, thereby curtailing the possibility of discovering ways that divorce may have actually made the ex-spouses lives better (ibid., 13). This is an example of the influence that values can have on the nature of the results that research ultimately yields, which is epistemically detrimental. In this case, the values in play biased the research outcomes to preclude recognition of countervailing evidence. Anderson argues that the problematic influence of values comes when research “is rigged in advance” to confirm certain hypotheses—when the influence of values amounts to incorrigible dogmatism (ibid., 19). “Dogmatism” in her sense is unfalsifiability in practice, “their stubbornness in the face of any conceivable evidence”(ibid., 22).

Fortunately, such dogmatism is not ubiquitous and when it occurs it can often be corrected eventually. Above we noted that the mere involvement of the theory to be tested in the generation of an empirical result does not automatically yield vicious circularity—it depends on how the theory is involved. Furthermore, even if the assumptions initially made in the generation of empirical results are incorrect, future scientists will have opportunities to reassess those assumptions in light of new information and techniques. Thus, as long as scientists continue their work there need be no time at which the epistemic value of an empirical result can be established once and for all. This should come as no surprise to anyone who is aware that science is fallible, but it is no grounds for skepticism. It can be perfectly reasonable to trust the evidence available at present even though it is logically possible for epistemic troubles to arise in the future. A similar point can be made regarding values (although cf. Yap 2016).

Moreover, while the inclusion of values in the generation of an empirical result can sometimes be epistemically bad, values properly deployed can also be harmless, or even epistemically helpful. As in the cases of research on female stumptail macaque orgasms and the effects of divorce, certain values can sometimes serve to illuminate the way in which other epistemically problematic assumptions have hindered potential scientific insight. By valuing knowledge about female sexuality beyond its role in reproduction, scientists can recognize the narrowness of an approach that only conceives of female sexuality insofar as it relates to reproduction. By questioning the absolute value of one traditional ideal for flourishing families, researchers can garner evidence that might end up destabilizing the empirical foundation supporting that ideal.

Empirical results are most obviously put to epistemic work in their contexts of origin. Scientists conceive of empirical research, collect and analyze the relevant data, and then bring the results to bear on the theoretical issues that inspired the research in the first place. However, philosophers have also discussed ways in which empirical results are transferred out of their native contexts and applied in diverse and sometimes unexpected ways (see Leonelli and Tempini 2020). Cases of reuse, or repurposing of empirical results in different epistemic contexts raise several interesting issues for philosophers of science. For one, such cases challenge the assumption that theory (and value) ladenness confines the epistemic utility of empirical results to a particular conceptual framework. Ancient Babylonian eclipse records inscribed on cuneiform tablets have been used to generate constraints on contemporary geophysical theorizing about the causes of the lengthening of the day on Earth (Stephenson, Morrison, and Hohenkerk 2016). This is surprising since the ancient observations were originally recorded for the purpose of making astrological prognostications. Nevertheless, with enough background information, the records as inscribed can be translated, the layers of assumptions baked into their presentation peeled back, and the results repurposed using resources of the contemporary epistemic context, the likes of which the Babylonians could have hardly dreamed.

Furthermore, the potential for reuse and repurposing feeds back on the methodological norms of data production and handling. In light of the difficulty of reusing or repurposing data without sufficient background information about the original context, Goodman et al. (2014) note that “data reuse is most possible when: 1) data; 2) metadata (information describing the data); and 3) information about the process of generating those data, such as code, all all provided” (3). Indeed, they advocate for sharing data and code in addition to results customarily published in science. As we have seen, the loading of data with theory is usually necessary to putting that data to any serious epistemic use—theory-loading makes theory appraisal possible. Philosophers have begun to appreciate that this epistemic boon does not necessarily come at the cost of rendering data “tragically local” (Wylie 2020, 285, quoting Latour 1999). But it is important to note the useful travel of data between contexts is significantly aided by foresight, curation, and management for that aim.

In light of the mediated nature of empirical results, Boyd (2018) argues for an “enriched view of evidence,” in which the evidence that serves as the ‘tribunal of experience’ is understood to be “lines of evidence” composed of the products of data collection and all of the products of their transformation on the way to the generation of empirical results that are ultimately compared to theoretical predictions, considered together with metadata associated with their provenance. Such metadata includes information about theoretical assumptions that are made in data collection, processing, and the presentation of empirical results. Boyd argues that by appealing to metadata to ‘rewind’ the processing of assumption-imbued empirical results and then by re-processing them using new resources, the epistemic utility of empirical evidence can survive transitions to new contexts. Thus, the enriched view of evidence supports the idea that it is not despite the intertwining of the theoretical and empirical that scientists accomplish key epistemic aims, but often in virtue of it (ibid., 420). In addition, it makes the epistemic value of metadata encoding the various assumptions that have been made throughout the course of data collection and processing explicit.

The desirability of explicitly furnishing empirical data and results with auxiliary information that allow them to travel can be appreciated in light of the ‘objectivity’ norm, construed as accessibility to interpersonal scrutiny. When data are repurposed in novel contexts, they are not only shared between subjects, but can in some cases be shared across radically different paradigms with incompatible theoretical commitments.

4. The epistemic value of empirical evidence

One of the important applications of empirical evidence is its use in assessing the epistemic status of scientific theories. In this section we briefly discuss philosophical work on the role of empirical evidence in confirmation/falsification of scientific theories, ‘saving the phenomena,’ and in appraising the empirical adequacy of theories. However, further philosophical work ought to explore the variety of ways that empirical results bear on the epistemic status of theories and theorizing in scientific practice beyond these.

It is natural to think that computability, range of application, and other things being equal, true theories are better than false ones, good approximations are better than bad ones, and highly probable theoretical claims are better than less probable ones. One way to decide whether a theory or a theoretical claim is true, close to the truth, or acceptably probable is to derive predictions from it and use empirical data to evaluate them. Hypothetico-Deductive (HD) confirmation theorists proposed that empirical evidence argues for the truth of theories whose deductive consequences it verifies, and against those whose consequences it falsifies (Popper 1959, 32–34). But laws and theoretical generalization seldom if ever entail observational predictions unless they are conjoined with one or more auxiliary hypotheses taken from the theory they belong to. When the prediction turns out to be false, HD has trouble explaining which of the conjuncts is to blame. If a theory entails a true prediction, it will continue to do so in conjunction with arbitrarily selected irrelevant claims. HD has trouble explaining why the prediction does not confirm the irrelevancies along with the theory of interest.

Another approach to confirmation by empirical evidence is Inference to the Best Explanation (IBE). The idea is roughly that an explanation of the evidence that exhibits certain desirable characteristics with respect to a family of candidate explanations is likely to be the true on (Lipton 1991). On this approach, it is in virtue of their successful explanation of the empirical evidence that theoretical claims are supported. Naturally, IBE advocates face the challenges of defending a suitable characterization of what counts as the ‘best’ and of justifying the limited pool of candidate explanations considered (Stanford 2006).

Bayesian approaches to scientific confirmation have garnered significant attention and are now widespread in philosophy of science. Bayesians hold that the evidential bearing of empirical evidence on a theoretical claim is to be understood in terms of likelihood or conditional probability. For example, whether empirical evidence argues for a theoretical claim might be thought to depend upon whether it is more probable (and if so how much more probable) than its denial conditional on a description of the evidence together with background beliefs, including theoretical commitments. But by Bayes’ Theorem, the posterior probability of the claim of interest (that is, its probability given the evidence) is proportional to that claim’s prior probability. How to justify the choice of these prior probability assignments is one of the most notorious points of contention arising for Bayesians. If one makes the assignment of priors a subjective matter decided by epistemic agents, then it is not clear that they can be justified. Once again, one’s use of evidence to evaluate a theory depends in part upon one’s theoretical commitments (Earman 1992, 33–86; Roush 2005, 149–186). If one instead appeals to chains of successive updating using Bayes’ Theorem based on past evidence, one has to invoke assumptions that generally do not obtain in actual scientific reasoning. For instance, to ‘wash out’ the influence of priors a limit theorem is invoked wherein we consider very many updating iterations, but much scientific reasoning of interest does not happen in the limit, and so in practice priors hold unjustified sway (Norton 2021, 33).

Rather than attempting to cast all instances of confirmation based on empirical evidence as belonging to a universal schema, a better approach may be to ‘go local’. Norton’s material theory of induction argues that inductive support arises from background knowledge, that is, from material facts that are domain specific. Norton argues that, for instance, the induction from “Some samples of the element bismuth melt at 271°C” to “all samples of the element bismuth melt at 271°C” is admissible not in virtue of some universal schema that carries us from ‘some’ to ‘all’ but matters of fact (Norton 2003). In this particular case, the fact that licenses the induction is a fact about elements: “their samples are generally uniform in their physical properties” (ibid., 650). This is a fact pertinent to chemical elements, but not to samples of material like wax (ibid.). Thus Norton repeatedly emphasizes that “all induction is local”.

Still, there are those who may be skeptical about the very possibility of confirmation or of successful induction. Insofar as the bearing of evidence on theory is never totally decisive, insofar there is no single trusty universal schema that captures empirical support, perhaps the relationship between empirical evidence and scientific theory is not really about support after all. Giving up on empirical support would not automatically mean abandoning any epistemic value for empirical evidence. Rather than confirm theory, the epistemic role of evidence could be to constrain, for example by furnishing phenomena for theory to systematize or to adequately model.

Theories are said to ‘save’ observable phenomena if they satisfactorily predict, describe, or systematize them. How well a theory performs any of these tasks need not depend upon the truth or accuracy of its basic principles. Thus according to Osiander’s preface to Copernicus’ On the Revolutions , a locus classicus, astronomers “… cannot in any way attain to true causes” of the regularities among observable astronomical events, and must content themselves with saving the phenomena in the sense of using

… whatever suppositions enable … [them] to be computed correctly from the principles of geometry for the future as well as the past … (Osiander 1543, XX)

Theorists are to use those assumptions as calculating tools without committing themselves to their truth. In particular, the assumption that the planets revolve around the sun must be evaluated solely in terms of how useful it is in calculating their observable relative positions to a satisfactory approximation. Pierre Duhem’s Aim and Structure of Physical Theory articulates a related conception. For Duhem a physical theory

… is a system of mathematical propositions, deduced from a small number of principles, which aim to represent as simply and completely, and exactly as possible, a set of experimental laws. (Duhem 1906, 19)

‘Experimental laws’ are general, mathematical descriptions of observable experimental results. Investigators produce them by performing measuring and other experimental operations and assigning symbols to perceptible results according to pre-established operational definitions (Duhem 1906, 19). For Duhem, the main function of a physical theory is to help us store and retrieve information about observables we would not otherwise be able to keep track of. If that is what a theory is supposed to accomplish, its main virtue should be intellectual economy. Theorists are to replace reports of individual observations with experimental laws and devise higher level laws (the fewer, the better) from which experimental laws (the more, the better) can be mathematically derived (Duhem 1906, 21ff).

A theory’s experimental laws can be tested for accuracy and comprehensiveness by comparing them to observational data. Let EL be one or more experimental laws that perform acceptably well on such tests. Higher level laws can then be evaluated on the basis of how well they integrate EL into the rest of the theory. Some data that don’t fit integrated experimental laws won’t be interesting enough to worry about. Other data may need to be accommodated by replacing or modifying one or more experimental laws or adding new ones. If the required additions, modifications or replacements deliver experimental laws that are harder to integrate, the data count against the theory. If the required changes are conducive to improved systematization the data count in favor of it. If the required changes make no difference, the data don’t argue for or against the theory.

On van Fraassen’s (1980) semantic account, a theory is empirically adequate when the empirical structure of at least one model of that theory is isomorphic to what he calls the “appearances” (45). In other words, when the theory “has at least one model that all the actual phenomena fit inside” (12). Thus, for van Fraassen, we continually check the empirical adequacy of our theories by seeing if they have the structural resources to accommodate new observations. We’ll never know that a given theory is totally empirically adequate, since for van Fraassen, empirical adequacy obtains with respect to all that is observable in principle to creatures like us, not all that has already been observed (69).

The primary appeal of dealing in empirical adequacy rather than confirmation is its appropriate epistemic humility. Instead of claiming that confirming evidence justifies belief (or boosted confidence) that a theory is true, one is restricted to saying that the theory continues to be consistent with the evidence as far as we can tell so far. However, if the epistemic utility of empirical results in appraising the status of theories is just to judge their empirical adequacy, then it may be difficult to account for the difference between adequate but unrealistic theories, and those equally adequate theories that ought to be taken seriously as representations. Appealing to extra-empirical virtues like parsimony may be a way out, but one that will not appeal to philosophers skeptical of the connection thereby supposed between such virtues and representational fidelity.

On an earlier way of thinking, observation was to serve as the unmediated foundation of science—direct access to the facts upon which the edifice of scientific knowledge could be built. When conflict arose between factions with different ideological commitments, observations could furnish the material for neutral arbitration and settle the matter objectively, in virtue of being independent of non-empirical commitments. According to this view, scientists working in different paradigms could at least appeal to the same observations, and propagandists could be held accountable to the publicly accessible content of theory and value-free observations. Despite their different theories, Priestley and Lavoisier could find shared ground in the observations. Anti-Semites would be compelled to admit the success of a theory authored by a Jewish physicist, in virtue of the unassailable facts revealed by observation.

This version of empiricism with respect to science does not accord well with the fact that observation per se plays a relatively small role in many actual scientific methodologies, and the fact that even the most ‘raw’ data is often already theoretically imbued. The strict contrast between theory and observation in science is more fruitfully supplanted by inquiry into the relationship between theorizing and empirical results.

Contemporary philosophers of science tend to embrace the theory ladenness of empirical results. Instead of seeing the integration of the theoretical and the empirical as an impediment to furthering scientific knowledge, they see it as necessary. A ‘view from nowhere’ would not bear on our particular theories. That is, it is impossible to put empirical results to use without recruiting some theoretical resources. In order to use an empirical result to constrain or test a theory it has to be processed into a form that can be compared to that theory. To get stellar spectrograms to bear on Newtonian or relativistic cosmology, they need to be processed—into galactic rotation curves, say. The spectrograms by themselves are just artifacts, pieces of paper. Scientists need theoretical resources in order to even identify that such artifacts bear information relevant for their purposes, and certainly to put them to any epistemic use in assessing theories.

This outlook does not render contemporary philosophers of science all constructivists, however. Theory mediates the connection between the target of inquiry and the scientific worldview, it does not sever it. Moreover, vigilance is still required to ensure that the particular ways in which theory is ‘involved’ in the production of empirical results are not epistemically detrimental. Theory can be deployed in experiment design, data processing, and presentation of results in unproductive ways, for instance, in determining whether the results will speak for or against a particular theory regardless of what the world is like. Critical appraisal of the roles of theory is thus important for genuine learning about nature through science. Indeed, it seems that extra-empirical values can sometimes assist such critical appraisal. Instead of viewing observation as the theory-free and for that reason furnishing the content with which to appraise theories, we might attend to the choices and mistakes that can be made in collecting and generating empirical results with the help of theoretical resources, and endeavor to make choices conducive to learning and correct mistakes as we discover them.

Recognizing the involvement of theory and values in the constitution and generation of empirical results does not undermine the special epistemic value of empirical science in contrast to propaganda and pseudoscience. In cases where the influence of cultural, political, and religious values hinder scientific inquiry, it is often the case that they do so by limiting or determining the nature of the empirical results. Yet, by working to make the assumptions that shape results explicit we can examine their suitability for our purposes and attempt to restructure inquiry as necessary. When disagreements arise, scientists can attempt to settle them by appealing to the causal connections between the research target and the empirical data. The tribunal of experience speaks through empirical results, but it only does so through via careful fashioning with theoretical resources.

  • Anderson, E., 2004, “Uses of Value Judgments in Science: A General Argument, with Lessons from a Case Study of Feminist Research on Divorce,” Hypatia , 19(1): 1–24.
  • Aristotle(a), Generation of Animals in Complete Works of Aristotle (Volume 1), J. Barnes (ed.), Princeton: Princeton University Press, 1995, pp. 774–993
  • Aristotle(b), History of Animals in Complete Works of Aristotle (Volume 1), J. Barnes (ed.), Princeton: Princeton University Press, 1995, pp. 1111–1228.
  • Azzouni, J., 2004, “Theory, Observation, and Scientific Realism,” British Journal for the Philosophy of Science , 55(3): 371–92.
  • Bacon, Francis, 1620, Novum Organum with other parts of the Great Instauration , P. Urbach and J. Gibson (eds. and trans.), La Salle: Open Court, 1994.
  • Bogen, J., 2016, “Empiricism and After,”in P. Humphreys (ed.), Oxford Handbook of Philosophy of Science , Oxford: Oxford University Press, pp. 779–795.
  • Bogen, J, and Woodward, J., 1988, “Saving the Phenomena,” Philosophical Review , XCVII (3): 303–352.
  • Bokulich, A., 2020, “Towards a Taxonomy of the Model-Ladenness of Data,” Philosophy of Science , 87(5): 793–806.
  • Borrelli, A., 2012, “The Case of the Composite Higgs: The Model as a ‘Rosetta Stone’ in Contemporary High-Energy Physics,” Studies in History and Philosophy of Science (Part B: Studies in History and Philosophy of Modern Physics), 43(3): 195–214.
  • Boyd, N. M., 2018, “Evidence Enriched,” Philosophy of Science , 85(3): 403–21.
  • Boyle, R., 1661, The Sceptical Chymist , Montana: Kessinger (reprint of 1661 edition).
  • Bridgman, P., 1927, The Logic of Modern Physics , New York: Macmillan.
  • Chang, H., 2005, “A Case for Old-fashioned Observability, and a Reconstructive Empiricism,” Philosophy of Science , 72(5): 876–887.
  • Collins, H. M., 1985 Changing Order , Chicago: University of Chicago Press.
  • Conant, J.B., 1957, (ed.) “The Overthrow of the Phlogiston Theory: The Chemical Revolution of 1775–1789,” in J.B.Conant and L.K. Nash (eds.), Harvard Studies in Experimental Science , Volume I, Cambridge: Harvard University Press, pp. 65–116).
  • Daston, L., and P. Galison, 2007, Objectivity , Brooklyn: Zone Books.
  • Douglas, H., 2000, “Inductive Risk and Values in Science,” Philosophy of Science , 67(4): 559–79.
  • –––, 2013, “The Value of Cognitive Values,” Philosophy of Science , 80(5): 796–806.
  • Duhem, P., 1906, The Aim and Structure of Physical Theory , P. Wiener (tr.), Princeton: Princeton University Press, 1991.
  • Earman, J., 1992, Bayes or Bust? , Cambridge: MIT Press.
  • Feest, U., 2005, “Operationism in psychology: what the debate is about, what the debate should be about,” Journal of the History of the Behavioral Sciences , 41(2): 131–149.
  • Feyerabend, P.K., 1969, “Science Without Experience,” in P.K. Feyerabend, Realism, Rationalism, and Scientific Method (Philosophical Papers I), Cambridge: Cambridge University Press, 1985, pp. 132–136.
  • Franklin, A., 1986, The Neglect of Experiment , Cambridge: Cambridge University Press.
  • Galison, P., 1987, How Experiments End , Chicago: University of Chicago Press.
  • –––, 1990, “Aufbau/Bauhaus: logical positivism and architectural modernism,” Critical Inquiry , 16 (4): 709–753.
  • Goodman, A., et al., 2014, “Ten Simple Rules for the Care and Feeding of Scientific Data,” PLoS Computational Biology , 10(4): e1003542.
  • Hacking, I., 1981, “Do We See Through a Microscope?,” Pacific Philosophical Quarterly , 62(4): 305–322.
  • –––, 1983, Representing and Intervening , Cambridge: Cambridge University Press.
  • Hanson, N.R., 1958, Patterns of Discovery , Cambridge, Cambridge University Press.
  • Hempel, C.G., 1952, “Fundamentals of Concept Formation in Empirical Science,” in Foundations of the Unity of Science , Volume 2, O. Neurath, R. Carnap, C. Morris (eds.), Chicago: University of Chicago Press, 1970, pp. 651–746.
  • Herschel, J. F. W., 1830, Preliminary Discourse on the Study of Natural Philosophy , New York: Johnson Reprint Corp., 1966.
  • Hooke, R., 1705, “The Method of Improving Natural Philosophy,” in R. Waller (ed.), The Posthumous Works of Robert Hooke , London: Frank Cass and Company, 1971.
  • Horowitz, P., and W. Hill, 2015, The Art of Electronics , third edition, New York: Cambridge University Press.
  • Intemann, K., 2021, “Feminist Perspectives on Values in Science,” in S. Crasnow and L. Intemann (eds.), The Routledge Handbook of Feminist Philosophy of Science , New York: Routledge, pp. 201–15.
  • Kuhn, T.S., The Structure of Scientific Revolutions , 1962, Chicago: University of Chicago Press, reprinted,1996.
  • Latour, B., 1999, “Circulating Reference: Sampling the Soil in the Amazon Forest,” in Pandora’s Hope: Essays on the Reality of Science Studies , Cambridge, MA: Harvard University Press, pp. 24–79.
  • Latour, B., and Woolgar, S., 1979, Laboratory Life, The Construction of Scientific Facts , Princeton: Princeton University Press, 1986.
  • Laymon, R., 1988, “The Michelson-Morley Experiment and the Appraisal of Theories,” in A. Donovan, L. Laudan, and R. Laudan (eds.), Scrutinizing Science: Empirical Studies of Scientific Change , Baltimore: The Johns Hopkins University Press, pp. 245–266.
  • Leonelli, S., 2009, “On the Locality of Data and Claims about Phenomena,” Philosophy of Science , 76(5): 737–49.
  • Leonelli, S., and N. Tempini (eds.), 2020, Data Journeys in the Sciences , Cham: Springer.
  • Lipton, P., 1991, Inference to the Best Explanation , London: Routledge.
  • Lloyd, E.A., 1993, “Pre-theoretical Assumptions In Evolutionary Explanations of Female Sexuality,” Philosophical Studies , 69: 139–153.
  • –––, 2012, “The Role of ‘Complex’ Empiricism in the Debates about Satellite Data and Climate Models,”, Studies in History and Philosophy of Science (Part A), 43(2): 390–401.
  • Longino, H., 1979, “Evidence and Hypothesis: An Analysis of Evidential Relations,” Philosophy of Science , 46(1): 35–56.
  • –––, 2020, “Afterward:Data in Transit,” in S. Leonelli and N. Tempini (eds.), Data Journeys in the Sciences , Cham: Springer, pp. 391–400.
  • Lupyan, G., 2015, “Cognitive Penetrability of Perception in the Age of Prediction – Predictive Systems are Penetrable Systems,” Review of Philosophical Psychology , 6(4): 547–569. doi:10.1007/s13164-015-0253-4
  • Mill, J. S., 1872, System of Logic , London: Longmans, Green, Reader, and Dyer.
  • Norton, J., 2003, “A Material Theory of Induction,” Philosophy of Science , 70(4): 647–70.
  • –––, 2021, The Material Theory of Induction , .
  • Nyquist, H., 1928, “Thermal Agitation of Electric Charge in Conductors,” Physical Review , 32(1): 110–13.
  • O’Connor, C. and J. O. Weatherall, 2019, The Misinformation Age: How False Beliefs Spread , New Haven: Yale University Press.
  • Olesko, K.M. and Holmes, F.L., 1994, “Experiment, Quantification and Discovery: Helmholtz’s Early Physiological Researches, 1843–50,” in D. Cahan, (ed.), Hermann Helmholtz and the Foundations of Nineteenth Century Science , Berkeley: UC Press, pp. 50–108.
  • Osiander, A., 1543, “To the Reader Concerning the Hypothesis of this Work,” in N. Copernicus On the Revolutions , E. Rosen (tr., ed.), Baltimore: Johns Hopkins University Press, 1978, p. XX.
  • Parker, W. S., 2016, “Reanalysis and Observation: What’s the Difference?,” Bulletin of the American Meteorological Society , 97(9): 1565–72.
  • –––, 2017, “Computer Simulation, Measurement, and Data Assimilation,” The British Journal for the Philosophy of Science , 68(1): 273–304.
  • Popper, K.R.,1959, The Logic of Scientific Discovery , K.R. Popper (tr.), New York: Basic Books.
  • Rheinberger, H. J., 1997, Towards a History of Epistemic Things: Synthesizing Proteins in the Test Tube , Stanford: Stanford University Press.
  • Roush, S., 2005, Tracking Truth , Cambridge: Cambridge University Press.
  • Rudner, R., 1953, “The Scientist Qua Scientist Makes Value Judgments,” Philosophy of Science , 20(1): 1–6.
  • Schlick, M., 1935, “Facts and Propositions,” in Philosophy and Analysis , M. Macdonald (ed.), New York: Philosophical Library, 1954, pp. 232–236.
  • Schottky, W. H., 1918, “Über spontane Stromschwankungen in verschiedenen Elektrizitätsleitern,” Annalen der Physik , 362(23): 541–67.
  • Shapere, D., 1982, “The Concept of Observation in Science and Philosophy,” Philosophy of Science , 49(4): 485–525.
  • Stanford, K., 1991, Exceeding Our Grasp: Science, History, and the Problem of Unconceived Alternatives , Oxford: Oxford University Press.
  • Stephenson, F. R., L. V. Morrison, and C. Y. Hohenkerk, 2016, “Measurement of the Earth’s Rotation: 720 BC to AD 2015,” Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences , 472: 20160404.
  • Stuewer, R.H., 1985, “Artificial Disintegration and the Cambridge-Vienna Controversy,” in P. Achinstein and O. Hannaway (eds.), Observation, Experiment, and Hypothesis in Modern Physical Science , Cambridge, MA: MIT Press, pp. 239–307.
  • Suppe, F., 1977, in F. Suppe (ed.) The Structure of Scientific Theories , Urbana: University of Illinois Press.
  • Van Fraassen, B.C, 1980, The Scientific Image , Oxford: Clarendon Press.
  • Ward, Z. B., 2021, “On Value-Laden Science,” Studies in History and Philosophy of Science Part A , 85: 54–62.
  • Whewell, W., 1858, Novum Organon Renovatum , Book II, in William Whewell Theory of Scientific Method , R.E. Butts (ed.), Indianapolis: Hackett Publishing Company, 1989, pp. 103–249.
  • Woodward, J. F., 2010, “Data, Phenomena, Signal, and Noise,” Philosophy of Science , 77(5): 792–803.
  • –––, 2011, “Data and Phenomena: A Restatement and Defense,” Synthese , 182(1): 165–79.
  • Wylie, A., 2020, “Radiocarbon Dating in Archaeology: Triangulation and Traceability,” in S. Leonelli and N. Tempini (eds.), Data Journeys in the Sciences , Cham: Springer, pp. 285–301.
  • Yap, A., 2016, “Feminist Radical Empiricism, Values, and Evidence,” Hypatia , 31(1): 58–73.
How to cite this entry . Preview the PDF version of this entry at the Friends of the SEP Society . Look up topics and thinkers related to this entry at the Internet Philosophy Ontology Project (InPhO). Enhanced bibliography for this entry at PhilPapers , with links to its database.
  • Confirmation , by Franz Huber, in the Internet Encyclopedia of Philosophy .
  • Transcript of Katzmiller v. Dover Area School District (on the teaching of intelligent design).

Bacon, Francis | Bayes’ Theorem | constructive empiricism | Duhem, Pierre | empiricism: logical | epistemology: Bayesian | feminist philosophy, topics: perspectives on science | incommensurability: of scientific theories | Locke, John | measurement: in science | models in science | physics: experiment in | science: and pseudo-science | scientific objectivity | scientific research and big data | statistics, philosophy of

Copyright © 2021 by Nora Mills Boyd < nboyd @ siena . edu > James Bogen

  • Accessibility

Support SEP

Mirror sites.

View this site from another server:

  • Info about mirror sites

The Stanford Encyclopedia of Philosophy is copyright © 2023 by The Metaphysics Research Lab , Department of Philosophy, Stanford University

Library of Congress Catalog Data: ISSN 1095-5054

Canvas | University | Ask a Librarian

  • Library Homepage
  • Arrendale Library

Empirical Research: Quantitative & Qualitative

  • Empirical Research

Introduction: What is Empirical Research?

Quantitative methods, qualitative methods.

  • Quantitative vs. Qualitative
  • Reference Works for Social Sciences Research
  • Contact Us!

 Call us at 706-776-0111

  Chat with a Librarian

  Send Us Email

  Library Hours

Empirical research  is based on phenomena that can be observed and measured. Empirical research derives knowledge from actual experience rather than from theory or belief. 

Key characteristics of empirical research include:

  • Specific research questions to be answered;
  • Definitions of the population, behavior, or phenomena being studied;
  • Description of the methodology or research design used to study this population or phenomena, including selection criteria, controls, and testing instruments (such as surveys);
  • Two basic research processes or methods in empirical research: quantitative methods and qualitative methods (see the rest of the guide for more about these methods).

(based on the original from the Connelly LIbrary of LaSalle University)

empirical research evidence

Empirical Research: Qualitative vs. Quantitative

Learn about common types of journal articles that use APA Style, including empirical studies; meta-analyses; literature reviews; and replication, theoretical, and methodological articles.

Academic Writer

© 2024 American Psychological Association.

  • More about Academic Writer ...

Quantitative Research

A quantitative research project is characterized by having a population about which the researcher wants to draw conclusions, but it is not possible to collect data on the entire population.

  • For an observational study, it is necessary to select a proper, statistical random sample and to use methods of statistical inference to draw conclusions about the population. 
  • For an experimental study, it is necessary to have a random assignment of subjects to experimental and control groups in order to use methods of statistical inference.

Statistical methods are used in all three stages of a quantitative research project.

For observational studies, the data are collected using statistical sampling theory. Then, the sample data are analyzed using descriptive statistical analysis. Finally, generalizations are made from the sample data to the entire population using statistical inference.

For experimental studies, the subjects are allocated to experimental and control group using randomizing methods. Then, the experimental data are analyzed using descriptive statistical analysis. Finally, just as for observational data, generalizations are made to a larger population.

Iversen, G. (2004). Quantitative research . In M. Lewis-Beck, A. Bryman, & T. Liao (Eds.), Encyclopedia of social science research methods . (pp. 897-898). Thousand Oaks, CA: SAGE Publications, Inc.

Qualitative Research

What makes a work deserving of the label qualitative research is the demonstrable effort to produce richly and relevantly detailed descriptions and particularized interpretations of people and the social, linguistic, material, and other practices and events that shape and are shaped by them.

Qualitative research typically includes, but is not limited to, discerning the perspectives of these people, or what is often referred to as the actor’s point of view. Although both philosophically and methodologically a highly diverse entity, qualitative research is marked by certain defining imperatives that include its case (as opposed to its variable) orientation, sensitivity to cultural and historical context, and reflexivity. 

In its many guises, qualitative research is a form of empirical inquiry that typically entails some form of purposive sampling for information-rich cases; in-depth interviews and open-ended interviews, lengthy participant/field observations, and/or document or artifact study; and techniques for analysis and interpretation of data that move beyond the data generated and their surface appearances. 

Sandelowski, M. (2004).  Qualitative research . In M. Lewis-Beck, A. Bryman, & T. Liao (Eds.),  Encyclopedia of social science research methods . (pp. 893-894). Thousand Oaks, CA: SAGE Publications, Inc.

  • Next: Quantitative vs. Qualitative >>
  • Last Updated: Mar 22, 2024 10:47 AM
  • URL:
  • Ebooks & Online Video
  • New Materials
  • Renew Checkouts
  • Faculty Resources
  • Friends of the Library
  • Library Services
  • Request Books from Demorest
  • Our Mission
  • Library History
  • Ask a Librarian!
  • Making Citations
  • Working Online

Friend us on Facebook!

Arrendale Library Piedmont University 706-776-0111

Empirical Research: A Comprehensive Guide for Academics 

empirical research

Empirical research relies on gathering and studying real, observable data. The term ’empirical’ comes from the Greek word ’empeirikos,’ meaning ‘experienced’ or ‘based on experience.’ So, what is empirical research? Instead of using theories or opinions, empirical research depends on real data obtained through direct observation or experimentation. 

Why Empirical Research?

Empirical research plays a key role in checking or improving current theories, providing a systematic way to grow knowledge across different areas. By focusing on objectivity, it makes research findings more trustworthy, which is critical in research fields like medicine, psychology, economics, and public policy. In the end, the strengths of empirical research lie in deepening our awareness of the world and improving our capacity to tackle problems wisely. 1,2  

Qualitative and Quantitative Methods

There are two main types of empirical research methods – qualitative and quantitative. 3,4 Qualitative research delves into intricate phenomena using non-numerical data, such as interviews or observations, to offer in-depth insights into human experiences. In contrast, quantitative research analyzes numerical data to spot patterns and relationships, aiming for objectivity and the ability to apply findings to a wider context. 

Steps for Conducting Empirical Research

When it comes to conducting research, there are some simple steps that researchers can follow. 5,6  

  • Create Research Hypothesis:  Clearly state the specific question you want to answer or the hypothesis you want to explore in your study. 
  • Examine Existing Research:  Read and study existing research on your topic. Understand what’s already known, identify existing gaps in knowledge, and create a framework for your own study based on what you learn. 
  • Plan Your Study:  Decide how you’ll conduct your research—whether through qualitative methods, quantitative methods, or a mix of both. Choose suitable techniques like surveys, experiments, interviews, or observations based on your research question. 
  • Develop Research Instruments:  Create reliable research collection tools, such as surveys or questionnaires, to help you collate data. Ensure these tools are well-designed and effective. 
  • Collect Data:  Systematically gather the information you need for your research according to your study design and protocols using the chosen research methods. 
  • Data Analysis:  Analyze the collected data using suitable statistical or qualitative methods that align with your research question and objectives. 
  • Interpret Results:  Understand and explain the significance of your analysis results in the context of your research question or hypothesis. 
  • Draw Conclusions:  Summarize your findings and draw conclusions based on the evidence. Acknowledge any study limitations and propose areas for future research. 

Advantages of Empirical Research

Empirical research is valuable because it stays objective by relying on observable data, lessening the impact of personal biases. This objectivity boosts the trustworthiness of research findings. Also, using precise quantitative methods helps in accurate measurement and statistical analysis. This precision ensures researchers can draw reliable conclusions from numerical data, strengthening our understanding of the studied phenomena. 4  

Disadvantages of Empirical Research

While empirical research has notable strengths, researchers must also be aware of its limitations when deciding on the right research method for their study.4 One significant drawback of empirical research is the risk of oversimplifying complex phenomena, especially when relying solely on quantitative methods. These methods may struggle to capture the richness and nuances present in certain social, cultural, or psychological contexts. Another challenge is the potential for confounding variables or biases during data collection, impacting result accuracy.  

Tips for Empirical Writing

In empirical research, the writing is usually done in research papers, articles, or reports. The empirical writing follows a set structure, and each section has a specific role. Here are some tips for your empirical writing. 7   

  • Define Your Objectives:  When you write about your research, start by making your goals clear. Explain what you want to find out or prove in a simple and direct way. This helps guide your research and lets others know what you have set out to achieve. 
  • Be Specific in Your Literature Review:  In the part where you talk about what others have studied before you, focus on research that directly relates to your research question. Keep it short and pick studies that help explain why your research is important. This part sets the stage for your work. 
  • Explain Your Methods Clearly : When you talk about how you did your research (Methods), explain it in detail. Be clear about your research plan, who took part, and what you did; this helps others understand and trust your study. Also, be honest about any rules you follow to make sure your study is ethical and reproducible. 
  • Share Your Results Clearly : After doing your empirical research, share what you found in a simple way. Use tables or graphs to make it easier for your audience to understand your research. Also, talk about any numbers you found and clearly state if they are important or not. Ensure that others can see why your research findings matter. 
  • Talk About What Your Findings Mean:  In the part where you discuss your research results, explain what they mean. Discuss why your findings are important and if they connect to what others have found before. Be honest about any problems with your study and suggest ideas for more research in the future. 
  • Wrap It Up Clearly:  Finally, end your empirical research paper by summarizing what you found and why it’s important. Remind everyone why your study matters. Keep your writing clear and fix any mistakes before you share it. Ask someone you trust to read it and give you feedback before you finish. 


  • Empirical Research in the Social Sciences and Education, Penn State University Libraries. Available online at  
  • How to conduct empirical research, Emerald Publishing. Available online at  
  • Empirical Research: Quantitative & Qualitative, Arrendale Library, Piedmont University. Available online at  
  • Bouchrika, I.  What Is Empirical Research? Definition, Types & Samples  in 2024., January 2024. Available online at  
  • Quantitative and Empirical Research vs. Other Types of Research. California State University, April 2023. Available online at  
  • Empirical Research, Definitions, Methods, Types and Examples, website. Available online at  
  • Writing an Empirical Paper in APA Style. Psychology Writing Center, University of Washington. Available online at  

Paperpal is an AI writing assistant that help academics write better, faster with real-time suggestions for in-depth language and grammar correction. Trained on millions of research manuscripts enhanced by professional academic editors, Paperpal delivers human precision at machine speed.  

Try it for free or upgrade to  Paperpal Prime , which unlocks unlimited access to premium features like academic translation, paraphrasing, contextual synonyms, consistency checks and more. It’s like always having a professional academic editor by your side! Go beyond limitations and experience the future of academic writing.  Get Paperpal Prime now at just US$19 a month!  

Related Reads:

  • How to Write a Scientific Paper in 10 Steps 
  • What is a Literature Review? How to Write It (with Examples)
  • What is an Argumentative Essay? How to Write It (With Examples)
  • Ethical Research Practices For Research with Human Subjects

Ethics in Science: Importance, Principles & Guidelines 

Presenting research data effectively through tables and figures, you may also like, what are journal guidelines on using generative ai..., types of plagiarism and 6 tips to avoid..., how to write an essay introduction (with examples)..., similarity checks: the author’s guide to plagiarism and..., what is a master’s thesis: a guide for..., authorship in academia: ghost, guest, and gift authorship, should you use ai tools like chatgpt for..., what are the benefits of generative ai for..., how to avoid plagiarism tips and advice for..., plagiarism checkers vs. ai content detection: navigating the....

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • BMC Health Serv Res

Logo of bmchsr

The effectiveness of implementation strategies for promoting evidence informed interventions in allied healthcare: a systematic review

Kaat goorts.

1 Department of Public Health and Primary Care, Environment and Health, KU Leuven, Leuven, Belgium

Janine Dizon

2 International Centre for Allied Health Evidence, University of South Australia, City East Campus, North Terrace, Adelaide, Australia

Steve Milanese

Associated data.

Data sharing is not applicable to this article as no datasets were generated or analysed during the current study.

Evidence based practice in health care has become increasingly popular over the last decades. Many guidelines have been developed to improve evidence informed decision making in health care organisations, however it is often overlooked that the actual implementation strategies for these guidelines are as important as the guidelines themselves. The effectiveness of these strategies is rarely ever tested specifically for the allied health therapy group.

Cochrane, Medline, Embase and Scopus databases were searched from 2000 to October 2019. Level I and II studies were included if an evidence informed implementation strategy was tested in allied health personnel.

The SIGN method was used to evaluate risk of bias. The evidence was synthesised using a narrative synthesis. The National Health and Medical Research Council (NHMRC) model was applied to evaluate the grade for recommendation.

A total of 490 unique articles were identified, with 6 primary studies meeting the inclusion criteria. Three different implementation strategies and three multi-faceted components strategies were described. We found moderate evidence for educational meetings, local opinion leaders and patient mediated interventions. We found stronger evidence for multi-faceted components strategies.

Few studies describe the effectiveness of implementation strategies for allied healthcare, but evidence was found for multi-faceted components for implementing research in an allied health therapy group population. When considering implementation of evidence informed interventions in allied health a multi-pronged approach appears to be more successful.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12913-021-06190-0.

Evidence-based health care practices have been promoted within healthcare systems internationally [ 1 ], as the use of evidence informed practice has been linked to improved patient health outcomes [ 2 ]. Clinical guidelines, developed from the best available evidence aim to improve the patient outcomes, quality of care, reduce practice variation and/or reduce cost by providing clinicians with recommendations that reflect best practice [ 3 ].

However, the practices recommended in guidelines are not always implemented in healthcare delivery, and significant variations in health care practice remain [ 1 ]. It has been suggested that the extent to which guideline implementation occurs depends primarily on two factors: the quality of the evidence on which the guideline is based, and the guideline implementation strategy used [ 3 ].

In general, there are two types of implementation strategies; passive strategies, which include the use of educational materials, posters, toolkits and visual aids, or active strategies, which include interactive workshops, academic detailing, audit and feedback and reminders [ 4 ]. The evidence suggests that passive strategies may have modest beneficial effects, but do not necessarily lead to sustained behaviour change. In contrast, active multifaceted strategies appear to have the greatest impact [ 5 ]. In addition to the type of strategy used, both the individual practitioner and the organization perspectives should be considered in the implementation strategy.

Some authors have suggested that the differentiation between active and passive or single versus multi implementation strategy is too simplistic and fails to recognize the complexity that is inherent in knowledge translation. They advocate for translational strategies that take account of the type of knowledge to be implemented, the context of implementation and the people and processes involved [ 6 ]. The PARHIS (Promoting Action on Research Implementation in Health Services) framework [ 7 ] described successful translation as a function of the interplay between the research evidence, the context in which translation is happening and the ways in which the process is facilitated. Having one or more people in a facilitatory role, contextualising the evidence and devising appropriate translation strategies for the local environment, forms an important ‘active ingredient’ to the framework.

The Cochrane EPOC group (Effective Practice and Organisation of Care Review Group) has presented a data collection checklist for scientists undertaking reviews into interventions for improving professional practice and the delivery of effective health services. The aim of the checklist is to provide reviewers with guidance on the relevant information that could be extracted from primary studies. This checklist provides an overview of ten (10) different implementation strategies, including both passive and active strategies [ 8 ].

The Expert Recommendations for Implementing Change study (ERIC) clustered 73 implementation strategies identified from an expert panel of stakeholders [ 9 ] into nine clusters to make it easier to consider the implementation strategies by thematic cluster [ 10 ]. These clusters include engaging consumers, using evaluative and iterative strategies, changing infrastructure, adapting and tailoring the context, developing stakeholder interrelationships, utilising financial strategies, supporting clinicians, providing interactive assistance and training and educating stakeholders.

Several studies have investigated the effectiveness of one or multiple implementation strategies, and several systematic reviews have aimed to synthesise this evidence [ 11 – 14 ]. However, in many studies/reviews, the results were not differentiated for the range of professions within the healthcare system, with a number of studies generalizing results for all “healthcare workers” including physicians, nurses, paramedics and other allied health groups. Differentiating between medicine (physicians, doctors), nursing and allied health may be important when considering implementation strategies as adherence to these strategies may differ between these groups.

Three reviews [ 3 , 15 , 16 ] have focused on the allied health profession. However, whilst these reviews gave an overview of the existing evidence, the inclusion of lower quality studies and significant heterogeneity across the included studies meant that the pooling of results was not possible. Also, the recommendations from the evidence on the strategies in practice where not quantitatively graded using grading methods [ 3 , 16 ]. These inconsistencies may explain the differences in review findings. Menon et al [ 15 ], concluded that the use of active, multi-component knowledge transfer interventions enhanced knowledge and practice behaviours in physical therapists but that additional research was needed in occupational therapy. In contrast, Hakkennes and Dodd [ 3 ] suggested that multi-faceted interventions were not more effective than single intervention strategies in allied health.

As all three reviews are at least seven years old, it is necessary to update the reviews in light of more current evidence and to explore the recommendations in terms of the quality of the evidence presented and using standardised evidence to decision framework. Therefore, the current review aimed to update the previous evidence reviews by identifying studies that have evaluated the effectiveness of strategies for disseminating and implementing evidence-based guidelines, specifically in an allied health context. By narrowing the review question to this specific context and focussing on high hierarchy and high-quality evidence, we aim to provide more valid recommendations for practice.

Protocol and registration

The systematic review protocol was registered in PROSPERO with ID number 152512

Identifying the research question

The primary question of this review was to review the effectiveness of implementation strategies for promoting evidence-informed interventions in allied health. A secondary aim was to describe the context in which certain implementation strategies were most effective.’

Eligibility criteria

Studies were selected based on the study design, the participants, implementation strategies and outcomes. Only randomized controlled trials (RCTs) and systematic reviews (SRs) were included. Within the SRs, only the primary RCTs were included that would satisfy the inclusion criteria.

Data was included if the participants were part of an allied health therapy group. The classification of allied health was based on the definition of Turnbull et. al [ 17 ] where four allied health groups were defined: a therapy group, a diagnostic and technical group, a scientific group and a complementary services group. In this paper, we will discuss the allied health therapy group only which includes nutritionist and dietitian, occupational therapist, physiotherapist, psychologist, podiatrist, social worker, speech pathologist, exercise physiologist, ambulance paramedic, music therapist, art therapist, exercise physiologist, ambulance officer, intensive care paramedics).

Studies were included if the implementation strategy was applied to the therapists in the allied health care therapy group (no patient only interventions) and if the implementation strategy was used to implement evidence informed healthcare guidelines. Studies were included if the outcomes addressed the impact on patient outcomes or process/profession outcomes. Studies were excluded if they were not original publications or were not published in the English language or were unable to be accessed in full text.

Information sources

Keywords were applied in Cochrane, Medline, Embase and Scopus databases on October 4 th 2019

A systematic search was performed to identify literature regarding the effectiveness of research implementation strategies in allied health contexts. The keywords used were: (health* or hospital*).

Allied Health Personnel/ (“allied health personnel” or “allied health professional*” or “assistant*, healthcare” or “health personnel, allied” or “health professional*, allied” or “healthcare assistant*” or “healthcare support worker*” or “paramedic*” or “paramedical personnel” or “personnel, allied health” or “personnel, paramedical” or “population program specialist*” or “professional*, allied health” or “program specialist*, population” or “specialist*, population program” or “support worker*, healthcare” or “worker*, healthcare support”).

“Diffusion of Innovation”/ or Evidence-Based Medicine/ or Evidence-Based Practice/ or Information Dissemination/ (“Knowledge translation” or “knowledge transfer” or “knowledge implementation” or “knowledge utili?ation” or “knowledge dissemination” or “knowledge adoption” or “knowledge change*” or “knowledge evaluation” or “knowledge use*” or “knowledge institutionali?ation” or “knowledge communication” or “research translation” or “research transfer” or “research implementation” or “research utili?ation” or “research dissemination” or “research adoption” or “research change*” or “research evaluation” or “research use*” or “research institutionali?ation” or “research communication” or “evidence translation” or “evidence transfer” or “evidence implementation” or “evidence utili?ation” or “evidence dissemination” or “evidence adoption” or “evidence change*” or “evidence evaluation” or “evidence use*” or “evidence institutionali?ation” or “evidence communication” or “Translation of knowledge” or “translation of research” or “translation of evidence” or “transfer of knowledge” or “transfer of research” or “transfer of evidence” or “systematic review evidence” or “implementation strateg*”).

A date limited search (from 2000 onwards) was applied as the contextual related factors (i.e. healthcare systems) have evolved over time. In addition, the use of formalised evidence-based clinical decision making became popular from approximately 1996 when Sackett and colleagues defined evidence-based clinical decision making as a combination of not only research evidence, but also clinical expertise, taking into account the patient’s preferences [ 18 ].

Electronic database searches were supplemented by checking the reference list of included articles.

Searches were performed by two authors (KG and JD).

Study selection

From the initial search, duplicates were removed. Titles and abstracts were screened for eligibility based on the criteria above and full texts of potentially included studies were retrieved and further assessed for eligibility. Only level I and II studies (SRs and RCTs) were included as they represent the highest level of evidence. Studies were selected independently by two authors (KG and JD).

Data collation, summary and reporting of findings

A purpose-built Microsoft Excel© sheet was used to extract relevant data from the selected studies including the authors, study design, setting, participants, type of implementation strategy and the associated outcomes. Data was extracted by one author (KG)

Findings were categorised using the taxonomy of professional interventions form [ 8 ], and the nine clusters of implementation strategies [ 10 ]. The taxonomy of professional interventions include:

  • Distribution of educational materials—distribution of published or printed recommendations for clinical care, including clinical practice guidelines, audio-visual materials, and electronic publications
  • Educational meetings—health care providers who have participated in conferences, lectures, workshops, or traineeships
  • Local consensus processes—inclusion of participating providers in discussion to ensure that they agreed that the chosen clinical problem was important and the approach to managing the problem was appropriate
  • Educational outreach visits—use of a trained person who met with providers in their practice settings to give information with the intent of changing the provider’s practice
  • Local opinion leaders—use of providers nominated by their colleagues as “educationally influential.” The investigators must have explicitly stated that their colleagues identified the opinion leaders
  • Patient mediated interventions—new clinical information (not previously available) collected directly from patients and given to the provider, e.g., depression scores from an instrument
  • Audit and feedback—any summary of clinical performance of health care over a specified period of time
  • Reminders—patient or encounter-specific information, provided verbally, on paper or on a computer screen that is designed or intended to prompt a health professional to recall information
  • Marketing—use of personal interviewing, group discussion (“focus groups”), or a survey of targeted providers to identify barriers to change and subsequent design of an intervention that addresses identified barriers
  • Mass media—(i) varied use of communication that reached great numbers of people including television, radio, newspapers, posters, leaflets, and booklets, alone or in conjunction with other interventions; and (ii) targeted at the population level

Risk of bias in individual studies

Two reviewers (KG and JD) independently assessed the quality of included publications using a relevant critical appraisal tool from the Scottish Intercollegiate Guidelines Network (SIGN) stable [ 19 ]. The relevant SIGN checklist was applied to the study and scored with scores < 3 categorised as low quality (LQ), between 4 and 6 average quality (AQ) and > 7 as high quality (HQ). Any disagreements were resolved by discussion between reviewers, and where agreement could not be reached an independent third reviewer (SM) was consulted. The SIGN checklists were used as they are widely used critical appraisal tools that are available for a range of study designs [ 20 ].

Grading of recommendations

Studies were assessed for relevancy, reliability, validity, and applicability and the level of Evidence was evaluated using the National Health and Medical Research Council (NHMRC) model for additional levels of evidence and grades for recommendations for developers of guidelines. The NHMRC model is a logical and intuitive way to formulate and grade recommendations that has been widely adopted by Australian guideline developers [ 21 ]. The grading process of the NHMRC process is described in Table 1 of the supplementary files.

The initial search yielded 464 original results, however only six studies remained for inclusion after screening (see Fig. ​ Fig.1). 1 ). We found two eligible RCTs and two eligible SRs. From one SR [ 3 ], no overview table was available, and it was therefore decided to screen the reference list from this review to find eligible studies. Since all eligible primary studies from this review [ 3 ] were also included in the second review [ 16 ], it was decided to exclude this review.

An external file that holds a picture, illustration, etc.
Object name is 12913_2021_6190_Fig1_HTML.jpg

PRISMA flow chart

We decided to include the primary studies from the SR. A total of six studies were included (two primary studies from the search and four primary studies from the SR. [ 16 ]

Study characteristics

Studies were grouped and categorised by implementation strategy based on the EPOC Taxonomy and the ERIC clusters. The results from the individual studies are summarized in Table ​ Table1 1 .

Description of individual studies

Three types of implementation strategies were identified. One study described educational meetings [ 22 ], one study described local opinion leaders [ 23 ], one study described patient mediated intervention [ 16 ] and three studies described multi-faceted components [ 24 – 26 ]. Four studies involved physiotherapists [ 23 , 25 , 26 , 27 ], one with paramedics [ 24 ] and one with speech language therapists [ 22 ].

Three studies were from the UK [ 22 – 24 ], two from the Netherlands [ 25 , 26 ] and one from Australia [ 27 ].

The outcomes for each implementation strategy are summarized in Table ​ Table2 2 .

Synthesis of results

S = significant ( p <0.05) NS= Not significant ( p >0.05)

The grades of recommendation according to the NHMRC model are listed in Table ​ Table3 3 .

Grades of recommendation

Whilst educational meetings were found to have a significant positive effect on therapists’ adherence to guidelines and knowledge increase, no patient-related outcomes were measured, and no significant changes were reported in clinical practice or cost effectiveness. The overall NHMRC grade of recommendation was B, suggesting that the recommendation can be trusted to guide practice in most situations.

We found no significant effect of local opinion leaders on professional or process outcomes, however no patient outcomes were explored for this strategy. The overall NHMRC grade of recommendation was C, suggesting that the body of evidence provided some support for the recommendation(s) but care should be taken in its application.

For patient mediated interventions, the review found significant effects on cost effectiveness and a significant increase in patient referral to falls services. However, all patient outcomes (patient safety, self-reported falls, health-related quality of life and patient satisfaction) did not significantly differ from the control group. The overall NHMRC grade of recommendation was C.

The body of evidence related to multi-faceted intervention strategies provided the highest grade of recommendation (A), suggesting that this recommendation can be trusted to guide practice. The review found that multi-faceted component studies improved guideline adherence significantly in two studies [ 25 , 27 ] and knowledge in one study [ 27 ].

When considering the evidence implementation interventions in terms of clusters the most common cluster of implementation strategy utilised involved training and educating stakeholders, when used in isolation this implementation strategy cluster was the least effective. When interventions were used that spanned a range of clusters the effectiveness of the implementation strategies appeared stronger. The strongest evidence of effectiveness came from the implementation of interventions that spanned the clusters of training and educating stakeholders, adapting and tailoring the context and supporting clinicians [ 27 ].

Risk of bias within studies

The six studies included in this review were of sound methodologic quality with SIGN scores ranging from adequate or high quality (AQ or HQ) [ 19 ]. (see additional files). All studies had a clear purpose, relevant background, and justification for conducting the study. Randomization was not clearly described in one study. In two studies, treatment and control group were not described at the start of the trial and no adequate concealment method was applied. All studies had adequate blinding and the only difference between groups was treatment under investigation. All studies but one described the dropout rate. Intention to treat analysis was executed for only three studies.

Summary of changes from the study protocol

During the review process the following items were changed from the study protocol. Due to the nature of the evidence found we decided to include only level I (systematic reviews) and II (RCT) studies in this paper. Only quantitative studies were considered, and implementation strategies were specified using the EPOC framework. The SIGN checklist and NHMRC grading framework was used to categorise the risk of bias and synthesize the results respectively.

This is the most recent review exploring the effectiveness of implementation strategies in allied healthcare. Six studies related to allied health were found but only among physiotherapists, speech pathologists and paramedics. Strategies evaluated were educational meetings, use of local opinion leaders, patient mediated interventions and a combination of different strategies forming multi-faceted interventions. Most strategies were evaluated against professional and process outcomes and only half were evaluated against patient or health outcomes. Multi-faceted strategies appear to remain the most effective in improving knowledge and adherence to guidelines and evidence (professional outcomes) but none of the strategies were found to improve patient outcomes.

Despite over 20 years since the recognition of the importance of evidence-based practice in quality health care this review could only identify six studies that explored the effectiveness of implementation strategies for promoting evidence-informed interventions in allied health. It was important to limit to the search to allied health as profession -related health discipline practice differences make it unlikely that the evidence associated with medicine would automatically transfer across to allied health. Whilst there has been an exponential growth in published evidence-based research across all allied health disciplines this has not been matched by published research into how best to implement this in clinical practice. Without effective strategies for implementation of evidence-based recommendations it is unlikely that evidence-based practice will improve the quality of care, reduce practice variation and/or reduce cost.

The importance of the implementation strategy to the effective use of evidence-based practice has been recognised by numerous authors [ 7 , 8 ]. Without a good understanding of the most effective strategy for implementing evidence-based recommendations in the real-world evidence-based practice becomes purely an academic exercise. Ecological validity depends on the evidence-based practice recommendation being tested in the real world. The current body of evidence related to implementation strategies in allied health are limited to speech-language therapists, paramedics and physiotherapists.

Of the evidence that exists there is relatively stronger support for the use of intervention strategies that are multi-faceted, including a range of active and passive strategies, rather than uni-faceted strategies such as educational meetings, local opinion leaders and patient mediated interventions. This adds support to the findings of Menon et al [ 15 ], who found multi-component knowledge transfer interventions enhanced knowledge and practice behaviours in physiotherapists. This is particularly evident when the interventions are explored in terms of clustering with the strongest evidence of effectiveness coming from strategies that include interventions from a range of clusters [ 27 ].

Across the studies found in this review there was inconsistent outcomes explored. Guideline adherence and knowledge were the two most common outcomes that were measured, potentially reflecting the relative ease of data capture of these two measures. Of concern when considering the body of evidence is the lack of focus on patient-centred outcomes. If the aim of evidence-based recommendations is to improve the health care of patients then this should be reflected in the evidence associated with intervention strategies. Patient-reported outcome measures would appear to be an important indicator of the effectiveness of an intervention strategy in improving patient centred care.

Whilst multifaceted interventions demonstrated the greatest effect on improving guideline adherence and knowledge the lack of changes in clinical patient outcomes is a concern. It is difficult to demonstrate cost effectiveness of an intervention if there are not measurable changes in patient outcomes. There remains limited evidence, from the findings of this review, that interventions based on training and educating stakeholders, adapting and tailoring the context and supporting clinicians change patient outcomes. This suggests either that the patient-related outcome measures were not sensitive enough or that different intervention strategies are needed to change patient outcomes. More research is needed in this area.

Due to the strict inclusion criteria of including only allied health therapy disciplines, only a few studies were found. Whilst this may be perceived as a limitation of the current review it also ensures that the reviews findings are relevant to the allied health discipline and reinforce the continued limited evidence base available in evaluating implementation strategies in allied health. This review is also limited by its focus on publications in the English language only.


The current limited evidence base in allied health suggests that multifaceted interventions, including the use of opinion leaders, follow-up education, educational meetings (workshops), audits and feedback and reminders, appear to be the most effective in implementing evidence-based recommendations. Therefore, when considering the use of evidence informed interventions in allied health an implementation strategy that incorporates these should be developed. Whilst evidence for knowledge uptake and guideline adherence and increased referrals exist there remains little consideration for patient or health related outcomes.


Not Applicable


Authors’ contributions.

KG, JD and SM contributed equally to the conception of the work, the acquisition, analysis, and interpretation of the data and the writing of the paper. All authors have read and approved the manuscript

This research was funded by the National Institute for Health and Disability Insurance, Belgium and by KU Leuven, young researchers’ careers scholarship. This funding covered the living expenses of the first author (KG)

Availability of data and materials

Ethics approval and consent to participate, consent for publication, competing interests.

The authors declare that they have no competing interest

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.


There is unequivocal evidence that Earth is warming at an unprecedented rate. Human activity is the principal cause.

empirical research evidence

  • While Earth’s climate has changed throughout its history , the current warming is happening at a rate not seen in the past 10,000 years.
  • According to the Intergovernmental Panel on Climate Change ( IPCC ), "Since systematic scientific assessments began in the 1970s, the influence of human activity on the warming of the climate system has evolved from theory to established fact." 1
  • Scientific information taken from natural sources (such as ice cores, rocks, and tree rings) and from modern equipment (like satellites and instruments) all show the signs of a changing climate.
  • From global temperature rise to melting ice sheets, the evidence of a warming planet abounds.

The rate of change since the mid-20th century is unprecedented over millennia.

Earth's climate has changed throughout history. Just in the last 800,000 years, there have been eight cycles of ice ages and warmer periods, with the end of the last ice age about 11,700 years ago marking the beginning of the modern climate era — and of human civilization. Most of these climate changes are attributed to very small variations in Earth’s orbit that change the amount of solar energy our planet receives.


The current warming trend is different because it is clearly the result of human activities since the mid-1800s, and is proceeding at a rate not seen over many recent millennia. 1 It is undeniable that human activities have produced the atmospheric gases that have trapped more of the Sun’s energy in the Earth system. This extra energy has warmed the atmosphere, ocean, and land, and widespread and rapid changes in the atmosphere, ocean, cryosphere, and biosphere have occurred.

Earth-orbiting satellites and new technologies have helped scientists see the big picture, collecting many different types of information about our planet and its climate all over the world. These data, collected over many years, reveal the signs and patterns of a changing climate.

Scientists demonstrated the heat-trapping nature of carbon dioxide and other gases in the mid-19th century. 2 Many of the science instruments NASA uses to study our climate focus on how these gases affect the movement of infrared radiation through the atmosphere. From the measured impacts of increases in these gases, there is no question that increased greenhouse gas levels warm Earth in response.

Scientific evidence for warming of the climate system is unequivocal.

empirical research evidence

Intergovernmental Panel on Climate Change

Ice cores drawn from Greenland, Antarctica, and tropical mountain glaciers show that Earth’s climate responds to changes in greenhouse gas levels. Ancient evidence can also be found in tree rings, ocean sediments, coral reefs, and layers of sedimentary rocks. This ancient, or paleoclimate, evidence reveals that current warming is occurring roughly 10 times faster than the average rate of warming after an ice age. Carbon dioxide from human activities is increasing about 250 times faster than it did from natural sources after the last Ice Age. 3

The Evidence for Rapid Climate Change Is Compelling:

Sunlight over a desert-like landscape.

Global Temperature Is Rising

The planet's average surface temperature has risen about 2 degrees Fahrenheit (1 degrees Celsius) since the late 19th century, a change driven largely by increased carbon dioxide emissions into the atmosphere and other human activities. 4 Most of the warming occurred in the past 40 years, with the seven most recent years being the warmest. The years 2016 and 2020 are tied for the warmest year on record. 5 Image credit: Ashwin Kumar, Creative Commons Attribution-Share Alike 2.0 Generic.

Colonies of “blade fire coral” that have lost their symbiotic algae, or “bleached,” on a reef off of Islamorada, Florida.

The Ocean Is Getting Warmer

The ocean has absorbed much of this increased heat, with the top 100 meters (about 328 feet) of ocean showing warming of 0.67 degrees Fahrenheit (0.33 degrees Celsius) since 1969. 6 Earth stores 90% of the extra energy in the ocean. Image credit: Kelsey Roberts/USGS

Aerial view of ice sheets.

The Ice Sheets Are Shrinking

The Greenland and Antarctic ice sheets have decreased in mass. Data from NASA's Gravity Recovery and Climate Experiment show Greenland lost an average of 279 billion tons of ice per year between 1993 and 2019, while Antarctica lost about 148 billion tons of ice per year. 7 Image: The Antarctic Peninsula, Credit: NASA

Glacier on a mountain.

Glaciers Are Retreating

Glaciers are retreating almost everywhere around the world — including in the Alps, Himalayas, Andes, Rockies, Alaska, and Africa. 8 Image: Miles Glacier, Alaska Image credit: NASA

Image of snow from plane

Snow Cover Is Decreasing

Satellite observations reveal that the amount of spring snow cover in the Northern Hemisphere has decreased over the past five decades and the snow is melting earlier. 9 Image credit: NASA/JPL-Caltech

Norfolk flooding

Sea Level Is Rising

Global sea level rose about 8 inches (20 centimeters) in the last century. The rate in the last two decades, however, is nearly double that of the last century and accelerating slightly every year. 10 Image credit: U.S. Army Corps of Engineers Norfolk District

Arctic sea ice.

Arctic Sea Ice Is Declining

Both the extent and thickness of Arctic sea ice has declined rapidly over the last several decades. 11 Credit: NASA's Scientific Visualization Studio

Flooding in a European city.

Extreme Events Are Increasing in Frequency

The number of record high temperature events in the United States has been increasing, while the number of record low temperature events has been decreasing, since 1950. The U.S. has also witnessed increasing numbers of intense rainfall events. 12 Image credit: Régine Fabri,  CC BY-SA 4.0 , via Wikimedia Commons

Unhealthy coral.

Ocean Acidification Is Increasing

Since the beginning of the Industrial Revolution, the acidity of surface ocean waters has increased by about 30%. 13 , 14 This increase is due to humans emitting more carbon dioxide into the atmosphere and hence more being absorbed into the ocean. The ocean has absorbed between 20% and 30% of total anthropogenic carbon dioxide emissions in recent decades (7.2 to 10.8 billion metric tons per year). 1 5 , 16 Image credit: NOAA

1. IPCC Sixth Assessment Report, WGI, Technical Summary . B.D. Santer, “A search for human influences on the thermal structure of the atmosphere.” Nature 382 (04 July 1996): 39-46. Gabriele C. Hegerl et al., “Detecting Greenhouse-Gas-Induced Climate Change with an Optimal Fingerprint Method.” Journal of Climate 9 (October 1996): 2281-2306.<2281:DGGICC>2.0.CO;2. V. Ramaswamy, et al., “Anthropogenic and Natural Influences in the Evolution of Lower Stratospheric Cooling.” Science 311 (24 February 2006): 1138-1141. B.D. Santer et al., “Contributions of Anthropogenic and Natural Forcing to Recent Tropopause Height Changes.” Science 301 (25 July 2003): 479-483. T. Westerhold et al., "An astronomically dated record of Earth’s climate and its predictability over the last 66 million years." Science 369 (11 Sept. 2020): 1383-1387.

2. In 1824, Joseph Fourier calculated that an Earth-sized planet, at our distance from the Sun, ought to be much colder. He suggested something in the atmosphere must be acting like an insulating blanket. In 1856, Eunice Foote discovered that blanket, showing that carbon dioxide and water vapor in Earth's atmosphere trap escaping infrared (heat) radiation. In the 1860s, physicist John Tyndall recognized Earth's natural greenhouse effect and suggested that slight changes in the atmospheric composition could bring about climatic variations. In 1896, a seminal paper by Swedish scientist Svante Arrhenius first predicted that changes in atmospheric carbon dioxide levels could substantially alter the surface temperature through the greenhouse effect. In 1938, Guy Callendar connected carbon dioxide increases in Earth’s atmosphere to global warming. In 1941, Milutin Milankovic linked ice ages to Earth’s orbital characteristics. Gilbert Plass formulated the Carbon Dioxide Theory of Climate Change in 1956.

3. IPCC Sixth Assessment Report, WG1, Chapter 2 Vostok ice core data; NOAA Mauna Loa CO2 record O. Gaffney, W. Steffen, "The Anthropocene Equation." The Anthropocene Review 4, issue 1 (April 2017): 53-61.



6. S. Levitus, J. Antonov, T. Boyer, O Baranova, H. Garcia, R. Locarnini, A. Mishonov, J. Reagan, D. Seidov, E. Yarosh, M. Zweng, " NCEI ocean heat content, temperature anomalies, salinity anomalies, thermosteric sea level anomalies, halosteric sea level anomalies, and total steric sea level anomalies from 1955 to present calculated from in situ oceanographic subsurface profile data (NCEI Accession 0164586), Version 4.4. (2017) NOAA National Centers for Environmental Information. K. von Schuckmann, L. Cheng, L,. D. Palmer, J. Hansen, C. Tassone, V. Aich, S. Adusumilli, H. Beltrami, H., T. Boyer, F. Cuesta-Valero, D. Desbruyeres, C. Domingues, A. Garcia-Garcia, P. Gentine, J. Gilson, M. Gorfer, L. Haimberger, M. Ishii, M., G. Johnson, R. Killick, B. King, G. Kirchengast, N. Kolodziejczyk, J. Lyman, B. Marzeion, M. Mayer, M. Monier, D. Monselesan, S. Purkey, D. Roemmich, A. Schweiger, S. Seneviratne, A. Shepherd, D. Slater, A. Steiner, F. Straneo, M.L. Timmermans, S. Wijffels. "Heat stored in the Earth system: where does the energy go?" Earth System Science Data 12, Issue 3 (07 September 2020): 2013-2041.

7. I. Velicogna, Yara Mohajerani, A. Geruo, F. Landerer, J. Mouginot, B. Noel, E. Rignot, T. Sutterly, M. van den Broeke, M. Wessem, D. Wiese, "Continuity of Ice Sheet Mass Loss in Greenland and Antarctica From the GRACE and GRACE Follow-On Missions." Geophysical Research Letters 47, Issue 8 (28 April 2020): e2020GL087291.

8. National Snow and Ice Data Center World Glacier Monitoring Service

9. National Snow and Ice Data Center D.A. Robinson, D. K. Hall, and T. L. Mote, "MEaSUREs Northern Hemisphere Terrestrial Snow Cover Extent Daily 25km EASE-Grid 2.0, Version 1 (2017). Boulder, Colorado USA. NASA National Snow and Ice Data Center Distributed Active Archive Center. doi: . Rutgers University Global Snow Lab. Data History

10. R.S. Nerem, B.D. Beckley, J. T. Fasullo, B.D. Hamlington, D. Masters, and G.T. Mitchum, "Climate-change–driven accelerated sea-level rise detected in the altimeter era." PNAS 15, no. 9 (12 Feb. 2018): 2022-2025.

11. Pan-Arctic Ice Ocean Modeling and Assimilation System (PIOMAS, Zhang and Rothrock, 2003)

12. USGCRP, 2017: Climate Science Special Report: Fourth National Climate Assessment, Volume I [Wuebbles, D.J., D.W. Fahey, K.A. Hibbard, D.J. Dokken, B.C. Stewart, and T.K. Maycock (eds.)]. U.S. Global Change Research Program, Washington, DC, USA, 470 pp, .



15. C.L. Sabine, et al., “The Oceanic Sink for Anthropogenic CO2.” Science 305 (16 July 2004): 367-371.

16. Special Report on the Ocean and Cryosphere in a Changing Climate , Technical Summary, Chapter TS.5, Changing Ocean, Marine Ecosystems, and Dependent Communities, Section

Header image shows clouds imitating mountains as the sun sets after midnight as seen from Denali's backcountry Unit 13 on June 14, 2019. Credit: NPS/Emily Mesner Image credit in list of evidence: Ashwin Kumar, Creative Commons Attribution-Share Alike 2.0 Generic.

Discover More Topics From NASA

Explore Earth Science

empirical research evidence

Earth Science in Action

Earth Action

Earth Science Data

The sum of Earth's plants, on land and in the ocean, changes slightly from year to year as weather patterns shift.

Facts About Earth

empirical research evidence

empirical research evidence

Cross-cultural research reveals universal bias towards simple rhythmic ratios in music

A comprehensive study spearheaded by researchers from the Massachusetts Institute of Technology and the Max Planck Institute for Empirical Aesthetics provides evidence that people tend to show a predisposition towards rhythms formed by simple integer ratios regardless of cultural background. Despite these universal tendencies, the study revealed significant variations in rhythm preferences across different societies, illuminating the nuanced factors that shape musical cognition.

The findings were published in Nature Human Behaviour .

The pursuit of this research stems from a curiosity about the universality of music cognition. Across the globe, music forms an integral part of human life, yet its manifestation is as varied as the cultures that create it. Previous studies, often focused on Western societies, hinted at a mental bias towards rhythms that can be neatly divided into equal parts, like the steady beat of a heart or the ticking of a clock.

But is this a universal trait, or are our musical minds molded by the melodies and rhythms that surround us from birth? The researchers conducted this study to investigate these answers, seeking to untangle the inherent from the acquired in music cognition.

This large-scale study was carried out among 39 participant groups spanning 15 countries, encompassing both urban societies and Indigenous populations. This diverse sample allowed the researchers to explore the universality and cultural specificity of music cognition, particularly regarding rhythm.

“This is really the first study of its kind in the sense that we did the same experiment in all these different places, with people who are on the ground in those locations. That hasn’t really been done before at anything close to this scale, and it gave us an opportunity to see the degree of variation that might exist around the world,” explained senior author Josh McDermott, an associate professor of brain and cognitive sciences at MIT.

To conduct their study, the researchers utilized a method reminiscent of the game of “telephone,” where a message is whispered from one person to the next, often leading to alterations of the original message. Participants were initially presented with a random “seed” rhythm through headphones. This rhythm consisted of a repeating cycle of three clicks, separated by time intervals that, when combined, totaled two seconds. Participants were asked to reproduce this rhythm by tapping along to it, a task designed to mimic how one might naturally attempt to replicate a rhythm heard in music.

Following the initial reproduction, the participant’s version of the rhythm was then used as the new stimulus for the next iteration of reproduction. This process was iterated several times, allowing the researchers to observe how the reproduced rhythms evolved over successive iterations. The hypothesis was that the participants’ reproductions would gradually converge towards certain preferred rhythms due to their internal biases or “priors” towards specific rhythmic structures. This iterative process effectively magnified the participants’ biases, making them easier to identify and quantify.

“The initial stimulus pattern is random, but at each iteration the pattern is pushed by the listener’s biases, such that it tends to converge to a particular point in the space of possible rhythms,” McDermott explained. “That can give you a picture of what we call the prior, which is the set of internal implicit expectations for rhythms that people have in their heads.”

Across all participant groups spanning 15 countries, there was a clear inclination towards rhythms composed of simple integer ratios, such as evenly spaced beats forming a 1:1:1 ratio. This finding suggests a commonality in human music cognition — a universal bias toward perceiving and enjoying rhythms that are mathematically simple.

However, the study also highlighted the significant variation in these rhythmic preferences across different cultures. While all groups demonstrated a bias towards simple integer ratios, the specific ratios that were preferred varied greatly, reflecting the diversity of local musical practices.

Some cultures showed a particular affinity for rhythms that are prevalent in their musical traditions, indicating that while there may be a universal foundation for rhythm perception, cultural influences play a crucial role in shaping individual and collective musical preferences.

For example, the 2:2:3 rhythm was notably prominent among traditional musicians in Turkey, Botswana, and Bulgaria, reflecting its importance in their local music. Similarly, the 3:3:2 rhythm, prevalent in African and Afro-diasporic music, including sub-Saharan styles and Afro-Cuban and Latin music, was strongly represented in the musical cognition of dancers from the Sagele village in Mali and musicians and dancers from other African and Afro-diaspora traditions.

“Our study provides the clearest evidence yet for some degree of universality in music perception and cognition, in the sense that every single group of participants that was tested exhibits biases for integer ratios. It also provides a glimpse of the variation that can occur across cultures, which can be quite substantial,” explained Nori Jacoby, the study’s lead author and a former MIT postdoc, who is now a research group leader at the Max Planck Institute for Empirical Aesthetics.

The study also delved into the question of whether these rhythmic biases are influenced by musicianship or a more passive exposure to music. Interestingly, the results indicated that the presence of discrete rhythm categories was not necessarily tied to one’s active musical training or expertise.

Instead, the broad exposure to particular types of music, regardless of active participation in music-making, seemed to be the key factor in shaping these perceptual biases. This finding challenges the notion that only trained musicians develop sophisticated rhythmic perceptions, suggesting instead that passive listening experiences can also significantly influence our internal representations of rhythm.

Another insights from this study is the observation that participants from traditional societies displayed rhythmic biases significantly different from those observed in college students and online participants from the same countries. This discrepancy underscores the profound impact of cultural and environmental factors on cognitive processes related to music.

The findings raise important considerations for psychological and cognitive neuroscience research, which has long been critiqued for its overreliance on WEIRD (Western, Educated, Industrialized, Rich, and Democratic) populations. This study provides concrete evidence that this reliance can lead to an underrepresentation of the vast diversity of human cognitive experiences.

“What’s very clear from the paper is that if you just look at the results from undergraduate students around the world, you vastly underestimate the diversity that you see otherwise,” Jacoby explained. “And the same was true of experiments where we tested groups of people online in Brazil and India, because you’re dealing with people who have internet access and presumably have more exposure to Western music.”

Despite the clear patterns that emerged, the study acknowledges its limitations and the potential avenues for future research. The scope of rhythms explored was limited to simple, periodic three-interval rhythms, leaving questions about more complex or extended rhythmic structures. Moreover, while the study provides strong evidence of culture-specific influences on rhythm perception, it also underscores the need for further investigation into how other factors, such as language or environmental sounds, might interplay with musical rhythm cognition.

The study, “ Commonality and variation in mental representations of music revealed by a cross-cultural comparison of rhythm priors in 15 countries ,” as authored by Nori Jacoby, Rainer Polak, Jessica A. Grahn, Daniel J. Cameron, Kyung Myun Lee, Ricardo Godoy, Eduardo A. Undurraga, Tomás Huanca, Timon Thalwitzer, Noumouké Doumbia, Daniel Goldberg, Elizabeth H. Margulis, Patrick C. M. Wong, Luis Jure, Martín Rocamora, Shinya Fujii, Patrick E. Savage, Jun Ajimi, Rei Konno, Sho Oishi, Kelly Jakubowski, Andre Holzapfel, Esra Mungan, Ece Kaya, Preeti Rao, Mattur A. Rohit, Suvarna Alladi, Bronwyn Tarr, Manuel Anglada-Tort, Peter M. C. Harrison, Malinda J. McPherson, Sophie Dolan, Alex Durango, and Josh H. McDermott.

(Photo credit: OpenAI's DALL·E)

Retraction Note: Testing the impact of sustainable environmental regulations on firm performance with mediating effect of product market competition: empirical evidence from Turkey

  • Retraction Note
  • Published: 28 March 2024

Cite this article

  • Ümit Çevik 1 &
  • Tahir Yeşilada   ORCID: 1  

148 Accesses

Explore all metrics

The Original Article was published on 01 September 2022

Avoid common mistakes on your manuscript.

Retraction Note: Environmental Science and Pollution Research (2022) 30:8048-8061

The Publisher has retracted this article in agreement with the Editor-in-Chief. An investigation by the publisher found a number of articles, including this one, with a number of concerns, including but not limited to compromised peer review process, inappropriate or irrelevant references, containing nonstandard phrases or not being in scope of the journal. Based on the investigation's findings the publisher, in consultation with the Editor-in-Chief therefore no longer has confidence in the results and conclusions of this article.

The authors have not responded to correspondence regarding this retraction.

Author information

Authors and affiliations.

Department of Business Administration, Faculty of Economics and Administrative Sciences, European University of Lefke, TR-10 Mersin, Lefke, Northern Cyprus, Turkey, 99010

Ümit Çevik & Tahir Yeşilada

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Ümit Çevik .

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

About this article

Çevik, Ü., Yeşilada, T. Retraction Note: Testing the impact of sustainable environmental regulations on firm performance with mediating effect of product market competition: empirical evidence from Turkey. Environ Sci Pollut Res (2024).

Download citation

Published : 28 March 2024


Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Find a journal
  • Publish with us
  • Track your research


  1. 15 Empirical Evidence Examples (2024)

    empirical research evidence

  2. Empirical Research: Definition, Methods, Types and Examples

    empirical research evidence

  3. What Is Empirical Research? Definition, Types & Samples in 2024

    empirical research evidence

  4. Empirical Research: Definition, Methods, Types and Examples

    empirical research evidence

  5. What Is Empirical Research? Definition, Types & Samples

    empirical research evidence

  6. Empirical Evidence

    empirical research evidence


  1. "Unveiling Truths: Where Empirical Evidence Resonates

  2. Research Methods

  3. Professors Apply Data And Empirical Evidence; You Don’t- Protozoa Slams Frimpong Boateng

  4. Understanding Evidence in Academic Writing

  5. Using Empirical Evidence to Guide Drug Policy Insights from Carl Hart, Columbia University

  6. TPDS: Early Warning Signals


  1. Empirical evidence

    Empirical evidence, information gathered directly or indirectly through observation or experimentation that may be used to confirm or disconfirm a scientific theory or to help justify, or establish as reasonable, a person's belief in a given proposition. ... It can be obtained by methods such as experiments, surveys, correlational research ...

  2. What Is Empirical Research? Definition, Types & Samples in 2024

    Empirical research is defined as any study whose conclusions are exclusively derived from concrete, verifiable evidence. The term empirical basically means that it is guided by scientific experimentation and/or evidence. Likewise, a study is empirical when it uses real-world evidence in investigating its assertions.

  3. Empirical research

    A scientist gathering data for her research. Empirical research is research using empirical evidence. It is also a way of gaining knowledge by means of direct and indirect observation or experience. Empiricism values some research more than other kinds. Empirical evidence (the record of one's direct observations or experiences) can be analyzed ...

  4. Empirical Research: Definition, Methods, Types and Examples

    Empirical research is defined as any research where conclusions of the study is strictly drawn from concretely empirical evidence, and therefore "verifiable" evidence. This empirical evidence can be gathered using quantitative market research and qualitative market research methods. For example: A research is being conducted to find out if ...

  5. Empirical evidence

    Empirical evidence for a proposition is evidence, i.e. what supports or counters this proposition, that is constituted by or accessible to sense experience or experimental procedure. Empirical evidence is of central importance to the sciences and plays a role in various other fields, like epistemology and law.. There is no general agreement on how the terms evidence and empirical are to be ...

  6. Empirical Research: Defining, Identifying, & Finding

    The evidence collected during empirical research is often referred to as "data." Characteristics of Empirical Research. Emerald Publishing's guide to conducting empirical research identifies a number of common elements to empirical research: A research question, which will determine research objectives.

  7. Empirical evidence: A definition

    Empirical evidence is information acquired by observation or experimentation. Scientists record and analyze this data. The process is a central part of the scientific method, leading to the ...

  8. Empirical Research

    Hence, empirical research is a method of uncovering empirical evidence. Through the process of gathering valid empirical data, scientists from a variety of fields, ranging from the social to the natural sciences, have to carefully design their methods. This helps to ensure quality and accuracy of data collection and treatment.

  9. Empirical Evidence

    Empirical evidence is related to the philosophical distinction between a priori and a posteriori reasoning. A priori reasoning, that is, without (or 'prior' to) evidence or experience is the sort of reasoning commonly used by logicians, philosophers, and mathematicians. a posteriori reasoning is based on observation and empirical evidence.

  10. Empirical Research

    Strategies for Empirical Research in Writing is a particularly accessible approach to both qualitative and quantitative empirical research methods, helping novices appreciate the value of empirical research in writing while easing their fears about the research process. This comprehensive book covers research methods ranging from traditional ...

  11. Empirical Research in the Social Sciences and Education

    Another hint: some scholarly journals use a specific layout, called the "IMRaD" format, to communicate empirical research findings. Such articles typically have 4 components: Introduction : sometimes called "literature review" -- what is currently known about the topic -- usually includes a theoretical framework and/or discussion of previous ...

  12. Theory and Observation in Science

    Theory and Observation in Science. First published Tue Jan 6, 2009; substantive revision Mon Jun 14, 2021. Scientists obtain a great deal of the evidence they use by collecting and producing empirical results. Much of the standard philosophical literature on this subject comes from 20 th century logical empiricists, their followers, and critics ...

  13. The Levels of Evidence and their role in Evidence-Based Medicine

    Level V evidence: little or no systematic empirical evidence: ... This allows the reader to know the level of evidence of the research but the designated level of evidence does always guarantee the quality of the research. It is important that readers not assume that level 1 evidence is always the best choice or appropriate for the research ...

  14. What is Empirical Research? Definition, Methods, Examples

    Empirical research plays a pivotal role in advancing knowledge across various disciplines. Its importance extends to academia, industry, and society as a whole. Here are several reasons why empirical research is essential: Evidence-Based Knowledge: Empirical research provides a solid foundation of evidence-based knowledge. It enables us to test ...

  15. Empirical Research: Quantitative & Qualitative

    Description of the methodology or research design used to study this population or phenomena, including selection criteria, controls, and testing instruments (such as surveys); Two basic research processes or methods in empirical research: quantitative methods and qualitative methods (see the rest of the guide for more about these methods).

  16. Appraising Evidence Claims

    For research evidence to inform decision making, an appraisal needs to be made of whether the claims are justified and whether they are useful to the decisions being made. ... adequate evidence should be provided to justify the results and conclusions. Second, reports of empirical research should be transparent; that is, reporting should make ...

  17. Empirical Research: A Comprehensive Guide for Academics

    Tips for Empirical Writing. In empirical research, the writing is usually done in research papers, articles, or reports. The empirical writing follows a set structure, and each section has a specific role. Here are some tips for your empirical writing. 7. Define Your Objectives: When you write about your research, start by making your goals clear.

  18. Empirical Evidence

    Types of Empirical Evidence. The two primary types of empirical evidence are qualitative evidence and quantitative evidence. 1. Qualitative. Qualitative evidence is the type of data that describes non-measurable information. Qualitative data is used in various disciplines, notably in social sciences, as well as in market research and finance.

  19. PDF What Is Empirical Social Research?

    —based not on ideas or theory but on evidence from the real world. Third, social research involves . analysis, meaning the researcher interprets the data and draws conclusions from them. Thus, writing what is typically called a "research paper" does not fit our definition of empirical research because doing

  20. Empirical Evidence

    Empirical evidence refers to factual data that is collected by conducting observations and experiments. It is a systematic process; every step is documented. Further, raw data is used to derive meaningful conclusions and findings. An Investigator first states a hypothesis and then accepts or rejects the theory based on analysis and tests.

  21. Empirical Research

    The aim of this research is to generate, analyze, and interpret reliable and valid information or data about the total experience of all members of the society or the group under study. It should be clear that empirical evidence in medical ethics draws from very heterogeneous sources. It contains studies that describe the moral experiences ...

  22. The effectiveness of implementation strategies for promoting evidence

    The PARHIS (Promoting Action on Research Implementation in Health Services) framework described successful translation as a function of the interplay between the research evidence, the context in which translation is happening and the ways in which the process is facilitated. Having one or more people in a facilitatory role, contextualising the ...

  23. What is Empirical Evidence?

    How Gathering Empirical Evidence in Social Science is Different. Testing the effects of, say, a public policy on a group of people puts us in the territory of social science. For instance, education research is not the same as automotive research because children (people) aren't cars (objects).

  24. Evidence

    Ancient evidence can also be found in tree rings, ocean sediments, coral reefs, and layers of sedimentary rocks. This ancient, or paleoclimate, evidence reveals that current warming is occurring roughly 10 times faster than the average rate of warming after an ice age. ... Geophysical Research Letters 47, Issue 8 (28 April 2020): e2020GL087291 ...

  25. Cross-cultural research reveals universal bias towards simple ...

    A comprehensive study spearheaded by researchers from the Massachusetts Institute of Technology and the Max Planck Institute for Empirical Aesthetics provides evidence that people tend to show a ...

  26. National culture and environmental, social, governance controversies: a

    This study aims to fill this research gap by investigating how country-level national culture overall and individual dimensions measured by Hofstede's (1980, 2011) and Hofstede and Hofstede's (2005) framework influence firms' intent to engage in ESG controversies in a cross-country setting consisting of 2,466 firms for the period 2010 ...

  27. Retraction Note: Testing the impact of sustainable ...

    The Publisher has retracted this article in agreement with the Editor-in-Chief. An investigation by the publisher found a number of articles, including this one, with a number of concerns, including but not limited to compromised peer review process, inappropriate or irrelevant references, containing nonstandard phrases or not being in scope of the journal.