What Type of Data Do I Need?

If you've decided that data and statistics can help you understand and report on the topic you're researching, then you next need to consider a few questions to clarify what data you need for your project.

Do I need primary data or secondary data?

  • Primary data derive from analysis or research of firsthand sources. You might analyze original texts or art works, or alternatively, conduct your own research of the population you are interested in, such as designing and administering a survey of community members.

  • Secondary data are data collected by others. There is a huge amount of data and statistics that are widely and freely available for analysis and reuse. Government agencies are a major source of secondary datasets.

When using secondary data, two other questions should be considered:

Do I need microdata or aggregated data?
  • Microdata are data collected from the individuals within the population or sample investigated, for example, responses to a survey or poll. Often, these data are "deidentified"—meaning information that could help you identify the individual who provided the response is suppressed from the data file. These data may be classified as "restricted-use"—meaning data files are only provided to researchers who agree to comply by specific rules about how and by whom the data files can be used. You might use microdata if you are interested in analyzing the responses in a particular software, such as SPSS.

  • Aggregated data are summarized data obtained from the population or sample investigated, for example, the total number of students at an elementary school. Like Secondary data, this summarized data is what most of us use when conducting general analyses for school, work, or policy projects.

Will quantitative or qualitative data/variables best support my research?

Quantitative and qualitative research are typically contrasted by methods. The two approaches have historically been the subject of much debate in the social sciences.

  • Quantitative research generally refers to the collection of numeric data that is analyzed using mathematical techniques.

  • Qualitative research broadly encompasses methods of inquiry that are nonquantitative (e.g., ethnography, participant observation, focus groups). You'll find a repository of data suitable for qualitative analysis in the Qualitative Data Repository, funded by the National Science Foundation, and hosted here by the Center for Qualitative and Multi-Method Inquiry, a unit of the Maxwell School of Citizenship and Public Affairs at Syracuse University.

In statistics, quantitative and qualitative data don't necessarily correspond to the method used to obtain them. For example, there is often some element of the results of qualitative research that can be processed numerically. In statistics, qualitative and quantitative are defined in this way:

  • Quantitative data are always numeric and can be analyzed using mathematical/statistical techniques. For example, spending by consumers on personal care products and services would comprise quantitative data. Another example would be the data coded numerically from the results of focus group interviews.

  • Qualitative data, also known as categorical data, refer to information about the quality of something. Examples of qualitative variables include gender, sexual orientation, race, ethnicity, etc. If you are interested in exploring gender differences in education, you would want to use qualitative data.

References

  • Muijs, D. (2011). Introduction to quantitative research. In Muijs, D. Doing quantitative research in education with SPSS (pp. 1-10). Sage Publications Ltd. https://doi.org/10.4135/9781849203241

  • Vogt, W. P. (2005). Dictionary of statistics & methodology. Sage Publications, Inc. https://doi.org/10.4135/9781412983907