Knowing the various types of data

Data has a number of characteristics. Let’s find out the most basic ones.

  • Primary (basic) and secondary data (processed): Primary data is data obtained from statistical surveys, etc., and is created directly from the survey subjects. By contrast, secondary data is created by processing primary data.
  • Public and private statistics: Public statistics are produced by the state or local governments and often involve large-scale surveys, characterized by an appropriate sampling method and high reliability. They also cover a wide range of areas, such as economy, diplomacy, education, and the environment. Private statistics are produced by business groups and private survey companies, providing detailed data from surveys relating to specific themes, such as market, usage, and attitude surveys. For example, it could be a survey on male attitudes towards beauty or a smartphone application usage survey.
  • Exhaustive surveys and sample surveys: The data from an exhaustive survey is derived from all subjects without exception and large-scale surveys conducted by the state such as national censuses, are usually exhaustive surveys. Sample surveys are conducted by extracting a certain number of subjects from a bigger group. Most private surveys are sample-based, and it is necessary to keep an eye on the sampling method as it might affect the reliability of the data.

Referring to white papers

When you intend to refer to data, sometimes you cannot find the data you need. In this case, it is recommended that you refer to white papers. White papers are reports that explain the results of government statistical surveys concerning society and the economy. They are filled with figures, tables, and graphs, so they provide clues as to what statistics and survey results are available regarding your topic of interest. White papers can be found at our Central Library, as well as on ministry and government websites.

When looking for data, you can ask for help at the library’s learning advice counter. You can also find ‘reference literature’ at the library that will tell you what data can be found on what page in the white papers. ‘Reference literature’ are not to be read cover to cover, but you should look up only the parts that you need. You can inquire at the learning advice counter about how to use them.

Using databases

Most of the public statistics from the government can be obtained from databases. You can perform keyword searches, but it will be difficult to pick what you need due to the large number of hits. An efficient way to find the data you are looking for is first looking up the name of the survey including the data you need in a white paper or something similar.

Most of the data can be obtained in the form of spreadsheet files. The following are representative databases operated in Japan and abroad.

Provider Database
Japan Statistics Bureau, MIC ‘e-Stat’
https://www.e-stat.go.jp/en
Int’l UNSD ‘UNData’
http://data.un.org/

Making tables and graphs

In order to get used to referring to data, you should start with recreating the figures, tables, and graphs that appear on the literature by yourself. Refer to the data sources in the literature to obtain the data from a database. Process the data using spreadsheet software and create the same figures, tables, or graphs as in the literature.

Your graph may differ from the one in the literature. For example, it is because the definition of items to be included in the category of ‘Others’ may be different from that of the literature, or not all items are reflected in the graph. Process the data until you can make the same graph, carefully comparing the graphs and paying attention to units and years as well. By doing this kind of close inspection, you will come to understand how the author used the data and also be able to check whether anything is incorrect or inappropriate.

As such, when citing figures, tables, or graphs from the literature in your report, you should aim to recreate it using the original data rather than simply pasting it into your report without checking. This will not only help you to gain a deeper understanding of data, but also to enhance your data referencing skills.

Paying attention to data reliability

There is a lot of interesting data in private statistics, but most of it is derived from the results of sample surveys. For some surveys, investigators will visit respondents to conduct the survey in the aim of preventing people other than the intended respondent from answering instead as well as preventing misunderstandings of the questions. However, there are other surveys that pre-registered respondents will answer online via a survey website. This is why we must be careful about the data reliability. Make sure also to use the data with particular caution if a very big proportion of respondents are from certain demographics, such as young people or urban women.

Moreover, you should take notice of whose survey it is. For example, a university hospital survey that shows the health-promoting effects of a certain food product might seem reliable at first glance, but you need to be careful if the study was funded by a specific food company.

Issue |
Institute of Liberal Arts and Sciences & Center for the Studies of Higher Education
First edition |
2018.3.20
Author |
Nakajima, Hidehiro