1. What is data?


Data becomes information

Data is raw information or numeric files.

Data can take a variety of forms, including numbers, text or people’s opinions. Data can be manipulated, analysed and interpreted to give it meaning. Once data has meaning, it becomes information. Data can be collected from measurement, observation, experience or experiment.

Note: Data is the plural of datum. However, data is commonly used as a singular collective noun. You will see both “These data are …” or “This data is …” in scholarly information. Check with your lecturer or school for the form you are expected to use.

question icon If you observed ducklings around UQ Lakes, what kind of data could you collect?

  • The number of ducklings
  • Their size
  • Where they go
  • What they eat

If you continued to observe the ducklings over time you could gather more data. If you organise and analyse the data you gather, it could reveal information about patterns of behaviour and identify potential impacts. e.g. growth or decline in the number of ducks around the lakes.

2 ducks with many ducklings on a path
Source: UQ Ducks by Flic French, CC BY 2.0

Types of data

  • Primary data is data you have collected yourself.
  • Secondary data is data collected by others, including public datasets.

Data can be described in many different ways. One way to describe data is to use the terms qualitative and quantitative.

Qualitative — data that records qualities e.g. descriptions, concepts and opinions. Qualitative research is often focused on how and why something occurs.

Quantitative — numerical or spatial data that records quantities, measurements or frequencies e.g. size, location or scores.

Qualitative data can be analysed through quantitative approaches, such as statistical analysis, by giving the data a numerical value or ranking. Qualitative data is generally harder to analyse than quantitative, as it is usually in a format that is difficult to analyse quickly with basic statistical techniques.

Check your knowledge

Metadata

Metadata is data that describes other data. It is useful for finding, sharing and evaluating datasets. The metadata may be automatically generated and contained within the data file, such as with images, or it may be manually created and exist external to the file.

“Metadata provides the descriptive, structural and administrative information necessary to effectively access and utilise digital information objects.”

Source: Alemu, G., & Stevens, B. (2015). An emergent theory of digital library metadata : enrich then filter (1st edition). Chandos Publishing.

Examples of metadata

Documents

You can use metadata to determine who created a document and when it was created. Information available will include:

  • when it was created
  • when it was modified
  • who owns the document
  • who created the document.
Metadata fields in a document
Screenshot of metadata from a PDF document

Image metadata

Image metadata may show:

  • when the image was taken
  • what device was used to take the image
  • if the device had GPS turned on, where that picture was taken.

In raw image files, the file properties may show where an image was taken. The image metadata provides the longitude and latitude that you can search using Google Maps.  The Google Maps Help page has more information on searching by latitude and longitude.

Different fields in the image properties
Image metadata may show the location and other data

Social media

In social media posts, the use of a hashtag is essentially acting as metadata for a post. Adding a hashtag (#UQ) to your tweet, Instagram or other posts creates a topic or tag for that post so that someone searching that platform for ‘UQ’ is more likely to find your post. The Social Media module has more information on hashtags.

Data discovery

Metadata is useful for data storage as it improves the discovery of files. It can help you avoid situations where you are not sure which file contains the information you are looking for. The quickest and easiest way to start using metadata is through a better file naming strategy.

Metadata and privacy

An individual’s metadata can be used to learn a lot about that person’s life and habits. When the telecommunications data retention scheme was first introduced in Australia, journalists were making their metadata available to the public to show the amount of information kept in the metadata. The ABC in 2015 published a summary of the findings on journalist Will Ockenden’s metadata, What reporter Will Ockenden’s metadata reveals about his life.

Four-star General and former Director of the NSA & CIA discussed how powerful an individual’s metadata is at The John Hopkins Foreign Affairs Symposium (YouTube, 1h14m) in 2014.

Google Maps and you!

Google Maps has a feature called “Timeline” that will show you all the locations you have been to in the past. This only shows information if you use location services and Google Maps. Try it out and see how much Google knows about where you have been.

Datasets

Datasets are collections of data. Datasets can be in a wide variety of formats, including:

  • Spreadsheets
  • Text
  • Transcripts from interviews and focus groups
  • Videos
  • Images
  • Code.

The File formats and naming conventions section provides information on the different file formats that you may encounter and the software needed to open the files.

Statistics

Statistics are not the same as data. Statistics are:

  • data that has already been processed and analysed
  • an interpretation or summary of data
  • used to make comparisons
  • often presented in tables, charts, graphs or percentages.

Get more information on statistics:

Licence

Icon for the Creative Commons Attribution-NonCommercial 4.0 International License

Work with Data and Files Copyright © 2023 by The University of Queensland is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License, except where otherwise noted.

Share This Book