7. Evaluating AI-generated content
You should evaluate the quality and reliability of AI-generated content before relying on it, just as you would information from any source. Information provided by generative AI tools may be:
- Incorrect
- Out of date
- Biased or offensive
- Lacking common sense
- Lacking originality
AI tools tend to produce ‘middle-of-the-road’ answers, based on a consensus of the most common information in the AI’s training data. You should continue to think critically as you use the tools for your learning. Ask yourself:
- Is the response you’ve been given too conservative?
- Is there an alternative viewpoint that has been missed?
- What are your views — do you disagree with the information?
Methods for evaluating information
The CRAAP test is useful to consider when evaluating information generally and also emphasises some of the challenges with the evaluation of AI-generated content.
Information Essentials contains detailed advice on what you should consider when evaluating an information source. Some things to consider include:
- Currency
- Relevancy
- Authority
- Accuracy
- Purpose
Challenges
Different generative AI tools have different limitations. It is not always clear how current information contained in LLMs is. OpenAI’s documentation states that GPT-4 “generally lacks knowledge of events that have occurred after the vast majority of its data cuts off (September 2021)”. It is, however, capable of searching the web to find more recent information.
LLMs may not always present you with the sources for answers, or rely on unsuitable sources. This can make it difficult to judge the relevance, authority, accuracy and purpose of the information.
I can help with a wide range of topics, but there are some limitations. For example, I don’t have access to:
- Personal data unless shared with me during our conversation.
- Real-time data like live sports scores or stock prices.
- Confidential or proprietary information.
- Certain copyrighted content in full, such as books, articles, or songs.
Source: Answer provided by Microsoft Copilot on 26 November 2024.
Tips for confirming the information provided by AI tools
- Ask the tool to provide you with sources. You can ask for a specific type of source (peer-reviewed journal articles, news articles or academic sources). You can provide other constraints such as a time limit, e.g. ‘Can you provide academic sources from the last 5 years?’. Writing your prompt in academic or formal language will increase the chance of getting those types of sources. Note that there’s no guarantee that the AI tool will give you what you ask for but these techniques can increase the chance of better outcomes.
- Locate the sources provided and confirm the information is real. Generative AI tools will present false information as fact and make up references.
- Once you confirm the sources, consider their quality and whether they are appropriate for your task.
- Look for other reputable sources that also confirm the information.
“Treat the AI like a slightly unreliable friend. Have a chat, ask some questions. Don’t trust the answers though.”
Human in the loop
Evaluating the outputs of AI tools is sometimes referred to as “human-in-the-loop” work. Many of the AI models are based on predictive modelling and contextual understanding of the prompts they’re given. Constant feedback by the human-in-the-loop can improve your specific output and also the AI tools and models “and enhance the accuracy, reliability, and adaptability of ML systems, harnessing the unique capabilities of both humans and machines” (Source: What is Human-in-the-Loop in AI & ML?).