AI in the Classroom- What it Can and Can’t Do

AI in the Classroom- What it Can and Can't Do

There are lots of tasks LLMs can do for us as instructors and researchers, but they are also pretty ill-suited for many other tasks. LLMs are constantly developing and being updated, so remember things might change, especially as some LLMs are integrating the ability to search the web. But for now, some general principles stay true about these models: tools like Gemini or GPT, while trained with large amounts of information available on the Internet, aren’t designed to be truth-telling machines. They are designed to produce a text that the user is likely to be happy with. Sometimes those things happen to be the same - but often they aren’t! Whether you’re trying ChatGPT, Gemini, Grok, AppleAI, DeepSeek, or Microsoft CoPilot, here’s a guide to help you navigate LLM strengths and weaknesses and save you time and frustration in your teaching or research work!

🟢 Green light, go for it

Summaries: LLMs are good at summarizing and paraphrasing text that you give them. This can help when you’re working with papers, articles, and longer texts.

Rewriting: models like ChatGPT or Gemini don’t “know” things, but they can play with language very well. Some things you can try:
- Paste a paragraph in your prompt and ask them to change its tone, rephrase it in different terms, turn it into a list of questions, or conversely turn a list of keywords or bullet points into a complete paragraph
- Upload an article, a book PDF, or a class syllabus, and ask the model to generate a study guide, multiple-answer questions, or essay prompts for your class.

Explaining: LLMs can be really helpful in explaining difficult concepts, breaking down complex paragraphs, giving a historical overview, or explaining cultural references. They are also great at picking up what you meant from a badly phrased or half-baked question: this comes in handy when you have a vague notion of something and you’re not sure exactly what to ask, or when you don’t know what you don’t know so you are not sure where to start. The expression “there are no dumb questions” was never truer than with LLMs. They will explain something to you over and over again no matter what you ask, and their standard setting is to remain patient and friendly throughout (unless you’re using a personalized version designed to reply in a snarky sarcastic tone - but if that’s what helps you learn, give it a try! Here’s a guide for how to create a customized AI agent in ChatGPT).

Design feedback: LLMs can give you helpful suggestions based on a text that you provide to them. Give them a go if you’re trying to understand how to design an assignment, revise a syllabus, or redistribute a list of readings over a different time scale. If you tell them what you’re teaching and what level your class is, they can also provide useful feedback if you’re trying to assess if a certain reading or piece of homework is appropriate for a certain level, or if your students might be missing something.

Language practice: LLMs “know” lots of languages! They can produce grammatically correct text even in languages with fewer or less accessible Internet sources, and they can “understand” you well even if you make major grammar or semantic mistakes. If you or your students are trying to learn a new language, here’s a guide on how to use AI to practice your conversation skills!

🟡 Yellow light, proceed with caution

History: if you’re asking questions related to historical events, think large scale. ChatGPT or Gemini can give you an informative, correct, and comprehensive summary of a well-known historical event, but be wary of what they tell you about minor historical figures, niche topics, or politically contested events. In those cases, the results suffer more from the fact that there are fewer sources about those areas to begin with, and the fact that there can be lots of misinformation in the sources LLMs gobbled up.

Translation: LLMs are good translators but not faithful ones. They are really useful if you want to double-check how idiomatic a certain expression is, or if you want them to explain the difference in nuance between two similar expressions. However, think of them as a translator who thinks they know what you need better than you do: sometimes, they will arbitrarily change the tone of your original text in their translation, skip words, or skip entire phrases if you’re giving them a longer text to translate, and you may not know enough to double check and fix that. They also operate with a short memory and fail to maintain consistency in the translation of a specific term or general tone.

Researching sources: it depends on the tool you’re using. If you’re using a general LLM (Gemini, the free version of ChatGPT, Microsoft Copilot, DeepSeek), don’t trust what it tells you if you have specific questions about sources. But there are also some tools where you can explicitly provide the LLM with the texts you want to research, to limit the pool of knowledge the LLM is supposed to use when talking to you. If you use Google’s Notebook LM, or a customized AI agent through ChatGPT Plus (that’s the paid version), you will be able to upload your own sources and notes and have conversations where the LLM restricts its answers to those sources. In that case, you can put more trust in the citations, quotes, and overarching comments you get.

Brainstorming: LLMs can help you explore ideas and research questions, but they may not “know” enough to ask you the right questions to go in-depth into a problem or tell you what’s missing, what else you should consider, or how to contextualize your problem. For this task, they’re better suited as a conversation starter than as a problem solver.

🔴 Red light, find a different way to do it

Truth: it’s a hit or miss! An LLM is a bit like a people-pleasing robot: it might give you a “true” answer, but that isn’t its primary concern. If the questions you are asking include details about real-world events, historical facts, dates, or quotes, keep a healthy level of suspicion. Sometimes LLMs get it right, and sometimes they blatantly don’t. This isn’t an error - it’s a feature of how they work. As an empirical observation, we have found that the first few replies are likely to be more informative and high-quality, but if you prompt them further and further, you increase the risk of having the model tell you what it thinks you want to hear, rather than correct information. Also, remember that one of the main features of these models is their ability to change their “mind” in response to your comments. If you discuss with them, they will often change their answer, or ask you which “version” of a reply you prefer, regardless of how true or accurate that version is, and regardless of whether you were asking about something that can be “preferred” - a preferred truth? A preferred version of history?

Citations and quotes: a widely reported problem with ChatGPT and Gemini is that they tend to make up believable but fake quotes even when you didn’t ask for a quote at all. They might make up the author, the source, or the content of the quote, and it often happens when you prompt them to comment on a specialized or niche research area. If you need a citation in support of your argument, or if you’re trying to double-check whether a quote is accurate, use Google (or any other search tool). Don’t trust LLMs with this - sometimes they will produce good results, but often they won’t, and you will have to do a lot of double-checking anyway.

Research feedback and critical thinking: remember, LLMs don’t “know” things. They’re great at writing, but they don’t know what it is that they’re writing. They can’t give you good feedback about the quality of your arguments, identify what you missed, or help you write an in-depth or nuanced analysis of a problem. They can’t help you make good guesses, estimates, or hypotheses in your research, they may not know what is more or less significant in your domain, and they aren’t very helpful with inferring or evaluating knowledge. Tools like ChatGPT and Gemini love to make lists suggest pros and cons, and present “both sides of an argument” in their responses. This can be very helpful for other tasks, but it produces misleading, superficial, and sometimes plainly incorrect results when it comes to domain-specific knowledge.

Information about people: just use Google, especially if you’re looking for information about a person who isn’t a historical figure, or someone who might be mentioned in Internet sources but doesn’t have a major online presence (personal pages, social media, etc). You might get correct information, you might not, but most importantly, you might get lulled into a false sense of security after a few correct AI interactions and forget to double-check.