Introduction to AI Content Detectors

Introduction to AI Content Detectors

As AI-generated content continues to grow and develop, so too are certain challenges to consider. For academic workers, one of the key challenges is discerning AI-generated content from human-generated content. As the use of AI tools becomes more widespread, educators are seeing a growing number of assignments and submissions that may include — or be entirely composed of — AI-generated content. This has led to increasing demand for AI content detectors within higher education.

Neither UCLA nor HumTech specifically endorses using these tools, nor should they be completely dependent upon them. As mentioned in” Evaluating AI-generated Student Work,” many of these tools are flawed with high rates of false positives. The goal of this article is to provide an overview of another set of tools that are available to instructors. If you suspect a student of using generative AI, it is always recommended to talk with them first.

AI content detectors are software tools that can scan and analyze text content to determine whether an AI writing tool generated it. Today, they play a crucial role in upholding academic integrity, providing fair assessments, and fostering genuine learning experiences by enabling educators to identify content that may have been generated by AI or borrowed without proper citations. AI content detectors analyze text by looking for certain patterns indicative of AI generation, including repetitive terms and phrases that look abnormal, nonsensical sentences or clauses, overuse of formal or informal tones, a suspicious lack of emotional nuance or personalization, or other types of technical markers. Through these efforts, these AI content detectors help educators maintain a fair and honest academic setting, ensuring that students are evaluated on their effort and original ideas.

Common approaches for detecting AI-generated Content

LLM prompting methods: 

This approach uses an external LLM to classify whether a piece of text is AI-generated. Some popular AI-detection tools, such as DetectGPT and FastDetectGPT, are based on this approach. There are even attempts to use ChatGPT itself as a detector to identify content that is likely AI-generated, possibly by ChatGPT itself. With carefully crafted prompts, it can achieve moderate accuracy in detecting AI-generated content efficiently. However, the design of prompts can significantly affect the performance of the external LLM in this detection task, and what constitutes an effective prompt for achieving optimal accuracy is still under exploration. In addition, an adversarial gap remains between the LLM detector and the text generator.

Linguistic and statistical signatures:

This approach detects AI-generated content by examining whether a text contains patterns that are likely to appear in such content. It most closely resembles the way humans manually assess whether a piece of text is AI-generated. Traditional stylometric features—such as function word usage, syntactic complexity, and average phrase length—have long been employed to identify AI-generated content. More recent approaches shift the focus toward computing perplexity or log-likelihood using reference language models, based on the knowledge that text generated by large language models often exhibits distinctive and detectable probability patterns. 

Watermarking for AI text:

When AI models are generating content, they can put a hard-to-see design in how words or marks are used, which is called an AI watermark. Cooperative watermarking modifies text generation during the token selection process, embedding such patterns in the distribution of words or punctuation. These patterns can later be detected to verify the use of AI in text generation. Compared to other approaches, checking for watermarking is currently the most effective and accurate method for detecting AI-generated content. However, this approach is only viable if major industrial LLMs adopt standardized watermarking methods. It fails when applied to content generated by models without embedded watermarking or when paraphrasing disrupts the watermark signal. 

Instructors now have access to a range of AI content detectors specifically designed to identify AI-generated content in student assignments. Top AI content detection tools like Winston AI, GPTZero, and Copyleaks offer reliable detection and user-friendly interfaces to support educators in promoting honesty in the classroom.

Popular AI Content Detection Tools

GPTZero  

GPTZero is one of the best-known and most widely used AI content detectors that check if a text was written by an AI content generation tool like ChatGPT or Copilot. It trained its model based on human-written and AI texts to learn which text is which. It uses machine learning formulas, Perplexity, and Burstiness, to test how predictable the language is. The tool was developed and worked on by Edward Tian. 

GPTZero is open-source and does not require access to private datasets. It is free for basic functions, such as detecting AI-generated content, but requires payment for advanced features, including more extensive checks, advanced plagiarism detection, and the ability to generate shareable reports. According to its website, the tool correctly classifies human-generated text 99% of the time and AI-generated content 85% of the time. However, it has also been criticized by a Washington Post newsletter for its high false positive rate, which can be especially harmful in academic settings.

Detecting-AI.com 

http://Detecting-AI.com is a popular online platform for identifying AI-generated content. According to the website, its AI checker offers comprehensive coverage of various AI models, allowing users to detect content generated by ChatGPT, Gemini, Jasper, Claude, and other similar systems. The tool boasts an impressive 98% accuracy rate in text detection. The platform is free to use with a limit of 5,000 characters and 100 detections per day. Users can upgrade to a premium plan to unlock support for longer texts and additional daily detections. Compared to its competitors, http://Detecting-AI.com offers a more affordable plan, priced at $14 per month or $84 per year.

Winston AI

Winston AI is a preferred tool for detecting AI-generated content created by models such as ChatGPT, Google Gemini, LLAMA, and other well-known LLMs. It is designed for use across a wide range of contexts, including academia, enterprise, and journalism. Winston AI is recognized for having one of the most accurate and up-to-date detection systems available. The tool is not free. The advertised free account is actually a free trial, limited to 2,000 words over 7 days. Premium plans start at $12/month (80,000 words) or $19/month (200,000 words).

It’s important to note that no AI-generated content detection tool is fully accurate or reliable enough to serve as definitive evidence of inappropriate AI use in academic work. These tools can produce both false negatives—failing to detect AI-generated content—and false positives—incorrectly identifying human-written content as AI-generated. For example, in June 2023, Turnitin acknowledged that its AI detection tool had a higher false positive rate than originally claimed. In the current context, AI detection tools should be viewed as aids for instructors to efficiently flag potentially suspicious content, but human review and judgment are still essential for making final determinations.