Field note · 9-minute read

The truth about AI detectors.

How they work, why they disagree, what teachers actually see, and the small habits that make AI writing read as human. An honest, hype-free read.

You can find at least a dozen websites that promise to tell you whether a piece of writing was generated by AI. Some are free. Some charge per check. Some are built into learning management systems and run automatically on every paper a student submits. Read enough discussion threads about them and you'll find two confident, opposite claims: "AI detectors are accurate, you can't fool them" and "AI detectors are useless, they accuse innocent students all the time."

Both claims are wrong in their absolute form. Here's the more useful answer.

What AI detectors are actually looking for

Detection tools don't have a magic signal that says "this was made by AI." What they have is a statistical model that measures two main things in your text: perplexity and burstiness.

Perplexity is a measure of how surprised a language model would be by your next word, on average. Human writers are unpredictable: we use weird vocabulary, we make small mistakes, we throw in idioms that don't quite fit. AI writing is the opposite — by construction, it picks the most plausible next word most of the time. So AI writing tends to have low perplexity, meaning each word is highly predictable from the ones before it.

Burstiness is the variation in sentence length and complexity within a passage. Humans naturally write in bursts — a long, flowing sentence, then a short one. Then a sentence-fragment. AI writing has very even rhythms: most sentences are roughly the same length, and there are fewer of the abrupt shifts a person would make.

A detector measures both, scores your text, and gives you a probability. That's it. There is no fingerprint, no hidden watermark in standard outputs from the major models. The detector is making an educated guess based on statistics.

Why detectors disagree

Each detector uses a slightly different statistical model, trained on different examples, weighted differently. Run the same paragraph through five detectors and you'll often get five different scores. One says 92% AI, another says 14%. Neither is "right." They're guessing, and their guesses don't agree.

This is also why the same text can score "human" today and "AI" next month. The detection models are updated frequently, often retrained on newer AI outputs. A passage that fooled the detector in January might fail in June.

The false-positive problem

The most-cited weakness of AI detectors is the rate at which they flag genuinely human writing as AI. Several studies have documented this happening regularly, particularly with: writing by non-native English speakers (whose grammar and vocabulary patterns can look more "predictable"), highly polished essays (because careful editing reduces perplexity), and any writing that follows a structured template (legal writing, scientific abstracts, formal business writing).

In other words, the better and more disciplined your writing, the more likely a detector will accuse you of using AI. This is the opposite of how it should work, and it's why most major universities have backed away from relying on AI detection scores as definitive evidence of cheating.

What teachers and reviewers actually notice

This is more important than the detection scores. Experienced readers — teachers, editors, hiring managers — develop a feel for AI-written text that no automated tool quite matches. The tells are usually:

A sense that every paragraph "summarizes itself" — repeating the main idea before moving on, like a textbook.
Overuse of certain transitions: furthermore, moreover, in conclusion, it is important to note.
Sentences that are all roughly the same length, paragraph after paragraph.
A reluctance to take a sharp position — output that "sees both sides" of even straightforward questions.
Generic examples instead of specific, contextual ones. "Many studies have shown..." with no studies named.
An absence of personality. Voice that could be anyone.

A skilled reader catches these in seconds, often before opening any detector. Detectors are downstream evidence. The pattern recognition is the upstream signal.

How to make AI-assisted writing read as human

If you're going to use AI as a writing assistant — and you're allowed to, in your context — the work of making the output sound like you matters. Some habits that help:

Break the rhythm.

After every two or three medium-length sentences, write a short one. Or a sentence fragment. The unevenness is what humans do naturally. AI has to be told to do it.

Use specific, weird examples.

Replace any "many people" with "the woman I sat next to on the bus last Tuesday." Replace "several factors" with three named factors. Replace "studies show" with the name of an actual study. Specificity is the most reliable marker of authorship.

Cut the transition words.

You almost never need "furthermore," "moreover," "in addition," "in conclusion." Start a new paragraph. The reader figures out the rest.

Use contractions.

"It's" instead of "it is." "Won't" instead of "will not." AI sometimes resists these, especially in formal contexts. Adding them back is a small change with a big effect.

Take a position.

Where the AI version says "there are arguments on both sides," say which side you're on. Confidence — earned or stated — reads as human, because it implies a person who has thought about it.

If you're a teacher or evaluator

The least useful thing you can do is run papers through a detector and treat the score as evidence. The most useful things, in our view, are:

Design assignments that require knowledge AI doesn't have — specific class discussions, in-class examples, recent personal experience.
Build in process work: drafts, outlines, in-class writing, oral defenses of arguments. AI can do the final paper; it can't do the messy middle convincingly under time pressure.
When you suspect AI use, have a conversation rather than an accusation. Ask the student to explain a paragraph in their own words. Real authors can do this; AI users sometimes cannot.

The bigger picture

AI detection, as a technology, is in a losing race. The models that generate text get better; the models that detect text play catch-up; the gap widens. Anyone selling you certainty about whether a piece of writing is AI is selling you something they don't actually have.

The deeper question — what we want writing to mean, and what we want students and writers to learn — isn't a detection question. It's an education and culture question, and no software is going to answer it.

More reads: How AI writing tools actually work · Six prompts that make a big difference.

Useful tools: our Text Humanizer rewrites for the rhythm changes described above. (It doesn't guarantee any specific detector will mark output as human — see the disclaimer on the tool page.)