Why AI Detection Tools Fail Here's What To Do Instead

Ryan Flanagan
Jul 26, 2025By Ryan Flanagan

TLDR: AI detectors wrongly flag human writing every day. I see it often in my lectures at the University of Melbourne. If you want to assess quality and credibility, stop guessing authorship. Use annotated workflows, track version history, and evaluate thought process not only the formatting.

 
I Watch This Happen Every Week
A student submits their work. They wrote it.

Cited properly. Structured the argument. Thought it through.

Then a tool flags 30% of it as AI. WTH?? 

Here comes a long process to review, investigate and judge. And ultimately...the application of the flagging is ...well...subjective. I teach corporate innovation and AI at a Masters level at University. I’ve seen students have to explain themselves, even penalised based entirely on a detection score that can’t explain how it reached that conclusion. As if AI can explain how it failed to capture what it has been trained to recognise.

This isn’t rare. It’s happening in schools, in corporate P&C (how did HR manage to squeeze culture into their designation!), and in HR screening. The tool says “likely AI,” and the process kind of jolts to a nuanced and metaphorical hold there.

It’s lazy, flawed, and IMO deeply unfair.

These AI Tools Aren’t Designed for Decision-Making

AI detectors don’t understand meaning. They scan for structure and patterns.
If your writing is too formal, too clean, or just unusually consistent, they flag it. What they measure is not authorship. It’s resemblance. And lets be honest...40 years of a school system that rewards and teaches conformity...what did you expect?

That means students who’ve practised writing well, or professionals working from structured templates, are more likely to get flagged. Especially if English isn’t their first language, or if they’ve used assistive tools like Grammarly. Then we have to prove they did not write it fully. Often without knowing what part of the work is under suspicion.

This is a governance problem pretending to be a tech solution. And the tech companies jumped on hammering a square into a circle there.

There Is a Better Way, and I Teach It

I don’t ban AI in the classroom. I teach students how to use it properly.

They learn how to prompt, how to restructure, how to revise. Then they show their process: with notes, version history, or a prompt appendix. That way, I’m not guessing what they used. I can see how they thought. 

This approach works in businesses too. We help teams design workflows that show how AI was used, not just if. The goal isn’t detection it’s documentation : fairly important when things go wrong..really important for accreditions like ISO 42001 on your AIMS.

It should not about catching people using AI. It’s about making it possible to contribute unique writing patters without judgement, safely and transparently.

 If You Use AI Detection Tools : Fix It.

If your organisation uses AI detectors in recruitment, training, or internal comms, you need to change course.

Here’s how:

  • Ask for annotated submissions or prompt logs if you need provenance.
  • Use version history to understand how work was developed.
  • Make it normal to say where AI helped and where it didn’t.
  • Review outputs for thinking, structure, and appropriateness not just “originality.”

AI writing support is already the norm. Pretending otherwise will only push it into shadow or punish the wrong people. It will be all of us soon enough.

 FAQ

Q: Are AI detection tools accurate?
No. They often flag well-edited human writing as AI and provide no meaningful explanation.

Q: Should we ban AI writing tools to avoid this problem?
No. Banning doesn’t stop use. It drives it underground. Clear policy and transparent use are more effective.

Q: How can we assess work if AI was used?
Ask for documentation—prompt chains, draft history, or process notes. Focus on outcome quality and clarity of reasoning.

Q: What if someone uses AI and lies about it?
Treat it the same as any integrity breach. But don’t use flawed tools to pre-emptively accuse people.

Q: What’s the alternative to detection software?
Create clear expectations, support transparent AI use, and assess the value of the output and reasoning interaction, not just its statistical patterns.