Ran my own essay through 5 different ai detectors and got completely different results

ok so im writing a research paper for my sociology class about social media and mental health. wrote the whole thing myself, no chatgpt, no nothing. just me and google scholar for like 3 days

decided to run it through some detectors before submitting because my professor said shes using detection software now and i didnt want any issues

results were all over the place:

  • one tool said 94% human
  • another said 67% ai generated
  • third one said “likely ai written” with no percentage
  • fourth gave me 82% human
  • fifth one crashed lol

how is this even possible?? its the SAME text. i literally wrote every word. the one that flagged me hardest seemed to hate my conclusion section which ok fair its a bit formal but thats because i was trying to sound academic??

has anyone else had this experience? im genuinely worried about submitting now because if these tools cant agree then what is my professor even using

Yeah this is the reality of ai detection right now unfortunately. i teach high school english and ive tested probably 200+ student papers across a few different tools. the inconsistency is the biggest problem

what i’ve found is that formal academic writing gets flagged way more often than casual writing. makes sense if you think about it - ai tends to write in a more structured “correct” way, and academic papers by nature are supposed to be structured and correct

my advice: keep your draft history. if you wrote in google docs the version history is your best friend. screenshots of your research process help too

Oh god same thing happened to me with my thesis proposal last semester. one tool flagged my literature review at 89% ai and i nearly had a panic attack. showed my advisor the google docs history with like 47 revisions and she was fine with it but still

the problem is these tools are pattern matching on writing style not actually detecting ai. formal + well structured + clear = ai apparently. which means anyone who writes well gets punished??

i keep a research log now with timestamps just in case

This is a well-documented issue in the research. Sadasivan et al. (2023) demonstrated that paraphrasing alone can reduce detector accuracy to near chance levels. The fundamental challenge is that these tools rely on statistical properties of text rather than genuine provenance.

For what it’s worth, the tools that give confidence percentages without clear methodology should be treated skeptically. Detection is probabilistic, not deterministic. Your professor should understand this nuance.

Good call on the version history @JonahHex99. I do write in docs so I have that. Just never thought id need to prove I wrote my own paper lol