cleaning up an archive. need to flag legacy contributor pieces that may have used AI during the 2023-2024 boom. 5000+ pieces across 6 years. need bulk API with reasonable per-call cost. anyone done this at scale and lived to tell
yes, did roughly similar last year, 7k pieces. two tips: 1) batch by year and run a sampling first, you’ll see false positive baselines shift across time periods because writing styles drift, 2) negotiate API pricing once you have your volume estimate. published rates are not the actual enterprise rates for 5-figure call volumes.
We did 2500 pieces and the surprise was how many false positives we got on pre-2022 content. obviously written before mainstream LLMs but the detector still flagged. that’s because the detectors trained on patterns also flag clean human writing from skilled writers. plan for that noise in your QA budget.
@RustyCircuitX yeah the sampling first idea is smart, im going to do a 200-piece per-year stratified sample before committing to the full run