kind of a nihilistic take but hear me out
we talk a lot about content provenance, watermarking, content credentials, metadata standards etc. all good ideas in theory. but the reality is:
- every social media platform strips metadata on upload
- messaging apps compress and reprocess media
- screenshots contain zero provenance information
- right-click save strips most embedded data
- even email attachments can lose metadata depending on the client
so even if we had perfect provenance technology, the actual distribution channels people use would destroy it. and thats before bad actors intentionally strip it
feels like were building a lock for a door that doesnt have walls. am i wrong? someone convince me provenance is solvable
You’re not wrong about the current state but I think you’re discounting the trajectory. The technical solutions for surviving platform processing exist (perceptual hashing, steganographic embedding, content fingerprinting). The bottleneck is adoption, not technology.
Consider HTTPS adoption as an analogy. In 2013, less than 30% of web traffic was encrypted. Today it’s over 95%. It took years and pressure from both browsers and regulation. Content provenance could follow a similar path if the incentive structures align.
The screenshots point is the one that gets me. so much content is consumed and shared as screenshots now. tweet screenshots, article screenshots, message screenshots. theres literally no provenance chain for a screenshot
but also i think provenance doesn’t need to be perfect to be useful. even if 50% of content has verifiable provenance, that shifts the default from “assume everything is real” to “be suspicious of anything without credentials.” thats progress even if its incomplete
Technically the steganographic approach is the most promising. embed the credential information in the actual pixel or waveform data at a level that survives compression and reprocessing. its like an invisible watermark that carries provenance info
the tradeoff is that it slightly degrades quality and adds compute to both creation and verification. but for high-stakes content (news, evidence, official communications) that tradeoff seems worth it
@Marc_Delrieu the https analogy is actually encouraging. Wonder what the equivalent forcing function would be for content credentials