The AI Detection Arms Race Is On

9 months ago 27
RIGHT SIDEBAR TOP AD

Watermarking” doesn’t help either, he says. Under this approach, a generative AI tool like ChatGPT proactively adjusts the statistical weights of certain interchangeable “token” words—say, using start instead of begin, or pick instead of choose—in a way that would be imperceptible to the reader but easily spottable by an algorithm. Any text in which those words appear with a given frequency could be marked as having been generated by a particular tool. But Feizi argues that with enough paraphrasing, a watermark “can be washed away.”

In the meantime, he says, detectors are hurting students. Say a detection tool has a 1 percent false positive rate—an optimistic assumption. That means in a classroom of 100 students, over the course of 10 take-home essays, there will be on average 10 students falsely accused of cheating. (Feizi says a rate of one in 1,000 would be acceptable.) “It’s ridiculous to even think about using such tools to police the use of AI models,” he says.

Tian says the point of GPTZero isn’t to catch cheaters, but that has inarguably been its main use case so far. (GPTZero’s detection results now come with a warning: “These results should not be used to punish students.”) As for accuracy, Tian says GPTZero’s current level is 96 percent when trained on its most recent data set. Other detectors boast higher figures, but Tian says those claims are a red flag, as it means they’re “overfitting” their training data to match the strengths of their tools. “You have to put the AI and human on equal footing,” he says.

Surprisingly, AI-generated images, videos, and audio snippets are far easier to detect, at least for now, than synthetic text. Reality Defender, a startup backed by Y Combinator, launched in 2018 with a focus on fake image and video detection and has since branched out to audio and text. Intel released a tool called FakeCatcher, which detects deepfake videos by analyzing facial blood flow patterns visible only to the camera. A company called Pindrop uses voice “biometrics” to detect spoofed audio and to authenticate callers in lieu of security questions.

AI-generated text is more difficult to detect because it has relatively few data points to analyze, which means fewer opportunities for AI output to deviate from the human norm. Compare that to Intel’s FakeCatcher. Ilke Demir, a research scientist for Intel who has also worked on Pixar films, says it would be extremely difficult to create a data set large and detailed enough to allow deepfakers to simulate blood flow signatures to fool the detector. When I asked whether such a thing could eventually be created, she said her team anticipates future developments in deepfake technology in order to stay ahead of them.

Ben Colman, CEO of Reality Defender, says his company’s detection tools are unevadable in part because they’re private. (So far, the company’s clients have mainly been governments and large corporations.) With publicly available tools like GPTZero, anyone can run a piece of text through the detector and then tweak it until it passes muster. Reality Defender, by contrast, vets every person and institution that uses the tool, Colman says. They also watch out for suspicious usage, so if a particular account were to run tests on the same image over and over with the goal of bypassing detection, their system would flag it.

Regardless, much like spam hunters, spies, vaccine makers, chess cheaters, weapons designers, and the entire cybersecurity industry, AI detectors across all media will have to constantly adapt to new evasion techniques. Assuming, that is, the difference between human and machine still matters.

Read Entire Article