Spam Filtering AI Content

Dec 8, 2022

As generative AI becomes more advanced, it's likely that we will see an increase in spam that is difficult to distinguish from human-generated content. Some ways that we can combat the next wave of AI-generated content.

  1. Adversarial models that are trained to detect AI-generated content. Of course, this works both ways. Anomaly detection algorithms. These algorithms can be tuned to detect AI-generated content by looking at things like frequency of posts, rate of change in topic, etc.
  2. Client-level restrictions. Rate-limits, limited API access. Shadow-banning. Better tools for humans to identify and report spam.
  3. Harsher penalties. A very naive punishment policy is one where the penalty is inversely correlated to the chance of getting caught. For example, a worker at a remote site who is caught sleeping on the job during an infrequent visit. That might mean removing users from the platform or monetary fines (e.g., CAN-SPAM Act)
  4. Transaction fees. Either explicitly, i.e., micro-transactions or through something like proof-of-work. The early underpinnings of Bitcoin were in Hashcash, a proof of work system designed to curtail email spam.
  5. Reputation systems (analogy to email or IP reputation systems).
  6. Challenge/response systems like CAPTCHA.