Hello - I'm EA-adjacent and have a cursory understanding of AI alignment issues. Thought I'd toss out a naive question!

AI systems rely on huge amounts of training data. Many people seem reluctant to share their data with these systems. How promising are efforts to limit or delay the power of AI systems by putting up legal barriers so that they can't scrape the internet for training data?

For example, I could imagine laws requiring anyone scraping the internet to ensure that they are not collecting data from people who have denied consent to have their data scraped. Even if few people deny consent in practice, the process of keeping their data out, or removing it later on, could be costly. This could at least buy time.

New Answer
New Comment
1 comment, sorted by Click to highlight new comments since: Today at 5:36 PM

-"For example, I could imagine laws requiring anyone scraping the internet to ensure that they are not collecting data from people who have denied consent to have their data scraped."

In practice this is already the case, anyone who doesn't want their data scraped can put up a robots.txt file saying so, and I imagine big companies like OpenAI respect robots.txt. I guess there could be advantages in making it a legal rule but I don't think it matters too much.