Preventing People From Scraping My Article Content (ie TopBuzz)
-
Not sure how many here are familiar with TopBuzz, but it’s essentially a Reddit-esque app/website that people use for sharing content. It refers tons of traffic to publishers via the nativeapp.touitao.com domain.
Over the past several months, I’ve noticed an absolutely massive surge in traffic from TopBuzz. It turns out that someone posing as my site was sharing links to all of my content on the platform. The process seemed to be automated, as if they were just syndicating an RSS feed (I’m honestly not sure HOW to do this in TopBuzz, but it’s the only explanation I have).
Suddenly, the traffic stopped. I noticed that instead of posting links to my site, the user is now just posting entire article content (right down to image captions) natively to the TopBuzz platform. They seem to be monetizing this content … without permission from me.
In reviewing the posts, I’m more convinced than ever that the scraping is automated. Which means I theoretically should be able to prevent it. But I’m not sure how. I tried switching the RSS Home Feed to “excerpt” instead of “full article content,” but that didn’t work.
Are there individual RSS files for the articles that I could potentially modify?
TopBuzz isn’t being very cooperative, so I’m going to be pursuing legal action. But I wanted to at least block the scraping in the interim.
- The topic ‘Preventing People From Scraping My Article Content (ie TopBuzz)’ is closed to new replies.