• Hi,

    I’m looking for a way to “scan” blogs (any blogs, whatever the format) and get results for each entry for number of comments, number of words and number of words in comments. Can that be done? If yes, is it complicated?

    Thanks

Viewing 5 replies - 1 through 5 (of 5 total)
  • Moderator Samuel Wood (Otto)

    (@otto42)

    www.remarpro.com Admin

    I would say that it’s pretty much impossible.

    You’re asking for a generalized solution to “scan” any given randomly designed site, determine what is a “post” and what is a “comment”, and then count them up. Counting is easy, computers are good at it. Determining what is the type of the content is very difficult, computers are very bad at it.

    Computers count. They don’t make judgment calls.

    Thread Starter baal666

    (@baal666)

    Hi Otto42,

    Thanks for your reply!

    What you mean, if I understand correctly, is that if someone wants to judge blogs based on the number of comments and the length of them, he would have to do it manually. Is this right? What a long job it must be!

    Best case, you would have to customize the scan for each of the blog engines you wish to incorporate… and hope that there is something reliable that you can search for to ID the various content types.

    It’s something that could be written relatively easily for a subset of blogs: WordPress blog templates frequently have similar markup and selectors (posts are usually contained within a DIV with class “post”, comments are usually contained within an OL with class “commentlist”), so your script could scan the contents of the post area being reasonably certain what the boundaries of that post area are.

    But there are many WordPress blog templates that don’t follow the common markup syntax, and the script would fail. And almost any Movable Type, Blogger, Drupal, etc. website could be counted on for not sharing that syntax either.

    At best, without making a career of it, you could write something that could scan some subset of sites. But you couldn’t scan any arbitrary site without investigating the site first.

    Thread Starter baal666

    (@baal666)

    Thanks for sharing your thoughts. I see it is complex stuff… Well, it is difficult to do as I see..

Viewing 5 replies - 1 through 5 (of 5 total)
  • The topic ‘Looking for a program that could scan number of comments, number of words, etc.’ is closed to new replies.