[Plugin: PC Robots.txt] Secure robots.txt
-
Hey!
When I started to use your plugin I simply loved it, but I kinda wanted to hide it and all its contents revealing that it’s a WordPress site to the general user browsing with one of the common browsers.
First I had the idea of doing it via .htaccess but that would have meant a real robots.txt file had to be put in the root folder of the site in order for Apache’s mod_rewrite to redirect to another custom 404 php page. So I came up with the idea of building this feature directly into your plugin.What it basically does is check whether or not the user is a bot or a casual user by checking the user agent and if it is a bot it creates the robots.txt as it should and if it’s a user it makes a 404 and sends the user back to the homepage. The user agents are based on this site here: https://www.useragentstring.com/pages/useragentstring.php
Here is the DIFF:
--- pc-robotstxt.php +++ pc-robotstxt.php @@ -74,24 +74,45 @@ if ( is_robots() ) { - $options = $this->get_options(); - - $output = "# This virtual robots.txt file was created by the PC Robots.txt WordPress plugin.\n"; - $output .= "# For more info visit: https://petercoughlin.com/robotstxt-wordpress-plugin/\n\n"; - - if ( '' != $options['user_agents'] ) - $output .= stripslashes($options['user_agents']); - - // if there's an existing sitemap file or we're using pc-xml-sitemap plugin add a reference.. - if ( file_exists($_SERVER['DOCUMENT_ROOT'].'/sitemap.xml.gz') ) - $output .= "\n\n".'Sitemap: https://'.$_SERVER['HTTP_HOST'].'/sitemap.xml.gz'; - elseif ( class_exists('pc_xml_sitemap') || file_exists($_SERVER['DOCUMENT_ROOT'].'/sitemap.xml') ) - $output .= "\n\n".'Sitemap: https://'.$_SERVER['HTTP_HOST'].'/sitemap.xml'; - - header('Status: 200 OK', true, 200); - header('Content-type: text/plain; charset='.get_bloginfo('charset')); - echo $output; - exit; + $rawData = $_SERVER['HTTP_USER_AGENT']; + $trueBrowsers = array('Mozilla','Opera','Links','Lynx','Nokia','Samsung','MOT','SonyEricsson','Doris','HTC','Bunjalloo','PSP','wii','Amiga','ELinks','Cyberdog','Dillo','Dooble','Enigma','Galaxy','HotJava','IBM','LeechCraft','NCSA','NetSurf','retawq','Surf','Webkit','Uzbl','Vimprobable','w3m','WorldWideweb'); + $mozBots = array('Googlebot','Yahoo! Slurp','Ask Jeeves', 'Twiceler','Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 4.0)'); + foreach($trueBrowsers as $key => $search_browser) { + if(stristr($rawData, $search_browser) == TRUE) { + if(stristr($rawData, $search_browser) == 'Mozilla') { + foreach($mozBots as $key => $search_moz_bot) { + if(stristr($rawData, $search_moz_bot) == TRUE) { + $isBrowser = FALSE; + } + } + } + else { + $isBrowser = TRUE; + } + } + } + if ($isBrowser == FALSE) { + $options = $this->get_options(); + + if ( '' != $options['user_agents'] ) + $output .= stripslashes($options['user_agents']); + + // if there's an existing sitemap file or we're using pc-xml-sitemap plugin add a reference.. + if ( file_exists($_SERVER['DOCUMENT_ROOT'].'/sitemap.xml.gz') ) + $output .= "\n\n".'Sitemap: https://'.$_SERVER['HTTP_HOST'].'/sitemap.xml.gz'; + elseif ( class_exists('pc_xml_sitemap') || file_exists($_SERVER['DOCUMENT_ROOT'].'/sitemap.xml') ) + $output .= "\n\n".'Sitemap: https://'.$_SERVER['HTTP_HOST'].'/sitemap.xml'; + + header('Status: 200 OK', true, 200); + header('Content-type: text/plain; charset='.get_bloginfo('charset')); + echo $output; + exit; + } + else { + header('HTTP/1.0 404 Not Found', true, 404); + header('Location: '.site_url()); + exit(); + } }// end if
- The topic ‘[Plugin: PC Robots.txt] Secure robots.txt’ is closed to new replies.