Add robots.txt to your WordPress site
- Release Date: October 29, 2022
Webmasters who have had a little exposure to SEO should know the robots protocol (also known as crawler protocol, crawler rules, robot protocol, etc.), which is the robots.txt file usually added to the root directory of the website. Its function is to tell search engines which The pages can be crawled and which pages cannot be crawled, so as to optimize the indexing results and weight of the website.
If there is no robots.txt in the root directory of your website , you can create one. For the specific writing method, please refer to Google Encyclopedia . Here is a basic WordPress robots protocol:
User-agent: *
Disallow: /feed/
Disallow: /trackback/
Disallow: /wp-admin/
Disallow: /wp-content/
Disallow: /wp-includes/
Disallow: /xmlrpc.php
Disallow: /wp-
Allow: /wp-content/uploads/
Sitemap: http://example.com/sitemap.xml
The following mainly introduces the function of WordPress to automatically create virtual robots.txt. If the real robots.txt does not exist in the root directory of your website, you can let WordPress automatically create a virtual robots.txt (this file will not exist, but it works fine when visiting http://yoursite.com/robots.txt show)
Just add the following code to your theme’s functions.php:
/**
Add robots.txt to your WordPress site
https://www.w3diary.com/add-robots-txt.html
*/
add_filter ( ‘robots_txt’ , ‘robots_mod’ , 10 , 2 );
function robots_mod ( $output , $public ) {
$output .= “Disallow: /user/” ; // Disallow pages containing /user/ in the link
return $output ;
}
Note: If you want to add more rules, please copy line 7 of the above code and modify it.
Visit http://yoursite.com/robots.txt we can see the following:
User – agent : *
Disallow : /wp-admin/
Disallow : /wp-includes/
Disallow : /user/
In other words, WordPress has added the first 3 lines of rules by default.
robots.txt is involved in the inclusion of websites, so make sure you understand how it’s written and make sure every rule is correct!