This is a post I wrote geared towards WordPress.com users. However, the tips are great for any site, no matter the platform, and are the very first things I do after verifying a site on Google Search Console.
However, a number of sites are also blocking /wp-includes/. While it doesn’t seem obvious, there are a number of things that live here which would need to be accessed by users (i.e., crawlers) to render pages properly. For example, Dashicons, the small icons you generally associate with the admin side of WordPress, can often be called by themes for front-end usage. Another major thing that can hinder proper rendering by crawlers is jQuery. Sometimes, themes will enqueue a different version, but by default, it lives in the /wp-includes/ folder. If we dive even further into the issue, we’d see that the built-in emojis and comment reply handling would also be affected.
So, what can be safely blocked? At this point, here’s what a “compliant” WordPress robots.txt would look like (as far as what is safe to block). Of course, you’d want to add in your own sitemap directives and other special cases, but this is a good starting point.
Questions or comments? Leave them below!