Google crawl rates are what determines how quickly your website content ends up in Google search. It’s not the factor that gets it ranked, but it’s the first step to ensure you have a fighting chance, so without it your website is virtually dead in the water.
Before I talk about my 3 action points, I want to address a misconception about Google crawl:
Many website managers and even some SEO people think that Google only crawls their website once or twice a month, which is why they say: “that’s why it takes ages to get ranked”. But that isn’t the real reason why it takes ages to get ranked. Googlebot (the crawler) is not the Algorithm. It’s the algorithm that assigns rank. Googlebot is just the messenger, and likes to visit as often as possible.
Assuming your website has at least 1 followed link to it from somewhere (and I will tell you more about that shortly) then your website can get crawled by Googlebot. I liken Googlebot to a hungry wolf knocking on little piggy’s door. If you really-really want to stay out of Google search, you’ll need a website made of solid brick, not straw. And your doors and windows will have to be locked! That’s because Googlebot is very keen to get in there and see what you’ve got on your website, just as much as Mr Wolf wants to eat little piggy. So much so that it might visit several times per day to take another peek. It’s merciless.
This image shows Googlebot crawl stats for a website with about 100 pages. The average pages crawled per day is 375, so there’s a ratio of 3.75:1 of page-crawls to actual pages. Ratios like 5:1 are not uncommon. There’s also not a single day where this website wasn’t visited by Google.
When you connect Google Search Console to your website and take a look at the Crawl Stats section, you’ll see that Googlebot crawls almost every day, and on some websites it crawls as much as 10x the number of pages than your website actually has. In other words, it’s crawling the same pages several times over in any one day. The more pages it can crawl, the more likely your page will get indexed. I’ll repeat again though: the only relationship between crawl and rank is that your pages must be crawlable to get ranked. Crawl doesn’t determine rank position. Check out my article here about Google Search Console if you haven’t used it before.
OK: Here are 3 top things you can do to ensure your website is getting crawled as much as possible:
Add your website to Google Search Console and submit your sitemap. If you’re using WordPress as your CMS, I recommend Yoast SEO to generate sitemaps for your website. If you use another CMS, find out if their functions include a sitemap and take a note of the URL it generates. Typically, the sitemap location is at the root directory of a website and ends up with a URL like this: www.yourdomain.com/sitemap.xml . If your website doesn’t have a sitemap and you can’t generate one with the CMS or plugins / extensions, you can actually make one manually using a TXT file. Just list all of the URLs that your website has in a TXT file and upload the text file into the root directory of your website. Name it something like sitemap.txt, and your URL for that will bewww.yourdomain.com/sitemap.txt instead. You will need file upload access to use this option. All of your URLs need to be listed with a linebreak between each. To submit the sitemap, visit here in Google Search Console:https://www.google.com/webmasters/tools/sitemap-list
Sitemaps provide Google with the first full set of followed links to your website’s pages. It’s like having links on someone else’s website linking to yours. That website just happens to be Google.com in this case. Just be aware that the sitemap submission is an invitation to crawl. It isn’t a command to crawl, and Google has the right to ignore that invitation. In other words, the sitemap alone doesn’t guarantee crawl. What it does do is give all of your pages an even chance to begin with.
Fetch as Google:
This is my number two in the list solely because if you just did number one you may as well stay in Google Search Console and do number two. Fetch as Google is a command with four modes: Googlebot desktop and Googlebot mobile, and now lesser common Googlebot XHTML and Googlebot cHTML. At the very least, use the fetch command to fetch the home page of your website, and use the Submit to Index button (which shows up after a successful fetch) to submit the page to Google’s index. You can opt to “crawl only this URL” or “crawl this URL and its direct links”. Here’s the difference:
If your website looks like this:
Then “crawl only this URL” from the home page will crawl only the home page. But “crawl this URL and its direct links” will crawl the URLs in red and orange. To get the lower blue hierarchy crawled (the products 1, 2, 3) then you need to use fetch again on the Products page (because it is the parent) using the “crawl this URL and its direct links” command from there or fetch each product independently. Rinse and repeat if your website structure is very deep, but don’t repeat the same URLs because this can make Google ignore your further requests.
You have a limit of 500 single URL fetches and 10 sets of direct-linked URLs fetched per month.
Only use the Fetch command on pages that haven’t been crawled yet. Using it on pages that are already indexed really won’t help, unless you changed the content of that page and you want the new content to get indexed sooner than later.
That’s my number three, and it’s a big one. This is not something you are going to be able to do overnight, but don’t give up just because it’s slow going! Followed links are links on other websites that lead to yours, and specifically lack the rel=”nowollow” parameter which requests that search engines do not pass through them. In other words, followed links are ones that allow search engines like Google to pass through them and crawl your website. The more of these you have, the more your website is exposed to possible crawls. Especially if the website you got a link from is a very busy and popular site in the same genre as yours, because Googlebot likes hanging out on websites that offer its users a good experience, and loves visiting other sites that are related to the one it was on.
You can use many different strategies to gain followed links, which might include content sharing, blogging, article writing, and directory submissions. Just bear in mind that you might not want links from dodgy websites that are not busy or popular, because if Googlebot comes from a site like that to yours through a link, it might be in a bad mood. I say this jokingly, but actually it’s a serious matter. If you get a lot of links from poor quality or irrelevant websites, your website may get a penalty from Google’s Penguin algorithm that could take a long time to recover from.
Ideally, links to your site should be:
- Distributed to many different pages in your site
- Distributed from many different sources
- From websites that you respect yourself and are related to what you do.
- Gained gradually over time.
- Gained from locations where your target market may hang out.