Nowadays it is of enormous importance as a company to ensure good online findability. If this is not the case, you can lose a lot of potential customers and that is of course a shame. Google indexes all web pages that are created.
Google does indexing by crawling on a web page. If there are certain malfunctions that prevent the crawl process from running optimally, this can have a major influence on the findability of a page. Optimizing the crawl budget of your page can therefore come in handy in such a case. We have a few tips to get you started.
What is a crawl budget?
The term crawl budget is actually underused when it comes to optimizing a website. And this while the crawl budget has a huge influence on the findability of a page. You can assume that Google crawls a few pages of your website almost every day. We also call the amount of pages crawled on the crawl budget.
The crawl budget of a website can differ slightly from day to day. The crawl budget depends on various factors, such as the size of a website and, for example, the number of references to a website. Are you curious about the crawl budget of your website? Then you can take a look at Search Console.
It is always good to know the crawl budget of your website. With this information you can compare the budget with the average number of pages crawled by a Googlebot. You can find out by performing the following search in the search engine:
” site: website.nl ”
As the image below shows, this is followed by the amount of pages that Google has indexed.
If your crawl budget does not match the average, then there is a profit to be made because Google does not crawl the pages on a daily basis. As mentioned earlier, the crawl budget depends on several aspects. Optimizing these factors can therefore ensure that more pages will be crawled by Google every day, we also call this crawl budget optimization.
Crawl budget optimization
Optimizing a crawl budget you will therefore have to take steps. The first factor of importance is the speed of a website. While crawling, Google renders a page simultaneously. If this process takes place quickly, the Googlebot can crawl a webpage faster. This ensures that websites can be found faster in the search engine.
In addition to the speed of a web page, the amount of references to a page also matters. When a website has a lot of inbound links, the Googlebot will take longer to process the links. The amount of time spent on your website will therefore add up quickly.
Applying crawl budget optimization
So you now have a little more insight into the crawl budget of a page and the factors that influence this budget. But how can you ensure that you can optimize the crawl budget of your page with this knowledge?
We have a few tips for you of course! The goal is to generate as much organic traffic as possible by directing Google to certain pages that are already of higher value. You can also ensure that the pages that are less important are not indexed unnecessarily.
1. Think carefully about filters
When creating a website, it is beneficial to set up clear categories for user-friendliness. You can do this with the help of filters. These filters ensure that many different URLs are generated.
Filters can also be combined with each other to create even more URLs. Loading a filter often happens after a parameter, it can sometimes be that a parameter is considered not relevant enough by Google. If this happens, Google will not crawl. To give you a better idea, here are a few examples.
If you are visiting an e-commerce website and want to buy something you can search by price category. This is of course very useful for a visitor, but this creates a lot of extra URLs. If you divide the range into eight price categories, eight different URLs will be created that are actually not that interesting for a Googlebot.
An internal search function on a website can also ensure that many URLs are created that are not relevant. It is best not to index these pages as this will not add anything. In addition, there is a high probability that these searches do not match the category pages on a website.
You would think how much damage these URLs above could do. Only a number of URLs can be generated. Yet it is not as simple as it seems. Here we show you an example of our crawl budget optimization strategy applied to one of our clients.
This example shows that by using a search function on a website, a huge number of pages were indexed that actually had no influence at all. By excluding these pages, the focus is more on the pages that should receive attention.
This can even affect your position in the Google ranking because more time is spent by the Googlebot on the most important pages. There are a few approaches to exclude the URLs generated by a filter. The first way is to enter the following text in the robots.txt: disallow:? Search
Simply put, the command in this case is for a Googlebot to skip the URLs that contain? Search. Yet this does not solve the biggest problem, because you will still see the URLs appear in the search results.
If you add a noindex line first and place a disallow in the line after it, this problem is solved. Or you can choose to apply a nofollow to a URL of a filter, so the pages with this tag will not be crawled.
2. Avoid duplicate content
Something that is also of great influence is the creation of duplicate content. This means that there are pages with almost exactly the same content as on another page. There are a few types of duplicate content, we’ll discuss one of the most common ones. We mainly see this form of duplicate content arise at web shops.
For example, if you want to buy shoes, there are several categories that lead to a particular pair of shoes. All these categories therefore all have their own URL. This creates duplicate content because multiple URLs have exactly the same information.
Here too it is often underestimated how many extra URLs are created and in this case duplicate content is also created. Google will therefore take double the time to crawl through all pages. Of course we also have a trick for you to find out if you offer products via two different URLs enter the following search term:
Site: webshop.nl ” product name ”
So if more than one URL appears, you know what to do!
3. Avoid 404’s and redirects
Maintaining a website is very important. It is very annoying for a visitor to a website when a 404 page appears. This is also seen as a disadvantage by the Googlebot, so using it will not benefit the crawl budget.
Do you want to know if you also have 404’s on your website? You can then use the Search Console again. If you look at the topic ‘crawling’ and then click on the heading ‘crawl errors’, you can find the 404s under ‘not found’. There are two ways to fix the 404’s.
First, you could change the internal link to that of a working page. Our advice is to choose this option if possible. The second option is to use a 301 redirect. Both options have the same outcome. Nevertheless, we recommend that you avoid the use of internal redirects as much as possible. This way you prevent the Googlebot from experiencing a longer loading time on a page.
You will generally have to be very careful with the use of redirects. Of course you do not want a redirect chain to arise. Redirects that are linked together can negatively affect a crawl budget. Here an example of a redirect chain:
4. Observe server logs
One of the best tactics for researching crawl budget is through server log analysis. In this analysis you find out whether the correct pages are visited by the Googlebot. This has to do with the relevance of a page.
It’s best to use a program like Screaming Frog. If you use these types of tools you will get a concise summary with all the information that matters regarding server logs. For example, you can see which user agent has crawled, you can also find out the status code, response time, the number of crawls and the amount of bytes downloaded.
Performing a server log analysis is not easy and it can take a while to get the hang of it. Still, investing time can certainly help. You can gather so much information about all the bots crawling on your pages.
Now that you know more about the crawl budget of a website and how to optimize it, you can get started yourself! Not only can you ensure an optimal crawl budget by applying these tips, the applications will also ensure that your website is experienced as more user-friendly. Who does not want a fast loading time of a page and clear navigation links. Investing in referrals to your page will also benefit your website.