Crawling VS Indexing: Everything You Need To Know

Blog

SEO

Dec 6, 2023

5 min read

By Luca Da Corte

Differences between crawling and indexing

Introduction

If you know a thing or two about websites, you’ve probably already heard about crawling and indexing.

These terms are often used interchangeably, although they are different actions.

In this article, I’ll help understand what each of these actually means and how they fit in the big picture of getting your website on search results.

The Big Picture

Whenever somebody publishes a website (or a new web page, for that matter), two things must happen before you see it displayed on Search Engine Results Pages (SERPs):

Google has to discover and learn about it (crawling)
Google has to add the website to its database (indexing)

Both of these steps must happen in order for the website to appear on search engines. There’s no way around it.

So, let’s look at each step in more details.

NOTE: While I’ll often use the term “Google” in this article, the same process applies for every other search engine out there.

Search engines first crawl your website and the index it

How does crawling work?

So, we stated that crawling is the process of downloading new or updated content found on the internet.

But, how does it work?

The first step that Google has to go through is to find out if there are new or updated pages to add to its list of known pages. This process is called “URL Discovery”.

While there is a handful of ways that Google uses to find new pages, the most common one is by using links on already known pages.

Example:

Let’s say that I publish a new blog article. Google doesn’t know about it.

However, it does know about the existence of my website. Therefore, it will periodically scan my website in search of new or updated content.

Whenever it scans my blog hub, it will find a link that leads to a new article.

How search engines (like Google) find and crawl new pages

Once Google has discovered this new URL, the true crawling process begins.

In order to crawl a website, search engines like Google use automated bots called “web crawlers”.

These web crawlers (also known as bots, spiders, or spiderbots) will render the page to then try to learn what the page is about.

The rendering process is fundamental as oftentimes pages rely on Javascript code to fill the page with content.

If you’re not familiar with the term “rendering”, think about it as the same process your browsers use to display the page’s content on your display.

However, it’s important to note that Google will only crawl pages that are publicly accessible. If a page requires a log in process or has been disallowed for crawling by the owner, Googlebot will not crawl it.

Now that the page has been crawled, it’s time to index it.

How does indexing work?

After a page has been crawled, Google will try to understand what it’s about.

Google will analyse the page’s HTML code to find as much information as possible about the page’s meaning.

Once Google has finished analysing your page, it will add it to a massive index of hundreds of billions of webpages.

Once search engines have crawled your webpages, they will add them to their index.

Mobile-first Indexing

When talking about indexing, there’s another piece to the puzzle that you should be aware of.

And that is Mobile-first indexing.

Mobile-first indexing basically means that Google will use the mobile version of your site’s content for indexing and ranking.

Which means:

If your site isn’t optimised for mobile, it could negatively impact how it performs on SERPs. Or even prevent it from being indexed.

To check whether your website is optimised for both desktop and mobile, you can run a PageSpeed Insights test.

Apart from telling you if your website is mobile-friendly, it will also give you valuable insights on:

Performance
Accessibility
Best practices
SEO

How to check if Google is indexing your website

If you can’t seem to find your webpage on Google no matter what you type, it might be that your webpage is not being indexed by Google.

However, let’s get this clear. Not appearing on the first page of search results does not mean your website is not indexed.

The fastest way to check if your webpage is being indexed it to run a search for the following query: site:URL

For example, if I wanted to check whether Google is indexing my website homepage, I would have to write: site:dacorte.dev

If there is at least one result, good news! Google is indexing your website.

If that’s not the case, you might want to check out this guide.

Check if your website is being indexed by search for site:URL

FAQs

Q1 - Does Indexing Influence SEO?

Yes, Indexing has a massive influence on SEO. So much so that, if your webpage is not being indexed, nobody will ever be able to find it on search results.

Q2 - What’s The Difference Between Crawling And Indexing?

Crawling is the process of search engines navigating and analysing web content, while indexing involves adding web pages to their databases for future retrieval.

Q3 - What Is A Sitemap?

A sitemap is, as the name suggests, a map of your site. It’s an XML file that provides information about the pages of your website and the relationships between them.

Q4 - Do I Need A Sitemap?

While having a sitemap is not strictly required, it’s still considered a best practice. It might be especially useful if your website is large, has few external links, or is poorly organised.

Q5- What’s A Robot.txt File?

A robots.txt file is a set of instructions used by websites to tell web crawlers (like Google’s) which pages should and shouldn’t be crawled.

Conclusion

In this article, we’ve discussed the differenced between crawling and indexing.

Thank you for taking the time to read the whole article. I hope you found it helpful.

If you need any assistance with your website or SEO campaign, please feel free to reach out with any questions.

If you’ve found this article helpful, I kindly invite you to share it with a friend who might benefit from it!