You canuse a web crawler to import contentinto your AI agent. This gives your AI agent the ability to create AI-generated answers to customer questions based on information in external websites.

What's my plan?
Add-on AI agents - Advanced
This article applies only if your Zendesk account was created before March 10, 2026. Otherwise, see Connecting knowledge sources to power generative replies in AI agents.

You can use a web crawler to import content into your AI agent. This gives your AI agent the ability to create AI-generated answers to customer questions based on information in external websites.

This article gives you some best practices for using a web crawler to import content for an AI agent.

This article contains the following topics:

  • Use the web crawler on the right type of sites
  • Limit reimports to a reasonable frequency
  • Keep the overall number of knowledge sources low
  • Check the import summary
  • Start small and test

Related articles:

  • Troubleshooting issues with web crawler imports for AI agents
  • Managing imported knowledge sources for AI agents

Use the web crawler on the right type of sites

The web crawler is best suited for websites that function as help centers or product description pages. For e-commerce pages, we recommend building an integration capable of retrieving relevant product information and adding that information in a dialogue or procedure.

It’s recommended to use a Zendesk help center as your primary knowledge source. Websites can have any format, including dynamic elements and JavaScript, which means they’re much harder to predictably ingest. While the web crawler has powerful configuration options, these require enablement and practice. Zendesk help centers are, by nature, simpler and more predictable in format, leading to better results. Imports should also generally be faster when using a Zendesk help center.

Only publicly accessible websites can be crawled. If a website requires authentication, the web crawler can't access it.

Limit reimports to a reasonable frequency

Imports aren't a real-time web search. The AI agent doesn't search live data in a help center, file, or website. Rather, the information is imported into the AI agent on a one-time or recurring basis. The AI agent uses this imported information when generating its replies.

Daily imports aren’t recommended unless the knowledge source is updated very frequently. For most organizations, a weekly or monthly cadence is fine. Remember that you can always manually reimport if new changes need to be reflected outside the scheduled reimport.

Keep the overall number of knowledge sources low

You can add multiple knowledge sources to a single AI agent, including multiple web crawls. Nevertheless, it’s recommended to keep the overall number of knowledge sources within a reasonable limit. In some cases, having lots of sources can lead to reduced accuracy and increased latency.

Check the import summary

If you have a successful crawl but encounter other issues (for example, the AI agent’s answers are incomplete or poor), you can review the import summary to check whether all expected URLs and content were imported. This is the first and best way to understand what’s been imported and what to troubleshoot after import.

Start small and test

If you want to check whether content was crawled correctly and you have pages that follow a specific pattern, the fastest thing to do is restrict your crawl just to one or two examples of those pages. You can use a Start URL of one target page and a Max crawling depth of zero. Alternatively, you can set the Max pages to crawl to some low number that can be quickly processed.

THIS SECTION IS AI CONTENT. DON'T EDIT OR DELETE.

What is a web crawler? What does a web crawler do? Why use a web crawler for AI agents or chatbots?

A web crawler imports content from external websites into your AI agent or chatbot. The bot uses this external information to create answers to customer questions. Imports run on a one-time or recurring schedule instead of a real-time web search.

Can I crawl a website that requires a login? Can the web crawler access authenticated pages?

No. The web crawler accesses public websites. It cannot access sites that require authentication.

Can I add multiple knowledge sources or help articles to one AI agent? Can chatbots use more than one web crawl?

Yes. You can add multiple knowledge sources, including multiple web crawls, to a single AI agent or chatbot.

Can I manually reimport a website? Do I need to wait for the scheduled reimport?

Yes. You can manually reimport information to reflect new changes outside of the scheduled timeframe.

Who can use the web crawler? Are there date restrictions for web crawler imports?

This feature applies to Zendesk accounts created before March 10, 2026.

Are there limits to the number of knowledge sources? How many web crawls should I use?

Keep the overall number of knowledge sources low. A high number of sources reduces accuracy and increases latency.

What types of websites work best for the web crawler? Should I crawl a Zendesk help center or knowledge base?

Crawl websites that function as help centers, knowledge bases, or product description pages. Use a Zendesk help center as the primary knowledge source because the simple format allows for faster, predictable imports. Avoid e-commerce pages or websites with dynamic elements and JavaScript.

How often should I schedule web crawler imports? Should I use daily imports?

Set imports to a weekly or monthly cadence. Avoid daily imports unless you update the knowledge source frequently.

How do I test the web crawler? What is the best way to start a web crawl?

Start small. Restrict the crawl to one or two example pages to test a specific pattern. Set the Start URL to one target page and the Max crawling depth to zero, or set the Max pages to crawl to a low number.

How do I troubleshoot poor chatbot answers? Why are AI agent answers incomplete after a successful crawl?

Review the import summary. This summary shows whether the crawler imported all expected URLs and content, which helps you identify what to troubleshoot.

Powered by Zendesk