SEMRUSH crawl returns 403 errors on Zendesk links



2021년 12월 01일에 게시됨

We use SEMRUSH to crawl and audit our website, but it looks like for some reason the crawl fails on link that point to Zendesk. Links such as https://help.tokyotreat.com/hc/en-us which work fine when opened on a browser, fail on SEMRUSH crawls.

Is there any fix for this?

Thanks.


4

22

댓글 22개

Hi 1263082153329 5895537231642 - it looks like Ahrefs have updated their IP range for their crawl bot (see article: https://help.ahrefs.com/en/articles/78658-what-is-the-list-of-your-ip-ranges) Due to this we are getting 403 errors like reported by the OP. From what I've read in this thread, your team should be able to resolve this?
Can I request that these are updated to allow the Ahrefs crawl bot from accessing Zendesk pages?

1


Hello everyone,
 
Thank you for taking the time to provide us with your feedback here. I want to note that this has been logged for our PM team to review. For others who may be interested in this feature request, please add your support by upvoting this post and/or adding your use case to the comments below. Please note that if you have a new feedback request that is not related here, please go ahead and create a new feedback post so we can log your requests separately. Thank you again!

0


I have what is I believe a similar and related issue to the one described here. The short version: We are getting 403 errors when our web crawler attempts to index our Zendesk Guide site articles. More information:

We use Lucidworks Fusion to index our website and provide search results to our customers. We are moving our support documentation to Zendesk, but we also want users searching on our website to discover our Zendesk documentation. 

Fusion allows us to add multiple datasources/websites to a collection to be searched together. Great! Tests with other websites showed this would work. Initially I did get this to work with our Zendesk site on 4/18. Fusion indexed it after some initial difficulties. I changed one setting on 4/19 and then the crawl failed. I changed it back and it failed again. I've gone through Fusion's settings all weekend trying may different settings to no avail. 

Then I thought to come here and search on what would cause Zendesk to return 403 errors. This post seems very closely related. What is really confusing is having it work once, and then having it fail. Is it possible Zendesk/Cloudflare has blocked our web crawler? Is there a way to authorize our web crawler? Discovery of our documentation via searching our website is critical to the support of our customers. I've submitted a ticket to Zendesk support, but thought I'd also post here. 

0


So, Eric Nelson Gorka Cardona-Lauridsen, we're in the somewhat same spot. We offer services for clients that involve crawling their website content. With full consent from the clients.

Yet, Cloudflare blocks our crawler just the same. And since we're not crawling any site, just on demand, we don't meet Cloudflare's Good Bot requirement of at least 1000 requests/day on their network. 

So, how can we solve this? As mentioned, the client is in full consent and all we want is to perform a normal operation. How?

Thanks - Martin

0


No we are not - it seems it fixed itself :)

0


Koen Doodeman Henrique Vilela Thanks for reporting this. We are working on finding the issue. Are you still experiencing it?

0


Hello guys, any updates on this? We are also getting tons of 403 errors from Semrush. Thanks!

0


Hey hey!

Sorry for digging up this old thread, but this issue seems to have resurfaced for ahrefs as of March 27th. That is, everything worked fine on March 26th, and our audit lead to a score of 1 due to all pages responding with 403 errors on the 27th. I've reached out to ahrefs first, and they believe it to be an issue with the site rejecting the crawler. The bots they specifically pointed that should work are AhrefsBot and the AhrefsSiteAudit bot. What can we do to resolve this?

3


Hey All,

I just wanted to say that we've been working with Cloudflare on the Semrush issues you have all reported. The Cloudflare team has been working to adjust their bot management functionality to allow the Semrush bot through. There have already been a handful of changes rolled out and a final update will be going out on Monday that should alleviate all the remaining issues we're seeing. If you're still encountering issues after end of business on Monday, please let us know and we'll continue to work with Cloudflare on this. 

Thanks!

0


Greg, your explanation is helpful and makes sense as a way to help protect against malicious attacks, but pushing the reach out to SEMrush (or other major bots) to customers seems like a weak response from the Zendesk product team. There are only a few handfuls of major legitimate bots that crawl sites and these can be easily found. As an example, here is a list from 2017 that includes SEMrushbot https://www.imperva.com/blog/most-active-good-bots/

While not all edge cases can be anticipated, it should part of Zendesk's product strategy to make sure that it's product works well with other major systems and tools used by customers. There are many more tools used by smaller sites/companies, but SEMrush would be on the list for many of the medium-to-larger size Zendesk customers. As part of product discovery, I might suggest the Zendesk product manager read What is Enterprise SEO? for a list of a few others in addition to the list shared above.

It is great the Zendesk engineering team has identified the issue Zendesk is causing errors in the SEO/marketing tools used by a large percentage of it's customers. However, the Zendesk product managers should be doing the work to make sure that Zendesk is playing well with other tools and not pushing that work to customers. Given the prominence of SEMrush in the market, it is equally as likely that the Zendesk product/engineering team has missed something in their protection implementation with Cloudflare. Zendesk product managers should easily be able to find contact information for their counterparts at SEMrush, ahrefs, Cloudflare, etc. to make sure they fully understand the situation with other major platforms and to find a resolution.

3


로그인하세요.

원하는 정보를 못 찾으셨나요?

새 게시물