Overview
Nosto's image crawler feature that fetches product images directly from your store as an alternative to the product API. If you're experiencing sync issues with the product API, you can enable the crawler by contacting Nosto support.
For the crawler to work, it must be able to access your product images and pages. If your store or CDN blocks unknown bots, you may need to whitelist Nosto's crawler. This article explains how to unblock Nosto's IP addresses and User Agent so the crawler can function properly.
Whitelisting
Depending on what is being blocked, you may need to whitelist one or both of the following:
IP Addresses — for product image access
If Nosto's crawler is blocked from fetching **product images**, whitelist the following IPs at whatever layer sits in front of your image URLs (CDN, load balancer, web server — see below).
Always fetch the authoritative list from:
https://api.nosto.com/meta → "crawler" property
User Agent — for product page access
If you need to allow Nosto's crawler to access product pages directly (similar to how you'd whitelist a search engine spider like Googlebot), use the following User Agent string to identify it:
Mozilla/5.0 (compatible; NostoCrawlerBot/1.0; +http://my.nosto.com/tagging)
Where to whitelist (IP addresses)
The request from Nosto's crawler travels through your network stack from the outside in. You need to whitelist at whichever layer is actually enforcing the block — typically the **first** one that applies access controls:
Internet → CDN / DDoS protection (Cloudflare, Fastly, Akamai)
→ Load Balancer (AWS ALB, nginx upstream)
→ Web Server (nginx, Apache)
→ Application
If you're unsure where the block is happening, start at the outermost layer (usually your CDN or DDoS protection) and work inward.
Whitelisting by platform
Cloudflare
1. Go to Security → WAF → Custom Rules and create a new rule.
2. Match on the Nosto crawler IPs:
ip.src in {18.209.181.40 34.233.200.247 34.238.228.86}
3. Set the action to Allow / Skip all remaining custom rules.
4. Place this rule above** any bot-blocking or rate-limiting rules.
> If you have **Bot Fight Mode** or **Super Bot Fight Mode** enabled, the WAF custom rule above is still required — NostoCrawlerBot is not in Cloudflare's verified bot list and will be challenged without it.
If you need to allow Nosto to access product pages instead, create a separate rule matching the User Agent:
http.user_agent contains "NostoCrawlerBot"
nginx
Use `satisfy any` to let the request through if *either* the IP matches *or* valid credentials are provided:
```nginx
location ~* \.(jpg|jpeg|png|gif|webp|svg)$ {
satisfy any;
allow 18.209.181.40;
allow 34.233.200.247;
allow 34.238.228.86;
deny all;
auth_basic "Restricted";
auth_basic_user_file /etc/nginx/.htpasswd;
}
```
Reload after changes: sudo nginx -t && sudo systemctl reload nginx
Apache
```apache
<FilesMatch "\.(jpg|jpeg|png|gif|webp|svg)$">
<RequireAny>
Require ip 18.209.181.40
Require ip 34.233.200.247
Require ip 34.238.228.86
Require valid-user
</RequireAny>
</FilesMatch>
```
AWS WAF (ALB or CloudFront)
1. Go to AWS WAF → IP sets and create a set named NostoCrawlerIPs (IPv4):
```
18.209.181.40/32
34.233.200.247/32
34.238.228.86/32
```
2. In your Web ACL, add an IP set match rule with action Allow and set its priority above any blocking rules.
Fastly
sub vcl_recv {
if (req.http.X-Forwarded-For ~ "18\.209\.181\.40|34\.233\.200\.247|34\.238\.228\.86") {
return(pass);
}
}
HTTP Basic Authentication
If your store or staging environment is protected with HTTP Basic Authentication, Nosto's crawler supports this natively. Contact Nosto Support and provide your credentials — they will configure them directly on the crawler.
