Is there a maximum download size for pages for your crawler?

< More FAQs

Yes. The Janitor crawler will only process the first 1MB of each page it visits. Binary assets are generally ignored entirely. The crawler is reasonably intelligent and will only follow hyperlinks, rather than any URL on the page. Most web pages are way below the 1MB threshold.