- May 17, 2013
- Reaction score
While robots.txt may allow crawling, it's only half of the equation. The other half is whether Google can actually crawl the data or not.
A lot of data we see as the end user is query based and have temporary URL's that can't be replicated by Google. Google needs a permanent or "hard" URL to crawl. That's why you rarely see a URL from a Google SERP with programming syntax (might be wrong vernacular here) with "?", "&" etc. Those seem to be a dead giveaway that the URL you're looking at is query generated and therefore, "temporary" for lack of a better term.
Just FYI, URL parameters get crawled and indexed all the time. They are a huge problem for site performance, because of all the content duplication that they can cause, and part of the reason Google released rel="canonical" and gave webmasters the ability to set URL parameters in Google Search Console. But Google absolutely can and does crawl and index them, often prolifically.