DanLeibson
Member
- Joined
- May 17, 2013
- Messages
- 229
- Reaction score
- 142
While robots.txt may allow crawling, it's only half of the equation. The other half is whether Google can actually crawl the data or not.
A lot of data we see as the end user is query based and have temporary URL's that can't be replicated by Google. Google needs a permanent or "hard" URL to crawl. That's why you rarely see a URL from a Google SERP with programming syntax (might be wrong vernacular here) with "?", "&" etc. Those seem to be a dead giveaway that the URL you're looking at is query generated and therefore, "temporary" for lack of a better term.
Just FYI, URL parameters get crawled and indexed all the time. They are a huge problem for site performance, because of all the content duplication that they can cause, and part of the reason Google released rel="canonical" and gave webmasters the ability to set URL parameters in Google Search Console. But Google absolutely can and does crawl and index them, often prolifically.
 
	 
  
 
		 
  
 
		
 I have seen strings of several parameters where different orders of the same parameters were indexed for multiple URLs. It's not as uncommon as you think.
 I have seen strings of several parameters where different orders of the same parameters were indexed for multiple URLs. It's not as uncommon as you think. 
 
		 
 
		 
 
		 
 
		 
 
		 
 
		
 
 
		 
 
		 
 
		 
 
		 
 
		