Site:govsitehere.com. if you see lots of pages of indexed content.........then you know Google sees it as well.
Google will crawl https if robot.txt allows.
Its public data.
Even Tyson could see that my SLC office business registration is out of date. ( nice find btw) That's the point. Who's "prominent"? Business with city and state license since 1997 or business with no license but GMB page etc? State records > data aggregators.
Sent from my SM-G900V using Tapatalk
While robots.txt may allow crawling, it's only half of the equation. The other half is whether Google can actually crawl the data or not.
A lot of data we see as the end user is query based and have temporary URL's that can't be replicated by Google. Google needs a permanent or "hard" URL to crawl. That's why you rarely see a URL from a Google SERP with programming syntax (might be wrong vernacular here) with "?", "&" etc. Those seem to be a dead giveaway that the URL you're looking at is query generated and therefore, "temporary" for lack of a better term.
I've run into so many sites (mostly government) that don't use permanent URL's and the information you see is based on a query you generated, which Google cannot replicated. Furthermore, the data you are seeing isn't housed on a permanent page where Google could crawl it and make sense of it. It's housed in what I can only imagine is a SQL database and therefore Google can't read that data. Even if they could, how would they make sense of it since the data only makes sense when put together through a query.
Government websites seem to be the worse culprits when it comes to site architecture. So it is very feasible that Google cannot crawl many of the SOS listings. Just because you can "site:" a website doesn't mean the website is completely crawlable. There are some sites out there with truly atrocious architecture that have thousands of pages indexed. They also have tens of thousands of pages that can't be indexed.
Here's a question for you.
How many of you when doing a citation audit have actually seen your SOS listing come up? Personally, I haven't seen it come up once. And that goes for competitors too.
Maybe Google buys this data from the SOS or a 3rd party. But if they do that, I can't imagine it being used as a ranking signal. That's a pretty manual signal and Google loves automation.
Also, I haven't seen an SOS listing but I imagine it's filled with misleading information (as far as NAP consistency goes) such as DBA's, different addresses for legal purposes, etc. Why would Google even consider such misleading information? It would seriously pollute their database. This is all based on conjecture, since I haven't seen an SOS listing before but it makes sense to me.
My point is I don't think this is such a cut and dry issue here.
I'll be interested in Joy's discovery from Google.
Also, if you "site:" a site for this example, do a broad search like "city dentist" and see what comes up. If a comprehensive list of dentists come up in your location and the information looks good, it probably is good. That would be a good example to see.