by Patrick follow (59)
User-Agent:GoogleBotDisallow: /
34.32.251.230 - - [30/Jun/2023:23:57:41 +0000] "GET /housing/rss.xhtml HTTP/1.1" 403 27 "http://patrick.net/housing/rss.xhtml" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" "-" 0.002 0.000 127.0.0.1:8083 "-" 403 - .
% whois 34.32.251.230NetRange: 34.4.5.0 - 34.63.255.255CIDR: 34.4.64.0/18, 34.4.32.0/19, 34.16.0.0/12, 34.8.0.0/13, 34.32.0.0/11, 34.4.16.0/20, 34.4.128.0/17, 34.5.0.0/16, 34.4.8.0/21, 34.4.6.0/23, 34.6.0.0/15, 34.4.5.0/24NetName: GOOGL-2NetHandle: NET-34-4-5-0-1Parent: NET34 (NET-34-0-0-0-0)NetType: Direct AllocationOriginAS:Organization: Google LLC (GOOGL-2)RegDate: 2022-05-09Updated: 2022-05-09Ref: https://rdap.arin.net/registry/ip/34.4.5.0 ...Comment: Complaints can also be sent to the GC Abuse deskComment: (google-cloud-compliance@google.com )Comment: but may have longer turnaround times.
Comments 1 - 1 of 3 Next » Last » Search these comments
https://patrick.net/robots.txt
Note that the first thing I do is to tell Google to fuck off:
But Google disrespects the wishes of site owners and indexes anyway! Proof from my web server log:
And that is not a spoof of Google's bot, because 34.32.251.230 is really a Google IP:
So I wrote google-cloud-compliance@google.com to ask them to stop that, but they have not replied, and not stopped.
Summary: Google is evil, and will try to index your site whether you ask them to stop or not.
PS I have taken measures on the server side now to block Google by IP address, returning a 403 Forbidden to them.