There is a git repository of the FreeBSD src, at https://cgit.freebsd.org/src/
Some time in the past that server had failed. Anybody can replicate the FreeBSD repo and setup a Webserver with cgit, and since I already had a cgit running, I did so.
So far, so good. Then I found lots of robots collecting that data, and since the repos are quite a lot of files, that became annoying - in addition to being completely pointless as the data is nothing new or of value, but just a mirror-copy.
So I placed a proper robots.txt file there, to protect them and me from wasting unnecessary traffic - and amazon and a few others did in fact respect that.
But others did not. And this is new to me: the Internet can only function per mutual agreements; deliberately to disregard the requests of the peer is something that was not seen earlier on the net.
I found the respective adresses all belonging to "compute.hwclouds-dns.com", and also a bunch of non-resolvable IPs that do belong to something called "huaweicloud" aka Xinnet, with responsible's e-mails at huawei.com. Apparently this is a very unpleasant and disrespectful member of the Internet community.
So I blocked these address ranges, and things were good, for a while.
Then the robot-scans did reappear, this time originating from "crawl.bytedance.com" and a whole bunch of IPs from Alibaba cloud - some more very unpleasant and disrespectful people who should not be suffered on the net.
I blocked these also. But this time it did not take long. And now the picture is really bizarre: yesterday all of a sudden did lots of scans appear, while the originating addresses look like this:
There is an incredible lot of them, and they look like they were private user addresses, distributed all over south america. But this cannot be what it seems - this is some organized scanning, only /disguised/ as private users.
This is a different kind of quality now: not just somebody who has a web scanning bot, and decides to not respect the robots.txt file basically because they are antisocial assholes, but rather somebody who goes to quite some lengths of organization (and investment!) in order to /deliberately/ behave unpleasant.
And the question arises: what do they want to achieve? What is the value to gain from collecting a redundant copy of the FreeBSD-src which could much easier be obtained per git cloning?
Some time in the past that server had failed. Anybody can replicate the FreeBSD repo and setup a Webserver with cgit, and since I already had a cgit running, I did so.
So far, so good. Then I found lots of robots collecting that data, and since the repos are quite a lot of files, that became annoying - in addition to being completely pointless as the data is nothing new or of value, but just a mirror-copy.
So I placed a proper robots.txt file there, to protect them and me from wasting unnecessary traffic - and amazon and a few others did in fact respect that.
But others did not. And this is new to me: the Internet can only function per mutual agreements; deliberately to disregard the requests of the peer is something that was not seen earlier on the net.
I found the respective adresses all belonging to "compute.hwclouds-dns.com", and also a bunch of non-resolvable IPs that do belong to something called "huaweicloud" aka Xinnet, with responsible's e-mails at huawei.com. Apparently this is a very unpleasant and disrespectful member of the Internet community.
So I blocked these address ranges, and things were good, for a while.
Then the robot-scans did reappear, this time originating from "crawl.bytedance.com" and a whole bunch of IPs from Alibaba cloud - some more very unpleasant and disrespectful people who should not be suffered on the net.
I blocked these also. But this time it did not take long. And now the picture is really bizarre: yesterday all of a sudden did lots of scans appear, while the originating addresses look like this:
Code:
host-181-199-63-48.ecua.net.ec
gmt-237-037.gamatelecomnet.com.br
pc-231-116-101-190.cm.vtr.net
bbb4aaae.virtua.com.br
139.13.196.181.static.anycast.cnt-grms.ec
152-248-41-138.user.vivozap.com.br
ip-179.108.50.220.redeatel.com.br
177.18.235.127.static.host.gvt.net.br
host190.5.33.175.dynamic.pacificored.cl
ip-189.126.44.83.jrtelecom.com.br
138-117-208-204.viamartelecom.com.br
170-238-198-165.static.sumicity.net.br
186-249-197-110.unifique.netnow
189-201-234-177.gigasat.net.br
There is an incredible lot of them, and they look like they were private user addresses, distributed all over south america. But this cannot be what it seems - this is some organized scanning, only /disguised/ as private users.
This is a different kind of quality now: not just somebody who has a web scanning bot, and decides to not respect the robots.txt file basically because they are antisocial assholes, but rather somebody who goes to quite some lengths of organization (and investment!) in order to /deliberately/ behave unpleasant.
And the question arises: what do they want to achieve? What is the value to gain from collecting a redundant copy of the FreeBSD-src which could much easier be obtained per git cloning?