I've been experiencing a lot of bot traffic that needs to be blacklisted. I am able to capture the IPs. I am just curious if there are there companies interested in purchasing such IP information? I am certain these are bots and they also target specific vulnerability. If not to sell, I would like to atleast share with companies who are in this space.
Bear in mind that IP bindings to bad (or good) actors is often ephemeral, and these "taint lists" wind up creating as much damage as they repair. A classic example is the US DoD blacklisting significant parts of the pacrim economies because of applying blanket filters to IP addresses: The very economies that are under attack from cyber threats found themselves cut off from connectivity with an agency which is active in their region.<p>Under address rental models in cloud/service providers an address can be ephemeral to a bad actor for minutes, and then back in the pool. If you apply this kind of filter, somebody else taking service from AWS or a sub-tenancy can find themselves in the bad place.<p>Third party damage risks basically.
It's unlikely that companies would pay for such data, as it is shared for free in places like FB ThreatExchange or <a href="https://cybersecurity.att.com/open-threat-exchange" rel="nofollow">https://cybersecurity.att.com/open-threat-exchange</a> (and some others that are more secret)<p>You could potentially otherwise check <a href="https://www.projecthoneypot.org/" rel="nofollow">https://www.projecthoneypot.org/</a><p>This is the ancestor of Cloudflare and still actively flagging IPs for bad bots, which is probably a source in Cloudflare itself.
There are a myriad of bot lists on github and assorted websites, each with their own <i>quality</i> in being maintained. Given the public nature of these lists it is unlikely anyone is going to pay for them. As others mentioned, these are ephemeral. Bots are operated by opportunists and they will hop from network to network to find addresses not yet tainted. There are also a large number of bots that use mobile LTE networks. IP addresses are not entirely useful for determining if something is a bot. Rather their TCP/IP characteristics, application characteristics, behavior will be a greater indicator. As simplistic <i>and non exhaustive</i> examples, SSH bots are often using really old SSH libraries that can be spotted in the handshake. Some HTTP bots are also often using really old libraries that can not utilize newer protocols <i>with obvious exceptions like those driving headless chrome</i>. Some bots use really old TCP/IP libraries that lack setting certain IP headers and options or fixate on specific options or source ports. Some bots also hide behind proxies, Tor, CDN's. They also hide their DNS lookups behind all the public DNS resolvers. The vast majority of bot authors are lazy and looking for quick wins.<p>I suppose that is a long winded way of saying I doubt there would be much interest in paying for IP lists as the greater value lies in understanding traffic behavior and packet characteristics. Perhaps if you started writing advanced eBPF code that could mostly-accurately separate traffic into bot vs non-bot then you would have created a piece of Cloudflare and people might pay to self host that. If going this route one should create public challenges to validate the accuracy and ability to "spot the bots". Independent third parties must participate in the challenge to validate both the ability to "spot the bots" and also not block legitimate people. That would be valuable. To garner interest there should be a free version.<p>I can speak from experience that some companies are not permitted to send specific types of traffic over a third party such as a CDN and that would be the use case for self hosted bot mitigation. Some companies try to accomplish this using bot mitigation in hardware load balancers, multi-million dollar firewalls and DDoS appliances. These devices are expensive and do not scale well not to mention they only stop very specific types of bots and attacks. These devices are also sometimes the causes of outages. In my experience, the more expensive a device is and the more promises around said device, the more glorious of an outage it will cause.