Tiny Core Linux

Off-Topic => Off-Topic - Tiny Core Lounge => Topic started by: gadget42 on August 23, 2025, 01:49:31 AM

Title: question for admin/mods: wondering reason for increased forum website traffic?
Post by: gadget42 on August 23, 2025, 01:49:31 AM
question for admin/mods: wondering reason for increased forum website traffic?

iirc, up until quite recently the most online ever was about 1k less?

did we get a mention somewhere recently?

or is it just increased ai/bot/llm/lvm/etc activities?
Title: Re: question for admin/mods: wondering reason for increased forum website traffic?
Post by: gadget42 on September 01, 2025, 02:20:56 AM
just noticed that a few hours after i posted the above commentary on the increased traffic from ai/bot/llm/lvm/etc, the ai/bot/llm/lvm/etc traffic doubled. ai/bot/llm/lvm/etc is ruining the open web for everyone everywhere.
Title: Re: question for admin/mods: wondering reason for increased forum website traffic?
Post by: Paul_123 on September 01, 2025, 08:55:22 AM
It’s bots.  There is not an easy way to remove them from the online count
Title: Re: question for admin/mods: wondering reason for increased forum website traffic?
Post by: gadget42 on September 02, 2025, 09:43:50 AM
imho i think it is better that forum members/visitors ARE able to see the bot traffic

i would not want it hidden at/on any website, in fact it should be actively called out by all the websites under siege.
Title: Re: question for admin/mods: wondering reason for increased forum website traffic?
Post by: Paul_123 on September 02, 2025, 10:01:59 AM
Most of them are well behaved, honoring rate settings.   I’ve not seen it really affect the load of the server.
Title: Re: question for admin/mods: wondering reason for increased forum website traffic?
Post by: gadget42 on September 02, 2025, 06:36:16 PM
https://tech.slashdot.org/story/25/08/31/1820249/are-ai-web-crawlers-destroying-websites-in-their-hunt-for-training-data
Title: Re: question for admin/mods: wondering reason for increased forum website traffic?
Post by: gadget42 on December 21, 2025, 06:11:10 AM
wowza!

Most Online Ever: 16857 (December 09, 2025, 12:37:20 PM)

anyone know the _what/why_ regarding this recent rather large spike in MOE traffic?
(might be ai/bot/llm/lvm/etc using residential based proxies which would massively increase the "individual" entity traffic based on originating ip addresses)
(re: residential proxies, see for example _randomly_referenced_ oxylabs.io and www[.]webshare.io)
Title: Re: question for admin/mods: wondering reason for increased forum website traffic?
Post by: gadget42 on December 29, 2025, 10:09:49 AM
after reading a recent commentary that included information about aisuru botnet here:

https://blog.cloudflare.com/ddos-threat-report-2025-q3/#aisuru-breaking-records-with-ultrasophisticated-hyper-volumetric-ddos-attacks

more searching resulted in a couple pieces from krebs:

https://krebsonsecurity.com/2025/10/ddos-botnet-aisuru-blankets-us-isps-in-record-ddos/

https://krebsonsecurity.com/2025/10/aisuru-botnet-shifts-from-ddos-to-residential-proxies/

snippet tidbit(mostly because an earlier post referenced oxylabs.io and www[.]webshare.io):
Quote
Today, Spur says it is tracking an unprecedented spike in available proxies across all providers, including;

LUMINATI_PROXY    11,856,421
NETNUT_PROXY    10,982,458
ABCPROXY_PROXY    9,294,419
OXYLABS_PROXY     6,754,790
IPIDEA_PROXY     3,209,313
EARNFM_PROXY    2,659,913
NODEMAVEN_PROXY    2,627,851
INFATICA_PROXY    2,335,194
IPROYAL_PROXY    2,032,027
YILU_PROXY    1,549,155
Title: Re: question for admin/mods: wondering reason for increased forum website traffic?
Post by: gadget42 on April 04, 2026, 07:59:52 AM
naturally the bots are still out of control:

https://forum.tinycorelinux.net/index.php/topic,28089.0.html
Title: Re: question for admin/mods: wondering reason for increased forum website traffic?
Post by: gadget42 on April 07, 2026, 04:28:55 PM
 Most Online Today: 31582. Most Online Ever: 31582 (Today at 10:43:55 AM)
Title: Re: question for admin/mods: wondering reason for increased forum website traffic?
Post by: Paul_123 on April 07, 2026, 04:43:03 PM
Yup.  They all nailed the server at the same time about 11:40 EDT.

It’s why a lot of servers are putting cloudflare in front of them.
Title: Re: question for admin/mods: wondering reason for increased forum website traffic?
Post by: Vic on April 07, 2026, 04:45:00 PM
It is probably my fault. I check TC a few times a week.

Sorry

Vic
Title: Re: question for admin/mods: wondering reason for increased forum website traffic?
Post by: Rich on April 07, 2026, 04:52:06 PM
Hi Paul_123
There were about 16000 users around 11:40. They really managed
to slow the site down. When I returned later on I saw the number
had peaked to over 31000.
Title: Re: question for admin/mods: wondering reason for increased forum website traffic?
Post by: CNK on April 07, 2026, 08:08:17 PM
It's why a lot of servers are putting cloudflare in front of them.

It's not clear if that means you're considering doing the same, but I'll just make the point that when most sites do that (or start using any other service that requires Javascript to try and verify humanity) I stop visiting.

I know it's a tough problem to solve (my own website was getting crippled by millions of bot hits a day a while ago), and other common solutions like blocking IPs from certain countries may cut off other users, but I'm just sharing my point of view.
Title: Re: question for admin/mods: wondering reason for increased forum website traffic?
Post by: Paul_123 on April 07, 2026, 08:10:43 PM
Hits from today by useragent   only the top 20

Code: [Select]
156472 Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/142.0.0.0 Safari/537.36
  43616 Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/145.0.0.0 Safari/537.36
  11518 Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)
   9614 Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Amazonbot/0.1; +https://developer.amazon.com/support/amazonbot) Chrome/119.0.6045.214 Safari/537.36
   4645 Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/146.0.7680.177 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
   3814 Mozilla/5.0 (X11; Linux i686; rv:109.0) Gecko/20100101 Firefox/115.0
   2841 Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/146.0.0.0 Safari/537.36
   2662 Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/142.0.0.0 Safari/537.36
   2157 Mozilla/5.0 (X11; Linux x86_64; rv:149.0) Gecko/20100101 Firefox/149.0
   1728 Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:149.0) Gecko/20100101 Firefox/149.0
   1683 Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm) Chrome/116.0.1938.76 Safari/537.36
   1605 Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36; ClaudeBot/1.0; +claudebot@anthropic.com)
   1543 Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; ChatGPT-User/1.0; +https://openai.com/bot
   1446 Mozilla/5.0 (X11; Linux x86_64; rv:140.0) Gecko/20100101 Firefox/140.0
   1255 Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.4 Safari/605.1.15 (Applebot/0.1; +http://www.apple.com/go/applebot)
   1079 Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/146.0.0.0 Safari/537.36
    927 Terra Cotta 0.1 https://www.github.com/ceramicTeam/CeramicTerracotta
    788 Wget
    759 Mozilla/5.0 (compatible; Thinkbot/0.5.8; +In_the_test_phase,_if_the_Thinkbot_brings_you_trouble,_please_block_its_IP_address._Thank_you.)
    755 Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/140.0.0.0 newsai/1.0 Safari/537.36

The first user agent was the offender.  they launched almost 60 requests per second for about 20 minutes)   Here is the real problem.  This attack came from 104,000 different ip addresses.
Title: Re: question for admin/mods: wondering reason for increased forum website traffic?
Post by: GNUser on April 08, 2026, 09:14:48 AM
Hits from today by useragent   only the top 20
Hi Paul_123. How awful.

I'll just make the point that when most sites do that (or start using any other service that requires Javascript to try and verify humanity) I stop visiting.
It's not like server administrators have great options in this situation.

I have my own http server which I use just for myself, family, and few friends. The server was getting hammered with tens of thousands of visits an hour, every hour, every day. I don't have premium hardware, so sometimes I couldn't access my own http server >:(

The options I considered were 1) shut off the server, 2) use a Javascript gatekeeper (e.g., Anubis), and 3) put the server on a nonstandard port. I actually tried all three options for a while. Doing without the server was too painful. Anubis worked well but added too much complexity to my otherwise barebones setup. Using a nonstandard port turned out to be the right balance for me.

Using a nonstandard port does not eliminate the problem (some bots are more sophisticated and do port scanning), but it eliminates >50% of bot traffic, bringing the noise down to a tolerable level. Would using a nonstandard port be worth trying for the TCL forum? The problem is that this would prevent a lot of legitimate, human users from finding the forum.
Title: Re: question for admin/mods: wondering reason for increased forum website traffic?
Post by: Paul_123 on April 08, 2026, 10:46:01 AM
Based on the port scanning going on, I doubt it.  Might slow them down for a couple of days.   And would just frustrate users.

Yesterday's hit was the first botnet that I know of.  Otherwise the bots have been fairly respectful.
Title: Re: question for admin/mods: wondering reason for increased forum website traffic?
Post by: mocore on April 09, 2026, 04:12:16 AM
  Otherwise the bots have been fairly respectful.

perhaps a-bit in parallel to this topic  ,
i post as i just happened to read the above then the quote below quote from  https://lists.gnu.org/archive/html/help-guix/2026-04/msg00047.html

which seam to be vastly differing perspectives


Quote from: help-guix/2026-04/msg00047
GPTBot alone did 109,552 accesses to my website in march, so I think
they are telling the truth in a very misleading way.

The websites that go into these stats have together about 2000 HTML
documents (https://www.1w6.org has 811, https://www.draketo.de/node has
827 and https://www.draketo.de/ has 296).

99% of these change less than once per year.

If GPTbot crawls them every day, that’s 2000x30 = 60.000 accesses per
month -- which is pretty close to the 109,552 accesses I see.

But I built these websites over 20 years. The oldest articles are from
2007.

A human goes there, reads 1-20 articles and leaves again. Maybe to
return later when there’s a new article (I have RSS feeds).

An LLM goes there and crawls everything. Every day.

There even was a week where GPT tried every possible combination of
search inputs on 1w6.org -- including repeated arguments, likely until
it hit the URL length limit of the server. My log analysis tool needed
days to complete the analysis after that week. And I give thanks to my
hoster that they didn’t boot me then (and that I don’t have to pay for
excess bandwidth).


Title: Re: question for admin/mods: wondering reason for increased forum website traffic?
Post by: Paul_123 on April 09, 2026, 08:41:08 AM
The different perspective is that I expect some level of scraping.   Its just the time we live.   I specifically use a host that allows for unlimited bandwidth.   Anything I can do to limit it will be obtrusive to the real users.


Title: Re: question for admin/mods: wondering reason for increased forum website traffic?
Post by: Rich on April 09, 2026, 02:27:45 PM
Hi Paul_123
... I specifically use a host that allows for unlimited bandwidth.   Anything I can do to limit it will be obtrusive to the real users.
If you are talking about downloading extensions from the repo, then
yes, I would agree with that statement.

But this is a simple forum that's not littered with adds and videos.
Even attachments are limited to 200K in size (and total). How much
bandwidth is really needed for reading the forum.

I lowered the download speed on one of my machines to 1Mbit/sec
and had no trouble navigating the website.

If it's possible to set a speed limit that's comfortable for human
consumption, but less comfortable for bots scraping web pages, it
might be worth considering.

Just a thought.
Title: Re: question for admin/mods: wondering reason for increased forum website traffic?
Post by: Paul_123 on April 09, 2026, 04:07:14 PM
External bandwidth is never an issue and never what throttles the site.  Its the php processing and database processes that jam up the CPU.  Things are already rate limited for IP addresses and sessions.   But a botnet avoids all of these limits.
Title: Re: question for admin/mods: wondering reason for increased forum website traffic?
Post by: CNK on April 09, 2026, 08:25:04 PM
I'll just make the point that when most sites do that (or start using any other service that requires Javascript to try and verify humanity) I stop visiting.
It's not like server administrators have great options in this situation.

I have my own http server which I use just for myself, family, and few friends. The server was getting hammered with tens of thousands of visits an hour, every hour, every day. I don't have premium hardware, so sometimes I couldn't access my own http server >:(

The options I considered were 1) shut off the server, 2) use a Javascript gatekeeper (e.g., Anubis), and 3) put the server on a nonstandard port. I actually tried all three options for a while. Doing without the server was too painful. Anubis worked well but added too much complexity to my otherwise barebones setup. Using a nonstandard port turned out to be the right balance for me.

In my case I was able to identify a common argument in the request URL strings in all the requests coming from the botnet that was making millions of requests per day to my site. By adding a rule in the Apache configuration I blocked requests matching the bot's requests, and since that prevented loading the PHP module for them the server was then able to handle all the requests the botnet could sent without running out of RAM anymore. I still needed to significantly increase overall connection limit settings in Apache and the Linux kernel itself, but then it was able to absorb the attack which continued for a week or two before finally giving up.

That was with a $1/month VPS, but I was lucky it was a crazy bot using a pointless argument in requested URLs (I guess it was running some idiotic AI-generated code), so I could block it without affecting human (or even sensible crawler) visitors at all. I've read accounts of other people identifying similar ways of blocking bots with web server rules to filter request URLs. Others have blocked impossible or unlikely User-Agents (really old browsers without sufficiently modern HTTPS support to really connect), since some botnets seem to use a pool of random browser User-Agents which isn't up to date. I could have blocked South American and Asian IP addresses since all the hundreds of thousands of IPs the botnet used seemed to be from there, but I didn't want to. Maybe that would be another option for your personal site though. Others block IPs based on the owners of IP blocks (eg. cloud/VPS hosting companies).

Lots of answers, but I agree no single one is perfect for every situation.
Title: Re: question for admin/mods: wondering reason for increased forum website traffic?
Post by: gadget42 on April 21, 2026, 12:48:36 AM
another bot saga

https://words.filippo.io/dependabot/