Back

Cloudflare Turnstile requiring fingerprintable WebGL

537 points13 hourshacktivis.me
denysvitali13 hours ago

Cloudflare is known to use fingerprinting to detect scrapers For example, they use JA3 fingerprints and match them against the UA to block stuff like cURL while allowing OkHttp (Android clients) - but this can be easily be spoofed with packages such as CycleTLS [1].

I don't want to defend them, because they gate away a good chunk of the internet with their "bot protection", but unless you do PoW (which is also ecologically a nightmare), probably fingerprinting is the way to go - completely destroying the privacy of everyone involved.

Cromite, a privacy conscious fork of Chromium for Android, has constantly issues with CloudFlare Turnstile [2] because they (Cloudflare) try to fingerprint it in multiple ways in order to pass the challenge. The only way to get it to work would be to join the CloudFlare Browser Developer program - which requires signing an NDA. Rightfully so, the project maintainer didn't want to do it.

If you want to see the extent of what CloudFlare does to fingerprint the browsers, just have a look in the issue [2] and see which flags need to be disabled in order to allow CloudFlare to pass the challenge.

I understand both sides, but at least CloudFlare could be flexible enough to fall back to PoW instead of just blocking people from sending forms or accessing websites...

[1]: https://github.com/Danny-Dasilva/CycleTLS

[2]: https://github.com/uazo/cromite/issues/2365

jwr11 hours ago

> I don't want to defend them, because they gate away a good chunk of the internet with their "bot protection"

They also gate away a good many people with their "bot protection". I am extremely worried about how so many seem to have outsourced the control over who can access their websites to a company, with no second thoughts whatsoever.

Rapzid2 hours ago

> with no second thoughts whatsoever

As someone responsible for mitigating card testing "attacks", account harvesting, and DDOS attacks..

It is unfortunate, but the ISP industries(from telco up to transit) and CC industries aren't providing a lot of great options. This idea that people are doing things "without a second thought" is usually false when it comes to businesses.

ethin9 hours ago

The problem is what is the alternative? I'm (not) defending them or this practice by any measure, but we all know what happens if you just open your site up without these, especially with AI bots which hammer servers and are in effect a legalized DDoS system. I've hated CAPTCHAs ever since I first encountered them and I can't wait for them to just finally die a permanent death, but I also don't know how we solve the "how do you identify a human and a bot" in a way which doesn't require server admins to have extremely beefy servers or similar setups to handle the extra load. I'm not going to do the "there HAS to be a way thing" either because, for all I know, this could just be one of those impossible-to-solve problems.

jwr8 hours ago

> we all know what happens if you just open your site up without these, especially with AI bots which hammer servers and are in effect a legalized DDoS system

No, we don't know. I honestly do not understand the problem. I run websites, both static and non-static. Granted, my sites aren't exactly the most popular internet go-to destinations, but I should be seeing this DDoS too, right?

I do see lots of requests. Nothing that any modern system can't handle. Computers are stupid fast these days. Unless you are doing something unreasonable, it's really hard to even notice this "extra load".

I understand there are sites for whom this causes problems, but I think these are rare and could be optimized not to do unreasonable things.

I think too many people are annoyed by AI companies (arguably understandable position), look at their logs and speak of "hammering", "DDoS" and "extra load", while in reality it doesn't matter much.

+1
acdha6 hours ago
matt_heimer7 hours ago

It might depend on the tech stack. I run a small niche website but it has PHP and a database (MediaWiki/PHPBB) and without Cloudflare I'd estimate I'd need to spend several hundred dollars a month to handle the traffic. Traffic used to be tens of thousands of requests a day. AI has increased that to between 400k and 3M requests per day but it's not a smooth distribution. This is with bot fight mode on that greatly reduces traffic.

I adopted Cloudflare because it was getting DDoSed by the AI crawlers. I'm pretty sure all of them are vibe coding their crawlers and don't bother adding rate limiting as a requirement.

+1
canyp7 hours ago
hombre_fatal2 hours ago

Consider yourself lucky. But don't let yourself fall into the trap of thinking it's a nonissue for everyone else until it happens to you.

People shouldn't have to be experts or provision a larger server to run a UGC service that can withstand the sort of 30x more traffic I'm seeing from AI bots. Or rather, you didn't render the argument for why they should have to do that if they can just use CloudFlare's free tier.

Either way, it's easy to have all the answers when you've never had the problem.

+3
dr_um7 hours ago
+2
ethin8 hours ago
piker5 hours ago

Same. Tritium and the blog have done stents on the front page here and high traffic subreddits and that plus bots has never been a problem. UX could be improved through a CDN but even that isn’t worth the trade-off for us at the moment.

RHSeeger4 hours ago

> I understand there are sites for whom this causes problems, but I think these are rare and could be optimized not to do unreasonable things.

There are. They're not. They can't (without significant effort)

+2
redox998 hours ago
JohnTHaller5 hours ago

If you're in any way semi-popular and a decent size, you're gonna get hammered. PortableApps.com was partially offline for weeks due to China-based AI scrapers. You block the useragent, they start hitting you with another one from the same IP in the same way. You block the IP, they switch to another. You block the subnet, they use another. At one point it was nearly a thousand different IPs from around China hammering away. For all intents and purposes, a DDoS. This wasn't a little "extra load", this was load that was thousands of times beyond what our legitimate userbase was using.

And if you're thinking about blocking all of China, while this particular AI bot didn't use them, a bunch of other ones I've encountered use VPNs and hacked clients worldwide.

xg158 hours ago

I don't think it's just privacy, it also increasingly turns the web itself into a walled garden. The end result is that websites can only ever be accessed by "approved" clients - the latest Chrome, Edge, Safari and if you're lucky Firefox - and nothing else.

+1
robertlagrant6 hours ago
adgjlsfhk12 hours ago

I think there's some chance we get a "proof of purchase" system where there is some entity that takes a $10 payment to give out a unique identity token that you need to present to visit most sites. if you have a revocation process for ones used for bad actors, it seems like it would work pretty well.

steelframe7 hours ago

The most plausible near-term path is probably micropayments embedded invisibly in AI agents. Your agent that has learned what you value and can make a reasonable decision to allow a micropayment for certain content pays on your behalf without requiring a conscious decision each time, eliminating the mental transaction cost problem entirely. It's the mental transaction cost that arguably led to the failure of the micro payment model back in the early 2000s.

Although the cynical part of me says that this will result in malicious actors trying to trick agents into giving out a bunch of micro payments. There are counter defenses that can help detect and compensate for that, but perhaps the best we will be able to do is prompt user with the default agent recommendation.

PunchyHamster6 hours ago

We have few dozen websites, from ones doing single digit Mbit to few Gbits.

Never needed it. Just put the worst offenders in penalty bucket and that's usually enough

cindyllm6 hours ago

[dead]

binaryturtle10 hours ago

I can no longer access any website that's "protected" by Cloudflare. As soon a website enables that stuff… "Shoot, another one bites the dust." I wonder if the website owners realise at all how many actual users they lose by this sort of "protection."

tardedmeme9 hours ago

Cloudflare will just tell them that 70% traffic drop is because 70% of their traffic was bots, and everything is working fine, and hey, don't you want to upgrade to a paid plan to block 50% of the remainder? Think about how many bots will be blocked with that upgrade!

+1
google2341233 hours ago
CrimsonRain9 hours ago

I'm one of those who have enabled cloudflare on all of the sites I maintain. Additionally, Added turnstile on every form.

I know some actual users get blocked. But the amount of spam we get without it, the amount of bot traffic simply overwhelming the server... It is just too much.

Recently I also hard blocked all IPs from china Singapore India Pakistan Russia and whole of africa. Do I want to do it? No. But the amount of bot traffic and corresponding spam is a bigger problem :(

aboardRat42 hours ago

I live in China, and I hate guys like you. F©king racists, I'm not responsible for the bad apples living in other provinces.

jp_sc3 hours ago

I also always block traffic from China, India, Pakistan, and Russia, after observing that 90%+ of the spam/scanning was coming from those countries.

At least for China, I imagine most of the real humans might use a VPN anyway

+2
dotancohen7 hours ago
gruez9 hours ago

>I wonder if the website owners realise at all how many actual users they lose by this sort of "protection."

How many people do you think are browsing with a weird enough config (eg. custom browser like OP, or some weird config like firefox with fingerprinting protection on a raspeberry pi) to trip cloudflare's protection?

binaryturtle9 hours ago

Well… I know plenty people in my circle affected by this. Just have a slightly outdated system you simply can't afford to update: it's way to easy to get cut off like this. IMHO, a rather systematic discrimination of poorer people.

+1
benhurmarcel8 hours ago
p_l7 hours ago

Does not have to be weird, at least once it happened to me that their strictest settings simply banned something like major portion of internet users in my country - to the point that if you had FTTH you were likely blocked.

And no, it wasn't due to a country-based block selected by site operator.

+1
ranger_danger7 hours ago
8bitsrule6 hours ago

>wonder if the website owners realise at all how many actual users they lose by this sort of "protection.

Yesterday cloudflare blocked me from visiting the MX-Linux site ... including an old browser with -no- protections ...

I have to wonder - assuming these sites are paying CF for this 'service' - are they getting a list of all the fejected IPs?

dwedge7 hours ago

I took the time to write to one on LinkedIn and they didn't reply

denysvitali11 hours ago

They sometimes have to comply with legal requests (which I understand), but at the same time they have a huge market share - which means that the internet is becoming less and less decentralized and more in their control. We've seen the effects of that in previous outages...

segmondy5 hours ago

I use a cellphone internet provider, there have been many a sites I couldn't access because or cloudflare or stupid recaptcha. i know damn well what a bicycle, bus, traffic light or stairs is.

stackghost10 hours ago

>I am extremely worried about how so many seem to have outsourced the control over who can access their websites to a company, with no second thoughts whatsoever.

I think the Web is on its last legs, anyway. Generative AI and LLM-instead-of-search has destroyed what little value remained.

matheusmoreira3 hours ago

Governments too. It's inevitable that the international network will fracture into multiple national networks with heavy filtering at the borders as each country scrambles to impose their laws on it.

I'm glad to have known the true internet before its demise. Truly one of the wonders of humanity.

tardedmeme9 hours ago

It's just one more facet of the enshittoscene, the era where actual product quality is completely irrelevant. Put it in the same bucket as websites that lag when you scroll, apps that refuse to show you video without a huge play/pause button overlaid in the middle of it that never goes away, and the movie Melania. My hypothesis is that billion-dollar businesses no longer exist to sell things to customers, but only to impress other billionaires to get their investment money.

sandeepkd10 hours ago

> I don't want to defend them, because they gate away a good chunk of the internet with their "bot protection", but unless you do PoW (which is also ecologically a nightmare), probably fingerprinting is the way to go - completely destroying the privacy of everyone involved.

Bot protection with fingerprinting is just an illusion. Any signals like this which is on client side can be spoofed by an above average person. Fingerprinting is just way to consolidate the market for advertising business. Assigning Reputation to residential IP addresses and commercial blocks is is another approach to achieve the desired result. Providers would be a lot more careful to allow their IP addresses for misuses, however turns out that it would bring down the DDOS business on both sides, attackers and protectors.

Ironically, more than often its the same companies that invest in building their own bots and finding ways to stop bots from other companies.

esrauch9 hours ago

> Bot protection with fingerprinting is just an illusion. Any signals like this which is on client side can be spoofed by an above average person.

At the upper bound, fraud can always be committed by paying real people with real accounts to perform the desired action in a way that is 100% truly indistinguishable from organic. There's fundamentally actual prevention technique at the limit.

So the entire game is only "increasing the costs until it's not viable ROI", not "holistically prevent", which is why fingerprinting is a relevant technique here.

sandeepkd8 hours ago

> entire game is only "increasing the costs until it's not viable ROI", not "holistically prevent", which is why fingerprinting is a relevant technique here.

As per cloudlare's own report, about 78% of the DDOS attacks are at the network layer where the fingerprinting technique is not useful.

DDOS is done against targets for certain reasons, most businesses are not even viable targets for everyone.

However letting everyone being fingerprinted on the pretext of solving the DDOS is where the privacy gets compromised (not much of it is left though). Some search engines did it indirectly by letting people use tag managers for free in their website and then utilize the data for their advertising business.

Relatively the end game is same, its just how these companies are approaching it.

b65e8bee43c2ed012 hours ago

it's all for nothing, because Cloudflare's scraping protection works about as well as a $5 padlock - good enough to dissuade bored teens, not good enough to dissuade even an amateur burglar. if someone wants to scrap your publicly visible data, they will. there's nothing you can do.

ACCount3712 hours ago

At the same time: it sure works well enough to annoy anyone with a "bad ASN" IP with 80 captchas a day.

shideneyu11 hours ago

exactly that's what I was thinking... like the day they provided a solution to the issue they posed

mootothemax10 hours ago

Exactly. I’m constantly amazed at how little you actually need to bypass CF, Amazon, Azure WAFs and so on (Incapsula springs to mind too). When you look at the code you’ve come up with, it’s actually quite small and compact.

More to the point, these systems actually help scraping because proof of work unlocks essentially unlimited scraping, in my experience.

That said - from my experience on the other side, sure you can’t stop people like me or you, but you can stop 99% of the others. That’s more than worth it operationally.

ranger_danger4 hours ago

> Cloudflare's scraping protection works about as well as a $5 padlock

It sure seems to keep me, the casual visitor, far away from just about any site they "protect". I have zero desire to alter my browsing configuration or use extra tools to get around turnstile, I'd rather not even visit the site in the first place.

aboardRat42 hours ago

>, I'd rather not even visit the site in the first place

Until your bank, airline, and tax ministry start using them.

ranger_danger1 hour ago

[flagged]

leonidasrup8 hours ago

Fingerprinting for "bot protection" is indistinguishable from fingerprinting for mass surveillance.

petu11 hours ago

> but unless you do PoW (which is also ecologically a nightmare)

Can you expand? I don't see a problem with some napkin math. 5W load for 2 seconds is 0.002Wh (we have to let smartphones pass and not by doing PoW for 10s of seconds). 8 billion checks a day for a year = 8GWh.

denysvitali11 hours ago

I stand corrected. It's not a nightmare scenario (as for Bitcoins) - but I'm still of the idea that "useless" computations should be avoided (as we should avoid having 10MB websites).

In any case, according to some napkin math done by Kimi 2.6 (which by itself is probably already consuming more than all of my PoW challenges for the upcoming 5 years) - the situation looks incredibly in favor of PoW: https://www.kimi.com/share/19e7ef40-a432-8912-8000-0000b4a71...

Which makes me wonder why CloudFlare isn't switching to this already

charcircuit39 minutes ago

Because you can't have both a difficulty with a reasonable page load time and a difficulty that stops bad actors. Attackers have stronger machines and are willing to wait as long as they need to.

tarpitt1 hour ago

There's a saying that if an idea is stupid, but it works, it's not stupid.

If some computation is "useless" but it serves it's purpose, it's not useless.

The reason why bitcoin network expends so much energy is down to tokenomics, not the system of PoW itself. At equilibrium we expect the power usage to be (blocks/hr) x (BTC/block) x ($/BTC) x (kWh/$), so it's a function of the BTC price and emission rate.

PoW in other context has way different driving factors. In this case, the marginal improvement of fetching the site again for AI bots isn't enough to cover the PoW cost. The PoW cost is outweighed by the net bandwidth cost of all the parties.

dcrazy11 hours ago

Because it doesn’t solve the problem of residential botnets.

Velocifyer6 hours ago

The botnet operators will be incentivized to mine bitcoin instead of whatever they are doing.

+1
dotancohen7 hours ago
bawolff2 hours ago

> I don't want to defend them, because they gate away a good chunk of the internet with their "bot protection", but unless you do PoW (which is also ecologically a nightmare), probably fingerprinting is the way to go - completely destroying the privacy of everyone involved.

I hate what the anti scrapper mechanisms have become but it really is the lesser evil. The alternative for many small operators is to just completely shutdown.

jorvi7 hours ago

Brave has aggressive fingerprinting protection, I have Auto-Shred (formerly Forgetful Browsing) turned on, I use VPN and yet I rarely get gated out.

high_priest7 hours ago

A testament to how well Brave protects you from being identified by [Cloudflare in this example]

jorvi7 hours ago

Not sure what you mean, Brave blows Firefox out of the water in terms of privacy protections. Firefox has milquetoast fingerprint protection and it doesn't even block ads. uBlock is worse than Brave's blocking by virtue of not being natively integrated.

fsckboy6 hours ago

>probably fingerprinting is the way to go - completely destroying the privacy of everyone involved

your doctor seeing you naked does not destroy your privacy, it's your doctor sharing the photos with everybody that does. i.e. it problem here is that intermediaries like cloudflare don't work for you, they work for somebody else or sell the data themselves.

gadders9 hours ago

They're also anti free speech.

PearlRiver12 hours ago

This is why I have two separate browsers. If you want to do official stuff like paying for things you need to get through cloudflare.

notafox11 hours ago

You can use Firefox with different profiles and configure it to launch particular profile directly, without launching default profile and using about:profiles.

Firefox with a non-default profile can be created like that:

  ./firefox -CreateProfile "profile-name /home/user/.mozilla/firefox/profile-dir/"
  # For, say, cloudflare that would be:
  ./firefox -CreateProfile "cloudflare /home/user/.mozilla/firefox/cloudflare/"
And you can launch it like that:

  ./firefox -profile "/home/user/.mozilla/firefox/profile-dir/"
  # For cloudflare that would be:
  ./firefox -profile "/home/user/.mozilla/firefox/cloudflare/"
So, given that /usr/bin/firefox is just a shell script, you can

    - create a copy of it, say, /usr/bin/firefox-cloudflare
    - adjust the relevant line, adding the -profile argument
If you use an icon to run firefox (say, /usr/share/applications/firefox.desktop), you'll need to do copy/adjust line for the icon.

Of course, "./firefox" from examples above should be replaced with the actual path to executable. For default installation of Firefox the path would be in /usr/bin/firefox script.

So, you can have a separate profiles for something sensitive/invasive (linkedin, cloudflare, shops, banks, etc.) and then you can have a separate profile for everything else.

And each profile can have its own set of extensions.

tardedmeme9 hours ago

They're blocking Firefox quite often. Stripe does something that makes Firefox hang. I use Chrome for those sites and then go back to Firefox...

t_mahmood10 hours ago

You do now do this from `Profiles` menu too, without going down to CLI path. It's extremely simple now.

+1
notafox8 hours ago
ferfumarma11 hours ago

Except that fingerprinting means that both profiles are actually tied together by cloudflare (and other tech companies)

VoidWhisperer10 hours ago

I think the idea is that they have the functionality that cloudflare is using to generate the fingerprint (like webGL in this case) disabled in their non-cloudflare profile and only use the cloudflare profile to do things they have to that are behind cloudflare

ranger_danger4 hours ago

that's why I use completely different browsers with different settings. my CF-friendly one (not my daily driver) is `firejail --private chromium` so it always starts with a clean temporary profile

helterskelter12 hours ago

Firefox added profile switching recently. Works good.

(That said, I still keep separate machines. One for doing "official" things, the other for everything else)

notafox11 hours ago

> Firefox added profile switching recently.

I think this was as recent as 25 years ago?

Recently they added some new UI. There was and still is (I think) classic Profile Manager UI, which you can launch with

  ./firefox -ProfileManager
or access UI in about:profiles.

But you don't have to use any of those anyway - see my comment above (a response to parent).

opem11 hours ago

They actually have at least 3 kinds of profile: 1. containers - As they say its somekind of sandbox, technically a profile 2. profiles that are accesible through about:proflies, which they had for years, and probably the one you are talking about... 3. New profiles that comes with a pop-up much like how chromium browsers shows it

thayne10 hours ago

The old UI was pretty difficult to use, and hard to discover unless you knew where to look though.

ajb12 hours ago

Odd - they've had that for years, but only on the command line. Wonder if it's different under the hood? They also have firefox containers which also never quite became a first-class feature (you have to install a plugin).

b65e8bee43c2ed012 hours ago

>Works good.

does it? same binary, same machine, same display, same 781 other heuristics.

jchw3 hours ago

JA3 fingerprinting is really not a serious deterrent, there are many ways to get around that. curl-impersonate works. You can even just use an actual Chrome instance with the devtools protocol, seems to pass as long as you don't use headless mode.

The WebGL fingerprinting thing is cute, too. I guess it'll buy them some time since off-the-shelf solutions are going to probably not handle this well yet. That said, as long as the reward for bypassing turnstile and other anti-bot protections remains high, these things really can't do much. A decently resourced adversary can probably come up with a dozen different approaches to make this less useful. Without really looking into it much, my kneejerk is you could probably tweak Mesa to have deterministically random behavior for whatever edge cases it looks for, but you could also just have lots of different GPU/driver combos to proxy to. The web gets less open, but in an asymmetrical way. If you really have an incentive to keep botting, you'll surely find a way.

The next step is to fully give up and just essentially implement WEI. And then the bot problem disappears?

Nope. Botting will still hold tremendous value, so likely there will be many crafty workarounds and bypasses over time. And there will be countermeasures for those and workarounds for that. Guess we'll start to find out who actually has the resources and incentives to keep botting in this environment.

So what's the real solution? Well the most obvious thing to do would be to make botting less valuable. Can we? I dunno. It may have been a mistake to move so many important things to the Internet after all. I mean, some of this is just threat actors catching up with what's possible and was inevitable to begin with. But, some of it is just trying to find solutions to problems that were unnecessary to begin with. Or failing to implement solutions despite an obvious need to do so.

There are a lot of threads to pull on, here. Account takeover still holds tremendous value to threat actors. Why? In my opinion, it's because passkeys were a tremendous failure, no matter what adoption shows. If we wanted to just improve security for users, I think we didn't need to restructure the internet around another authentication mechanism that of course, provides attestation capabilities, we could've just improved on passwords. For more secure handling of passwords, PAKEs exist. Password managers exist. For anti-phishing, TOTPs exist. What if you could have the exact same passkey experience, but in such a way that everything can gracefully fallback to just passwords and TOTP, because they're the real keymatter at the end of it? Add a web standard that lets browsers and browser extensions hook into the login process, standardize PAKEs as part of the web. Cross-vendor syncronization? A problem easily solved if we ever wanted to.

Instead of that, we got the dumbest possible world. Passkeys are sometimes available, but often not. Can you sync your passkeys across devices? Probably, maybe they have blacklisted KeepassXC by now so maybe I can't :)

But a lot of stuff doesn't even offer me the option to use passkeys, so they still use passwords. Can I enter my password to log in still? No, of course not. See, I will helpfully get the option to enter my password, in addition to the option to use email or SMS, the most secure authentication scheme known to Man, but if I actually select password and enter my secure password from my secure password manager, what I get to find out is that the password option is actually password and email or SMS and there's no option to use TOTP. Oh, and you randomly get logged out for no reason sometimes.

Some of the bots will probably disappear. Like, whatever bot is throwing me several terabytes of nonsense traffic every month will probably eventually disappear since they're wasting so much bandwidth on doing literally nothing. I have no idea what the point is, but I know it can't be terribly valuable for them, and it's not terribly expensive for me. I'd love to know who the hell is doing that and why, though.

But since the web is ran mostly by crap companies like Google, it will never get its shit together, and we will get solutions like WEI and identitity verification to solve problems that were entirely manufactured (or caused by a significant lack therefore of) in the first place.

NamlchakKhandro42 minutes ago

It's completely fucked.

By virtue of incompetent and ignorant Devs and middle managers. Our by virtue of greed and maliciousness.

Yeah yeah never attribute to malice what can be explained by stupidity... This time no. It's both.

jeroenhd10 hours ago

> Plus privacy.resistfingerprinting isn't enabled even when selecting "Strict" "Enhanced Privacy Protection" in the settings, great job there Mozilla.

For good reason. I've run that setting for ages but I kept having to disable it and add workarounds because websites would break in weird ways. Timezones in scheduling websites being messed up nearly made me miss a couple of appointments. There's no way to tell the user Firefox isn't broken without displaying a permanent banner like "if websites are broken in any way or you see weird glitches or your computer's time is wrong or fonts look weird or videos don't always work right, click here to disable fingerprinting protection".

Interestingly, Turnstile breaks with resistfingerprinting but works with fingerprintingProtection, I guess the latter takes this crap into account.

croes10 hours ago

Maybe a good reason for not enabling it by default but a bad reason to not enabling it for strict settings.

I somewhat expect breaking sites with strict settings, I don’t expect an still wide open tracking path.

That’s deceiving.

userbinator8 hours ago

"If they know you're spoofing, you're not spoofing hard enough."

This stupid "war against bots" is going to lead to the downfall of the Internet and effectively turn it into another walled garden where only "approved" (anti-)user agents are allowed. Don't fall for the nonsense about "AI scrapers" --- it's just a way to manufacture consent.

0x597 hours ago

Idk, if bots ate hammering your server then setup rate limits. If you have content that you don't want others to have access to, don't serve it with a webserver.

TkTech7 hours ago

I used to just start giving any IP downloading way too much a redirect to multi-tb NASA images. This was a long time ago but it was surprisingly how many would follow redirects and never time out. Wouldn't see a request again for hours and then its right back to downloading a new part of the sky.

Those images also used to crash all the early GUI irc and chat clients that showed inline images without size checks...

mcosta5 hours ago

How do you know it followed the redirect and downloaded the image?

dotancohen7 hours ago

How were you tracking each IP address's data usage? Did you parse the logs every request? Store usage in a database? At the application or webserver level?

TkTech4 hours ago

Webalayzer! I'm not sure there were really any other options at the time other than writing your own. Parsed the apache logs and gave you pretty detailed results and you could see the usage (in kb, which tells you how long ago this was!) broken down by date and IP.

Once you added a redirect rule for the IP to apache you'd just check your log and see the IP that was hitting you every couple of minutes poofed for a good few hours.

pmdr6 hours ago

This. What even is the point of blocking scapers if Google consumes your content anyway and serves it as an AI answer?

These are sad times we're living as far as openness of the web goes. People would have less of a scraping problem if their websites didn't ship with 20MB of JS.

remus6 hours ago

> What even is the point of blocking scapers if Google consumes your content anyway and serves it as an AI answer?

Google bot is generally fairly well behaved, but this is not the case for all scrapers and it can cause significant traffic (and expense).

matheusmoreira3 hours ago

Yeah. I can already see the future. Only computers that pass remote attestation will be able to connect to the internet at all.

konform9 hours ago

I'm maintaining a minority browser[0] and as of a couple of weeks this is affecting several of our users[1]. While I'm currently not considering this a browser bug (one could be involved, of course), more eyes are better and any help or ideas on improving or mitigating the situation would be appreciated.

[0]: https://konform-browser.codeberg.page/

[1]: Most? All? Without any telemetry, relying on user reports and our own testing here.

Animats10 hours ago

Is there a deal between Google and Cloudflare to make non-Chrome browsers harder to use? The pressure to use Chrome keeps increasing, and the amount of ad filtering you can do in Chrome keeps decreasing.

denismi1 hour ago

As someone who runs Firefox on both Linux and Android, with Enhanced Tracking Protection enabled, and tries to use web over native mobile apps wherever possible ... I really don't feel this at all?

wnevets9 hours ago

I would wager to guess its one of the nature consequences of Chrome being the most popular browser on the web. Most legit traffic will be from Chrome.

hack13128 hours ago

only chrome was approved for use internally at cf 5 years ago when i left

bigyabai9 hours ago
tardedmeme9 hours ago

Yes

1vuio0pswjnm72 hours ago
rfl8908 hours ago

>It looks like you're trying to hide your identity.

You were never entitled to it in the first place

WatchDog5 hours ago

By the same token, you aren't entitled to see the website content.

rfl8904 hours ago

True, but that's at the discretion of the content author/publisher, not Cloudflare Turnstile.

malka198613 hours ago

Thanks, i did not know about `privacy.resistfingerprinting`

I'll make sure to fail all cloudflare turnshit in the future.

gruez13 hours ago

I have it enabled and turnstile works fine.

jeroenhd11 hours ago

It breaks Turnstile for me on Android. Had to restart the browser for it to take effect of course.

fulafel1 hour ago

WebGL fingerprinting is of course an attack and a unintended use of the WebGL API. Browser vendors should respond to this misuse somehow (reputation based blacklist?).

Kiboneu10 hours ago

In other words, Cloudflare requires you to substantially increase your browser’s attack surface in order to visit websites.

dblohm711 hours ago

> Plus privacy.resistfingerprinting isn't enabled even when selecting "Strict" "Enhanced Privacy Protection" in the settings, great job there Mozilla.

That pref is there for the Tor Browser.

konform10 hours ago

It's enabled by default in Tor Browser and I'm not sure it can even be disabled?

Also enabled by default for Konform Browser and Mullvad Browser, which borrow many of the privacy- and security-related patches from Tor Browser.

gorgoiler9 hours ago

I always like the axiom with crime that once X% of the population are violating a statute then it should probably struck off. Recreational drugs being the obvious example.

If randomized canvas stuff was cracked down upon as a bot thing but now everyone with a copy of Firefox is doing it, maybe Cloudflare should just “legalize” it?

aboardRat42 hours ago

Everyone with a copy of Firefox is about 2% of the web.

avallach12 hours ago

Doesn't this mean we just need to make the webgl fingerprint resistance implementation smarter? Instead of explicitly rejecting webgl access or responding with dummy data, respond with data that is random within space of N common and reproducible patterns. E.g. emulate webgl implementation of some low spec but actually popular devices.

btown10 hours ago

The last screenshot in the OP article mentions that "a browser extension... adding random noise to canvas data" can be detected. Which isn't to say this perfectly detects all such randomization, but it's certainly an active part of the arms race.

ranger_danger4 hours ago

Yes but the idea is that the protection should be part of the browser itself, then it becomes the expected norm AND isn't really "detectable" since there's no extension to redefine javascript variables. Scraper-friendly solutions like Camoufox or CloakBrowser make such changes to avoid having the same fingerprint every time while still appearing normal.

bflesch12 hours ago

All of those advanced features should be enabled on a per-website basis but unfortunately even browsers whose marketing focuses on privacy don't allow you to do that. Same with TLS root CA certificates, there is no way to configure that a certain CA can only create certificates for certain domains.

adamtaylor_1313 hours ago

So if you need to prevent bot abuse, but also don't want an ugly captcha every time someone goes to sign up, is there a better option?

ribtoks12 hours ago

Use proof-of-work captchas, many are private by default. Look into Private Captcha or Cap captcha.

mootothemax10 hours ago

Speaking from the scraper’s perspective, I like proof of work; a ten year old 96-core server will cost a couple of quid to run for a few hours and will grab an absurd number of pages thanks to the access granted by repeatedly solving proofs of work. Small slick codebases too!

tardedmeme9 hours ago

There's also the Anubis idea where your PoW is persistent until your IP address or session cookie changes, so you get to skip PoW in exchange for making yourself identifiable, which means the PoW can then be ramped up to take a couple of minutes.

I don't use Anubis though. I just make my site not take five seconds to render a page so bots can overload it easily? It's not actually that hard?

Velocifyer6 hours ago

It would be more profitable to mine bitcoin.

matheusmoreira2 hours ago

Can this be repurposed as some kind of distributed cryptocurrency mining mechanism? Pay websites by mining some monero in order to access them?

arbol8 hours ago

PoW doesn't stop bots.. It's an annoyance at most. A rate limiter and nothing more

0123456789ABCDE7 hours ago

PoW difficulty can be scaled, eg: all cookies must work 1s, but 2nd cookie from the same ip, might have to do 2s of work

ideally one would pick something a bit more forgiving than a linear function, to avoid penalizing too much users connecting from CGNAT

phoronixrly12 hours ago

How does proof of work stop bots?

stephantul12 hours ago

Because it destroys the economics of scraping. It’s too expensive with proof of work, or at least not as economically viable

+3
gruez12 hours ago
ranger_danger4 hours ago

5W load for 2 seconds is 0.002Wh, I think we'll be fine

arbol8 hours ago

Except it doesn't

ray_v12 hours ago

If it gets too expensive/time-consuming to scrape then it won't happen at scale (as much)?

keynha9 hours ago

Behavioral signals are the usual answer: risk-scored, invisible challenges; proof-of-work (cost without identity, though it taxes mobile); and signup-velocity/rate limits that stop cheap abuse before any challenge fires. The reason fingerprinting wins anyway is that it requires less operator effort, not that it is the only thing that works.

arbol8 hours ago

Behavioural requires interaction. Fingerprinting is instantaneous and cloudflare runs on page load for lots of sites

keynha7 hours ago

[dead]

ImPostingOnHN12 hours ago

The tool "Anubis" uses proof of work instead

BetterThanSober12 hours ago

With a tuned cool down period this isn't a problem, especially if you frequent the sites. OpenWRT uses Anubis and usually when I need to peruse their site I'm on a very low-end device. I prefer waiting much more over finding Waldos

But in principle I agree that there's no good answer to this, scraping _is_ useful and I bet most of us here had scraped something, it is AI company and their use of human's material for training without consent and return that led us to this (I know botting exists in forum since forum is a thing but it is easily solved by human moderators and keyword filter)

timpera12 hours ago

Anubis often takes more than 60 seconds to complete on low-end devices (especially old smartphones). It seems like there's no good solution.

QuantumNomad_11 hours ago

But after you’ve completed the Anubis PoW challenge for a site, it remains valid for some amount of time.

So it’s not quite as horrible as it sounds.

I have setting up Anubis for my own sites on my todo list. And I wish more people did it too. I don’t really mind waiting a little bit extra every now and then before the page loads. What I do mind is ReCaptcha asking me to click all the pictures with buses in them etc. And especially when I have to do it several times over before it’s happy. I’d rather wait a minute for a page to load than to ever solve a ReCaptcha again, if given the choice.

dangus12 hours ago

That must be really low end then. I’ve never seen it complete in a timeframe that was slower than “I can’t even read the page before it redirects”

titularcomment10 hours ago

My guess is its an implementation error, not an hardware limitation. I have two 10-year-old devices and one passes instantaneously while the other halts for a good half minute every time.

ImPostingOnHN12 hours ago

There's not an easy, perfect solution, for sure. Newer phones get faster, but spammer compute gets cheaper.

Some sort of decentralized trust web seems like another option, though less viable.

WesolyKubeczek12 hours ago

One of unexpected outcomes from AI-induced hardware shortage may be that, in fact, compute won’t be getting cheaper and may in fact get more expensive…

phoronixrly12 hours ago

How does Anubis stop bots?

redwall_hp10 hours ago

Anubis is designed to stop a certain class of badly behaved bots. It intentionally doesn't run if a bot identifies itself with a UA, such as Googlebot, because then you can rate limit it or block by UA and with other tools.

Anubis is active when a user agent looks like a web browser (e.g. contains the "Mozilla" substring every major browser uses). The reverse proxy serves an interstitial page that does a proof-of-work check, validated server side, setting a cookie if it passes.

This means a legitimate user won't constantly get the proof of work check, because they already passed it. But AI bots rotating through tons of residential IPs to scrape your forum or git forge or whatever will be slowed down.

Overall, I like the idea. It's unobtrusive, privacy preserving, and seems to be working out well for a lot of sites.

basilikum10 hours ago

The real answer is that it makes sites behave different requiring the bots to make slight adjustments.

And there are just not enough sites using Anubis for the people and companies running the bots to care to do that.

If you do care bypassing Anubis is trivial.

arbol8 hours ago

It doesn't. It slows them down. To stop bots you need to employ the full suite of tools, fingerprinting, IP rep, behavioural analysis. Anubis will slow down your basic scrapers that try to crawl the entire web but it is useless against actual bots

xena12 hours ago

Bots don't execute JavaScript or follow complicated redirects.

pwg11 hours ago

Bots don't [currently] execute JavaScript or follow complicated redirects.

They don't now, but enough "high value to the bots" pages turning on JS or complicated redirects will simply result in the bot authors adding JS execution or redirect following so they can continue "botting" the sites they want to scrape.

It's a hole with no bottom. Each one-up on the anti-bot side will eventually be handled on the bot side.

ExpertAdvisor019 hours ago

That's not true . A lot of bots are just headless chrome instances .

ranger_danger4 hours ago

They have been doing it for years: https://roundproxies.com/blog/bypass-bot-detection/

baq8 hours ago

The logical next step would be for them to allow to pay you to pass the check and become the ultimate Internet tool booth.

whatwhyisthis4 hours ago

You hiding things from them automatically lots automatically bins you with agents having a reason to hide things from them.

Which, to be clear, is the entire problem: given how much of the internet goes through them, they should have enough alternative signals as to wether you’re not a bad actor that are stronger than this specific one.

However, this also presents the problem that there’s barely any users in their base with your exact configuration, so getting any actual solutions might just take forever.

elivoncoder3 hours ago

interesting topic. 3 of my browsers failed that test page. konqueror. and on android, vanadium and cromite.

https://browser-compat.turnstile.workers.dev/

4oo412 hours ago

I tested this extension that I've been using for a long time on the turnstile page and it got through, fwiw. I think it's a bit more subtle than how resistfingerprinting works but not sure what the privacy tradeoff is.

https://github.com/kkapsner/CanvasBlocker

tosti10 hours ago

Looks cool. And I wonder why I'd run this over JSshelter. It appears to do the same thing, no?

4oo46 hours ago

JSshelter looks cool, I'm not familiar but this makes it seem like it operates more like resistfingerprinting by blocking outright instead of noise injection, at the expense of more broken sites?

https://jshelter.org/fpd/

What all security extensions do you run? After running into issues over the years, with extensions doing multiple things that fight each other, I switched to trying to block via ublock origin as much as possible, then prefer other extensions to just do one thing to extend coverage, like this one. Makes it much easier to troubleshoot/exclude/disable when it breaks something vs. fiddling in settings.

BoingBoomTschak10 hours ago

Thanks for the report, I've been running this for a long time.

nulledy13 hours ago

As turnstile users on several of our sites, I think we need to revisit that decision.

sammy225513 hours ago

Out of curiosity, why did you have it on in the first place?

nulledy11 hours ago

Bot rejection for contact forms. Better UX than reCaptcha.

nlitened8 hours ago

Did you think it rejects bots by using some kind of magic?

nulledy5 hours ago

Well, of course not, don't be silly. But if it blocks visitors of our site from using non-standard browsers, perhaps its worth exploring alternatives.

Nearly all of our sites are visiting by extremely tech literate folks, the exact type that may not be using Google Chrome or Firefox.

X-Istence5 hours ago

This is an issue I am running up against on Safari (Version 26.5 (21624.2.5.11.4)) on MacOS 26.5.

I keep getting the turnstile and having to click the "I a human" button.

mixologic8 hours ago

Privacy and Bot defense are opposite ends of the same fulcrum. If you permit privacy, the site/service has to trust users to behave and follow the rules. If you track users, then the users have to trust the site/service owners not to abuse that trust. There isn't really an in between.

So if you want privacy, you have to accept poor and sometimes insecure services.

Dwedit11 hours ago

Adding noise to a canvas element is a mistake anyway. It means you can't develop a proper paint program using web technologies because your browser will mess with the image.

boywitharupee6 hours ago
tosti11 hours ago

You can still do that, but it may not be rendered correctly in a screenshot.

kordlessagain12 hours ago

I did warmups in Grub Crawler to fight this: https://deepbluedynamics.com/grub

JoshTriplett12 hours ago

"This makes your browser appear suspicious because it looks like you're trying to hide your identity."

Yeah, this needs to be burned to the ground.

gruez12 hours ago

Bad optics aside, it doesn't actually reflect reality. See my other comment. You can enable basically all the privacy settings and still pass turnstile. Tor browser in a VM passes it, of all things.

https://litter.catbox.moe/gaizpk692bhhs6b7.png

JoshTriplett12 hours ago

Any idea what the difference is between your setup and the one in the article that failed with fingerprint-resistance enabled?

gruez11 hours ago

He's using a custom browser, apparently: https://hacktivis.me/projects/badwolf

+1
JoshTriplett11 hours ago
morpheuskafka7 hours ago

I'm getting this error on Safari 26.3.1 even without an adblocker extension, and advanced tracking prevention is set to private tabs online.

jameson8 hours ago

I use LibreWolf which disables creating WebGL API by default and I don't have this issue. Why could be the reasons I'm passing CF turnstile?

majorchord4 hours ago

CF uses more than just WebGL to fingerprint users... LibreWolf isn't helping you as much as you think it is.

https://abrahamjuliot.github.io/creepjs/

goda907 hours ago

Already fingerprinted, perhaps?

Wowfunhappy13 hours ago

...in the age of AI, does anyone have an actual solution for keeping out bots while preserving the privacy of humans?

Obviously this is terrible, but I think there's a possibility it's the least terrible option? Another option is IP reputation, which I think is worse. Or scanning a code with a non-rooted phone, which I think is even worse than that!

fidotron13 hours ago

> ...in the age of AI, does anyone have an actual solution for keeping out bots while preserving the privacy of humans?

There isn't one, and pretending otherwise is nonsense because humans will always provide their credentials to something to act on their behalf.

In the limit you end up with Chinese phone farms.

tardedmeme13 hours ago

Right. Botnet operators love cloudflare because they make so much money renting out compromised machines to pass their tests.

thisislife212 hours ago

The only solution is regulation. If all content created by anyone has a copyright, how does an implicit opt-in (which is what happens if you don't create a robots.txt file for your website) for scraping make any sense? Moreover, even if you have a robots.txt, AI (or whatever) bots often don't respect it (or use workarounds - they outsource scraping of such "restricted" sites to unethical third-parties to get the data; Meta has even resorted to piracy, openly!). So clearly, the logic and the "honour system" has failed.

Cloudflare, Google Captcha, HCaptcha etc. are all shitty technical solutions because, as we are all discovering, it comes at the cost of our privacy (i.e. our personal data may monetise these services) and / or our computing resource and time. If current copyright laws aren't sufficient to prevent this, we have to acknowledge the system is broken. The answer could be enhancing it with some kind of Digital Millennium Copyright Act (DMCA) -like laws, but in favour of the creators against BigTech or rogue actors.

- Web-scraping and copyright law - https://www.neudata.co/blog/web-scraping-and-copyright-law

- Why DMCA Claims Against Web Scrapers Face Long Odds - https://capstonedc.com/insights/why-dmca-claims-against-web-...

oceanplexian12 hours ago

Or you could let information be free, at least the stuff that’s on the public net.

As for issues like bots overloading websites or using too many resources scaling laws will take care of it quickly, it’s not like you can’t serve thousands of RPS from a Raspberry Pi these days.

arbol8 hours ago

Or the regulated agents standard that cloudflare is conveniently going to steward alongside Google...

ImPostingOnHN12 hours ago

I don't think regulation will stop web scraping, not least of which because it can be done from locations outside the jurisdiction of the regulations.

> we have to acknowledge the system is broken

The system is broken. It probably takes, what, 10 seconds or less to use a residential or foreign proxy, 6+ months to internationally track and prosecute a single offender? So like a million times more effort going the regulatory route.

thisislife212 hours ago

Just as criminal laws don't end all crimes, copyright laws and anti-scraping regulation won't end all scraping. But it will greatly reduce it and limit it to rogue actors. Two examples I can cite here are the laws against email spams and laws against unsolicited marketing calls - they had a definite impact in reducing both (even in India, from where I am, where implementation of laws are often lax).

Wowfunhappy3 hours ago

I basically agree that the idea should be to reduce, not eliminate, bots.

However, a big difference with crimes involving the internet is that they can be launched from anywhere. In the real world, I can't steal from someone unless I'm physically present in the same country as my victim. On the internet, the US could outlaw scraping and Russia would keep doing it.

JoshTriplett12 hours ago

Exactly. Bot activity is a problem of volume, not all-or-nothing. Solving 95% of it would be a win.

0x597 hours ago

I mean, you could just turn on WebGL or use an approved, secure, agent to access the web. If you have nothing to hide, then you have nothing to fear.

mschuster918 hours ago

> The only solution is regulation.

The thing why Cloudflare got invented isn't AI scrapers. These are just the latest development... the original reason why Cloudflare got created and why it experienced such a meteoric growth is DDoS and botnets.

Yes. We need regulation in the AI space. But it will be useless as long as bad actors aren't held accountable - and a lot of the bad actors aren't in our jurisdictions. You got hacked devices all over the world in giant botnets, controlled by Russia, Chinese, Iranian and North Korean actors. You got Chinese AI scraper bots as China is heavily investing into training their own models. You got Indian, Filipino and Myanmar-based scammers.

And frankly I have no idea how to get all of that under control. As much as I'd like to see sanctions against both domestic and foreign enablers of abuse (which includes residential ISPs) - it's going to be one giant ass whack-a-mole game.

jeroenhd11 hours ago

Remote attestation should still be possible with a rooted phone if phone manufacturers weren't so shit. If the attestation happens at hardware level, it doesn't matter what programs or kernels you're running.

cr125rider13 hours ago

And identifying a bot that is acting on my behalf. Claude go search this topic is basically the same as Googling something and clicking on the results. Human driven AI searching needs to be in a different box than AI scraping for training data.

Which sounds extremely difficult to differentiate

JoshTriplett12 hours ago

Hopefully it stays that way; "a bot acting on my behalf" is still a bot. At least it's often a well-behaved bot and uses a user-agent that can be detected and blocked.

spacedoutman12 hours ago

Private invite only internets

arbol8 hours ago

LAN parties?!

Gander573912 hours ago

You don't need a non-rooted phone to pass captcha checks, I have a rooted phone and can pass the captchas that ask you to scan a qr code. But I doubt phones without google services would manage.

HWR_1410 hours ago

How does scanning a QR code prove any kind of captcha?

Gander573910 hours ago

https://support.google.com/recaptcha/answer/16609652 - it just launches the verification service.

ravenstine10 hours ago

Or maybe we can actually start paying for all the things we use on the Web, making it prohibitively expensive to deploy fleets of bots.

csomar12 hours ago

They are not a problem unless you "believe" it is a problem. I estimate around 20-25K hits to my website from bots per day and I have all cloudflare protections disabled. Any decently optimized server should be able to easily handle that. (it's roughly 1 request every 3 seconds).

specialp12 hours ago

Yes and that is just the bot background radiation of the internet. I run a primary source of information site and these botnets are aggressive to a DDOS level. All to do some sort of scraping. Because they have sophisticated enough tactics to DDOS us if they wanted to. However I am not sure their objective as they have wasted enough of our resources to have scraped all our content 1000s of times over. That 25k traffic is a couple of minutes for us. And that adds up. 80-90pct of our traffic is this

HWR_1410 hours ago

Assuming that the bots aren't repackaging your content and preventing users from seeing your blog by serving that content to them first.

thisislife212 hours ago

True. But it still wastes your server resources, right? And it's sad that you have to accept that as part of the "cost" of hosting a site ...

ndriscoll12 hours ago

What resources are you concerned about? An n100 minipc should be capable of serving something like a blog at 20k+ requests/second (or saturating its network).

malka198613 hours ago

> keeping out bot

You can forget about it. It is not possible. Simple as that.

Wowfunhappy13 hours ago

Let's say I'm selling concert tickets. How do I prevent bots from buying up all the tickets and scalping them?

arbol7 hours ago

- behavioural fingerprinting - ja4 - IP rep - queue mechanism - card country to IP country checks - app attestation - custom metrics based on knowledge of past scalpers

It's hard but it's not impossible. You can make it very inconvenient for scalpers. They need to poll at volume so their behaviour is very much detectable. A hard stance is required on IP rep, especially for more in demand concerts.

Wowfunhappy5 hours ago

I don't now, a lot of this seems just as invasive as WebGL fingerprinting, if not more invasive.

ranguna11 hours ago

Do it like plane tickets do, tie a ticket to an identity + buyback up to a week or so before the concert in case someone wants to cancel (or authorize the transfer and capture only a week before). Ask for ID and ticket at the entrance.

MyMemoryfails12 hours ago

I'd simply check filling speed, even with browser's autocomplete humans are slow due needing click submit.

Then when it's "processing", do them in bulk and prioritize slower users. There's huge opportunity do bot checks after checkout without affecting user experience.

Also on product launches you could add unique field which requires user to input, for example that way bots can't prepare for launches.

arbol7 hours ago

Yeah, this doesn't even begin to cut it

fragmede12 hours ago

huh. no wonder my password manager's auto submit triggers bot detection (it's a fairly popular one).

ndriscoll12 hours ago

Sell them via a Dutch auction. Eliminate the arbitrage opportunity for scalpers and make more money in the process.

+1
dcrazy11 hours ago
luckylion13 hours ago

Tie them to the buyer's identity, offer at-value buy-backs until X weeks before event, disallow resale.

doctorpangloss12 hours ago

web environment integrity

Wowfunhappy5 hours ago

But that is so clearly worse than WebGL fingerprinting!

ashishbijlani1510 hours ago

[dead]

aussieguy12342 hours ago

For the malicious bot authors, if WebGL is a "free pass" so that their browser is not detected as a bot, they'll simply switch to a chrome based browser such as CloakBrowser, which already passes CloudFlare Turnstile.

So no real benefit for bot detection here. Just a privacy nightmare for everyone else.

gausswho6 hours ago

Brazenly requiring the abuse of a browser feature's intended use against the user. What an age.

I'd like to hear from someone who worked on WebGL and how they feel about their ambitions being utterly subverted. Remember when the dream was playing games i. the browser?

megous10 hours ago

They use all kinds of obscure APIs, which you'll learn if you're privacy/security conscious and disable random web APIs that are of no use to YOU as a web user, but only can ever serve the people who serve you stuff or want to hack you or track you.

Normally websites feature test and just skip using obscure disabled APIs, or more likely, websites don't use those APIs at all or only tracking scripts use it, which are already optional usually.

Problem with CF is that if you want increased security they'll prevent you from gaining it everywhere, even on sites they don't protect, or prevent you from accessing services even the ones you paid for. Browsers don't allow disabling APIs per domain, so you're either at risk everywhere or you're blocked from accessing a lot of things for no particular reason.

CF can't be bothered to feature test.

arbol7 hours ago

I'm no CF advocate but those random APIs are literally what differentiates people running Chrome on their computer versus a bot operation with a load of containers. Kubertnetes clusters don't have GPUs. This is why it's used in bot detection (I use brave with no hardware acceleration and I'm captcha everywhere)

zuzululu10 hours ago

Dont like it but is a reality due to bots

SilverElfin10 hours ago

This company makes the internet unusable if you value privacy and use VPNs or whatever. Evil.

pmdr6 hours ago

Can't directly outlaw VPNs? No problem, we'll have the the few corporations powering the internet block anyone who even thinks about anonymity!

boywitharupee9 hours ago

> has been looping indefinitely

this can mean WebContent process is crashing

J37T3R7 hours ago

Web3.0 and beyond was a mistake

anonym2913 hours ago

Say no to malware - say no to Cloudflare

bflesch12 hours ago

Firefox has so much built-in tracking it seems they want to push me to build my own browser. For example every time you open the settings there are several ways they are sending out pings to certain extensions.

Also by default addons.mozilla.org is a privileged site so of course they include google tracking in it and they get the proper fingerprint no matter what you have configured.

konform10 hours ago

If you are this motivated (I am!), how about joining forces on Konform Browser? Radio silence and remote third-party integrations disabled by default and generally sane and conservative defaults respecting old-fashioned notions like individual consent and data-protection regulations.

Aside from general dev, could use a hand in bringing it to more platforms (mobile and flatpak are frequently asked) and taking a closer look at fingerprinting protections and what's currently tripping up the turnstile.

https://codeberg.org/konform-browser/source

kykat13 hours ago

What? Big tech company is evil? No way! I thought cloudflare were good guys...

aleksandrm13 hours ago

What gave you the impression that Cloudflare were the good guys?

tardedmeme13 hours ago

Probably everyone on HN singing their praises for the past 10 years.

tick_tock_tick8 hours ago

Pretty sure every thread has a massive chain about them being a NSA honey pot.

kykat12 hours ago

And my og comment getting downvoted on this very intellectual forum that definitely isn't an echo chamber

Petersipoi12 hours ago

Your very sarcastic, uninteresting comment getting downvoted is not an indication that forum isn't intellectual. It's an indication that you aren't behaving intellectually.

bflesch12 hours ago

Cognitive dissonance in tech millionaires is quite strong, still worth it to trigger them from time to time on a factual basis.

aboardRat413 hours ago

Big tech companies are always visited first by the G-men who need something done.

shevy-java12 hours ago

I wondered about that too. So they allege that bots require that everyone now has to ID to the big service providers. Very dystopian situation. Skynet is currently winning the war.

Fokamul13 hours ago

Please, anyone from EU (US is doomed rofl) create a petition to ban browser-fingerprinting in EU, across all existing browsers.

I'm not good at creating petitions but can happily sign it. Also with stop killing games and anti-chat control.

I can imagine this can get a traction, if it's explained in youtube video to "normal" people.

fidotron13 hours ago

A better solution would be to make webgl, webgpu and (especially) webrtc have some sort of prompt before they can be in any way used in that fashion, but this will absolutely destroy web ux Windows Vista style.

JoshTriplett12 hours ago

And then the gatekeepers like Cloudflare will say "please hit accept in order to verify your browser and access this site".

richwater12 hours ago

You mean the "Accept Cookies" banner that has become a complete joke? Pass

MyMemoryfails12 hours ago

I think he means browser permissions, for example when browsers want notify or record your mic theres a permission check something similar for webgl.

J-Kuhn12 hours ago

Fun Fact: When Cookies were introduced into Netscape, you got a browser permission prompt. Then browser vendors set it to allow by default.

And then legislation required those consent boxes back, so everyone built their own, instead of demanding that the default should be changed back.

bflesch12 hours ago

It's about explicitly deciding to allow certain capabilities on a per-website basis. No major browser allows defense-in-depth via fine-grained website permissions.

Even simply changing the user agent was sabotaged at Firefox, and choosing one user agent per domain is wishful thinking.

fsflover10 hours ago

This is actually illegal under GDPR.

arbol7 hours ago

You literally can't get rid of it without introducing government issued ID to buy any scarce freely accessible items

raincole7 hours ago

Which is why it's very likely to happen, especially in the EU.

jeroenhd10 hours ago

Fingerprinting is just an implementation, banning it will just drive these companies to invent new tricks. That's why the GDPR doesn't specify any technical tracking methods, whether you're using cookies or fingerprinting or a camera drone looking at the user's screen, tracking without consent or good reason is banned.

I doubt politicians care much about fingerprinting, though. They're more afraid of actual businesses getting attacked by bots than they are about Linux users with weird setups not being able to access some websites.

koolala13 hours ago

a. Accept All

b. Accept Only Necessary Fingerprinting

hanzeweiasa1 hour ago

[flagged]

ryanshrott1 hour ago

[dead]

34875238913 hours ago

[dead]

gruez12 hours ago

This blog post is filled with false assumptions.

>Turns out it's because Cloudflare wants to have a fingerprint of your device via WebGL, the only reason for doing this would be tracking.

> So Cloudflare just banned all WebKitGTK browsers as I guess they put an exception for Safari.

This is false. I ran firefox with:

* hardware acceleration disabled (so software renderer, nothing to fingerprint)

* resistfingerprinting enabled, including letterboxing with default window size

* webgl disabled

* VPN enabled

* In a Windows VM

By all accounts this should be the most suspicious fingerprint ever, but turnstile happily lets me through. If they want to track people, they're doing a pretty bad job. My guess is that OP's browser is getting banned because his WebKitGTK has a weird fingerprint, not because of webgl or whatever.

> Such things are blocked in WebKit, and have been for years. Meaning it's tracking so awful that even Apple would block it, and as far as I can tell it's not the kind of privacy protection you can easily disable in it.

This is also false. Webgl fingerprinting works just fine on Safari. They might try to mitigate it by adding some noise, but that's not so different than what firefox does, and is certainly not "blocked".

konform10 hours ago

I think your comment is also making plenty assumptions..

Official Firefox can be leaky unless you build it yourself with some build-time changes or use a fork with such[0]. Am I guessing right that you still have Webcompat, RemoteSettings, and Nimbus enabled still? How do you know a compatibility intervention isn't causing your browser to open the kimono just enough to "unbreak the page"?

> My guess is that OP's browser is getting banned because his WebKitGTK has a weird fingerprint, not because of webgl or whatever.

My guess is a different flavor of the same: Not matching an expected fingerprint (simplified: whitelist vs blacklist approach) combined with other factors.

[0]: I'm currently aware of Tor Browser, Konform Browser (am dev), Mullvad Browser, and to a certain extent Waterfox, LibreWolf, and r3df0x doing that.

gruez9 hours ago

>Official Firefox can be leaky unless you build it yourself with some build-time changes or use a fork with such[0]. Am I guessing right that you still have Webcompat, RemoteSettings, and Nimbus enabled still? How do you know a compatibility intervention isn't causing your browser to open the kimono just enough to "unbreak the page"?

See my other comment, tor browser works fine too: https://news.ycombinator.com/item?id=48346659

jeroenhd11 hours ago

Enabling resistfingerprinting on my Android phone shows me the same error screen. It's not just webkit.

fingerprintingProtection works fine on the other hand, but then again that's intentionally less intrusive.

shiomiru12 hours ago

> My guess is that OP's browser is getting banned because his WebKitGTK has a weird fingerprint, not because of webgl or whatever.

So why is Cloudflare saying the author got blocked because of WebGL?

> > Such things are blocked in WebKit, and have been for years. Meaning it's tracking so awful that even Apple would block it, and as far as I can tell it's not the kind of privacy protection you can easily disable in it.

> This is also false. Webgl fingerprinting works just fine on Safari. They might try to mitigate it by adding some noise, but that's not so different than what firefox does, and is certainly not "blocked".

While I don't have an iDevice to try, the assumption that they are special cased is fair... because they are: https://blog.cloudflare.com/eliminating-captchas-on-iphones-...

(Yes, this is basically WEI in a shinier package.)

gruez12 hours ago

>So why is Cloudflare saying the author got blocked because of WebGL?

No idea. I can't even reproduce the error OP got with webgl disabled.

https://litter.catbox.moe/y42l22k97tgv96nx.png

superkuh12 hours ago

Yep. Cloudflare and cloudflare's customers don't care about blocking people that use non-standard browsers (or accessible browsers, or feed readers, or whatever). Using cloudflare defaults is basically saying, "Only major corporate browsers released in the last year or two can access this site."