GitHub is introducing rate limits for unauthenticated pulls, API calls, and web access

chaospatterns@lemmy.world · edit-2 18 days ago

GitHub is introducing rate limits for unauthenticated pulls, API calls, and web access

traches@sh.itjust.works · 18 days ago

Probably getting hammered by ai scrapers

adarza@lemmy.ca · 18 days ago

you mean, doin’ what microsoft and their ai ‘partners’ do to others?

Also, as Microsoft appears to have recognized scraping for AI training as a problem, are you seizing your own scraping activities on public code and the larger web or is this a case of double standards?

midori matcha@lemmy.world · 17 days ago

Github is owned by Microsoft, so don’t worry, it’s going to get worse

Lv_InSaNe_vL@lemmy.world · edit-2 17 days ago

I honestly don’t really see the problem here. This seems to mostly be targeting scrapers.

For unauthenticated users you are limited to public data only and 60 requests per hour, or 30k if you’re using Git LFS. And for authenticated users it’s 60k/hr.

What could you possibly be doing besides scraping that would hit those limits?

chaospatterns@lemmy.world · edit-2 17 days ago

You might behind a shared IP with NAT or CG-NAT that shares that limit with others, or might be fetching files from raw.githubusercontent.com as part of an update system that doesn’t have access to browser credentials, or Git cloning over https:// to avoid having to unlock your SSH key every time, or cloning a Git repo with submodules that separately issue requests. An hour is a long time. Imagine if you let uBlock Origin update filter lists, then you git clone something with a few modules, and so does your coworker and now you’re blocked for an entire hour.

onlinepersona@programming.dev · 17 days ago

I see the “just create an account” and “just login” crowd have joined the discussion. Some people will defend a monopolist no matter what. If github introduced ID checks à la Google or required a Microsoft account to login, they’d just shrug and go “create a Microsoft account then, stop bitching”. They don’t realise they are being boiled and don’t care. Consoomer behaviour.

Anti Commercial-AI license

calcopiritus@lemmy.world · 16 days ago

Or we just realize that GitHub without logging in is a service we are getting for free. And when there’s something free, there’s someone trying to exploit it. Using GitHub while logged in is also free and has none of these limits, while allowing them to much easier block exploiters.

onlinepersona@programming.dev · 16 days ago

I would like to remind you that you are arguing for a monopolist. I’d agree with you if it were for a startup or mid-sized company that had lots of competition and was providing a good product being abused by competitors or users. But Github has a quasi-monopoly, is owned by a monopolist that is part of the reason other websites are being bombarded by requests (aka, they are part of the problem), and you are sitting here arguing that more people should join the monopoly because of an issue they created.

Can you see the flaws in reasoning in your statements?

Anti Commercial-AI license

calcopiritus@lemmy.world · 16 days ago

No. I cannot find the flaws in my reasoning. Because you are not attacking my reasoning, you are saying that i am on the side of the bad people, and the bad people are bad, and you are opposed to the bad people, therefore you are right.

The world is more than black or white. GitHub rate-limiting non-logged-in users makes sense, and is the expected result in the age of web scrapping LLM training.

Yes, the parent company of GitHub also does web scrapped for the purpose of training LLMs. I don’t see what that has to do with defending themselves from other scrappers.

onlinepersona@programming.dev · edit-2 16 days ago

Company creates problem. Requires users to change because of created problem. You defend company creating problem.

That’s the logical flaw.

If you see no flaws in defending a monopolist, well, you cannot be helped then.

Anti Commercial-AI license

calcopiritus@lemmy.world · 16 days ago

I don’t think Microsoft invented scrapping. Or LLM training.

Also, GitHub doesn’t have an issue with Microsoft scraping its data. They can just directly access whatever data they want. And rate-limiting non logged in accounts won’t affect Microsoft’s LLM training at all.

I’m not defending a monopolist because of monopolist actions. First of all because GitHub doesn’t have any kind of monopoly. There are plenty of git forges. And second of all. How does this make their position on the market stronger? If anything, it makes it weaker.

daniskarma@lemmy.dbzer0.com · 17 days ago

Open source repositories should rely on p2p. Torrenting repos is the way I think.

Not only for this. At any point m$ could take down your repo if they or their investors don’t like it.

I wonder if it would already exist and if it could work with git?

Kuinox@lemmy.world · 17 days ago

Torrenting doesn’t deal well with updating files.
And you have another problem: how do you handle bad actors spamming the download ?
That’s probably why github does that.

daniskarma@lemmy.dbzer0.com · edit-2 17 days ago

That’s true. I didn’t think of that.

IPFS supposedly works fine with updating shares. But I don’t want to get closer to that project as they had fallen into cryptoscam territory.

I’m currently reading about “radicle” let’s see what the propose.

I don’t get the bad actors spamming the download. Like downloading too much? Torrent leechers?

EDIT: Just finished by search sbout radicle. They of course have relations with a cryptomscam. Obviously… ;_; why this keep happening?

Jakeroxs@sh.itjust.works · edit-2 17 days ago

There’s literally nothing about crypto in radicle from my reading, cryptography and crypto currency are not synonymous.

Ah because they also have a different project for a crypto payment platform for funding open source development.

Edit again: it seems pretty nifty actually, why do you think it’s a scam? Just because crypto?

John Richard@lemmy.world · 18 days ago

Crazy how many people think this is okay, yet left Reddit cause of their API shenanigans. GitHub is already halfway to requiring signing in to view anything like Twitter (X).

plz1@lemmy.world · 18 days ago

They make you sign in to use search, on code anyways.

goferking (he/him)@lemmy.sdf.org · 17 days ago

Which i hate so much anytime i want to quickly look for something

GitHub is introducing rate limits for unauthenticated pulls, API calls, and web access

GitHub is introducing rate limits for unauthenticated pulls, API calls, and web access

Updated rate limits for unauthenticated requests - GitHub Changelog