You are not logged in.
https://bbs.archlinux.org/viewtopic.php … 3#p2187593 in the grr thread made me think about hosters .
When sourceforge was a much used solution for sourecode hosting no one was worried about loosing stuff because SF had a mirror system setup.
The vast majority of those mirrors were and are publically accesible
Archlinux and other distros use the same method for their binry packages .
A typical mirror setup includes dozens of mirrors, but mirror systems with hundreds and even thousands are also found.
Git and svn however don't use a mirror setup as far as I know.
I know some projects that use 2 publicly available hosters to prevent losing stuff, like sf + github or github + gitea and other variations .
That's however only 1 publicly available alternative instead of many.
What will happen if several big hosters like github.com , gitlab.com, kernel.org , freedesktop.org sourcefore.com all loose all data at the same time ?
Are there VCSes that can ensure the public availabilty of sourcecode ?
Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.
clean chroot building not flexible enough ?
Try clean chroot manager by graysky
Offline
Partial repost of my reply over there:
[...] What I meant was a decentralized and distributed git, has the robustness and comfort of a P2P network, e.g. you don't clone from a single source, but you specify a unique identifier and, say, a checksum, and the VCS clones from and pushes into the P2P network. Why do I want this? One of my private web services was suspended by the hosting service, because they got Nintendo email claiming, that a Yuzu repository was being hosted on my service. That wasn't even the case, but the hosting service reacted out of fear.
Online
What will happen if several big hosters like github.com , gitlab.com, kernel.org , freedesktop.org sourcefore.com all loose all data at the same time ?
Other than the world instantly becoming a better place with good claiming victory over evil, people dancing in the streets, puppies playing, birds singing... other than that, not much.
Are there VCSes that can ensure the public availabilty of sourcecode ?
No. None. Period. Not even theoretically possible. This is simply not even remotely in scope for a VCS. It sounds, however, like you are asking about services that provide a public hub server running one or more VCSs. In other words, you are asking about services in the category of github not git. So this is not a vcs question. This is a question about public servers that allow users to share content.
Even with that, can any "ensure" it? Still no, not completely. But are there some that have better measures to avoid the loss of any content, certainly. But which losses (i.e., which threats to availability) are you most concerned about? I see at least two *very* different flavors of loss: 1) hardware error, or attack on an actual server, and 2) specific content being removed either for 2A) legitimate legal reasons, or 2B) the threat of litigation from a large corporation.
For protecting from type #1 threats, the service simply maintaining their own internal backups would be sufficient. On their end, this would generally be done with load-balancing between multiple physical machines duplicating the same content. I'm confident that most major VCS hubs already do this. Do you get any benefit (for this #1 goal) by being able to individually address each node? No, as long as the load-balancer is working, you just request the uri you want, and let it direct the request to whichever provider node is best.
Now for #2, the distinction between 2A+2B is relevant for how each of us may feel about trying to circumvent it, as well as for whether such discussions are allowed on these forums - but at a technical level, 2A and 2B are identical from the end-user perspective. There is no way at all to prevent "your" content from being removed from a server owned by a third party. The "hub" owner could remove your content if it is illegal, they could remove your content even if it was legal but they were worried about litigation, and they could remove your content because you cut off one of their technicians in traffic and they don't like you know. They can remove your content for any reason or for no reason at all. The *only* security against that, is to not rely on a "hub" run by someone else, but instead keep your content on your own server, whether it's classified documents, porn, or VCS repositories is all the same.
EDIT: here's a thought experiment to emphasize my main point. If github vanished today, no meaningful content would actually be lost. Any actively maintained or even used repos are "mirrored" on countless user's computers around the world. The solution then would be to synchronize all of these disparate copies of a given git repo, and establish a new central hub to either A) store a reference copy of the repo at which point you've recreated github, or B) not store the content, but instead just store some form of list or directory pointing to all the distinct copies on user computers (i.e., now a more traditional p2p or torrent-like network).
But are either of these options superior for preventing future removal of content? If BigEvilCompany threatens the result of A, then that server may take down an offending repo. If BigEvilCompany threatens the result of B, then that server may take down the directory or list of providers of that repo content. The end result is the same whether you have a github-like service, or a p2p-like service.
Or I guess this is a very long-winded way of saying that this sounds like an X-Y problem. Mirroring and / or a p2p approach is not really different from the current status quo and it's not the solution to the stated problem(s).
Last edited by Trilby (2024-08-03 13:20:47)
"UNIX is simple and coherent" - Dennis Ritchie; "GNU's Not Unix" - Richard Stallman
Offline
It's clear now I phrased the question wrong and VCS don't offer what is needed.
Rephrasing the question .
First I am not concerned about 'my content' but about content that belongs to humanity (all of us) and is in my opinion part of the history of the human race .
How do we preserve publicly avaialble sourcecode like that of the linux kernel for the future ?
I am aware of two systems that can help with that .
One is the network formed by universities and other volunteers that's best known for mirroring sourcecode & binary for opensource projects.
The other is archive.org .
Both have flaws .
Could a system be designed to store public sourcecode that allows acces to everyone and can repair damaged stuff by combining multiple sources ?
Disliking systemd intensely, but not satisfied with alternatives so focusing on taming systemd.
clean chroot building not flexible enough ?
Try clean chroot manager by graysky
Offline
https://en.wikipedia.org/wiki/InterPlan … ile_System
How do we preserve publicly avaialble sourcecode like that of the linux kernel for the future ?
From the available data, stone carvings have the best long-term survival chances. It's however tedious and one typo and you got to start over on a new moutain…
What will happen if several big hosters like github.com , gitlab.com, kernel.org , freedesktop.org sourcefore.com all loose all data at the same time ?
If github vanished today, no meaningful content would actually be lost. Any actively maintained or even used repos are "mirrored" on countless user's computers around the world.
Unless we're talking about WWIII scenarios, the data will be reconstructed - though there's little authority in case of deviating branches.
Offline
I am not concerned about 'my content' but about content that belongs to humanity
That's a semantic difference only. Until we do away with capitalism all together and live up to Gene Roddenberry's dreams, content may be shared with all humanity, but it is written / owned by *someone*. The servers such data is stored on are also owned by *someone*. I phrased this in a "you" and "them" scenario for simplicity, but the exact same applies to any condition in which the server storing the content belongs to a different someone than the content itself does.
though there's little authority in case of deviating branches.
What do you mean? If we accept that there was an original author / owner of the content, their mirror, or whichever one they endorse as closest to what they lost is the authoritative branch. If we view the content as Lone_Wolf would as belonging to all humanity, then an authoritative branch is nonsensical.
Could a system be designed to store public sourcecode that allows acces to everyone and can repair damaged stuff by combining multiple sources?
Who would build it? Who would own it? Again, until we achieve Roddenberry's dream, *someone* would own it. Even if they super-promise-to-never-be-broken-even-on-double-dog-dare to maintain it for the public good, technically they'd still own it. And then we're in the same place as any of the current "hubs".
As a bit of aside, I somewhat facetiously refer to Roddenberry's dream society - but I do believe that the system you are seeking is implied in his vision of society, and further his vision of society is a prerequisite to really achieving your goal. Absent that, we get the university mirrors and archive.org which you referred to.
Also on ownership: the kernel was given as an example, but that definitely doesn't belong to all humanity. It has a clear copyright with restrictions enforced under the GPL. There could be more of a discussion about content that is actually in the public domain, and here we have a parallel to art, music, and literature: public domain content that is *good* doesn't need much of a planned strategy to maintain it: it maintains itself as countless people keep and pass around copies, covers, variations, and usages of the content. Public domain content only really vanishes when no one has use for it (including even just aesthetic enjoyment): and even then some is maintained in libraries, museums, or history books.
Software isn't really any different. Any algorithms that are simple enough to not be protected under copyright are in text books, implemented in projects around the world, and discussed in academic circles. They're not going away as long as they have any use at all. More extensive source code that may be eligible for copyright but has been placed in the public domain would be similar to the art / literature above. But when restricted by a license, the copyright owner maintains ownership, and it is on them to ensure the content can be preserved.
Last edited by Trilby (2024-08-03 15:34:46)
"UNIX is simple and coherent" - Dennis Ritchie; "GNU's Not Unix" - Richard Stallman
Offline
If we accept that there was an original author / owner of the content, their mirror, or whichever one they endorse as closest to what they lost is the authoritative branch.
And how do you prove that you were @weuirzzuew112 on github when github is completely gone?
Ifff you have signed stuff w/ a private key, users who already had your public key can trus you, everyone else, or everyone absent that precondition, still has to rely on some makeshift WoT.
Offline
Yes, but that's also not unique to software source code. I have a book here by Tom Clancy. His name is right there on big bold letters on the cover. His name is also on a copyright notice on the first pages. But if I ever meet the Tom Clancy, how could he prove he was actually the same Tom Clancy who actually wrote that book (hint: it's a tick question as most of his books for the last 20 years were ghost written by other authors!)
Given that I have a copy of his book, I could also just claim I wrote it. I'm holding it, it's my book, I can claim I wrote it. Clancy didn't sign it with a encryption key - so how can he establish ownership.
Society has been fumbling along doing just this without any pgp signing for millenia. However one may feel about private ownership of intellectual property, we clearly have systems in place for it; so asking how it would be maintained is really a bit silly (if used to suggest it would be impractical to do ... it's fair to ask about implementation details if we were actually building such a system: there's work to be done, but nothing magical required, just the same old stuff we humans have been doing as long as societies have existed).
Last edited by Trilby (2024-08-03 15:43:48)
"UNIX is simple and coherent" - Dennis Ritchie; "GNU's Not Unix" - Richard Stallman
Offline
But if I ever meet the Tom Clancy, how could he prove he was actually the same Tom Clancy who actually wrote that book
Look around: if you see clouds and harps and angels, he's telling you the truth because he doesn't want to be kicked out of heaven. If you see a lot of lava, you got other problems than Tom Clancy.
But the analog would be that Penguin burns down, all scripts and records with it and a bunch of people are showing up, claiming that their different versions of Red October is a true copy.
And conveniently Clancy is dead, so there's no authority (otherwise he'd identify by showing his drivers licence and Tv appearances) - who do you believe if you've not even ever read the book before?
The problem isn't that you're trying to take credit for his work (I see why you didn't go with Dan Brown here…), but that you'd try to claim "I am Tom Clancy" or at least "My version of Red October is the real one, Jack Ryan always had a parrot, you just forgot about that".
Offline
Fair point. But still, inferring original source content from a range of diverging copies is a thoroughly practiced art / science in literary analysis.
Certainly readers of the KJV Bible think their version is the one true version of christian scripture regardless of what any analysis shows; and quite ironically, readers of the New King James Version also somehow maintains that theirs is the original and true authoritative source ... despite being a "newer" version of a version that itself was only commissioned in the 1600s. But setting aside the categories of people that have deliberately shut off their rational mind, evidence can be assessed to make reasonably good inferences about the original forms of widely copied material.
EDIT: aside: I originally wrote my novelist example referring to Dean Koontz (who is still alive) as I actually have a lot of his books. I've never actually read any Clancy books. I like the movies that come from Clancy's stories, but his books never appealed to me. And while I've had suspicions that Koontz must have ghost writers given how ridiculously prolific his writing is, I've never found any actual basis for that suspicion. In contrast, Koontz has published under other names too: he's written *more* than is published under his name. I don't think he sleeps.
Last edited by Trilby (2024-08-03 16:37:27)
"UNIX is simple and coherent" - Dennis Ritchie; "GNU's Not Unix" - Richard Stallman
Offline
The level of resistance in this thread is almost comical. The idea was to a) minimize the impact of a single (bad-faith) actor like Github and b) reduce the pressure on small hosting services that can easily be crushed by single (bad-faith) actors like corporations by decentralizing an already distributed VCS. This is, yeah, fuck us, indeed not exactly only a VCS problem, because distributing signed tarballs via torrent is already an option, but that only shows how semantics can get in the way. Since it's just semantics, I'm not sure why it seems to piss you off so much, that somebody is asking for a VCS that does something out of scope, because, fine, we need a better name, how about SDS (source distribution system), maybe it's even completely (D)VCS agnostic.
Also, getting stuck on intellectually low hanging fruit like the philosophy behind the human perception of property and ownership isn't going to solve a single real world problem. Fixation on semantic perfection isn't either. Again, the proposed solution to the vulnerability of big single hosting services was "host your own damn code" and it didn't work. The "distributed" aspect of modern VCS is already a massive help, because we can play whack-a-mole with software pretty well and we could do this even without a VSC by pushing tarballs around (or even tarballs that contain repositories! *gasp*), but getting actual work done in between would be nice.
Online
The level of resistance in this thread is almost comical.
Are you referring to me? I'm seeking clarification on the actual goal which could be described as "resisting" vague wishful thinking; but the latter doesn't get us anywhere.
Since it's just semantics, I'm not sure why it seems to piss you off so much
I'm not "pissed off". You might be, but don't project that on others. Perhaps if you're a religious type I may have hit a nerve using the bible as an example. But note that I take no issue with religion in general: but quite plainly, if someone thinks a book commissioned in the 1600s predated texts that were quoted 1200 years earlier, then that person is delusional. I stand by that assessment.
At least I wasn't pissed. I find your post quite unpleasant. If you disagree with something, elaborate. But disregarding other points of view simply because you find them "comical" or "pissed off" or "low hanging fruit" is not appropriate.
If we want something new, it is important to be clear about WHAT it is we want. Semantic games of moving goal posts make that impossible to discuss. So call it whatever you want, the name doesn't matter, but the requirements / goals do. If they are not established, there is no way to go forward. Taken to the extreme, an absence of clarity in the definition of a problem results in nothing but yelling "I WANT I WANT I WANT". We must be able to communicate what exactly we want if we have any hope in getting it.
But I also challenge the use of VCSs in this discussion for the very reason that it seems what is wanted for VCSs already exists. Just not *specifically* for VCSs, but there's no need for a specific application to VCSs. Decades ago I worked in a pet store and a customer sought help finding a gram scale and was upset that we didn't have any. I was puzzled - we were right next to other stores with kitchen sections that would certainly carry gram scales and I suggest he check there. He responded "But it's for my hamster!" as if one gram of hamster was somehow totally different from one gram of flour or sugar. No one needs a hamster-specific gram scale. It's the cognitive inflexibility of thinking there must be a gram-scale specifically made for weighing hamsters that prevented this customer from being able to actually achieve their goals of just weighing their hamster.
stuck on intellectually low hanging fruit like the philosophy behind the human perception of property
How is that low hanging fruit? Please enlighten me with your simple solution to all of human economic policy. I'll wait.
... isn't going to solve a single real world problem.
Removing roadblocks that prevent a solution from working is necessary for solving problems. Wishful thinking completely divorced from reality, in contrast, doesn't get us very far.
... "host your own damn code" and it didn't work.
Oh? In what way does it not work? Again, please define the actual problem / goal - otherwise nothing can be acheived.
Last edited by Trilby (2024-08-03 17:38:52)
"UNIX is simple and coherent" - Dennis Ritchie; "GNU's Not Unix" - Richard Stallman
Offline
At least I wasn't pissed. I find your post quite unpleasant. If you disagree with something, elaborate. But disregarding other points of view simply because you find them "comical" or "pissed off" or "low hanging fruit" is not appropriate.
Fair. I found your posts unpleasant as well. I perceived your whole line of response as purposefully antagonistic and to me it looked like an attempt to derail this thread, because you were unsatisfied with the persistence of the topic. I'll read everything again, but from my perspective, it felt like you were pretty wound up about the topic. I read this from your initial response to my post in the other thread and I even agree with those statements, but your vehemence made it look like you were so opposed to even discussing the topic, that you would jump at every chance to make it stop. I you say that wasn't the case, then I must have misread.
Regarding the bible and after re-reading the whole thread, I'm not even sure if I didn't write the damn thing myself. In that case I apologize.
But I also challenge the use of VCSs in this discussion for the very reason that it seems what is wanted for VCSs already exists. Just not *specifically* for VCSs, but there's no need for a specific application to VCSs.
Okay, this has bugged me since our initial conversation in the other thread, but I haven't had the time to put into research so far. Do you know of a tool, that allows a) asynchronous collaborative code writing and b) version controlled source code distribution, that does not rely on servers hosting it? I know you assume an X-Y problem, but
Awebb wrote:stuck on intellectually low hanging fruit like the philosophy behind the human perception of property
How is that low hanging fruit? Please enlighten me with your simple solution to all of human economic policy. I'll wait.
Ah, no! Bringing up the notion that anything belongs to mankind really hung the fruit low for you to pick and throw back at the topic, operating under the original assumption.
Oh? In what way does it not work? Again, please define the actual problem / goal - otherwise nothing can be acheived.
This is actually not necessary, because you've understood the question from the very beginning.
here's a thought experiment to emphasize my main point. If github vanished today, no meaningful content would actually be lost. Any actively maintained or even used repos are "mirrored" on countless user's computers around the world. The solution then would be to synchronize all of these disparate copies of a given git repo, and establish a new central hub to either A) store a reference copy of the repo at which point you've recreated github, or B) not store the content, but instead just store some form of list or directory pointing to all the distinct copies on user computers (i.e., now a more traditional p2p or torrent-like network).
That was you paraphrasing the scenario perfectly. This is why I assumed you've got the gist and were just trying to make it stop. Again, I probably have misread the general tone of your post. I'm genuinely interested in the topic.
But are either of these options superior for preventing future removal of content? If BigEvilCompany threatens the result of A, then that server may take down an offending repo. If BigEvilCompany threatens the result of B, then that server may take down the directory or list of providers of that repo content. The end result is the same whether you have a github-like service, or a p2p-like service.
Or I guess this is a very long-winded way of saying that this sounds like an X-Y problem. Mirroring and / or a p2p approach is not really different from the current status quo and it's not the solution to the stated problem(s).
This is interesting, relevant and maybe true, maybe not. Had this been my point of entry into the rest of the discussion, this would have been a very useful thing to talk about. Instead I return mere hours later to some sort of Samuel Beckett play, where the three of you discuss authorship atomicity.
Online
I'd also rather not discuss authorship issues in this conversation as I didn't believe they'd be relevant. But my initial comment was rebutted with a distinction between Lone_Wolf's own intellectual property which was reportedly not the concern here and communal property which is the concern. So it seems right on topic to compare contrast the relevance on communal / private property: and my thesis is that there isn't substantial difference for the present topic - there may be different logical paths to the conclusion for private vs communal property, but the conclusion (in my view) is the same.
Do you know of a tool, that allows a) asynchronous collaborative code writing and b) version controlled source code distribution, that does not rely on servers hosting it? I know you assume an X-Y problem
Yes. And correct. But here is the point that I seek clarification on. I clearly do not understand your criteria (your "a" and "b") because as I read them, every single DCVS system I've used seems to meet those criteria.
Though I suspect the point of "not relying on servers hosting it" may be where I'm stumbling: by servers do you mean shared public resources that I've labeled as the "hubs" as github is to git? I don't think you do - because I know you understand these topics better than that. But other than that I'm just not sure what you could mean. I can host my fossil repos on the laptop that I'm typing this message on while sitting in my living room. Is my laptop a "server" for these purposes? (It is, but is this all your second criteria requires?)
On one point you certainly give me to much credit: I really don't understand the problem. I'm trying to. But I certainly don't. For example here:
This is why I assumed you've got the gist ...
The excerpt of mine your quoted for that was me laying out a possible scenario in which everything would work out fine and restore the status quo. If the current status quo is indistinguishable from everything working out just fine, then in what way is there a problem to be fixed (hence, I do not currently understand the problem).
I am trying to understand the issue via critical discussion. If unwelcome I would leave - but my intent is just that, not to derail or anything else that it might have appeared to be.
Last edited by Trilby (2024-08-03 19:55:20)
"UNIX is simple and coherent" - Dennis Ritchie; "GNU's Not Unix" - Richard Stallman
Offline
VCS (distributed or not) and decentralized data storage (for whatever reason, leaving the WWIII scenario aside where the EMP is the lesser problem of thermonuclear armageddon) are two completely disjunct problems.
We can obviously have DVCS on centralized hubs and we can have all the copyright infringement (for rather obvious reasons) in a decentralized storage.
The problem w/ the latter is that it's very much demand driven supply, but the technology to host git (or svn or cvs - or just more porn) on decetralized, DHT driven storage exists.
https://docs.ipfs.tech/how-to/host-git-repo/
The OPs artificial limitation to "sourcecode" creates a fictional problem - you could already share tarballs on edonkey (I wanna say napster, but idk whether that was originally or ever the case)
The metaphysical question whether that actually makes sense or is reasonably necessary is off-topic.
The premise of "several big hosters … all loose all data at the same time" and probably implying "cold" is obviously hypothetical (thermonuclear warfare aside…) and Trilby pointed out why even then the idea that this would not lead to a loss of actively used data because it's copied plentysome. So, no: I'd not lose much sleep over that being a necessary precaution to avoid immediate data loss.
The more realistic scenario is actually that the "free as in beer" reality at some point might stop, not over night but gradually - and then what?
Offline
From the available data, stone carvings have the best long-term survival chances
That's a common misconception
Interesting read on this general subject: https://askleo.com/stone-tablets/
And thanks to the OP for raising this important question. Given the imminent collapse of Modernity[1] we really do need to secure our knowledge base for future generations. If there are any.
[1] https://dothemath.ucsd.edu/2024/04/dist … tegration/ EDIT: trigger warning: ecocide.
Last edited by Head_on_a_Stick (2024-08-03 21:56:58)
Para todos todo, para nosotros nada
Offline
I think the author of that article is making a case against the absolute notion that "stone is forever", but don't see him rejecting that scratching stuff in stone has the best long-term static* survival chance except with "certainly not a storage mechanism that compares well with the majority of today’s technologies" which in terms of "duration" (of course they're super impractical, like 1kB/week writing speed…) I call BS on. While storage systems that virtually last forever are certainly being researched, nothing else has actually preserved data this long and for the vast majority of stuff we use we absolutely know that it doesn't have a remote chance to ever get there because of bio-degradation, corrosion, UV sensitivity and general flammability. Let alone reliance on reader technology.
So if you task me, today, with what's available to me/most humans, to write down some information that somebody needs to read in 5000 years, no relying on future generations to relay it, I'll go for some stone (or clay if I can't manage to carve the stone - and some kinds of stone will certainly fare better than others, the latter ones being easier to carve)
It might get broken or shattered and even far more likely lost, but certainly stands the biggest chance to make it there and still be readable.
* he's clearly making a case for backups and medium transfer which, yes. Of course.
Offline