Talk:Data degradation

Page contents not supported in other languages.
Source: Wikipedia, the free encyclopedia.

2005 EEPROM discussion

The article mentions bit rot with respect to EPROMs; what about magnetic storage such as disks and tapes. more info on this discredited cosmic ray theory please

Yeah, seriously. If the theory has been discredited, give a reference. Otherwise the 'discredited' assertion should be removed.

    • Added reference to Intel experiment that showed how background radioactivity of ceramic chip packaging causes bit flips, not cosmic rays. I think IBM has research on cosmic rays affecting high altitude (satellite and aircraft) electronics. — Preceding unsigned comment added by Rolofft (talkcontribs) 00:58, 28 December 2005‎

Article is very misleading, as most "bit rot" is really "software rot"

This article is pretty misleading, because while "bit rot" is a theoretical phenomenon that can (rarely) actually happen, when the term is generally used, it instead refers to software rot. This sort of reference is somewhat tongue-in-cheek, but it is, in my experience, the vast majority of references to "bit rot." As such, this article really needs to be reorganized to make it clear that a casual reference to "bit rot" outside of certain specialized contexts almost certainly refers to "software rot," not literal "bit rot" where the bits are unexpectedly changing.

I'm not sure how to tackle this. One approach would be try to merge the two articles, but I think the emphasis of the two terms is different. On the other hand, I don't think I've ever heard anyone use the term "software rot" outside of a discussion of bit rot nomenclature; is there any actual evidence of its use? (It's talk page makes one even more wary.)

Another approach would be to try to explain the distinction between the literal meaning and the actual meaning higher up in the article, and immediate go to a detailed explanation of "software rot." Essentially promote the "Problems with software" section to the lede and flesh it out with more relevant detail, so the bulk of the article is not misleading literalism.

I'm sure there are other approaches. Would anyone care to recommend them? As it stands this article seems extremely problematic, so I will mark it Disputed and reference this talk topic. jhawkinson (talk) 07:08, 31 January 2014 (UTC)[reply]

  • Hello there! Sorry, but seems like you've never experienced such bit rotting issues, what makes them unclear to you; please allow me to explain.
Let's start with software rot, and a quote from that article: "This is not a physical phenomenon: the software does not actually decay, but rather suffers from a lack of being responsive and updated with respect to the changing environment in which it resides." In other words, software rot is a virtual decay of software (which is also a virtual thing), caused by lack of human involvement and bad maintenance; things and requirements are changing, but software can't follow that without humans and their elbow grease. :)
On the other side, bit rot is a physical decay of various physical media (due to aging, environmental conditions, loss of media properties etc.), which is used for carrying data/bits; the physical media erosion causes data loss. The most obvious example is paper media (used for punched cards and punched tapes); I'm sure you've seen at least a few old books having their paper eroded and crubmbling due to its age. The same applies to other media, such as magnetic tapes, optical media etc. Have you ever seen an audio tape (or a VHS tape) no longer playable due to its age? If you spend some time Googling, you'll find manufacturers' statements regarding the expected lifetimes of optical media, for example; that resulted in media such as M-DISC etc.
Hope it all makes sense. What I'd suggest we do, is to delete the Bit rot § Problems with software section; that section is pretty much out of context in this article, and induces quite a lot of confusion. Software rot should be mentioned only as a link within the Bit rot § See also section, and nothing more. Merging those two articles wouldn't be a good thing, in my opinion, as they're dealing with two completely different things, so it's better to keep those separated.
Thoughts? Looking forward to discussing it further! — Dsimic (talk | contribs) 21:29, 31 January 2014 (UTC)[reply]

Hi guys, no offense, but for a person who is into computers the unknowledge of the

talk) 22:45, 31 January 2014 (UTC)[reply
]

So, it's all a joke – paper never decays and there are no cosmic rays, for example? Please don't get me wrong, but the Jargon File hasn't been pulled out of thin air. Software rot is a kind of joke, but bit rot (or at least its physical background) isn't. — Dsimic (talk | contribs) 01:06, 1 February 2014 (UTC)[reply]
Well, there is a bit rot and bit rot. There is definitely a joke. And the jargon file does say that this joke is 'cum grano salis' . An the other hand, please read my previous post carefully: whether reliable sources call
talk) 02:19, 1 February 2014 (UTC)[reply
]
Bit rot (data corruption, soft errors, or however we're calling it)—which was surely much more of a joke back at the time when the jargon file was assembled—became much more frequent as the sizes of memory and storage increased, making the equipment's rates of (undetectable) errors more easily reached even for Joe Averages. Please, have a look at the papers below:
  • Silent Data Corruption: Causes and Mitigation Techniques
  • Eliminating Silent Data Corruption with Oracle Linux
  • "Are Disks the Dominant Contributor for Storage Failures? A Comprehensive Study of Storage Subsystem Failure Characteristics" (PDF). USENIX. Retrieved 2014-01-18.
  • David S. H. Rosenthal (October 1, 2010). "Keeping Bits Safe: How Hard Can It Be?". ACM Queue. Retrieved 2014-01-02.
  • Bitrot and atomic COWs: Inside “next-gen” filesystems
Actually, Keeping Bits Safe: How Hard Can It Be? paper uses "bit rot" (even bits are modeled as radioactive atoms :) in a quite serious context, and an ACM publication should be taken as a reliable source, if you agree. At the same time, these talk discussions might be interesting:
Once again, please don't get me wrong. Anyway, quantum bogodynamics is a hilarious evergreen. :) — Dsimic (talk | contribs) 02:51, 1 February 2014 (UTC)[reply]

Dsimic, you seem very confused about what my point is. Your recentAnd Staszek Lem's edit to this article appears to have ignored the point I raised and made the problem dramatically worse. Please allow the discussion here to reach some sort of concensus before taking action that is the opposite from what has been proposed here.

Your language above is somewhat confusing. For instance, I don't know what you mean by "what makes them unclear to you; please allow me to explain" — is that intended as a question? My concern is that this article, at present, does not reflect how the term "bit rot" is actually used. You are correct that it has a literal definition, but that is not what is generally meant by the term. This is a problem of usage, and it's probably pretty tough to nail it down without original research, and that is going to be a problem, especially if we do not have concensus. The Jargon file is probably the best we can do.

The physical phenomenon of literal bit rot is mostly uninteresting because it is solved by keeping backups and redundant copies. It is not mysterious or troublesome. It is rarely talked about. The problem of "software rot" is a much more significant and real problem, but the term "software rot" is not widely used. Instead it is generally referred to as "bit rot." This is a tough issue of semantics.

Your proposal to remove the software rot section of this article makes it much worse. It causes a reader who hears the term "bit rot" who comes to this article looking for an explanation to get a diametrically opposite explanation than what they should get. Because, again, despite its literal meaning, when most people say "bit rot," they mean what Wikipedia defines as "software rot."

(I remain quite skeptical that Wikipedia's "software rot" page should exist; it seems to have original research problems. I wonder if it's content should be merged into "bit rot" and then redirected here. I suspect that may be the best solution, but it might be too controversial.)

Dsimic, do you have actual experience with real human beings who use the terms "bit rot" and "software rot"? Or references sources (beyond the jargon file, and certainly beyond Wikipedia) that discuss their usage? The list of papers you offer above is something else — those are some examples of how bit errors occur, but they do not seem to be about the term "bit rot."

There is no question that bit errors happen. The question is whether the term "bit rot" generally refers to such things, and whether a reader of this encyclopedia is well served by the assertion that they do. I think the answer is clearly no. jhawkinson (talk) 04:07, 1 February 2014 (UTC) Corrected note on author of recent edit at 04:12.[reply]

Ok, let's go through questions one by one. I'm not confused about what you point is; you're questioning whether "bit rot" is actually "software rot", and it isn't. Software rot is the decay of software's abilities, while bit rot is the decay of media/hardware storing the data. The jargon file was written back at the time when triggering the actual bit rot was pretty much a theoretical thing, as you need to read about 1016 bits from an HDD to get one bit flipped in an undetectable manner.
Also, plain backups and/or redundant copies aren't solutions for the actual physical phenomenon behind the bit rot, as they also aren't immune to it. When combined with more advanced techniques, like data and metadata checksumming, they can serve that purpose; it's all pretty much about detecting such silent data corruptions. However, that's a completely different topic.
Bit rot § Problems with software section is what actually doesn't belong to this article, as bit rot is about the decay of media/hardware (or firmware bugs) causing data loss. Regarding your question whether "bit rot" is actually used for that purpose, here are a few examples:
From the references listed above, it should be clear that "bit rot" is used for the data/hardware/firmware corruption, not for software decay. Of course, I'm more than happy to discuss this further. — Dsimic (talk | contribs) 04:48, 1 February 2014 (UTC)[reply]
Dsimic, you write, "you're questioning whether "bit rot" is actually "software rot", and it isn't." You argue this by bare unsupported assertion, and I believe you to be incorrect. Can you justify your claim?
The references you cite are unhelpful and not relevant -- they are examples of the literal use of the word "bit rot." It is undisputed that the term is sometimes used for its literal meaning. The question is whether that is the common use, the general use, and the preferred use. And thus what focus this article should have. The questions I asked previously, which you have not answered, are:
  • Do you have actual experience with real human beings who use the terms "bit rot" and "software rot"?
  • Can you reference sources that discuss the usage of these terms?
I think these are important. I should be clear, I am hard-pressed to come up with references that support my side of this as well, which is why I fear this whole article is really far over the line into Original Research. But I know many people who use the word "bit rot" casually and they almost always use it to discuss the concept that Wikipedia calls "software rot." And this is, because, outside of a few specialized contexts, random bit errors simply are not worth discussing. We have the technology to work around them, and all storage technology does so, in various ways. Thanks. jhawkinson (talk) 05:36, 1 February 2014 (UTC)[reply]
Ok, let's put aside my reasoning for a moment, if you agree; please convince me that "bit rot" is actually "software rot", by providing the same level of quality for the backing references as you're asking me to provide.
Sorry for missing to answer the remaining questions, I'll do that immediately. No, I haven't seen yet terms "bit rot" and "software rot" coming out of a human's mouth; nowadays people tend to speak about .Net and Java ultra-high-level abstractions, and nobody cares about some... bits? what are those bits? can I put them into a workflow? :) Also, I'm unable to provide reliable references discussing the actual usage of these two terms; can you, please?
When looking at the bigger picture, I'd say that
WP:OR
behind, while also deduplicating quite a lot of content.
How about that? — Dsimic (talk | contribs) 06:24, 1 February 2014 (UTC)[reply]
Dsimic, let me try to give an example of the problem I am trying to solve before turning to your question. In this week's New Yorker, biographer Benjamin Moser writes about his experience at UCLA's Susan Sontag's archives. The archives have copies of Sontag's emails on hard drives, but Moser reports that they are "menaced by bit rot" (he also references other terms unfamiliar to him along with bit rot, such as "forensic software" and "write blockers"). He then goes on to offer the literal definition espoused by this article, which is wrong. He is confused about what is meant by bit rot, and he perpetuates his confusion to his readers.
In the context of Sontag's archive, as with most good digital archives, the problem is not a fear of random bit-errors scrambling the content of the archives. The problem of bit rot is that older software used to access the digital content (in this case emails, presumably in a propriety format) no longer works well. I don't know if Moser used Wikipedia to try to understand "bit rot" and summarized a definition; it's easy to see how if he had looked at Wikipedia, he would have come away with a definition that was not applicable in his case (the "literal" definition), and gotten confused.
Of course I have a lot of trouble producing references here (hence "I am hard-pressed to come up with references"), because I suspect there are very few on the question of usage of colloquial computer terms. The best I have right now is the cited Jargon File, which is written a bit subtly. In its lede it notes bit rot is a "hypothetical disease"; and concludes: "The term software rot is almost synonymous. Software rot is the effect, bit rot the notional cause." (As a practical point, I think ESR and the Jargon File authors made an error in splitting "bit rot" and "software rot" into two separate topics; it allows an unfamiliar reader to think they are separate concepts that can happen independently, but is not how the terms are used.)
In my computing life, I regularly deal with systems that have been around for a long time, such as decades. We have a university computing environment (with network storage) that has been around for decades. We have conversations about whether software written and compiled in the 1990s will function properly on current machines without recompiliation, as well as much shorter timescales. In this world, people talk about "bit rot" with some frequency -- because there are real problems with software continuing to function as designed over those kinds of periods. And at least here, at MIT in Cambridge, Mass, the term "software rot" is not used, but the term "bit rot" is used, and it always means what Wikipedia calls "software rot."
Your merger proposal is consistent with your prior proposals, but is no better for me, and is in fact worse. I want to ensure that someone who wants to understand what is meant by the term "bit rot" gets a good answer. That a reader who sees those words will understand what the writer meant when they were written. What Wikipedia calls "software rot" is not a form of data corruption. Stuff stops working, but it is not because of corruption, it is because of more subtle effects. Redirecting this article to the data corruption article makes the problem worse, not better. It will mislead readers into thinking "bit rot" is about the problem of bit errors and data corruption, but that is not true.
I don't want to suggest that my personal experience should be the basis of this article. But my experience is consistent with the cited references that discuss usage of the term (e.g. the Jargon File), as well as the prior comments here on the Talk page. E.g. the "Software bit rot" section from 2007–2010, and following. Given that you have no experience with the actual use of the term, it is difficult to take your position credibly.
Do you see what I am getting at? jhawkinson (talk) 14:36, 1 February 2014 (UTC)[reply]
This was a very enjoyable read, thank you for taking some time and explaining in detail the situation on your hands!
Speaking about "bit rot" in general, it's quite hard to maintain
WP:NPOV
without good and reliable references – and those are hard to be found for the term's colloquial use. To me, the only right thing at the moment—in absence of adequate references, and in order to follow Wikipedia's rules—is to present both usages/meanings of "bit rot", with more clear explanations where they're coming from. I know (and totally understand) you're against that as you hear the term being used for only one meaning on a daily basis, but killing any of the meanings would be against maintaining neutrality, if you agree. Also, this is contrary to my initial suggestion of deleting one of the meanings, where I clearly was wrong because of no day-to-day exposure to this term, what shows that I'm more than open to learning and changes. :)
Of course, when there are good sources available for linking "bit rot" only with the software decay meaning, I'm all in for the other meaning to be deleted. In the meantime, there are references using "bit rot" for the other meaning, and we simply can't trump those.
Thoughts? — Dsimic (talk | contribs) 03:30, 2 February 2014 (UTC)[reply]
Dsimic, thanks, yes, it looks like you and I are making progress. Unfortunately I'm not sure where that leaves the article :). As I said in my initial post in this section of the talk page, I'm not really sure how to tackle this. It's why I didn't start out making a Bold edit but instead started a discussion.
Now, maybe this is silly of me, but we have one usage reference that suggests literal "bit rot" is a notional idea that doesn't really happen (the Jargon file); do we have any usage references that claim that it does happen? I guess when I say usage reference, I mean something akin to a secondary source (it'd be great to have some dictionary sources; but nothing in the American Heritage, OED, or Merriam Webster. I also came up dry with Lexis-Nexis. There was this seemingly-relevant Economist article, Digital data: Bit rot, but it turns out not to discus the term, though it discusses the problem). I realize we have various examples where references and papers use the term "bit rot," but those don't strike me as authentic references, they're just examples. On the other hand, we cannot be prescriptivist, we have to be descriptionist. If 95% of the usage really is the literal one, even if that is somehow "wrong," then we have to hew to it. Of course we have no good data on what fraction of is "literal" bit rot and what fraction is "software rot."
So, yes, I think the path forward is to try to equalize the volume of the article that deals with both cases, and to make it less lopsided. I am troubled that references 2-6 seem to be all be about bit errors in general. I feel like someone started with the article in early stages and went on to find cases where people talked about bit errors and cosmic rays, and added them into the article, thus making the literal definition seem more credible, which lead to more references added to that section and more emphasis on that section, in a runaway effect. (I guess I could look at the history to determine how much this is the case; there are only 225 edits in the life of the article, so that is manageable.) jhawkinson (talk) 16:50, 2 February 2014 (UTC)[reply]
Exactly, we're now pretty much on the same page, but we still need more references in order to even out these two meanings, for the beginning. Just as you described, it's tough to find good sources, as the majority of the available stuff seems to be of "infotainment" nature.
Regarding the bit rot as a literal decay of storage media, I'd say we already have a few good references listed above (the ACM paper, for example). For the software degradation meaning, here are a few good papers, though they tend to use terms such as "code decay", "software decay" or "software rot", and they're oriented more toward the "active decay", so to speak:
Though, these papers aren't using the term "bit rot", so their usability as references is somewhat questionnable. It's the same thing, but again, anyone could easily say the opposite. In the end, finding the ratio between the usage levels of these two meanings is, I'd say, an impossible mission. :) — Dsimic (talk | contribs) 02:23, 3 February 2014 (UTC)[reply]
My point exactly. There was no proof presented that "Code decay" is called "bit rot". YOu have nice chat and all is good for a software blog, but in wikiepdia we speak of references to reliabkle sources to prove our point. I see a nice exercise of wit on both sides, but in the end all what matters is show me the source please add a reference to the questioned phrase into the article and done with bickering socializing.
talk) 03:04, 4 February 2014 (UTC)[reply
]
Unfortunately, there are very few references clearly linking the term "bit rot" with software/code decay. Once again I've spent some time Googling, and the best I came up with were the Jargon File, an entry from a printed dictionary, a slide show citing Wikipedia, and a few usenet posts back from 1982 and 1983. Out of those (besides the Jargon File), this network dictionary entry seems to be of the best quality. I'll keep trying to find some more references for the second meaning, but chances for that—unfortunately—seem to be quite slim. — Dsimic (talk | contribs) 04:43, 4 February 2014 (UTC)[reply]
If it is so, then the following must be done:
  • (a) Figure out the real, technical, most common term for data/media decay and move the article under this title, because "bitrot" is a colloquialism, and we don't have the article 'penis' under the title dick (anatomy) or something.
  • (b) Have a small section about bitrot joke.
  • (c) If in the future one finds good references, we can add a dabnote for the degradation of unmaintained software see Software rot, and in "Software rot" page add a referenced sentence ("sometimes called bit rot"[ref]).
  • (d) There is no reason mix and match "data storage decay" and software rot in one page; cross-referencing is enough.
  • (e) Optionally a disambig page may be created (if evidence suffices): for a joke and for colloquialisms for the two types of decay.
  • (f) Finally, don't forget a related subject, digital obsolescence/digital preservation/etc., to put these clearly related things into a common context.
talk) 02:37, 5 February 2014 (UTC)[reply
]
Well, that's a reasonable plan due to lack of references, but let's also hear comments from jhawkinson before doing anything. — Dsimic (talk | contribs) 03:12, 5 February 2014 (UTC)[reply]

Please whatever you are chatting about here, don't remove referenced content and don't restore unreferenced opinion that software rot is called bit rot. Once you find a reliable source to this end, you are welcome to restore the section. But you are not welcome to restore unreferenced opnions.

talk) 03:02, 4 February 2014 (UTC)[reply
]

Staszek Lem, your removal of the "software rot" section in the lede in uncalled for. We've had an extensive discussion here about why it is important, and I think we have concensus that it belongs. (If you disagree, please contribute to the discussion). It is not an "unreferenced opinion." In fact, the one source we have in the article at all on usage favors this (the Jargon file). There are no references that about usage that support the idea of bit rot being used to refer to actual physical bit errors. Please undo your edit. jhawkinson (talk) 03:08, 4 February 2014 (UTC)[reply]
Please re-read the lede carefully. The term "software rot" is still there. If you have a consensus, then where are the references in the article? Jargon File does not favor it. It is your reading. The lede say now what the Jargon file actually says. (Heck, you have it in this talk page as well). Yes, jargon file says it is 'nearly synonymous'. Yes, just like cannabis and marijuana. Some say they are one and the same, others vehemently disagree (in the
talk) 03:17, 4 February 2014 (UTC)[reply
]

2016

If everyone doesn't agree about the term "bitrot", then the remark in the intro should be completely removed.

I've always seen the terms "bit rot" or "bitrot" to mean corrupted data bits at the storage level, and NEVER heard it used to mean software rot.

In this recent article, a ZFS developer uses the term "bit rot" during file system talk.

This article uses the term "bitrot" during file system talk.

I've seen numerous people use the "bit rot" when talking about PAR2 files and recovery volumes in RAR (file format), though I don't have links (only because I never thought I would need to save such links).

SbmeirowTalk • 19:50, 26 June 2016 (UTC)[reply]

There seems to have been a discussion here a couple of years ago about the term bit rot - which was it seems the original title of this page - and whether it applied to the degrading of digital data, of software applications, over time.
While it seems a thorough and sincere conversation, it seems to have come to a conclusion which today seems incorrect. The term bit rot today is used in common literature to mean the corruption of digital data over time. (just one example, from Vince Cerf, seen as one of "the fathers of the Internet" https://www.theguardian.com/technology/2015/feb/13/what-is-bit-rot-and-is-vint-cerf-right-to-be-worried ). Yet anyone looking at wikipedia for information on bit rot would come away with conclusion that it applies to software becoming more difficult to run over time.
This seems out of step from the common usage (perhaps except amongst some developers), and I wonder if we need a rethink. Perhaps while acknowledging the term may have two meanings, the term itself should link to here, with such an explanation. Foxdown (talk) 12:59, 18 October 2016 (UTC)[reply]
I highly respect what Vince Cerf has done for the internet, but many computers and electronics terms have redefined or replaced over time. In electronics, the term condenser was used until the 1920s then slowly changed over to capacitor, and today a majority of engineers and hobbyists haven't heard of the term condenser. I still stand by my June 2016 comment, but I don't have a problem integrating both concepts into the article, as long as there is a hard split between two parts of the article. • SbmeirowTalk • 13:41, 18 October 2016 (UTC)[reply]

External links modified

Hello fellow Wikipedians,

I have just modified 2 external links on Data degradation. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:

When you have finished reviewing my changes, please set the checked parameter below to true or failed to let others know (documentation at {{

Sourcecheck
}}).

This message was posted before February 2018.

regular verification using the archive tool instructions below. Editors have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the RfC before doing mass systematic removals. This message is updated dynamically through the template {{source check
}} (last update: 18 January 2022).

Cheers.—InternetArchiveBot (Report bug) 07:56, 7 December 2016 (UTC)[reply]

Images are available on Commons

Hi folks! Someone requested that I add the images from my Ars Technica article "Bitrot and Atomic COWs" to this wikipedia article. As a matter of policy and neutrality, I'm not going to edit a wikipedia article that uses an article of mine as a reference, but I've made the actual images - which belong to me, not to Ars Technica - available on Wikimedia Commons using the Creative Commons CCASA-4I license. The first in the series is at https://commons.wikimedia.org/wiki/File:Bitrot_in_JPEG_files,_0_bits_flipped.jpg; the notes references the other images in the series, which consist of the same image with 1, 2, and 3 bits flipped, as well as a "cascade" PNG showing all four at once.

The original images are *original*, and may be diff'ed to see where each individual bit was flipped in the series. If anyone would like to use them, here or elsewhere, you have my blessing. Jrssystemsnet (talk) 16:42, 13 November 2017 (UTC)[reply]

Thanks! This is useful, but you need to write a paragraph or so of text to describe and clarify what the photos mean. Simples captions aren't enough. • SbmeirowTalk • 04:38, 15 November 2017 (UTC)[reply]
  • Bitrot in JPEG files, 0 bits flipped
    Bitrot in JPEG files, 0 bits flipped
  • Bitrot in JPEG files, 1 bit flipped
    Bitrot in JPEG files, 1 bit flipped
  • Bitrot in JPEG files, 2 bits flipped
    Bitrot in JPEG files, 2 bits flipped
  • Bitrot in JPEG files, 3 bits flipped
    Bitrot in JPEG files, 3 bits flipped