Talk:Pentium FDIV bug

Latest comment: 2 years ago by 2602:302:D1B1:C770:E120:1FCC:881C:BF46 in topic Nicely/Kraljevic discovery
Good articlePentium FDIV bug has been listed as one of the Engineering and technology good articles under the good article criteria. If you can improve it further, please do so. If it no longer meets these criteria, you can reassess it.
Article milestones
DateProcessResult
April 30, 2021Good article nomineeListed
Did You Know
A fact from this article appeared on Wikipedia's Main Page in the "Did you know?" column on May 20, 2021.
The text of the entry was: Did you know ... that a hardware bug in early versions of the Intel Pentium CPU led to the affected processors being recalled, in what was the first full recall of a computer chip?

586 Pentium clone

edit

I am wondering, what "586 Pentium clone" did IBM have for sale in 1994? Crusadeonilliteracy 13:52, 12 Jan 2004 (UTC)

I'm probably the one that wrote the original text; I don't remember where I got that information. [1] claims that IBM introduced the "5x86" in 8/1995; I think that the whole FDIV flap did last the 10 months required for this to be relevant. Cwitty 19:34, 12 Jan 2004 (UTC)
It actually was Cyrix's chip-design; IBM merely physically produced them for Cyrix, and under agreement got to sell some under their own brandname. I don't think the IBM5x86C can be called a Pentium clone from a design point of view. And it went into Socket 3 motherboards Crusadeonilliteracy 00:53, 13 Jan 2004 (UTC)
The idea of the text is to point out that IBM was a competitor of Intel's in terms of selling x86-compatible chips. For that purpose, I'm not sure that it matters who designed the chips, or what motherboard they go in. On the other hand, if you feel that the current text is bad, go ahead and edit it. Cwitty 23:27, 13 Jan 2004 (UTC)
I've edited it to remove any confusion (I hope). --Townmouse 22:42, 8 Jun 2005 (UTC)

more detail needed?

edit

The article is good but could use some more detail about the precise nature of the bug and its potential consequences. I'm not sure the "of little importance" phrase is NPOV. There's a fair bit of detail in the German article de:Pentium-FDIV-Bug if anybody cares to translate it. --Mathew5000 06:15, 27 June 2006 (UTC)Reply

Definitely not NPOV. Requested citations on that and a few other random POV claims. —Preceding unsigned comment added by 67.53.37.221 (talk) 21:08, 28 February 2009 (UTC)Reply

Bad opening sentence?

edit

The original Pentium chip is notorious in computing history, for being the only chip ever made capable of performing the mathematically impossible operation of dividing by zero - due to a bug in the FPU.

This looks like it might be vandalism, or at least badly misinformed. Division by zero in a floating point context has been a fully defined operation since IEEE 754.

Also, the fdiv bug caused incorrect but real results when given real parameters. -- Myria 07:52, 13 November 2006 (UTC)Reply

well spotted Myria, this looks like vandalism to me, as even a casual read of the first paragraph gives an accurate overview of the flaw. I think we should just remove that sentence. -- taviso 16:45, 13 November 2006 (UTC)Reply
I went ahead and killed it. -- taviso 16:48, 13 November 2006 (UTC)Reply
Wikipedia articles need a lead section; inaccurate or not, it did introduce what the article is about. I consider it quite rude to remove it without replacing with a better one [and I'm not the author]. I slapped a {{cleanup}} for now. -- intgr 18:47, 13 November 2006 (UTC)Reply
An inaccurate sentence is considered better than no sentence? Interesting. Sorry, I'll remember that in future. -- taviso 19:39, 13 November 2006 (UTC)Reply
It was not strictly "inaccurate", merely misleading, and it served its purpose. -- intgr 13:47, 14 November 2006 (UTC)Reply
I've written a new lead and removed the generic cleanup tag. However, I'm not sure about the overall structure. It seems to be largely chronological, which is fine, but perhaps it would be better to embrace that and have the play-by-play as a timeline of some sort. --Steven Fisher 22:20, 18 December 2006 (UTC)Reply

Trivia or Refferences

edit

The Freakazoid article reffers to this bug. —The preceding unsigned comment was added by Can Not (talkcontribs) 00:59, 3 December 2006 (UTC).Reply

Wrong value?

edit

My PC gives not 4195835.0/3145727.0 = 1.333 820 449 136 241 000 but 1.333 820 449 136 241 002 etc. - is this a bug or is the number in the text wrong? --Constructor 21:17, 8 August 2007 (UTC)Reply

Do you have an original Pentium? The expected incorrect answer is off by roughly one part in ten thousand. Your answer could be easily explained by a change in precision, or by a software division algorithm replacing the hardware division instruction. --Steven Fisher 04:50, 10 August 2007 (UTC)Reply
I have a dual core AMD. Still was curios about that. --Constructor 08:31, 26 August 2007 (UTC)Reply

Short answer: Yes, both are correct results; double-precision (64-bit) floating numbers cannot accommodate this precision, so both of these results would be equal. The x86-specific 80-bit floating point datatype is implementation-defined by design (although it's at least as precise as double-precision values)

What the value actually looks like in the end depends a lot on implementation details, e.g., whether the number formatter is rounding up or down, whether it's interpolating un-representable binary values or filling them with zeroes, whether it's using double-precision IEEE floating point numbers or the 80-bit x86 reals, etc. And this behavior might change depending on the application or standard C library version. -- intgr #%@! 23:03, 26 August 2007 (UTC)Reply

Analysis of the defect.

edit

In late 1994, the Intel Pentium FDIV bug played out mailing list and in the newsgroup comp.sys.intel. The posters were acoumplished scientist and engineers from major companies. While Intel was claiming the bug was minor, the readers of these newsgroups found out how serious the defect was. (I followed the posting at the time and was amazed at their quality.)

Tim Coe, a FPU (floating point unit) designer at Vitesse Semiconductor, read the reports of the Pentium division errors and was able to reverse engineer the cause of error. He wrote a C program to predict the errors. He did not own an Intel CPU, so he went to a local computer store to check his results. His error predictions were correct. He posted his results on the newsgroup, comp.sys.intel, on November 16, 1994.

  • Tim Coe (1994-11-16). "Re: Glaring FDIV bug in Pentium!". Newsgroupcomp.sys.intel. Retrieved 2008-03-24.

The original newsgroup posing can be found on Google groups. Here is a web site that has a good copy of Tim Coe's posting and some other valid links. [2]

His work was reported in the technical press at the time and here is a report from the MathWorks newsletter.

  • Moler, Cleve (Winter 1995). "A Tale of Two Numbers" (PDF). The MathWorks News & Notes. The MathWorks. Archived from the original (PDF) on 2005-02-08. Retrieved 2008-03-24. {{cite web}}: Unknown parameter |dead-url= ignored (|url-status= suggested) (help) (Modified 2/3/2018 to include archiveurl)

Tim Coe later wrote a paper in the peer reviewed journal, IEEE Computational Science & Engineering

  • Coe, Tim (Spring 1995). "Computational aspects of the Pentium affair". Computational Science & Engineering, IEEE. 2 (1): 18–30. doi:10.1109/99.372929. {{cite journal}}: Unknown parameter |coauthors= ignored (|author= suggested) (help) "The Pentium affair has been widely publicized. It started with an obscure defect in the floating-point unit of Intel Corporation's flagship Pentium microprocessor. This is the story of how the Pentium floating-point division problem was discovered, and what you need to know about the maths and computer engineering involved before deciding whether to replace the chip, install the workaround provided here, or do nothing. The paper also discusses broader issues of computational correctness."

-- SWTPC6800 (talk) 03:02, 25 March 2008 (UTC)Reply

If you're planning to add some of this new material to the article, I'd suggest that you not include the Usenet posting (since it's not considered a reliable source) but the other papers would be good to include as references. It would be especially interesting if you can identify any discovery made by Tim Coe that was different or in addition to what Thomas Nicely reported. The article is surprisingly thin on the technical details of the problem. Possibly a few more sentences might be added by someone who could fully digest the references. EdJohnston (talk) 02:52, 25 March 2008 (UTC)Reply
I have asked about the newsgroup posting at WP:Reliable_sources/Noticeboard#A_reliable_newsgroup_posting. This newsgroup cost Intel millions of dollars. -- SWTPC6800 (talk) 03:02, 25 March 2008 (UTC)Reply

It appears that Andy Grove, the Intel CEO, responded to this newsgroup.[[3]] -- SWTPC6800 (talk) 03:13, 25 March 2008 (UTC)Reply

The two papers I was involved in concerning this bug was Coe et al as cited above and

  • Pratt, V.R., "Anatomy of the Pentium Bug", Proc. Theory and Practice of Software (TAPSOFT'95), Springer-Verlag Lecture Notes in Computer Science, volume LNCS 915, 97-107, Aarhus, Denmark, May 1995, available online at various places as a PDF by googling for the title, or just click on the copy at my website.

The latter paper expands on Coe's analysis of the bug, modifying it to account for additional details of the bug and in the process exposing previously hidden architectural details of the floating point unit (so the bug works a little like a linear accelerator, which reveals the structure of the nucleus by smashing particles together). The bug was caused by a miscalculation of where to truncate the lookup table used with the SRT algorithm, resulting in a row of five 2's being cleared to zero. This row had an extremely low probability of being used during random testing (Intel estimated one error in 27,000 years) making it hard to detect. Intel's statistics postulate only random data; in my paper I show that if instead one starts out with only the number 1 and repeatedly combine it with itself using the four arithmetic operations chosen at random (for example 1+1 = 2, 1/2 = .5, .5+1 = 1.5, .5/1.5 = .333…, etc.) the probability of encountering the row rises dramatically, with the bug manifesting itself every few minutes. Another bad case is "bruised" integers less than a hundred, such as 23.999927 (as caused by rounding errors when the results were supposed to be exact integers), where the bug is encountered on average every 400 divisions! --Vaughan Pratt (talk) 17:50, 4 June 2008 (UTC)Reply

Would anyone consider it a WP:COI if I reproduced some of the above material about bruised integers etc. in this article? --Vaughan Pratt (talk) 05:25, 1 December 2009 (UTC)Reply

The original "Tale Of Two Numbers" PDF link is lost, I have added a link to an archived copy. Jimw338 (talk) 17:35, 3 February 2018 (UTC)Reply

Link?

edit

Should a link or reference be included to this subject from wiki's "Math Coprocessor"? —Preceding unsigned comment added by 68.107.184.54 (talk) 13:31, 24 June 2008 (UTC)Reply

Give me a break

edit

Shouldn't we just change this page to a giant Size 72 [citation needed] at this point? I mean come on. 69.205.63.181 (talk) 02:41, 26 March 2009 (UTC)Reply

The page warrants several [citation needed] comments because it expresses summary judgments about the bug, such as the evaluation of the bug as minor and the dismissal as biased of some research (particularly IBM's) indicating the bug was serious.

I am not going to speak to the mathematics or computation. Let me speak instead to the impact, which is ignored in this article. I can only speak from personal observation and not from published references. One observation is that doctoral students at several universities had to rerun their analyses of dissertation data before being allowed to graduate because they had done computationally intense analyses on computers with this processor. What is the cost of 6 months wasted of a person's professional life? I know this happened because I discussed this with some affected students at a few universities. In addition, I understand (from colleagues who were consulting to medical device manufacturers, including one who approached me for legal advice) that companies making medical devices and companies developing pharmaceuticals had to redo any analyses conducted on computers with the Pentium bug, before submitting them to the Food and Drug Administration in United States. What is the cost of delay of life-critical devices or medicine?

In terms of impact, this was a serious problem. CemKaner (talk) 03:11, 25 October 2009 (UTC)Reply

Perhaps the argument could be made that IBM's research, which appeared only in a technical report, was biased. However even before IBM announced their calculations of the high likelihood of encountering the bug in practical situations (as opposed to Intel's unrealistic assumption of a uniform distribution of values), of which I was unaware at the time, I had demonstrated in detail how the mean time between errors could be minutes instead of Intel's optimistic 27,000 years, published later in 1995 as an article in Springer Lecture Notes in Computer Science volume 215 (see above), downloadable as [4]. The bug was extremely likely in the case of divisors very close to integers with a small number of digits, the same conclusion IBM had arrived at completely independently. While these divisors represent only a tiny proportion of all reals (whence Intel's 27.000 year figure), they nevertheless arise very commonly in practice as a result of rounding errors involving nominally integer results. This high frequency has the verifiability of a theorem, allowing readers to come to their own conclusions about the likelihood of both errors and bias. (Why trust a university professor when you can figure it out yourself?) --Vaughan Pratt (talk) 06:35, 1 December 2009 (UTC)Reply

Effects of the bug on Intel stock and/or revenue

edit

Perhaps there should be a note about how Intel's stock and/or sales fared in the aftermath of the bug? All the best, --Jorge Stolfi (talk) 18:43, 19 December 2009 (UTC)Reply

Misleading example

edit

After presenting Tim Coe's famous example of the FDIV bug (where 4195835 / 3145727 returns 1.333739068902037589 instead of 1.333820449136241002), the article currently says:

Another test can show the error more intuitively. A number multiplied and then divided by the same number should result in the original number, as follows:   But a flawed Pentium will return:  

This is misleading. It implies that a floating-point multiplication followed by the corresponding floating-point division should always return the original input exactly, which isn't true.

The example does demonstrate the Pentium FDIV bug, but not because the result isn't equal to 4195835; it's because the error is so large (a relative error of 0.006%). This is exactly the same relative error as in the original, simpler example.

Since this "more intuitive" example is both misleading and redundant, I've removed it.  --mconst (talk) 08:58, 15 February 2011 (UTC)Reply

Internet influence

edit

This was an early example of the power of the Internet -- some crucial connections between those who discovered and revealed the problem were apparently made on Usenet, and widespread Internet mockery and publicity provided a significant stimulus for Intel to declare a recall... AnonMoos (talk) 14:19, 25 November 2011 (UTC)Reply

Rarity of Bug

edit

The article presents Intel's estimate of one in nine billion calculations uncritically.

While it is true that Intel's estimate is correct for normal random floating-point calculations in, say, a physics simulation, as it happened the bug affected a noticeable fraction of divisions where the dividend and divisor were both integers. While it would seem wasteful to use floating-point arithmetic to perform an integer calculation, this is how spreadsheets usually work.

This point was noted in articles about the bug which appeared at the time. — Preceding unsigned comment added by Quadibloc (talkcontribs) 02:43, 28 December 2013 (UTC)Reply

Here is a reference:

http://articles.chicagotribune.com/1994-12-13/business/9412130175_1_pentium-ibm-intel

Also, see Vaughan Pratt.

- Quadibloc (talk) 03:26, 28 December 2013 (UTC)Reply

edit

Hello fellow Wikipedians,

I have just added archive links to one external link on Pentium FDIV bug. Please take a moment to review my edit. If necessary, add {{cbignore}} after the link to keep me from modifying it. Alternatively, you can add {{nobots|deny=InternetArchiveBot}} to keep me off the page altogether. I made the following changes:

When you have finished reviewing my changes, please set the checked parameter below to true to let others know.

 Y An editor has reviewed this edit and fixed any errors that were found.

  • If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
  • If you found an error with any archives or the URLs themselves, you can fix them with this tool.

Cheers. —cyberbot IITalk to my owner:Online 10:58, 27 August 2015 (UTC)Reply

Nicely/Kraljevic discovery

edit

The opening paragraph attributes that the bug was discovered by Nicely in 1994, however the Chronology section directly after states Kraljevic allegedly discovered it first somewhat contradictingly. Are we attributing this to Nicely because the claim about Kraljevic isn't substantiated, or is it because Nicely is simply more notable? -- 72.214.182.83 (talk) 01:06, 10 December 2016 (UTC)Reply

Thomas Nicely (1943-2019) passed away on September 11, 2019 from injuries in a car accident. He retired from Lynchburg College in 2000. — Preceding unsigned comment added by 2602:302:D1B1:C770:E120:1FCC:881C:BF46 (talk) 07:16, 2 September 2022 (UTC)Reply

edit

Hello fellow Wikipedians,

I have just modified one external link on Pentium FDIV bug. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:

When you have finished reviewing my changes, you may follow the instructions on the template below to fix any issues with the URLs.

This message was posted before February 2018. After February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than regular verification using the archive tool instructions below. Editors have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the RfC before doing mass systematic removals. This message is updated dynamically through the template {{source check}} (last update: 5 June 2024).

  • If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
  • If you found an error with any archives or the URLs themselves, you can fix them with this tool.

Cheers.—InternetArchiveBot (Report bug) 04:37, 24 May 2017 (UTC)Reply

Screenshot of how to check for and workaround bug in Win 95/98

edit

I've tried to upload a screenshot for Win 95 SR 2.5 showing how Windows presented users with info in Device Manager as to whether their processor was affected, but I keep getting error saying Wikipedia isn't sure if my uploaded file is allowed. Can't see why not, as screenshots should surely be "fair use" when showing a specific issue? It's not like articles on Windows, Office, macOS and similar don't have example desktop/app screenshots.

Never uploaded a picture before so anyone have any advice? Thanks Dft-fire (talk) 16:45, 10 April 2018 (UTC)Reply