Talk:Hard disk drive/Archive 17

Latest comment: 10 years ago by Tom94022 in topic RAMAC Price and Ratio
Archive 10Archive 15Archive 16Archive 17Archive 18Archive 19Archive 20

RAMAC Price and Ratio

The IP reverted a change with the comment,

WP:Verifiability "Do not use articles from Wikipedia as sources." Must provide an underlying source (reference). 180 million still has false precision; $0.05 corresponds to 200 million.)

Actually the RAMAC price/$MB (a routine calculation) has its two components properly referenced on the cited page so it is indirectly referenced meeting the verifiability requirement, which I believe the IP knows, and according the WP:Verifiability the IP should have fixed the reference rather than blowing it away with a tag.

I really don't understand the "false precision" comment, it seems to me that mathematically 9200/<0.05 == >184 million which properly rounds to >180 and not >200. Perhaps a better way to look at it is if

.045 < current price < 0.50 then,
184 million < Ramac prioe/current price < 204 million

so to say that the ratio is >200 million as proposed by the IP is misleading at best. Tom94022 (talk) 17:52, 26 August 2014 (UTC)

WP:Verifiability is violated here: "Do not use articles from Wikipedia as sources." This inline citation (price of $9,200) is unverified because it merely points to Wikipedia, regardless of the purported source buried several layers deep: History_of_IBM_magnetic_disk_drives#IBM.27s_first_HDD_versus_its_last_HDDs[1]
WP:Citation needed is also violated: "If someone tagged your contributions with [citation needed] and you disagree, discuss the matter on the article's discussion page." You revert without discussing first.
The updated price of $10,000 has an honest-to-goodness (non-WP) inline citation,[2]
The number $0.050 is fabricated by the editor and the number is actually $0.05. That extra trailing zero is deceptive. Therefore, the two significant digits of 180 million are erroneous, incorrect false precision. 71.128.35.13 (talk) 01:15, 27 August 2014 (UTC)
The Blackblaze reference states their price in Sept 2013 was $0.044. The chart in Blackblaze clearly shows the price in 2013 starting well above $0.050 and descending below $0.50 but remaining well above $0.040, about at the $0.044 reached in September; therefore, the average for 2013 is clearly below 0.050 and above 0.045 and nothing is "fabricated" other than perhaps your objection. Tom94022 (talk) 17:50, 27 August 2014 (UTC)
The RAMAC purchase price and capacity sources are clearly given in the cited section and you are being disingenuous and argumentative by ignoring them, so just in case you choose to dispute this again here they are:
IBM Archives: IBM 350 disk storage unit gives the capacity as 5 million characters which a routine calculation converts to 3.75 MB (not 5 MB a u used above)
Ballistic Research Laboratories "A THIRD SURVEY OF DOMESTIC ELECTRONIC DIGITAL COMPUTING SYSTEMS," March 1961, section on IBM 305 RAMAC (p. 314-331) has a $34,500 purchase price at Boeing Wichita.
As you did, it is then routine to calculate Price/MB. There is nothing in WP:Verifiability that precludes such indirect verification, but if you insist upon direct verification then as stated in WP:Verifiability you should now update this article to these sources.
Your [re]search has indeed discovered another source that has a different RAMAC purchase price. There is already consensus on $34,500 at RAMAC Purchase Price so a discussion should be held there as to what to do about this new source; after all we really don't need to different numbers for the same thing, do we? Tom94022 (talk) 17:50, 27 August 2014 (UTC)
Wikipedia:Verifiability#What_counts_as_a_reliable_source The $9,200 claim requires support from a direct inline citation, not from another WP article and not from an editor's routine calculation. In fact, this source WP:AD directly says $10,000 per megabyte: http://www.sfgate.com/business/article/Hard-driving-valley-began-50-years-ago-And-most-2469806.php 71.128.35.13 (talk) 20:29, 27 August 2014 (UTC)
Your source is on its face inaccurate since 50000/3.75 does not equal 10,000. Again I suggest u move this duologue to the appropriate page. Tom94022 (talk) 20:54, 27 August 2014 (UTC)

These sources directly say $10,000 (or $11,364) per MB: [2] [3] You must directly support your un-referenced and dubious $9,200 claim with a non-wikipedia inline citation, Wikipedia:Verifiability#What_counts_as_a_reliable_source because Wikipedia content is not considered reliable unless it is backed up inline by a reliable source.

The claim of 3.75??? megabytes is on its face inaccurate because it is unsupported by an inline reference, and because these reliable sources say 4.4 MB. wp:ad http://www.snopes.com/photos/technology/storage.asp [3] 71.128.35.13 (talk) 23:13, 27 August 2014 (UTC)

I can't believe u are seriously disputing 3.75 MB; u are either very ignorant of storage history or just being argumentative. The IBM Archive and many other places say 5 million characters and there are many sources that establish the 350 recorded 6-bits per characters, e.g. "stored 5 million 6-bit characters (the equivalent of 3.75 million 8-bit bytes)", "IBM's 305 RAMAC had a 3.75-megabyte internal hard disk drive", "stored 5 million 6-bit characters (the equivalent of 3.75 million 8-bit bytes", all u have to do is Google "RAMAC 3.75 million" If you still insist there is an argument we can turn to the technical manuals for the 350 which clearly show 5 million 6-bit characters. You seem to [re]search just to make an argument rather than trying to establish facts suitable to Wikipedia. Since both your references are fundamentally flawed they cannot be reliable. But you really should take this to the RAMAC page where we can probably gain consensus. If u won't move there, then I guess I will invite those editors here. There really should be only one place where we discuss RAMAC and other articles such as this should not disagree. Tom94022 (talk) 01:05, 28 August 2014 (UTC)
Those rumored IBM internal documents are not publicly available references, and cannot be used in Wikipedia. Several references have 6 bits plus one odd parity bit, total of 7 bits per character: that's how they arrive at 4.4 MB, not 3.75 MB.[[1]][3] 71.128.35.13 (talk) 02:37, 28 August 2014 (UTC)
In modern terms the RAMAC 350 used a 6/8(1,7) run_length_limited code (RLL), encoding 6 data bits into 8 channel bits with a minimum spacing of 1 and a maximum spacing of 7. The two additional bits are a space bit and a parity bit. RLL codes are used throughout storage and in all cases the capacity is based upon data bits not channel bits such as parity and space in this case. A simple example is the CD which encodes eight data bits to fourteen channel bits (EFM) and then adds a bunch of parity bits but the capacity is stated in terms of (data bits)/8 not (channel plus parity bits)/8. My source is an IBM publication on the web and I have a local copy but it shouldn't be necessary to go there since you agree that the seventh bit is a parity bit. Tom94022 (talk) 18:36, 28 August 2014 (UTC)
References are definitely necessary and absolutely required. Two references say 4.4 megabytes.[2]][3] Has any real reference (not purported, not rumored, not alleged, not calculated indirectly) been shown to support 3.75 directly? 71.128.35.13 (talk) 00:13, 29 August 2014 (UTC)
Actually I have given u three references to 3.75 MB which for some reason u choose to ignore them. But better yet the RAMAC 305 Customer Engineering Manual of Instruction (c) 1959 on page 7 states there are 8 channel bit positions within each character, two of which are not used in the bit coding (Space and Parity), leaving 6 data bits. Figure 86 shows the waveforms of the 8 channel bits within the M350 just in case u are confused by the title of this reference. Will you now agree that the M350 capacity was 3.75 MB or are you going to look for some other tertiary and incorrect source? Or perhaps u will claim parity should be counted? Tom94022 (talk) 05:21, 31 August 2014 (UTC)


Disputed BRL reference

The editor calculates $9,200/MB, however no source supports this claim directly and IBM internal documents are not useful as references in Wikipedia. Two reliable sources directly indicate $10,000 or around $11,000 per MB.[2][3] 71.128.35.13 (talk) 02:48, 28 August 2014 (UTC)

The IP is again disingenuous, the above cited 1961 BRL reference in online and specifically states a purchase price of $34,500 for Boeing Wichita. It is a routine calculation to then divide by 3.75 to get $9,200. The question then becomes which is the more reliable source, the 2006 source and its echo in 2011[a] or the 1961 source. On its face, a serious repeated contemporaneous survey such as performed by BRL should carry more weight than a comment made in a speech almost 50 years later.
Furthermore, the known $650/month rental price of the 350 would at $50,000 give a purchase price to rental ratio of 77:1, far exceeding the 55:1 ratio or less for similar IBM equipment. $34,500 is a 53:1 ratio, far more believable. There is no difference of opinion here, at least one number is wrong and the best evidence supports $34,500. Tom94022 (talk) 19:16, 28 August 2014 (UTC)
The foregoing is disingenuous with respect to WP:civility, and dubious data are rampant. Even the "known" rental data for one unit of $650 per month is faulty. This 1961 source shows $975 per month at Yale Univ, $975 per month at USA ESCO, and $975 per month at Western Electric Co. in Indianapolis.[[3]]
On the subject of dubious data, the IP has cited from the IBM 650 RAMAC section and not the IBM RAMAC 305 section referenced above. FWIW, the 350 Disk IS NOT listed in the section he cites so I have a hard time seeing this misrepresentation as a simple mistake. The $975/month is for the 355 Disk and not the lower priced 350 Disk. Had he searched the correct citation he would have found at least three instances of a $650/month rental of the 350 Disk File. But we don't have to take just the BRL survey for a $650/month rental price since had the IP [re]searched the truth he might have found
"The monthly rental for a basic RAMAC was $3,200, of which $650 was for the disk storage unit ..."[4] Tom94022 (talk) 06:22, 29 August 2014 (UTC)
The editor has calculated $9,200 based on 3.75 megabytes, but two other references directly show $10,000 or $11,000 and 4.4 megabytes. WP:verifiability disallows reliance on an editor's calculation. Instead of relying on an editor's calculation of one unit's price, the article should list $10,000 (one significant digit) that represents the three sources. When sources disagree, use a rounded number that represents them all, cite them all in-line, and move on. Don't rely solely on an editor calculation. 71.128.35.13 (talk) 00:32, 29 August 2014 (UTC)
Again the IP misstates the situation, both his cited sources use the same $50,000 from a 2006 speech, shown above to be highly unlikely; one source divides by 5.0 MB and the other by 4.4 MB; both capacities are incorrect. We would being doing a disservice to Wikipedia and failing as editors if we accepted two fallacious divisions as in any way equal to the one simple division of numbers from reliable sources that any editor or reader can do. Sources do say the earth is flat but that doesn't mean we as editors are obligated to use such assertions in any articles on the earth, even if one IP says so.
Finally the IP asserts in his edit note that " wp:Verifiability disallows editor calculations." but there is simply no support for this assertion in the policy; no mention of calculate in any form appears in the policy. Routine calculations such as the results of a division are allowed to appear in articles with consensus so it seems to me that the only implication of verifiability for division is that both the numerator and the denominator be verifiable, and in this case they are. I suppose if we had to we could add more to the reference, but I think the reference makes it pretty clear what constitutes the numerator and the denominator and both are verifiable. Tom94022 (talk) 06:22, 29 August 2014 (UTC)
The $9,200 is flat out dubious, misleading and unsupported. No reference, anywhere, ever, listed $9,200/MB. In 1956 IBM announced both the 350 and 355 RAMACs simultaneously.History_of_IBM_magnetic_disk_drives#IBM_350 The $9,200 is tied to the RAMAC350 at $650 per month and simply ignores the RAMAC355 at $975 per month. Both RAMACs were real and contemporaneous, so obviously the price is higher than $9,200. Several sources, not a Wikipedia editor's calculation, directly support $10,000/MB. 71.128.35.13 (talk) 18:30, 29 August 2014 (UTC)
For anyone interested in learning more about the RAMAC history I highly recommend the RAMAC 350 Restoration Web Site and the RAMAC oral history project. I have used both in this tedious duologue.
The cited BRL reference clearly gives one instance of a 350 Disk File at $34,500 There are four instances at a rental of $650/month confirmed the Pugh reference which yields a reasonable 53:1 price/rental ratio. Thus 34,500/3.75 = 9,200 is supported, accurate and correct; the IP's assertion that it is "dubious, misleading and unsupported" is a flat out lie.
Apparently the IP does not understand what the 355 was - it was a 350 with three actuators interfaced to a Model 650 system - the Model 350 disk file initially had only one actuator. It shipped much later than the M350. From the same BRL data cited by the IP it seems the the 355 Model 1 had a purchase price of $62,200; with a capacity of 6 million characters (5 bit) it's capacity was also 3.75 MB resulting in a price of $16,587/MB. Of course we ignore this, using the M350 because it was first and is lower.
What are flat out dubious, misleading, inaccurate and false are the 5.0 MB and 4.4 MB capacities used to arrive at the $10,000/MB and $11,364/MB the IP continues to cite. The $50,000 purchase price is also dubious in that it appears in a speech 50 years later without any sourcing. Wikipedia values accuracy and both accuracy and truth are knowable for mathematical calculations such as the capacity of the M350 or M355! This makes the two citations used by the IP unreliable sources not suitable for citation. To equate these two citations with the BRL survey and the RAMAC capacity would be a false equivalence. Tom94022 (talk) 05:08, 31 August 2014 (UTC) updated: Tom94022 (talk) 18:59, 31 August 2014 (UTC)

Wikipedia values wp:verifiabililty, and this tedious defense of $9,200 is bereft of any reference. Apparently the editor can't find direct support for $9,200. No source, beside one Wikipedia editor, ever said $9,200. This is all wp:or original research; it midleadingly focuses on the RAMAC305 and ignores the contemporaneous RAMAC650_355; and it is laden with false precision.

Here's another source that is not a calculation and not original research which gives a RAMAC price of $10,000/MB. Both RAMAC350 and RAMAC355 were introduced simultaneously in 1956 and withdrawn simultaneously in 1969. As of the late 1950s the 650 computer with RAMAC355 storage was produced at the rate of one per day in higher volume, nearly 2000 units, than any other computer in history up to that time.[5] We could agree that the RAMAC355 price was higher than the RAMAC350. One dubious editor-produced calculation puts it 65% higher ($15,200 versus $9,200 per disputed MB), and a more rational and reasonable calculation has it 25% higher (($975 / 6 million characters * 5 million characters) per month versus $650 per month). Either way, the 355 price was indeed higher, and these two models were contemporaneous according to IBM. This means the real, true price must have been substantially higher than $9,200. More like $10,000 with one significant digit of precision.

Again, no reference supports the entirely editor-fabricated claim that the 355 was produced later than (not contemporaneous with) the 350. So, the $9,200 price is still dubious, misleadingly over-precise, ignores the higher-volume and contemporaneous 355, and unsupported by any reference. The real, verifable, properly supported, not falsely precise, price is $10,000. 71.128.35.13 (talk) 23:49, 31 August 2014 (UTC)

I really hate to repeat myself, but there is nothing in Wikipedia's Policy on Verifiablity that precludes an editor performing a routine calculation which can then be used in an article; the IP is simply making up a prohibition. The IP has previously edited CAGR calculations into an article so it is rather dishonest of him to deny the validity of routine division yielding $9,200, particularly since both the numerator and the denominator have reliable sources.
The IP again cites an unreliable source since the price/MB therein is based upon an incorrect 5 MB capacity.
The IP asserts the character size of the M350 is the same as that of the M355 without any evidence what so ever, but is then willing to perform a routine calculation in support of his position. BTW, it maybe true, I don't think so, but it is irrelevant to this duologue.
The IP is deliberately misleading when he characterizes the 350 and the 355 as contemporaneous; they were announced simultaneously in a press release on Sept 14, 1956[6], but at that time there were at least two 350s installed (Zellerbach and USN NorfolkCite error: The <ref> tag has too many names (see the help page)., each one having at least one 350; some having 2. According to Phister the number of first generation disk drives installed on all IBM systems in 1961 was 900 units leaving very little room for 355s on 650s. A low number of disks on M650s is supported by a 1962 survey of educational computing[7] that identified only 2 RAMAC units out of 38 installed M650s. It seems the M355 is yet another red herring by the IP to justify his position, which u should recall proposed $10,000/MB without any reference to M355 at all.
There is no false precision in $9,200 since the prices of the various models are known to 5 digits, the capacity is an integer and the 350 was the only product shipping in 1956. What the IP is asserting is a false equivalence between several inconsistent unreliable sources and this routine calculation. Tom94022 (talk) 06:48, 1 September 2014 (UTC)
RAMAC355 passed customer acceptance test at USA ESCO in mid-1957; the reference is here [[4]] A tit-for-tat “deliberately misleading” accusation wouldn't accord with the doctrine of WP:civility, though the editor does spew incivility. In the discussion above, the claim that RAMACs first shipped in 1958 was kicked to the curb by a Zellerbach Paper Company 1956 reference. We hew to reliable sources and place our full faith in the WP:verifiability of RAMAC prices, not restricted to an arbitrary editor-selected 14 units.
The references below show RAMAC355 and RAMAC350 dates of customer acceptance and the running periods (a month or more) for measurement of reliability. Furthermore, the 355RAMAC price is 25% higher than the 350RAMAC: $975 per month for 6 million characters versus $650 per month for 5 million characters. Therefore, the $9,200 editor-produced price mis-represents the BRL data by model selection and overprecision. RAMAC is more nearly $10,000 per megabyte with one significant digit of precision.[[5]]
[[6]] (six million characters RAMAC355)USA ESCO Type 355 Disk Storage at $975 ea. Per month with running period from Oct 59 to May 60 Passed Customer Acceptance Test Jul 57;
Western Electric Co. Type 355 $975 per month with running period 16 May 60 to 17 Aug 60 Passed Customer Acceptance Test Aug 59'
[[7]] (five million characters)Boeing, Wichita 350 Disk Storage $34,500; Passed Customer Acceptance Test 10 Jun 58; running period included 1 Mar 60 to 31 Mar 60
WE Aurora 350 Disc Storage Unit $650 per month with running period 1 May 60 to 31 Jul 60 Passed Customer Acceptance Test 1 May 60
Georgia State 350 Disc Storage Unit $650 per month with running period 1 Jun 60 to 30 Jun 60 passed Customer Acceptance Test 1 Jan 60
A few of those purportedly red RAMAC355 herrings, obviously not the true IBM color, are named above. There are a lot of documented Big Blue RAMAC fish in the deep Blue sea. False precision and biased model selection are sinful. The BRL data verify the Official IBM PR Announcement that the RAMAC350 and RAMAC355 are indeed contemporaneous. And yea verily, the RAMAC355 price is indeed higher than the editor-calculated, unreferenced $9,200. So sayeth the Records of IBM, as revealed to us in the gospel of Thelen_BRL. Our faith has been confirmed by the presentation of the Price Data; amen to $10,000 per megabyte with one significant digit of precision.[[8]] 71.128.35.13 (talk) 03:43, 3 September 2014 (UTC)
Thank you for acknowledging BRL as a reliable source.
The ESCO citation for the IBM 355 date is ambiguous in that ESCO reported two IBM 650 systems, Number 800 having 4 IBM 355s and Number 700 without any disk drives. The acceptance test date does not say what was accepted in 1957, it could have been one or both.
There is good evidence that the IBM 355 character was a 5 bit decimal digit and not a 6 bit alphanumeric digit[8]. The IP has presented no evidence that the character size of the two RAMACs was the same so any ratio is suspect. But since he is willing to do such ratio calculations I fail to understand any objection to $34,500/3.75.
What is indisputable is that in 1956 the only disk drives shipped to customers was the IBM 350 as part of an IBM 305 System, including one to Zellerbach in June and one to the USN Norfork before Sept 14, 1956. Since the table cell in question in this article is in a column labeled "Started with" all discussion of the IBM 355 is a red herring since the disk drive industry started with the IBM 350 not the IBM 355. Tom94022 (talk) 16:48, 3 September 2014 (UTC)

End of one part of the duologue

IBM's archivist has provided me with a copy of IBM Announcement Letter 259-38, May 5, 1958, announcing the "IBM 350 Disk Storage Model 2", "similar to the Model 1" to be installed as a second disk file on the 305 system with a rental price of "$700/month" and a purchase price of "$36,400". Initial deliveries were scheduled to begin "September 1958." Note that there is one M350 in the BRL survey at $700/Month, presumably a Model 2; the others have rental of $650/month and one has a purchase price of $34,500. This clearly establishes the BRL as a reliable source for the purchase price of the 350 Model 1 as $34,500 and confirms as unreliable the sources using $50,000 for the purchase price of the original RAMAC, Model 1 or 2. If the IP insists otherwise I suppose I will have to find a way to post it but I would hope he/she would attribute good faith on my part and stop wasting time on this issue. Tom94022 (talk) 20:09, 1 September 2014 (UTC)

Summary of arguments in favor of current edit

This dispute is over the contents of the Price row in a table in the article

Improvement of HDD characteristics over time
Parameter Started with Developed to Improvement
Price US$9,200 per megabyte[9][dubiousdiscuss] < $0.05 per gigabyte by 2013[10] 180-million-to-one

The following facts are from reliable sources and not disputable.

  1. The industry "Started with" the only hard disk drive to ship to customers in 1956, the IBM 350 (shipped to at least Zellerbach[11] and USN Norfork before September 14, 1956[6]).
  2. The capacity of the IBM 350 was 5 million 6-bit characters which is precisely and only 3.75 MB[12]
  3. The purchase price of the IBM 350 was $34,500 (BRL[9] as confirmed by Pugh and the IBM 350 Model 2 announcement).

It is an indisputable fact that $34,500/3.75 MB = $9,200/MB.
It is an indisputable fact that 9,200/<0.05 = >184 million which properly rounds to 180 million
The IBM 350 capacity and purchase price are precise whole numbers, any other values reported by any source are on their face incorrect and such a source cannot be used in Wikipedia other than perhaps in context. Tom94022 (talk) 16:58, 4 September 2014 (UTC)

Argument by IP against current edit

Capacity (based on 6 bit characters) is absolutely disputable. Respected sources say 7 non-binary bits, not 6 binary bits per character and 3.75 MB capacity. The Official History of IBM hails the RAMAC350 as an Icon of Progress, “The whole thing could store 5 million binary decimal encoded characters at 7 bits per character.” That might be 5 million characters (7 bits / 8 bits per modern byte) or around 4.4 million megabytes.[[9]] Al Shugart himself is quoted directly saying it had a 7 bit code, “The coding used on the disk drive was a 7-bit code - 6-bit + 1 binary. Very straightforward and simple.”[[10]] The American Society of Mechanical Engineers designated RAMAC as An International Historic Landmark, noting it had 7 non-binary bits per character: “The 350’s … contained a total capacity of 5 million binary decimal encoded characters (7 bits per character) of storage.”[[11]] RAMAC355 likewise had 7 bits per character because it “used the same mechanism as the IBM 350 and stored 6 million 7-bit decimal digits.”[[12]], 20% more than the RAMAC350. Again according to wikipedia, “each character was 7 bits, composed of two zone bits ("X" and "O"), four BCD bits for the value of the digit, and an odd parity bit ("R") in the following format: X O 8 4 2 1 R”[[13]] To recap, sources supporting a non-binary version of 7 bits not the 6 binary bits that are tied to 3.75 MB include the Official History of IBM, a direct quote from Al Shugart himself and the American Society of Mechanical Engineers International Historic Landmark designation. Certainly, the non-binary code would carry more than 6 binary bits of information, hence more than 3.75 MB, because four BCD bits take more space than binary bits: "Standard BCD requires four bits per digit, roughly 20 percent more space than a binary encoding (the ratio of 4 bits to log210 bits is 1.204)."[[14]] 71.128.35.13 (talk) 20:33, 4 September 2014 (UTC)

Neither non-binary bits nor non-binary version of 7 bits has an obvious meaning, but the 305 RAMAC manuals[13] at bitsavers clearly describe 6 data bits per character. Shmuel (Seymour J.) Metz Username:Chatul (talk) 22:57, 18 September 2014 (UTC)
Agreed - specifically, see page 70 in the above-referenced PDF. And note that it is explicitly talking about the data format "on the drum or disk", so there is no arguing that it's the 305's internal format and the format on disk could be different. This is a "horses' mouth" reference. Jeh (talk) 23:49, 18 September 2014 (UTC)

Correct price is $10,000 per Megabyte

No editor-calculated original research WP:OR is justified here since direct sources are available. Here's one direct source[[15]] that gives a RAMAC350 a price/megabyte of (US$)15,200. So, we've numerous reports, not editor calculations. They present an obstacle for the tendentious and calculating editor because none of the sources say precisely $9,200/MB nor 3.75 MB nor 6 binary bits per character.

Respected sources say 7 non-binary bits, not 6 binary bits per character and 3.75 MB capacity. The Official History of IBM hails the RAMAC350 as an Icon of Progress, “The whole thing could store 5 million binary decimal encoded characters at 7 bits per character.” That might be 5 million characters (7 bits / 8 bits per modern byte) or around 4.4 million megabytes.[[16]] Al Shugart himself is quoted directly saying it had a 7 bit code, “The coding used on the disk drive was a 7-bit code - 6-bit + 1 binary. Very straightforward and simple.”[[17]] The American Society of Mechanical Engineers designated RAMAC as An International Historic Landmark, noting it had 7 non-binary bits per character: “The 350’s … contained a total capacity of 5 million binary decimal encoded characters (7 bits per character) of storage.”[[18]] RAMAC355 likewise had 7 bits per character because it “used the same mechanism as the IBM 350 and stored 6 million 7-bit decimal digits.”[[19]], 20% more than the RAMAC350. Again according to wikipedia, “each character was 7 bits, composed of two zone bits ("X" and "O"), four BCD bits for the value of the digit, and an odd parity bit ("R") in the following format: X O 8 4 2 1 R”[[20]] To recap, sources supporting a non-binary version of 7 bits not the 6 binary bits that are tied to 3.75 MB include the Official History of IBM, a direct quote from Al Shugart himself and the American Society of Mechanical Engineers International Historic Landmark designation. Certainly, the non-binary code would carry more than 6 binary bits of information, hence more than 3.75 MB, because four BCD bits take more space than binary bits: "Standard BCD requires four bits per digit, roughly 20 percent more space than a binary encoding (the ratio of 4 bits to log210 bits is 1.204)."[[21]]

So, the problems that produce over-precision in the editor's $9,200 calculation are model selection (350 or 355, which were contemporaneous and priced differently), capacity (6 binary bits or 7 non-binary bits per character) and price (various sources offer different price/MB). All direct sources, and the dubious over-precise $9,200 editor calculation, agree that RAMAC “at the start” cost $10,000/MB with one significant digit of precision. 71.128.35.13 (talk) 20:06, 4 September 2014 (UTC)

(Moved IP comment here for better flow) Questioning and civil discussion is the first step on the path to wisdom. Thanks for the opportunity to summarize the (correct) answers to the questions you've posed:

1. The capacity of the 350 Disk File is "other" (nearly 4.4 MB), because BCD encoding is not precisely 6.000 bits (more nearly 7 bits) per character.

2. The purchase price was $34,500 according to BRL, and the contemporaneous RAMAC355 was 50% higher (with 20% greater capacity as well). No, price/MB is not a routine calculation. Here the editor posing these questions selects a biased model type (350) and ignores the contemporaneous higher-priced RAMAC355. Any calculation is unneeded and inappropriate original research WP:OR on the part of the wikipedia editor because several reliable sources are available that directly give price/MB.

3. The premise of your question (agreement that price is a routine calculation) is wrong. We can calculate the ratio here, without undertaking an editor calculation of the price or agreeing to your premise. The ratio is > 200 million to one. ($10,000 per megabyte versus < $0.05 per gigabyte). 71.128.35.13 (talk) 20:28, 4 September 2014 (UTC)

An End To The RAMAC Price Duologue

With an apparent acceptance that the 350 Disk File (Model 1) had a purchase price of $34,500, may I now suggest that this duologue can be ended by allowing other editors to discuss the following three questions, perhaps referring to the summaries immediately above.

Questions For Discussion
  1. What was the capacity of the 350 Disk File (Model 1) in MB:   3.750,  4.375,  4.4,  5.000,  other?
  2. Is the division $34,500/(answer 1 above) a routine calculation suitable for inclusion in an article?
  3. If the answer to 2 above is yes,  what is its ratio to <$0.05/MBGB:   >180 million-to-1,  >200 million-to-1,  other?

Information to answer these questions can be found repeatedly and in detail above this section. I for one will not longer respond to the IP but will briefly answer any questions placed below by any other editor and am willing to accept any consensus reached by a reasonable number of registered editors Tom94022 (talk) 16:58, 4 September 2014 (UTC)

Discusssion from editors other than the IP would be appreciated below

  1. Capacity 3.75MB. In what we call "capacity" today, it includes user data only, and excludes all parity, ECC, servo and metadata. This was not the case in the early days; in the 1990s I worked on drives that reported the larger "unformatted" capacity (lawyers made us stop doing that). For any comparison, we must do some conversion between the old and new uses of the term capacity. I can conceive of no other way that to use data (non-parity) bits; (modern drives have much larger ratios of non-data to data capacity). That means the parity bit for each 6 bits should not be counted. This is suitably sourced. Al Shugart's comments support the "+1" as not being data. 5 million times 6 bits = 3.75MB (8-bit). BTW, introducing BCD into this mix would reduce the capacity, as 4 BCD bits only allows for values of 0-9, not 0-15 as allowed by 4 binary bits, but evaluating the impact of BCD on capacity isn't sourced. (Also, BCD values of 10-15 could be reserved for "special conditions" not defined by the storage itself.)
  2. Yes; $34,500/3.75MB calculation is okay to include ($9200/MB). This is allowed per WP:CALC and does not violate WP:SYNTH.
  3. 180 million-to-1. --A D Monroe III (talk) 23:56, 4 September 2014 (UTC)
  • I agree with 3.75 MB. To pull this off, we must compare apples to apples. Common sense must prevail. On with building Wikipedia. BTW, I would suggest a non-breaking space (&nbsp;) between the value and the unit of measure. Greg L (talk) 03:01, 5 September 2014 (UTC)
  • I'm not commenting on the pricing, but I do agree that the storage was 3,750,000 bytes. To allow correct comparison with modern capacity measures, one should not include meta-bits like parity and stop bits. I'd only change my mind if it was possible to turn off the parity check and use the 7th bit for data, but nobody has produced evidence of that. Incidentally, I'm pretty sure that in those days 3,750,000 bytes was 3.58 MB since "MB" meant 220 bytes, but I'm not suggesting to use that. Zerotalk 02:21, 5 September 2014 (UTC)
  • I have no knowledge of the drive under discussion but I agree that early systems gave their capacity in terms of every bit on the platter, not just user data. Perhaps replace US$9,200 with "Over US$9,000" and leave the 180-million-to-one. The footnote should give a very brief explanation per the above summary. The IP will never be satisfied and the formal procedure would be to start an RfC. Johnuniq (talk) 03:54, 5 September 2014 (UTC)
  • For the beginning, here are a few quotes:
Within RAMAC, all data is read, transferred, and written serially by word, character, and bit. There are eight bit positions within each character position. They are identitified as bits S, X, 0, 1, 2, 4, 8, and R. Bit S merely provides a spave between the recording bit positions of each character, and is not used in the bit coding. Bit R has no numeric or alphabetic value, but is added to certain characters so that every character will have an odd numer of bits. This convention makes possible a technique whereby RAMAC may perform a validity check on each character transferred.
The disk drive could store 5 million characters using a 6-bit character code plus a parity and space bit per character.
The 350's fifty 24-inch disks contained a total capacity of 5 million binary decimal encoded characters (7 bits per character) of storage.
Based on the first quote, each RAMAC's byte had six end-user-accessible bits, as we have to discard space that's unusable to end users; for RAMAC, those are space and parity bits in each "platter" byte. As a note, modern HDDs also contain quite a lot of space that isn't accessible to end users, while only the usable amount of space is advertised; one of the motivations behind the 4Kn initiative is the reduction of "wasted" platter space associated to ECC data, and ECC data is only one part of the end-user-inaccessible space. For example, a typical 1 TB HDD with 512-byte sectors also provides additional capacity of about 93 GB for the ECC data; see this PDF and this illustration for more information.
The third quote might be a bit contradictory, but each character is actually written to platters as seven bits, out of which only six bits are accessible to end users as the seventh ("R") bit is the parity bit used by RAMAC; eighth ("S") bit is practically empty space. This is probably a small marketing trick.
Based on all that, RAMAC has 5,000,000 characters with six bits each, what's 30,000,000 bits or 3,750,000 bytes. As the "1 MB = 1,000,000 bytes" marketing gimmick applies to modern HDDs, we have to apply it to RAMAC, too, thus we end up with 3.75 MB as the RAMAC's capacity that can be used for later comparisons. $34,500 / 3.75 is the logical calculation for price per MB, what results in $9,200 per RAMAC's MB of storage. When that is compared to less than $0.05 per MB, we end up with a somewhat larger than 184,000-to-one ratio – how is the 180,000,000-to-one ratio calculated? — Dsimic (talk | contribs) 05:00, 5 September 2014 (UTC)
The summary above includes ">184 million which properly rounds to 180 million". Three significant figures are not needed or justified. Johnuniq (talk) 07:02, 5 September 2014 (UTC)
Hm, how can 184,000 be rounded to 180,000,000? That's the question. :) — Dsimic (talk | contribs) 07:21, 5 September 2014 (UTC)
$9,200/$0.05 = 184,000 and the extra 1,000 is because one is in megabytes and the other in gigabytes. Johnuniq (talk) 09:40, 5 September 2014 (UTC)
Sorry but that simply isn't true, as we're comparing $0.05/MB and $9,200/MB. In today's HDDs is that a MB costs $0.05, not a GB. — Dsimic (talk | contribs) 20:36, 5 September 2014 (UTC)
My bad, it's $0.05 per GB for contemporary HDDs (though it seems a bit too low), I got a little confused. However, the third question on top of this section specified $0.05 for the MB price, I'll get it corrected. — Dsimic (talk | contribs) 21:09, 5 September 2014 (UTC)
IBM used the term BCD to refer to a six bit character set in which the digits 0-9 were encoded as 000000 through 001001; the term does not mean that the value of the right four bits is restricted to that range for other characters. Typically there will be 63 or 64 valid characters in a BCD character set. Shmuel (Seymour J.) Metz Username:Chatul (talk) 21:34, 9 September 2014 (UTC)

Perhaps one reason for this question being so hard to settle is that there are good arguments for all the options:

  • 3.75MB If one tried to store a .jpg file on an IBM 305 RAMAC there would only be room for a 3.75MB file.
  • 4.375MB The IBM 350 drive stored seven bit bytes. The decision to allocate one bit to parity is arguably a system design decision, not a disk drive limitation. If we had a working 350 and wanted to interface it to, say, a Raspberry Pi, one could easily put the interface upstream of the parity circuitry and store 4.375MB worth of .jpg files.
  • 5MB Nobody was storing images on computers in 1956. IBM 350 disk drives were used for storing numbers and characters and if we ran the same software today, say with a 305 emulator on a PC, we would store one character per modern byte. So an IBM 350 used in the normal way it was used then would be as useful as a 5 MB modern disk drive used the same way.

In the table in question we are reporting economic utility improvement ratios, and those ratios calculated in any of the ways mentioned are extremely large, almost incomprehensible, perhaps the most dramatic in human history. So my inclination would be to report them in the most conservative way, which would calculate the ratios on the basis that one character on the 350 is the economic equivalent of one 8-bit byte on a modern drive. So e.g. for "Starting with", I would say "5 megacharacters." However, I would then put a footnote at the bottom of the table, that said something like:

“* These ratios assume that one six-bit character on the IBM 350 is equivalent to one eight-bit byte on a modern disk drive. Comparing on a cost per bit basis, one should increase these ratios by a factor of 1.14 or 1.33 depending on whether one includes the seventh parity bit or not.”

In other words, here is a big number and arguably it could be even bigger.

As for rounding, again I would also be conservative and never round up. 180,000,000 is not excessive precision here.--agr (talk) 20:58, 5 September 2014 (UTC)

I disagree. There's no useful way to compare these so-called "megacharacters" to bytes. It's true there are use clashes in this, but unless we just forget about doing any comparision, we can only use the lowest common denominator -- data bits. Since RAMAC had a designated parity bit, I can only assume it wasn't sufficiently reliable without it. On a modern HDD, I can attempt to use Write-Long and Read-long commands to store extra data in the ECC field to a higher capacity, but I'd never be able to usefully retrieve the data I stored, so it can't count as storage capacity. Also, even if we try and get "smarter" about use, this would be countered by it's BCD focus that reduces effective capacity. But most importantly, by attempting this smarter use-case conversion, we'd now need a source that supports this, or we violate WP:SYNTH. --A D Monroe III (talk) 22:50, 5 September 2014 (UTC)
There is much confusion here over "BCD". IBM referred to their six-bit character code (which included almost 64 printable characters) as "binary-coded decimal". Their use of this term does not mean that only decimal digits were stored, so there is no "BCD focus that reduces effective capacity". The "zone bits" correspond to the 12- and 11-zone punches on the IBM punch card. See the character code table in the IBM 1401 article. Jeh (talk) 22:58, 5 September 2014 (UTC)
Hm, then maybe we can go with 3.75 MB and briefly describe how it was calculated in a note? I also don't count parity bits as something that should be used to store user-accessible data. Also, as Jeh already described, IBM practically misused BCD as a term (at least as we know it), as both letters and numbers were stored on RAMAC. Why would only numbers be stored? At least you can't sell something that can store only numbers to someone who wants to run a business with that thing. :) — Dsimic (talk | contribs) 23:24, 5 September 2014 (UTC)
My take: The parity bit should not get counted in usable capacity any more than do the CRC bits on a modern hard drive, or the parity stripes in a RAID array for that matter. So it is 5 million x 6 bits = 30 million bits. There are defensible arguments for calling this 3.75 million bytes. To refer to it as "five million characters" without explanation that these "characters" had only 64 possible values, not 256, is to invite a misunderstanding that the drive had the same usable capacity as a 5 MB Shugart drive, when, of course, it did not. Jeh (talk) 03:16, 6 September 2014 (UTC)
Yes, but "five million 6-bit characters" would be correct and also match the recommended usage. Zerotalk 04:05, 6 September 2014 (UTC)
I would then suggest "five million 6-bit characters, equivalent storage to 3.75 million 8-bit bytes", as "6-bit" by itself may not convey enough meaning to the non-technical reader. I would explicitly spell out "million" as "MB" is ambiguous, and besides, "five million BCD characters" is how IBM described the device iirc. Jeh (talk) 08:25, 6 September 2014 (UTC)
"8-bit" in "8-bit bytes" is pretty much redundant. — Dsimic (talk | contribs) 08:57, 6 September 2014 (UTC)
For all practical purposes today, yes. But it was not always so. Anyway, I think the phrasing I suggested has the advantage of parallel wording, and more directly conveys to the reader the reason for the 5 vs. 3.75 million difference. Good writing is not always about sweating things down to the absolute minimum number of words. Jeh (talk) 19:07, 6 September 2014 (UTC)
Yeah, that's why we also have "machine word" and (as you've suggested in the edit summary) "octet" terms. Agreed, if we take the route of explaining 6-bit characters and everything, an additional explanation could only help. — Dsimic (talk | contribs) 20:04, 6 September 2014 (UTC)
I don't agree that "8-bit byte" is redundant, although I prefer the term "octet". I've seen documentation that assumed byte sizes of 6, 7 and 12. Shmuel (Seymour J.) Metz Username:Chatul (talk) 21:50, 9 September 2014 (UTC)

I don't agree that bit count is the only common denominator for comparison. Addressable characters was arguably as important a metric. The 305 processed characters which back then came almost exclusively from punch cards, and their coding in 1956 did not require more than 64 bits. If one concocted a science fiction plot where one had to go back and fix an IBM 305 with a minimal modern drive, a 3.75 MB drive would not do while that 5 MB Shugart would work just fine. As for "I can only assume it wasn't sufficiently reliable without it" {the parity bit}, there is no reason to assume that. To begin with, the parity bit did not improve reliability. It only told you that a failure occurred. What were you supposed to do then? Stop processing, figure out what track was bad, find the original data and reload it? If the data on the drive was the result of several updates, that would be extremely tedious if not impossible. (Transaction logging was well into the future.) So if parity failures had occurred with any significant frequency, the drives would have been unusable. Also, I can't find any reference in any of the documents that the IBM 650 version of the drive, the 355, used a parity check. An alternate explanation for the parity bit is IBM's need to convince customers that it was safe to switch from punch card storage of their data, something customers had decades of experience with, to magnetic media. Note that the 6 bit BCD with parity format is exactly the same as IBM's 7-track tape format. The obvious solution for our article is to make clear the different comparison approaches. All I am suggesting is that when computing the huge improvement in drive capacity per dollar, a gigantic number, that we start with the most conservative number and then point out ratio is up to 33% bigger if one just looks at bits.--agr (talk) 21:41, 7 September 2014 (UTC)

Hm, if it's about replacing a RAMAC with the smallest possible modern HDD, there are no reasons not to create a translation layer that would map 6-bit "words" (or bytes) onto standard 8-bit "words" with no wasted bits; that way, 3.75 MB would still be enough as we can ditch parity bits and leave that to modern HDD's ECC functionality. Also, parity checks aren't usable for restoring corrupted data, for sure, but that at least made it possible to know that something was wrong with a RAMAC, and possibly repair it (I guess). — Dsimic (talk | contribs) 22:04, 7 September 2014 (UTC)
Arnold, a parity check certainly does improve reliability. A lot of read errors are ephemeral (electronic noise, physical vibration, etc), so the first step on getting the error is to try the read again. The chance of eventually getting the correct data off the disk is significantly greater than if there was no check. Maybe this was even done automatically by the device controller as all modern disk controllers do (do we have the documentation to determine that?). Another way that parity checks enhance reliability is that users can keep two copies of critical files and know which is correct if one gets a parity error. Zerotalk 01:17, 8 September 2014 (UTC)
FWIW, this is how IBM describes the characters of a 305 and the 350:

Within RAMAC, all data is read, transferred, and written serially by word. character, and bit. There are eight bit positions within each character position. They are identified as bits S, X. 0, 1, 2, 4, 8, and R. Bit S merely provides a space between the recording bit positions of each character, and is not used in the bit coding. Bit R has no numeric or alphabetic value, but is added to certain characters so that every character will have an odd number of bits. This convention makes possible a technique whereby RAMAC may perform a validity check on each character transferred.
See also Figure 86. for File [i.e. 350] Write and Read Waveforms.

RAMAC 305 Customer Engineering Theory Of Operations, IBM Corp, © 1959, p.7-8 and 85

Just as a not-so-important note, this quotation was already available in the section above. — Dsimic (talk | contribs) 23:54, 8 September 2014 (UTC)
A closer read of the IBM 305 documentation make it clear that IBM was not doing any automatic error recovery on the 305. Also the 305 only had room for 200 instructions total, so keeping live backups was unlikely (and they would have cut the capacity of the system in half). The 6-bit plus parity bit encoding was used throughout the 305 and any parity failure halted the machine. The 350 disk drive just transferred the 7-bit data from and to the CPU's 100-character magnetic-core data buffer. The IBM 305 operation manual says (Ref 4, p.72) "Each character that enters or leaves the magnetic-core unit is checked to insure that it contains an odd number of bits. Because all information transfers (except certain arithmetic operational transfers) take place through the magnetic-core unit, the machine will recognize an error whenever an inadmissible character is transferred. Any combination of bits that give an even count will stop the machine and turn on the parity check light."
The IBM 350 disk system instead achieved reliability with a read and compare after write system. "The file check is a check on the recording of information on the disks. Whenever a record is written in the disk storage, the machine automatically rereads the same record into the core unit. Then the record is read back from the disk storage track and compared, character by character, with the re-reading of the record in the magnetic-core unit. A difference in comparison causes the file error light to be turned on and stops the machine." (p.73) The operator could try the write operation again manually: "The operator may attempt the transfer again by depressing the check reset key and then the program start key." Note that this file check approach does not depend on the parity bit. So it would seem that the IBM 350 drive was a reliable mechanism for storing 7-bit data, hooked to a CPU that then used one of those bits for additional error checking.
Again, I am not saying any one basis for comparison is the right answer, just that there is a good argument for each, and when computing the massive multiple in cost improvement from 1956 to now we should start with the most conservative number. I would also point out that the difference between using 5, 4.375 or 3.75 meg as the comparison point probably presents less variation than there is in establishing "current" disk prices.--agr (talk) 03:43, 9 September 2014 (UTC)
A block of data usable by a system at the bit stream level (the 350 interface and most interfaces into the 1990s) consists of a stream of serial gap bits, followed by a stream of encoded data bits, followed by a stream of check (and now correction) bits[b] Today there are always many more encoded data bits than system usable bits; however block capacity is always stated in (system usable bits)/8 and drive capacity is then in multiples of these blocks and rounded at MB thru TB as appropriate. We don't know why the designers of the 350 recording channel chose to intersperse some of their gap bits, S, and their check bits, R, within their encoded data bits[c] but it really shouldn't matter since gap and check bits have never been counted to measure capacity available to a system. Since the 350 data bits are not encoded, the 350 sector has precisely 600 data bits per sector available to the system. Seventy five bytes per sector in today's terms, no more, no less.
I'm not sure how the bits (bytes/8) are character encoded matters. Sectors read lately off the 350 at the Computer History Museum are likely stored as Unicode (implying a 10 MB current replacement capacity) and since disk storage is free the sectors themselves maybe stored in 4k sectors (implying a 204.8 MB current replacement capacity) but the entire contents of the museum's 350 will fit into precisely 3.7504 MB of modern storage (7,325 512 byte sectors) or maybe 3.751836 MB (4k sectors) either rounding to 3.75 MB. Capacity required currently for replacement under various other coding schemes is interesting and might be worth discussing someplace but does it make sense in this summary table?
Read after write was easy in tape but very difficult in disk and it gradually disappeared as disk drives got more reliable.[d] Regardless, the 350 recording channel engineers achieved their targeted channel reliability and how it was achieved really shouldn't matter in determining capacity. BTW from a recording channel error rate the 350 was probably better than today's drives at a soft error rate (say 10-9 vs. 10-6) but worse at hard error rate (say 10-11 vs. 10-14). Overall reliability improvement is also about 3 orders of magnitude (and more if u count RAID) but isn't that a different parameter from capacity? Tom94022 (talk) 18:01, 9 September 2014 (UTC)
IMO 3.75 is the appropriate number, suitably footnoted as to derivation. In the end, I suppose I could reluctantly go along with using 5 Mchar as the divisor with a footnote to 3.75 MB or the other way around, but I see no justification at all for 4.375 (or 4.4). Tom94022 (talk) 18:01, 9 September 2014 (UTC)
Whatever we use to compare, it has to come from a source, not based on our own reasoning, even if (at seems clear to me in some cases) we can do more informed reasoning than the sources, lest we violate WP:SYNTH. Let's just review the sources, and make a selection based on those. --A D Monroe III (talk) 19:24, 9 September 2014 (UTC)
Hm, I'd say that unfortunately we can't simply extract the size of disk drive in megabytes from available sources, simply because it seems that back at the time it was much more important to express the capacity as an equivalent of punch cards, thus the only important thing was how many characters could be stored. As a matter of fact, that's what was the purpose of first disk drive, to replace punch cards. — Dsimic (talk | contribs) 22:26, 9 September 2014 (UTC)
There are sources for 3.75 MB, 4.4 MB and 5.0 MB so our task is to see if we can get consensus as to whether there is a genuine dispute among the sources or whether all sources of one or more values are unreliable. For example, we might arrive at a consensus that the sources citing 4.4 MB are unreliable in this context because according to a contemporaneous and highly reliable source, the IBM CE manual, the recorded characters were 8 bits not 7 bits and in calculating capacity both in 1956 (capacity then in characters) and today (capacity in bytes) only data bits count - therefore, using 7/8 is improper and such sources are not suitable for Wikipedia in this limited context. Tom94022 (talk) 05:59, 10 September 2014 (UTC)
We have a policy on this Wikipedia:These_are_not_original_research#Conflict_between_sources. Basically in a situation like this where there are quality sources that appear to conflict, we should explain the different views of the question and not try to pick winners or losers. That would take a lot less effort than all this debate.--agr (talk) 20:09, 10 September 2014 (UTC)
If we go with including stuff from multiple different sources, what should we take as the value used in the comparison table? The average of three values? — Dsimic (talk | contribs) 21:00, 10 September 2014 (UTC)
I suggest using the most conservative number, 5 million, which produces ratios that everyone can agree represents a minimum net improvement. Then add a footnote below the table that says something like: “* These ratios are based on equating one six-bit character on the IBM 350 with one eight-bit byte on a modern disk drive. Comparing on a cost per bit basis, one should increase these ratios by a factor of 1.14 or 1.33 depending on whether one includes the seventh parity bit or not.”--agr (talk) 21:19, 10 September 2014 (UTC)
FWIW, that's something I could live with. — Dsimic (talk | contribs) 21:28, 10 September 2014 (UTC)
The cited policy states, "If reliable sources exist which show that another apparently reliable source is demonstrably factually incorrect, the factually incorrect material should be removed." We should be able to come to a consensus as to the demonstrably factual accuracy of 3.75MB, 4.4MB and/or 5.0MB as the equivalent capacity in modern terms. I believe this particularly applies to 4.4 MB for which I can find no support for including parity in a capacity calculation, either in characters or bytes. see also WP:Inaccuracy
Since data bits are the lowest common denominator between 350 characters and today's bytes I suggest it is the better basis for comparison and should be the value used in the comparison table. It is certainly POV to select the primary value based upon a desire to show a minimum net improvement. It appears we agree 3.75MB is factual.
Whether we footnote the 5.0MB or not depends upon whether we can find a reliable source for the equivalence of one modern byte to one 305 character when measuring capacity. To a certain extent it is a bit of apples and oranges since a byte can have different character sets depending upon its code page. Did the 305 support binary operations (I'm going to research this)? if so, then perhaps we are falling into a semantics trap since if one compareds 6 bit words to 8 bit words I doubt if anyone would say they were equivalent. Furthermore, to the best of my recollection, most of the 5.0 MB sources confuse character and byte, a demonstrably factual inaccuracy. As I said above, I guess I can live with 5.0MB in a footnote, but now I would add if we can find a reliable source for the equivalency. Tom94022 (talk) 22:18, 10 September 2014 (UTC)


Arnold, I am not sure what u meant by "addressable characters" and how that can be used to normalize capacity. FWIW the 305/350 only had 48 characters (including "blank") whose code map looks nothing like ASCII or anything modern, e.g "1" is 01H in 350 vs 60H in ASCII, "A" is 03H vs 41H, etc. There is a mysterious to me 350 symbol "□" with bit code 27H which may not exist in ASCII although certainly 27H is a byte. Storage is character agnostic, there would have to be a translation layer from a modern byte to the 305/350 bit code and that translation layer would equally work with packed 305/350 characters as well as unpacked characters. So yes in our science fiction plot a 3.75 MB HDD would store the entire contents of a 350 in a packed format and perform well as a 350 emulator (keep in mind the 350 read and wrote 600 data bits per block while a modern disk drive reads and writes 4096 or 32,768 bits per block so the fictional emulator has to do lots of parsing in its translation layer). If u meant addressable disk storage locations in a 350 then we all agree it is 5 million, but I don't see how the mapping of the bit values to characters is relevant to normalizing. Its the size of the location that counts, we state RAM size in bytes regardless of whether they are accessed in 32, 64 or 128 byte chunks. Isn't the size of the storage location in bits the best analogy? Tom94022 (talk) 05:54, 11 September 2014 (UTC)

Tom, here is the addressable character argument. The ratio we are arguing about is an economic one, and economic comparisons between different eras are typically done based on the normal practices in each era. The normal practice these days is to store one character per octet. If we were converting a 305 system to modern technology we’d just convert to ASCII or UTF-8. (BTW that mysterious 350 symbol "□" with bit code 27H was called lozenge and has a code point in Unicode.) As of a couple of years ago there was a company in Texas that still used punched card inventory control with IBM 402 accounting machines. When they eventually cut over they will undoubtedly convert their punched card records to octets on a one for one basis (unless they go to Microsoft software which uses UTF-16 in which case they’ll need 2 octets per character). The character conversion needed is just a 48 character lookup table, much less complex than the the translation layer you posit (try designing an 8-bit to 6-bit converter in 1956 vacuum tube logic). Note that ASCII is a 7-bit code, and it is hardly common practice to use such a translation layer. So that 5 mega-character drive back in 1957 arguably was providing the same economic benefit as a 5 megabyte drive would today.
I agree with u that an economic analysis is one way to look at it; however, I conclude such an analysis arrives at 3.75 MB as the appropriate capacity. Storage is character agnostic and arguably today it takes two octets per character. All drives today are bit serial just like the 350. There is not necessarily a table look up in either scheme if the characters were stored in their native bit form either 6 bits/char or 8 bits/char. In either scheme the hypothetical controller would have to generate sector marks, insert a leading gap since the first character cannot start at the sector mark and strip the 600 or 800 bits out of a longer bit stream. The 6 bit/char version would have to generate parity and insert the blank a rather trivial operation with today's microcontrollers. If for some reason the controller designer wanted to store the characters in today's unicode then it would require 10MB (thanks to the lozenge) and require a complex table look up - altogether not a likely implementation. So any modern drive greater than 3.75 MB would provide the same economic value and since we are talking economics shouldn't we go with the lowest value? Tom94022 (talk) 19:36, 11 September 2014 (UTC)
The question of seven bits vs six bits depends on where you draw the line between the 350 disk drive and the 305 processor. Remember, the disk drive units themselves neither generated nor checked parity bits. That was done in the CPU’s core memory buffer, which was used for multiple functions besides interfacing to the disk drive. There is no reason to think the parity bit was included just to look for disk errors; remember this was a vacuum tube machine and tubes failed regularly. So it’s perfectly reasonable to describe the 350 as a 7-bit disk drive hooked up to a computer that operated on 7-bit bytes that include 6 data bits and one parity bit. That is the view taken by several reliable sources and it is not objectively wrong. It also fits with essentially the same disk drive being used with the IBM 650 to store 7-bit decimal numbers in bi-quinary format. if some museum succeeded in restoring a 350, it could be used to store 7-bit data directly. That’s not to say the six bit view is wrong either, reliable sources take that view as well, and the article can and should present all three viewpoints.
I don't know of any disk drive recording channel ever that did not include some form of error checking, parity, CRC and/or ECC. You really have no basis for expecting a restored 350 to reliably store 7 bit data directly. An imperfect analogy is the RLL versions of the ST506; by changing from MFM to RLL it would record 50% more data, but not reliably (as many hackers found out). This is WP:OR and unconvincing to this old recording channel engineer.
The line drawn is at the interface which at the data interface is essentially the same for all drives into the 1990s, that is bit serial data. The seminal ST506 has a bit serial data interface with a generally accepted specified capacity of 5 MB but in fact the raw capacity is up to 6.2 MB with the difference represented in check bits and gap bits none of which are counted in the ST506 specified capacity and should not be counted in converting the 350. A very reliable source, IBM, tells us that the S bit is always zero and that the R bit "has no numeric or alphabetic value." In memory and storage capacity specifications parity or ECC bits are not included in specified capacity - think SDRAM with ECC, DRAM with parity, any disk drive (including the 350 as described by IBM as 5 million 6 bit characters, not 8 bit characters) so without explanation any calculation using 7 bits is a mathematical error which should not be reproduced in Wikipedia.
I find it interesting that the 7 bit advocates ignore the 8th bit. IBM describes the 350 as a 5 million character machine have an 8 bit recorded character. Why is the P bit counted but the S bit not counted? Both are there, neither contributes to the character definition. My guess is that 7 bits is an urban legend generated by someone who didn't read the book. In any event, an advocate of seven bits is not accurately describing the 350 and this is another reason to say any such calculation is factually inaccurate and not suitable for Wikipedia. Tom94022 (talk) 19:36, 11 September 2014 (UTC)
When we calculate a comparison between the 350 and modern drives, we are pushing the OR boundary and it is reasonable to start with the most conservative view and then point out the number could be higher if calculated in different but reasonable ways. And we haven’t even gotten to putting in the effects of inflation.
As for the 305’s ability to operate on binary data, that is a story in itself. The 305 could only add and subtract. It did all its compare and testing operations via its plugboard control panel, where everything was done in punch card code using relay logic. The two high order data bits corresponded to the 12 and 11 rows on a card and those could be tested individually. The low order four bits were converted into the digits 0 to 9, so they could not be tested directly.
Anyway Tom, I am not saying that your way of looking at this is wrong, just that there are other reliably sourced ways that have a reasonable basis and we should present them all. —agr (talk) 14:52, 11 September 2014 (UTC)
I've designed, known and/or used many memory and storage devices over the years and I cannot think of a single memory or storage device where the specified capacity included check or gap bits. Word lengths do differ but even when they I can't think of a single modern case where the specified capacity was not expressed in the common language of bytes (with binary or decimal prefixes). For example, the DEC PDP10 used a 36 bit word but it's disk drives were specified in both words and bytes (ignoring gap and check bits) converting on a bit basis. Similarly a 2GB SDRAM has 2GiB regardless of whether the physical interface is 32, 64 or 128 bits and independent of the number of bits used for checking. Finally, a 750GB HDD provides 750 GB of user data whether it is bit serial (SATA) or word serial (PATA) and regardless of the number of spare, check and gap bits. It's tough to prove a negative, but we have a whole bunch of history that suggests parity and gap bits do not count in specifying storage capacity and no reliable source that says why they should be.
I am saying that 4.4 MB is factually incorrect by a number of tests and cannot not be used. There is an argument to footnote the 5 million characters (not 5 MB), but by any other test the capacity in modern terms is 3.75 MB Tom94022 (talk) 19:36, 11 September 2014 (UTC)
If someone builds a dedicated word processor that uses a standard hard drive just to store data, with everything including the file system encoded as 7-bit ASCII with parity, would you say the hard drive was now a 7-bit drive?--agr (talk) 13:44, 12 September 2014 (UTC)
In that case, it's up to the word processor for using such a storage layout design, and that implies nothing to the underlying HDD. As an opposite example, what if someone used flash-based storage with such a word processor? As we know, capacities of flash-based storage products are also expressed in user-addressable bytes, despite the fact virtually all such products include some amount of overprovisioning (even over 30%) and keep bending over backwards to present such an awkward internal structure through a nice and clean external interface. — Dsimic (talk | contribs) 17:42, 12 September 2014 (UTC)
No, a drive storing 8 bit bytes in blocks of 512 or 4096 bytes has its capacity measured in bytes. How many 7 bit characters it stores is another question and depends upon implementation. One way is to throw away one bit per byte and map 1 character into 1 byte; the drive capacity in MB is unchanged but the capacity in characters is the same number 7/8th that. An other way is to compact strings of 7 bit characters into the standard size blocks, so a 4096 byte block stores 4,681 characters with a negligible loss of .003% so that the capacity in characters is essentially 8/7th that in bytes - the capacity in bytes is unchanged. To make it a 7 bit drive one would have to store an integer number of 7 bit characters per block.

In summary of where we might be at

3.75 MB has reliable sources and is factually accurate. I believe all except maybe the IP agree.
4.4 MB
I contend that although there are several sources for this number it is factually inaccurate, as follows:
IBM states a 350 character is recorded as 8 bits of which 6 are data bits (source: contemporaneous IBM CE Manual)
IBM discloses the 305 has a set of 48 characters mapping into 6 data bits (source: contemporaneous IBM Programming Manual)
Information theory confirms that 48 characters map into 6 data bits. (no source right now, but can this be disputed)
One data byte as used in storage capacity specification has 8 data bits. (no source right now, but can this be disputed)
It is factually inaccurate to equate one 350 character to 7 data bits. While there are multiple sources that make this equivalence none explain their reasoning. All of these sources are much later than the contemporaneous IBM documents and none state their source for 7 bits. Accordingly the use of 7 bits must be considered factually inaccurate and any calculation based thereupon must be excluded according to Wikipedia policy.
5.0 MB
I agree that if the 305 characters are mapped as recorded (8 bits) into a modern drive drive it would require 5 MB. Unfortunately I cannot find a reliable source that says this. Everyone I've looked at just makes an explicit statement, many just say "The HDD weighed over a ton and stored 5 MB of data" (31,100 hits)

I suggest that both 4.4 and 5.0 have reached the point of urban legends, repeated without verifying the underlying facts. Nonetheless, I can accept going with 3.75 and $9,200 in the table with a footnote that reads something like:

"Other equivalent capacities have been reported such as 5.0 MB which corresponds to a one-to-one mapping of the recorded 350 character bits into a byte. Using 5.0 MB would reduce the price/MB from $9,200 to $6,900."

This assumes the 31,100 hits did do their due diligence, otherwise it might be OR :-) Tom94022 (talk) 19:06, 12 September 2014 (UTC)

Per your first sentence, I think there is a reasonable argument that it' was the 305 designers' choice to use a character code with a parity bit, and that choice implies nothing about the underlying 350 HDD, which neither generates nor detects the parity bit and hence can be regarded as a 7-bit drive. Maybe you buy that view maybe you don't, but it is certainly not "objectively wrong" as Tom claims and there is no basis to reject the sources who take that viewpoint.--agr (talk)
Arnold, your "designers' choice" statement is speculative and at best improbable WP:OR. Virtually all disk drives until the 1990s were incapable of distinguishing bits, be they gap bits (including header), check bits (parity, CRC or ECC) or encoded data bits yet drives were for the most part specified in terms of data (user) bytes with a given format and channel code. Again the ST506 was specified with MFM and any controller that could write RLL would apparently increase the capacity by 50% but as many found out it wasn't reliable and voided the warranty. You might have a better argument about the unused S bit; maybe it could have been used or maybe not, that is, unpersuasive to me original research and although u can make an original research argument here u need to gain consensus before it can be used in an article. I don't see why it is so hard, IBM said 6 data bits, 8 channel bits - 7 bits is not supported by any reliable source (albeit there are a lot of people repeating this urban legend). Tom94022 (talk) 21:18, 12 September 2014 (UTC)
The use of the IBM 350 parity bit as a data bit violates IBM's implicit specifications for the drive so such a calculation cannot be used as a basis for establishing a 4.4 MB capacity. Sort of like saying the ST506 was a 7.5 MB drive because a controller could write and read RLL data at the drive interface. I have a long explanation of why at my sandbox and if anyone wants to discuss my reasoning they can do it here or there.
Yet one more way of looking at things! I hope we all agree that MB means one million data bytes where the byte has 256 unconstrained states. According to IBM the IBM 350 recorded character has 6 data bits of which 48 states are used but 64 are available. So dimensional analysis goes like this:
(5,000,000 char/IBM350) * (6 data bits/char) / (8 data bits/data byte) / (1,000,000 data byte/million byte) = 3.75 MB of data
Advocates of any other use some (other bits/char) dimension to arrive at a number that is (other bits/data bits) MB. Perhaps an interesting and relevant number but not without context of what are the "other bits" and in what context are they meaningful. In this context (recorded bits/char) could be meaningful but (data bits + parity bit)/(char) doesn't have any meaning I can think of. Tom94022 (talk) 21:38, 13 September 2014 (UTC)

Yet another way of looking at the 350 Capacity

Modern disk drives are specified by the disk drive vendor in data bytes available to a system (1 data byte = 8 unconstrained data bits). IBM at that time specified the 350 as 5 million characters and that at the drive's interface there were 6 unconstrained data bits per character. For Wikipedia purposes this should be the primary value with any other value, particularly something published 50 years later subject to explanation.

There are many bytes and bits under the cover of a modern disk drive but they are not disclosed by the vendors and most are not accessible by a system, but even when they are, they are not normally used by the vendor in specifying the drive's capacity. For some time some serial bit drives were specified by the vendors in two capacities, unformatted and formatted[e]. Even then it was the formatted capacity that most often defined the drive, two examples:

  • The ST506 is generally accepted as a 5.0 MB disk drive; it was specified by Seagate as a 5.0 MB formatted and 6.38 MB unformatted. Using the constraints of the Seagate format it is possible to build a 6.1 MB ST506 and I am sure someone did. However, even if there was a reliable source for the a 6.1 MB ST506 it would at best rate a footnote in the ST506 article. Note the ST-506 article states, "5 megabytes after formatting."
  • The IBM 3330 with its 3336-1 is generally accepted as a 100 MB disk drive even though IBM's publications acknowledge that this is with an IBM full track record of 13030 bytes per track, its capacity is less at any number of records per track greater than 1 and the stated capacity did not include 7 spare tracks . IBM's public maintenance literature for the subsystem discloses an unformatted track length of 13440 bytes/track corresponding to an unformatted capacity of 105 MB. DEC used the identical pack in a fixed block mode and only stored 83 MB. While there are reliable sources for capacities other than 100 I would argue most are of undue weight but some may be worthy of a footnote. Note the 3330 Section states "Its removable disk packs held 100 MB (404x19x13,030 bytes)"

Thanks to the unpublished original research of the RAMAC Restoration team we know the unformatted capacity of the IBM 350 was "about" 5000 bits/track for unformatted IBM 350 capacity of 6.25 MB. It doesn't matter whether this number is exact or not, because it does allow us to put each of the known 350 bits in proper context. The 350 consisted of:

3.750 MB of formatted capacity (disclosed as 6 data bits per character)
0.625 MB of parity bytes (disclosed as 1 parity bit per character)
0.625 MB of space bytes (disclosed a 1 space bit per character)
1.250 MB of other gap bytes (not disclosed but some number is inherent in magnetic recording)

Totaling

6.250 MB of unformatted capacity (from RAMAC Restoration team)

It is factually correct that 4.375 = 3.75 + 0.625 and rounds to 4.4 but in the context of a 350 it is an incomplete statement of the unformatted capacity so without such context it is obviously factually incorrect, but even if there was a reliable source that placed it in context using 4.4 MB would violate undue. We don't know how many check bytes are in a modern drive, we know how many in similar bit serial drives but don't publish such in Wikipedia, why do we care about those in the 350?

It is factually correct that 5.0 MB is the modern capacity needed to map the 8 bit recorded character on a bit for bit basis into an 8 bit byte of a modern disk drive. I'm not sure there is a reliable source for this mapping and I don't think is it particularly relevant but I could accept placing it in a footnote if there is consensus that it should be used.

I will change the article to note the Capacity is "formatted"

Can I now take the dubious tag off the article? Tom94022 (talk) 20:28, 15 September 2014 (UTC)

No, this is disputed. The RAMAC350 capacity is 4.4 MB according to Claus Mikkelsen of IBM (another direct source, not calculated) who wrote that RAMAC 305 had "4.4 MB usable capacity."[22]

The following material was copied from the following section so as to separate two threads ----- Tom94022 (talk) 20:37, 18 September 2014 (UTC)

Also, that reference contradicts itself on 305 RAMAC capacity, later listing it as "5MB of storage" on two different pages. It also says its disk platters where '1" thick'! They were certainly not anywhere near 1 inch thick, as shown in the pictures right above this claim. --A D Monroe III (talk) 17:10, 16 September 2014 (UTC)
Also, a 2014 source asserting 4.4 MB without explanation is of questionable reliability with regard to a product last documented in the early 1960s wherein the manufacturer never used such a number. So anything other than 5 million characters is a calculation and to be reliable the bases should be disclosed. Furthermore, the 355 was rarely if ever called a RAMAC by anyone so the 650 discussion is a red herring. Anyhow, I have contacted both authors of the IPs latest unreliable source to see what they say. Tom94022 (talk) 18:02, 16 September 2014 (UTC)
I agree. There is reason to support "5 million 6-bit characters"; we can say "3.75 million bytes"; we could even say "30 million bits". We have to regard the "4.4 million bytes" claim as just mistaken. Jeh (talk) 19:21, 16 September 2014 (UTC)
I agree that there were definitely 5 million characters. But each character was not 6 bits. According to the references provided, including Al Shugart himself[23] and [24] each character was 7 bits. This 1960 IBM document says IBM computers used Bi-quinary coded decimal (7 bits per character).[25]
Another red herring, we all agree that the 650 used Bi-quinary coded decimal - the article is about disk drives not systems. You should read Section XIV Code Translation, what we are doing is translating to an 8 bit unchecked Byte code which turns out to be easy since IBM says their were 6 data bits per character in the 350.
The full Al Shugart transciption states "7-bit code - 6-bit + 1 binary" Al probably said "parity" but we know this is an incomplete recollection in 2001 of something Al hadn't work on in more 50 years. Again not a particularly reliable source Tom94022 (talk) 00:11, 17 September 2014 (UTC)


Relevance of IBM 355 and IBM 650

Consider the following pictures: a direct way of looking at the 350 capacity clearly visible on the operator panel. They show how the IBM 650 and RAMAC represented each seven-bit digit (corresponding to 4.4 MB total RAMAC305 capacity) as a Bi-quinary coded decimal. Here are two references that support Bi-quinary coded decimal:[26][27]

IBM 650 – seven bits

— Two bi bits: 0 5 and five quinary bits: 0 1 2 3 4, with error checking. Exactly one bi bit and one quinary bit is set in a valid digit. In the pictures of the front panel below and in close-up, the bi-quinary encoding of the internal workings of the machine are evident in the arrangement of the lights – the bi bits form the top of a T for each digit, and the quinary bits form the vertical stem.

(the machine was running when the photograph was taken and the active bits are visible in the close-up and just discernible in the full panel picture)

 

IBM 650 front panel

 
Close-up of IBM 650 indicators
, \ 0 \\ 10-10000
\ 1 \\ 10-01000
\ 2 \\ 10-00100
\ 3 \\ 10-00010
\ 4 \\ 10-00001
\ 5 \\ 01-10000
\ 6 \\ 01-01000
\-----quinary
\ 7 \\ 01-00100
\ 8 \\ 01-00010
\ 9 \\ 01-00001
71.128.35.13 (talk) 01:00, 16 September 2014 (UTC)
I'm not aware of any dispute about how the 650 stores data on the drum or in core, and I see nothing in the cited references to suggest that the character format of a 350 attached to a 305 is anything but six bits plus parity. There is certainly nothing in either reference to suggest that bi-quinary is relevant to the 305 or 350.
No parity bit was used here according to Tom09422 as of 01:11, 10 September 2014 (UTC) with reference to the 650 manual of instruction, "If as is likely it uses a bi-quinary coded decimal code which in modern terms is a self checking 7 bit channel code then no parity would be required." The RAMAC interfaced to the 650 computer, and stored punched card data in a 7 bit Bi-quinary coded decimal format. Here is an IBM reference dated 1960 that describes in great detail the error checking implemented with Bi-quinary coded decimal on the RAMAC computers of that time period:[28]
Please do not misleadingly paraphrase me; there is no evidence regarding the use of a parity bit in the 355; all I said is it would not be required. Frankly I suspect they used a space bit and 7 Bi-quinary coded decimal bits, but all of this is a red herring since the reference is to the beginning of HDDs, the first disk drive, the 350 not the 355. Although there is no evidence that IBM ever formally called either the 350 or the 355 a RAMAC, your discussing them as one could lead to misunderstandings so please identify the drive you are discussion. Tom94022 (talk) 00:11, 17 September 2014 (UTC)
Au contraire, this usage is not a "red herring," because it is sanctioned by Big Blue. There is indeed documentary evidence that IBM formally called the 350 a "RAMAC," and my discussing them as one is fully sanctioned by the RAMAC 305 Customer Engineering Manual of Instruction, which says on page seven that they are integral: "Development of a machine to perform accounting functions by in-line processing has long been desired. However, the most fundamental requirement of such a machine is its ability to read, alter, and replace any of the file records in any random sequence. Such a machine was not practical until the development by IBM of the 350 Random Access File. This file is an integral part of the RAMAC." 71.128.35.13 (talk) 23:47, 17 September 2014 (UTC)
Regarding "an IBM reference dated 1960 that describes in great detail the error checking implemented with Bi-quinary coded decimal on the RAMAC computers of that time period", it does not say that. It references only the IBM 650. Not the 305. The same manual also describes many other coding schemes that were used on other IBM computers. Your statement makes it sound as if the manual talks only of BQCD and says it was used on all RAMAC computers; that is a misrepresentation of the source. Jeh (talk) 00:08, 19 September 2014 (UTC)
Regarding "RAMAC 350", that is simply not a product name that IBM ever used. That the 350 "random access file" was "integral" to the RAMAC 305 does not mean that "RAMAC 350" was a valid product name. "Fully sanctioned"? Nonsense. That's just you jumping to conclusions. It would be very helpful if everyone here would stick to actual product names. "RAMAC 350" makes it unclear whether you're referring to the disk drive (the 350) or the computer (and typo'd the number). Jeh (talk) 00:59, 19 September 2014 (UTC)
One strange thing is that your first reference cites the RAMAC as the fastest disk drive ever made, when in fact is was the slowest disk drive shipped by IBM and slower than any other disk that I'm aware of with the exception of the RCA Data Record File. Shmuel (Seymour J.) Metz Username:Chatul (talk) 15:27, 16 September 2014 (UTC)
The fastest hard drive ever made characterization was written by an IBM expert in 2014. This may be understood to mean that this product was the fastest as of the 1956 date of manufacture. This is also a truism, because there was no prior hard drive (no competition). [29] 71.128.35.13 (talk) 23:04, 16 September 2014 (UTC)
Also, that reference contradicts itself on 305 RAMAC capacity, later listing it as "5MB of storage" on two different pages. It also says its disk platters where '1" thick'! They were certainly not anywhere near 1 inch thick, as shown in the pictures right above this claim. --A D Monroe III (talk) 17:10, 16 September 2014 (UTC)
Also, a 2014 source asserting 4.4 MB without explanation is of questionable reliability with regard to a product last documented in the early 1960s wherein the manufacturer never used such a number. So anything other than 5 million characters is a calculation and to be reliable the bases should be disclosed. Furthermore, the 355 was rarely if ever called a RAMAC by anyone so the 650 discussion is a red herring. Anyhow, I have contacted both authors of the IPs latest unreliable source to see what they say. Tom94022 (talk) 18:02, 16 September 2014 (UTC)
I agree. There is reason to support "5 million 6-bit characters"; we can say "3.75 million bytes"; we could even say "30 million bits". We have to regard the "4.4 million bytes" claim as just mistaken. Jeh (talk) 19:21, 16 September 2014 (UTC)
I agree that there were definitely 5 million characters. But each character was not 6 bits. According to the references provided, including Al Shugart himself[30] and [31] each character was 7 bits. This 1960 IBM document says IBM computers used Bi-quinary coded decimal (7 bits per character).[32]
If you view the RAMAC Oral History Project video at about 1:09:36 you will hear Al correct himself to 6 data bits + 1 parity bit; he apparently forgot about the 8th bit. According to an email conversation with an author of the Share citation, 4.4 MB came from a sign he observed without any further review. Neither is a reliable source for 1950s technology Tom94022 (talk) 23:54, 18 September 2014 (UTC)
Another red herring, we all agree that the 650 used Bi-quinary coded decimal - the article is about disk drives not systems. You should read Section XIV Code Translation, what we are doing is translating to an 8 bit unchecked Byte code which turns out to be easy since IBM says their were 6 data bits per character in the 350.
The full Al Shugart transciption states "7-bit code - 6-bit + 1 binary" Al probably said "parity" but we know this is an incomplete recollection in 2001 of something Al hadn't work on in more 50 years. Again not a particularly reliable source Tom94022 (talk) 00:11, 17 September 2014 (UTC)
The RAMAC305 hard drive, has 7 bits per character. You should read the RAMAC 305 Customer Engineering Manual of Instruction (page 9), specifically with regard to the RAMAC305 process drum, “Each character position is further broken down into the 7 bit system of coding.”[33] 71.128.35.13 (talk) 23:47, 17 September 2014 (UTC)

Each character represents the same information, no matter whether it is in the core memory, on the RAMAC hard drive, or on the punched card. According to the 1957 RAMAC305 Manual of Operation, “The IBM RAMAC is built around a random-access memory device that permits the storage of five million characters of business facts in the machine. In effect, the machine stores the equivalent of 62,500 80-column IBM cards.”[34]

What was on those cards? This would determine how much the RAMAC could store. It had five million characters on 62,500 punched cards: (5e6 / 62500) = 80 characters per punched card. Punched cards had 7 bits per bi-quinary coded decimal character as indicated in this reference: “The standard 80-column punchcard … stored about 70 bytes of data“[35] The RAMAC 305 Customer Engineering Manual of Instruction (page 9) likewise states that with respect to the 305 process drum “Each character position is further broken down into the 7 bit system of coding.”[36] So, total storage is (62,500 punched cards)[37] * (70 bytes per punched card)[38] = 4.4 MB, or (7 bits per character)[39] * (5 million characters)[40] * (1 byte / 8 bits) = 4.4 MB.

Numerous sources say that IBM650 and RAMAC had 7 bits per bi-quinary coded decimal character, “Each digit was represented in seven bit "bi-quinary" notation: one bit out of 5 represented a value from zero to four; one bit out of two indicated whether or not to add 5 to that value, giving the electronic equivalent of the abacus.”[41][42]

The American Society of Mechanical Engineers confirms that the RAMAC hard drive “contained a total capacity of 5 million binary decimal encoded characters (7 bits per character) of storage.”[43] This is 4.4MB (5e6 characters * 7 bits / character * 1 byte/ 8 bits). This was a “usable capacity” of 4.4 MB (not raw unformatted) according to an IBM expert.[44]

IBM equipment of that time, including the 650, RAMAC305 and RAMAC350 used bi-quinary coded decimal characters. Characters were the bedrock, the common foundation. The seventh bi-quinary coded decimal bit is not a parity bit. As Al Shugart (another ex-IBM RAMAC expert) said with respect to RAMAC305, “The coding used on the disk drive was a 7-bit code - 6-bit + 1 binary.”[45] 71.128.35.13 (talk) 23:47, 17 September 2014 (UTC)

No doubt about it: Bi-quinary coded decimal has 7 bits per character. 71.128.35.13 (talk) 23:25, 16 September 2014 (UTC)
And the 650 computer used bi-quinary, because they wanted a machine that did arithmetic in decimal, and electronics to do addition and subtraction in bi-quinary were easier than those for BCD.... but so what? Do you not understand that the character coding in an I/O device can be, and often is, different from that stored in the computer to which the device is attached? Punch cards have possible 12 bits per character position, but that doesn't mean the characters take 12 bits each to store when the computer reads them.
We have ample good references that the "five million characters" quoted for the 350 referred to six-bit characters. We have no good evidence that the 350 stored 7 bits per character, unless we count the parity bit... and since we don't count the ECC bits when describing modern hard drive capacities, we shouldn't count the parity bit in the 350. As for the 650 - the drive attached to the 650 was actually a 355, not a 350. All of this handwaving about the data format on the 355 is irrelevant to the 350. We can expect similar technologies, but there's no reason to expect them to be identical. So even if the 355 does store digits in bi-quinary form (to match the computer to which it was attached) there is no reason to extrapolate that to the 350. Jeh (talk) 01:04, 17 September 2014 (UTC)

Here's some good evidence from the RAMAC 305 Customer Engineering Manual of Instruction (page 9), specifically with regard to the RAMAC305 process drum, “Each character position is further broken down into the 7 bit system of coding.”[46]

That just about wraps it up: each RAMAC305 character has 7 bits. 71.128.35.13 (talk) 23:47, 17 September 2014 (UTC)

(Again and again and again...) The "RAMAC 305" refers to the computer (and its built-in drum storage). That says nothing conclusive about the data format on the 355 disk. Nor is there any reason to think the 355 disk and the 350 disk (the subject of this argument) were the same in this detail. If "that wraps it up", then I guess you're out of arguments, because this does not provide any evidence that the 350 held anything but 5 million six-bit characters. Jeh (talk) 23:55, 17 September 2014 (UTC) Apologies, I was confusing the "RAMAC 305" computer with the 650. It's the 650 that uses the 355. More in a few minutes. Jeh (talk)
OK, looking at the manual you linked... I think you did not read enough. "Seven-bit system of coding" is indeed mentioned on page 9. But this does not support your claim. The coding scheme is clearly illustrated on page 8. Certainly there are seven bits per character. But near the top of page eight, first column, it says
"There are eight bit positions within each character position. They are identified as bits S, X, 0, 1, 2, 4, 8, and R. Bit S merely provides a space between the recording bit positions of each character, and is not used in the bit coding. Bit R has no numeric or alphabetic value, but is added to certain characters so that every character will have an odd number of bits." (emphasis added)
In other words, it's a six bit code plus a parity bit (bit R) and another bit (bit S) for "spacing".... eight bits total. There is no indication that the 350 disk unit (they call it a "file") stores characters in any other fashion.
Looks like my conclusion was valid anyway: If "that wraps it up", then I guess you're out of arguments, because this does not provide any evidence that the 350 held anything but 5 million six-bit characters... if we're not counting the parity bit or other "overhead" bits. (And we shouldn't, just as we don't count ECC bits in a modern HD.) Jeh (talk) 00:40, 18 September 2014 (UTC)
Look again. There is no parity bit in Bi-quinary coded decimal: it's a binary bit. It's very clearly seven bits per character in the Bi-quinary coded decimal wikipedia article.
At the risk of repetition, this replies to your claim of six bits per character. It should be noted that you haven't deigned to provide here any reference supporting the claimed six bits per character. Mountains of evidence, from the RAMAC 305 Customer Engineering Manual of Instruction[47] to Al Shugart[48] to Claus Mikkelsen of IBM[49] support seven bits per character.
Keep in mind that the RAMAC 305 is an "integral" part of the IBM350, according to the RAMAC 305 Customer Engineering Manual of Instruction: "Development of a machine to perform accounting functions by in-line processing has long been desired. However, the most fundamental requirement of such a machine is its ability to read, alter, and replace any of the file records in any random sequence. Such a machine was not practical until the development by IBM of the 350 Random Access File. This file is an integral part of the RAMAC."[50]
Each character represents the same information, no matter whether it is in the core memory, on the RAMAC hard drive, or on the punched card. According to the 1957 RAMAC305 Manual of Operation with respect to the hard disk drive, “The IBM RAMAC is built around a random-access memory device that permits the storage of five million characters of business facts in the machine. In effect, the machine stores the equivalent of 62,500 80-column IBM cards.”[51]
What was on those 62,500 cards inside the RAMAC350? This would determine how much the RAMAC350 could store. It had five million characters on 62,500 punched cards: (5e6 / 62500) = 80 characters per punched card. Punched cards had 7 bits per bi-quinary coded decimal character as indicated by this reference: “The standard 80-column punchcard … stored about 70 bytes of data“[52] The RAMAC 305 Customer Engineering Manual of Instruction (page 9) likewise states that with respect to the 305 process drum “Each character position is further broken down into the 7 bit system of coding.”[53] So, total storage is (62,500 punched cards)[54] * (70 bytes per punched card)[55] = 4.4 MB, or (7 bits per character)[56] * (5 million characters)[57] * (1 byte / 8 bits) = 4.4 MB.
Numerous sources say that IBM650 and RAMAC had 7 bits per bi-quinary coded decimal character, “Each digit was represented in seven bit "bi-quinary" notation: one bit out of 5 represented a value from zero to four; one bit out of two indicated whether or not to add 5 to that value, giving the electronic equivalent of the abacus.”[58][59]
The American Society of Mechanical Engineers confirms that the RAMAC hard drive “contained a total capacity of 5 million binary decimal encoded characters (7 bits per character) of storage.”[60]
This is 4.4 MB (5e6 characters * 7 bits / character * 1 byte/ 8 bits). This was a “usable capacity” of 4.4 MB (not raw unformatted) according to an IBM expert.[61]
IBM equipment of that time, including the 650, RAMAC305 and RAMAC350 used bi-quinary coded decimal characters. Characters were the bedrock, the common foundation. The seventh bi-quinary coded decimal bit is not a parity bit. As Al Shugart (another ex-IBM RAMAC expert) said with respect to RAMAC305, “The coding used on the disk drive was a 7-bit code - 6-bit + 1 binary.”[62] 71.128.35.13 (talk) 00:52, 18 September 2014 (UTC)
No, YOU look again. Now you're confusing the RAMAC305 and the 650. The reference YOU gave for the RAMAC305 does not say one word about "bi-quinary coded decimal", nor about any character coding scheme that could be confused with that, and the "seven-bit system of coding" includes a parity bit.
I mean, really. What part of "Bit R has no numeric or alphabetic value, but is added to certain characters so that every character will have an odd nunber of bits" leads you to believe that Bit R is the seventh bit in a BQCD code? (That's a direct quote from the IBM RAMAC305 manual that you linked. Top of page 8, first column.) How can it be a bit in a BQCD code if it "has no numeric or alphabetic value"? That is an excellent description of a parity bit, on the other hand.
Look at the character code chart at the bottom of page 8. It clearly shows that the bits are not weighted as in BQCD, but rather in binary: 1, 2, 4, 8. Digits 1 through 9 were represented in pure BCD: "3" by bits 1 and 2; "5" bit bits 1 and 4; "6" by bits 2 and 4; etc. There is a special case for zero, using the "0 bit", so it is not quite pure BCD but it is flatly NOT bi-quinary coded decimal. And just as clearly, the "R" bit that you claim is the seventh bit for BQCD clearly has no role to play in the numeric values! If it did, then "3", being represented by bits 1, 2, and R, would not have a value of 3 at all. The IBM 305 manual furthermore states, as I quoted, that the "R" bit (the seventh bit) is a parity bit (though not using that word), and the character code table proves it.
Here, let me make it easy for you. Here are the bit codes for the digits 0 through 9, from the RAMAC 305 manual that you linked:
  X 0 1 2 4 8 R
0   1
1     1
2       1
3     1 1     1
4         1
5     1   1   1
6       1 1   1
7     1 1 1
8           1
9     1     1 1
Honestly, does that look like BQCD to you? What flavor of BQCD has an "8" bit and no "5" bit or "3" bit, represents "3" with bits "1" and "2", or "9" with bits "1" and "8"? Answer: None. This is BCD with a special case for zero.
Or are you just going to pretend that the manual you linked doesn't count, or doesn't exist?
This is a "horse's mouth" reference; there is no other possible interpretation; and any reference that states, implies, or can be reasonably interpreted to mean that the 305 used bi-quinary coded decimal is therefore clearly wrong. (And I expect that Al Shugart meant "parity" where he said "binary". A statement of "six bits, plus one binary" would make little sense - what, the six bits aren't binary too? As for the slide show by Mikkelson...that's cute, but it hardly trumps the IBM reference material. I expect Mikkelson just misremembered (or believed Shugart). Jeh (talk) 01:40, 18 September 2014 (UTC)
And, I just have to respond to this:
"In effect, the machine stores the equivalent of 62,500 80-column IBM cards.... Punched cards had 7 bits per bi-quinary coded decimal character as indicated by this reference:"
Your assertion of BQCD on punched cards is nonsense. Have you ever even looked at one?! IBM punched cards had 80 characters, yes, but they did not use BQCD coding. Each character was represented by digit punches 0 through 9, which for digits simply represented themselves; I think you will agree that there is not a hint of BQCD (or BCD) there. Then there were two "zone punches" called "11" and "12", but these played no part in numeric representations, except that an "11 punch" over the units digit could denote a negative number. (By itself, it represented the minus sign character.) These were combined with the digit punches in various ways to produce letters and a small set of special characters. For more confusion, for some characters, the "0" punch became a zone punch too. Take a look at the character code chart in the IBM 1401 article. (Like the 305, the 1401 made a special case of the internal storage of "0". But instead of having a "zero bit", it represented "0" in core with the "2" and "8" bits set. This probably had something to do with its implementation of BCD arithmetic.)
Your claim that "Punched cards had 7 bits per bi-quinary coded decimal character as indicated by this reference:" is particularly egregious. First, the slide show you referenced doesn't say a word about BQCD, nor "7 bits". If punched cards were coded using BQCD then each column could only have held one BQCD character - i.e. a digit from 0 through 9 - and that was of course not the case. Coding of the 64-character set supported by IBM cards would have required two columns per character, and that was of course not the case either.
Conclusion: Your claim of BQCD on punched cards is ridiculous. A more realistic interpretation is that by the usual character coding, each column could represent one of about 64 different characters, i.e. 6 bits; that's 480 bits per card, or 60 bytes. BUT! I mightily doubt that IBM was worrying about any of that when they said "62,500 80-column cards", 8-bit bytes not being in common use or maybe not even thought of yet. Hm, one could also think of a punched card as holding 80x12 = 960 bits; that's 120 bytes! (And it is possible on some later IBM machines to read and punch cards and use every possible bit.) But.. Nah. The simple fact is that when you read a card into a computer memory by the usual character coding, it takes 80 character positions in the memory. 62,500 x 80 = 5,000,000 characters, each of which has 64 possible values, encoded in six bits. Hence the claim that the 350 held the equivalent of 62,500 IBM cards. It really is that simple. Your machinations to use this quote to support 7-bit characters (or 4.4 MB) on the 350 is a circular argument; you're assuming 7-bit characters from the beginning. But they're not. The RAMAC 305 manual proves it. Jeh (talk) 04:08, 18 September 2014 (UTC)
Nailing it to the wall: Re the 650... We know how digits 0 through 9 were represented, in BQCD. But what of other characters? I have been unable to locate a full character code chart for the 650, but numerous references, including our own WP article, state that characters other than digits were represented as pairs of BQCD digits! Thus it took 14 bits to represent a character out of a set of 100 possible codes, and the machine's fixed-length 10-digit "word" could hold just 5 characters. This is absolutely not storing one character in every seven bits, and it is totally different from the 305's internal character set.
One might ask: But why couldn't they store an arbitrary character in one seven-bit BQCD digit position? After all, seven bits give you 128 possibilities, not just the ten digits represented by BQCD! The answer is that those seven bits are always interpreted by the machine as representing a decimal digit in BQCD form (even if software is interpreting pairs of digits as characters). So there are only a small number of valid configurations. That's the "self-checking" aspect, and it's why they didn't need a parity bit. But if you took advantage of the 128 possible arrangements of seven bits to store (for example) the entire ASCII-7 character set, almost all of them would not be valid BQCD configurations and would cause the machine to raise an error flag ("invalid digit" or some such).
So, your claim that characters were any sort of "common foundation" among these early machines is pretty much belied by the 650. Not only is its character coding radically different from the 305's, but also, as a character-processing machine, it's an absurd design. Besides taking two digit positions to store one character, it was a fixed-length word (10 digits) machine; so individual digits (and therefore individual characters) were not addressable. This makes the handling of both arbitrary-length character strings and of individual characters very awkward. But it does lead to a better understanding of the info we have on the 355: "six million digits", or three million characters, using the 650's coding scheme of two characters per digit. That's probably why they never quoted the 355's capacity in characters. Stated that way it looks like barely more than half the capacity of the 350.
Back to the RAMAC 305 and its 350 "file": (By the way, although colloquialisms do abound, in terms of official product names there was no "RAMAC 350"; it was the "RAMAC 305" and the "350 file" that was attached to it - check the manual.) Your assumption that all of these early IBM machines used the same internal representation for characters, and that information on the 650's use of BQCD applies equally to the 305, is very clearly flat-out wrong. And all of your conclusions that depend on that assumption are therefore also wrong. They were just very different machines.
In sum: The 305 absolutely did not use BQCD. The manual never mentions BQCD and the character code chart (page 8 of the manual) does not illustrate anything that could be interpreted as BQCD. Bits are called 1, 2, 4, 8, 0, X, and R. That's seven bits, and the manual does say "seven-bit coding", but per the manual, the seventh bit (R) is a parity bit, "carrying no numeric or logical value." So the 350's "5 million characters" are five million six-bit characters, each with a parity bit, which we do not count when counting its usable capacity. The fact that there is no parity bit in BQCD is completely irrelevant, as this machine did not use BQCD (regardless of how many other IBM machines did). QED. Jeh (talk) 20:22, 18 September 2014 (UTC)
No, each 80-column punched card had 70 bytes, not 60 bytes, as stated in this reference: http://www.extremetech.com/computing/90156-the-history-of-computer-storage-slideshow/2]
So RAMAC 350 stored (70 bytes / punched card) * (62,500 punched cards) = 4.4 MB formatted capacity. 71.128.35.13 (talk) 19:34, 18 September 2014 (UTC)
IBM intertwined the 305 RAMAC and 650 RAMAC, or as you put it "confused" them, by their deliberate strategy. On September 14, 1956 Thomas J. Watson, Jr. of IBM announced, or as you put it "confused," the 305 RAMAC and 650 RAMAC (they use nearly the same hard disk drive and the same bit-encoding-method to handle characters from punched cards) in the following IBM press release:[63]
"... revolutionary new products to accelerate the trend toward office and plant automation were announced today by International Business Machines Corp.: 305 RAMAC and 650 RAMAC, two electronic data processing machines using IBM's random access memory, a stack of disks that stores millions of facts and figures..."[64]
"Headline: 650 RAMAC and 305 RAMAC \\ The 650 RAMAC and 305 RAMAC both utilize the magnetic disk memory device announced as experimental by IBM a year ago. ..."[65]
"The 650 RAMAC combines the IBM 650 Magnetic Drum Data Processing Machine with a series of disk memory units which are capable of storing a total of 24-million digits. The 305 RAMAC is an entirely new machine which contains its own input and output devices and processing unit as well as a built-in 5-million-digit disk memory."[66] 71.128.35.13 (talk) 19:38, 18 September 2014 (UTC)

We must take your six-bit claim that "bits are not weighted as in BQCD, but rather in binary: 1, 2, 4, 8" on the RAMAC with a grain of salt (shake, shake, shake). It's just an oversalted "two bit" claim.

RAMAC 650 actually had seven bits per character. The IBM 650 "was a two-address, bi-quinary coded decimal computer".[67]

Alex Laird has dismissed the (six-bit) binary claim on the 650 RAMAC: "These days, all computer hardware is designed for two-bit binary communication. However, before massive amounts of standardization occurred in the technological realm, a few computers tinkered with the idea of making a computer run on hardware that wasn’t base two. These computers, the Colossus, the UNIVAC, the and IBM 650, to name a few, were coded using bi-quinary coded decimal. Of these, the IBM 650 is the only one that was mass-produced."[68]

Laird goes on to refute the six-bit binary claim on the RAMAC 650 once more, "The hardware communicated using bi-quinary coded decimal instead of in binary coded decimal as all modern computers (and even most historical computers) do."[69]

That salt you're shaking should help with the flavor of this dish of crow. 71.128.35.13 (talk) 19:42, 18 September 2014 (UTC)

IBM never officially referred to the 355 as a RAMAC nor is there any evidence that it was the disk drive the industry "started with" as the column is captioned in the article. RAMAC is ambiguous! As near as I can tell IBM used it in product and marketing materials only with the 305, 650 and the much later array. What is uniquely identified as the IBM 350 disk storage Model 1, has been frequently referred to as the RAMAC so it has that meaning also. I made a mistake in captioning this Section with "RAMAC" - it should have been "IBM 350." The IP knows this and continues to use RAMAC ambiguously to assert his point of view. May I suggest the IP is a disruptive user in deliberately and continuous using RAMAC in a confusing manner and we should now proceed to a request for comment? Tom94022 (talk) 20:37, 18 September 2014 (UTC)
IP: Regarding the punched card issue... ah, another cute slideshow, and this one from a tech blog. This number of theirs is just wrong. Simple arithmetic: 80 columns, each of which represents one of about 64 possible characters. It takes six bits to count from 0 through 63, so that's six bits per column. 80 x 6 bits = 480 bits = 60 bytes total of information if it was packed into eight-bit bytes. How they got to "70 bytes" is a mystery, but however they did, it's obviously wrong. To get there, each column would have to encode 7 bits - i.e. one of 128 possible character codes, rather than one of 64. And no commonly-implemented character code ever defined for punched cards ever did that. So it's six bits per column. 70 bytes per card? Rejected. Jeh (talk) 22:12, 18 September 2014 (UTC)
Aside: I worked a LOT with punched cards and I wanted to work with text processing sysytems; I would have loved to have even the 95 printable characters of ASCII-7 available on punched cards. (n.b.: I regard "space" as printable, but not "del".) They were not. Yes, IBM's EBCDIC codeset did define punch combinations for all possible 8-bit internal byte values, but there was no keypunch ever made that would punch them... so we still were limited to about 64 possible characters, just like on the 1401 but with a few different punch combinations. And I assure you, IBM punched cards absolutely did not use BQCD!
For the rest of it... Honestly, falling back on claims that "IBM deliberately confused" the 350 and the 355 just looks like thrashing on your part. We have well established statements of "six million digits" capacity for the 355, and "five million characters" for the 350. The 355 was used with the 650, which did use BQCD; it was clearly a "digit-oriented machine".

Coalescing the argument

But this is about the 350. The 350 held 3.75 million end-user usable bytes, formatted as 5 million 6-bit characters, each with a parity bit. The core of the proof - it's not an argument any longer - is as follows:
  1. The 350 goes with the 305.
  2. The 305 has a character set consisting of digits in the range 0 through 9, letters A through Z, and a handful of symbols like currency symbol, slash, etc. (Character code chart on page 8 of the RAMAC 350 manual).
  3. Bits in each character position are called 1, 2, 4, 8, 0, X, and R. (Character code chart on page 8 of the manual).
  4. That's seven bits, but per the text at the top of page 8 of the manual, the seventh bit (R) "carries no numeric or logical value." The manual's description of bit R shows clearly that it is a parity bit, computed to give each character stored an odd number of "1" bits. The character code chart confirms this.
  5. So there are six data bits per character.
  6. It is true that there is no parity bit in 7-bit BQCD, but this is irrelevant, as BQCD is simply not used in the 305.
  7. When IBM says the capacity of the 350 is "5 million characters", this means five million characters of the same format used in the 305.
  8. Claims of 7 bits per character in the 350 are widespread in nontechnical sources. However, if this count did not include a parity bit, then each character would have 128 possible values, not 64... and this does not match the character code set of the 305, nor any other manifestation of reality re the 305 or 350. Hence "7 bits per character" means "6 bits plus one for parity", just as it does in the 305 manual.
  9. This is further supported by pages 70 and 71 of the 305 RAMAC Random Access Method of Accounting and Control Manual of Operation,[13] which is explicitly describing "The method of coding these characters on the disk and drum tracks" (emph. added) and clearly describes the seventh bit on the IBM 350 disk as a parity bit. The character code chart provided on page 71 confirms it; there is no "data significance" to bit R; it is the seventh of seven bits shown, and it is a parity bit, matching the description in the preceding text. This is a "horse's mouth" document and there is just no wiggle room here.
  10. So, five million characters x 6 bits each = 30 million end-user usable bits.
  11. 30 million bits at 8 bits per byte is 3.75 million bytes.
This conclusion is inescapable from this sequence, and I see nothing in any of your comments to refute any of the steps. If you still disagree with this conclusion, please indicate with which numbered point(s) immediately above you disagree; and why; and provide reliable references for your position. Please note: Anything about the 650 or the 355 or BQCD is irrelevant; responses regarding any of those will be interpreted as non-responsive. Jeh (talk) 22:12, 18 September 2014 (UTC)

Counter argument

Re: 1 We are talking about the history of hard drives. As with modern hard drives, the 350 was a separate unit within the 305 system. The same mechanism was used in the 355 and 1405.
  • Agreed we are talking about the history of hard disk drives, not the systems to which they attach and that the 350 was sold separately as a component of the 305 RAMAC system. While the 355 and 1405 used many of the same parts as the 305 they were not the same mechanism. Tom94022 (talk) 18:08, 19 September 2014 (UTC)
All sources I am aware of say the same mechanism was used, in particular the CHM RAMAC Oral History.--agr (talk) 21:35, 19 September 2014 (UTC)
Oh, come now. From http://www-03.ibm.com/ibm/history/exhibits/storage/storage_1405.html :

The IBM 1405 Disk Storage of 1960 used improved technology to double the tracks per inch and bits per inch of track -- to achieve a fourfold increase in capacity -- compared to the IBM RAMAC disk file of 1956. Storage units were available in 25-disk and 50-disk models, for a storage capacity of 10 million and 20 million characters, respectively. Recording density was 220 bits per inch (40 tracks per inch) and the head-to-disk spacing was 650 microinches. The disks rotated at 1800 rpm. Data were read or written at a rate of 17.5K bytes a second.

Four times the data density. Heck, the rotation speed wasn't even the same! Oh, I'm sure they were "the same" at the "30,000 foot" level, but "same mechanism?" Rubbish. Jeh (talk) 01:09, 21 September 2014 (UTC)
2. Note that there are only 48 possible characters in the 305 character set.
  • While true it is irrelevant to the history of hard disk drives. The drive was specified to record blocks of 100 8 bit characters with additional requirements on the gaps before and after the data block, all laid out in the CE manual. Tom94022 (talk) 18:08, 19 September 2014 (UTC)
3. Agreed
4. This is the crux of the issue. The 350 neither generated nor was aware of the parity bit. That was internal to the 305 CPU. Nor is there any indication that the parity bit was needed or used specifically to enhance the 350s reliability, indeed any parity failure halted the CPU. If someone today built a text processing machine that exclusively used 7-bit ASCII with parity and stored that text on a modern hard drive, we would not say the hard drive then became a 7-bit drive.
  • The crux of the issue is the 305 CE manual describes 8 bits not seven for the 350 as in fact it does for other components of the 305. It is partially true that the "350 neither generated nor was aware of the parity bit" but the whole truth is the 350, like most if not all of the later HDDs until the 1990s did neither generate not were aware of the meaning of any bit. There is no justification for including the parity bit and not the space bit. Tom94022 (talk) 18:08, 19 September 2014 (UTC)
The justification for including the parity bit and not the space bit is that many sources say the drive was 7 bit. Your interpretation of the CE manual is [[WP:OR[[, unless you have a source that agrees with your interpretation. Note that on page 189 of the CE manual it says the the space bits are eliminated by a special circuit before the data gets to the core memory.--agr (talk) 21:35, 19 September 2014 (UTC)
5. Strictly speaking a 305 character is not 6-bits of information. There are only 48 possible characters, so the information content per character is log2(48)≈5.585 bits per character.
If the measure is information capacity, it is relevant. That is how information content is measured.--agr (talk) 21:35, 19 September 2014 (UTC)
7. Agreed, BQCD is a red herring
8. yes
Agreed Tom94022 (talk) 18:08, 19 September 2014 (UTC)
9. Not sure what you mean by "nontechnical sources." We don't normally get to pick an choose between reliable sources, instead we present both positions and their arguments. Also no one is disputing that the parity bit was stored on the 350. But again that parity bit is only generated and tested in the 305 CPU, not the disk drive. The 350 could store any 7 bit pattern, and there is a suggestion from a timing chart in the CE manual that 8 bits were possible.
There was no reason for IBM to describe the drive at the time as anything other than 5 million characters. That is what they were selling. Wikipedia is based on secondary sources and we prefer the opinions of modern experts who are knowledgeable about old and new technology, e.g. Al Shugart.
10 . The 6 data bits were not "end-user usable." There was no way to access the binary code from the stored program or the plugboard. So arguably five million characters x 5.585 bits each = 27.9 million end-user usable bits.
  • Any purchaser of an IBM 350 could access the 6 data bits. The fact that an end user of a 305 could not, while true, is irrelevant to the history of hard disk drives Tom94022 (talk) 18:08, 19 September 2014 (UTC)
Yes, but it is also true that any purchaser of an IBM 350 could access the 7 data bits. It is not clear if this was true for 8 bits. The space bit was suppressed in hardware.--agr (talk) 21:35, 19 September 2014 (UTC)
"any purchaser of an IBM 350 could access the 7 data bits." This appears to me to be OR on your part, or perhaps rather WP:SYNTH from the info in the 305 and 350 manuals. You need to provide a RS for this claim, one that is as authoritative as the 305/350 manuals we're already referencing. Jeh (talk) 00:56, 21 September 2014 (UTC)
11. If so the 27.9 million bits at 8 bits per byte is 3.49 million bytes. My point is that there is no one "right" way to look at this. Our article should represent the different viewpoints of the sources. --agr (talk) 15:00, 19 September 2014 (UTC)
  • While 3.49 MB is one valid way to look at the capacity it is irrelevant to the history of hard disk drives. There are several ways, 4.4 MB is an obviously incorrect one.Tom94022 (talk) 18:08, 19 September 2014 (UTC)
In summary, the 305 is as irrelevant as the 650. The 350 was sold as a separate product and is not limited by the 305 character set. Absent any evidence to the contrary, we have to accept the limitations of the IBM format as clearly disclosed in the 1959 IBM CE manual, namely that both the parity bit and the space bit are required. There is no basis for an assumption that the parity bit was not needed; it is original research and likely incorrect. The analogy is most every drive thereafter into the 1990s, e.g., the ST506, which had a specified unformatted capacity (i.e., 6.38 MB for the ST506) and a capacity specified by the manufacturer in a given format (i.e., 5.0 MB for the ST506). The only issue here is how to convert 5 million 6 data bits into bytes as we currently use the term, and there are only two answers, 3.75 MB or 5.0 MB, the latter having no reliable source. Tom94022 (talk) 18:08, 19 September 2014 (UTC)
The operations manual tells us that the parity bit is generated and tested in the 305 CPU for all data transfers, not just to and from the Disk Drive. That is no more OR than your 8-bit claim. Remember the argument is about your attempt to throw out certain sources as objectively wrong. That is a very hard case to make on Wikipedia and requires more than your (or my) reading of WP:primary sources: " All interpretive claims, analyses, or synthetic claims about primary sources must be referenced to a secondary source, rather than to an original analysis of the primary-source material by Wikipedia editors." --agr (talk) 21:35, 19 September 2014 (UTC)
It is not OR to quote the manual, a primary source, which clearly states the 350 recorded character is 8 bits. This is not interpretation, a claim, analyses, or syntheses - just a statement of fact from a primary source and as such should be sufficient to establish that secondary sources using another number without explanation are objectively wrong.
It is OR on your part to assert that because parity bit is generated and tested in the 305, not the 350, for all data transfers this somehow matters more than the S bit. The S bit is generated and suppressed in the 305 and would u like to provide the OR to say what happens if it is not present say due to a read error. My OR says parity error Also, it is OR on your part to speculate that the parity bit could be used for other purposes. Tom94022 (talk) 00:04, 20 September 2014 (UTC)

Reply to counter argument

1. How do you know it was the exact same mechanism? And how do you know it used the exact same electronics? The 355's "six million digits" does not make it sound like it was formatted the same way. If stored in BQCD straight from the 650, that would take 42 million bits; compare with the 350's 36 million bits (I count parity bits there because parity is inherent in BQCD code). It would have taken a trivial circuit to convert BQCD to and from simple four-bit BCD with a parity bit and store them that way... but that's only 30 million bits including parity, a significant step down from the 350's 35 million bits including parity. No.. the 355 must have had a different formatter.

The CHM oral history says the basic mechanism was the same. One big difference was the addition of two more access arms, but that is not relevant to our discussion. Undoubtedly there were differences in the electronics.--agr (talk) 22:09, 19 September 2014 (UTC)
The "differences in the electronics" are key to the whole thing. If they were the same device they would have had the same number.

2. That's just an artifact of the circuits in the card reader and card punch. There are six bits per character position in the 305, even if the peripheral equipment IBM chose to build doesn't let you get at all 64 possible characters.

And six bits is an artifact of the 305 CPU.--agr (talk) 22:09, 19 September 2014 (UTC)
And the IBM docs on the 305/350 state unequivocally that the seventh bit is a parity bit. Jeh (talk) 22:03, 20 September 2014 (UTC)

4. "The 350 neither generated nor was aware of the parity bit." How do you know this? Anyway, this is a red herring. Even if it is true, since the only way to store data on a 350 was via a 305, and the 305 would error check if it detected a character with even parity, there would be no way to store such characters on the 350. Hence the 350 can be used to store five million six-bit characters. I think this is a distinction without a difference.

The 305 operations manual is clear on this. See page 70-71, and figure 47.--agr (talk) 22:09, 19 September 2014 (UTC)
Um, maybe. You know perfectly well that diagrams of this sort (even the "schematics" in the CE doc) are "cartoons"; they leave out a lot of detail. We can't conclude that there was nothing in the 350 that checked or generated the parity bit. This leaves us with the 305 and 350 docs, which state explicitly that the seventh bit is a parity bit, "carrying no numeric or logical value." It is OR on your part to infer from the IBM docs that it would have been possible in some way to use the 350 without the 305, and thereby (or by any other means for that matter) to store seven data bits per character, no parity. Jeh (talk) 22:03, 20 September 2014 (UTC)
IBM manuals of the era were written carefully. The 305 operations manual section is describing error handling. The notion that IBM had another error checking means and did not bother to mention it in the manual is far fetched. But I suggest you are the one engaging in OR by reading the manual in just one way to exclude certain sources a (like the program manager for the 350) as "unreliable."

4a. "If someone today built a text processing machine that exclusively used 7-bit ASCII with parity and stored that text on a modern hard drive, we would not say the hard drive then became a 7-bit drive." And if someone built a computer with integral hard drive that stored six bits per character, but the peripheral units only allowed generation of 48 different character values out of the 64 theoretically possible, we would not say the hard drive became a 5.585-bit drive. You can't have it both ways.

I am not trying to have it both ways. I am saying there is more than one right answer, and we should respect the differences in the sources.--agr (talk) 22:09, 19 September 2014 (UTC)
Not when the sources are not equally reliable. Your assumption that someone could have bought 350s from IBM, without the accompanying 305, and used them successfully to store seven data bits per character position, is not backed up by an WP:RS. So far, it appears to be pure conjecture on your part. Do you have RSs indicating that this was ever done? Do you have docs from IBM, of the same calibre as the 305/350 docs, indicating that this is possible? Jeh (talk) 22:03, 20 September 2014 (UTC)
We are talking about comparing the 350 with a modern drive, that should be done one the same basis, with both drive in isolation.--agr (talk) 21:44, 21 September 2014 (UTC)

5. See reply 2 (and 4a). There are six data bits.

9. Actually, yes, we do pick and choose, in that we evaluate the reliability of sources. When we see a tech blog say "70 bytes per card" with absolutely no justification nor reference, and simple arithmetic (the same sort you're doing in your step 5) proves otherwise, we can decide that that source is not reliable on that point. Similarly, a slide whipped up for a fun talk at a SHARE meeting is not of the same reliability as the manufacturer's user and CE manuals. Editors on WP evaluate the reliability of sources all the time. Otherwise we would just quote everything in every source and be done with it.

Sources like ASME or Al Shugart are not in that category.--agr (talk) 22:09, 19 September 2014 (UTC)
They're not that bad, no, but they're not at the same trust level as the IBM docs. The quote by Al Shugart is from a transcript of an informal round-table discussion that happened decades after the 350 project ended. Furthermore it makes no sense: "a 7-bit code - 6-bit + 1 binary"??? What, the other six bits were not binary? He might have meant to say "parity" instead of "binary"; he might have meant "a 7-bit code - (6-bit + 1) binary", implying that the "+1" bit was special and the core code was 6 bits. Or it might even be a transcription error. Anyway, "a 7-bit code" is still consistent with the 305/350 doc that indisputably describes 7 bits per character... one of them being a parity bit. Jeh (talk) 22:03, 20 September 2014 (UTC)
Al Shugart was program manager for the project and went on to found one of the biggest companies in the modern disk drive industry. And I don't know what transcript you are looking at but the video of the interview is readily available and after saying "a seven bit code, six bits plus one binary bit," he immediately corrects himself, adding "one parity bit rather."--agr (talk) 21:44, 21 September 2014 (UTC)
The same is true of the ASME document. One, it is not exactly at the "peer-reviewed research paper" level. Two, it's from the society of Mechanical engineers. I think good MEs walk on water, but we can't expect them to be experts about digital data storage technology. So this document is not of the same trust level as IBM's 305/350 documents. Three, again, "7 bits per character" is not inconsistent with the 305/350 docs, which clearly describe 7 bits per character... and state that one of those bits is a parity bit - and there is just no other way to interpret them. Jeh (talk) 22:03, 20 September 2014 (UTC)
We have no requirement for peer-reviewed sources on Wikipedia. And saying gee, they are mechanical engineers, so why do they know about electronics? is really hair splitting.--agr (talk) 21:44, 21 September 2014 (UTC)

9a. "The 350 could store any 7 bit pattern..." How do you know there was no error checking inside the 350? It seems very unlikely to me that IBM would build the 305 to just halt on a parity error but give no indication as to the source of the error. But even if true, this again is a distinction without a difference. The 350, connected to the 305 as it had to be, could not store "any 7 bit pattern", not unless the parity check in the 305 failed.

Again the manual is clear on this. IBM computers of this era halted on a parity check, not just the 305. More sophisticated responses came later. And, no the 350 drive did not have to be connoted to a 305, it just was. In analyzing at a hard drive is is reasonable to consider it in isolation, just as we would a modern hard drive incorporated into an integrated system. And if you do look at the 350/305 as a system the data is characters from a 48 bit set.--agr (talk) 22:09, 19 September 2014 (UTC)
Was the 350 ever used by anyone, connected to anything other than a 305? RSs please. Jeh (talk) 22:03, 20 September 2014 (UTC)
No one is saying it was,just that the parity bit was not part of the disk drive, as the same manuals makes clear.

10. There are six bits, damn you!

Some sources say 7. And please see WP:CIVIL.--agr (talk) 22:09, 19 September 2014 (UTC)
Let it be known that this, namely (Jeh at 18:49, 19 September 2014 (UTC)) is a disruptive user, clearly beyond the WP:Civil pale. 71.128.35.13 (talk) 20:15, 20 September 2014 (UTC)
Please. That's a ST:TNG reference, meant in a spirit of levity. Don't grasp at broken straws on social AND technical grounds at the same time, 'K? It just makes you look desperate (and would get you laughed out of ANI if you brought it up... so, please do!). Jeh (talk) 22:03, 20 September 2014 (UTC)
My humor detector was apparently improperly calibrated. I withdraw my complaint.--agr (talk) 21:44, 21 September 2014 (UTC)

11. See 9. There are no reliable sources that support, for example, 4.4 million bytes (despite one IP's voluminous and fallacious posts to the contrary), not in the face of overwhelming "horses's mouth" evidence that the claimed seventh bit was a parity bit. Similarly, a claim of "5.585 bits" is not supported by any reliable source, only by your calculation. And while we can use such calculations to evaluate sources, we can't use them to generate material for the article; that's WP:SYNTH. Jeh (talk) 18:49, 19 September 2014 (UTC)

Information content is a standard calculation, done in other articles, but I am not arguing for using it, just saying stick to reporting what the sources say, and stop claiming some are "objectively wrong." And we don't depreciate comments just because the editor is an IP.--agr (talk) 22:09, 19 September 2014 (UTC)
So on the one hand you argue that, because there is nothing in published docs that says you couldn't use a 350 separate from the 305 and store seven data bits per character position, we should regard it has holding 5 million characters of 7 bits each. And on the other hand you argue that it should be regarded as holding only 5.585 bits per character because of the limits of the card reader and punch that IBM happened to sell with the 305. I don't understand how you can argue both of these positions at the same time. Jeh (talk) 22:03, 20 September 2014 (UTC)
Re the IP, I'm not deprecating them because they're an IP. I'm deprecating them because they're wrong and because the IP continued to repeat previously-refuted arguments (arguing in circles) and is in general behaving like a tendentious editor.
And personally, Arnold: if I think something is objectively wrong, I'm not going to feel limited by any orders from you not to say so. Not every source is equally reliable and editors are free to, indeed are expected to, describe the reasons for their evaluations of sources. Jeh (talk) 22:03, 20 September 2014 (UTC)
I've spent a fair bit of time in the 305 Manual[12] which I suggest is the only reliable source for this discussion. The manual clearly shows a bit serial data interface; bit serial disk drives are traditionally specified at their interface with the format given by the drive vendor, not by what the system does with it, so much of the chatter about the 305 is irrelevant beyond the format. The format is disclosed a block comprising a leading gap with some bits, 100 8 bit characters and a trailing gap. Arnold is partially correct that "The 350 neither generated nor was aware of the parity bit." - this is pretty clear from the circuit diagrams, but it is more complete to note that the 350 did not know the meaning of any bit. Since the recorded character is 8 bits the manual makes any 7-bit or 4.4 MB assertions suspect. No one yet has stated the basis for omitting the space bit but adding the parity bit. Further more, the IBM specified format constrains many bits in the block, including the parity and space bit. The 305 manual specifies at its bit serial interface 30 million unconstrained bits and lots of other constrained bits, but since modern HDD capacity is measured in bytes of 8 unconstrained bits the only valid capacity for comparison is 3.75 MB. My proof is somewhat different than JEHs but we arrive at the same conclusion and I hope Arnold now will agree. Tom94022 (talk) 20:42, 19 September 2014 (UTC)
I've pointed out above that the space bit must be suppressed before data gets to the core buffer, according to the CE manual, p.189. On the other hand the parity bit is presented to the core buffer and is only checked there. So there are different valid ways of looking at the capacity of the 305 for comparison purposes. I argue that when we make our smug comparison about how much better modern drives are we should take the most conservative view of that ratio and compare characters with modern bytes, because that is how users employed the drives then and now.--agr (talk) 22:09, 19 September 2014 (UTC)
Arnold you also pointed out that "In analyzing at a hard drive is is reasonable to consider it in isolation, just as we would a modern hard drive incorporated into an integrated system." Exactly, the core buffer is a part of the 305 system not part of an isolated drive. There is no "check" on any bit in the 350 but if the S bit is not generated by the drive, all subsequent decodes of the characters of one block in the system will be wrong and some will cause a parity error upstream from the drive, but nothing will happen in an isolated drive. So what makes the parity bit so special that we count it but not the space bit ? In isolation the data interface of the 350 is just a stream of bits, with the S and P both required and providing no information. In isolation the drive can't distinguish any bit, but a drive failure to generate properly either the S bit or the P bit will cause the system to report an error. There really is no justification for using just 7 bits of the 8 bits per character in the bit stream other than it is an urban legend. There is nothing smug about this comparison, but it sure sounds like you have a point of view to minimize the improvement. Tom94022 (talk) 23:34, 19 September 2014 (UTC)
Please take a careful look at figure 86 on page 86/253 of the 305 CE Manual[12] particularly the lines labeled "Disk Write Data" (the serial input stream), "Disk Flux" (what is written to the disk) and "Disk Data" (the serial output stream). This is the IBM format for the 350. Note there are 200 usec of AGC bits (about 166 bits) at the beginning of a block. You have to write them, u can't use them for anything else or the drive might not replay correctly. Note the about 400 usec region with no bits, it's there for a reason so writing bits into it may have unpredictable results. At the end of this blank region is the S bit of the first character, followed but the remaining 7 bits of the first character and then followed by the remaining 99 characters of the block. Elsewhere the spec says there is a 180 usec blank gap after the last character. IBM constrains the P and S bits so that a character has only the 64 available states of the 6 data bits. This is the only published IBM format that can be used to calculate the capacity in current terms. Absent a source that explains how they are going to get 128 states or 256 states out of these 8 bits as specified by IBM we have to conclude that such a calculation is wrong.
The industry has come somewhat of a full circle, from bit serial data interfaces to bit serial interfaces with data in one of several packets across that interface. There are lots of serial bits in today's packet any of which could be redesignated by a adapter designer but that would clearly violate the spec and be rejected by today's smart drives. Yesterday's dumb drive would accept them and replay whatever it could but there would be no guarantee that the playback would be reliable, even if one read after writing. Some SAS drives do have additional bits that could be used for additional capacity, the extra bits in the 520 or 538 byte sectors, without posting an error but they are not counted by the drive vendor in specifying capacity, a Seagate ST2000NX0273 is 2 TB regardless of the sector size. Capacity is based upon the manufactures specs and IBMs specs yield a capacity in current terms of 3.75 Bytes.
Arnold I hope this will convince you, but if not I suppose I could live with 3.75 as the main value with footnotes saying "Other reported capacities of 4.4 MB (based upon 7-bits per character) and 5.0 MB (based upon 8 bits per character) would result in $xxxx/MB and $yyyy/MB" Tom94022 (talk) 16:48, 20 September 2014 (UTC)
It won't convince any reader of IBM technical documents circa 1960.
IBM RAMAC 305 disks use the same coding system as the drum. This "coding system ... is used throughout the machine for magnetic recording on the disks and drum." according to the RAMAC 305 Manual of Operation dated April 1957. [70](page70)
IBM technical manuals do not support 6 bits per character. The horse's mouth spoke of seven bits per character as of 1960: IBM white paper, "Digital Computers - Logical Principles and Operation" by George J. Saxenmeyer, General Products Division Development Laboratory Endicott, N. Y., September 23, 1960 [71] describes the IBM RAMAC 305: "The core buffer has a capacity of 100 characters. ... This buffer consists of an array of seven planes of 100 (10 x 10) cores each, a separate plane for each of the character bits."(page 11) The core buffer capacity is 700 binary bits or 100 characters at seven bits per character.
On the IBM 305 RAMAC "one character position contains eight bit-spaces for the seven bits of the character".(page 11) [72]
"Currently-produced systems use a 51-character seven-bit parity-checked code." ... "the current seven-bit code has a capacity of 64 characters."[73](page 33) Bi-quinary coded decimal has seven bits per character. 71.128.35.13 (talk) 20:15, 20 September 2014 (UTC)
As has been pointed out to you already, the IBM documents specific to the 305/350 do describe a seven-bit code... and go on to describe the seventh bit as a parity bit. Using phrases like "carrying no numeric or logical value" to describe it. (I really don't understand how you can continue to ignore those words.) It's a parity bit. Yes, there are some shorter descriptions that simply say "seven bits" but every more technical document, more specific to the 305/350, notes that the seventh bit is a parity bit.
"Currently-produced systems use a 51-character seven-bit parity-checked code." Yes, seven bits including the parity bit - hence six data bits. As thoroughly described in the 305/350 documentation. The entire capability of this code is not quite fully utilized by the 305/350, as indicated by:
"the current seven-bit code has a capacity of 64 characters." Right, because without the parity bit (which is one of the seven has amply documented in the 305 manuals), you have six data bits, giving 64 possible characters. If the seventh bit was not a parity bit it would have 128 possible characters. These points do not contradict the claim of six data bits, they are consistent with it.
"Bi-quinary coded decimal has seven bits per character." Wrong. BQCD has seven bits per decimal digit. Every IBM computer and device (650, 355, NOT the 305 or 350) that used BQCD had to use two such decimal digits to represent each possible "character", giving 100 possible characters. BQCD and any machine that used BQCD is irrelevant to this discussion. The document you are referencing is a general description, giving the codes and techniques used in a variety of different machines. You can't use anything it says about BQCD in a discussion of the 305, or the 350, or the system that included both. That BQCD uses seven bits per decimal digit (and has an inherent built-in error check, so no parity bit was deemed necessary), while the coding used in the 305 used six bits plus a parity bit, also totaling seven, is mere coincidence.
You have had these facts explained to you several times now, and your bringing up BQCD yet AGAIN, as if it had anything at all to do with the 305/350, is unproductive at best. Jeh (talk) 22:03, 20 September 2014 (UTC)
In sum, the careful, thorough, technically competent, and honest reader of IBM docs of this era—a reader who is not cherry-picking sentences in an attempt to advance an already thoroughly-discredited position—will read the material specific to the 305/350, will find the references to the "seven bit code"... but will also find (and not ignore) the description of the seventh bit as being used only for error checking, "carrying no numeric or logical value". A bit that "carries no numeric or logical value" cannot be used in a capacity calculation, as long as we are not using the ECC bits in modern hard drives in their capacity calculations.
Such a reader will also correctly ignore all information about BQCD, as BQCD was never used on the 305/350.
While pondering the above, you might want to review WP:IDIDNTHEARTHAT. Jeh (talk) 22:15, 21 September 2014 (UTC)

Wrapping up

My conclusion so far: IBM docs presented so far do not explicitly rule out the possibility of using the 350 to store seven data bits per character, with no parity. However, until RSs of the same trust level are presented that explicitly indicate this is possible, such speculation on the part of WP editors is just that. We must go with the most trusted, most detailed, and most specific RSs, the 305/350 documents already extensively cited. They state clearly that the seventh bit was a parity bit and not generally usable, "carrying no numeric or logical value". Until equally reliable sources are presented stating that it was possible to use the seventh bit for arbitrary data, the conclusion of "5 million six-bit characters" must stand. Since the IBM docs state "seven bits per character" but then go on to say that one is a parity bit, this is not contradicted by various claims of "seven bits per character". There were seven bits per character, just, one of them was only ever used as a parity bit. To put the calculation on equal footing with specs for modern hard drives, we don't count the parity bit, any more than we would count the ECC bits in modern hard drives. Jeh (talk) 22:12, 20 September 2014 (UTC)

I agree with Jeh. The RS weight is heavily against 4.4MB; we cannot even mention it without violating WP:UNDUE. Consensus is also against doing so, with only the IP dragging the rest of us in circles. Please, let's end this here. --A D Monroe III (talk) 16:45, 21 September 2014 (UTC)
I don't agree that it is undue to give weight to the modern characterization of an RS like the ASME, vs. Wikipedian analysis of IBM docs from the era, but I also think the issue is not that important and more than enough time has been spent on it, so I am happy to move on. Cheers.--agr (talk) 22:16, 24 September 2014 (UTC)
Please. ASME's focus, being a society of mechanical engineers, was on the mechanics, not the data stream. And put next to the IBM technical documents, the ASME's award document is little better than a puff piece. I have no doubt that they spent more time on graphic design than they did on technical research. And I don't agree that reading "[one of the seven bits] carries no numeric or logical value" and concluding "the seventh bit should not be counted as part of the drive's capacity" requires any significant amount of "Wikipedian analysis". Jeh (talk) 22:45, 24 September 2014 (UTC)

So, this leaves only the IP as a dissenter, and all of the IP's arguments have been refuted. We're done here. The capacity was 5 million 6-bit characters, equivalent to 3.75 million 8-bit bytes. A footnote to the effect that "references to 4.4 MB are counting the seventh bit as a data bit, when in fact it was a parity bit", this claim ref'd to the 350 CE manual, would be not out of place. Jeh (talk) 22:49, 24 September 2014 (UTC)

Six bits per character, as long as the plethora of seven-bit sources are given their proper due:
Seagate "RAMAC 4.4 MB" http://www.thic.org/pdf/Nov02/seagate.dlitvinov.perpendicular.021105.pdf
American Society of Mechanical Engineers ASME "The 350’s fifty 24-inch disks contained a total capacity of 5 million binary decimal encoded characters (7 bits per character) of storage.[74] (This is 4.4 MB.)
The Official History of IBM, 350 Disk Storage Unit "The whole thing could store 5 million binary decimal encoded characters at 7 bits per character" http://www-03.ibm.com/ibm/history/ibm100/us/en/icons/ramac/
Museum (Ross Perot collection) RAMAC "storage capacity 4.4 megabytes."[75]
"in 1957, the original RAMAC ... boasted a storage capacity of 4.4 megabytes."[14]
"the 305 random-access-method of accounting and control (RAMAC) system and its disk file ... system was launched in September 1956 and included near line random disk file storage device. The disk file helped in access of 4.4 megabytes of data"[76]
"The total storage capacity of the RAMAC 305 was 5 million 7-bit characters, or about 4.4 MB."[77]
IBM introduced the 305 RAMAC system "It started with a product announcement in May of 1955. IBM Corp. was introducing a product that offered unprecedented random-access storage — 5 million characters (not bytes, they were 7-bit, not 8-bit characters).[78]
"Milestones in the hard disk drive industry" "350 RAMAC formatted capacity 4.4 MB" page 29[79]
"The 350 stored 5 million 7-bit characters (about 4.4 megabytes)."[80]
71.128.35.13 (talk) 23:29, 24 September 2014 (UTC)
Each of the above is obviously less authoritative than the IBM 305/350 documents, which indicate unequivocally that the seventh bit was a parity bit, with "no numeric or logical value". Bits with "no numeric or logical value" do not get counted in modern hard drives, so for a proper comparison, they should not be counted for the 350. I will say the same thing I said to Arnold: Unless you can come with a source that's at least as authoritative as the IBM 305/350 documents, and which states that the 7th bit was ever usable as a data bit, the correct answer is 5 million 6-bit characters = 3.75 million bytes. Until you can come with such a source, the "proper due" to any or all of these 7-bit claims is to note that, according to the most authoritative sources (the 305/350 documents), they're counting the parity bit in the calculation. If you can't find such a source, your repeating the same arguments and referring to the same not-so-authoritative "sources" yet another time will not advance your position; it will just be tiresome. Jeh (talk) 04:31, 25 September 2014 (UTC)
I agree with JEH. FWIW, Engineering Design of a Magnetic-Disk Random-Access Memory, Feb 1956, by IBM SJ Engineers also states the IBM 350 recorded an 8 bit character. Seven bits as a discrete quantity does not exist in the reliable sources from the time of the IBM 350, 6 data bits and 8 recorded bits do, so a mere assertion of 7 bits or 4.4 MB in a modern source absent a discussion of which 7 usable bits and why not 8 usable bits is not reliable. Tom94022 (talk) 18:07, 25 September 2014 (UTC)
  1. ^ In email correspondence with Maleval the 2011 article author he cited the same 2006 source for $50,000
  2. ^ Sometimes as in CDs the check bits are interspersed with the encoded data bits, mostly they are not.
  3. ^ Probably because electronics were very expensive then compared to media
  4. ^ I think the addition of the count field (sector header) was the first step followed by internal write circuit checks and head checks (causing far more false positives that actual failure detection).
  5. ^ As near as I can tell this began in the early 70s and ended as all drives became intelligent (e.g. SCSI and PATA) in the 1990s
  1. ^ IBM's first HDD versus its last HDDs
  2. ^ a b c Fost, Dan (2006-09-11). "Hard-driving valley began 50 years ago / And most other forms of data storage eventually became a distant memory". San Francisco Chronicle. Mountain View,CA. Retrieved 2014-08-26. In 1956, the RAMAC cost $50,000, or $10,000 per MB. Cite error: The named reference "Fost 2006" was defined multiple times with different content (see the help page).
  3. ^ a b c d e Maleval, Jean-Jacques (2011-06-20). "History: First HDD at 55 From IBM at 100 Ramac 350: 4.4MB, $11,000 per megabyte". storagenewsletter.com. Retrieved 2014-08-27. Ramac 350: 4.4MB, $11,000 per megabyte ... The first delivery to a customer site occurred in June 1956, to the Zellerbach Paper Company, in San Francisco, CA.
  4. ^ Pugh, Emerson (1995). "Building IBM: Shaping an Industry and Its Technology". p. 226.
  5. ^ IBM Archives - IBM 650
  6. ^ a b Text of an IBM press release distributed on September 14, 1956
  7. ^ "Computing In The Universty," American Mathematical Society and the Ohio State Research Center, Datamation, May 1962
  8. ^ Pugh, IBM's Early Computers, p658 fn55, "Because of 650's encoding conventions, 355 capacity was 6 million decimal digits."
  9. ^ a b Ballistic Research Laboratories "A THIRD SURVEY OF DOMESTIC ELECTRONIC DIGITAL COMPUTING SYSTEMS," March 1961, section on IBM 305 RAMAC (p. 314-331) states a $34,500 purchase price which calculates to $9,200/MB.
  10. ^ Farming hard drives: 2 years and $1M later
  11. ^ "in June 1956 ... to Zellerbach ... "
  12. ^ a b c RAMAC 305 Customer Engineering Manual, p.7 describes the 350 character coding as having 6 data bits plus two other bits that do not affect the character coding. Therefore it is a 6-bit character.
  13. ^ a b 305 RAMAC Random Access Method of Accounting and Control Manual of Operation (PDF). April 1957. p. 70. Form 22-6264-1. {{cite book}}: Unknown parameter |separator= ignored (help)
  14. ^ J. M. D. Coey (25 March 2010). Magnetism and Magnetic Materials. Cambridge University Press. ISBN 978-0-521-81614-4.