User:Diego Moya/Capacity measurements

Capacity measurements

edit
Advertised capacity
by manufacturer
(using decimal multiples)
Expected capacity
by consumers in class action
(using binary multiples)
Reported capacity
Windows
(using binary
multiples)
Mac OS X 10.6+
(using decimal
multiples)
With prefix Bytes Bytes Diff.
100 MB 100,000,000 104,857,600 4.86% 95.4 MB 100.0 MB
100 GB 100,000,000,000 107,374,182,400 7.37% 93.1 GB, 95,367 MB 100.00 GB
TB 1,000,000,000,000 1,099,511,627,776 9.95% 931 GB, 953,674 MB 1,000.00 GB

The capacity of hard disk drives is given by manufacturers in megabytes (1 MB = 1,000,000 bytes), gigabytes (1 GB = 1,000,000,000 bytes) or terabytes (1 TB = 1,000,000,000,000 bytes).[1][2] This numbering convention, where prefixes like kilo- and mega- denote powers of 1000, is also used for data transmission rates and DVD capacities. However, the convention is different from that used by manufacturers of memory (RAM, ROM) and CDs, where prefixes like kilo- and mega- mean powers of 1024.

When the unit prefixes like kilo- denote powers of 1024 in the measure of memory capacities, the 1024n progression (for n = 1, 2, …) is as follows:[1]

  • kilo = 210 = 10241 = 1024,
  • mega = 220 = 10242 = 1,048,576,
  • giga = 230 = 10243 = 1,073,741,824,
  • tera = 240 = 10244 = 1,099,511,627,776,

and so forth.

Note also that lowercase “k” is the proper unit symbol for the prefix kilo. Uppercase “K” is properly the unit symbol for the unit of thermodynamic temperature kelvin. Moreover, the rule of the International System of Units (the SI) is that when describing the magnitude of a measure, a space always separates the numeric value of a quantity and its unit symbol, e.g. “1 kB”. Nonetheless, it is exceedingly common within the computing industry when denoting binary capacity—particularly in marketing literature and product packaging—to use uppercase K and no space (1KB), although “1 KB” is not incorrect and is often considered more suitable in technical writing.

Early Usage discrepancies

edit

The practices of using prefixes assigned to powers of 1000 within the hard drive industry (storage) and prefixes assigned to powers of 1024 within the memory and storage industry both date back to the early days of computing.

FWIW I think all the following examples are TMI but if we are going to have examples in an HDD capacity article I suggest a fair and balanced article would have at least as many HDD examples as memory examples. Below I made them equal in number. I actually think the earlier construction with two sentences, each having a few footnoted examples is better than this, but here are my suggestions.

The first disk drive's capacity was described as providing 5 million character capacity[3]
The practice of describing HDD capacity in powers of 1000 developed in the 1960s as in the Univac 9400 disc based computer system description ... "can have 2-8 8411 drives for 14.5-58 megabytes capacity. The 8411 has a transfer rate of 156K bytes per second." In this example the usage is consistently in the powers of 1000 sense between HDD capacity and transfer rate[4].
By the 1970s million, mega and M were consistently being used in the powers of 1000 sense to describe HDD capacity as, for example, in the seminal 1974 IBM article on Winchester HDD drives, which makes extensive use of Mbytes[5] and the October 1974 CDC Product Line Card which has multiple unambiguous usage of MB[6].
Disk/Trend, the principal marketing report on the hard disk drive industry from its beginning in 1977 made extensive use of MB in the powers of 1000 sense. While the categories changed during its next 22 years of publication, Disk/Trend, always and consistently categorized the industry in segments using prefixes M and later G in the powers of 1000 sense[7].
Some computers like the IBM 702, which in 1953 used magnetic-core memory (non-volatile memory comprising small magnetic rings), had precisely 10,000 memory locations.
This has no relevance since it doesn't use prefixes
The IBM 1410 Data Processing System memory, which used modified decimal addressing was described using powers of 1000 as in, "The 40K core array requires 40,000 valid five-position addresses from 0,000 to 39,999." [8]
Computers such as the IBM 704 (also with magnetic-core memory) in 1954 had up to 32,768 words of memory, which was referred to as “32k words.” Thus, the prefix “k” was being used to represent 1024 in the 1950s. [9]
The practice of using the prefix “K” to denote 1024 was further reinforced after magnetic-core memory was obsoleted by Intel Corporation in 1969 with the introduction of the Intel 1102 dynamic random-access memory (DRAM) chip featuring 1024 bits of memory, Intel marketed it as a “1K” or 1 kilobit chip.[10]
When supercomputers with large amounts of IC-based memory were developed, such as the Cray‑1 in 1976, which featured up to 8 × 1024 × 1024 bytes of DRAM in 64-bit words, terminology such as “4M words” helped to popularize the use of the standard prefix “M” to denote the multiple 1,048,576 (10242). Today, computer memory in DIMM form is commercially available in multiples of gigabytes (10243).

Until the 1980s the two usages evolved in parallel with HDD capacity being described in powers of 1000 and memory capacity usually being described in powers of 1024. Prior to the 1980s there are no known instances of HDD capacity being described in powers of 1024 nor, aside from a few very early examples, no known instances of memory capacity being described in powers of 1000.

Difference between the systems of measurement Reactions to the discrepancy

edit

In the case of “mega-,” there is a nearly 5% difference between its "powers of 1000" used by the HDD industry and its powers of 1024 definition used for computer memory. Furthermore, the difference is compounded by 2.4% with each incrementally larger prefix (gigabyte, terabyte, etc.).

Capacity has become confused because different operating systems report capacity in different ways. Most operating systems, including the Microsoft Windows operating systems use the powers of 1024 convention when reporting HDD capacity, thus an HDD offered by its manufacturer as a 1 TB drive is reported by these OSes as a 931 GB HDD. Apple's current OSes, beginning with Mac OS X 10.6 (“Snow Leopard”), use powers of 1000 when reporting HDD capacity thereby avoiding any discrepancy between what it reports and what the manufacturer advertises.

The industry finds itself in the situation where a 1 TB HDD in a Dell desktop is advertised by it manufacturer (e.g., Seagate[11]) and by Dell[12] as 1 TB HDD but when the user checks capacity using the Dell provided operating system, i.e., Windows, the reported capacity is 931 GB.[13]

The discrepancy between the two conventions for measuring capacity was the subject of several class action suits against HDD manufacturers. The plaintiffs argued that the use of decimal measurements effectively misled consumers.[14][15]

In December 1998, an international standards organization attempted to address these dual definitions of the conventional prefixes by proposing unique binary prefixes and prefix symbols to denote multiples of 1024, such as “mebibyte (MiB)”, which exclusively denotes 220 or 1,048,576 bytes.[16] In the over‑25 years that have since elapsed, the proposal has seen little adoption by the computer industry and the conventionally prefixed forms of “byte” continue to denote slightly different values depending on context.[17][18]

References

edit
  1. ^ a b Drive displays a smaller capacity than the indicated size on the drive label Cite error: The named reference "WD" was defined multiple times with different content (see the help page).
  2. ^ i.e. see HGST, Samsung, Seagate, Toshiba and Western Digital websites
  3. ^ to be provided
  4. ^ to be provided
  5. ^ to be provided
  6. ^ to be provided
  7. ^ to be provided
  8. ^ to be provided
  9. ^ Real, P. (September 1959). "A generalized analysis of variance program utilizing binary logic". ACM '59: Preprints of Papers Presented at the 14th National Meeting of the Association for Computing Machinery. ACM Press: pp. 78-1 – 78-5. doi:10.1145/612201.612294. S2CID 14701651. On a 32k core size 704 computer, approximately 28,000 datum may be analyzed, … without resorting to auxiliary tape storage. {{cite journal}}: |pages= has extra text (help)
  10. ^ The Intel 1102 silicon die had memory cells laid out in two banks whereby clusters of four bits were each organized into a 16 × 8 matrix. Thus, 2 banks × 4 memory cells × 16 columns × 8 rows = 1024 memory cells, or bits.
  11. ^ to be provided
  12. ^ GB means 1 billion bytes and TB equals 1 trillion bytes; actual capacity varies with preloaded material and operating environment and will be less. Popup at HDD capacity at Dell site[1].
  13. ^ to be provided
  14. ^ Western Digital Settles Hard-Drive Capacity Lawsuit, Associated Press June 28, 2006 retrieved 2010 Nov 25
  15. ^ Seagate lawsuit concludes, settlement announced
  16. ^ National Institute of Standards and Technology. "Prefixes for binary multiples". "In December 1998 the International Electrotechnical Commission (IEC) [...] approved as an IEC International Standard names and symbols for prefixes for binary multiples for use in the fields of data processing and data transmission."
  17. ^ Upgrading and Repairing PCs, Scott Mueller, Pg. 596, ISBN 0789729741
  18. ^ The silicon web: physics for the Internet age, Michael G. Raymer, Pg. 40, ISBN 9781439803110