User talk:Vt320/Archives/2021/December

Latest comment: 2 years ago by Guy Harris in topic Microcode engine of the 9370


Microcode engine of the 9370

The Cocke and Markstein paper said "The IBM 9370 uses an 801 as its microcomputer."; I think IBM has historically used the term "microcomputer" to refer to the processor that executes microinstructions.

However, "The Design of a Microprocessor" says, on pages 31 and 32, that "The Capitol chip set CPU employs vertical microcode. The microinstruction architecture is essentially unchanged compared to two predecessor machines: the IBM 4361, and the IBM 9370-90.", and the microinstruction set doesn't look all that 801ish. ("The Capitol chip set" was the chip set used in the ES/9370 systems.)

So, yes, it doesn't look as if the 801 was the vertical microcode engine of the 9370 CPU.

For what it's worth, "Software Metrics and Microcode Development: A Case Study" has, on page 4, a table giving the number of lines of code, in various languages, for the "Source code of the IBM 4381 and IBM 9370 microcode devided into units and sub units", with the 9370 microcode consisting of microcode for the Processor Unit (PU), Input/Output unit (IO) and Service Processor unit (SP). The PU microcode is all in a language called "ASM4"; they describe ASM1 through ASM5 as "assembler languages used inside IBM". The IO microcode is a mix of "ASM5", "PL8", and "680ASM". I think "PL8" is PL.8; I don't know what "680ASM" is, but I wouldn't be too surprisde to find out that it's 68000 assembler. Perhaps some of the functions of the I/O unit are performed by an 801, programmed in a mixture of PL.8 and ASM5 (there appears to be a lot more PL.8 code than ASM5 code); maybe there's a 68k stuck inside there (perhaps also partially programmed in PL.8). The SP unit microcode is a mix of ASM4, "370 ASM", "MASM", "ASM86", and "PLS86", so it sounds like a mix of CPU vertical microcode, S/370 code (perhaps extra 370 code outside the OS that handles some functions), and things that all sound suspiciously like x86 code (IBM PC buried inside?).

Unfortunately, I gave away my copy of "Inside the AS/400", so I can't look there to see what it said about Iliad etc.; maybe I'll go buy a used copy somewhere. (Sadly, I couldn't find any e-book versions.) Guy Harris (talk) 10:13, 28 November 2021 (UTC)

On pg. 370 of Inside the AS/400, 2nd edition:

That the Endicott-developed models of the 9370 systems each contained a RISC processor is somewhat ironic. The operating system and the applications ran on the System/370 co-processor, while the Iliad processor was relegated to the role of a service processor. Because the Iliad processor in those systems did not play a significant role, the fact that it was a RISC design was never advertised.

— Frank Soltis
Soltis sometimes gets his descriptions of non System/3x or AS/400 hardware wrong, but this take makes sense given that the Fort Knox project tried and failed to port the S/370 operating systems to the 801 directly, leading to the "co-processor" approach. Descriptions of the 9370 processor hardware in IBM announcement letters, and documents in Bitsavers are vague, arguably lending further credence to Soltis' claim. My search for additional sources on the subject continues. Vt320 (talk) 12:11, 28 November 2021 (UTC)
Addendum: the 68k and 16-bit x86 chips were widely used by IBM for the service processors and I/O controllers of their larger systems (early AS/400 had a number of them, one was used to bootstrap the main CPU). It's certainly not surprising to think that the 9370 platform would have had a few of these chips in various roles.
This IBM publication seems to indicate that the I/O processor of at least some AS/400 machines - the hardware for which was also used by "the IBM 6010 Feature of the IBM 9370 system" - included a 68000 as a "Control Processor", so perhaps 68000's were used in at least some I/O processors for the 9370 as well. Whether the PL.8 code is for an 801 processor, the main 370 CPU, or the 68000 is a good question; PL.8 compilers existed for all threeof those instruction sets, so maybe there was no 801 in the I/O processor.
As for the service processor, US patent 5067129 says "A previously known IBM (registered trademark of IBM Corp.) service processor is resident in a personal computer and used with an IBM 9370 computer to provide such functions." - if "personal computer" means "IBM-compatible PC", then that would explain all the "86" stuff in the languages for the service processor "microcode". So maybe there was no 801 in the service processor, either.
Or perhaps that depended on which model of 9370.
The evolution of RISC technology at IBM (which notes that there were PL.8 compilers for S/370 and 68k) says "The IBM 9370 uses an 801 as its microcomputer.", but perhaps that was the plan but didn't end up being the case. It also says that "The newer 801 instruction set was also enhanced with several special- purpose instructions to assist in the simulation of System/370.", but perhaps that was done for Fort Knox but never used in in any S/370's that IBM sold.
BTW, Lynn Wheeler (a long-time IBM engineer) discussed a bunch of Fort Knox/Iliad/etc. stuff in various alt.folklore.computer postings that he's archived. In this one, he says

79-80 there was big push to move vast array of internal microprocessors to 801/risc ... microprocessors in low&mid range 370, control microprocessors, the as/400 (merged followon to s/36 & s/38), etc. these were in large part Iliad chips of one form or another. for various reasons, the efforts faltered and you saw some number of the engineers leaving to do risc at other vendors. ... misc. old 801 email

and

The 4331&4341 followons (4361&4381) were going to be Iliad (801/risc) ... I helped with whitepaper that derailed those efforts. An issue was that circuits were getting small enough that it was possible to directly implement much of 370 directly in hardware (rather than having to resort to the microcode implementations of previous generations).

so he takes some credit for killing Fort Knox. That description of the 4361/4381 directly implementing a bunch of S/370 instructions directly in hardware sounds somewhat like what the ES/9370 chip set did, so perhaps that dates back before the 9370, and perhaps the non-ES 9370s did the same - perhaps that meant no 801's in the 9370.
(Lynn Wheeler also seems to indicate that there were multiple "Iliad" processors; he describes "Blue Iliad" as "first 32bit 801", but says that it "never got much past sample chips." He says here that:

about the time of xt/370, boeblingen had 3 chipset 370 running about the speed of 168-3 (3mips)

which was probably either the Capitol chip set or a predecessor to it (it may be the "ROMAN" chip set that Wheeler mentions elsewhere - used in the original 9370s?). He also says:

For the 801, I would have liked to use blue iliad ... 1st 32bit 801 chip ... really big and really hot ... ran about 20mips ... and was never finished (next 32bit 801 chip was six chipset RIOS)

which seems to indicate that after Blue Iliad came the first POWER chipset for the RS/6000 ("six chipset RIOS").) Guy Harris (talk) 06:07, 30 November 2021 (UTC)
I found some further sources of info in the follow papers (both of which require an ACM subscription unfortunately) - Implementing a Mainframe Architecture in a 9370 Processor and The 9373 and 9375 Pipelined Processing Unit
The tl;dr is that the 9373 and 9375 processors consisted of three separate units, an instruction decode unit (I Unit) which converts S/370 instructions into vertical microcode, a microcode execution unit (E Unit), and a floating point unit (FP Unit). From the description, it sounds like all S/370 instructions were microcoded. The interesting part is the description of the E unit, which is described as having the following characteristics:
* All vertical microinstructions are 32-bits in length
* 32 x 32-bit registers
* Typically one cycle per microinstruction
* Microinstructions follow a load-store model
* Three-stage pipeline
All of this, of course, makes the E unit sound a lot like an 801, or something which was based on the 801 design. While the 801 is not mentioned in any of these papers, the fact that the engineers supposedly had experience of implementing a S/370 compatible system on top of an 801 makes it more likely that this E unit is some type of 801 derivative, and thus backing up the Cocke and Markstein paper.
A further paper (I370 - A New Dimension of Microprogramming) describes the use of PL.8 for writing microcode in the later 9377 processor, although it's not clear whether this is the same vertical microcode of the earlier 9370 processors, or a different variety.
The I unit is described as being horizontally microprogrammed, responsible for storing the S/370 system state (including the Program Status Word) and is also capable of partially executing some instructions. This makes me wonder if the I unit is the "S/370 co-processor" described by Soltis in his book.
I'll continue looking for more resources, but I am inclined to include the mention of the 801 being used as the microcode execution engine in the 9370 and 801 articles (I guess this is essentially a mid ground between the "it was used as the processor of the 9370" claim the article originally had, and the "it was just a service processor" claim from the Soltis book) Vt320 (talk) 22:05, 21 December 2021 (UTC)
So it sounds as if when I said "So, yes, it doesn't look as if the 801 was the vertical microcode engine of the 9370 CPU." I managed to make a statement that is neither true nor false, because there's no such thing as "the 9370 CPU" - the 9373 and 9375 CPUs were different from the 9377.
"The Design of a Microprocessor", mentioned above, says:

The Capitol chip set CPU employs vertical microcode. The microinstruction architecture is essentially unchanged compared to two predecessor machines: the IBM 4361, and the IBM 9370-90.

The 9370-90 is the model with the 9377 CPU. (I thought I recognized that "i370" term - I either first saw it mentioned in "The Design of a Microprocessor" or in something I searched for that led me to that book.) The book says of the microinstruction format in the Capitol chipset:
Except for compression into a 16 bit format, the microinstruction set has many RISC characteristics. Especially, except for load and store microinstructions, operand addresses refer to the Data Local Store (DLS) or internal registers and latches. Also, most microinstructions execute within a single cycle.
Fitting the microinstructions into 16 bit results in a highly irregular, non- orthogonal microinstruction format. This applies in particular to Operand addresses, usually accessing the Data Local Store. The addressing of 48 Data registers requires 6 bits. The full addressing capability is available only for some sense and control, and for branch microinstruction types. ALU microinstructions, shift microinstructions, and load/store from/to main store microinstructions have a reduced addressing capability for 8 registers, requiring 3 bits per operand.
If the CPU chip executes in trap level, these 8 registers are the DLS registers TO..T7 (see Figure 11). In base level these 8 registers are either 6 direct and 2 indirect addressable registers or 8 direct addressable registers, depending on the S/370 Operation Code of the instruction being executed. The direct addressable registers are WO..W7 resp. WO..W5. The indirect addressable registers are either the General or the Floating Point Registers just addressed by the contents of the R1-field of a S/370 instruction.
so that sounds rather specialized towards implementing an S/3x0 instruction set.
This might be an Endicott vs. Böblingen difference; the 9373 and 9375 appear to be Endicott designs, while the 4361 is a Böblingen product (see "1983"), the 9370-90/9377 appears to be a Böblingen design, and the ES/9370's "Capitol" chip set was a Böblingen design.
In any case, the 9373/9375 design isn't "an 801 processor runs (micro)code that fetches S/370 instructions from main memory and interprets them" (801 running Hercules :-)), it looks as if it might be "an 801-like execution engine executes instructions fed to it from other units" (somewhat anticipating what the http://www.cpu-collection.de/?l0=co&l1=NexGen&l2=Nx586 Nx586], AMD K5, and Pentium Pro/Pentium 2 teams later did).
(The Cocke and Markstein paper is dated 1990, which is well after the 1986 introduction of the 9370-90, but maybe they weren't that familiar with what the Böblingen people were doing. They also spoke of "The IBM 9370".)
So the 9370 and 801 articles could perhaps indicate that the 9370 models using the 9373 and 9375 processors using something 801-derived as an execution engine.
As for the PL.8-developed code, that was, I think, "i370" code, which was, I think, an early form of millicode, with the instruction set being a superset of a subset of S/370, removing instructions that trapped to i370 code and adding instructions to manipulate internal processor resources (similar to DEC PRISM Epicode and Alpha PALcode.
(Thanks for all the additional research!) Guy Harris (talk) 02:36, 22 December 2021 (UTC)