Talk:Three-address code

Latest comment: 11 years ago by Doc Daneeka in topic Article revision (2013-04-29)

Article revision (2013-04-29)

edit

Under advice of this discussion page, I have revised the article. Primarily I have linked to some uses of TAC, moved the text sections to the top, introduced a table for keeping the examples in, and introduced an example in which most operands actually use three operands. I modified the C example to focus on the operands rather than a full-body compilation: Since b was used for nothing, it was equivalent to the null program. So I removed any code that might perform stack-allocation within TAC, since compilers might deal with this entirely different. I did add the word-alignment (* 4) code to the translation of the C-like example. Yes, it may be confusing, but at least the reader might question why it is there. Doc Daneeka (talk) 09:58, 29 April 2013 (UTC)Reply

Example

edit

The current example:

      i := 0                  ; assignment
L1:   if i < 10 goto L2       ; conditional jump
      goto L3                 ; unconditional jump
L2:   t0 := i*i
      t1 := &b                ; address-of operation
      t2 := t1 + i            ; t2 holds the address of b[i]
      *t2 := t0               ; store through pointer
      i := i + 1
      goto L1
L3:

Two areas of possible contention:

  • Line 5 explicitly computes the "address of" the array b, even though the array-name itself would suffice in C, because lvalues of type int[10] decay to int* in contexts such as this. I would prefer to keep the explicit address calculation, because after all TAC is not C, and we shouldn't assume that our TAC interpreter (or code generator, or whatever) follows the same rules as C when it comes to lvalue decay. The same argument would apply if we were converting an int to a double; the conversion should probably be explicit in the TAC representation. Also consider that readers of this article should not be expected to know the finer points of C. More pedagogical is better.
  • Line 6 does not multiply i by sizeof(int) before adding it to t1. I believe this is reasonable, for two reasons. One: pedagogical clarity. Readers should again not need to know the meaning of the sizeof operator in C. Adding a multiply-by-4 would also lengthen the example by one line, without contributing any new concepts. Two: real-world practicality. All common machines are word-addressible, so if this were the IR for a real compiler, there wouldn't be any "multiply" visible in the generated code. If we're obsessing over technical correctness, we can make the same argument that I rejected in line 5: in C, addition to a pointer is scaled by the size of the pointed-to element.*

I like the new example better than the old one; thanks, Ceroklis!

* - I would not object if the expression &b were changed to &b[0] for total type-correctness, but that's a pretty formidable-looking expression for something that's supposed to look simple. :) --Quuxplusone 04:33, 13 June 2007 (UTC)Reply


I agree with your first point. TAC is not C and making explicit the fact that we are dealing with an address is clearer (even though b by itself would have no meaning).

However I cannot agree with your second point. The expression b[i] is an abbreviation for *(b + i) (pointer addition) which itself is an abbreviation for *(b + sizeof(int) * i) (integer addition). And this is exactly what a compiler will produce at this stage. More specifically:

- The fact that you have to take the size of array elements into account when computing their address is not particularly difficult to understand.

- The meaning of sizeof could be given in a comment, or an equivalent notation could be used.

- The address calculation should be explicit. As per you first point no C abbreviations should be used in TAC.

- You cannot assume that array elements will always be word sized (the current example would break with a char array), and interpreting addresses in TAC as referring to words instead of bytes can only lead to confusion and is fundamentally wrong (see my last point).

- Pedagogical value should not be at the expense of correctness.

- If using arrays renders the example too complicated (which I dispute) we may want to do without them.

- This change adds only one line to the example, which is not much. I also think that examples should be informative as a whole, and that maximizing the number of different constructs per line is not essential. After all the i := i+1 line doesn't bring much either.

- I don't know where you got the idea that modern machines are word addressible. The memory controller only deal with byte addresses, not word addresses. Sure you can avoid doing the multiplication explicitly on machines supporting indexed addressing but this is an optimization that is deferred until machine code generation and doesn't appear in the TAC, which will use the correct full address (which by your convention is a multiple of four).

- I agree that there is a risk for the reader to get lost in the technicalities of C, and that we should avoid it. But C is technical, and definitely not a beginner's language (although it is often wrongly taught to them), which is why it is arguably not a good language to write examples of general concepts in.

That's all, folks! Ceroklis 21:08, 13 June 2007 (UTC)Reply

"I don't know where you got the idea that modern machines are word addressible." — All the machines with which I'm even vaguely familiar (PowerPC, ARM, x86, MIPS) are word-addressible. They also provide instructions to access the individual bytes and half-words of a word, and some of them even allow you to access misaligned word-sized objects.
If you feel that "pedagogical value should not be at the expense of correctness", then yet another solution would be to change the type of b to unsigned char[10], so that i wouldn't need to be scaled. Personally, I think a "technique of deliberate lying" (as Knuth puts it[1]) is a helpful teaching aid now and then. If the technical details aren't important, it's okay to leave them out at first and fill in the gaps later. IMHO, pretty much all of Wikipedia falls into the "at first" category. --Quuxplusone 03:17, 14 June 2007 (UTC)Reply

Question about the example

edit

I have some questions related to this example. The article states that in 3AC, each statement has the general form of result := operand_1 operator operand_2. Yet, statements such as if i >= 10 goto L2 and t1 := &b do not follow this form.
P.S.: Byte addressing vs. Word-addressable --Abdull (talk) 09:47, 8 October 2010 (UTC)Reply

This description is somewhat inaccurate. TAC doesn't only have assignments. The key property is that a statement includes one address plus at most one operation. In the common case of binary operations, there are two operands plus one address, hence the name.
Statement types include:
  1. assignment: dest = operator (operand1, operand2)
  2. unconditional jump: goto target
  3. conditional jump: if operator (operand1, operand2) goto target
They all fulfill the condition above.
To go back to your examples, if i >= 10 goto L2 is of type 3 and t1 := &b of type 1. The later statement uses a unary operation so there are only two addresses.
I'll fix the article at a later time. Bomazi (talk) 13:30, 24 November 2011 (UTC)Reply

want the result

edit

s->if(a<b) then x:=y+z else 1:=m+n —Preceding unsigned comment added by 61.2.178.93 (talk) 14:04, 15 April 2009 (UTC)Reply