Talk:Statement (computer science)
This article is rated Start-class on Wikipedia's content assessment scale. It is of interest to the following WikiProjects: | ||||||||||||||||||
|
I was under the impression that Pascal was defined using lower-case letters for all the keywords. Does anyone know different? If no response within a week, I'll change things to lower-case. Murray Langton 09:35, 7 February 2006 (UTC)
Pascal is not case sensitive (although some implementations provide an option to make it so). It is true that most source code + books use lower case. Having written Pascal using a terminal that did not support lower case I don't find reading upper case Pascal a jarring experience. Others might.
"Most programming languages"
editThere are a couple of sentences that say "most programming languages", but I don't know that's statistically a true characterization. In fact, I know there are many languages in which those descriptions of statements are false. Unless someone can provide references that support the use of the phrase "most languages", I think they should be replaced with "many" or even "typically". — Chris Page 21:15, 8 May 2007 (UTC)
- I guess until somebody actually does the counting we should go with something less emphatic (not that I might have any views one way or the other ;-). Derek farn 22:52, 8 May 2007 (UTC)
- Perhaps a rephrase to "most imperative programming languages" would be appropriate, since this type of language is what the bulk of this article is concerned with. Murray Langton 15:02, 9 May 2007 (UTC)
- I would have said that all imperative programming languages have statements. Isn't that one of the things that makes them imperative? On the other hand how many non-imperative language shave statements? I guess it comes down to what the designer of a language decided to call a statement. Could I define a prolog like language and decide to call things that look like clauses statements? Perhaps we need to revisit the definition of statement, I am beginning to think we might need a definition that does encompass a construct that appears in all executable languages. Derek farn 17:38, 9 May 2007 (UTC)
- It would make much more sense to call Prolog clauses "statements" but usually they aren't called that. In imperative languages, it should be "instructions", but historically they're called statements, so we're stuck with that, even if they aren't the same as logical statements at all. The equivalent of a logical statement in programming is a Boolean expression. --88.74.206.222 (talk) 01:23, 29 October 2016 (UTC)
- I would have said that all imperative programming languages have statements. Isn't that one of the things that makes them imperative? On the other hand how many non-imperative language shave statements? I guess it comes down to what the designer of a language decided to call a statement. Could I define a prolog like language and decide to call things that look like clauses statements? Perhaps we need to revisit the definition of statement, I am beginning to think we might need a definition that does encompass a construct that appears in all executable languages. Derek farn 17:38, 9 May 2007 (UTC)
Create new statements during program execution?
editDefining new statements is possible in Lisp. But you still have to use the list notation for your statements. Languages of the Lisp family have inherited that feature. There are also languages like Seed7 which allow the definition of new statements syntactically and semantically. But what is that: "Snobol4 allows new statements to be created during program execution." I can only speculate about what it means to create a new statement during program execution. Can it be that the new statements are created in the interpreter while it is interpreting the program. In that case the definition of the new statement must be part of the program. Possibly this is a reference to self modifying code. Can somebody give me more information. Zron 14:18, 31 October 2007 (UTC) but basing on the basics of the vb.net portfolios — Preceding unsigned comment added by 197.221.231.254 (talk) 13:56, 19 November 2018 (UTC)
Use of "declaration" and "definition"
editThe following statement (no pun intended) mixes the terms "definition" and "declaration":
- Many languages (e.g. C) make a distinction between statements and definitions, with a statement only containing executable code and a definition declaring an identifier.
A variable definition is (or at least can be) distinct from a variable declaration. In the case of 'C' the variable type is defined using the typedef (type definition) keyword, however an instance of that variable is declared for use in a program simply by using the newly defined type name. —Preceding unsigned comment added by 82.2.62.83 (talk) 17:44, 8 October 2008 (UTC)
Statement syntax and the definition of statements in standard documents
editI added two paragraphs to point out several things that are not mentioned in the article:
- Since statements are used frequently they dominate the appearance of a program.
- In imperative languages statements are usually characterized by special syntax and special semantics.
- The special syntax and semantics of statements is usually described outside the language (in reference / standard documents which use natural language and some form of syntax description which is not part of the language). That means that in the common case a programmer is not able to specify syntax or semantic of a statement (Examples of languages where the syntax or semantic or both can be specified should be added also).
Sorry to use the term vandalism but it is IMHO not OK to just remove some information. Georg Peter (talk) 14:37, 8 August 2010 (UTC)
- First of all, please do not characterize good-faith edits with sensible Edit Summaries ("Well meaning edit containing generalised points of view and incorrect claims (eg, most statements do not start with if/for/while") as 'vandalism'. I agree with User:Derek farn's judgement here. The second addition, in particular, which mentions that language definitions define the syntax and semantics of statements, is not specific to statements; it is equally true of expressions, declarations, and other language elements. --Macrakis (talk) 14:47, 8 August 2010 (UTC)
- Ok, I will assume good-faith. I overreacted because the nice Edit Summary was combined with a complete revert. This is IMHO not a strategy to encourage part time contributors of Wikipedia. BTW: My change did NOT intend to claim that "most statements start with if/for/while". So it can be that User:Derek farn started with a false premise. Of couse all language elements are defined somehow in the language definition. But there are things that do not need to be part of the (basic) language definition, such as classes and methods defined in libraries. Class libraries can usually be defined in the language itself (the class libraries of Java and C++ are written in Java and C++ respectively). I want to point out that in most programming languages statements, declarations and some other language elements cannot be defined in a library. Everybody assumes that this needs to be the case, but this is not true. There are languages (such as LISP or Seed7) which allow user defined statements. Consequently in such languages statements can be defined in the language itself. Therefore such statements may even not be mentioned in the language definition. This concept might not be reasonable or desirable, but it is possible. It IMHO improves the understanding of what a statement is, when such things are mentioned also. Georg Peter (talk) 15:47, 8 August 2010 (UTC)
It is correct that the syntax/semantics of statements is specified by the defn of the PL (as every other aspect of a PL). But there is usually also a generic function call or method call syntax. Inside this frame of function and method definitions new functions and methods can be introduced. Since statements are usually outside this frame (and not a function call) the programmer is not able to introduce new ones (asside from languages with closures such as LISP). BTW is it really to hard to improve some paragraphs instead of just removing them? Georg Peter (talk) 14:57, 8 August 2010 (UTC)
- OK, I see your point, and will edit the article to reflect it. PS I worked on extensible languages (EL/1) in the 80's.... --Macrakis (talk) 15:16, 8 August 2010 (UTC)
- Thank you (BTW: I wrote the paragraph above before I saw your response. Since it was so much work I do not have the heart to leave it out :-) ). Georg Peter (talk) 15:47, 8 August 2010 (UTC)
Why the name?
editWhy are they called statements? Who decided to call them that? Mathematically a statement is a Boolean expression. "Statements" in imperative languages are not statements in that sense, they are commands. They describe a state transition, not a condition that may be true or false.--92.214.175.181 (talk) 19:12, 21 December 2018 (UTC)
- Use of <statement> dates back many decades e.g. Flow-Matic (predecessor to Cobol) in 1955, Fortran 1 Manual in 1956, Algol 60 Report published in 1960. I don't know where it came from. Murray Langton (talk) 23:48, 21 December 2018 (UTC)
Confusing example?
editI didn't understand the example under the Expressions section. Why is print any different between the first and second lines? It doesn't seem to qualify as an expression under the given definition of always returning a result and not having side effects. 124.214.42.86 (talk) 01:46, 17 August 2020 (UTC)
Keywords and syntax
editHi Macrakis.
We obviously have slightly different views on this article, so probably better to discuss things here rather than carry on with partial reverts in the actual article. I offer these comments for your consideration.
As you may have gathered I have been slowly working my way through the article. I suspect that the entire sections "Expressions" and "Semantics" may have to go or be substantially rewritten.
You used the sub-header "No reserved keywords" while I used the sub-header "Context-dependent analysis". I would point out that stropping also has no reserved keywords, so your sub-heading is perhaps less accurate than mine.
Languages which use stropping are just as easy to parse as those which use reserved words. Lexical analysis for a stropped language is faster than for one using reserved words since you don't have to check every single name/identifier to see if it is a reserved word. Both stropping and reserved words require the same amount of lookahead (at most 1 token if using an LL(1) grammar).
An early use of reserved words was with Cobol (first published in 1959) with between 300 and 400 reserved words. What is even more confusing is that some of these reserved words have US spelling so that an attempt to use UK spelling often results in an incomprehensible error message.
An even earlier use of reserved words was in FLOW-MATIC published in 1953.
You said: "Most languages since the early 1960s, including Ada, C, C++, Java, and Pascal,". The dates for these languages are Ada 1979, C 1972, C++ 1982, Java 1996, Pascal 1970. The date for PL/1 is 1964. Hence I would suggest that "1970s" is more appropriate than "1960s".
I look forward to your response. Murray Langton (talk) 22:32, 7 February 2021 (UTC)
- @Murray Langton: thanks for opening the discussion here.
- I am not sure that the section on stropping/reserved words belongs in this article at all. The reserved words article seems to be the right place to discuss how keywords are identified. I'd think that a simple statement along the lines of:
- The keywords used in statement syntax may be reserved, marked in some special way, or overlap with identifiers.
- Any additional detail belongs in that article, I think, not here, because it has no bearing at all on the semantics of statements, and only a trivial role in their syntax.
- I agree that languages with stropping and with reserved words are equally easy to parse and require the same amount of lookahead; I hope my wording didn't imply the opposite. I doubt there is any significant advantage in lexical analysis speed either way, since you have to look up identifiers in both cases. I suppose with stropping you can use a trie or a perfect hash. Having spent a lot of time with compiler front ends, I'm pretty sure the difference is in the noise.
- The move away from unreserved keywords started in the 60s, not the 70s. Some descendants of Algol continued stropping (Simula, Algol 68); some implementations of Algol used reserved words instead of stropping; but some descendants in the 60s did reserve words (Algol W, Pascal). XPL had reserved words although otherwise it was essentially a subset of PL/I. I think that Basic had reserved words. The Joss family did not (Mumps, CAL). I did not mention Algol W, Pascal, and Basic simply because they're less well known than Ada etc.
- Best, --Macrakis (talk) 19:10, 10 February 2021 (UTC)
- Hi Macrakis,
- I like your sub-heading 'No distinguished keywords'.
- I've rewritten the reserved word section to show that reserved words started even earlier than the 1960s.
- I think that having short paragraphs, as at present, for stropping and reserved words is a good idea since it gives readers enough detail to know what they are and they can then look at the main articles for more detail.
- Moving on, I'm wondering if the 'Semantics' section is too technical with reference to 'call-by-name' and 'lazy evaluation'.
- Cheers. Murray Langton (talk) 18:20, 11 February 2021 (UTC)
- Agreed, the 'semantics' section as written was a mess. I've tried simplifying it, but I'm sure it can be greatly improved in a variety of ways. For example, there is currently nothing about formal descriptions of semantics in the article. --Macrakis (talk) 19:52, 11 February 2021 (UTC)
intro to Syntax
editHi Macrakis,
I'm starting a new talk section just for 'Syntax', since we seem to have more or less reached agreement on the keywords issues. I'll think more about the 'Semantics' issue once this is sorted. The current paragraph at the start of Syntax reads:
"The appearance of statements shapes the look of programs. Programming languages are characterized by the type of statements they use (e.g. the curly brace language family). Many statements are introduced by keywords like if, while or repeat. Many languages have keywords that are reserved so that they cannot be used as names of variables or functions. Imperative languages typically use special syntax for each statement, which looks quite different from function calls. Common methods to describe the syntax of statements are Backus–Naur form and syntax diagrams."
Random thoughts for your consideration:
- need link to Syntax (programming languages).
- similar link to Semantics (computer science) will be needed in the Semantics section.
- this is a fair chunk for one paragraph - should it be broken up a little?
- distinguish: "the syntax or grammar defines the appearance of a program while semantics defines the the meaning." should this or something similar go at the start of the article, just after your sentence distinguishing simple and compound statements?
- curly brace family really refers to the way in which statements are grouped rather than a type of statement, though I agree that it profoundly affects the appearance of a program.
- should we mention that Backus–Naur form is a text-based grammar, while syntax diagrams are a graphical representation?
- should we mention that they are equivalent in descriptive power, but that some people find one form or the other easier to understand?
Murray Langton (talk) 12:39, 13 February 2021 (UTC)
- Some thoughts:
- The appearance of statements shapes the look of programs.
- Silly and vacuous. Remove.
- Programming languages are characterized by the type of statements they use
- Vacuous. Of course PLs are characterized by the statements they use.
- (e.g. the curly brace language family).
- I suppose it's useful to say that language syntax is usually largely inherited from previous languages: the Algol family, the Fortran family, the FLOW-MATIC/COBOL family, the COMIT/SNOBOL family, the C family (aka CBL), etc. Though an extended discussion of this belongs elsewhere.
- Many statements are introduced by keywords like if, while or repeat.
- OK, this begins to have some meat.
- This probably belongs after the next sentence.
- Imperative languages typically use special syntax for each statement, which looks quite different from function calls.
- Is the "imperative" here supposed to be excluding Lisp? First of all, most Lisp programs actually have a significant imperative component. Secondly, Lisp also has special syntax -- think of COND and for that matter SETQ....
- Common methods to describe the syntax of statements are Backus–Naur form and syntax diagrams.
- Unexceptionable.
- Those are some quick thoughts.... --Macrakis (talk) 15:57, 13 February 2021 (UTC)
- Draft for your consideration and comments:
- Apart from assignments and subroutine calls, most languages start each statement with a special word (e.g. goto, if, while, etc.) as shown in the above examples. Various methods have been used to describe the form of statements in different languages; the more formal methods tend to be more precise:
- Cobol used a two-dimensional metalanguage.[3]
- Algol 60 used Backus–Naur form (BNF) which set a new level for language grammar specification.[4]
- Pascal used both syntax diagrams and equivalent BNF.[5]
- BNF uses a lot of recursion to express repetition so various extensions have been proposed to allow direct indication of repetition.
- ^ ANSI FORTRAN 66 standard"FORTRAN 66" (PDF). Retrieved February 19, 2021.
- ^ ANSI FORTRAN 95 standard"Fortran95" (PDF). Retrieved February 19, 2021.
- ^ Cobol manual."COBOL" (PDF). Retrieved January 23, 2021.
- ^ Revised ALGOL 60 report section 1.1."ALGOL 60". Retrieved January 23, 2021.
- ^ Pascal User Manual and Report Appendix D."Pascal" (PDF). Retrieved February 19, 2021.
- Looks good to me... but no need to propose like this on the talk page in general -- be bold and other editors may edit your contributions. --Macrakis (talk) 17:13, 19 February 2021 (UTC)
- Hi Macrakis. Sure, for smaller changes I'm quite happy to just put them up. For a more substantial change it makes it easier for me to sort out my ideas, and give others a chance to say that I'm completely on the wrong track. In this case it gave me a chance to sort out the references. I like the improvements you have made. Murray Langton (talk) 20:04, 19 February 2021 (UTC)
- Looks good to me... but no need to propose like this on the talk page in general -- be bold and other editors may edit your contributions. --Macrakis (talk) 17:13, 19 February 2021 (UTC)
Fortran ambiguity
editI see no reason to mention that DO10I is an "implicitly declared" variable. Whether variables need to be declared or not (they don't in Fortran) is completely orthogonal to the syntactic ambiguity here. Yes, if Fortran required all variables to be declared, this ambiguity would be unlikely to happen, but I don't think that's relevant here. --Macrakis (talk) 17:28, 15 February 2021 (UTC)
- OK, I see your point. Murray Langton (talk) 17:45, 15 February 2021 (UTC)
intro to Semantics
editJust some first thoughts on rewriting the Semantics section. I expect to develop these over the next week or so. Feel free to make/suggest improvements.
Current sections reads:
- "In most programming languages, most statements cannot be modeled as subroutine calls. For example, the assignment statement
a = b
treatsa
as an L-value, butb
as an R-value; the loop statementwhile X do Y
may executeX
andY
multiple times. Some programming languages do support non-strict mechanisms which allow modeling such cases."
While the above is correct I'm not sure how relevant it is to semantics/meaning. I also suspect that this is almost meaningless to the casual reader.
Draft for improvement:
Semantics is concerned with the meaning of a program. The standards documents for many programming languages use BNF or some equivalent to express the syntax/grammar in a fairly formal and precise way, but the semantics/meaning of the program is generally described using examples and English prose. This can result in ambiguity.[1] In some language descriptions the meaning of compound statements is defined by the use of 'simpler' constructions, e.g. a while loop can be defined by a combination of tests, jumps, and labels, using if
and goto
.
The semantics article describes several mathematical/logical formalisms which have been used to specify semantics in a precise way; these are generally more complicated than BNF, and no single approach is generally accepted as the way to go. Some approaches effectively define an interpreter for the language, some use formal logic to reason about a program, some attach affixes to syntactic entities to ensure consistency, etc.
References
- ^ Trouble spots in Algol 60"Trouble Spots" (PDF). Retrieved February 24, 2021.
Expressions
editI now turn my attention to the section 'Expressions'. This section currently reads:
Current version
editIn most languages, statements contrast with expressions in that statements do not return results and are executed solely for their side effects, while expressions always return a result and often do not have side effects at all.
For example:
- A statement
print('Hello, World.')
- An expression:
X=your data
print (X)
Among imperative programming languages, Algol 68 is one of the few in which a statement can return a result. In languages that mix imperative and functional styles, such as the Lisp family, the distinction between expressions and statements is not made: even expressions executed in sequential contexts solely for their side effects and whose return values are not used are considered 'expressions'. In purely functional programming, there are no statements; everything is an expression.
This distinction is frequently observed in wording: a statement is executed, while an expression is evaluated. This is found in the exec
and eval
functions found in some languages: in Python both are found, with exec
applied to statements and eval
applied to expressions.
A statement is an instruction that the Python interpreter can execute. We have only seen the assignment statement so far. Some other kinds of statements that we’ll see shortly are while statements, for statements, if statements, and import statements. (There are other kinds too!)
An expression is a combination of values, variables, operators, and calls to functions. Expressions need to be evaluated. If you ask Python to print an expression, the interpreter evaluates the expression and displays the result.
Comments on the above
editIn first paragraph note that a commonly desired side-effect is assignment.
The examples given are somewhat confusing.
This article is about imperative programming languages so can omit mention of Lisp and functional languages.
Should keep mention that <statement is executed while expression is evaluated>.
I'm inclined to remove the discussion of Python.
- I agree -- that section should be completely re-written. Using print and assignment as examples is perverse. Yes, there are languages where print is a function, and languages where it is a statement; yes, there are languages where an assignment can be part of an expression. But better to start with the simple cases. The Python business is off-topic and confusing. "We have only seen...", "we'll see shortly", and "if you ask Python..." are not WP style. --Macrakis (talk) 22:16, 9 March 2021 (UTC)
Proposed new wording
editA distinction can be made between statements, which are executed, and expressions, which are evaluated. The value obtained from an expression is often used as part of a statement e.g. assignment variable := expression;
Some programming languages (e.g. C, C++) allow some statements to provide a result (technically all statements provide a result, but this result is often of type 'void' and can't be used for anything). The most useful statement which provides a result is an assignment, the result being the value just assigned.
This can be useful for multiple initialisation:
i = j = 0;
which is treated asi = (j = 0);
It can also result in simple finger trouble completely changing the meaning of some code:
if (i == j) {. . . };
tests to see ifi
is equal toj
if (i = j) { . . . };
assigns the value ofj
toi
and then tests to see if that value is non-zero.
Some languages (Algol 60, Pascal) allow multiple assignment but don't allow assignments to appear in expressions.
Comments/Objection to the above
editThe above "Proposed new wording" (which appears to now be the current wording) is simply wrong. No statements in C or C++ evaluate to any value. In the example given, j = 0
is an *expression* and not a statement, which is the reason is it allowed on the right side of the assignment operator. i = j = 0;
is a statement and it does not evaluate to any value, void or otherwise. Consider the following *invalid* code example:
i = (j = 0;);
This will always result in a compiler error because the *statement* j = 0;
does not evaluate to a value, so it cannot be used to assign to i.
Note the difference between expression: j = 0
and statement: j = 0;
. Just because statements can be constructed of an expression (which returns a value) followed by a statement terminator (semicolon) it does *not* mean that the statement returns a value.
Additionally, is "simple finger trouble" normal parlance? I have never heard it before, and I am afraid to Google it to find out. 63.115.17.182 (talk) 20:46, 17 October 2022 (UTC)