Talk:Low-level programming language
This level-5 vital article is rated Start-class on Wikipedia's content assessment scale. It is of interest to the following WikiProjects: | |||||||||||
|
Examples
editThis page could use some examples. Thehotelambush 16:09, 30 July 2007 (UTC)
- Example - a function that calculates the nth Fibonacci number.
- First, in C [3rd generation]:
unsigned int fib(unsigned int n) { if (n <= 0) return 0; else if (n <= 2) return 1; else { int a,b,c; a = 1; b = 1; while (1) { c = a + b; if (n <= 3) return c; a = b; b = c; n--; } } }
- The same function, but converted to x86 assembly language, specifically MASM, using the
__cdecl
calling convention [2nd generation]:
fib: mov edx, [esp+8] cmp edx, 0 ja @f mov eax, 0 ret @@: cmp edx, 2 ja @f mov eax, 1 ret @@: push ebx mov ebx, 1 mov ecx, 1 @@: lea eax, [ebx+ecx] cmp edx, 3 jbe @f mov ebx, ecx mov ecx, eax dec edx jmp @b @@: pop ebx ret
- Note:
@f
refers to the closest following@@:
label.@b
refers to the closest preceding@@:
label.- I deliberately did not use
proc
.
- And here is the final x86 machine code [1st generation]:
8B542408 83FA0077 06B80000 0000C383 FA027706 B8010000 00C353BB 01000000 B9010000 008D0419 83FA0376 078BD98B C84AEBF1 5BC3
- For comparison purposes, here is a disassembly listing:
00401000 fib: 00401000 8B542408 mov edx,[esp+8] 00401004 83FA00 cmp edx,0 00401007 7706 ja loc_0040100F 00401009 B800000000 mov eax,0 0040100E C3 ret 0040100F loc_0040100F: 0040100F 83FA02 cmp edx,2 00401012 7706 ja loc_0040101A 00401014 B801000000 mov eax,1 00401019 C3 ret 0040101A loc_0040101A: 0040101A 53 push ebx 0040101B BB01000000 mov ebx,1 00401020 B901000000 mov ecx,1 00401025 loc_00401025: 00401025 8D0419 lea eax,[ecx+ebx] 00401028 83FA03 cmp edx,3 0040102B 7607 jbe loc_00401034 0040102D 8BD9 mov ebx,ecx 0040102F 8BC8 mov ecx,eax 00401031 4A dec edx 00401032 EBF1 jmp loc_00401025 00401034 loc_00401034: 00401034 5B pop ebx 00401035 C3 ret
- By the way, the Fibonacci sequence breaks at precisely F47.
- unsigned int fib(unsigned int n): -1323752223
- unsigned long long int fib(unsigned int n): 18446744072385799393
- actual F47: 2971215073
- Just thought I'd let you know. 174.28.41.236 (talk) 18:24, 2 January 2014 (UTC)
Absolute terms
editI've always known low-level and high-level programming languages to be absolute terms. The former refers to programming languages that provide no abstraction from the instructions that the processor implements, this is binary code (maybe written in hexadecimal), and assembler language, which is a mnemonic for the binary code and is assembled. The latter refers to programming languages that can be used independently of the processor because there is a software layer that abstracts that detail (something an assembler does not do), as a compiler or an interpreter.
The crisp and absolute difference between low-level programming languages and high-level programming languages, for what I knew, is that the code is done for a specific hardware (or group of them) as may be the x86 group of processors, or it is done for a software (or group of them) as is the group of C compilers.
Unfortunately I haven't been able to find references as this is more a terminological matter than a informatics matter, that's why I've required the citation (as there may be one). All that I got by now is this a definition in the webopedia[1].
Kind regards. —Preceding unsigned comment added by Trylks (talk • contribs) 23:26, 19 February 2008 (UTC)
Removed two-year old citation tag
editThere are no controversial statements of fact in this article, so no citations are needed. Nor is the Utah State Office of Education reference needed. "a low level language is one that does not need a compiler or interpreter to run" is a simple statement of fact. "Mars is a planet" does not need an "according to..." reference. The article may be a stub, but what's there is an example of content which does not need references, besides the one instance with the citation tag. And sometimes, especially in the long-winded Wikipedia, a term can be explained perfectly well with a stub. J M Rice (talk) 13:37, 11 October 2008 (UTC)
Old citation
editIt's rediculous to require a citation for the statement that programmers rarely code in machine code. That's like requiring a citation to state that Elvis is dead.
- "Rediculous" or not, it's Wikipedia policy.Diego (talk) 10:26, 27 October 2008 (UTC)
Low-level programming in high-level languages - needs edit, possible removal
editI have moved the C example code out of this section and provided a point by point comparison between it and the assembly language example, to emphasise the difference between high level abstraction and low level architecture specifics. The C example served absolutely no purpose in this section - it is absolutely not an example of low-level programming in a high level language.
I do not think this section is very useful as currently written. It makes no mention of the most pertinent mechanism (inline assembler) and I think causes confustion by mostly refering to systems programming. Systems programming is not by definition low-level programming. UNIX (and UNIX-like) operating systems are written in C but almost all of that code is completely portable. Porting such an operating system to a specific architecture mostly means simply specifying values for operating system constants and in some places choosing implementations which will operate most efficiently on a particular architecture (over implementations which would function poorly but would work). Even those parts which are achitecture-specific are mostly high-level C code written with an awareness of the architecture rather than code which directly manipulates that architecture in a low-level way. Actual low-level programming is a very small part of operating system design. As it stands, this section is as vague in its worning and uncertain in its purpose.
I think this also applies to the "relative meaning" section, which seems equally ill informed. The beginning of the article gives a clear definition of "low level". This is computer-science jargon, where words and phrases which might be ambiguous in general language have specific meanings. Whoever wrote that doesn't realise that C is still considered a high level language. The emergence of more abstract languages hasn't changed that basic fact.
I am going to make the following changes to this section:
- Add mention of inline assembler and other techniques for low-level hardware control by high level programmes.
- Remove mention of Forth (off topic).
- Remove the paragraph which is entirely devoted to systems programming in C, does not contribute to the definition of Low-level programming languages and is off topic.
- Remove the small bullet list of other types of programmig languages. It is not relevant to this section, very limited and entirely redundant given the automatically-generated category-relevant lists at the bottom of the page.
Itsbruce (talk) 17:39, 25 August 2014 (UTC)
- I am not sure by now, how well low and high are defined. Seems to me that there needs to be a name for something above assembler, but that still allows for low-level coding at the register level, and other machine specific abilities. PL/360, for example, allows operations on specific machine registers. As well as I know, BLISS and PL/S also allow specification at the hardware level. PL/M does similarly for the 8080 and 8086. I believe some C compilers allow writing interrupt routines, with, for example, proper interrupt enable and disable. While C is still a high-level language, one can cast an integer constant to a pointer, and read or write from specific hardware addresses. On machines with memory-mapped I/O, one can do I/O operations directly. Those are things one normally is not able to do in a high-level language, yet C allows for them. Gah4 (talk) 13:04, 11 July 2019 (UTC)
- ESPOL is another high-level language that supports hardware-specific operations - including, with
POLISH
, what amounts to inline assembler. Guy Harris (talk) 22:28, 11 October 2023 (UTC)
- ESPOL is another high-level language that supports hardware-specific operations - including, with
more examples
editSeems to me that languages like PL/360 and PL/M should be described here. They are designed to look like higher-level languages, but operate more like an assembler. In PL/360, variables with names like R1 and R2 represent actual processor registers. Even more, the translation is more direct than for high level languages. In PL/360, for example, R1:=R1+R1+R1; will result in R1 holding four times its previous value, LR R1,R1; AR R1,R1; AR R1,R1, as the modified value is used for the last AR. Gah4 (talk) 18:57, 29 June 2018 (UTC)
C as a low-level language
editThere is no mentioning of C being used as a low-level language. C could be used as a high-level language, but the modern day go-to low-level programming language is pretty much C.
- People publishing books about "low-level programming" with C in the title.
- Columbia University offering course in Low-level programming with C
Based on my observation, the only people who potentially disagree about C being low level are hardware developers. However if this article only takes their point of view, then it is basically NPOV. --Voidvector (talk) 07:54, 27 December 2019 (UTC)
- No idea about the WP:NPOV, but I do agree that C is commonly used as a low-level language. One common example is assigning addresses (often of memory-mapped I/O devices) to pointers, and then dereferencing them. While the standard is meant to be used as a high-level language, that doesn't mean that people do it. Gah4 (talk) 00:32, 31 December 2019 (UTC)
- C is often used to control low-level machine details, however that statement does not imply that C is a low-level programming language. For instance, many languages uncontroversially considered high-level (e.g. Haskell, Java) can also perform low-level operations if suitable abstractions are provided. As evidence, House is an operating system written in Haksell. I argue the correct way to think about C is as a high-level language with implementations that often provide out-of-the-box support for low-level operations. I suggest that this article can be improved by mentioning very high-level languages so as to distinguish C from languages like Java and Haskell. — Preceding unsigned comment added by 65.60.139.183 (talk) 14:00, 27 March 2020 (UTC)
- C is not just a low-level language, it is a primitive language with lowest-common-denominator facilities like pointers dressed up with defines. This is actually non-semantic, giving it the feeling of non-semantic low-level coding.
- However, the lack of semantics is also lack of hardware semantics as well. This leads to comments like: Alan Perlis (first recipient of the ACM Turing Award for computing): “A programming language is low level when its programs require attention to the irrelevant. While, yes, this definition applies to C, it does not capture what people desire in a low-level language.”
- This is what the hardware people are getting at.
- Because of the primitive features of C is can't be considered a high-level language. It is a coding language dressed in structured syntax. It is the compromises (some of which are needed for system programming, but many that are just bad compromises) that makes C not a high-level language. Ian.joyner (talk) 02:14, 20 September 2024 (UTC)
This leads to comments like: Alan Perlis (first recipient of the ACM Turing Award for computing): “A programming language is low level when its programs require attention to the irrelevant. While, yes, this definition applies to C, it does not capture what people desire in a low-level language.”
The text in quotes is not a quote from Alan Perlis. It is a combination of a quote from Alan Perlis, "A programming language is low level when its programs require attention to the irrelevant.", followed by a reply from David Chisnall, "While, yes, this definition applies to C, it does not capture what people desire in a low-level language." Perlis' quote comes from his Epigrams in Programming, and Chisnall's response from his ACM Queue article, |"C Is Not a Low-level Language". Some of what Chisnall is saying there may apply to other programming languages with a sequential view of the world.- Perhaps you should have also pointed to what I suspect may be your own criticisms of C++, including criticisms of C that also apply to C++ (which also include complaints about C in areas other than what you describe above, such as syntactic complaints about #defines and the assignment operator). Guy Harris (talk) 08:53, 20 September 2024 (UTC)
Very popular, little development
editThis article gets a lot of traffic and is especially underdeveloped. I am just remarking on this; I have no ideas for organizing editing of this topic. Bluerasberry (talk) 23:15, 12 June 2022 (UTC)
Only one source present in majority of the article
editThe entire of the introduction section and portions of the Machine Code section is almost entirely copied from this source, with little changes made. Bxshan 09:04, 3 April 2023 (UTC)
Wiki Education assignment: Introduction to Digital Humanities Spring 2024
editThis article was the subject of a Wiki Education Foundation-supported course assignment, between 15 January 2024 and 10 May 2024. Further details are available on the course page. Student editor(s): Phoenixpinpoint (article contribs).
— Assignment last updated by Phoenixpinpoint (talk) 21:20, 21 April 2024 (UTC)
Very poor wiki page
editIt is hard to know where to begin on this page. It was posted to disagree with my assertion that languages using structured syntax (high-level syntax) do not define high-level programming languages.
This article is based on the wrong assumption and thinking that only machine language or assembler is low level.
Reference 1 goes to some African University page with an article only covering machine code and assembler (although there is a little confusion in it, it is never clarified).
The Fibonacci examples are irrelevant. They don't help with the definition.
The C example is only about syntactic appearance. C is absolutely not just a low-level language but primitive language with pointers (exposing memory which high-level languages should not, by definition) and macros (actually from Burroughs ALGOL).
If the function of a language is to handle platform and machine details, it is a low-level language. High-level languages do not touch those details - they concentrate on the abstract conceptual level, not on the physical implementation level.
There is a clear distinction between low-level and high-level programming. This article fails to get it.
The comments on the Burroughs languages of ALGOL and ESPOL (replaced long ago by NEWP) are inaccurate. These languages go back to 1961, not late 1960s. C did not come out until 1974. ALGOL and ESPOL (NEWP) run on machines that have no assembler. They do everything an assembler might cover. Since they do that they are low-level system languages.
They also do not rely on any inline assembler code (there is no assembler). Ian.joyner (talk) 08:16, 10 July 2024 (UTC)
- There definitely needs to be something between assembler and C. And inline assembler isn't really a low level language, but a mix of two languages. I never programmed BLISS, but looking at the page, that could be one. Personally, I always liked PL/360, which is pretty much a S/360 assembler with PL/I syntax. One way to see the difference is with R1 := R1 + R1 + R1; which multiplies R1 by four. No real HLL would do that! It compiles as LR R1,R1; AR R1,R1; AR R1,R1. Well, for one R0 through R15 are names the registers, not general variables. Gah4 (talk) 17:44, 10 July 2024 (UTC)
If the function of a language is to handle platform and machine details, it is a low-level language. High-level languages do not touch those details - they concentrate on the abstract conceptual level, not on the physical implementation level.
C's function is not to handle platform and machine details. It allows doing it, as does C++, but, at this point, the vast majority of C code is probably machine-independent, and much of that is platform-independent as well - even the majority of kernel-mode code written in C is probably machine-independent. Guy Harris (talk) 22:00, 3 August 2024 (UTC)- A lot of C code isn't as portable as it should be. Often it assumes specific sizes for types, or specific endianness. Some won't realize that until they try porting it. Gah4 (talk) 19:41, 20 September 2024 (UTC)
- If that code is reading from or writing to files, or sending or receiving network packets, or dissecting raw file or network packet contents, it can be easy to write code that "works on [your] PC", i.e. that assumes little-endianness. It might be harder to do so in languages that don't let you just cast a pointer to a blob of memory to a pointer to an integral or structure data type and access the value through that pointer, as those languages may force you to, for example, extract a 4-octet integral value an octet at a time and combine them into an integral value, which means you have to do that differently for big-endian and little-endian values in the file or over the network, and the result will be correct on machines of any byte order. (Fun fact: newer versions of both GCC and Clang/LLVM detect that idiom and generate loads, or loads and byte-swap instructions, if that works on the target processor.)
- I suspect that's the main source of endianness issues, at least, so C and languages derived from it (e.g., C++) may be at fault for making it too easy to cheat. (Note that this issue involves I/O, which I suspect is a rock on which many computing abstractions have crashed.)
- I'm not sure what the causes of assuming specific sizes for types are, but having data types that are just "this holds an integer", without specifying the range of the integers it can hold, may not help. That one may not be unique to the C family of languages; it can be a property of languages that purportedly "concentrate on the abstract conceptual level" but have bits of the "physical implementation level", such as the word size of the processor, poke through. (For example, Pascal allows the programmer to express a range, but doesn't require it; the default range is implementation-defined.) Bignums FTW! Guy Harris (talk) 20:39, 20 September 2024 (UTC)
- Network byte order being big-endian makes it a little harder to accidentally get it right, with little-endian processors. But yes in the case of many file formats. This goes back at least to Fortran 66, though. Fortran 66 A format I/O stores characters left (low address) justified. While the standard doesn't allow for comparison, or even assignment, of such data, program often did, and do, it. Java has fixed sizes for data types, and you aren't supposed to be able to tell which byte order the underlying processor uses. You can still put bytes together, though, and write them out in the wrong order. Gah4 (talk) 00:35, 24 September 2024 (UTC)
- Putting bytes together (shifting and ORing) is for reading a value that's in a specified byte order and putting it into a native byte order value; pulling them apart (shifting and masking) is for writing it out in a specified byte order. By "rite them out in the wrong order" do you mean you assemble the bytes when reading and then just dump them out in whatever byte order that happens to be in, rather than disassembling them and writing them in the intended order?
- Also, for Java, I've seen some stuff on the web that indicates that the multi-byte-numeric-value methods of `DataInputStream` read big-endian values, and the multi-byte-numeric-value methods of `DataOutputStream` appear to be documented to write big-endian values, which might further suggest that the readers read big-endian values. I don't know if you can use that for networking I/O (or, at least, TCP socket I/O), or whether there's anything for use with packet-oriented rather than stream-oriented protocols. I also don't know if they have anything to handle little-endian data. Guy Harris (talk) 05:50, 24 September 2024 (UTC)
- Network byte order being big-endian makes it a little harder to accidentally get it right, with little-endian processors. But yes in the case of many file formats. This goes back at least to Fortran 66, though. Fortran 66 A format I/O stores characters left (low address) justified. While the standard doesn't allow for comparison, or even assignment, of such data, program often did, and do, it. Java has fixed sizes for data types, and you aren't supposed to be able to tell which byte order the underlying processor uses. You can still put bytes together, though, and write them out in the wrong order. Gah4 (talk) 00:35, 24 September 2024 (UTC)
- A lot of C code isn't as portable as it should be. Often it assumes specific sizes for types, or specific endianness. Some won't realize that until they try porting it. Gah4 (talk) 19:41, 20 September 2024 (UTC)