Talk:Shebang (Unix)
This is the talk page for discussing improvements to the Shebang (Unix) article. This is not a forum for general discussion of the article's subject. |
Article policies
|
Find sources: Google (books · news · scholar · free images · WP refs) · FENS · JSTOR · TWL |
Archives: 1 |
This article is rated C-class on Wikipedia's content assessment scale. It is of interest to the following WikiProjects: | ||||||||||||||
|
Wikipedia's Gestapo Edit Reviews
editI've made a number of small edits to this page as well as other pages. So far they have nearly all have been rolled back including those like this where a citation was requested. We'll see if the change I made gets rolled back or not. There is a request for original research to be reviewed. I will not invest any time on this if even my minor changes get rolled back. I've been developing software for over 20 years and am qualified to review and improve this page. But not if anything I do gets rolled back!
Disambiguation
editPerhaps you were looking for : Ricky Martin - She Bangs! https://www.youtube.com/watch?v=5ihtX86JzmA 207.216.37.201 (talk) 06:51, 22 May 2015 (UTC)
Alternate names
editAn edit was reverted as the sources were claimed to be unreliable, however they are cited directly by the author of TLDP's Advanced Scripting Guide -- http://tldp.org/LDP/abs/html/sha-bang.html#FTN.AEN201 84.93.143.206 (talk) 21:38, 19 January 2012 (UTC)
- Just as a continuation - searching Google Books comes up with several publications calling a shebang a hash pling, pound-bang, and sha-bang. 84.93.143.206 (talk) 21:47, 19 January 2012 (UTC)
- Citing a source doesn't confer reliability upon it, so even if it's cited by a reliable source, http://www.in-ulm.de/~mascheck/various/shebang should not be used to reference any facts in this article. If the facts are actually included in the reliable source, the reliable source can of course be used. If the facts are in books, you can use those too.
- It does look like http://www.in-ulm.de/~mascheck/various/shebang could be a suitable external link. – Pnm (talk) 22:20, 19 January 2012 (UTC)
Ajax
editI can't work out what the last sentence in the lead paragraph is saying:
The "shebang" or "hashbang" name is also sometimes used of state-preserving fragment identifiers in Ajax applications; Google Webmaster Central specifies that fragment identifiers starting with an exclamation point (...url#!state...) are indexed specially by the Googlebot.
Should this be 'sometimes used in', 'sometimes used as', or 'sometimes used instead of' or something else? 81.98.43.107 (talk) 17:11, 1 February 2012 (UTC)
- Clarified a bit. Lorem Ip (talk) 00:25, 2 February 2012 (UTC)
env security
editI think the phrasing about env security is unnecessarily convoluted. It is also a quite free interpretation of the source. My understanding is that the env approach is dangerous only if you use it for suid programs (or similar) without checking PATH, which is a recipe for disaster anyway. --LPfi (talk) 12:05, 27 April 2012 (UTC)
- Agreed. I’ve introduced some alternative wording. Ewx (talk) 07:44, 28 April 2012 (UTC)
UTF-8 indentification
editThe page now says "UTF-8 can reliably be recognised as such by a simple algorithm".
This is wrong - UTF-8 cannot be distinguished reliably from any other encoding which uses 8bit per character unless that file happens to contain an invalid UTF-8 sequence. As an example, "aè!" in ISO-8859-1 is "aè!" if read as UTF-8. There is no way for an algorithm to say it's one way or the other. — Preceding unsigned comment added by 188.65.1.1 (talk) 10:33, 31 May 2012 (UTC)
- If there are more than a few non-ASCII bytes and there is no invalid UTF-8, the probability that it is not UTF-8 is very low. No mathematical evidence is needed.--BIL (talk) 11:08, 31 May 2012 (UTC)
- Might be reported to UTF-8 page? — Preceding unsigned comment added by 84.100.195.219 (talk) 23:37, 22 June 2012 (UTC)
Unneeded Shebang
editSome people say the BOM indication is unneeded to them in UTF-8, as UTF-8 as no specific Byte Order.
On old Unix, Shebang is not compatible with BOM.
Nowadays, text files are generally written in UTF-8, and often with BOM.
This makes some files, those with both BOM and shebang not to work on old Unix.
But scripts with both BOM and shebang can still be launched indicating the interpreter to use on the command line.
So my question:
How is Shebang unneeded as the interpreter generally known and provided by the caller? — Preceding unsigned comment added by 77.199.89.101 (talk) 13:36, 2 July 2012 (UTC)
Nowadays, text files are generally written in UTF-8, and often with BOM. Agreed on the first claim, since any ASCII file is also UTF-8. But the second ... can you please name one widely used Unix file that's UTF-8 with a BOM? For example, in one of the thousands of Debian packages? JöG (talk) 04:46, 2 February 2013 (UTC)
As for your specific question: the shebang is needed if you intend to use it. The exec family of functions can only run a few different types of programs. Executables are one type; text files with a shebang line is another. exec() doesn't try to guess. JöG (talk) 04:46, 2 February 2013 (UTC)
- The wrappers in libc (execvp, etc) will attempt to use the shell if execve does not recognize the file format. So arguably there is some (limited) guessing going on. Ewx (talk) 10:35, 3 February 2013 (UTC)
Examples
editThe existing examples are repetitive, mostly "me-toos" which are uninteresting because they are conventional scripting languages. Pruning those and adding a few different examples such as "make" and "env" would benefit the reader. TEDickey (talk) 10:02, 13 July 2012 (UTC)
- I agree with this. The point of the examples is to demonstrate use of #!, not to enumerate everything you can possibly use it with (for which a category would make more sense). The article does not need multiple shells and multiple scripting languages in its examples. Ewx (talk) 07:53, 20 May 2013 (UTC)
- I strongly disagree, the flags for each program differ widely as to how the program is written on the first line of a script (sed, awk and csh need a -f and tcc needs -run while other programs don't need options at all (PHP, Python) and even more programs require other options. (scsh, this) (guile, -s) (gs, -dNOSAFER) ), and a brief list of how to use different programs' options (including all of the common ones (awk, etc.) that require options) is helpful to have. I don't read it as "me-too," I read it as "This is how to invoke this common program in a script." Maybe this should be rewritten more clearly. TL,DR: We should list the programs that require flags after the shebang in the examples section. -- 12.218.76.10 (talk) 17:04, 26 October 2014 (UTC)
- That would be a completely open ended list, and could easily become longer than the rest of the article put together. People who want to know how to script in a particular language should read a tutorial for that language. Ewx (talk) 09:03, 28 October 2014 (UTC)
- I doubt it would be longer than a paragraph. After all, many or most of the programs either don't require any options or just require '-f' or another small option or two, with the exceptions above. If it does get too long, we could always come up with some kind of limitation. (most-popular languages list or some such thing.) -- 12.218.76.10 (talk) 17:08, 3 November 2014 (UTC)
UTF-8 as a de facto standard
editCould you expand on why you removed that information? Simply not liking it is not a reason to remove it (see WP:IDONTLIKEIT). [1] and [2] both clearly shows that utf-8 is considered as a de facto standard, and I could get more citations if needed (those two was just the first hits I found). Belorn (talk) 09:37, 26 July 2012 (UTC)
- Your claim is wrong: UTF-8 is not a defacto standard but a real stadard. It is however just one of many standards in the area of encoding. Your change made the article less exact. --Schily (talk) 09:54, 26 July 2012 (UTC)
- Not my claim. To cite:
- ""The most widely accepted ("de facto standard") character encoding method is UTF-8."" (ibm.com) [3]
- ""Modern Linux installations use UTF-8 for their environment in any country with any language and is currently the de facto standard for to represent text"" (Readhat Glossary) [4]
- ""UTF-8 is the defacto standard console and text file on modern systems, though other encodings are still common"" (Mercuril project)[5]
- ""UTF-8: Unicode for all regions, mostly in 1-3 Octets (new de facto standard)"" (linuxtopia.org) [6]
- ""For many projects on Linux, the de facto standard is to use UTF-8."" (unifont.org), and so on. Python has also gone making utf-8 the default for text.
- ""itself (ASDF) only recognizes one encoding beside :default, and that is :utf-8, which is the de facto standard, already used by the vast majority of libraries that use more than ASCII."" (ASDF manual published by common-lisp.net). [7]
- This is just a small subset of all articles I find doing a quick few min google search. Books should have even more statements like that. Are there a reason they are not relevant sources to the suggested change to the article? — Preceding unsigned comment added by Belorn (talk • contribs) 11:48, 26 July 2012 (UTC)
- The most widespread coding on UNIX is either C or ISO-8859-1. --Schily (talk) 10:43, 27 July 2012 (UTC)
You're both right; UTF-8 is a standard, as in a standardised encoding of Unicode; but it's one among many, and the choice of UTF-8 over other encodings is a de facto standard in most cases (though there are certainly cases where it's an explicit and official standard, such as in XML). To escape the whole argument, and because the Magic number section had become rather bloated and messy, I've rewritten it and trimmed it down. It now repeats itself less, contains less irrelevant information, and totally avoids the question of what sort of standard UTF-8 represents. :-) -- Perey (talk) 12:23, 26 July 2012 (UTC)
- Looks like a very good compromise. Belorn (talk) 20:04, 26 July 2012 (UTC)
- As the BOM is now clearly marked as superfluous, I see no problem with this text. --Schily (talk) 10:48, 27 July 2012 (UTC)
Coining of "shebang" (to mean #!)
editFor the record, I believe I was the one to coin this particular usage of "shebang", sometime in the late 1980s. By the time I posted [8] in 1989 I was already trying to get people to adopt the term. (The Usenet article in question was patch 7 for Perl 3.0.) I'm not aware that anyone else coined it independently, though of course that is always possible. (Sorry for the anonymous post, but someone seems to have grabbed "TimToady" already, and I don't think it was me.) --Larry Wall 71.139.24.65 (talk) 02:49, 4 September 2012 (UTC)
- since the slang word shebang was in fairly widespread use prior to computer culture, were you aware of that usage? I would trace the whole shebang back to that word, seems unavoidable. 68.174.97.122 (talk) 14:12, 22 February 2013 (UTC)
Proposed merge with Interpreter directive
editNeither this article nor Interpreter directive is noteworthy enough for a wikipedia article really, but if we merge them together they might just pass. Felixphew (Ar! Ar! Ar!) 07:16, 19 June 2014 (UTC)
- Except for UNOS (1982) #! is not an interpreter directive but a kernel feature. Schily (talk) 13:23, 14 July 2014 (UTC)
the pathname of the script
editIt would be nice if this article would mention how a shebang shell script can set a variable to a pathname to the script. This pathname is useful so the script can use relative paths to reference other files it needs. I have shell code to do this, but it isn't pretty and I'm not sure it's robust. Perhaps someone knows a bulletproof way to do this and could add it to this article.
Encyclopedant (talk) 08:23, 6 December 2014 (UTC)
- That depends on the language used and is nothing to do with #!. In general consult a tutorial or reference documentation for your choice of language. Ewx (talk) 09:41, 6 December 2014 (UTC)
- There have been many discussions in the POSIX standard teleconferences and the reason why this feature is not yet integrated in the standard is that there is currently no suitable proposal to deal with variable path names. Note that the POSIX standard does not specify pathnames and that #! <command> needs an absolute path name for <command> in order to avoid making it a security problem. Schily (talk) 17:09, 8 December 2014 (UTC)
Merge
editBased on all the talk at Interpreter directive, the answer is clearly a no to the merge of the two. One is a list, the other gives details and history. The point was made there are different versions. I will wait a short time before removing this request that is not an agreement. Tag removed. Telecine Guy 04:50, 26 October 2015 (UTC)
Too Linux Centric?
edit#!path arg translates to this system call: execve("path", ["path", "arg"], env);
According to this link, most unices deliver all argument in single string. — Preceding unsigned comment added by 218.103.114.193 (talk) 08:17, 28 October 2015 (UTC)
man topic?
editWhat topic would one look under in the 'man' command to find information about shebang?Tedtoal (talk) 20:57, 21 December 2016 (UTC)
- Under Linux, man execve. Ewx (talk) 09:15, 22 December 2016 (UTC)
characters before shebang
editIMHO any characters before #! also shall be ommited in first line. Without it there is not possible to run part of script interpreters correctly.
For example: for php shall be possible:
//#!/bin/.../php
or
/*#!/bin/.../php
*/
to prevent put "#!/bin/.../php" to stdout after script running. — Preceding unsigned comment added by 94.254.137.147 (talk) 23:51, 26 May 2017 (UTC)
- You might be interested in this StackOverflow post. The hack works on Unix-like systems for languages where forward slash is comment. It treats the file as a shell script and run the first line, which in turn runs the "interpreter" then immediately exits. --2601:647:4E03:9530:DC98:4A8E:FE3E:1CBA (talk) 10:13, 1 May 2020 (UTC)
Missing "/dev/fd workaround, discussed below"
editThere seems to be a missing footnote or subsequent text referred to in this article:
"On a system with setuid script support this will reintroduce the race eliminated by the /dev/fd workaround described below."
(I couldn't find another occurrence of either "/dev/fd" or "workaround".)
Reverts (#!/bin/bash example)
edit#!/bin/bash
as an example without comment is not a a good idea. Yes, it can be found in some tutorials and scripts, but that doesn't mean it's right. We don't show wrong JavaScript codes as example, do we? The use of paths to the binary is not recommended because the path might differ - as explained in section Portability. #!/usr/bin/env bash
should be shown instead. But because #!/usr/bin/bash
is actually widely used (and because those are just examples) the best approach would be to mention both. That's basically what the IP did (and what I repeated after it was reverted). WP:NOTGUIDE does not apply here. The reverted text was not an instruction or manual but a clarification. Otherwise the whole section about protability should be removed too. --StYxXx ⊗ 07:45, 14 April 2020 (UTC)
- Well, why not show just the recommended way, just like the Python case below it? There is no need to show anything else, and as you say the explanation comes later. As it was, the corollary looked like a guide and editorial.Dorsetonian (talk) 10:21, 14 April 2020 (UTC)
- I also think that the Bash example should look like the Python example. Of course WP is not a guide, and that is why the text should not say anything like "this is the recommended way". But superior, more generally applicable examples should still be preferred to inferior, less likely to work ones. The downside of /bin/bash is that it does not work on systems where Bash lives elsewhere. The downside of the `env` way is that it *may* pick up the wrong interpreter. In my experience, the former bites far more often and is more difficult to work around than the latter. If simplicity is desired, the `/bin/sh` example provides that. --RainerBlome (talk) 12:46, 26 April 2020 (UTC)
Whatever your personal preferences, `#!/bin/bash` is undeniably widely used, and it works fine for the people using it, who are either using systems where that is the native path to bash or who have linked/installed it to that path. Dismissing it as "wrong" shows no understanding of other people's perfectly reasonable choices.
The env hack is adequately covered by the python example and described elsewhere in the document. Ewx (talk) 18:44, 27 April 2020 (UTC)
only absolute or also relative path?
editThe current page is inconsistent in that it is written in the "definition" that the shebang is
#!interpreter ...
"where interpreter is an absolute path...", but later in the page it's written "...or relative to the current working directory". Now what?! It would indeed be handy to allow relative paths like "./myscript" or "../myscript" where myscript would be a user script that interprets the actual file with "custom commands" or data. Now why did the author put "absolute" if it is not necessarily an absolute path? And/or why would one say "...absolute... or relative..."? (OK, hoping that it is true, it is indeed less bad to write "A or not A" rather than wrongly "must be A"...) — MFH:Talk 22:57, 4 May 2020 (UTC)