Talk:R (programming language)
This is the talk page for discussing improvements to the R (programming language) article. This is not a forum for general discussion of the article's subject. |
Article policies
|
Find sources: Google (books · news · scholar · free images · WP refs) · FENS · JSTOR · TWL |
Archives: 1, 2, 3Auto-archiving period: 12 months |
R (programming language) was a Engineering and technology good articles nominee, but did not meet the good article criteria at the time. There may be suggestions below for improving the article. Once these issues have been addressed, the article can be renominated. Editors may also seek a reassessment of the decision if they believe there was a mistake. | ||||||||||
|
This level-5 vital article is rated B-class on Wikipedia's content assessment scale. It is of interest to multiple WikiProjects. | ||||||||||||||||||||||||||||||||||||||||||||
|
The article Datasets.load was nominated for deletion. The discussion was closed on 24 September 2018 with a consensus to merge the content into R (programming language). If you find that such action has not been taken promptly, please consider assisting in the merger instead of re-nominating the article for deletion. To discuss the merger, please use this talk page. Do not remove this template after completing the merger. A bot will replace it with {{afd-merged-from}}. |
This page has archives. Sections older than 365 days may be automatically archived by Lowercase sigmabot III when more than 5 sections are present. |
Too much tutorial-like
editCurrently this page reads more like a single-page printout of a book than an encyclopedia article. There are way too many tutorial-like examples. These should be removed in favor of a link to an R resource showing these examples. — Preceding unsigned comment added by 87.213.43.208 (talk) 17:22, 7 March 2024 (UTC)
- Definitely agree. These examples contain a ton of off-topic information and are better suited for a textbook or other online resource. Jcschwartz3205 (talk) 05:31, 12 March 2024 (UTC)
- I noticed this as well and was going to put up a poll to maybe separate the article into two: one of the language (this page) and then another focused on syntax and semantics. The second page on syntax and semantics would be like the Python analog.
- That said, the former split (now just moving text) would definitely need to be edited to be focused on highlighting different syntax and semantics, and not just a tutorial of how to program in R.
- Another avenue for this recently removed content could easily be moved to Wikibooks with little to no change, for example in the Computing section https://en.wikibooks.org/wiki/Department:Computing. Erictleung (talk) 17:26, 12 March 2024 (UTC)
- I restored the deleted content. The deleted content is the information I was looking for when I first came to this article to learn about R. Because it wasn't here, I bought three books on R. I then paraphrased these books and my college statistics textbook to build the examples. I'm sure I'm not the only one looking for examples to learn about R. If there is a consensus to fork these examples to another article that this article links to, then no information is lost. Timhowardriley (talk) 21:47, 12 March 2024 (UTC)
- I concur with Erictleung. This was a pretty clear case of WP:NOTTEXTBOOK. This might be useful to some folks, and it may well have been what you were personally looking for, but it is off-mission for an encyclopedia. I would not support forking to another article, either. MrOllie (talk) 21:51, 12 March 2024 (UTC)
- I restored the deleted content. The deleted content is the information I was looking for when I first came to this article to learn about R. Because it wasn't here, I bought three books on R. I then paraphrased these books and my college statistics textbook to build the examples. I'm sure I'm not the only one looking for examples to learn about R. If there is a consensus to fork these examples to another article that this article links to, then no information is lost. Timhowardriley (talk) 21:47, 12 March 2024 (UTC)
- I'm the guy who added the "Basic syntax" examples. If you look at the wiki pages for PHP or C you will see they are laden with meaty code examples. It only makes sense that pages on programming languages focus on the language itself, otherwise what's the point of having the page to begin within? If anything the syntax and usage examples should be promoted on the page. The only thing I think that needs to be removed at the moment is the prominent reference to the "Tidyverse" in the Packages section, which (unlike the language usage examples) genuinely has nothing to do with the the premise this article; which is ostensibly about "R, the programming language". Raquart (talk) 00:12, 17 March 2024 (UTC)
- The point is to explain what the language is, not to help people learn how to program in it - that is beyond the scope of the encyclopedia, just like carpentry shouldn't give advice on how to properly hammer a nail. MrOllie (talk) 03:00, 17 March 2024 (UTC)
- Absolutely agree regarding the "Tidyverse"...very off-putting and unnecessary Gdefreitas (talk) 16:10, 16 October 2024 (UTC)
Milestones
editThe table in the Milestones section show R version with format x.y, e.g. R 3.6. However, except for some of the historical releases, the formal version format is x.y.z, e.g. R 3.6.0. The dates associated with each entry appears to point to when the x.y.0 release was done. Should the 'Release' version be updated to use x.y.0 format?
Add .rhistory
edit.rhistory is another filetype that stores the history of the code executed in a R session. I want to add it to the file types list but I am new to Wikipedia and I don't know how. AHWikipedian (talk) 11:55, 13 November 2023 (UTC)
- I have added that in for you Pansydyke (talk) 16:55, 19 January 2024 (UTC)
- Thanks. AHWikipedian (talk) 18:16, 29 February 2024 (UTC)
Moved Comparison with alternatives/Python to talk
edit@Newystats: I moved this paragraph to talk:
Comparison with alternatives/Python
Python and R are interpreted, dynamically typed programming languages with duck typing that can be extended by importing packages. Python is a general-purpose programming language while R is specifically designed for doing statistical analysis. Python has a BSD-like license in contrast to R's GNU General Public License but still permits modifying language implementation and tools.[1]
Why is R being compared with Python? Python is a general-purpose programming language, but R is a specific-purpose programming language. This paragraph is comparing an apple with an orange. R_(programming_language)#Interfaces says you can embed R to Python by installing Rpy2. The implication is you can have both full Python and full R.
- You can also embed full Python and other languages in R, as described in Yihui Xie; Joseph J. Allaire; Garrett Grolemund (30 December 2023), R Markdown: The Definitive Guide, Chapman & Hall, Wikidata Q76441281
Regarding Python has a BSD-like license in contrast to R's GNU General Public License but still permits modifying language implementation and tools.
:
- This contrast is immaterial.
- This sentence is the only one that is cited. The book title of the citation is intriguing: "Python vs. R for Data Science." However, the paragraph doesn't paraphrase the book's thesis.
Timhowardriley (talk) 20:32, 3 January 2024 (UTC)
- @Timhowardriley: I object to removing the section on "Comparison with alternatives". If you think it's biased, please propose changes that remove the bias.
- Wikipedia has many comparisons like this that provide a valuable service. Only yesterday I got substantial help with something I was doing from a crudely similar comparison on Wikipedia. In my judgment deleting the entire "Comparison with alternatives" section degrades the quality of this article.
- I'm restoring that entire section including the discussion of Python. I plan to add other material, but I'm not exactly certain what just yet.
- DavidMCEddy (talk) 14:24, 5 January 2024 (UTC)
- Python and R are the two leading programming languages in data science and the comparison is very frequently discussed in relevant sources, so I think it makes sense to include it here. However I agree that the section as it stands is pretty shallow. – Joe (talk) 14:42, 5 January 2024 (UTC)
- Regarding
If you think it's biased, please propose changes that remove the bias.
: Comparisons between products and services are best handled through a table. For a narrative comparison to be unbiased, it requires a lot of words to fairly describe each differentiating characteristic. Most importantly, Wikipedia articles need to be reliably sourced. As Wikipedia editors of this product, we are inherently biased. Instead, a reliable source (like Consumer Reports) needs to compare R with a competitor, then we can paraphrase that material. On the other hand, simply name-dropping the NY Times is misleading. I got past the pay-lock once to read the article. I remember it being very supportive R and having only a mention of SAS. Moreover, it quoted SAS's marketing manage who refuted the SAS disparagements. The Comparison of statistical packages link in the "See also" section is the proper way to compare R with its competitors. RegardingI'm restoring the ... discussion of Python
: Please refute any of my claims that this is a lousy paragraph. Timhowardriley (talk) 23:28, 5 January 2024 (UTC) - Regarding
I plan to add other material, but I'm not exactly certain what just yet.
: The cart is in front of the horse. Wikipedia articles need to be reliably sourced. Step one is to discover something relevant in your secondary research. Step two is to paraphrase that material into the Wikipedia article. Otherwise, it's original research. Timhowardriley (talk) 23:55, 5 January 2024 (UTC)
- Regarding
- Accepted re. deleting the "Comparison with alternatives". Thanks, DavidMCEddy (talk) 16:24, 6 January 2024 (UTC)
References
- ^ Grogan, Michael (2018). Python vs. R for Data Science. O'Reilly Media, Inc.
Removing the description was "as a programming language to teach introductory statistics at the University of Auckland."
editThe introduction says "was started by professors Ross Ihaka and Robert Gentleman as a programming language to teach introductory statistics at the University of Auckland." This struck me as a bit odd, as R is not generally considered a tool to teach introductory statistics. Anyway, there's a hyperlink that references this introductory statistics comment. This is what it actually says, with no reference whatsoever to teach introductory statistics.
Early History - 1990
• Ross Ihaka joins the Department of Statistics at the
University of Auckland.
• Robert Gentleman spends 1990 in Auckland on sabbatical
from the University of Waterloo.
• During a chance encounter in the corridor, the following
exchange takes place:
Gentleman: “Let’s write some software.”
Ihaka: “Sure, that sounds like fun.”
• The initial goal is to build a testbed for trying out ideas and to
publish a paper or two.Early History - 1990 Drkirkby (talk) 15:15, 3 February 2024 (UTC)
- The quote from the PDF link is, "We set a goal of developing enough of a language to teach introductory statistics courses at Auckland." I added this quote to the citation. Timhowardriley (talk) 17:39, 3 February 2024 (UTC)
- The "Let's write some software" quote is on page 10 of the PDF. The citation points to page 12 of the PDF. Timhowardriley (talk) 17:45, 3 February 2024 (UTC)
- My error, I missed that.
- I'm having the misfortune of having to learn some introductory statistics with Minitab. I wish we were using R, but the university does not consider R an appropriate language to teach introductory statistics. I can see their point to be honest. Drkirkby (talk) 22:23, 4 February 2024 (UTC)
- The best way to learn statistics is to get a good eraser. ;-) Timhowardriley (talk) 04:38, 5 February 2024 (UTC)
Misuse of print() and return()
editIn a lot of the code examples the print() and return() functions are used incorrectly. For example R does not require the print(x) function to print the values of a vector x. Simply using the name of the function will do that. There are places where print() should be used (for example in the middle of a function), but not in most of the code shown. This is an important distinction between R and other languages.
Likewise you do not need return() at the end of a function definition. Whatever is on the last line will be returned. You do need to use return() if you are returning from the middle of a function. Again, this is an important distinction between R and other languages.
I would like to edit the examples to reflect this, unless someone has a reason for not doing so. Mcsmom (talk) 15:05, 18 February 2024 (UTC)
- *Regarding
Misuse of print() and return()
: I disagree. print() and return() are not misused. - *Regarding
... the print() and return() functions are used incorrectly.
: I disagree. They are used correctly. - *Regarding
For example R does not require the print(x) function to print the values of a vector x.
: Correct. If x is on a line by itself, then the interpreter will send it to print() for you. - *Regarding
This is an important distinction between R and other languages.
: I disagree. It's a shortcut and not important. - *Regarding
Likewise you do not need return() at the end of a function definition.
: Correct. If the last expression is left unassigned, then the interpreter will return it for you. Indeed, R_(programming_language)#Programmer_created_functions explains this in the comments. - *Regarding
Again, this is an important distinction between R and other languages.
: I disagree. It's a shortcut and not important. - *Regarding
I would like to edit the examples to reflect this...
: I will revert these edits b/c they will confuse a reader not familiar with R's shortcuts. Indeed, when I was new to the language and encountered these shortcuts in code, I was confused. The article's audience is intended to be as broad as possible. However, a new section titled, "Shortcuts" would be an appropriate place to enlighten a reader new to the language. Timhowardriley (talk) 22:43, 18 February 2024 (UTC)
- Additional thought: Just because a language allows for a syntactic construct, it doesn't mean it's wise to use it. For example, COBOL used to allow a function to alter a local variable of another function. See Computer_program#Coupling. Software engineering principles emphasize readability over cryptic syntax constructs. For example, this version of the article has a mistake trying to explain the return() shortcut. The return() is not optional b/c the expression is assigned to the variable z. Timhowardriley (talk) 00:29, 19 February 2024 (UTC)
Add some detail on OOP features and on closures (functions)
editI added a succint explanation (with code example) of the OOP features of the language I found absent (outside the "paradigms" wikidata) but imho are an important feature of R (for instance there is a section about the "pipe operator" but none of OOP). Also added some detail in "functions", specifically the possibility of creating custom infix operators (which is rather uncommon). Rikivillalba (talk) 02:10, 26 April 2024 (UTC)