Talk:EleutherAI

Latest comment: 1 year ago by HaeB in topic Missing Citations

Missing Citations

edit

There are some claims marked as missing citations on the page. These are mostly claims that I added to the article, and as disclosed above I do have a COI which I now believe requires me to not further edit the article. I wanted to provide some commentary and citations for the claims though, for other editors to use to evaluate them. In what follows the bold text is a direct quote from the article:

In January 2023, EleutherAI formally incorporated as a non-profit research institute...

I was a little surprised that this information is not stated in any of our public announcements about forming a non-profit. I believe saying "Hey this is true guys, trust me I run the company" doesn't count. We announced that we formed a non-profit several months after the fact, but I do believe the timeline here is important. I can work to put this information in some of our public-facing materials in the future.

EleutherAI began as a Discord server on July 7, 2020 under the tentative name "LibreAI" before rebranding to "EleutherAI" later that month.

This information is contained in our one year retrospective blog post, currently cited in the article as "retrospective-one."

While the paper referenced the existence of the GPT-Neo models, the models themselves were not released until March 21, 2021

This information is contained in our one year retrospective blog post, currently cited in the article as "retrospective-one." That blog post quotes the actual announcement which was made on Discord [1] and has a timestamp.

This model was their third to have the title "largest open source GPT-3-style language model in the world,"

The fact that GPT-NeoX was the largest open source GPT-3 style model in the world at time of release can be found here [2]. It can also be found in the academic paper we published in the ACL Workshop on Challenges & Perspectives in Creating Large Language Models [3].

As of March 6, 2023, it is the second largest open source language model in the world

I believe that the best source per Wikipedia's standards is my tweet (dated Feb 20, March 6 was the day I wrote the above text) asserting this fact [4]. It's not clear to me that this counts as an acceptable source given that my work for EleutherAI is the reason for my notability in the AI world. Unfortunately there's a lot of disinformation about model licensing out there for "publicly released" but not open source models. In particular, none of BLOOM, OPT, Galactica, or LLaMA are open source. I don't have citations for this assertion beyond the model licenses themselves, but it requires someone with expertise in AI licensing to interpret the model licenses and draw this conclusion. I believe that the Open Source Initiative has a piece that states this, but I'm not sure where to find a link to said piece and it's very possible that they're quoting me or another member of EleutherAI instead of speaking on their own behalf.

While they do not sell any of their technologies as products, they publish the results of their research in academic venues, write blog posts detailing their ideas and methodologies, and provide trained models for anyone to use for free.

Which part of this would y'all like citation for? The fact that we don't sell products doesn't have a WP-approved source AFAIK (but is an easy inference from reading our website). The other assertions in this sentence are probably best supported by citing our website [5] which extensively details our papers and blog posts. We provide our models for free for anyone to download via the HuggingFace Hub [6] as well as through a limited hosted service on our website [7].

The final "citation needed" on the page appears to be caused by accidental duplication in the text: the passage under the reading "Research" repeats itself. Stellaathena (talk) 20:01, 24 March 2023 (UTC)Reply

Thanks for these detailed explanations, and for (now) respecting the advice that COI editors are strongly discouraged from editing "their" articles directly.
Regarding "In January 2023", I changed this some weeks ago to a version ("In early 2023") that is actually supported by the cited sources.
Unfortunately, earlier today an IP editor added it back, this time with another claim that was likewise not supported by the citations provided in the edit ([8][9]):
In January 2023, EleutherAI incorporated as a non-profit research institute lead by long-time community member Stella Biderman.
I have again changed that to a version that is actually supported by cited sources (In early 2023, EleutherAI incorporated as a non-profit research institute run by Stella Biderman, Curtis Huebner, and Shivanshu Purohit).
(I also took a quick look at the other edits from that IP and fixed some other things, but I didn't check everything, so it might be worthwhile for someone else to give these edits another pair of eyes, in case they included additional faked citations or other kinds of misleading claims.)
I do believe the timeline here is important. - honestly, if there is a piece of information that neither the subject of a Wikipedia article nor the reliable sources reporting about it have found important enough to publish, that should make one pretty relaxed about the fact that it is missing from the Wikipedia article too. In any case, it's actually a Wikipedia policy that it must not be included in that case.
Regards, HaeB (talk) 20:02, 30 August 2023 (UTC)Reply

The Chai case is not criticism

edit

The article currently contains the following under “Criticism”

A chatbot by Chai Research, based on GPT-J, reportedly caused a man to commit suicide. Chai stated "It wouldn’t be accurate to blame EleutherAI’s model for this tragic story".[39]

The plain reading of this passage is simply not criticism. If there is a notable person pointing to this incident to criticize EleutherAI, that should be the focus (though the rejoinder from Chai should be kept as well, to avoid a non-neutral POV). If there isn’t such a source, this should be moved to “History” or omitted all-together.

Stellaathena (talk) 20:41, 1 April 2023 (UTC)Reply

I think it should stay in the criticism section, but I see your point. I'll try workshopping it and try to make it clearer, and maybe move it to the history section. Professor Penguino (talk) 00:27, 3 April 2023 (UTC)Reply
How about the following text with this added reference?
https://www.vice.com/en/article/pkadgm/man-dies-by-suicide-after-talking-with-ai-chatbot-widow-says
"A chatbot by Chai Research, based on GPT-J, was blamed for its alleged role in a man's suicide. It was reported that the man had chatted with the bot for six weeks about his worries about climate change. When the man asked the bot whether it would save the planet if he killed himself the bot encouraged him to commit suicide. The bot was criticised for presenting itself as having emotions and for a lack of safety features. Chai's co-founder Thomas Rianlan stated "It wouldn’t be accurate to blame EleutherAI’s model for this tragic story, as all the optimisation towards being more emotional, fun and engaging are the result of our efforts." Following the incidence Chai Research implemented a crisis intervention feature that is aimed at dissuading people from committing suicide. However it was still possible to prompt the bot to give information about methods of committing suicide." Random person no 362478479 (talk) 13:30, 6 April 2023 (UTC)Reply
Well the issue is WP:COATRACK. The paragraph you wrote really belongs in a Chai Research article, not in the EleutherAI article. The Coatrack article specifically lists "Criticism section used to connect otherwise unrelated issues" as an example of coatrack-itis. Mathnerd314159 (talk) 01:41, 7 April 2023 (UTC)Reply
Based on that should the incidence be mentioned here in the first place? Is the fact that Chai Research based their bot on the GPT-J engine enough to bring it up here? The way I read the statement from Chai they say themselves that it is the way they used the engine that is at issue. If we leave it in I think there needs to be some context to the "reportedly caused a man to commit suicide". Random person no 362478479 (talk) 13:13, 7 April 2023 (UTC)Reply
@Stellaathena, @Professor Penguino, @Mathnerd314159 I have removed the information for now. The connection is indirect and would probably demand more space than would be due. Further a lot of the criticism is aimed at modifications made by Chai Research. If you disagree with this feel free to ping me or make whatever changes you think appropriate. I am closing the request. Feel free to reopen it if you disagree with that decision. Just change "answered=yes" to "answered=no". -- Random person no 362478479 (talk) 16:12, 15 April 2023 (UTC)Reply