Talk:Q*

This article was nominated for merging with OpenAI on 4 December 2023. The result of the discussion (permanent link) was merge.

Relation with RLAIF

Latest comment: 9 months ago2 comments1 person in discussion

People are speculating they could be using something like AI or external tool in the loop (e.g. Matlab, Wolfram, ..) to solve the problem of human fatigue. Think something like AlphaGo playing against itself or the RLAIF paper. Since some of OpenAI/Ilya’s more recent work was with the step by step math dataset PRM800K people are loosely speculating it is combination work of QLearning and step-by-step logical/mathematical reasoning. So maybe they are synthetically generating more PRM800K to get orders of magnitude more training data. Letting it just solve lots of math step by step over and over. Building up its chain of thought or chain of verification with math reasoning. Not sure any of this really belongs in the article though. Would be great if work still got published anymore :) --208.64.173.109 (talk) 16:00, 23 November 2023 (UTC)Reply

Also just adding but not sure it makes sense to include in the article: Ilya Sutskever (11/2/2023): "The most near term limit to scaling is obviously data, this is well known. And, some research is required to address it. Without going in to the details, I will say the data limit can be overcome and progress will continue."

Later "On the scientific problem I think it is still an area where not many people are working on. Where AIs are getting powerful enough you can really very effectively. We will have some very exciting research to share soon.. Some forces are accelerating and some forces are decelerating. So for example, the cost and scale are a decelerating force. The fact that our data is finite is a decelerating force to some degree... With the data in particular I think it won't be an issue, we fill figure out something else. You might argue the size of the engineering team is a decelerating force. On the other hand, the amount of the investment is an accelerating force. The amount of interest from people, from engineers, scientists is an accelerating force. I think one other accelerating force is the fact that evolution biologically has been able to figure it out, and the fact that up until now progress in AI has up until this point had this weird property that it has been hard to execute upon. But it has also been more straightforward than one might have expected... How it will play out remains to be seen."

https: // youtube dot com/Ft0gTO2K85A?si=CFT3i6C0lToPDkql&t=1575 --208.64.173.109 (talk) 16:30, 23 November 2023 (UTC)Reply

"Rubik's cube"

Latest comment: 9 months ago6 comments4 people in discussion

That's a joke, right? 2600:1700:5B20:CAA0:2577:98F2:776E:A189 (talk) 19:24, 23 November 2023 (UTC)Reply

To be fair, GPT4 doesn't even have a basic world model if you try to play Hangman with it. So getting it off of LSD and training it to solve grade school math would be considered a research breakthrough worthy of a trillion dollar company--208.64.173.109 (talk) 22:50, 23 November 2023 (UTC)Reply

No, GPT-4 scored over 80% on GSM8K depending on the version already. This aced the test wouldn't be a breakthrough worthy to oust a popular CEO.

There's too much misinfo in this Q* debacle... I even saw a SCP-esque "leaked document" that talks about this thing breaking AES-192. The AGI hype is too big and the AI doomposting is even bigger, please don't pour fuel to the fire.

So this article is too gossip-y, it doesn't even mention the grade-schooler math unless you go your way through the citated Reuter post. Can anyone please edit it to be more neutral? 125.166.3.221 (talk) 06:23, 24 November 2023 (UTC)Reply

The acing was a little sarcasm. People seem to be arguing about GPT4 would have gotten a gold sticker on a math test it brought home from school or not. I think when most people imagine Artificial Super-Intelligence they don't imagine something which gets ~80% on in-distribution grade school word problems or algebra problems. Tackling something like List of unsolved problems in mathematics might attract attention. It's fine to note there is some way to go.

Just noting you are on the edge of AGI once a month for five years eventually feels less naive AI summer/peak of inflated expectations and starts to look, feel, and quack like a startup looking for its own self interest (free publicity, more investor funding, regulatory capture, ..). --208.64.173.109 (talk) 09:38, 24 November 2023 (UTC)Reply

Almost 5 years ago, GPT2 was also considered too dangerous to release at one point as well --208.64.173.109 (talk) 23:06, 23 November 2023 (UTC)Reply

Many experts criticized the decision, saying it limited the amount of research others could do to mitigate the model’s harms, and that it created unnecessary hype about the dangers of artificial intelligence. “The words ‘too dangerous’ were casually thrown out here without a lot of thought or experimentation,” researcher Delip Rao told The Verge back in February. “I don’t think [OpenAI] spent enough time proving it was actually dangerous.”

Could this be more self-serving marketing?--208.64.173.118 (talk) 13:05, 1 December 2023 (UTC)Reply

Collection of Reactions

Latest comment: 9 months ago2 comments2 people in discussion

@pmddomingos (November 23, 2023). ""We need a moratorium on talking about looming AGI until we have at least a working housebot."" (Tweet) – via Twitter. - Pedro Domingos Professor of computer science at UW and author of 'The Master Algorithm'. Researcher in machine learning known for Markov logic network enabling uncertain inference

@ylecun (November 23, 2023). "No. Everyone in top AI research labs is working on giving dialog systems the ability to plan & reason. There are projects along those lines at Meta-FAIR, DeepMind, and OpenAI, with early results (if you follow the literature). Q* is just one such project among many. The planning expert at OpenAI is Noam Brown, who worked on Libratus (poker) and Cicero (Diplomacy) at FAIR, both of which use planning. I suspect he has something to do with Q*. I don't think it's the kind of breakthrough the Twittersphere makes it to be. People need to calm down" (Tweet) – via Twitter. - Yann LeCun, Vice-President, Chief AI Scientist at Meta on whether he believes this signals looming AGI

@fchollet (November 23, 2023). ""Every single month from here on there will be rumors of AGI having been achieved internally. Just rumors, never any actual paper, product release, or anything of the sort."" (Tweet) – via Twitter. -- François Chollet - the creator of the Keras deep-learning library and AI Researcher at Google

Strong preference to remove subjective reactions from assorted researchers, which are heresay, and add no new information. Let's not have an article about whether AGI will be achieved or not, and/or when, please. — Preceding unsigned comment added by Hughperkins (talk • contribs) 13:09, 1 December 2023 (UTC)Reply

--208.64.173.109 (talk) 11:20, 24 November 2023 (UTC)Reply

Proposed merge to OpenAI

Latest comment: 9 months ago8 comments7 people in discussion

The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section. A summary of the conclusions reached follows.

There appears to be consensus for merging this to OpenAI. User:Bri raised concerns about losing the "further reading" references; I've tried to address this by carrying all of them over as references in the merged text, so that they can be absorbed into the article. Merge applied in Special:Diff/1188248220 and Special:Diff/1188248233. DefaultFree (talk) 05:41, 4 December 2023 (UTC)Reply

Rationale: not yet notable outside of the context of the Sam Altman drama. Standalone significance hinges on WP:RUMOR. Until and unless more detail on the project emerges, this should be a redirect to OpenAI#Q*, perhaps with some of the text from this article merged in. DefaultFree (talk) 00:29, 27 November 2023 (UTC)Reply

+1/Agree. Especially since Q* is widely used elsewhere in Computer Science/ML, specifically around Monte Carlo methods, Markov decision processes, or Q-Learning. We could be including many Q*'s here, this one doesn't seem notable enough to get an entire article ignoring those. If there is going to be a article about Q*, it should mostly be about the former (Bellman, MDP, Q-Learning, ..), not a single rumored project which happens to use a similar type of implementation.

More generally, for any finite Markov decision process, "Q" may refer to the function that the algorithm computes – the expected rewards for an action taken in a given state.. In optimization, star notation can often be used to denote a variable which is being optimized.

--2600:1009:B11D:F654:ADA7:CD3D:5946:8253 (talk) 00:54, 27 November 2023 (UTC)Reply

+1/Agree. The article is pretty much just quotations from people, pretty low on actual objective facts. I feel it could also be condensed to remove quotations from Chollet and LeCunn, which don't add any new information I feel, just personal opinions about AGI. — Preceding unsigned comment added by Hughperkins (talk • contribs) 00:08, 28 November 2023 (UTC)Reply

I honestly think that Q* should be entirely deleted because of WP:RUMOR Mr vili (talk) 05:21, 30 November 2023 (UTC)Reply

Agree. The topic might be noteworthy, but currently it reads like the section of a greater article. GarethPW (talk) 14:03, 30 November 2023 (UTC)Reply

I'm concerned with the move proposal for two reasons. First, The topic might be noteworthy seems to admit that it should have its own article, but the article should be improved (my preference). Second, in a merge, the "further reading" section would likely be lost, and I think there's a lot of value for readers there in new topics that haven't yet been incorporated. ☆ Bri (talk) 19:32, 30 November 2023 (UTC)Reply

At this point the initial unsourced rumors have been contested by OpenAI, rejected by Microsoft, and challenged by other notable AI researchers, so it isn't clear the rumors of AGI remain notable enough for an article. As we can see, there were "Sparks of AGI at the beginning of the year" and OpenAI has been promoting GPT2 as "too dangerous" for humanity as well. This probably belongs more in a marketing section of OpenAI more than anything else.--208.64.173.118 (talk) 13:26, 1 December 2023 (UTC)Reply

The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Add topic