Talk:Stable Diffusion/Archive 1

This is an archive of past discussions about Stable Diffusion. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.

Archive 1

Gallery of examples?

Latest comment: 2 years ago2 comments2 people in discussion

The Japanese language page has a gallery of various examples that Stable Diffusion can create, perhaps we should do the same to showcase a few examples for people to see. I'd be curious to hear others weigh in. Camdoodlebop (talk) 00:57, 11 September 2022 (UTC)

The built-in Batch, Matrix and XY plot functions are great for this. Please feel free to use this example for the Img2Img section to explain parameters: https://i.imgur.com/I6I4AGu.jpeg Here I've used an original photo of a dirty bathroom window and transformed it using the prompt "(jean-michel basquiat) painting" using various CFG and denoising 73.28.226.42 (talk) 16:42, 8 October 2022 (UTC)

External links

Latest comment: 2 years ago7 comments4 people in discussion

The AUTOMATIC1111 fork of Stable Diffusion is indubitably the most popular client for Stable Diffusion. It should definitely have its place in the external links section. Thoughts? Leszek.hanusz (talk) 16:26, 7 October 2022 (UTC)

Reddit comments aren't reliable for anything, and Wikipedia is WP:NOT a link directory. We should not be providing links to clients at all. MrOllie (talk) 16:30, 7 October 2022 (UTC)

This is just one metric, it has more than 7K stars on GitHub, what more do you want? Do you actually use Stable Diffusion yourself? It is now THE reference. Leszek.hanusz (talk) 16:37, 7 October 2022 (UTC)

GitHub stars (or any form of social media likes) are also indicative of precisely nothing. What I want is that you do not advertise on wikipedia by adding external links to your own project. - MrOllie (talk) 16:38, 7 October 2022 (UTC)

Automatic1111 is not my own project, it has nothing to do with me. http://diffusionui.com is my own project and I agree it should not be in the external links. Leszek.hanusz (talk) 16:51, 7 October 2022 (UTC)

I agree with MrOllie. Nothing here (reddit comments, GitHub stars) is the type of sourcing that would suggest this should be included in this article. Elspea756 (talk) 16:53, 7 October 2022 (UTC)

I think its evident in that most, nearly all, published/shared prompts for SD use the parentheses/brackets/prompt-editing syntactical sugar, which is a feature exclusively from Automatic1111-webui's version. That should be a good indicator of its popularity if you can't use github stats for some reason. 73.28.226.42 (talk) 13:41, 10 October 2022 (UTC)

Using images to promote unsourced opinions

Latest comment: 2 years ago15 comments3 people in discussion

I've removed two different versions of editors trying to use images to promote unsourced legal opinions and other viewpoints. Please, just use reliable sources that support that these images illustrate these the opinions, if those sources exist. You can't just place an image and claim that it illustrates an unsourced opinion. Thanks. Elspea756 (talk) 15:54, 6 October 2022 (UTC)

"But Stable Diffusion’s lack of safeguards compared to systems like DALL-E 2 poses tricky ethical questions for the AI community. Even if the results aren’t perfectly convincing yet, making fake images of public figures opens a large can of worms." - TechCrunch. "And three weeks ago, a start-up named Stable AI released a program called Stable Diffusion. The AI image-generator is an open-source program that, unlike some rivals, places few limits on the images people can create, leading critics to say it can be used for scams, political disinformation and privacy violations." - Washington Post. I don't know what additional convincing you need. As for your edit summary of "Removing unsourced claim that it is a "common concern" that this particular image might mislead people to believe this is an actual photograph of Vladimir Putin", nowhere in the caption was that ever mentioned, that's purely your own personal interpretation that completely misses the mark of what the caption meant. --benlisquare_T•C•E 16:02, 6 October 2022 (UTC)

Thank you for discussing here on the talking page. I see images of Barack Obama and Boris Johnson included in that Tech Crunch article, so those do seem to illustrate the point you are trying to make and are supported by the source you are citing. Can we agree to replace the previously used unsourced image with either that Barack Obama image or series of Boris Johnson images? Elspea756 (talk) 16:06, 6 October 2022 (UTC)

That would not be suitable, because those images of Boris Johnson and Barack Obama are copyrighted by whoever created those images in Stable Diffusion and added those to the TechCrunch article. Per the WP:IUP and WP:NFCC policies, we do not use non-free images if a free-licence image is already available. A free licence image is available, because I literally made one, and released it under a Creative Commons licence. --benlisquare_T•C•E 16:09, 6 October 2022 (UTC)

OK, now I understand why you feel so strongly about this image, it's because as you say you "literally made" an image and now you want to include your image in this wikipedia article. I hope you can understand you are not a neutral editor when it comes to decisions about this image you "literally made", that you have a conflict of interest here, and shouldn't be spamming your image you made into this article. Your image you are spamming into this article does not accurately illustrate the topic, so it should be removed. Elspea756 (talk) 16:15, 6 October 2022 (UTC)

It's your perogative to gain WP:CONSENSUS for your revert, given that you are the reverting party. If you can convince myself, and the wider community of editors, that your revert is justified, then I will by all means happily agree with your revert. --benlisquare_T•C•E 16:17, 6 October 2022 (UTC)

Nope, it is your obligation to provide sources and gain consensus for your "image made literally 27 minutes ago ffs." We have no obligation to host your "image made literally 27 minutes ago ffs." Elspea756 (talk) 16:21, 6 October 2022 (UTC)

Point to me the Wikipedia policy that says this. Almost all image content on Wikipedia is user self-created, anyway; your idea that Wikipedia editors cannot upload their own files to expand articles is completely nonsensical. All of your arguments have not been grounded in any form of Wikipedia policy; rather, they are exclusively grounded in subjective opinion, and a misunderstanding of how Wikipedia works. "We" "Your" - my brother in Christ, you joined Wikipedia on 2021-06-14, it's wholly inappropriate for you to be condescending as if you were the exclusive in-group participant here. --benlisquare_T•C•E 16:23, 6 October 2022 (UTC)

As, I've said you have a very clear conflict of interest here. It is very evident from your language choices here, writing "ffs," "Christ," etc, that you not a neutral editor and that you feel very strongly about spamming your unsourced, user generated content here. I understand very clearly where you are coming from now There is no need for you to continually restate your opinions with further escalating profanity. Elspea756 (talk) 16:34, 6 October 2022 (UTC)

I totally agree with Elspea756 and removed some images. This is clearly original research; while it is reasonable to be more lax with WP:OR as it applies to images, WP:OI (quite reasonably) states: [Original images are acceptable ...] so long as they do not illustrate or introduce unpublished ideas or arguments. Commentary on the differences between specific pictures is very different than something like File:Phospholipids aqueous solution structures.svg, which is inspired by existing diagrams and does not introduce "unpublished ideas". Yes, the idea that "AI images are dependent on the input", is published; no, no one independent has analyzed these specific pictures. Also, using AI-generated art with prompts asking to emulate specific artists' styles is not only blatantly unethical, but also potentially a copyright violation; that it is legally acceptable is not yet established. Finally, we shouldn't have a preponderance of pictures appealing to the male gaze. There are thousands, millions of potential subjects, and there is nothing special about these. Ovinus (talk) 01:47, 8 October 2022 (UTC)

This thread is specifically in reference to the Putin image used in the "Societal impacts" section, however. The disagreement here is whether or not it's appropriate to use the Putin image to illustrate the ethics concerns raised in the TechCrunch article; my position is that we cannot use the Boris Johnson image from the TechCrunch article as that would fall afoul of WP:NFCC. As discussed in a previous thread, I had already planned to replace a few of the sample images in the article with ones that are less entwined with the female form and/or male gaze, I just haven't found the time to do so yet, since creating prompts of acceptable quality is more time-consuming than most might actually assume. --benlisquare_T•C•E 02:07, 8 October 2022 (UTC)

I understand, but the images in this article are broadly problematic, not just the Putin image. It's quite arguably a WP:BLP violation, actually. A much less controversial alternative could be chosen; for example, using someone who's been dead for a while. Ovinus (talk) 02:12, 8 October 2022 (UTC)

In that case, that's definitely an easy job to fix. I'll figure out an alternative deceased person in due time. --benlisquare_T•C•E 02:15, 8 October 2022 (UTC)

Thank you, Ovinus, for "total agree"ment that these spam images are a problem and that they are "clearly original research." In a moment I will be once again removing the unsourced, inaccurate image of Vladimir Putin from this article that has been repeatedly spammed into this article. Besides being obvious spam and unsourced original research, it is also a non-neutral political illustration created to express an individual wikipedia editor's personal point of view, and its subject is a living person so this violates our policy on biographies of living persons. Once again, the obligation to provide sources and gain consensus is on the editor who wants their content included. We do not need a week-long discussion before removing an unsourced user-generated spam image expressing a personal political viewpoint about a living person. Elspea756 (talk) 13:22, 13 October 2022 (UTC)

I wouldn't call it spam. Benlisquare is clearly here in good faith. Ovinus (talk) 14:42, 13 October 2022 (UTC)

Wiki Education assignment: WRIT 340 for Engineers - Fall 2022 - 66826

Latest comment: 2 years ago1 comment1 person in discussion

This article was the subject of a Wiki Education Foundation-supported course assignment, between 22 August 2022 and 2 December 2022. Further details are available on the course page. Student editor(s): Bruhjuice, Aswiki1, Johnheo1128, Kyang454 (article contribs).

— Assignment last updated by 1namesake1 (talk) 23:38, 17 October 2022 (UTC)

Not Open Source

Latest comment: 2 years ago5 comments5 people in discussion

The license has usage restrictions, and therefore does not meet the Open Source Definition (OSD):

https://opensource.org/faq#restrict

https://stability.ai/blog/stable-diffusion-public-release

Nor is the "Creative ML OpenRAIL-M" license OSI-approved:

https://opensource.org/licenses/alphabetical

It would be correct to refer to it as "source available" or perhaps "ethical source", but it certainly isn't Open Source.

Gladrim (talk) 12:40, 7 September 2022 (UTC)

This is my understanding as well, and I thought about editing this article to reflect this. However I'm not sure how to do this in a way that is compliant with WP:NOR, as the Stability press release clearly states that the model is open source and I have been unable to find a WP:RS that clearly contradicts that specific claim. The obvious solution is to say "Stability claims it is open source" but even that doesn't seem appropriate given the lack of sourcing saying anything else (after all, the explicit purpose of that language is to cast implicit doubt on the claim). I have a relatively weak understanding of Wikipedia policy and would be more than happy if someone can point to evidence that correcting this claim would be consistent with Wikipedia policy, but at the current moment I don't see a way to justify it.

It's also worth noting that the OSI-approved list hasn't been updated since Stable Diffusion came out, and SD is the first model to be released with this license as far as I can tell. Thus the lack of endorsement is not evidence of non-endorsement. Perhaps we could say "Stability claims it is open source, though OSI has not commented on the novel license" (this is poorly worded but you get my point)

Stellaathena (talk) 17:41, 7 September 2022 (UTC)

According to the license which is adapted from the Open RAIL-M(Responsible AI Licenses) which the 'M' means the usage restrictions only applies to the published Model or derivative of the Model, not source code.

Open RAIL has various types of licenses available: RAIL-D(Use restriction applies only to the Data), RAIL-A(Use restriction applies only to the application/executable), RAIL-m(Use restriction applies only to the Model), RAIL-S(Use restriction applies only to the Source code) and it can combined in D-A-M-S order e.g. RAIL-DAMS, RAIL-MS, RAIL-AM

The term ''''Open'''' can be added to the licenses to clarify the license is royalty-free and the works/subsequent derivative works can be re-licensed 'as long as the Use Restrictions similarly apply to the relicensed artifacts'

"

Open RAIL Licenses

Does a RAIL License include open-access/free-use terms, akin to what is used with open source software?

If it does, it would be helpful for the community to know upfront that the license promotes free use and re-distribution of the applicable artifact, albeit subject to Use Restrictions. We suggest the use of the prefix "Open" to each RAIL license to clarify, on its face, that the licensor offers the licensed artifact at no charge and allows licensees to re-license such artifact or any subsequent derivative works as they choose, as long as the Use Restrictions similarly apply to the relicensed artifacts and its subsequent derivatives. A RAIL license that does not offer the artifact royalty-free and/or does not permit downstream licensing of the artifact or derivative versions of it in any form would not use the “Open” prefix." source

so technically the source code is 'Open Source'

Maybe a useful link:

https://huggingface.co/blog/open_rail

https://www.licenses.ai/ai-licenses

https://www.licenses.ai/blog/2022/8/26/bigscience-open-rail-m-license Dorakuthelekor (talk) 23:04, 17 September 2022 (UTC)

It is definitely not open source, and to describe it that way is misleading. Ichiji (talk) 15:56, 3 October 2022 (UTC)

Just a short note for now as I'll revisit this issue later: Stable Diffusion is clearly open source. The questions is whether it's free and open source software (FOSS) or just open source (including any potential subtypes of open source, which don't have to be approved by any "Open Source Definition (OSD)). The source code is voluntarily fully openly available in accessible manner, it's open source by definition. Concerning, whether or not it's FOSS: I would argue it is but maybe there should be a clarification that it's not a type of a condition-less fully free types of FOSS.

Several WP:RS sources have stated that is FOSS and even made that a major topic of their articles. See these refs and there are probably more: ^[1]^[2]^[3]^[4]^[5]^[6]

References

^ Edwards, Benj (6 September 2022). "With Stable Diffusion, you may never believe what you see online again". Ars Technica. Retrieved 15 September 2022.
^ "Stable Diffusion Public Release". Stability.Ai. Retrieved 15 September 2022.
^ "Stable Diffusion creator Stability AI accelerates open-source AI, raises $101M". VentureBeat. 18 October 2022. Retrieved 10 November 2022.
^ Kamath, Bhuvana (19 October 2022). "Stability AI, the Company Behind Stable Diffusion, Raises $101 Mn at A Billion Dollar Valuation". Analytics India Magazine. Retrieved 10 November 2022.
^ Pesce, Mark. "Creative AI gives overpowered PCs something to do, at last". The Register. Retrieved 10 November 2022.
^ "Is generative AI really a threat to creative professionals?". The Guardian. 12 November 2022. Retrieved 12 November 2022.

Prototyperspective (talk) 17:27, 12 November 2022 (UTC)

Image variety

Latest comment: 1 year ago13 comments5 people in discussion

Benlisquare, I appreciate all the work you've done expanding this article, including the addition of images, but I think the article would be improved if we could get a greater variety of subject matter in the examples. To be honest, I think any amount of "cute anime girl with eye-popping cleavage" content has the potential to raise the hackles of readers who are sensitive to the well-known biases of Wikipedia's editorship, so it might be better to avoid that minefield altogether. At the very least though, we should strive for variety.

I was thinking about maybe replacing the inpainting example with figure 12 from the latent diffusion paper, but that's not totally ideal since it's technically not the output of Stable Diffusion itself (but rather a model trained by LMU researchers under very similar conditions, though I think with slightly fewer parameters). Colin M (talk) 21:49, 28 September 2022 (UTC)

My rationale for leaving the examples as-is is threefold:

Firstly, based on my completely anecdotal and non-scientific experimentation from generating over 9,500+ images (approx. 11GB+) of images using SD at least, non-photorealistic images play best with the ability for img2img to upscale images and fill in tiny, closer details without the final result appearing too uncanny for the human eye, which is why I opted for working with generating a non-photorealistic image of a person for my inpainting/outpainting example. Sure, we theoretically could leave all our demonstration examples as 512x512 images (akin to how the majority of example images throughout that paper were small squares), but my spicy and highly subjective take on this is, why not strive for better? If we can generate high detail, high resolution images, then I may as well should. The technology exists, the software exists, the means to go above and beyond exists. At least, that's how I feel.
Specifically regarding figure 12 from that paper, it makes no mention as to whether or not the original inpainted images are generated through txt2img which were then inpainted using img2img, or whether they used img2img to inpaint an existing real-world photograph. If it is the latter, then we'd run into issues concerning Commons:Derivative works. At least with all of the txt2img images that I generate, I can guarantee that there wouldn't be any concern in this area, as long as I don't outright prompt to generate a copyrighted object like the Eiffel Tower or Duke Nukem or something.
Finally, I don't particularly think the systemic bias issue on this page is that severe. Out of the four images currently on this article, we have a photorealistic image of an astronaut, an architectural diagram, and two demonstration images containing artworks featuring non-photorealistic women. From my perspective, I don't think that's at the point of concern. Of course, if you still express concern in spite of my assurances, give me time I could generate another 10+ row array of different txt2img prompts featuring a different subject, but it'll definitely take me quite some time to finetune and perfect to a reasonable standard (given the unpredictability of txt2img outputs). As a sidenote, the original 13-row array I generated was over 300MB+ with dimensions of 14336 x 26624 pixels, and the filesize limit for uploading to Commons was 100MB, hence why I needed to split the image into four parts.

Let me know of your thoughts, @Colin M. Cheers, --benlisquare_T•C•E 03:08, 29 September 2022 (UTC)

Actually, now that I think about it, would you be keen on a compromise where I generate a fifth image, either containing a landscape, or an object, or a man, to demonstrate how negative prompting works, as a counterbalance to the images already present? The final result would be something like this: Astronaut in the infobox, diagram under "Architecture", the 13-row matrix comparing art styles under "Usage" (right-align), some nature landscape or urban skyline image under "Text to image generation" (left-align), the inpainting/outpainting demonstration under "Inpainting and outpainting" (right-align). I'm open to adjustments if suggested, of course. --benlisquare_T•C•E 03:33, 29 September 2022 (UTC)

Regarding your point 1:

I don't think we're obliged to carefully curate prompts and outputs that give the nicest possible results. We're trying to document the actual capabilities of the model, not advertise it. Seeing the ways that the model fails to generate photorealistic faces, for example, could be very helpful to the reader's understanding.
Even if we accept the reasoning of your point 1, that's merely an argument for showing examples in a non-photorealistic style. But why specifically non-photorealistic images of sexualized young women? Why not cartoonish images of old women, or sharks, or clocktowers, or literally anything else? It's distracting and borderline WP:GRATUITOUS.

Colin M (talk) 04:19, 29 September 2022 (UTC)

Creating the inpainting example took me quite a few hours worth of trial-and-error, given that for any satisfactory img2img output obtained one would need to cherrypick through dozens upon dozens of poor quality images with deformities and physical mutations, so I hope you can understand why I might be a bit hesitant with replacing it. Yes, I'm aware that's not a valid argument for keeping or not keeping something, I'm merely providing my viewpoint. As for WP:GRATUITOUS, I don't think that particularly applies, the subject looks like any other youthful woman one would find on the street in inner-city Melbourne during office hours, but I can understand the concern that it may reflect poorly on the systemic bias of Wikipedia's editorbase. Hence, my suggested solution to that issue would be to balance it out with more content, since there's always room for prose and image expansion. --benlisquare_T•C•E 06:01, 29 September 2022 (UTC)

I've gone ahead and added the landscape art demonstration for negative prompting to the article. When generating these, this time I've specifically left in a couple of visual defects (e.g. roof tiles appearing out of nowhere from inside a tree, and strange squiggles appearing on the sides of some columns), because what you mentioned earlier about also showcasing Stable Diffusion's flaws and imperfections does also make sense. There are two potential ways we can layout these, at least with the current amount of text prose we have (which optimistically would increase, one would hope), between this revision and this revision which would seem more preferable? --benlisquare_T•C•E 06:05, 29 September 2022 (UTC)

+1 on avoiding the exploitive images. The history of AI is rife with them, let's not add to that. Ichiji (talk) 15:59, 3 October 2022 (UTC)

I agree with Ichiji that editors should not be adding "exploitive images". I also agree with Colin M above in questioning why editors would be adding "images of sexualized young women" to this article." And I agree with Ovinus below that "we shouldn't have a preponderance of pictures appealing to the male gaze." In a moment I will be removing four unsourced, unnecessary, user-generated images created with "Prompt: busty young girl ..." We are not obligated to host anyone's "busty young girl" image collection. PLEASE NOTE: We don't need four editors to take over a week to disagree with someone spamming into this article 95 user-generated images from their "busty young girl" image collection. The obligation to provide sources and gain consensus is on the editor who wants their content included. An editor adding 90+ unsourced, user-generated images to an article is obvious spam and can be just removed on sight. Elspea756 (talk) 17:09, 8 October 2022 (UTC)

Nonsense. The consensus takes time and thorough discussion, there is no WP:DEADLINE to rush anything through without dialogue. Also, consider reading WP:SPAM to see what spam actually is. Your allegations are bordering upon personal attacks here. --benlisquare_T•C•E 18:22, 8 October 2022 (UTC)

If certain types of images are characterisic of the usage of AI in general or this paricular program in particular, why should this artcle pretend otherwise? Of course, it would be ideal if they were published in some RS first, but this is to be balanced with other concerns, like permissive licenses. See Visual novel article illustrations, for example. Smeagol 17 (talk) 10:12, 16 October 2022 (UTC)

I'd have to concur here; other types of AI-generated images are already well represented throughout Wikipedia, for instance the DALL-E page. In contrast with other text-to-image models, SD is particularly good at creating non-photorealistic art of anatomically realistic humans, and on occasion photorealistic images of people too if you're lucky with how the hands and limbs are generated, so showcasing such outputs make sense compared to generic images of beach balls and teddy bears.

On a sparingly-related tangent, today I have gotten the DreamBooth article onto the front page of Wikipedia which is arguably the part of Wikipedia with the most visibility, where it is showcased within the DYK section, and this article features an AI-generated topless photo of Wikipedia founder Jimmy Wales that I personally created; there were no objections during the entire DYK review process, there were no last minute objections from any sysops managing the DYK queues, and the DYK entry has been on the front page for 23 hours now, with no one raising any objections regarding the image. Not a single Wikipedia reader, not a single Wikipedia editor. It's interesting how there's essentially zero objection to a half-nude Jimmy Wales, but as soon as a fictional woman is involved it suddenly becomes a big issue, heaven forbid anyone sees a woman on the internet. My primary complaint is not that American prudishness exists; rather, my gripe is that Americans have the culturally imperialistic habit to dictate to everyone else on the planet how they should and should not feel about images of women. --benlisquare_T•C•E 23:44, 8 December 2022 (UTC)

So, how about returning at least Inpainting and Outpainting images? They were very illustrative. Smeagol 17 (talk) 23:43, 11 November 2022 (UTC)

I'd definitely welcome it, the article currently looks like an utter desert completely lacking in visual content, which is absolutely silly for an article about image generation AI models out of all things. Feel free to re-apply the image yourself from an old diff if you wish to; personally I'm going to refrain from doing so myself, because I have no interest in having obsessive editors breathing down my neck for weeks and weeks again. I'll leave it to the judgement of someone else. --benlisquare_T•C•E 23:54, 8 December 2022 (UTC)

Regarding Square Brackets as Negative Prompt

Latest comment: 2 years ago8 comments2 people in discussion

This is my first time contributing to a discussion, so please be understanding if I'm not following etiquette properly. I read the talk guidelines but it assumes a rather high level of familiarity of these systems, which I do not have.

In any case, I just wanted to start a discussion regarding the interpretation of citation 16. If I am reading the source material correctly, brackets around a keyword actually creates emphasis around the keyword rather than a negative correlation. It states: "The result of a qauntitative analysis is that square brackets in combination with the aforementioned prompt have a small but statistically significant positive effect. No effect or a negative effect can be excluded with 99.98% confidence." It goes on to state that with very specific prompt engineering square brackets can be used to create an inconsistent and negligible effect.

It then discusses exclamation points and that they do seem to have some negative effect on reducing the appearance of certain keywords in images. Since I am new to both contributing to Wikipedia and to Stable Diffusion I wanted to see if someone smarter than me could confirm my interpretation of the source material before making the corrections to the article. Thank you. — Preceding unsigned comment added by Abidedadude (talk • contribs) 19:58, 30 October 2022 (UTC)

By design (as mentioned here, here and here as examples; note these are just for use as examples, I'd strongly recommend not citing them in any serious work), parentheses, for example (Tall) man with (((brown))) hair, increases emphasis, while square brackets, for example [Red] car with [[[four doors]]], decreases emphasis. Gaessler's findings suggest that while attempting to decrease the occurrence of something via [French] architecture, it actually has a "small but statistically positive effect" yet also "not perceptible to humans" based on his data and methodology; meanwhile, the use of rainy!! weather to emphasise (i.e. increase the occurrence of; since ! can only be used for emphasis and not de-emphasis) was not very coherent and resulted in a high chi2/NDF, and the use of negative prompts to decrease the occurrence of keywords resulted in a highly statistically significant change in outcome. --benlisquare_T•C•E 02:31, 31 October 2022 (UTC)

I probably should also point out that some Stable Diffusion GUI implementations might use different formatting rules for emphasis; for example, NovelAI (which is a custom-modified, anime/furry-oriented model checkpoint of Stable Diffusion hosted on a SaaS cloud service) uses curly brackets, for example {rusty} tractor, for positive emphasis instead of parentheses. Not all Stable Diffusion implementations will process prompt formatting in the same way. --benlisquare_T•C•E 02:56, 31 October 2022 (UTC)

X/Y plot demonstrating how deemphasis and emphasis markers work

In case it satisfies your curiosity, I've just generated this X/Y plot in Stable Diffusion to demonstrate to you how [deemphasis] (emphasis) prompting works in practice. In my personal opinion, you can barely see any visual difference at all among the [deemphasised examples], so usually I don't see much point in using them at all while prompting. Of course, all of this is original research, so this is just a FYI explanation, nothing more and nothing less. --benlisquare_T•C•E 03:48, 31 October 2022 (UTC)

That's definitely interesting. I do like experimenting with prompts and learning about other people's experiences, you're saying that your info is based on NovelAI's custom implementation, yes? If so, perhaps it would be better to put the emphasis/de-emphasis info in the article about NovelAI? Because it seems the default checkpoint for Stable Diffusion hasn't been trained in any specific way to handle square brackets, and the citation in question doesn't really seem to support the assertion in the article about emphasis. In any case, perhaps it's all moot if everybody seems to more or less agree that the effect is imperceptible, even if it does exist. Maybe this sort of granular prompt fine tuning just shouldn't be mentioned at all, given that it's all pretty unreliable and results can be unpredictable with any machine learning? As a side note, with regards to anecdotal FYI info, I have been experimenting with JSON to input prompts (with default checkpoints) and the results have been pretty interesting. It's obvious that it hasn't been trained to interpret that in any way, but it really seems to make minor changes to the prompt result in much more significant differences vs natural language in terms of the resulting image. I definitely haven't experienced brackets the way anybody else is describing them, but again, it's all anecdotal. Abidedadude (talk) 07:58, 31 October 2022 (UTC)

No, 100.000000% (9 significant figures) of what I've covered above has nothing to do with NovelAI. I just mentioned in passing that NovelAI uses curly brackets instead of parentheses.

given that it's all pretty unreliable

That's precisely what the citation says: that using emphasis markers is less reliable than using negative prompts. And that's what's asserted in the Wikipedia article as well: "The use of negative prompts has a highly statistically significant effect... compared to the use of emphasis markers" (bold emphasis mine).

it really seems to make minor changes to the prompt result in much more significant differences

Yes, that is correct, and it's because even slight adjustments to things like word order, punctuation, and spelling, adds additional noise to the prompt, which will lead to a different output. The model doesn't parse the prompt like a human would, and we see this when big red sedan driving in highway and highway, with red big sedan, driving on it results in different outputs even with the same seed value. --benlisquare_T•C•E 10:54, 31 October 2022 (UTC)

I have to be honest, I'm even more confused now than when we started this discussion, so I'm probably just going to go ahead and bow out at this point. I'm pretty sure your own original research in the example above was intended to show me that negative prompting was less effective than emphasis, but right now you're telling me that the assertion in the article - about negative prompts being more effective - is correct. Even though the original research you conducted is consistent with the citation (which is then inconsistent with the statement in the article). Perhaps it's all a joke of some kind, because the 9 significant digits bit is pretty funny. All that extra emphasis on significance, and yet it doesn't change the end result one bit, much like the topic of discussion, no? Anyway, I did enjoy the discussion, but I'm afraid it's either going over my head or that the contradictions are just becoming too much for me to care about. I thought I was just helping out with a quick fix. I do appreciate you taking the time to engage with me either way. Abidedadude (talk) 17:34, 31 October 2022 (UTC)

An example of one of many different UI implementations of Stable Diffusion. This particular one is built upon the Gradio web frontend library, but there are non-web frontends for Windows and macOS as well. These UI frontends allow the user to interact with the model checkpoint without needing to type commands into a python console, making the barrier to entry easier for new users. All of these UIs have separate text fields for "prompt" and "negative prompt", as seen above. You enter what you want to see in the output image into the "prompt" text field; you enter what you don't want to see in the "negative prompt" field.

My bad, I should definitely be more clearer in my explanation. My first question to you is, are you using a UI frontend while using Stable Diffusion, or are you directly inputting the settings (e.g. sampler steps, CFG, prompts, etc.) into a command-line interface? If you are using a UI frontend, which one are you using? Are you running the model locally on your own computer, or are you using a cloud service via a third-party website?

As mentioned in the article, these features are provided by open-source UI implementations of Stable Diffusion, and not the 3.97GB model checkpoint itself. The UI implementation acts as an interface between the user and the model, so that the user doesn't need to punch parameters into a python console window. There are many different open-source UI implementations for Stable Diffusion, including Stable Diffusion UI by cmdr2, Web-based UI for Stable Diffusion by Sygil.Dev, Stable Diffusion web UI by AUTOMATIC1111, InvokeAI Stable Diffusion Toolkit, and stablediffusion-infinity by lkwq007, among others. All of the aforementioned implementations utilise both negative prompting features and emphasis features. In fact, almost every single Stable Diffusion user interface now has these features; it is now the norm, rather than the exception, for Stable Diffusion prompts to feature negative prompting and emphasis marking given that they significantly reduce the quantity of wasteful, low quality generations to sift through; go to any prompt sharing website or Stable Diffusion online discussion thread, and the vast majority of shared prompts will feature negative prompts or emphasis markers, or even both. Since this is a common question raised by someone else above, I should point out it is inappropriate for the Wikipedia article itself to list all of these UI implementations, as Wikipedia is WP:NOT a repository of external links; the examples I've provided above are just to make sure you have full context on what's going on.

Just like how the original CompVis repo provided a collection of python scripts that allow the user to interact with the model checkpoint file (the 3.97GB *.ckpt file that does much of the heavy lifting, so to speak), and those python scripts aren't "part" of the model checkpoint, open-source user interfaces likewise implement their own interface between the user and the 3.97GB *.ckpt; this space has been rapidly evolving and improving over the past few months, mostly as an open-source community driven effort, and the "norm" for the range of configurable settings available to make prompts has shifted considerably since September.

If you have any additional questions relating to this topic in particular, or if you would like assistance on how to set up any of the aforementioned UI implementations or how to improve your prompting to obtain better outputs, feel free to let me know. As someone who has generated over 33GB of images though experimentation in Stable Diffusion and is quite passionate in fine-tuning prompts to find the most perfect outputs, I'd be quite glad to help out. --benlisquare_T•C•E 22:20, 31 October 2022 (UTC)

Consensus is that we do not need to host a user's "busty young girl" image collection

Latest comment: 1 year ago19 comments7 people in discussion

There seems to be a dispute by editor Smeagol over whether there is a consensus for whether or not this page should be used to once again host benlisquare's unsourced, user-generated images created by prompting Stable Diffusion to create images of "busty young girls." Editor Ovinus above has disagreed with including these images, saying "we shouldn't have a preponderance of pictures appealing to the male gaze. There are thousands, millions of potential subjects, and there is nothing special about these." Editor Colin M above has also disagreed with including these "busty young girls," saying "why specifically non-photorealistic images of sexualized young women? Why not ... literally anything else? It's distracting and borderline WP:GRATUITOUS." Editor Ichiji has agreed, saying "+1 on avoiding the exploitive images." I agree with Ovinus, Colin M, and Ichiji, so that is a very clear consensus of 4 editors in agreement that belisquare should not use wikipedia to host their "busty young girl" image spam. My understanding is now editor Smeagol wishes to add some illustration of inpainting and outpainting, and so has reverted my removal of these images. A desire to include an esample of iinpainitng does not change the current consensus that if an illustration of inpainting were to be included here, it should be from a source other than benlisquare's unsourced, user-generated images of busty young girls. Such an example can be, as Colin M has said and others agreed, of "literally anything else." Does that make sense? Or is there a disagreement with this? Elspea756 (talk) 00:13, 19 December 2022 (UTC)

You are obsessed. Heaven forbid somebody depict your messiah Vladimir Putin in a less-than-positive light, huh. My one and only regret is creating that Putin image, which seems to be the sole origin for this multiple month-long obsession.

There seems to be a dispute by editor Smeagol... so that is a very clear consensus of 4 editors in agreement

Sounds like you don't have a consensus, then. Considering myself, @Smeagol 17, and 49.228.220.126 from Thailand are not in agreement, that makes 3 editors who disagree with 4 editors - hardly a consensus by any stretch. Not to mention, why are you grasping onto a tiny handful of posts from months ago, given that consensus can change, anyway? Furthermore, consensus is built based on the quality of arguments and not the quantity of proponents; frankly, lazy arguments that are shit and can be easily pulled apart can be given less credence.

it should be from a source other than benlisquare's unsourced, user-generated images

You contradict yourself. The Artificial intelligence art article, which is your personal precious baby that you valiantly defend every single day, is full of Wikipedia user-generated content such as this and this and this and this, yet frankly you don't seem to give a shit. Funny that, huh?

...of busty young girls

It's not my problem that you get boners looking at... checks notes... ordinary women wearing clothes. This is strictly your problem, not mine. Unless you live a sheltered life residing in the basement of a Baptist church in the American midwest, women in the real world like to wear fashionable clothes (horror, I know), and the depicted women look like any typical urbanite youth in any non-shithole city. It's the 21st century, don't tell women what to wear, teach men not to harm women instead. Is your solution to gender inequality to unironically erase women from the public eye? If your complaint is that the women are attractive and that I should have made her uglier, then my question is... why should I make pictures of ugly people, of any gender? From the marble statues of Cleopatra to the Mona Lisa, people generally prefer to create art of attractive people, and art will be heavily skewed towards attractive-looking people, news at 11, sports at 11:30.

Apart from this constant whining, have you contributed anything to this article? It's always the people who have nothing useful to bring to the table who whine and moan the loudest. You know why the current inpainting example images are being used? Because literally nobody has bothered to put in the effort to create something better in quality and educational usefulness. If you have such dire complaints, then why don't you create some better images to replace the existing content with? Surely it's easy for you, right? The tools are right there, freely downloadable, readily useable, and fully documented. Or are you conceding that you are incapable of making any contributions yourself to fix the issue of "women that are too attractive" as opposed to whining about it for months and months on end? --benlisquare_T•C•E 02:38, 19 December 2022 (UTC)

As I said earlier, such pictures are fairly typical for Stable Diffusion use, afaik. So why should we pretend otherwise, and on its own page, no less? (And it is one out of four/five, for now). Not to mention, the choice is between this picture and no picture, given that no one created another one in 2 months. Smeagol 17 (talk) 03:00, 19 December 2022 (UTC)

Hi, Smeagol, thank you for discussing on this talk page. No, these don't seem typical to me of the Stable Diffusion images I've created or that have seen created by anyone else I know. As has been repeated several times here, Stable Diffusion can be used to create anything one types a prompt in for. Yes, I am sure we could find a different more suitable image of inpainting from a reliable source, or we could make our own. Would an example of inpainting involving dogs aor cats, maybe somewhat similar to those seen at https://huggingface.co/stabilityai/stable-diffusion-2-inpainting be a suitable solution for you here? Elspea756 (talk) 04:00, 19 December 2022 (UTC)

I don't have a statistic, but I would be surprised if less then 1/5 of all images created by SD right now are of "beautifull women" (see for example this, for midjourney: https://tokenizedhq.com/wp-content/uploads/2022/09/midjourney-statistics-facts-and-figures-popular-prompt-words-infographic.jpg?ezimgfmt=ng:webp/ngcb1) A picture from a reliable source would be ideal... in principle, but our RS-s are unlikely to have permissive enough licenses for that. If someone will create a competing fig. of better quality using cats, we can discuss replacement, but if it is of the same or worse quality, then, imho, the one who first took the effort to create an illustration (in this case, benlisquare) shall have priority. Smeagol 17 (talk) 08:01, 19 December 2022 (UTC)

If the new result is worse in quality, I would still consider that a bad change... Just because we CAN change to a picture of a cat doesn't mean a lower resolution/blurry/unclear picture is better 49.228.220.126 (talk) 17:04, 19 December 2022 (UTC)

these don't seem typical to me of the Stable Diffusion images I've created or that have seen created by anyone else I know - sounds like you should pay more attention to what the SD community has been churning out over the past few months, finding similar examples isn't even difficult. Being unaware is not an excuse. --benlisquare_T•C•E 15:41, 19 December 2022 (UTC)

what part of the wikipedia rules talks about the male gaze, this tutorial is super helpful so even if you might be offended by it why should it go???? i've seen worse content on this site, why not go protest that????? 98.109.134.212 (talk) 22:43, 20 December 2022 (UTC)

Speaking of tutorials, does Elspea756 provide a step-by-step guide on how to replicate his output image in the file description? Of course not, he's not here to educate or help the reader, he's here to win an internet argument. At least my file descriptions actually have some effort put into them. --benlisquare_T•C•E 02:49, 21 December 2022 (UTC)

Since Wikipedia isn't for writing step-by-step guides (WP:NOTHOWTO) that really isn't a problem. - MrOllie (talk) 03:43, 21 December 2022 (UTC)

You're not wrong about WP:NOTHOWTO, but you still haven't addressed my main point that his image is the epitome of low-effort. --benlisquare_T•C•E 03:54, 21 December 2022 (UTC)

It illustrates the concept in the article so the reader can understand it, and it does so without a sexualized image. The amount of effort involved isn't really a factor. - MrOllie (talk) 04:00, 21 December 2022 (UTC)

Understood. If I put the woman in a hijab, would you still have opposition to it? That would remove the entire "sexualisation" element that forms the crux of this dispute. It would take me no more than 15 minutes, and would alleviate your primary concern regarding WP:SYSTEMIC. --benlisquare_T•C•E 04:04, 21 December 2022 (UTC)

Covering the hair really isn't the point. MrOllie (talk) 04:14, 21 December 2022 (UTC)

I should clarify: the entire body, apart from the arms and face. The additional bonus would be that it would allow for representation of non-white cultures, another issue relating to WP:SYSTEMIC. Two birds with one stone, wouldn't you agree? --benlisquare_T•C•E 04:15, 21 December 2022 (UTC)

WP:SOFIXIT

Disruptive and discriminatory imagery and commentary. ~ Pbritti (talk) 20:37, 22 December 2022 (UTC)

Alhamdulillah she has seen the folly of her ways, and has learned to embrace modesty and the grace of Allah. Fatima (fictional character, any resemblance to a real-world Fatima is purely coincidental) shall no longer partake in the folly and hedonism of decadent fashion, inshallah.

Before

Demonstration of inpainting and outpainting techniques using img2img within Stable Diffusion

Step 1: An image is generated, coincidentally with the subject having one arm missing.

Step 2: Via outpainting, the bottom of the image is extended by 512 pixels and filled with AI-generated content.

Step 3: In preparation for inpainting, a makeshift arm is drawn using the paintbrush in GIMP.

Step 4: An inpainting mask is applied over the makeshift arm, and img2img generates a new arm while leaving the remainder of the image untouched.

After

Demonstration of inpainting and outpainting techniques using img2img within Stable Diffusion

File:Demonstration of inpainting and outpainting using Stable Diffusion, halal edition (step 1 of 4).png

Step 1: An image is generated, coincidentally with the subject having one arm missing.

File:Demonstration of inpainting and outpainting using Stable Diffusion, halal edition (step 2 of 4).png

Step 2: Via outpainting, the bottom of the image is extended by 768 pixels and filled with AI-generated content.

File:Demonstration of inpainting and outpainting using Stable Diffusion, halal edition (step 3 of 4).png

Step 3: In preparation for inpainting, a makeshift arm is drawn using the paintbrush in GIMP.

File:Demonstration of inpainting and outpainting using Stable Diffusion, halal edition (step 4 of 4).png

Step 4: An inpainting mask is applied over the makeshift arm, and img2img generates a new arm while leaving the remainder of the image untouched.

This is how things are actually fixed. Are there any further complaints, or can we finally put an end to this multiple month-long circus? --benlisquare_T•C•E 09:52, 21 December 2022 (UTC)

Brilliant, nobody has any objections. I sure hope nobody will start whining again immediately after I put this new version into the article, now that the prior concerns have been resolved. --benlisquare_T•C•E 13:45, 22 December 2022 (UTC)

Yeah, as I've suspected. Elspea756 doesn't give a shit about WP:SYSTEMIC, WP:GRATUITOUS or modesty, he's just spiteful and bitter that he's not getting his way. I guess that puts the "I care about protecting Wikipedia from sexual depictions of women" hypothesis to rest, huh? --benlisquare_T•C•E 13:54, 22 December 2022 (UTC)

New non-WP:GRATUITOUS, more representative, higher resolution, and up-to-date version images to replace previous disputed images created by prompt "busty young girl"

Latest comment: 1 year ago7 comments6 people in discussion

I have uploaded new images to replace the previous disputed series of images that had been created with the prompt "busty young girl." This new set of images is far superior to the previous images for the following reasons: 1) These new images address the concerns of editors Ovinus, Colin M, and Ichiji who requested that we stop using "pictures appealing to the male gaze," "sexualized young women," images in violation of "WP:GRATUITOUS," and "exploitive images." 2) Editor Smeagol requested that the images be created by prompts that are "typical" and "representative" of what AI artists typically use, and Smeagol provided (thank you) a link to a list and chart of 20 "most popular" prompt terms. Terms like "Busty" and "Young" appear nowhere on this list. The top three terms, each of which the chart seems to show as at least twice as popular as any of the other terms below, are "man," "city," and "space." The new images I've uploaded are created with a prompt using 4 of the top 5 words on this list (man, city, space and cat), as well as other words that are in this top 20 ("cyberpunk," etc.). So, these new images are likely several times more representative than any images created with prompts like "busty and young" which are nowhere on the list provided by Smeagol. 3) Smeagol and IP editor 49.228.220.126 expressed a preference for images of higher resolution. These new images are up to twice the resolution of any of the previous images. And 4) Creating these new images used the 2.1 version of Stable Diffusion which was just released on December 7, so these images are a more up-to-date example of Stable Diffusion than any images created previously. So for at least all of those reasons and likely more —non-gratuitous, more representative, higher resolution, and created with a more up-to-date version — I think and hope we can all agree that these images created with a "man with cat in a city in space" prompt are a vast improvement over the previous images created with a prompt for "busty young girl." Thank you again. Elspea756 (talk) 17:42, 20 December 2022 (UTC)

honestly it looks trash, the cat looks stoned on weed and the guy's head anatomy is completely off. bring back the old one, this really reeks of desperation by someone lacking competence. 2003:C6:5F00:9700:E847:13DA:2EC6:9854 (talk) 21:48, 20 December 2022 (UTC)

The old 'busty girl' image was a literal poster child for WP:SYSTEMICBIAS. Keep the new one. - MrOllie (talk) 22:07, 20 December 2022 (UTC)

Your logic appears to be that you're willing to accept any low-quality substitute as long as "systemic bias" is not introduced, is that correct? Even if the non-systemic bias alternative is five steps backwards? I feel like you're more interested in culture wars rather than making the article better. 142.188.128.54 (talk) 03:22, 21 December 2022 (UTC)

Have fun arguing with that straw man. - MrOllie (talk) 03:29, 21 December 2022 (UTC)

Did you really drag the resolution sliders up with no regard to what the final image looks like, just so that you could "beat" the resolution size of my previous image? Just look at it, the finer details are messy and it's clearly very blurry once you zoom in to native resolution. Also, you have set CFG way too high, which is why you are seeing random purple splotches everywhere. Not to mention, your file description contains barely any information at all; what is the reader supposed to "learn" from your image? You haven't even given the reader the courtesy to reproduce your steps should they choose to do so; by comparison, I have been 100% transparent regarding each step I took, and how I got to my final image, within my descriptions.

I don't really know why you like to bring up WP:GRATUITOUS so often, an image of a normal, clothed woman is not gratuitous, and just because someone has mentioned WP:GRATUITOUS doesn't make it true. These images are WP:GRATUITOUS:

Images (Redacted)

...which is why we don't use these images in Wikipedia articles, unless we absolutely need to as a last resort. The earlier inpainting images on the other hand were not WP:GRATUITOUS, and no amount of ad nauseam repetition will turn this falsehood true. --benlisquare_T•C•E 02:47, 21 December 2022 (UTC)

I also vote it is not GRATUITOUS, given context and usage. It is like complaining that an article about Italian Renaissance painting or ancient Greek art has some not-covered-from-head-to-toe women. Smeagol 17 (talk) 10:19, 22 December 2022 (UTC)

Developer(s)

Latest comment: 1 year ago34 comments5 people in discussion

The infobox and article present Stability AI as developer. However, this is incorrect:

- Stable Diffusion is essentially the same approach as the Latent Diffusion Models (LDM) developed by the CompVis group at LMU Munich and a coauthor from Runway. Patrick Esser, one of the license holders of Stable Diffusion (https://github.com/CompVis/stable-diffusion/blob/main/LICENSE) „When we wrote that paper [LDM] we showed that it actually works, nicely! […] Then it was like - can we scale this up? And that led us to Stable Diffusion. It is really the same model, slight changes but not too essential. Just on a bigger scale in terms of our resources.“, https://research.runwayml.com/the-research-origins-of-stable-difussion

Comparing the code from the LDM and SD github confirms this. Moreover, the depictions and explanations of the approach on this Wikipedia article are all discussing the ideas of the original approach.

- Both the source code as well as the models had so far been released by the CompVis group at LMU Munich (https://github.com/CompVis/stable-diffusion). The license is issued on their Github (https://github.com/CompVis/stable-diffusion/blob/main/LICENSE) by the original authors of https://arxiv.org/abs/2112.10752, Rombach et al.

- The CompVis Github lists the contribution of Stability AI as providing funding for compute, stating that “Thanks to a generous compute donation from Stability AI and support from LAION, we [CompVis] were able to train a Latent Diffusion Model on 512x512 images from a subset of the LAION-5B database.“

- Cristobal Valenzuela (CEO of Runway): “This version[1.5] of Stable Diffusion is a continuation of the original High-Resolution Image Synthesis with Latent Diffusion Models [LDM] work that we created and published (now more commonly referred to as Stable Diffusion)[…] we thank Stability AI for the compute donation to retrain the original model“, https://huggingface.co/runwayml/stable-diffusion-v1-5/discussions/1

- The github and huggingface of CompVis, Runway, etc. cite https://openaccess.thecvf.com/content/CVPR2022/html/Rombach_High-Resolution_Image_Synthesis_With_Latent_Diffusion_Models_CVPR_2022_paper.html

as the reference to the Stable Diffusion approach.

TL;DR

The approach was actually developed by the CompVis group at LMU Munich (leading authors) + a coauthor from Runway. The references above show that the team then made only minor modifications to retrain essentially their original approach on a larger dataset. For this retraining, Stability AI donated the compute on AWS servers. However, donating compute for a model built by another research team does not make them the (sole) developer. Instead, all repositories are crediting the original authors. 89.206.112.10 (talk) 19:40, 21 October 2022 (UTC)

I understand where you're coming from here, but most of the analysis above is based on primary sources. Secondary sources overwhelmingly attribute Stable Diffusion principally to Stability AI. e.g.

TechCrunch: "Stability AI, the startup behind Stable Diffusion, raises $101M"
CNN: "Getty Images announced a lawsuit against Stability AI, the company behind popular AI art tool Stable Diffusion"
MIT Technology Review: "Stability.AI, the company that built Stable Diffusion..."
The Verge, for its part, originally described Stability AI as having solely released Stable Diffusion, and later issued a correction clarifying that "the Stable Diffusion model — though funded and developed with input from Stability AI — is released under a license from the CompVis lab at Germany’s LMU Munich university"

(And these aren't cherry-picked - they're just the first several secondary sources I encountered among the article's references.) Privileging our own analysis of primary sources in contradiction of the majority of secondary sources would be a violation of WP:OR. We can definitely talk about the details of how different parties were involved in the development of the model (perhaps in a new "Development" section?), but I think we should defer to secondary sources in giving Stability AI top billing when summarizing the topic. Colin M (talk) 16:12, 22 March 2023 (UTC)

Stable Diffusion builds on the previous work of CompVis group at LMU Munich and Runway. The secondary sources listed above are cherry picked. Most articles list LMU Munich and Runway as the authors. Including the original publication here and here. e.g.

Most importantly, the paper and original source code as well as the models have been released by CompVis and Runway. Which is stated in the original CompVis repository and as well in an interview by one of the authors. Furthermore, Stability’s forked version of the model also states that in the Readme section:

“The original Stable Diffusion model was created in a collaboration with CompVis and RunwayML and builds upon the work: High-Resolution Image Synthesis with Latent Diffusion Models”

Since the release of the original model, Stable Diffusion has been forked and multiple version of the model are maintained by different organizations.

Since this page is dedicated to the original model, attribution should be given to the original developers. It seems like Stability has wrongly attributed the ownership of the model which was picked by the sources listed before, but has since then retracted this position. Juhun87621 (talk) 02:54, 24 March 2023 (UTC)

The secondary sources listed above are cherry picked. I literally said they weren't, and described how I found them. Accusing me of bad faith does not really set the tone well for us to have a productive collaborative discussion. If sources disagree on a particular matter, which seems to be the case here, we present all the mainstream views with appropriate weight. But what you did in this edit was to just remove the citations that disagree with your preferred vision, which is not an appropriate way to handle this. Colin M (talk) 15:09, 25 March 2023 (UTC)

Since you accused me of cherry-picking the above results, here's another experiment. I searched the New York Times for the three most recent articles mentioning Stable Diffusion. All three attribute it to Stability AI (and make no mention of Runway, LMU Munich, or any other orgs):

Colin M (talk) 15:21, 25 March 2023 (UTC)

I agree that all of the original developers, including CompVis and RunwayML, should be listed in the infobox. However, they should be listed in the "Original author(s)" parameter, not the "Developer(s)" parameter, to avoid implying that CompVis and RunwayML are the current developers. Is that a good solution for everyone? Elspea756 (talk) 15:48, 25 March 2023 (UTC)

Sure, as long as we have citations to secondary sources that support that. Also, the infobox should generally be a summary of the content of the article proper, so if we're going to mention Runway in the infobox, it would be good to also have some text in the "Development" section elucidating its involvement in the project. Colin M (talk) 16:10, 25 March 2023 (UTC)

Yes, that all sounds good. The various original authors and their roles should definitely be in the article. The first secondary source you (Colin M) suggest above is Tech Crunch. So, we could use this secondary source "Tech Crunch: This startup is setting a DALL-E 2-like AI free, consequences be damned" which describes Stable Diffusion as "A collaboration between Stability AI, media creation company RunwayML, Heidelberg University researchers and the research groups EleutherAI and LAION ... CompVis, the machine vision and learning research group at Ludwig Maximilian University of Munich, oversaw the training ..." Is that a good solution for everyone? Elspea756 (talk) 16:37, 25 March 2023 (UTC)

I'm not seeing how that source supports the interpretation that CompVis and RunwayML are the "original authors" and Stability AI is the "developer", so I wouldn't support the infobox change you mentioned above based on that source. But it could be used to add a mention of Runway in the "Development" section as a collaborator. Colin M (talk) 16:48, 25 March 2023 (UTC)

To be clear, I am not saying that the source supports, as you've put it, "CompVis and RunwayML are the 'original authors' and Stability AI is the 'developer'". I am saying that the Tech Crunch article supports that the "original authors" are Stability AI, RunwayML, CompVis, LAION, et al. Is there agreement on this? Elspea756 (talk) 19:49, 25 March 2023 (UTC)

Yes, I agree that that source supports that statement. But I would not use that fact to give all of those organisations equal weight in the article (including mentions in the intro/infobox), because that's just one source, and the vast majority of secondary RS coverage of Stable Diffusion give Stability AI as the main creator (with most not even mentioning Runway, CompVis, LAION, etc.). Colin M (talk) 19:06, 4 April 2023 (UTC)

Not accusing you of bad faith. I see your point. Seems like there are sources to justify both arguments in equal forms.

However, I do believe this article is about to the original development of the model. Since it was released, Stable Diffusion has been forked 7.4K times. Stability maintains a separate version of the original model which diverges from the initial release. The confusion arises from the fact that they are called the same. We should clarify that. I suggest that:

This article should refer to the original Stable Diffusion model that we all agree was created by CompVis and RunwayML. And so, they should be listed as Original author(s) and developers of the original model. “The original Stable Diffusion model was created in a collaboration with CompVis and RunwayML and builds upon the work: High-Resolution Image Synthesis with Latent Diffusion Models"
Add a sub-section which further explains that Stability donated compute to train the original version and is the maintainer of a forked version of Stable Diffusion available here. The original version still remains here.

Juhun87621 (talk) 05:07, 26 March 2023 (UTC)

When Stability AI says at https://github.com/Stability-AI/stablediffusion that “The original Stable Diffusion model was created in a collaboration with CompVis and RunwayML," Stability AI is saying that Stability AI created the model in collaboration with CompVis and RunwayML. That is, all three of them collaborated on it. Similarly, when CompVis says at https://github.com/CompVis/stable-diffusion that "Stable Diffusion was made possible thanks to a collaboration with Stability AI and Runway," that is CompVis stating once agin that all three (Stability AI, CompVis, and Runway) collaborated on the model. This is all consistent with the description in the reliable secondary source article I cited earlier "Tech Crunch: This startup is setting a DALL-E 2-like AI free, consequences be damned" which describes Stable Diffusion as "A collaboration between Stability AI, media creation company RunwayML, Heidelberg University researchers and the research groups EleutherAI and LAION ... CompVis, the machine vision and learning research group at Ludwig Maximilian University of Munich, oversaw the training ..." So, the "original authors" in the infobox should list Stability AI, CompVis, and Runway. Is that clear and can we all agree on it? Elspea756 (talk) 05:28, 26 March 2023 (UTC)

The primary sources above point to stable diffusion being developed by the CompVis group at LMU Munich (leading authors) + a coauthor from Runway. However, shortly after the original release, Stability AI apparently portrayed their role to be that of primary developers, without mentioning the other entities, e.g., https://www.youtube.com/watch?v=1Uy_8YPWrXo

This seems to then have been picked up by the press and led to the secondary sources mentioned above. However, the CEO of Stability AI, Emad Mostaque, later clarified that "Stable Diffusion came from the Machine Vision & Learning research group (CompVis) @LMU_Muenchen", https://twitter.com/EMostaque/status/1587844074064822274?ref_src=twsrc%5Etfw%7Ctwcamp%5Etweetembed%7Ctwterm%5E1587844074064822274%7Ctwgr%5E9cde94082155c6f6638c8bee5bd005a3301d1a23%7Ctwcon%5Es1_

How can we claim a company to be the primary or even sole developer, when even their CEO has clarified that it was another entity, as is corroborated by all the primary sources? Mistakes easily happen and get picked up and repeated by the press. However, all parties that are under discussion (CompVis, Runway, Stability AI) eventually pointed towards the authors of the original publication as being the developers: Compvis and Runway as in the primary sources cited above and the CEO of Stability AI as quoted here. Moreover, Emad's twitter post links to a project page with more recent press coverage that corroborate this. 89.206.112.10 (talk) 07:09, 29 March 2023 (UTC)

In the wikipedia software infobox, "original author" is for the "Name of the original author(s) or publisher(s) of the software," and "developer" is for the "Name of the current developer of the software." So, that is why "developer" is going to list the current developers and is not going to list "the authors of the original publication." See https://en.wikipedia.org/wiki/Template:Infobox_software Elspea756 (talk) 12:50, 29 March 2023 (UTC)

So, the "original authors" in the infobox should list Stability AI, CompVis, and Runway. Is that clear and can we all agree on it? Given that we're talking about a neural network, it's not entirely clear what "original authors" should actually mean. The authors of the source code used to train the model? The researchers that developed the architecture used by the model? The people that assembled the training set and ran the training code that led to the creation of the model? This is a problem that arises when using {{Infobox software}} on an article about a topic which is not exactly software (in fact, I've been thinking of creating an infobox specifically for ml models/neural nets for a while). This is why I'd prefer to just avoid the "original authors" field so that we can explain the nuances of which people/orgs contributed in which ways in the article's prose. Colin M (talk) 19:23, 4 April 2023 (UTC)

Is evident to everyone in this thread who the original authors and developers of the model are. I’m not sure why we keep discussing this. Multiple secondary sources confirm the origins multiple times including the ACTUAL code and research code and research. This has been confirmed by all parties involved in the development of the model. The argument of “we're talking about a neural network” is completely out of context. The people/orgs that contributed to the development, invention, training, release and publication are LMU and Runway per all verified sources listed above my multiple contributors. Stability donated computed and cannot be considered a developer since they didn't "develop" anything. Even the CEO of Stability confirmed that. Not sure what else is there to discuss. There’s a lack of NPOV on the edits made on this page. Juhun87621 (talk) 21:40, 4 April 2023 (UTC)

Maybe we should try to see if we can find any common ground here. Let's start with this: do you agree with me that the majority of secondary source coverage of Stable Diffusion describes it as being developed by Stability AI? If the answer is "no", can you suggest some procedure we could use to gather some objective data to answer this question? Colin M (talk) 21:47, 4 April 2023 (UTC)

> Maybe we should try to see if we can find any common ground here.

Agree!

> do you agree with me that the majority of secondary source coverage of Stable Diffusion describes it as being developed by Stability AI?

No. All primary sources and almost all secondary sources list LMU Munich and Runway as developers and Stability as donating compute. The only argument against that is citing early sources that directly quote the CEO of Stability. Since when those articles were published, the CEO of Stability has retracted his position. See here

Primary Sources: These are the developers themselves. The creators of the model. This is the original source of information about the topic.The creators list multiple times that the model was developed by LMU Munich and Runway.

Secondary Sources:

> can you suggest some procedure we could use to gather some objective data to answer this question

I suggest we properly attribute the development, research, and training of the model to the original authors: LMU Munich and Runway. And we should attribute the compute donation to Stability. Let me know what you think. Juhun87621 (talk) 01:01, 5 April 2023 (UTC)

The list of secondary sources seem to be cherry picked to support your case. What I'm wondering is whether we can come up with a procedure to generate a sample of RS coverage of Stable Diffusion that is more or less unbiased, and then see what that sample says. For example, we could do a Google News search for "Stable Diffusion", sort by recent, and take the ten first sources that are listed as "generally reliable" at WP:RSP. That's just the first idea that comes to mind. Let me know if you can think of a better method. Colin M (talk) 04:49, 5 April 2023 (UTC)

I'm going to note that the secondary sources listed by Juhun87621 actually contradict their claim that Stability AI is not one of the original developers. For example, Silicon Angle: Stable Diffusion developer Runway raises $50M to create AI multimedia tools says "Runway provided the foundational research for Stable Diffusion and collaborated with Stability AI." The other sources seem to mostly just name one the co-developers, without naming the others, which makes sense because these articles are each largely about one of the co-developers launching a new project. So, again, these sources support that the original developers of Stable Diffusion are Runway, CompVis, and Stability AI. Note that I did not watch the "NVIDIA’s CEO" video that is over an hour long; if you expect to use that as a source please provide a time stamp and relevant quote from the video. Elspea756 (talk) 11:37, 5 April 2023 (UTC)

> The list of secondary sources seem to be cherry picked to support your case

This is the same argument I made for all your sources and that you dismissed as "bad faith". All your secondary sources seem to be extremely cherry picked. As Elspea756 also mentions, all recent secondary sources support that the original developers of Stable Diffusion are Runway, CompVis, and Stability AI.

I think we need some common sense and good editorial judgment here. We should rely on statements of fact using primary sources and validated by reliable and up-to-date secondary sources. We have the statements, quotes, and citation from accounts written by the actual paper authors and developers. We should avoid a narrow perspective by using unreliable quotes in secondary sources. In this case, the actual developers, research, and training of the model has to be made to the the original authors: LMU Munich and Runway. And we should attribute the compute donation to Stability.

> Note that I did not watch the "NVIDIA’s CEO" video that is over an hour long; if you expect to use that as a source please provide a time stamp and relevant quote from the video

Apologies, time stamp here: https://www.youtube.com/watch?v=DiGB5uAYKAg&t=3202s (53:22) Juhun87621 (talk) 12:42, 5 April 2023 (UTC)

This is the same argument I made for all your sources and that you dismissed as "bad faith" The difference is that I explained to you the procedure that I used to generate my list of sources. We disagree on a factual question: do the majority of reliable secondary sources attribute Stable Diffusion primarily to Stability AI? I am saying that to reach some consensus about our disagreement on how the article should be worded, we need to reach consensus on this question of fact. I proposed above what I think could be a simple method to collect some data that would help us resolve this factual disagreement. If you care about resolving this, I think an experiment like the one I proposed (or you can suggest a different procedure if you like) is the right way forward. Colin M (talk) 13:34, 5 April 2023 (UTC)

Hey Colin M, you asked for sources and I listed a long list of both primary and secondary data sources. All gather via searching "Stable Diffusion developers" in Google. Feel free to do it yourself as an experiment. All reliable secondary sources list LMU Munich, Runway, and Stability as developers. Including the actual developers themselves. How can we argue against what the developers/inventors say? That's a fact. We need to take a neutral stance with common sense and good editorial judgment. All other contributors in this thread are on the same page with regards to having LMU Munich, Runway, and Stability as developers.

I will, once again, suggest that we properly attribute the development, research, and training of the model to the original authors: LMU Munich and Runway. And we should attribute the compute donation to Stability.

Other contributors, please chime in with feedback if the above sounds reasonable. Juhun87621 (talk) 19:37, 5 April 2023 (UTC)

I think we're sort of talking past each other at this point. I didn't ask for a list of sources that mention Runway/LMU Munich. I asked about the proportion of RS coverage that mentions those organizations vs. Stability AI. My claim is that the majority of secondary RS coverage gives sole or primary credit to Stability AI. I proposed a method to test this hypothesis (and offered that you could also suggest an alternative one), but you haven't really engaged with that line, so this discussion doesn't seem to be going anywhere. Colin M (talk) 23:14, 5 April 2023 (UTC)

The idea that "we could do a Google News search for 'Stable Diffusion', sort by recent, and take the ten first sources ..." is not a very good idea for finding information on the original contributors, as it will have a bias towards whichever of the original contributors is currently or most recently in the news with a new project. At this point, we have multiple reliable sources that name Runway, CompVis, and Stability AI as original contributors and explain their roles. I am not sure what is in dispute at this point. Which of these three original contributors -- Runway, CompVis, and Stability AI -- is anyone disputing? Or is there a dispute over their individual roles? Or how to credit them in the infobox? What is the dispute here? Elspea756 (talk) 23:51, 5 April 2023 (UTC)

> The idea that "we could do a Google News search for 'Stable Diffusion', sort by recent, and take the ten first sources ..." is not a very good idea for finding information on the original contributors, as it will have a bias towards whichever of the original contributors is currently or most recently in the news with a new project

Very much agreed!

> At this point, we have multiple reliable sources that name Runway, CompVis, and Stability AI as original contributors and explain their roles. I am not sure what is in dispute at this point. Which of these three original contributors -- Runway, CompVis, and Stability AI

Exactly! Not sure what else is there to discuss. I propose we separate the contributors based on their contributions to the project. LMU Munich and Runway as developers and Stability as donating compute. We should update the infobox to match that. Juhun87621 (talk) 02:00, 6 April 2023 (UTC)

In the wikipedia software infobox, "original author" is for the "Name of the original author(s) or publisher(s) of the software," and "developer" is for the "Name of the current developer of the software." See https://en.wikipedia.org/wiki/Template:Infobox_software So, the sources seem to say that "original author(s)" in the infobox would be Runway, CompVis, and Stability AI, and current "developer" would be Stability AI. Any differentiation of the respective roles of the original authors would go in the article itself, not the infobox, as there isn't a paramter for that in the infobox currently. Any proposed changes to the infobox -- such as "separate the contributors based on their contributions to the project" -- would need to be discussed on that template's talk page, with a much larger group of people participating in the discussion, as that infobox is used on about 14,000 pages. My suggestion is to just add "Runway, CompVis, and Stability AI" as "original author(s)" in the infobox, and do any clarifying of their respective roles in the article itself. Does that sound like a good solution here? Elspea756 (talk) 02:27, 6 April 2023 (UTC)

> My suggestion is to just add "Runway, CompVis, and Stability AI" as "original author(s)" in the infobox, and do any clarifying of their respective roles in the article itself. Does that sound like a good solution here?

Sounds like a great solution. Do you want to make the changes? Juhun87621 (talk) 03:59, 6 April 2023 (UTC)

I am not sure what is in dispute at this point. Yeah, I'm not sure either. Juhun has done this edit, which incorporates several changes, which I disagree with for different reasons. I'd be happy to discuss any one of them in more detail. Overall, my feeling is:

The intro and infobox of the article should summarize the key points from the body.
The content in the intro and infobox related to SD's development are a fair summary of the content currently in the "Development" section.
The "Development" section gives what I think is a due weight summary of what secondary sources say about SD's development, but I'm certainly open to expanding it with more detail.

Unless you disagree with me on points 1 or 2, then I think it would be premature to change the attribution in the infobox/intro without touching the body. If you think we're getting the attribution of credit wrong, I think we should start by focusing on the prose of the "Development" section. Does anyone have any suggestions for aspects that should be added or revised there? Colin M (talk) 16:13, 7 April 2023 (UTC)

OK, yes, I think I see the dispute now. Neither the "Development" section nor the infobox currently give due weight to the original collaborators on Stable Diffusion's development, as this "Development" section currently does not mention RunwayML and we have multiple editors and multiple reliable sources pointing out that RunwayML was one of the original collaborators. So, we should add in RunwayML to both the "Development" section and as one of the "original authors" in the infobox, and everyone here should be pleased with our own collaborative work on improvement of this article. Elspea756 (talk) 20:53, 8 April 2023 (UTC)

The "Development" section had already listed Patrick Esser and Robin Rombach as leaders of Stable Diffusion's original development. I have now added that Patrick Esser is from Runway and Robin Rombach is from CompVis, as is stated in the source already used in that sentence. I have then also added to the infobox as Original author(s) Runway, CompVis, and Stability AI. This is all supported by multiple sources, including those already used in the article, and this seems to be supported by multiple editors, and now the "Development" section and the infobox agree with each other, so I hope and believe this should resolve the dispute over how to describe the original authors' roles in the development here. Elspea756 (talk) 01:10, 9 April 2023 (UTC)

It seems like much of the confusion above on who originally developed SD results from Stabilty AI pushing the narrative in a somewhat biased direction. In the meantime press articles have now carefully investigated these issues, clarified them and put matters straight. Also, all parties involved in the development of SD came to word. So the opening section should not cite a press release of only one company as the source for explaining who developed SD, but rather these independent press resources:

https://sifted.eu/articles/stability-ai-fundraise-leak

https://www.forbes.com/sites/kenrickcai/2023/06/04/stable-diffusion-emad-mostaque-stability-ai-exaggeration/?sh=347a8fcb75c5

Given those sources and the comments by Stability, the opening statement "It [SD] was developed by the start-up Stability AI in collaboration with a number of academic researchers and non-profit organizations" is factually wrong. The articles clearly outline that this was part of exaggerated claims by Stability AI prior to raising their seed funding round. Neither Stability AI, nor any of the other parties involved stick to this narrative anymore: even Stability'S CEO confirms "Stable Diffusion came from the Machine Vision & Learning research group (CompVis) @LMU_Muenchen", https://twitter.com/EMostaque/status/1587844074064822274?lang=en

Suggestions:

- rather than just citing the press release of one of the parties involved in SD in the opening section, we must list one of the recent independent investigations that clarify the contributions to the development of SD to avoid biases. Stability as well as all other parties were interviewed here so things are on level ground. That way we can avoid the impression of gate-keeping.

- all investigations and all parties involved with SD suggest the opening paragraph needs to be corrected to sth like "It was (originally) developed by researches from the CompVis Group at Ludwig Maximilian University of Munich and Runway with a compute donation by Stability AI and training data from non-profit organizations". The articles clearly point out that Stability AI only later joined in when all coding was already done! Listing them as original developer is simply wrong. 207.102.170.162 (talk) 14:45, 21 June 2023 (UTC)

If you think you've found an error in the article, you can just correct it. If someone disagrees, they will change it back. Elspea756 (talk) 15:17, 21 June 2023 (UTC)

Current consensus is that sexist image spam is unnecessary for this encyclopedia article

Latest comment: 1 year ago8 comments6 people in discussion

Current consensus is that sexist image spam is unnecessary for this encyclopedia article. This concept has been stated many times by multiple editors. The latest example of sexist image spam also seems to be making some sort of statement about religion and also seems potentially racially problematic. Per Wikipedia policy, the onus to achieve consensus for inclusion is on those seeking to include disputed content. Once again that has not happened here, so I will once again be removing the latest iteration of sexist image spam from this article. The most charitable reading of this user's actions would be that we are now at the "discuss" portion of "Bold, Revert, Discuss." A lack of discussion is not a sign of consensus; it can also be a sign that an editor is wasting our time by constantly spamming the same sorts of problematic images into this article. Elspea756 (talk) 14:03, 22 December 2022 (UTC)

Grow up. You have multiple IP editors telling you that your images are shit. You have multiple editors expressing that they wish the article be properly illustrated with an inpainting demonstration. You have not addressed the changes to WP:SYSTEMIC, the primary concern raised by editors that you really love to quote over and over again. All you are capable of is pounding the table and yelling. You have zero interest in collaboratively and constructively building an encyclopedia. Just grow up. --benlisquare_T•C•E 14:07, 22 December 2022 (UTC)

You're one post away from an indefinite block for personal attacks. --jpgordon^{𝄢𝄆𝄐𝄇} 18:32, 22 December 2022 (UTC)

I would recommend not mistaking your preferences for consensus. Smeagol 17 (talk) 19:26, 22 December 2022 (UTC)

Here is a reminder that there is a long-standing consensus that sexist image spam is unnecessary for this encyclopedia article. Thank you. Elspea756 (talk) 02:31, 3 August 2023 (UTC)

To re-affirm consensus per Elspea756. Ceoil (talk) 05:13, 10 August 2023 (UTC)

I agree that this should stay out of the article. Edit warring to try to put it back in is unwise. MrOllie (talk) 13:34, 17 August 2023 (UTC)

Whatever edit waring about this is wise or not, it it is obvious from reading previous discussions here that there is no such consensus. Smeagol 17 (talk) 13:39, 17 August 2023 (UTC)

[arstechnicasd-1] Edwards, Benj (6 September 2022). "With Stable Diffusion, you may never believe what you see online again". Ars Technica. Retrieved 15 September 2022.

[2] "Stable Diffusion Public Release". Stability.Ai. Retrieved 15 September 2022.

[venturebeatai-3] "Stable Diffusion creator Stability AI accelerates open-source AI, raises $101M". VentureBeat. 18 October 2022. Retrieved 10 November 2022.

[4] Kamath, Bhuvana (19 October 2022). "Stability AI, the Company Behind Stable Diffusion, Raises $101 Mn at A Billion Dollar Valuation". Analytics India Magazine. Retrieved 10 November 2022.

[5] Pesce, Mark. "Creative AI gives overpowered PCs something to do, at last". The Register. Retrieved 10 November 2022.

[6] "Is generative AI really a threat to creative professionals?". The Guardian. 12 November 2022. Retrieved 12 November 2022.

[1]

[2]

[3]

[4]

[5]

[6]