User talk:DatGuy/Archives/2021/May

This is an archive of past discussions with User:DatGuy. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.

DatBot AIV reports

Latest comment: 3 years ago19 comments4 people in discussion

Hi. I being meaning to raise an issue about DatBot reports at AIV. It seems to me that the bot is creating a lot of reports that are not actionable by admins. Here is a snapshot of the page as it currently stands, with a lot of reports that have been reviewed by User:Daniel Case. I wonder if you could have a look and see if anything could be done to reduce the number of non-actionable reports, as it currently makes reviewing reports difficult. Cheers TigerShark (talk) 17:55, 29 April 2021 (UTC)

Not that I'm fully qualified to speak to this issue, but ... first, the bot is no more perfect than any other bot and second, that's basically 14 hours worth of reports that no one else has reviewed, so it looks worse than it really is, IMO. Daniel Case (talk) 18:00, 29 April 2021 (UTC)

Yeah, that's a fair point, but I wonder how useful it is to have bots reporting to AIV when the user has not even been warned. I don't know how difficult it would be to script the bot to check for a history of warnings, but ot would certainly be useful. The point of AIV is, IMHO, really to list accounts that have reached the point of needing admin attention. The obvious exceptions might be long term abuse and/or IP hopping. TigerShark (talk) 18:10, 29 April 2021 (UTC)

Remember, the filter is giving out its own "warnings", even if they don't show up at the user's talk page. It's not like the user has no clue that they're doing anything wrong. IMO, if they've seen MediaWiki:Abusefilter-disallowed five times and are still clicking "publish", it's not inappropriate to block, though obviously the seriousness of the vandalism, and the staleness of the report, must be taken into account. I don't think that requiring a talk page notice first would be a good idea. If the filter is stopping all edits, it's possible that no human has noticed, so the talk page will be a redlink. Suffusion of Yellow (talk) 18:15, 29 April 2021 (UTC)

(ec) @TigerShark: DatBot's reports are a reflection of the activities of the the EFM community. Either the filter hits are false positives, or the filter shouldn't be at User:DatBot/filters. There's no need to bring it up with the operator; the bot is doing exactly what we are telling it to do. That said, I might take a look at User:DatBot/filters and prune a few of the spammier ones. Do you have any in mind? Because, in the very first example linked above, Daniel Case would appear to be mistaken. The edits sure look like vandalism to me, albeit not the specific LTA the filter is looking for. Suffusion of Yellow (talk) 18:10, 29 April 2021 (UTC)

I had, in that instance, just looked at the edits, which seemed to me like they might have been well-intended, but not the filter log, which does make things look a little different. Daniel Case (talk) 18:15, 29 April 2021 (UTC)

Yeah, that was a weird one. Looks like that user might be trying to the right thing, but still prone to immature impulses. Not saying that they should have been blocked, just pointing out that neither the filter or DatBot were "wrong" here. Suffusion of Yellow (talk) 18:51, 29 April 2021 (UTC)

Just leaving a note that I do see this discussion, but believe Suffusion of Yellow has put it succinctly. Dat Guy^Talk_Contribs 18:28, 29 April 2021 (UTC)

Here's an idea, if you have the free time and inclination:

Let us assign a "score" and a "time decay" to each filter, instead of choosing between just "one edit" and "five edit in five minutes". User:DatBot/filters could be laid out like this:

#filter score time
42 20 5
123 100 1
321 10 15
1000 50 30

The first column is the filter ID, of course. Any time an user trips a filter, their accumulated points are increased by score column. After time minutes, it's decreased by the same amount. Any time the accumulated points go over 100, DatBot reports. Note that in the example above, filter 42 is set to the equivalent of "vandalism", and filter 123 is set to the equivalent of "immediate". The other aren't possible with the current system, however. That would let us set some of the borderline vandalism filters to, say "10 edits in 15 minutes", and some of the borderline LTA filters to "2 edits 30 minutes", or whatever works out best. That might cut down on the spam.

There's the problem of what to do when they hit multiple filters at once. Adding the scores together would result in even more spurious reports. So I'd say just go with the highest scoring filter, and ignore all the others. Suffusion of Yellow (talk) 18:48, 29 April 2021 (UTC)

@Suffusion of Yellow: issue is, would anyone adding a filter (aside from you I suppose) know/go through the effort of adding information for any new filters? Or will they consider it too complex and give up? Dat Guy^Talk_Contribs 19:16, 29 April 2021 (UTC)

I don't know. I doesn't seem any more difficult than setting the throttle parameters on the filters themselves. There can be a few examples, in the comments, i.e. "if unsure, set vandalism filters to '20 5' and set LTA filters to '100 1'". Can you think of a more user-friendly way to give finer control?

Another (perhaps simpler) idea: What if DatBot removes its own reports if they haven't been acted on, in, say, 8 hours? That's enough time for someone to look at AIV, IMO. Suffusion of Yellow (talk) 19:31, 29 April 2021 (UTC)

@Suffusion of Yellow: shouldn't be too hard to implement, but it feels to me that it wouldn't be as spammy if HBC AIV helperbot5 would remove reports that have been responded to but not blocked after a certain amount of hours. In the diff linked above, the amount of actioned DatBot reports nearly doubles the amount of unactioned reports. Dat Guy^Talk_Contribs 14:23, 30 April 2021 (UTC)

Frankly, it seems less effort for the admin to just remove the report, instead of responding. Particularly with stale-ish reports. It's not like the bot is going to understand the message. Suffusion of Yellow (talk) 19:15, 30 April 2021 (UTC)

And, as Daniel Case has pointed out, the filters aren't really all that well documented. (I try to leave meaningful notes, at least...) So the best response is sometimes to say "I don't know which LTA that's supposed to be, but maybe someone else does". If no one can recognize the LTA in a few hours, it's likely no one will at all. Suffusion of Yellow (talk) 19:22, 30 April 2021 (UTC)

Or maybe we could create another bot to read all the filters and properly ID it. Daniel Case (talk) 19:53, 30 April 2021 (UTC)

Thanks for all the thoughts above and I appreciate that the bot is a reflection of the filter logic, and that most of what is reported is actual vandalism. My concern is that AIV is not a place to report all vandalism edits, it is a place to report accounts that need blocking. It is expected that, the majority of cases, accounts will have been sufficiently warned before they are blocked Wikipedia:Vandalism#Warnings. To block without verifying that is to go against policy, except in egregious cases. In that sense there is more to the logic of than simply whether the edit is vandalism. A report would ideally not be placed at AIV unless the user has been sufficiently warned, and then continues to vandalise. Because of the warning requirement of policy, most of the reports will not be actionable if the warning hasn't been giving. For the snapshot I linked, the vast majority of reports Daniel flagged were insufficiently warned. It is true that inappropriate reports, both bot and user generated, will eventually be reviewed and removed, but there is a time cost to that, and a distraction cost. It seems at the moment that Datbot will report even if they user's talk page doesn't exist, and doesn't attempt to warn the user itself. Now I don't underestimate the possibly complexity of trying to add the "has the user being warned?" logic, but I think it would be worth considering. As far as I can see there is some warning logic used by ClueBot_NG so I'll tag the operators in to see if they have any thoughts. User:Cobi User:Rich Smith User:DamianZaremba TigerShark (talk) 17:27, 1 May 2021 (UTC)

The purpose of warning a user who has vandalized is to inform the user that the user's conduct is abusive and prohibited, and seek the user's compliance. The users (e.g. Special:AbuseLog/29790413) are being warned, it just isn't on their talk page. The wording is nearly identical to standard user warning templates. Dat Guy^Talk_Contribs 17:44, 1 May 2021 (UTC)

Warning: An automated filter has identified this edit as potentially unconstructive.

It is almost never appropriate to add emoji and other Unicode icons (e.g. ♥, ☺, ☢, ☮) to Wikipedia articles. This is often an indicator of vandalism. If your edit is vandalism, you may be blocked from editing Wikipedia.

New and anonymous editors are prevented from adding such icons to Wikipedia articles. If this edit is constructive, you may report this erroneous warning or request that the edit be made at this article's talk page.

Report error

@TigerShark: It's not a question of technical implementation; I'm sure that's possible. It's a question of whether it's a good idea. First, who is going to give out the warnings? If we require recent changes patrollers to go through the log and hand out warnings based on edits that didn't save, then we've just defeated half the purpose of the filter: lessening the load on recent changes patrollers. And if a bot hands out the warnings, well, how is it supposed to know which filter hits are false positives? Second, the warning is telling them something they already know; namely that their edit tripped a filter! Third, warning LTAs is counter-productive; it just gives them the attention that they crave. There's no "W" in WP:RBI for a reason.

I'll also point out that your snapshot is misleading in another way. The "good" reports tend to blocked (and thus removed) almost instantly; while the "bad" reports accumulate over time, until someone clears the board. So unless the bot is 100% perfect, eventually the board will be filled with lots of unactionable reports. It only means no one has gotten around to clearing it.

I really think it's just a matter of User:DatBot/filters getting crufty, that's all. I've made one change that will help a little bit, but I'll look for more to do. Suffusion of Yellow (talk) 19:10, 1 May 2021 (UTC)

@TigerShark: On second thought, maybe there's not a problem at all. See the second table here (permalink). 70% of users reported by DatBot are blocked, in the end. That doesn't seem all that bad, though I still might trim a few filters. Suffusion of Yellow (talk) 00:28, 2 May 2021 (UTC)

File reducing GIF files

Latest comment: 3 years ago6 comments4 people in discussion

Hello!

Would Wikipedia:Bots/Requests for approval/RonBot 2 be something that you might consider implementing into DatBot? Since it has already been run before I figured it would need limited coding for you as operator, which of course is comfortable.Jonteemil (talk) 00:21, 4 May 2021 (UTC)

I don't see why not. I'll cook something up and file a BRFA. Dat Guy^Talk_Contribs 12:49, 4 May 2021 (UTC)

Oh, awesome!Jonteemil (talk) 12:59, 5 May 2021 (UTC)

@JJMC89: what happened here? I was about to suggest doing that regardless but what were the actual ramifications of that move compared to when the BRFA was filed? Dat Guy^Talk_Contribs 20:23, 6 May 2021 (UTC)

Ronhjones is no longer with us, so that BRFA is void I guess? — Alexis Jazz (talk or ping me) 16:41, 9 May 2021 (UTC)

It was just a mostly duplicate template, so I boldly redirected it. — JJMC89 (T·C) 04:01, 11 May 2021 (UTC)

File reducing WebP files

Latest comment: 3 years ago4 comments3 people in discussion

Starting this under a new header so it will be easier to follow. The WebP file format seems to be very transparent, in that it is open source and Google really seems eager to tell you everything about it. This makes me think that implementing file reductions for WebP files wouldn't perhaps be so hard for DatBot. See for example https://developers.google.com/speed/webp/ and perhaps also https://developers.google.com/speed/webp/docs/api which seems to be giving the needed information that would be needed for file reductions.Special:Search/file: local: incategory:"Wikipedia non-free file size reduction requests for manual processing" filemime:image/webp currently has 76 hits, a few more than the 31 GIF files needing reduction. If you could take a look on it it would be great :).Jonteemil (talk) 23:36, 6 May 2021 (UTC)

Everything here should work if it's standardised in the same format as the current template. Dat Guy^Talk_Contribs 01:51, 8 May 2021 (UTC)

Today DatBot processed only one non-free file. Will it process everything in the next run? — Alexis Jazz (talk or ping me) 17:14, 9 May 2021 (UTC)

@Alexis Jazz: I guess it did...Jonteemil (talk) 02:45, 11 May 2021 (UTC)

Category:Wikipedia non-free svg file size reduction requests

Latest comment: 3 years ago2 comments2 people in discussion

Hey. Do you need Category:Wikipedia non-free svg file size reduction requests for your bot, or could it be merged with Category:Wikipedia non-free file size reduction requests? — JJMC89 (T·C) 04:04, 11 May 2021 (UTC)

The process for upscaling SVGs, downscaling SVGs, and resizing other images is fairly different, so I do believe it's preferable to keep them separate. Dat Guy^Talk_Contribs 05:23, 11 May 2021 (UTC)

SVGcleaner issues

Latest comment: 3 years ago9 comments2 people in discussion

DatBot creating W3C errors

Sorry for bothering you so much lately, but I though you should know this. As can be seen here your bot created two W3C errors while upscaling File:UEFA Champions League logo 2.svg. I though I should report the bug(?) to you.Jonteemil (talk) 11:57, 7 May 2021 (UTC)

Seems like it's an issue brought on by svgcleaner. The correctness section says it's very rare, but if you have many examples such as the above I could possibly disable it altogether, although it would mess with the file sizes. Dat Guy^Talk_Contribs 01:51, 8 May 2021 (UTC)

On second thought, I checked through a few edits and it rather seemed to fix W3C errors, so I'll keep it as is. Dat Guy^Talk_Contribs 02:40, 8 May 2021 (UTC)

Hi!

I never wanted to disable SVGCleaner, it mostly does an awesome job at making the filesizes smaller. I guess I'll report the bug at its page on Github then, thanks!:).Jonteemil (talk) 12:02, 9 May 2021 (UTC)

DatBot bug at File:W-League logo.svg

Hello!

Sorry for bothering you even more, but I think a spotted another bug(?) at File:W-League logo.svg where DatBot removed the shading effects when upscaling the logo. Thought you should know.Jonteemil (talk) 14:19, 8 May 2021 (UTC)

Ditto at File:ERC Ingolstadt Logo.svg, but another kind of bug.Jonteemil (talk) 02:16, 11 May 2021 (UTC)

Seems to be the same as above. Modifying only the attributes DatBot changes results in intended behaviour. Reduced manually. Dat Guy^Talk_Contribs 05:23, 11 May 2021 (UTC)

Thanks for fixing the Ingolstadt logo. The W-League remains and also File:DFB-Pokal logo 2016.svg was wrongly upsized.Jonteemil (talk) 08:09, 11 May 2021 (UTC)

Ping :)Jonteemil (talk) 16:31, 16 May 2021 (UTC)

File:Confederation of Shipbuilding and Engineering Unions logo.png

Latest comment: 3 years ago4 comments3 people in discussion

Hi, I appreciate the aim of downsizing the logo, but the bot keeps reverting it to an earlier version; can it please downsize the current logo? Warofdreams talk 01:33, 16 May 2021 (UTC)

I don't see the issue? It's modifying the logo from 10 May. Dat Guy^Talk_Contribs 08:25, 16 May 2021 (UTC)

@Warofdreams: Maybe, it has to do with cache? See Wp:Bypass your cache.Jonteemil (talk) 16:23, 16 May 2021 (UTC)

Looks like that is the issue, sorry! Warofdreams talk 21:16, 16 May 2021 (UTC)