Wikipedia:Edit filter/Requested/Archive 8

Latest comment: 7 years ago by MusikAnimal in topic Flag vandal
Archive 5Archive 6Archive 7Archive 8Archive 9Archive 10Archive 15

<gallery>
Example.jpg|Caption1
Example.jpg|Caption2
</gallery>

as extraneous markup.

Category:<nationality> inventions (2)

Warning on double-extension file names

In the past 3 weeks or so, I've been looking at the latest new uploads and WP:FNC#5 correcting double-extension suffixes as needed. This could be tracked and disallowed simply with an edit filter.

Suggested filter: Warn/disallow when a user uploads a file name with a double-extension (like File:Westworld-S01-E02-Chestnut.jpg.png). These instances are almost always surely unintentional.

Generic filter spec
action == "upload" &
(
  article_prefixedtext irlike "\.(png|gif|jpg|jpeg|tiff|tif|xcf|pdf|mid|ogg|ogv|svg|djvu|oga|flac|opus|wav|webm)\.(png|gif|jpg|jpeg|tiff|tif|xcf|pdf|mid|ogg|ogv|svg|djvu|oga|flac|opus|wav|webm)"
)

The list of prefixes comes from the list at Special:Upload. I've tested this at the testwiki (please don't import) preliminarily with correct results. I believe the generic abuse filter warning message is misleading, however and possibly needs its own MW message, something along the lines of "It appears you are attempting to upload a new file with two extensions. Consider correcting the double-suffix and try again." — Andy W. (talk ·ctb) 14:57, 11 October 2016 (UTC)

Actually... a blacklist entry could work as well... File:.*\.(png|gif|jpe?g|tiff?|xcf|pdf|mid|ogg|ogv|svg|djvu|oga|flac|opus|wav|webm)\.(png|gif|jpe?g|tiff?|xcf|pdf|mid|ogg|ogv|svg|djvu|oga|flac|opus|wav|webm) if we'd like to restrict it without edit filter flexibility — Andy W. (talk ·ctb) 15:13, 11 October 2016 (UTC)
Either way - testing at Special:AbuseFilter/1 to see what we get -- samtar talk or stalk 15:21, 11 October 2016 (UTC)
Actually, I'm wondering if folks think this task is actually worthy of a filter (surely not getting a ton of hits), or rather a bot task, or just manual correction? I don't mind further correction for now, and may lean toward a blacklist entry if it helps editing performance. File names are not exactly front-facing. Just a thought (as initial requester here). — Andy W. (talk ·ctb) 01:53, 12 October 2016 (UTC)
I would go for title blacklist, since we're only acting on titles and we can still throw a friendly customized message. A bot task to autocorrect the titles doesn't sound bad either MusikAnimal talk 21:58, 12 October 2016 (UTC)
@Andy M. Wang and MusikAnimal: Disabling Special:AbuseFilter/1, it got a couple of hits but I agree this could be better served in the TBL - proves the regex works nicely though -- samtar talk or stalk 07:12, 13 October 2016 (UTC)

(where the ... heard)

I am trying to figure out how to stop the following trivial insertions which have been going on for more than a year, coming from a wide range of New York IPs. (See ANI thread Filter? Rangeblock? Synthesizer patch trivia LTA case from New York, at which an edit filter was suggested.) Examples:

Examples
  • [1] (where the Asian-European Yamaha DX7 internal factory patches "Bass 1", "Vibe", "Tub Bells" and "Piano 2" were heard)
  • [2] (where the Roland D-50 PN-D50-04 patch "Octave Synth Bass" and Roland TR-808 claps and low pitched Roland TR-808 cowbells were heard)
  • [3] (where the real flutes were heard)
  • [4] (where the Roland D-50 internal factory patch "Soundtrack" and PN-D50-01 patch "Pulse Pad" were heard)
  • [5] (where the synth brass and Yamaha DX7 internal factory patches "Harmonica 1" and "Electric Bass 1" were heard)
  • [6] (where the poly synth or Minimoog patch "Square and "Saw" and synth strings in 5ths were heard)
  • [7] (where the flute and steel drums were heard)
  • [8] (where the drums weren't featured or heard)
  • [9] (where the synth strings were heard)
  • [10] (that was released in 1980 and where the synth belly chiming piano, synth belly chimes or piano was/were heard)
  • [11] (where the synth FX and synth were heard)
  • [12] (where the synth snare drums were heard)
  • [13] (where the synth strings were heard)
  • [14] (where the Roland D-50 internal factory patch "Soundtrack" and PN-D50-01 patch "Pulse Pad" were heard)
  • [15] (where the Roland D-50 custom patch "Synth Xylophone" and synth whitles were heard)
  • [16] (where the synth strings and 12th Yamaha DX7 internal factory patch "Electic Bass 1" were heard)
  • [17] (where the drums mixed with Roland Alpha Juno and Roland Alpha Juno patch "Jet" were heard)
  • [18] (where the Roland Alpha Juno patch "Bell-Chimes 1" was heard)
  • [19] (where the Roland JX-3P patch "Brass I" and Yamaha DX7 custom patch "Synth Brass" were heard)
  • [20] (where the Roland JX-3P patches "Brass I" and "Brass II" and 5th Yamaha DX7 internal factory patch "Brass Horns (Brasshorns)" were heard)
  • [21] (where the 17th Yamaha DX7 internal factory patch "Sax BC (Sax/Saxophone/Saxophone BC)" was heard)
  • [22] (where the synth voices, synth vocals and guitars mixed with each other got heard)
  • [23] (where the piano mixed with the 26th Asian-European Yamaha DX7 internal factory patch "Tub Bells", and Roland Alpha Juno patch "Jet", and drums mixed with the Roland Alpha Juno patch "Jet" got heard)
  • [24] (where the synth strings and Hammond B3 organ with the vibrato were heard)
  • [25] (where the Roland Alpha Juno patch "Bell Chimes 1" was heard)
  • [26] (where the flute was heard)
  • [27] (where the claps/synth claps were heard)
  • [28] (that Madonna backup sang) (where the Roland Alpha Juno patch "Bell-Chimes 1" and saxophones were heard)
  • [29] (where the synth bells, chimes and flutes were heard)
  • [30] (where the Roland D-50 internal patches "Living Calliope", "Arco Strings", "Horn Section", and "Digital Native Dance", Sequential Circuits Prophet-5 patches "Sync I" and "Sync II" and Roland D-50 internal factory patches "Fantasia" and "Shamus Theme" were heard)
  • [31] (where the brass and octave brass were heard)
  • [32] (where the 41st Roland D-50 internal patch "Staccato Heaven" and 1st Roland D-50 internal patch "Fantasia" were heard)

So almost all of these clumsy insertions start with "(where the...", using open parentheses followed by those two words. All of them end with "...heard)", using close parentheses. Perhaps we can have a filter based on that fact. Binksternet (talk) 16:49, 4 October 2016 (UTC)

Also, they all end with "was heard)" or "were heard)" so maybe the filter can include a parallel look at was/were. Binksternet (talk) 16:51, 4 October 2016 (UTC)
I've run across a number that end in "is heard".[33] FWIW, there are a number of "background sang" edits by the same editor, along with GWAR and similar problems.
In the ANI thread, you mentioned hoping to catch them long enough to try to reason with them. I don't think this is a matter of unsourced trivia. I think it's more of a "special interest" and no likely to be amenable to reason. If a filter is possible with minimal false positives I'd...write an epic poem singing the praises of its creator or something. - SummerPhDv2.0 02:09, 5 October 2016 (UTC)
True about "is heard", and we have at least one "got heard". So the filter should look for just "heard)". Another thing the filter could look for is that the edit comes from an IP address. This person hasn't ever registered a username, as far as I can tell. Binksternet (talk) 15:37, 5 October 2016 (UTC)
Trying something at Special:AbuseFilter/637, will update you with my findings MusikAnimal talk 19:08, 5 October 2016 (UTC)
We had a visitor today who should have tripped your filter a handful of times. See the contributions of 2600:1017:B42C:F2BD:39DF:B887:9F4C:57B3 (talk+ · tag · contribs · filter log · WHOIS · RBLs · proxy check · block user · block log · cross-wiki contribs · CheckUser (log))
Interested to know what happened. Binksternet (talk) 22:51, 8 October 2016 (UTC)
Courtesy ping to MusikAnimal. Binksternet (talk) 23:13, 8 October 2016 (UTC)
@Binksternet: 637 logged the edits, and there hasn't been any false positives (yet..). I think SummerPhDv2.0 better start drafting that poem... -- samtar talk or stalk 07:55, 9 October 2016 (UTC)
Excellent! Can we then throw the switch, changing the filter from logging to disallowing? Binksternet (talk) 14:59, 9 October 2016 (UTC)
@Binksternet:   Done with Special:AbuseFilter/797. Will continue to monitor and tweak as needed. Do let me or Samtar or someone know if you see anything getting through MusikAnimal talk 22:00, 9 October 2016 (UTC)

Samtar and MusikAnimal: Were the edits by 2600:1001:B106:CD64:80D6:9EDD:2B2:4974 before or after the filter was on? (Incidentally: Thanks all for the work on this. I do appreciate it. So much so that I will opt for the "or something" portion of my pledge; my poetry sucks.) - SummerPhDv2.0 02:17, 10 October 2016 (UTC)

That was before. And I'd love to hear your poem but I was mostly kidding about holding you to it :) Might I suggest a haiku? They are my favourite :D MusikAnimal talk 02:44, 10 October 2016 (UTC)
I'll let you know if any others pop up. As for the poem, I know no one would hold me to it. They'd regret it if they did. Thanks again. - SummerPhDv2.0 13:08, 10 October 2016 (UTC)

Vashikaran

Edit filter for the word "Junior5a"

@BethNaught and Samtar:, not sure what the best way forward is, but the edit filter logs the information about what was stopped. There were multiple entries that had to be suppressed because of the personal information that was caught by the filter. I think it is better to have the information stopped before it gets to the mainspace, but the information was available to all filter managers for a while. In this case would a title blacklist entry be better so we don't get the logs or should we just monitor the logs to watch for additional information that needs to be suppressed? -- GB fan 17:55, 13 September 2016 (UTC)
@GB fan: You're quite right - can you add .*Junior5a.* to the TBL? I'll disable the filter once that's done -- samtar talk or stalk 18:06, 13 September 2016 (UTC)
@Samtar:, I added that, hopefully I did it right as this is the first one I have ever done. -- GB fan 18:16, 13 September 2016 (UTC)
  Done, added to TBL - @GB fan: perfectly done - I've disabled the filter, and these page creations should now be stopped and not even enter the logs -- samtar talk or stalk 18:20, 13 September 2016 (UTC)
Do NOT archive this; the same vandal has turned to extensively using the word "ZUUZUZ" in vandalism, presumably to harass Zzuuzz. 96.237.16.97 (talk) 20:19, 13 September 2016 (UTC)
They will change the name Jr5a is unfunny, Jr5a is pointless, Jr5a in failure. Marvellous Spider-Man 01:36, 14 September 2016 (UTC)

IP et al, has this calmed down a little? The TBL entry will have prevented a great deal of the vandalism, but as mentioned above it is very easy to bypass as it is public. A filter may be needed in the future should disruption continue -- samtar talk or stalk 16:45, 15 September 2016 (UTC)

Reference desks

Can something please be done to stop the disruption at the reference desks? Wikipedia:Reference desk I just re-protected them all for a day after vandalism started up again when the six-hour semi expired. Protects are easy, doing the damn revdels is time consuming. CU says the rangeblocks are not possible. --NeilN talk to me 00:11, 14 September 2016 (UTC)

@NeilN: Special:AbuseFilter/795 is disallowing a couple, keep us posted as to any alterations. It'll have to be one of the admin-EFMs as the content is getting revdel'd -- samtar talk or stalk 16:41, 15 September 2016 (UTC)
samtar, I will email you the latest attacks. I've been looking at the edit filter docs and unless I'm missing something, the functionality is completely inadequate for filtering based on an editor's editing history. --NeilN talk to me 16:52, 15 September 2016 (UTC)
Send it to me. Dragons flight (talk) 17:00, 15 September 2016 (UTC)
  Done by Dragons flight -- samtar talk or stalk 15:29, 21 September 2016 (UTC)

Supreme Genghis Khan Filter

  • Task Tags all new users that have the words "Supreme", "Genghis", or "Khan" in their usernames. Also looks for the words "Supreme Genghis Khan", "Supreme Genghis", or "Supreme Khan" in articles.
  • Reason: Supreme Genghis Khan, while not as active as he was a month ago, is still active. This filter will stop him from vandalising as easily.

- ThePlatypusofDoom (talk) 02:03, 18 August 2016 (UTC)

  Done Between Special:AbuseFilter/58 and Special:AbuseFilter/579. Tags are not being added, and I'm not even sure if it's possible to do this for account creation. As it stands now only sysops will be able to view the logs, but from what I can tell they are being patrolled MusikAnimal talk 21:16, 21 August 2016 (UTC)
@MusikAnimal: Couldn't you use Special:log/newusers to look for new users to tag? ThePlatypusofDoom (talk) 23:10, 22 August 2016 (UTC)
@ThePlatypusofDoom: Nice, from Wikipedia:Tags it was evident this could be applied to log entries, but for some reason I wasn't able to figure that much out... Anyway, filter 579 is a shared filter that has been running for some time, so we should consult other filter authors before tagging hits. Pinging DoRD and Elockid MusikAnimal talk 00:37, 23 August 2016 (UTC)
I was just thinking about this filter earlier today because I'm seeing a lot of false positives on "Khan". As it is, the majority of the recent accounts that might look like SGK (and/or a number of other sockmasters) are actually someone else, and keeping track of who is who is becoming a minor annoyance. Anyway, I wouldn't be in favor of tagging any accounts just for a match on Khan. ​—DoRD (talk)​ 01:40, 23 August 2016 (UTC)
What about just tagging if there are 2 of the 3 words? (It has to have 2 of the letter strings, not just Supreme, Genghis, or Khan) ThePlatypusofDoom (talk) 13:34, 23 August 2016 (UTC)
Seeing the account that just vandalized this page, I'm not sure if that will work. RickinBaltimore (talk) 20:26, 23 August 2016 (UTC)
What about the word "Genghis"? also possbile when SGK make sock accounts like this example: "InterstateGenghis55". KGirlTrucker81 talk what I'm been doing 20:47, 23 August 2016 (UTC)
We should probably discuss this somewhere that SGK can't see. ThePlatypusofDoom (talk) 22:24, 26 August 2016 (UTC)

Yeah, try sending an email to wikipedia-en-editfilters lists.wikimedia.org, that way you can discuss things in private. Omni Flames (talk) 22:35, 26 August 2016 (UTC)

@Omni Flames: I'll try to message people on IRC. ThePlatypusofDoom (talk) 11:48, 29 August 2016 (UTC)

A possible solution is being discussed here -- samtar talk or stalk 19:12, 29 August 2016 (UTC)

Tikeem Cumberbatch Filter

- Linguist 111 Moi? Moi. 22:25, 4 September 2016 (UTC)

  Done Added to Special:AbuseFilter/579 -- samtar talk or stalk 10:02, 5 September 2016 (UTC)

The Decoded Sexes filter

  • Task: The filter should be a case non-specific filter to stop any iteration of this article name, with or without the "the" from being created.
  • Reason: User:Awais Azad is a paid editing sockfarm who has been recreating this page repeatedly. It's now up to 72 sockpuppets. Every time the article gets deleted, he apparently misspells the title to get it into namespace. It's not a good use of time to chase the thousands of possible spellings and RFPP them constantly.

- MSJapan (talk) 03:40, 3 September 2016 (UTC)

@MSJapan: This would be better served by using the title blacklist. An admin will probably see the request here, but you can also drop by WP:AN to request -- samtar talk or stalk 07:10, 3 September 2016 (UTC)
I've made a request on your behalf -- samtar talk or stalk 10:36, 3 September 2016 (UTC)
  Done and added to the Title Blacklist by MER-C -- samtar talk or stalk 11:04, 3 September 2016 (UTC)

Harassment/Stalker abuse filter

  • Task: Tag or block edits by IPs from Special:Contributions/86.187.* (CIDR: 86.187.0.0/16) that either have the edit summary "rv v" (always with a space between rv and v) or "Undid revision ######### by [[Special:Contributions/Eik Corell|Eik Corell]] ([[User talk:Eik Corell|talk]])" where ######### is an diff number. Example IPs that would hopefully be blocked/tagged in the future with this filter: 86.187.161.103 (talk · contribs · deleted contribs · logs · filter log · block user · block log) and 86.187.160.100 (talk · contribs · deleted contribs · logs · filter log · block user · block log)
  • Reason:
This is part of a very long term pattern of wikistalking/harassment of a user. See a current ANI filing at My stalker's latest IP sock and some past filings at this link. I've seen enough ANIs myself to know this is a very long-term and pervasive issue (message me if you need more details and I can dig up some more ANI threads from the past). I believe Malcolmxl5 and KrakatoaKatie are familiar with this case if you need some admin input. Edit: I see that a past similar request was made and rejected as "not a good reason". Give the extreme persistence of this stalker, the fact that a rangeblock is inappropriate, and the ineffectiveness of past blocks, I think an edit filter is the best option here. This is not a minor annoyance. It's 7 years of abuse. I'm hoping some Malcolmx15 can attest to the severity of this issue).
The IP range seems to be 86.187.160.0 - 86.187.175.255. I put the large range in the task description (CIDR: 86.187.0.0/16). I used a CIDR conversion site for the more narrow range and it gives me 86.187.160.0/21, 86.187.168.0/22, 86.187.172.0/23, and 86.187.174.0/24. Forgive me if this is wrong or not useful as I'm not too familiar with CIDR.
Given the rather consistent use of edit summaries and the IP range being identified, I hoped this could be something an abuse filter could manage. If more info is needed, please ping me. Thank you!

- EvergreenFir (talk) 04:12, 5 August 2016 (UTC)

We could potentially check if the IP is in that range using something like user_name ip_in_range 86.187.0.0/16 == true. The rest seems fairly straightforward. — Preceding unsigned comment added by Omni Flames (talkcontribs)
We can give it a try, and perhaps merge into an existing filter. Note the IP is likely going to find a way around it, and we'll be playing the same cat-and-mouse game we do with most socks. Unfortunately harassment is not easy to deal with :/ So between that and the fact we may not get enough hits to make the expense of a filter worthwhile, I'm making no promises. @Omni Flames: I believe you mean something more like ip_in_range(user_name, "86.187.160.0/20"). This is nothing sensitive but in general we probably shouldn't be writing out private filter details here on the wiki. Anyway I'll start up a test filter to evaluate the extent of abuse, and we'll go from there MusikAnimal talk 22:16, 11 August 2016 (UTC)
Watching this, given the continued harassment (and the related AN/I thread) I would say a filter here is the least we can do to help the editor. @MusikAnimal: where's this being tested? -- samtar talk or stalk 18:38, 14 August 2016 (UTC)
Apologies for not updating you all. This is running at Special:AbuseFilter/723 and looks good :) Updates to come... MusikAnimal talk 04:06, 15 August 2016 (UTC)
@MusikAnimal, Samtar, and Omni Flames: Thank you all for working on this. I greatly appreciate it and I'm sure the target of the harassment does as well. Cheers! EvergreenFir (talk) 04:42, 15 August 2016 (UTC)
@MusikAnimal: Filter is looking good, doesn't seem to be any FPs. I imagine this would need its own filter? -- samtar talk or stalk 19:23, 29 August 2016 (UTC)
@Samtar: I guess I'll need confirmation, but it was my assumption the filter was getting false positives ([34][35]). My most recent update should help, but it's been 9 days since then and we've had no hits. I figure we'll let it run a bit longer, and if disruption is often enough, consider moving forward with it MusikAnimal talk 23:05, 30 August 2016 (UTC)
In case it'll help the process along in some way, their latest IP is 86.187.171.148 (talk · contribs · deleted contribs · logs · filter log · block user · block log). Same pattern, but with some activity on the Ruger Mini 14 article as well. AccountForANI (talk) 17:39, 31 August 2016 (UTC)

Can something be done quickly about the situation? See WP:ANI#IP-hopping stalker is back. -- The Voidwalker Discuss 19:15, 31 August 2016 (UTC)

I'll keep an eye on this, MA's filter did get hits on this recent spree, but he is right - there were some false positives I didn't identify. We'll try some other things -- samtar talk or stalk 19:33, 31 August 2016 (UTC)
  Done I'm now convinced and have a dedicated filter at Special:AbuseFilter/792. @EvergreenFir: Let us know if you see anything get through. Best MusikAnimal talk 19:51, 31 August 2016 (UTC)
Thank you! I'll keep an eye out. AccountForANI let me know if I can help further too. EvergreenFir (talk) 02:39, 1 September 2016 (UTC)

Category:<nationality> inventions

  • Task: Disallow anonymous editors from the range 24.114.0.0/17 from adding or removing any subcategory of Category:Inventions by country (all of which are named in the format "<nationality> inventions" (maybe look for the string "[[Category:"*" inventions]]" in added or removed lines?); tag non-autoconfirmed users making the same change.
  • 'Reason: Banned user:Filipz123 is a prolific sockpuppeteer (see Wikipedia:Sockpuppet investigations/Filipz123) and does very little other than these changes, almost all of which are bad, and can hit many article before being detected (I've just reverted 40-odd). They use a mixture of IPs from the given range (which would have too much collateral damage to block) and sock accounts. The chances of false positives from preventing IPs from that range making such a specific change is negligible, but I want to see whether there are (m)any false positives on the named accounts before disallowing those edits. This users' contributions also get tagged with "Mobile web edit", so if it is possible for an edit filter to read that then maybe that could be added in as well, but I'd rather tag even those edits made by other means at first. Thryduulf (talk) 13:13, 7 August 2016 (UTC)
I think we had a filter for this LTA at some point, I will try to track it down or create a new test filter if necessary MusikAnimal talk 03:02, 12 August 2016 (UTC)
  Done at Special:AbuseFilter/790. I'm going to continue to monitor for a bit before enabling any actions MusikAnimal talk 15:33, 15 August 2016 (UTC)
I tweaked it a bit to catch another variant I've seen (and reformatted to make it easier for me to follow the logic). DMacks (talk) 17:18, 15 August 2016 (UTC)
@Thryduulf: Let me know if you see any more related edits getting through. Best MusikAnimal talk 21:01, 21 August 2016 (UTC)

Date-change vandalism

  • Task: Tag rapid (<300 seconds between edits) date changes.
  • Reason: Since around 2007 there has been an increasing number of "sneaky vandals" that like to change numbers (usually dates, but any number really). These edits are always unsourced and the editors (usually IP accounts) rarely or never respond to user talk page comments. Although they are regularly reverted at high-traffic articles, these edits languish for months and years at unwatched backwater articles like those covering Children's TV programming, albums from one-hit-wonder bands, educational video games, children's books, etc. This issue has become such a problem that it has made up 2 of the 5 elements of the Subtle Vandalism Taskforce's scope since 2010.

These edits are not necessarily all vandalism, but when they're not they're usually incorrect unsourced original research (as when a child thinks the debut date for a Bugs Bunny cartoon is last Saturday when they saw it on TV, not realizing that it was a rebroadcast of a 1940s episode) which is why a tag (rather than a hard-filter) would be helpful.

These kinds of edits introduce misinformation into large number of articles at a rapid pace and may take weeks research and correct, and they have been happening encyclopedia-wide for close to a decade. If anyone who knows how to create filters has any idea how to create and implement such a tagging filter, please help us to get it started so that this steady flow of vandalism can at last be stanched. Below I'm linking some example of previous filters that acted in much the same manner as the filter I'm requesting. Perhaps they can be used as a model for this filter.

Thank you. -Thibbs (talk) 14:51, 24 April 2016 (UTC)

Prior art
  • In 2009, Filter 249 was created to assist editors in identifying rapid-paced reverting.
  • In 2011, Filter 391 was created to assist editors working in the area of sports-related articles to identify subtle vandalism related to the height/weight of sports athletes. That filter has been highly successful in identifying problematic editors and long-term vandals.
@Thibbs: Would a filter like this work?
!("confirmed" in user_groups) & (
  article_namespace == 0 & (
    date_change_vandalism :="(([0123456789]|present){4}|[0-9]{1,3} (January|February|March|April|May|June|July|August|September|October|November)|(January|February|March|April|May|June|July|August|September|October|November) [0-9]{1,3})";
    added_lines irlike date_change_vandalism & (
      removed_lines irlike date_change_vandalism & (
        !("<ref" in added_lines)
      )
    )
  )
)
Omni Flames (talk) 10:28, 3 July 2016 (UTC)
@Omni Flames and Thibbs: In testing at public filter 777, many thanks OF!   -- samtar talk or stalk 15:31, 6 July 2016 (UTC)
Welp! Entirely my fault for not batch testing, but your regex matches a lot of false positives. I'll work on it -- samtar talk or stalk 15:52, 6 July 2016 (UTC)
@Samtar: Yeah, but I suppose that if you want to catch all unsourced date changes, there's always going to be a lot of false positives. Omni Flames (talk) 21:48, 6 July 2016 (UTC)
Ah, I just took a look at the log and now I see what you mean by the false positives. Honestly, I'm not sure what else we could do to stop them occurring. Is there a way to return the value which a regex has matched? Omni Flames (talk) 21:55, 6 July 2016 (UTC)

Thanks for the work on this, Omni Flames and Samtar. I notice that in Filter 249 (rapid-paced reverts by a new editor - linked above) the "Trigger actions only if the user trips a rate limit" flag is set so that the filter is only triggered when more than 3 edits of the same kind are made by the target usertype within 300 seconds. Could that same limit be applied to filter 777 to reduce false positives? -Thibbs (talk) 13:21, 8 July 2016 (UTC)

Good idea, testing again for a couple of minutes -- samtar talk or stalk 13:25, 8 July 2016 (UTC)
By Jove I think you've cracked it... I'll continue to monitor. @Omni Flames: if you're around a half-eye on the filter log and a panicked message should it flood again would be appreciated! -- samtar talk or stalk 13:33, 8 July 2016 (UTC)
That was short lived, disabling -- samtar talk or stalk 13:37, 8 July 2016 (UTC)
I'm drawing blanks here on any other ways to limit the filter - it's picking up edits which contain new dates, so I imagine it would have to match for a date in the old_wikitext and then match against the same date being changed in added_lines? -- samtar talk or stalk 13:47, 8 July 2016 (UTC)
@Samtar: I've been spending some time trying to figure out how something like this could work in a filter. You're right that we would have to check that the date in the old_wikitext wasn't found in added_lines. However, I'm not even sure if it's possible to do this with the abuse filter. I've read through some documentation pages, but I can't seem to find any kind-of function which returns what value was matched by a regex. Unless there is something along those lines, I don't think this is going to work. Omni Flames (talk) 02:21, 9 July 2016 (UTC)

Thanks for the continued work on this, Samtar and Omni Flames. If I'm not mistaken, Filter 391 (flagging height/weight changes by a new editor - linked above) accomplishes just what you're talking about, comparing added_lines to removed_lines in old_wikitext. Could that be used as a model (just changing the height/weight-related identifiers for date-related identifiers)? -Thibbs (talk) 13:42, 9 July 2016 (UTC)

I've been looking over the code of Filter 777 and trying hard to remember the comp sci classes I took in the late 90s, so apologies in advance if these questions just show my ignorance, but... I don't understand a few things:
1) Isn't ([0123456789]|present){4} looking for "present" to match 4 times? When would that ever occur? Wouldn't something like ([0-9]{4}|present) make more sense?
2) The maximum size of a day of the month is 2-digits long (as in "31 January"), so why do we twice have [0-9]{1,3} rather than [0-9]{1,2}?
3) The two listing of the months (January|February|March... both end in ...September|October|November). Why not ...October|November|December)?
I will continue to look into this in the next few days and to explore the idea of using filter 391 as a model because I just remembered that I have some datasets of the targeted kinds of vandalism (1 and 2) and a more narrowly-tailored search pattern may be all that is needed. -Thibbs (talk) 12:48, 11 July 2016 (UTC)
@Thibbs:
1) You're right, nice catch!
2) Once again, you're right...
3) Another error on my part.
Well, let's just say that I was up late when I wrote that regex, and so it looks like I made a few careless mistakes. Thanks for catching them! Omni Flames (talk) 12:52, 11 July 2016 (UTC)
Correct, there is no way to do regex captures with the abuse filter extension. E.g. "(cond1|cond2)" is the same as "(?:cond1|cond2)" (non-capturing group). Filter 391 works by targeting only changes to specific parameters of a template. So if removed_lines had | weight = 200 lbs then if added_lines only changed 200 to 201 (edit_delta of zero), the filter is triggered because removed_lines and added_lines both matched the same thing, and because the edit was going to be saved there must have been some sort of change. Since it is so restrictive this filter is usually accurate, but all it requires is any change to the block of text matching the regex, so false positives are inevitable ([36][37][38]). For us, matching any date-like string in added/removed_lines will not work because the likelihood of some change to that block of text is incredibly higher. E.g. I can copy edit a paragraph and so long as there's a date somewhere in there the filter is triggered – not good! Like 391, we'll have to restrict to templates or other structured data in order for it to be remotely useful. I've updated the filter to work like this, along with some tweaks. We should see better results, but we will never be able to call it "vandalism". Let's see how she flies and we can reconsider tagging with "possible date change" or the like MusikAnimal talk 18:57, 11 July 2016 (UTC)
Thanks for clarifying that MusikAnimal. It's annoying that the abuse filter is so limited, but anyway. Your changes to the filter made it much more efficient, so thanks for that. One thing you could maybe check for is whether or not the changed parameter contains something along the lines of "date" or "release"? Omni Flames (talk) 23:39, 11 July 2016 (UTC)
We're already checking for | some_param = date-like-string which should cover all date-related parameters. It depends on what we're trying to capture, but I figure any date changes in a template are probably worth tracking MusikAnimal talk 01:21, 12 July 2016 (UTC)
That looks great, MusikAnimal. Thanks for looking into this. Unsourced date changes within templates and tables are by far the most common example of this kind of vandalism I've seen and honestly a simple flag (not describing it as vandalism, but just noting that there are date changes) would be all that would probably be needed to get some extra eyes on an edit and ensure that it's not a problematic one. Especially if the risk factors of "unconfirmed editor" and "rapid, only-date-changes-for-last-several-edits" are there. -Thibbs (talk) 11:06, 12 July 2016 (UTC)
No problem! I've updated it to do the throttling check, as you say. Since adding this we haven't had any hits, so let's keep monitoring. Without it we see a bunch of good-faith edits, but unconfirmed users quickly changing dates across various articles is more suggestive of abusive behaviour, I think MusikAnimal talk 18:46, 12 July 2016 (UTC)

  Done Seems to be working as expected -- samtar talk or stalk 07:13, 9 August 2016 (UTC)

Template doc pages

  • Template docs: A lot of doc pages are getting hit by variations of this
  • Reason: IP range also has good edits.

- NeilN talk to me 17:11, 2 August 2016 (UTC)

Working on this MusikAnimal talk 17:34, 2 August 2016 (UTC)
In testing at Special:AbuseFilter/723 MusikAnimal talk 17:47, 2 August 2016 (UTC)
Thank you. --NeilN talk to me 22:03, 2 August 2016 (UTC)
No prob,   Done with Special:AbuseFilter/785 MusikAnimal talk 22:05, 2 August 2016 (UTC)

Harambe vandalism

  • Task: Tag all additions of the word "Harambe", or "Harumbe", which is a common misspelling of the name (see Google search) by unregistered or new users. This could either be an addition to an existing filter, or most likely, a standalone filter. The regex should be something along the lines of Har(a|u)mbe. The rest of the filter should be fairly straightforward to write, although note that it might be a good idea to ignore the edit if the page title contains "Harambe".
  • Reason: After the gorilla's death, it's turned in to somewhat of a meme now, and I've noticed large amounts of vandalism adding it to articles recently. As the creator of the DAB page Harambe, I get a notification every time someone links there, which is helpful, but it doesn't allow me to look for unlinked versions of the name, or links to the actual page, Killing of Harambe. Here are some examples of this vandalism which I found just in the last 24 hours, although there's most likely a lot more out there that I couldn't find: [39][40][41][42][43][44][45][46]. I'm willing to give more examples of this if needed. All of the edits I've found so far have been vandalism, although it is possible there are some that aren't. A lot of it is not yet reverted by the time I get to it, so this would help me find more of the vandalism that goes on. Thanks, Omni Flames (talk) 07:13, 2 August 2016 (UTC)

- Omni Flames (talk) 07:13, 2 August 2016 (UTC)

@Omni Flames: I've popped it up on Special:AbuseFilter/1 just to test for hits -- samtar talk or stalk 07:32, 2 August 2016 (UTC)
Wow Samtar, that was quick! Anyway, thanks, I'll watch it and see if it gets any hits. Omni Flames (talk) 07:33, 2 August 2016 (UTC)
  Done at Special:AbuseFilter/784 -- samtar talk or stalk 08:39, 2 August 2016 (UTC)

Rapper

- Dat GuyTalkContribs 15:08, 21 September 2016 (UTC)

Added to Special:AbuseFilter/781 for the moment (log and tag), see how it goes -- samtar talk or stalk 15:37, 21 September 2016 (UTC)
Seems the string 'rapper' is used a lot by non-confirmed for legitimate reasons.. I don't think this filter idea will work -- samtar talk or stalk 18:15, 21 September 2016 (UTC)
There is no way you're going to be able to look for simply "rapper" or "rapping" and expect to get anything useful. This is a common term that may show up in any music-related page or category. At the very least consider a word break as with \brapper\b so you don't get "wrapper", etc. The example diff also doesn't appear to be particularly problematic, at least to the extent to warrant a filter MusikAnimal talk 15:29, 22 September 2016 (UTC)
  Denied Doesn't look like this is problematic enough to warrant a filter which, in any iteration, will have false positives -- samtar talk or stalk 18:52, 10 October 2016 (UTC)

Tripled Or More Letter Filter

  • Task: Basically, a blanket filter of one letter being repeated 3 times or more, such as ttt or something similar, as a very few words in the english language have 3 of the same letter back to back. Obviously for names, or abbreviations it wouldn't be a disallow, but would tag it.
  • Reason: A lot of vandals have taken to just adding a string of one letters or else super-emphasizing a word to make it reeaaallllyyyy (example) long.

- Iazyges (talk) 20:56, 14 September 2016 (UTC)

@Iazyges: See Special:AbuseFilter/135, it handles repeating characters and tags them. Could you link to any edits that weren't tagged, that you feel should have been? MusikAnimal talk 00:31, 21 September 2016 (UTC)
@MusikAnimal: Oh, I was unaware of that, sorry. Iazyges Consermonor Opus meum 00:33, 21 September 2016 (UTC)
  Denied Apparently not needed MusikAnimal talk 15:29, 22 September 2016 (UTC)

(Dank) memes

  • Task: Tags all new users that have the words "dank" and/or "meme(s)" in their usernames.
  • Reason: Many users who have these strings in their names are vandals.

- Linguist Moi? Moi. 15:39, 15 November 2016 (UTC)

  Denied Many of them might be vandals, but certainly all are not, and we're not going to block solely because either of these terms are in the username. Therefore I don't think the expense of a filter is worthwhile. If vandalism is observed admins can use their own discretion as to whether the username further suggests they are WP:NOTHERE and act accordingly MusikAnimal talk 17:08, 15 November 2016 (UTC)
  • Task' : Tags edits adding links to the BBC iPlayer service which match a specfic URL prefix (typically *://www.bbc.co.uk/iplayer/*).
  • Reason : BBC iPlayer links are not stable ( they tend to expire after about 14-28 days), and do are not necessarily a long term source link. Having a tracking filter will allow other users to track the links and replace them with more appropriate long term sources containing the same information. In addition, as of Sept 1st 2016, iPlayer is a defacto subscription content service for a number of users, and owing to it's use of Flash is incompatible with some browsers, necessistating a link warning on Wikipedia, which an additional contributor may forget to add. Tracking the addition of iPlayer links will allow for the appropriate flash warnings to be given.

Sfan00 IMG (talk) 15:23, 3 September 2016 (UTC)

@Sfan00 IMG: Just to confirm, this would only be for logging these additions? I don't think the use of a tag would be necessary, as the filter itself would be public (and thus the logs would be also) -- samtar talk or stalk 16:44, 3 September 2016 (UTC)
{{done}} at Special:AbuseFilter/794 - batch tested and log only, seems like this does have a use for those who would wish to patrol the addition of iPlayer links -- samtar talk or stalk 16:45, 3 September 2016 (UTC)
@Sfan00 IMG: Is there any advantage of using a filter over a simple search? Tagging edits that add these links may be a nice convenience but it's expensive to run such a filter on every mainspace edit. The search will also identify links that were added in the past, or that somehow got around the filter MusikAnimal talk 19:17, 3 September 2016 (UTC)
If you can provide a version of Special:Linksearch that will track the specfic revision when a link is added, then the filter isn't needed. Sfan00 IMG (talk) 17:54, 4 September 2016 (UTC)
@Sfan00 IMG: Well, we're not too far off from having phab:T115119 completed, which will fulfill that need, and more! :) In the meantime, it's true we cannot easily provide you with revision info of when links were added/removed. May I ask why this information is important? Do you merely need to contact the user who added the link? User:XLinkBot may be of help here. I think we should discuss further, it's just important to understand the expense of the requested edit filter. If we do move forward with a filter, are there any other restrictions we can think of to add? For instance, have the filter only run for users with less than say, 500 edits? MusikAnimal talk 17:17, 6 September 2016 (UTC)
Yes, it's so that the user can be advised to find an alternative link that's not paywalled. Sfan00 IMG (talk) 08:48, 7 September 2016 (UTC)
I wonder if a bot would be better for this task. It could automatically notify users who add these links. KSFTC 17:38, 7 September 2016 (UTC)
@Sfan00 IMG and KSFT: Yes, that bot is User:XLinkBot :) It's a bit tricky so I'm going to ask for help, but it appears you can create customized warnings, and even define logic such as only revert once, or revert even whitelisted users, etc. I will let you know what I found out! I think we can safely mark this request as   Denied as far as the need for an edit filter MusikAnimal talk 18:29, 9 September 2016 (UTC)

Cause of death vandal

  • Task: the long-term and repeatedly blocked cause of death vandal is a tricky one. They operate from IP ranges far too big to rangeblock (and recently from a different country), and hit random biographical articles, making semi-protection unviable as well. However, the similarity in many of their edits (nearly all of it in infoboxes) suggests to me that a filter could be useful here.
  • Reason To prevent vandalism, some of which (due to its sheer randomness) can remain in articles for a long time. Black Kite (talk) 19:17, 5 July 2015 (UTC)
  • They're still going, see [48]. A good starting point would be to disallow IPs changing (infobox musical artist) to (infobox person), i.e. [49]. The vandal does this because the former template does not have cause of death, etc. on it. Black Kite (talk) 09:11, 15 July 2015 (UTC)
@Black Kite: This will be a hard filter to make but it might be possible, for at least some of the characteristics. I'll have a think about it but eyes from more experienced EFMs would be appreciated. Sam Walton (talk) 10:04, 5 September 2015 (UTC)
712 already to logs DoB changes (Black Kite  Filter log 712 might be useful.). Extend that to the other well-known changes, and limit by IP range. Then the decision is whether to log or block, which can only be decided after watching the logs for a few days. (On a side-note it would be good to have some empirical data on the impact of various strategies, specifically blocking, quick reversion, and more leisurely reversion, on medium term vandals.) All the best: Rich Farmbrough, 22:57, 5 September 2015 (UTC).

Lions at Cat Creek

@The Bushranger: Are they still active? PhantomTech (talk) 06:06, 20 March 2015 (UTC)
@PhantomTech: I've seen a few socks of this user recently, but not as many as in the past. Ping me if you need me to reply EvergreenFir (talk) Please {{re}} 21:26, 23 April 2015 (UTC)
@PhantomTech: A bit belated due to my busy year, but...these shenangians have continued through at least August, so yeah, this is still a needed thing if the poor Cat Creek article is ever to get un-full-protected. - The Bushranger One ping only 08:27, 4 December 2015 (UTC)
@EvergreenFir: @The Bushranger: Do you know if this is still an issue? Omni Flames (talk) 10:38, 3 July 2016 (UTC)
@Omni Flames: There was one incident in the past month or so, but he's been rather quiet recently. But he's a very long term abuser, and persistent. If it's an easy one, this would be a useful filter imho. EvergreenFir (talk) Please {{re}} 19:12, 3 July 2016 (UTC)
@Omni Flames: I concur; while sometimes they go quiet, when they do come back it's sudden and oft en masse. As long as it isn't a bother then this should be done, as there is no legitmate reason for the content it would preclude to be inserted. - The Bushranger One ping only 21:37, 5 July 2016 (UTC)

@The Bushranger: @EvergreenFir: A filter could work, I guess. Something along the lines of this maybe.

article_namespace == 0 & (
  lacc := ((L(i|y)ons)(.*)(C(a|@)t Creek|Montana)|(C(a|@)t Creek|Montana)(.*)(L(i|y)ons))
  added_lines irlike lacc & (
    !removed_lines irlike lacc & (
      !("mountain" in added_lines)
    )
  )
)

As for the idea of catching "ROAARR" or something like that in edit summaries, although such a filter would be easy to write, I personally think it would catch a lot of false positives. Just my opinion though. Omni Flames (talk) 07:31, 8 July 2016 (UTC)

From what I can tell, these waves of disruption happen very infrequently. They also closely resemble general vandalism, are quickly reverted, and ultimately blocked. Finally, like all the sock filters, they're going to just find a way around it and we'll continually need to update it to keep up with them. It seems for this particular LTA the old-fashioned way would be the most effective MusikAnimal talk 02:01, 12 July 2016 (UTC)
@Omni Flames: This filter is also full of syntax errors. Music1201 talk 04:28, 15 July 2016 (UTC)

Here is the same filter without the syntax errors:

article_namespace == 0 & (
  lacc := "\b((L(i|y)ons)(.*)(C(a|@)t Creek|Montana)|(C(a|@)t Creek|Montana)(.*)(L(i|y)ons))";
  added_lines irlike lacc & (
    !removed_lines irlike lacc & (
      !("mountain" in added_lines)
    )
  )
)

Music1201 talk 04:46, 15 July 2016 (UTC)

Ah, thank you for that Music1201. I forgot the quotation marks and the semicolon. However, as far as I know, the /b isn't needed, although I could be wrong. Omni Flames (talk) 07:30, 15 July 2016 (UTC)
The \b could go either way. It is not required, although without it, the filter may pick up a lot more false positives than expected. Music1201 talk 17:13, 15 July 2016 (UTC)
  Denied I don't think a filter is the best course of action here. If disruption pops up in masse, as they say it sometimes does, we can consider putting something together quickly, but I don't see an advantage to being preemptive about it MusikAnimal talk 15:54, 17 July 2016 (UTC)
@MusikAnimal: The problem here is that every single time the Cat Creek, Montana article has been unprotected, the vandals reappear. The article was been locked, completely - full protection - for 4 years becasue of this - and the instant the protection expired they returned. This is long-term abuse and all other options have been tried. An edit filter is the only way this page can be left unprotected. - The Bushranger One ping only 03:31, 21 July 2016 (UTC)
@The Bushranger: That article has been under semi since May and nothing has popped up. By the time it does, we'll more than likely have a new tool under our belt that should prove effective. If by that time ECP is still not permissible please ping me and I will put something together. For now I'm against the idea of hogging resources for the possibility that they may return MusikAnimal talk 04:21, 21 July 2016 (UTC)

Please turn off 744

Please turn off filter #744, which I requested a year ago at Wikipedia:Edit filter/Requested/Archive 7#Blanking TemplateData. Normal editing/patrolling seems to be sufficient now. Thanks, WhatamIdoing (talk) 00:58, 28 December 2016 (UTC)

  Done MusikAnimal talk 05:21, 29 December 2016 (UTC)
{{archive now}} MusikAnimal talk 04:02, 16 January 2017 (UTC)

Date warring

{{User:ClueBot III/ArchiveNow}}

  • Task: Can we please have a filter to deal with Kipperfield. This person or group is changing date formats across all mainspace pages in defiance of WP:DATERET using automated or semi-automated processes.
  • Reason: A large army of socks have been dealt with through WP:SPI but more are continually being created. The problem has been ongoing since at least 2008 and there is no sign of it stopping. Kipperfield may well not be the only date warrior out there.

- SpinningSpark 00:20, 4 December 2016 (UTC)

  • Is it possible to look at user editing history in a filter? That could potentially weed out the false positives (new user, three date changes in a row or something like that). SpinningSpark 11:32, 4 December 2016 (UTC)
The filter rules can't examine an account's previous edits, but it can use account age and editcount data, and the restrictions can be set to "throttle", so this shouldn't be dismissed out of hand. I wouldn't be sure how to write rules for this, though. BethNaught (talk) 11:42, 4 December 2016 (UTC)
It should be easy to detect a regex match of Kipperfield's preferred date change, e.g December 12, 2016 to 12 December 2016. Combine that with new(ish) user, no other changes made, and maybe check the edit summary as well and I think there will be very few false positives. There can't be many new users who come along just to make a date format change, and even if you do catch a few GF edits, they are more than likely violations of DATERET, even if inoccently done. SpinningSpark 18:10, 12 December 2016 (UTC)
  • I am somewhat concerned that the first (at least) currently listed as a suspected sock is making almost completely legitimate edits, which are being branded as "pointless". Given this I am reluctant to bring the power of edit filters to bear, without further investigation. All the best: Rich Farmbrough, 01:35, 18 December 2016 (UTC).

We could in theory log date changes by unconfirmed users/ips, no? Ⓩⓟⓟⓘⓧ Talk 01:46, 18 December 2016 (UTC)

  • Changing date formats is, of course, a completely legitimate thing to do. What is not legitimate is to create an army of socks to do it en-masse in an automated way, in defiance of a block, and without regard to DATERET. At the very least it is WP:BLOCK EVASION, and likely also a breach of WP:BOTPOL. Certainly some Kipperfield edits are actually useful, unifying the date format, but there are many others that are just straight date changes. Kipperfield always unifies to his preferred format, sometimes unifying to the body text, sometimes to the infobox format, whichever is in his favourite format. What is more, on some of the edits I have checked in detail in the past the inconsistency had been caused by a previous run of a Kipperfield bot. Kipperfield is welcome to come back to legitimate editing and argue his case for these changes. I'll unblock him myself. All he has to do is undertake to kill off the socks. SpinningSpark 11:03, 18 December 2016 (UTC)
  Impossible King of 09:34, 16 January 2017 (UTC)

Flag vandal