Wikipedia talk:WikiProject Check Wikipedia/Archive 5


svwiki has no errors reported

edit

  Done

Hi, I wish you luck with Labs...

See Wikipedia talk:WPCleaner#CheckWikipedia_does_not_work_on_sv.wikipedia.org.5B....5D, svwiki has no errors reported on Labs, while some are reported on toolserver. --NicoV (Talk on frwiki) 06:55, 6 November 2013 (UTC)Reply

NicoV, there is a problem with the svwiki dump files for the past few months. There is some sort of corruption about 1/2 way thru the dump. I don't think it is a bad dump as it always happens and only happens with svwiki. Probably some borked articles. I'll hunt down the articles in a few days. I've got to debug it on my computer and the computer is currently running the latest enwiki dump. I'll then be busy fix errors. Dump processing on my 3-year old computer is 68% done. At labs, it is 11% done and it started at the same time as my computer. Bgwhite (talk) 07:47, 6 November 2013 (UTC)Reply
NicoV and Josve05a‎, errors are up. It's still not going thru 100% of the dump, but atleast there are errors to play with. Bgwhite (talk) 08:21, 24 November 2013 (UTC)Reply
SCORE! Finally! Thanks! -(tJosve05a (c) 08:31, 24 November 2013 (UTC)Reply
Bgwhite, NicoV: error #37 is showinga lot of false errors. Since svwp now supports special characters in DEFAULTSORT. So the error what me to create a DEFAULTSORT with the exact same name as the title. Deactivate? -(tJosve05a (c) 09:07, 24 November 2013 (UTC)Reply
Josve05a Just mark them all done and I'll deactivate the error. Boy, marking done 2/3 of the errors is a nice feeling. Bgwhite (talk) 09:12, 24 November 2013 (UTC)Reply
Bgwhite, THAT FEELING. Quote I said in my mind: "Is this how it feels to vandalise Wikipedia?". -(tJosve05a (c) 09:16, 24 November 2013 (UTC)Reply

#67 and abbreviations

edit

  Done

Would it be possible to configure a list of abbreviations for which it would be normal to have the reference just after a punctuation ? For example, etc.<ref>REF</ref> is OK because etc. is an abbreviation. WPCleaner uses error_067_abbreviations_.. to configure this list. --NicoV (Talk on frwiki) 13:50, 13 November 2013 (UTC)Reply

Yes. Bgwhite (talk) 20:27, 13 November 2013 (UTC)Reply
NicoV, do you have some articles where the abbreviations can be found, so I can test things out. Bgwhite (talk) 00:15, 26 November 2013 (UTC)Reply
Yes, sure: fr:Grenade à main (after J.-C.). --NicoV (Talk on frwiki) 12:45, 26 November 2013 (UTC)Reply
NicoV, in theory, this is fixed. Will upload the fixed version as soon as Labs comes back on-line and see how the daily update pans out. FYI... Grenade à main is on the whitelist. Bgwhite (talk) 22:53, 26 November 2013 (UTC)Reply
Thanks ! I know, that's how I found it for the example :-) I've just removed it from the whitelist. --NicoV (Talk on frwiki) 08:31, 27 November 2013 (UTC)Reply

# 37

edit

  Done

Moin Moin Bgwhite, at the german Wikipedia reached me a question. I set a DEFAULTSORT (german: SORTIERUNG), but before there was a DEAFULTSORT directly at the categorie (see this link). In the article is the template "Disambiguation" and set automatically the categorie "Begriffsklärung". Could you say me, if its right or wrong to do so? Thanks. --Crazy1880 (talk) 09:03, 29 November 2013 (UTC)Reply

Crazy1880 the edit seems just right to me. -- Magioladitis (talk) 21:43, 18 December 2013 (UTC)Reply
Technical right, yes, but it didn't changed or fixed anything. I left the user a note. --TMg 13:45, 18 January 2014 (UTC)Reply

Wiktionary and capital letters

edit

  Done

It seems like the tool assumes that all projects capitalize the first letter. That's not true for Wiktionaries, so those links usually point to the wrong entry. 18:54, 22 December 2013 (UTC) — Preceding unsigned comment added by Skalman (talkcontribs)

Skalman You can request that some errors are disactivated for wiktionary. -- Magioladitis (talk) 19:33, 22 December 2013 (UTC)Reply
These are not errors codes. A page like wikt:word will in the tool be linked as wikt:Word, which is a different, possibly non-existent page. Here's an example page. Skalman (talk) 19:04, 23 December 2013 (UTC)Reply
Skalman the page does not give instructions of how to fix the errors displayed, it only states the error. It is for the user to decide what is correct. Moreover, the tool I use, WP:AWB, does not convert external links to wikilinks. -- Magioladitis (talk) 19:17, 23 December 2013 (UTC)Reply
Skalman, I'm slightly confused (nothing new), but I think I understand what you are saying. The "Here's an example page" is not the example you should have shown. It threw me off for a bit and it looks to have also confused Magioladitis.
A good example is Checkwiki is reporting a #86 error for wikt:Chad. The error being [[http://www.macmillandictionary.com/new ...]]. There error is actually not in wikt:Chad, but in wikt:chad. I think I know how to fix this problem. With the end of the year holidays, I'm not sure when I'll get this fixed, but will leave a message here when I do. Bgwhite (talk) 21:05, 23 December 2013 (UTC)Reply
Bgwhite, you got what I meant. Sorry for being unclear. Wonderful to hear that you're considering fixing this in the near future! Skalman (talk) 17:44, 27 December 2013 (UTC)Reply
Skalman, I think it is fixed. enwikitionary is now producing case-sensitive article names. Bgwhite (talk) 22:20, 28 December 2013 (UTC)Reply
Bgwhite, it looks like it's fixed for enwiktionary. However, all Wiktionaries are case-sensitive and sv-wikt where I'm most active has not been fixed yet (based on this example page). Thanks. Skalman (talk) 18:21, 29 December 2013 (UTC)Reply
Skalman, I was only rerunning enwikitionary to test things out. All other Wikitionries will update during their next scheduled dump report. For svwiktionry, that will be next Sunday-Tuesday time frame. Bgwhite (talk) 06:11, 30 December 2013 (UTC)Reply
edit

  Done

Error #64 needs to be corrected too. E.g. [[a|A]] is being reported, even though [[A]] does not point to the same page in Wiktionaries. Skalman (talk) 00:11, 7 January 2014 (UTC)Reply

That would be a problem. Will work on it. Bgwhite (talk) 08:05, 7 January 2014 (UTC)Reply
Skalman, the problem has been fixed... hopefully. It will show up in the next dumpfile scan. Bgwhite (talk) 23:07, 8 January 2014 (UTC)Reply

#89 false possitives

edit

  Done

Below I will list some false possitives for the #89-error that I've/will encounter/d (right now there is only one, but I will find more...)

  • {{DEFAULTSORT:UTC-08:30}}

(tJosve05a (c) 18:29, 23 December 2013 (UTC)Reply

Do you mean #6 and not #89? There is no comma for UTC-08:30 to be an error #89. Bgwhite (talk) 21:10, 23 December 2013 (UTC)Reply
Then this is a WPCleaner-error, that NicoV has to fix when he gets back from holidays. I did not know if this was a WPCleaner-error or a CHECKWIKI-error, since WPCleaner is suppoed to work with the CHECKWIKI rules, but still WPCleaner thought this was a #89-error. -(tJosve05a (c) 23:18, 23 December 2013 (UTC)Reply
Josve05a please report WPC's bug in WPC's bug page and not here. -- Magioladitis (talk) 21:17, 24 December 2013 (UTC)Reply
I did not know (at first) that this was a WPCleaner-error. That was why I reported it on both places. -(tJosve05a (c) 21:18, 24 December 2013 (UTC)Reply

2,5-Dimethoxy-4-chloroamphetamine, 1,4,6-Androstatriene-3,17-dione, 2-Phenyl-3,6-dimethylmorpholine etc. is false possitives since it does have a comma, but is not suposed to have a space between. -(tJosve05a (c) 20:49, 24 December 2013 (UTC)Reply

Yea, I noticed the same false positives. I need to change the code to not report commas surrounded by numbers. Bgwhite (talk) 21:36, 24 December 2013 (UTC)Reply

"That's strike one!"

edit

  Resolved

The program WPCleaner detects <small>-tags as a #42-error. I belewe (of what I can understand, that that error is only there for reporting strike-tags and not small-tags. It might be a bug in the program or in the CHECKWIKI-coding.

  • On Chandra Davis it detects <small>(television)</small> and <small>(singing)</small> as #42-errors.
  • On 1993–94 Atlanta Hawks season it detects <small>(eliminated 2-4)</small> as a #42-error.
  • On Tina Turner discography it detects <small>(with [[Ike Turner]])</small> as a #42-error.
  • On Kathy Kirby it detects <small>[[UK Singles Chart]]</small> as a #42-error.
  • On Detroit Institute of Arts it detects <small>Annual sales estimates reflect free admission for Wayne, Oakland, and Macomb county residents for millage years. Expenditures rise about 1.9% annually for inflation. Investments yield about 3.8% annually.</small> as a #42-error.

(tJosve05a (c) 18:49, 23 December 2013 (UTC)Reply

Josve05a, I've already responded to you about this at Wikipedia talk:WPCleaner#strike vs small. It is a new error to catch strike tags. NivoV is on holiday and is currently unable to update WPCleaner. Bgwhite (talk) 20:46, 23 December 2013 (UTC)Reply

dawiki obs

edit

  Done

2 observations for the wmflabs version:

  • charset in error #6, #37 - dawiki allow [æøåÆØÅ] -
  • not same priority - in high fx: missing #81, #69, #71, #83, #84 (is in middle), from middle #80 (difference between toolserver and wmflabs), some off - #79, #81

--Steenth (talk) 14:57, 7 January 2014 (UTC)Reply

@Steenth: You can change priorities in the translation file (da). Matt S. (talk | cont. | cs) 15:09, 7 January 2014 (UTC)Reply
@Matěj Suchánek: The translation file is okay!! --Steenth (talk) 17:06, 7 January 2014 (UTC)Reply
Steenth, I've already have [ÆØÅæøå] entered for #6 and #37. Will look into why it is not working.
The translation file is not ok. There are actually two different settings for each error. One that says error_0**_prio_script and the other is error_0**_prio_dawiki. Difference is one has "script" and the other has "dawiki". I truly do not know what the "script" settings are supposed to do. They predate my involvement. I've been removing and encourage others to remove the "script" lines for each error. Bgwhite (talk) 19:02, 8 January 2014 (UTC)Reply
Steenth, the problem with errors #6 and #37 have been fixed. It should show up when the next dawiki dump file is scanned. Bgwhite (talk) 22:44, 8 January 2014 (UTC)Reply

Ever more new errors

edit

  Resolved

If we know any more errors that can be implemented, list the here.

Jsve05a, it is already listed #Round 2, fourth item in the table. At the moment, I'm not taking new errors as I've got to implement the ones already listed. Bgwhite (talk) 18:43, 8 January 2014 (UTC)Reply
Oh, I did nor even see that this was already listed. Ok. It was just a suggestion for the future. -(tJosve05a (c) 18:44, 8 January 2014 (UTC)Reply

Daily scan

edit

  Resolved Moin Moin Bgwhite, since the update to wmf10 the daily scan for new "errors" ins't running. Can you check this, please. Thank you and regards --Crazy1880 (talk) 18:22, 17 January 2014 (UTC)Reply

Crazy1880, labs has been having problems the past few days. I know they had a network outage today (17th). enwiki hasn't started or didn't come close to completing either the past few days. A look at frwiki shows it didn't run yesterday and only partially today. Bgwhite (talk) 07:04, 18 January 2014 (UTC)Reply
Moin Bgwhite, for me its important, that you confirm my detections. Do somebody know when it will be fixed? --Crazy1880 (talk) 09:41, 18 January 2014 (UTC)Reply
Crazy1880, I haven't a clue when things will be fixed. Labs doesn't share much information on what is happening. I only knew about today's network outage because it was posted to the mailing list. I know they are going to physically relocate the computers to a new location and then they will fix some outstanding problems to the database machines. But, I don't know when that will be happening. Bgwhite (talk) 09:52, 18 January 2014 (UTC)Reply

Disable #6 and #37 on svwiktionary

edit

  Resolved

We specifically use DEFAULTSORT with special characters in order to put pages in our preferred order.

To clarify: [1] and [2] don't make sense for us. Skalman (talk) 23:28, 6 January 2014 (UTC)Reply

Skalman, alot of Swedes don't make sense either, but we don't delete them... yet. :)
Svwiktionary doesn't have a translation page. The page is how you can customize what errors to turn off and on. Josve05a, could you copy the svwiki translation file over to svwiktionary and you two wacky Swedes can customize it. Yell if you need help. When you are done, tell me where it is located, so I can added it to the programs. Bgwhite (talk) 08:04, 7 January 2014 (UTC)Reply
Bgwhite, if you show me the svwiki cutomization page I can copy and try to customize it myself. Of course, if Josve05a wants to help out, that's appreciated. Skalman (talk) 11:40, 7 January 2014 (UTC)Reply
Bgwhite, Skalman, the translation file can now be found here. -(tJosve05a (c) 11:55, 7 January 2014 (UTC)Reply
Josve05a, thanks. Bgwhite, I moved the page here to go better with our other project pages. Skalman (talk) 12:01, 7 January 2014 (UTC)Reply
Skalman, the translation page is in the database and the changes are on the web page. Some errors are probably changed. Instead of the defaults, the settings are using the translation page. So, change the page as you see fit... add or delete errors, change error priorities or change text. Bgwhite (talk) 00:12, 8 January 2014 (UTC)Reply

Error #16 and new Unicode checks

edit

  Resolved

NicoV, TMg, Josve05a, Matěj Suchánek and Kwami

New Unicode control characters and the entire Private Use Areas (PUA) are now being checked for enwiki only.

  • Currently only U+200E and U+FEFF control characters are checked for all wikis.
  • U+200B, U+2028, U+202A and U+202C control characters are checked for only enwiki.
  • All Private Use Area characters (\p{Co}) are checked for enwiki only.

I'm not a Unicode expert or do I understand some things. Magioladitis knows more about this. Should any of the new control characters be ported to other wikis? Bgwhite (talk) 21:35, 17 January 2014 (UTC)Reply

Thanks, BG. Will the report say what the PUA characters are, so we can address them manually? — kwami (talk) 21:43, 17 January 2014 (UTC)Reply
FYI for everybody.... A list for enwiki can be found here. Any PUA characters are labeled as {PUA}, but that can be changed to the actual Unicode value. Bgwhite (talk) 22:04, 17 January 2014 (UTC)Reply
Magioladitis told me and I started digging into it (English). The problem is, most of these characters do have a meaning depending on the context and are crucial in some languages. This is especially true for U+200B which is used as a plain character (not encoded as &#x200B;) in some Asian Wikipedias. It should only be reported if the local community agrees it is an error. I think the same is true for U+202A and U+202C. Not sure about U+2028. Do you have examples? The PUA is different. It's clearly an error to use characters that have a different meaning depending on what operating system or software you are using. These should be reported in all languages. Similar to the Windows stuff in U+007F to U+009F which is also clearly an error. --TMg 23:44, 17 January 2014 (UTC)Reply

I've gone through the PUA to Cao Hong, maybe 30% of the total. This is quite manageable. There are very few that are intentional, and most of those deal specifically with assignments to the PUA (such as the Apple logo). Those can be substituted with &#x...; and tagged with {{PUA}} for future maintenance. Some are stray characters which can just be deleted. PUA within text is almost always due to copying and pasting. Often the original can be found by doing a Gsearch of the surrounding text and corrected. In relatively few cases do we need to alert someone familiar with the article to fix. Of the articles I reviewed (up to Cao Hong in BG's sandbox list), I skipped emoji as too much work, and left notes on the talk pages of IBM 1620 and Sakya. Multiply that by 3 or 4 and we really don't have much work to do, and once we take care of the backlog, it should be easy to keep up with the dump. — kwami (talk) 00:51, 18 January 2014 (UTC)Reply

Okay, I think I've reviewed/fixed them all. Probably missed a couple. Left the Mongol alone. — kwami (talk) 06:10, 18 January 2014 (UTC)Reply
I've gone through all the other characters. I've ssen nothing that could stay. Let's see how many will be produced this month. -- Magioladitis (talk) 01:55, 18 January 2014 (UTC)Reply

Can't find the PUA in Nay Toe.

The Inner Mongolian govt and publishers use PUA rather than Unicode for classical Mongolian script, so we may want to handle these separately. We'd want to embed a supporting font in WP at least. But Mongolian WP uses Cyrillic, so it shouldn't be a problem to scan WP-mn for PUA. — kwami (talk) 02:28, 18 January 2014 (UTC)Reply

Sakya's been fixed. Ask user:BabelStone to convert Tibetan PUA. — kwami (talk) 06:32, 18 January 2014 (UTC)Reply

Error #95

edit

  Resolved

  1. It seems it checks for English "User:" only and does not recognize localizations and aliases, e.g. "Benutzer:" and "Benutzerin:" in the German Wikipedia.
  2. In the German Wikipedia some maintenance templates are designed to be used in the article (instead of the talk page). For example, {{Liste|Reason. --[[User:Example]]}} is allowed in an article in the German Wikipedia. Do you think it's possible to add an "allow user signatures in whitelistes templates" feature? I'm not sure if it's worth the trouble. Maybe it's easier to disable the error in dewiki.

--TMg 18:20, 27 January 2014 (UTC)Reply

It or any of the other new errors should not have be active on dewiki. It is now off.
Whitelists are for individual articles.
Templates will contain individual wiki's name for "User:" Bgwhite (talk) 18:47, 27 January 2014 (UTC)Reply
Hi Bgwhite, do you mean that I should add the following lines in frwiki translation file?
error_095_templates_frwiki=
  Utilisateur
  Utilisatrice
  Discussion Utilisateur
  Discussion Utilisatrice
  Discussion utilisatrice END
Is it the correct syntax ? (no ":", no "User" even if it's a possible name, ...) --NicoV (Talk on frwiki) 14:07, 4 February 2014 (UTC)Reply
At the moment, no. We talk everywhere, but I mentioned someplace I was going to get the names thru the API. So, there should be no reason to add anything to #95. The API does return all 5 "users" that you mentioned. Bgwhite (talk) 18:12, 4 February 2014 (UTC)Reply

"Break tag with incorrect syntax" - is not incorrect

edit

  Resolved

Regarding edits like this - they are unnecessary. The <br /> tag is perfectly valid HTML 5, and indeed, HTML Tidy converts all <br> to <br /> when a Wikipedia page is served. --Redrose64 (talk) 21:44, 28 January 2014 (UTC)Reply

Hi Redrose64, it wasn't <br /> but <br/ > and I believe they are incorrect (not 100% sure that whitespace is accepted between "/" and ">". --NicoV (Talk on frwiki) 22:04, 28 January 2014 (UTC)Reply
OK, I didn't spot that the space was after the slash. This doc isn't perfectly clear on where spaces are optional, although it is clear on the places where they are mandatory (before each attribute). --Redrose64 (talk) 23:01, 28 January 2014 (UTC)Reply
If I read correctly the description in "Start tags", they don't mention any space between 6. (the "/") and 7. (the ">"), so I believe they are forbidden there. --NicoV (Talk on frwiki) 07:23, 29 January 2014 (UTC)Reply

WPCleaner and new errors

edit

  Resolved

Hi,

I'm just starting this thread to be sure I'm not missing anything I need to do in WPCleaner to be coherent with the recent changes in Check Wiki. Feel free to edit directly the list below. --NicoV (Talk on frwiki) 10:42, 31 January 2014 (UTC)Reply

  •   Done #01 - Template with the useless word "template" (previous error #502 renumbered)
  •   Done #04 - HTML text style element <a> (previous error #519 renumbered)
  •   Done #16 - Unicode control characters (complete refactoring of the detection/fix, including adding U+2028=Line separator, U+202A=Left-to-right embedding, U+202C=pop directional formatting, and Private Use Areas)
  •   Done #42 - HTML text style element <strike> (previous error #517 renumbered)
  •   Done #62 - URL containing no http:// (old error removed, new error added)
  •   Done #89 - DEFAULTSORT with no space after the comma (old error removed, new error added)
  •   Done #90 - Internal link written as an external link (previous error #511 renumbered)
  •   Done #91 - Interwiki link written as an external link (previous error #512 renumbered)
  •   Done #93 - External link with double http:// (new error added)
  •   Done #94 - Reference tags with no correct match (new error added)
  •   Done #95 - Editor's signature or link to user space (new error added)
  •   Done #96 - TOC after first headline (new error added)
  •   Done #97 - Material between TOC and first headline (new error added)
As far as I know, WPCleaner can now detect all the new errors in Check Wiki. Tell me if you see any discrepancy. --NicoV (Talk on frwiki) 15:56, 15 February 2014 (UTC)Reply

#96 and #97: syntax for _templates_ parameter

edit

  Resolved

Hi Bgwhite,

Errors #96 and #97 have a _templates_ parameter in Wikipedia:WikiProject Check Wikipedia/Translation. How the parameters are used? For example, ABP is detected by #96 because there's {{toc right}} (lowercase) in it, but in the parameter, there's only "TOC[ ]+right" (uppercase). --NicoV (Talk on frwiki) 15:29, 12 February 2014 (UTC)Reply

NicoV, having so many redirects is plain evil. There's *only* 11 redirects for {{TOC right}}. I lowercase everything. I lower case the parameter from the translation file and the article's text. For the majority of things, I lowercase everything to do a search. Bgwhite (talk) 19:18, 12 February 2014 (UTC)Reply
NicoV, I've found out that not all TOCs are created equal. I've removed {{Compact ToC}} and {{TOC index}} from the Translation file because " it does not contain a heading." Bgwhite (talk) 09:31, 13 February 2014 (UTC)Reply
Ok Bgwhite. Is the "regular expression" ([ ]+) necessary in the _templates_ parameter ? There are no regular expressions in templates list for other errors (#3, #28). --NicoV (Talk on frwiki) 13:37, 13 February 2014 (UTC)Reply
Technically, no it is not necessary. I could add a template with a space and one without. There is only one template in #3 with a space and the template is actually a redirect. In #28, none of the templates listed have redirects. #96 and #97 are the only ones that have templates with a space and a redirect without a space. Bgwhite (talk) 19:14, 13 February 2014 (UTC)Reply
Ok. I've coded #96 and #97 to remove the [ ]+ and do a simple template name comparison in WPCleaner. --NicoV (Talk on frwiki) 12:53, 16 February 2014 (UTC)Reply
NicoV, I don't like that. I didn't know you had to code around that solution. I'll change to remove the [ ]+. You shouldn't have to change when I can just added the templates twice. It should be changed on my end and not yours. Bgwhite (talk) 07:13, 17 February 2014 (UTC)Reply
Ok, great, better for me! --NicoV (Talk on frwiki) 07:45, 17 February 2014 (UTC)Reply

WMFLabs out (again)

edit

  Resolved

Hi, bug opened about WMFLabs being completely out again. --NicoV (Talk on frwiki) 12:51, 16 February 2014 (UTC)Reply

#3 and list of templates for <references/>

edit

  Resolved

Hi, it seems that #3 doesn't take into account the list of templates that can be used instead of <references>. On frwiki, the full scan has just run, and we end up with 400k articles listed in #3. I checked the first one in the list fr:!!! which hasn't been modified for months, and has {{références}} at the end of the article --NicoV (Talk on frwiki) 17:00, 2 March 2014 (UTC)Reply

NicoV, fixed. Bgwhite (talk) 23:29, 2 March 2014 (UTC)Reply
Thanks! --NicoV (Talk on frwiki) 09:12, 4 March 2014 (UTC)Reply

#13 with a slight issue

edit

  Done

The checkup for <math>-tags should disregard programming-tags like <math.h> header library that are mentioned in several articles. --StreifiGreif (talk) 16:37, 3 March 2014 (UTC)Reply

StreifiGreif Ok, I'll add a fix. Will tell you when it is in. Bgwhite (talk) 19:41, 3 March 2014 (UTC)Reply
The fix is in Checkwiki program.
StreifiGreif and NicoV. I've switched ordering of some checks. Checking math tags now goes after checking for source, code and syntaxhighlight tags. When checking for these three tags, the program removes any material in between the tags to allow Checkwiki not to check the material for errors. <math.c> tags should be between these three tags. There might be some unintended consequences, so give a yell if you see problems. Bgwhite (talk) 20:21, 3 March 2014 (UTC)Reply

Error #90 and redirect=no

edit

  Resolved

Hi, on frwiki, #90 is detecting fr:Diplomatie (jeu) because of [http://fr.wikipedia.org/wiki/Allan_B._Calhamer?redirect=no Allan B. Calhamer]. Should it be detected? Is there a wiki syntax that can be used to convert this external link into an internal link? --NicoV (Talk on frwiki) 12:13, 27 February 2014 (UTC)Reply

NicoV, I don't recall seeing this before. Frescobot just fixed any #90 and #91 errors. We then went thru what was left and either fixed them manually or added them to the whitelist. I did #91. Magioladitis did #90 and maybe he came across some.
I started a dump scan and searched for "redirect=no". There are quite a few articles. I checked some and "redirect=no" was either in a non Wikipedia external link or in a comment. All the comments were the same and example is in Antler.
Unless there are more than a few isolated cases, I'm inclined to just add it to the whitelist. Bgwhite (talk) 08:29, 28 February 2014 (UTC)Reply

I did not come across any articles with redirect=no. -- Magioladitis (talk) 08:32, 28 February 2014 (UTC)Reply

Ok, thanks for the answers. Since it seems to be an isolated case, I will use the whitelist. --NicoV (Talk on frwiki) 08:52, 28 February 2014 (UTC)Reply

WMFLabs problem - Dump files not being processed

edit

The twice monthly dump files are not being processed at the moment. WMFLabs has a problem with mounting various directories, including where the dumps are located. Problems have been going on for a few days. A bug report has been filed, but no action or acknowledgement of the bug report has happened. So, unknown when this will be fixed. Bgwhite (talk) 21:57, 21 January 2014 (UTC)Reply

Any status update? A link to the bug report? Is there any way to manually update? Skalman (talk) 21:04, 28 January 2014 (UTC)Reply

Template categorization

edit

Greetings Wikipedia checkers! I have a question.

Over at the village pump I'm talking to people about the feasibility of cleaning up all the copy-and-pasted comments in template documentation that derive from {{Documentation/preload}}. My reasoning is that they cause clutter and represent a low-quality form of documentation that can't be updated easily. Some editors have suggested that they're necessary to prevent inexperienced template editors from including template categories directly in templates, when our standard procedure is to place them in &lt;includeonly&gt; blocks on template documentation pages. I think that this is not enough of a problem to merit thousands of copies of the same string of text being pasted into templates. Fixing occurrences of it is a task completely suited to a bot such as the ones you operate. What would you say about the feasibility of adding that as a task? My thinking is that the logic would be something like:

  • An edit added a category to a non-documentation template (name doesn't end in /doc)
  • Does it have a documentation template?
    • Yes: move the category to the documentation template
    • No: leave it as is

That doesn't strike me as being particularly complex by the standards of your project. If you think that it is a reasonable goal, that would be just great. Ideally, I'd like to rewrite the template documentation documentation template (try saying that five times in a row) to better explain how template categories should work, and then commission a one-off bot run to clean out all the variants of the copy-and-pasted comments.

What do you think? Thanks, — Scott talk 13:42, 23 January 2014 (UTC)Reply

Invitation to User Study

edit

Would you be interested in participating in a user study? We are a team at University of Washington studying methods for finding collaborators within a Wikipedia community. We are looking for volunteers to evaluate a new visualization tool. All you need to do is to prepare for your laptop/desktop, web camera, and speaker for video communication with Google Hangout. We will provide you with a Amazon gift card in appreciation of your time and participation. For more information about this study, please visit our wiki page (http://meta.wikimedia.org/wiki/Research:Finding_a_Collaborator). If you would like to participate in our user study, please send me a message at Wkmaster (talk) 13:07, 18 February 2014 (UTC).Reply

Checkwiki is down - February 5

edit

The powers that be are in the process of moving everything at WMFLabs to a new data center. Checkwiki's move barfed. Checkwiki will be down until things get fixed. Bgwhite (talk) 09:32, 5 March 2014 (UTC)Reply

Checkwiki should be up now. Bgwhite (talk) 23:15, 5 March 2014 (UTC)Reply

Mismatched sub and sup tags

edit

  Done

@Salix alba:, @NicoV:, @Magioladitis:

Salix alba asked a question about mismatched <sub> and <sup> tags. He was guessing there are ~4,000 articles with problems. After doing a scan, he is wrong. There are 7,096 articles from February's dump file. Examples are:

Looking at the source code of the rendered web pages, it appears the MediaWiki software does convert the mismatched tags to the correct value. However, there are around ~400 articles where there are broken or missing tags and this does cause rendering problems.

However, the majority of problems come at the end of a table cell where it doesn't do damage.


Should this be added to Checkwiki? AWB doesn't currently warn or fix the problem, not sure about WPCleaner. Should these be added to AWB and/or WPCleaner? Bgwhite (talk) 08:25, 27 February 2014 (UTC)Reply

I think this could be added to Checkwiki. I will add it to WPCleaner when I've managed to reduce the current backlog... --NicoV (Talk on frwiki) 09:12, 27 February 2014 (UTC)Reply

Bgwhite I could fix the <sup/> and <sub/> if someone give me the list. -- Magioladitis (talk) 09:36, 27 February 2014 (UTC)Reply

Yes I agree with number of broken articles. I've a list at User:Salix alba/subsup. The earlier prediction was done with a scan on just one of the database dump and assumed roughly the same number for each dump file, however later dumps seem to have higher error rates. There may be a few false positives I've found some pages which have <sup id="foo">ref</sup> or a style attribute, this breaks my simple test. There seem to be a couple of different errors e<sup>x</sub> and e<sub>x</sup> in all the cases I've looked at its the first tag which is correct, and could probably be auto corrected. There is also a bunch of cases where there in just one tag, say a single <sup> or </sub> alone. Sports articles seem to have a lot of these. It seems fine to just strip these tags completely. Line by line checks seem to be ok as I've never seen then span multiple lines.
There is a related bugzilla T63011 the problem first emerged as VE/parsoid and the standard rendered treat things differently. Parsoid uses HTML5 treebuilder which has a different recovery algorithm.--Salix alba (talk): 10:00, 27 February 2014 (UTC)Reply
Salix alba, so... Parsoid does not automagically fix the mismatched sub/sup tags as HTML Tidy currently does. If I understand the bug report correctly, it won't be "fixed" at all in Parsoid. If this is true, I'll have Checkwiki check for this. It is a simple copy/paste to add it into Checkwiki, so it will be ready before the next dump. When AWB and/or WPCleaner adds in functionality to fix it, a bot run should happen to fix the problems. Do you have some links for HTML5 treebuilder? It would be interesting to read up on it and see what else it does/doesn't do.
Magioladitis, User:Bgwhite/Sandbox contains cases of <sup/> and User:Bgwhite/Sandbox1 contains <sub/>.
I don't have a way to report cases of missing tags, but I do find them. After a bot run is done to fix mismatched tags, whats left contains cases of missing tags. I was estimating 150 articles that have missing articles, but from what Salix alba wrote, it looks to be higher. Bgwhite (talk) 20:58, 27 February 2014 (UTC)Reply

Bgwhite I fixed everything in the two given lists. -- Magioladitis (talk) 22:01, 27 February 2014 (UTC)Reply

Gwicke might be the person to ask about parsoid/treebuilder. As I understand it parsoid transforms wikitext in to an annotated form of html which is then passed to VisualEditor which is a html rather than wikitext editor. The algorithm it uses to do the transformation is different from the standard wikitext to html converter. In particular it transforms A<sup>-1</sub> normal text. into A<sup>-1 normal text.</sup>, discarding the </sub> and fixing things by adding a </sup> at the end of the line. You can see the effect at Divergent series in the Zeta function regularization section at the end.--Salix alba (talk): 23:16, 27 February 2014 (UTC)Reply
OK.... NicoV, I'll add <sub> as #98 and <sup> as #99. Magioladitis, can you do a bot run to fix the mismatched tags now or will it better to wait till a fix is put into AWB? I'll get you the lists if you can do it now. Bgwhite (talk) 00:10, 28 February 2014 (UTC)Reply

Bgwhite how is AWB supposed to fix this? In casse of mixed tags (for instance <sup>50</sub>) how do we know which is the correct one? -- Magioladitis (talk) 06:56, 28 February 2014 (UTC)Reply

Magioladitis, Salix alba said up above, "... in all the cases I've looked at its the first tag which is correct, and could probably be auto corrected." I'd have to agree simply because the first tag is what renders on the web page. If an editor meant the second tag, we aren't braking anything if we go with the first, it will still look the same. Bgwhite (talk) 07:03, 28 February 2014 (UTC)Reply

Bgwhite rev 9957 added fix for bad sup/sub tags. -- Magioladitis (talk) 06:57, 28 February 2014 (UTC)Reply

These don't seem to be strong enough, and miss most of the existing cases. I've been running AWB with the regexps <sup>([^<]*)</sub><sup>$1</sup> and similar for <sub>. So far its 174 edits without problems.--Salix alba (talk): 08:03, 1 March 2014 (UTC)Reply
Salix alba, rev 9957 are for cases of <sup/> and <sub/. Magioladitis still has to add the rest. I'll look at the regex and if things look ok, I'll do a bot run on them. Bgwhite (talk) 08:23, 1 March 2014 (UTC)Reply

Bgwhite rev 9958 added fix for bad center tags. We already had fix for bad small tags. -- Magioladitis (talk) 22:41, 28 February 2014 (UTC)Reply

Bgwhite, Rjwilmsi alerts for unclosed <math>, <source>, <ref>, <code>, <nowiki>, <small>, <pre> or <gallery> tags and comments. Should we update it for sub/sup tags? -- Magioladitis (talk) 22:54, 28 February 2014 (UTC)Reply

#98 and #99 added in WPCleaner, and errors configured on frwiki. --NicoV (Talk on frwiki) 07:27, 1 March 2014 (UTC)Reply
Geez, you take a day and Magioladitis will take weeks. I'm starting to think I WikiMarried the wrong editor. Magioladitis just goes to the beach and looks at the pretty girls. He never spends time with me anymore... Bgwhite (talk) 08:23, 1 March 2014 (UTC)Reply
:-) it was an easy one, just copy paste #13. I did the minimum, I still have to add meaningful suggestions.
On the other hand just updating existing regular expressions in AWB's code won't work for those two tags. -- Magioladitis (talk) 14:35, 1 March 2014 (UTC)Reply

rev 9959 to fix more of <sup/>, </sup/> etc. -- Magioladitis (talk) 09:37, 1 March 2014 (UTC)Reply

False positives for #3

edit

  Done

Hi, it seems that #3 detects a lot of false positives: 179 pages were detected during tonight scan, and when I checked the first 4 articles (fr:Abdallah Naaman, fr:Adda Daouéni, fr:Adrien de Pauger, fr:Agriculture étrusque), they all had a <references /> through {{references}} (which is one of the templates for references). --NicoV (Talk on frwiki) 01:48, 13 March 2014 (UTC)Reply

NicoV, I haven't a clue. Everything looks good. I run the program manually with the 4 articles and I don't get an error on WMFLabs or my laptop. Remind me after tomorrow's run. Bgwhite (talk) 07:23, 13 March 2014 (UTC)Reply
Bgwhite, same problem with similar articles (fr:Aghribs, fr:Akibani, fr:Aldrien, ...), they all use the same {{references}} template. --NicoV (Talk on frwiki) 04:41, 14 March 2014 (UTC)Reply
Grrrr. This is not going to be a fun to figure out.
Bgwhite, any luck finding something? Articles using {{references}} keep appearing on frwiki list. --NicoV (Talk on frwiki) 15:55, 22 March 2014 (UTC)Reply
NicoV, I usually code toward the end of the month. I'm about done fixing all the problem articles for #97 and then I'll start Checkwiki when I'm done. Bgwhite (talk) 21:00, 22 March 2014 (UTC)Reply

Yobot and "See also"

edit

  Resolved

Yobot keeps on changing "Related topics" to "See also"...sorry, Related topics isn't wrong and no policy discourages the use of that section title, no matter how many times Yobot persists to change it.--ColonelHenry (talk) 18:54, 24 March 2014 (UTC)Reply

ColonelHenry, this is not a topic for Check Wikipedia. Check Wikipedia finds errors and does not correct them. This is also not an error it finds. You need to bring it up at Wikipedia talk:AutoWikiBrowser as AWB does the substitution. However, per WP:ORDER and MOS:SEEALSO, "See also" is the approved name and not "Related topics". Yobot is following MOS. As every other page on Wikipedia also uses "See also", readers have to expect "See also" and know what that means. Bgwhite (talk) 20:22, 24 March 2014 (UTC)Reply
  • Bgwhite just because MOS:SEEALSO says "The most common title for this section is "See also", doesn't mean it is the only title. Nothing says "see also" is the only "approved name". FYI, if you go back in time, the MOS:SEEALSO section used to be named "See also" and "Related topics" sections, and I direct you to this page: [3]. Thanks for directing me to AWB.--ColonelHenry (talk) 21:39, 24 March 2014 (UTC)Reply

Wondering about ID#84

edit

  Done

Hi, I saw that - at least for the German WP - there's a huge list of ID#84. But on virtually all sites this is because of captions that are comment by <-- and --> Problem is that often the author did not put the opening commentary-tag in the same line as the caption or that he comment multiple captions thus the second and so on are missing "their" opening tag. See any chances to get a workaround for that? --StreifiGreif (talk) 17:37, 7 March 2014 (UTC)Reply

StreifiGreif Known problem. I did have a fix for it and was in the code. The fix ended up causing a problem on a few sites. It caused the checkwiki program to crash. I'll look at it again in a few weeks. Bgwhite (talk) 21:52, 7 March 2014 (UTC)Reply
StreifiGreif, this should be fixed now. Bgwhite (talk) 07:39, 26 March 2014 (UTC)Reply

<includeonly>...</includeonly> and #48

edit

  Done

Hi, should we detect #48 (internal links to the title) when they are inside <includeonly>...</includeonly> tags ? On frwiki, all articles in fr:Catégorie:Effectif actuel de franchise de la LNH are included in other articles, so they have a link to themselves inside a <includeonly>...</includeonly>. --NicoV (Talk on frwiki) 08:43, 13 April 2014 (UTC)Reply

Magioladitis, do you have answer? Bgwhite (talk) 21:11, 14 April 2014 (UTC)Reply
NicoV My answer is that we should not fix them. AWB right now won't fix 48 in a page that has noinclude/includeonly even when the 48 error is outside the area. I would like to fix 48 errors when they are outside the includeonly tags because many pages contain empty includeonly tags or sometimes are they result of a copy pasted navox/infobox. -- Magioladitis (talk) 05:02, 15 April 2014 (UTC)Reply
Bgwhite,Magioladitis I agree about not fixing them, so maybe we should not detect them also ;-) ? I've modified WPCleaner so that it still detects them everywhere (to be coherent with Labs), but it doesn't suggest to fix them when they are inside includeonly tags (I don't check if there are noinclude/includeonly tags somewhere else). --NicoV (Talk on frwiki) 08:27, 15 April 2014 (UTC)Reply

Done. Bgwhite (talk) 21:21, 18 April 2014 (UTC)Reply

CHECKWIKI #81

edit

  Resolved

Why is #81 off for enwp, has there been a discussion in the past which I was not a part of or...why? (tJosve05a (c) 00:01, 15 April 2014 (UTC)Reply

From what I can find at this latest discussion here I can not see there being consensus for turning off #81
The "latest discussion" was about removing errors. #81 was never removed, it was turned off on enwiki. It was turned off 4-6 months ago. I can't remember the number, but there was over 20,000 articles with no hope of them being taken care of. It's also technically not an error. Bgwhite (talk) 04:57, 15 April 2014 (UTC)Reply
Bgwhite what does this error exactly mean? I thought it was about having a reference list twice. -- Magioladitis (talk) 05:07, 15 April 2014 (UTC)Reply
Magioladitis, no, that is error #78. #81 was if there were two identical references in an articles. AWB would only fix a small subset of the errors. Bgwhite (talk) 05:12, 15 April 2014 (UTC)Reply
Bgwhite true. AWB will only fix pages that already have a multiple reference once. -- Magioladitis (talk) 05:15, 15 April 2014 (UTC)Reply

Whitespace and #67

edit

  Done

Hi, it seems that #67 is detected only when there's no whitespace characters between the punctuation and the reference. It would be better if . <ref was also detected. --NicoV (Talk on frwiki) 09:44, 16 April 2014 (UTC)Reply

Hi. Even better, a reference should not follow a whitespace character, even if there is no punctuation ahead. --Sahrayana (talk) 13:53, 16 April 2014 (UTC)Reply
NicoV, I'll add the whitespace between punctuation and ref. I'm on the swamped side, so it will take me a bit to add this and the other ones recently mentioned here.
Sahrayana, I'm hesitant on adding this. If enwiki is an indicator, there will be a couple hundred thousand articles with errors. It is also on the "minor" side, minor being relative to the editor. Bgwhite (talk) 21:43, 16 April 2014 (UTC)Reply
Thanks ! No rush for any request, do it when you have time ;-) --NicoV (Talk on frwiki) 04:51, 17 April 2014 (UTC)Reply
Done. Bgwhite (talk) 21:20, 18 April 2014 (UTC)Reply
Thank you ! Sahrayana (talk) 16:24, 19 April 2014 (UTC)Reply

Is there some type of bug flaw with the WCW application?

edit

  Resolved

A user used WP:WCW to fix a spelling and punctuation mistake in an article:

[4]

I was the next one to edit the article and made completely separate edits for content, yet the previous edits noted above were automatically reversed:

[5]

I was curious if anybody knows why this happened, has it happened elsewhere, and if there is something that can be done to fix it for users that employ this tool. Thanks. Wondering55 (talk) 20:57, 16 April 2014 (UTC)Reply

@Wondering55: Actually WP:WCW is just a database(/dataset?) of errors. The program used in the first diff was WP:WPCleaner. (Ping NicoV, the developer). (tJosve05a (c) 21:01, 16 April 2014 (UTC)Reply
Thank you for that ping, ping, ping quick response. I assume that I do not have to post this same message at WP:WPCleaner since you also pinged the developer. I also learned a new command where someone can ping/notify users with a Wikipedia command about a posted message. Hopefully, we will hear back about what might have caused this problem or positive steps to prevent this from happening again. Wondering55 (talk) 21:15, 16 April 2014 (UTC)Reply
@Wondering55: please next time report WCW bugs on their page. -- Magioladitis (talk) 21:16, 16 April 2014 (UTC)Reply
@Wondering55: Well, the first edit was indeed made with WPCleaner (which uses MW API to do the edit). It was done 19 minutes before you saved your edit. Question: is it possible that you started you edit before the first edit (more than 19 minutes editing) ? If so, did you get a warning ? Did you do a section edit or edited the entire article ? I don't see how this could be a bug in WPCleaner since its edit was correctly saved in wiki. It's rather the second edit that is problematic. --NicoV (Talk on frwiki) 21:19, 16 April 2014 (UTC)Reply
It is very possible that I started my edit before the first edit. I believe I was editing the entire article. I don't recall getting any edit warning, which I usually take note of in order to resolve edit conflicts, and even got one while I was editing my response to you. I will assume that there is nothing further to do for now. If I ever see this problem again, I will post it on WP Cleaner. Thanks for the quick response and evaluation. If you happen to find anything further, let me know. Wondering55 (talk) 21:38, 16 April 2014 (UTC)Reply

The tab 'WMFLabs'

edit

  Resolved

The link in the tab that says WMFLabs at the top of this page is not working. it brings me to an 'Internal error'-page. (tJosve05a (c) 21:27, 16 April 2014 (UTC)Reply

Josve05a. WMFLabs recently changed webservers and some configs. They are aware of the problem and are trying to fix it. Bgwhite (talk) 21:18, 17 April 2014 (UTC)Reply
Josve05a After people started to complain en mass, they fixed it. Bgwhite (talk) 17:15, 24 April 2014 (UTC)Reply
@Bgwhite: Yay! (You see, it is good to nag!) (tJosve05a (c) 17:19, 24 April 2014 (UTC)Reply

Math and #54

edit

  Done

Hi, it seems that #54 detects false positives when the list element ends with a br followed by <math>...</math>. The math tags are probably removed before analyzing.

Example on fr:Action de groupe (mathématiques):

**[[Théorème de Cayley|par translations à gauche]] ; cette action est [[#Action simplement transitive|simplement transitive]], c'est-à-dire [[#Action libre|libre]] et [[#Action transitive|transitive]] :<br /><math>G \times G \rightarrow G,\ (g,x) \mapsto gx</math>

Maybe, rather than removing math tags, just remove the contents of the math tags? --NicoV (Talk on frwiki) 04:32, 19 April 2014 (UTC)Reply

Yes, the math tags are removed before analyzing. It does hinder a few other errors such as #61. I hadn't thought of removing just the inside of tags. Will do some testing. Egads, I think I'm in a polygamous marriage now. Magioladitis has been my WikiSpouse because he is constantly telling me what to do. Nico is now my WikiSpouse because he is constantly nit picking me. Bgwhite (talk) 06:00, 19 April 2014 (UTC)Reply

Add to "Participants" list

edit

  Resolved

Not sure whether I'm allowed to change "Wikipedia:WikiProject Check Wikipedia/Participants" by myself. Therefore, I'm requesting...please add me to the "Participants" list on "Wikipedia:WikiProject Check Wikipedia". Thanks.
--LukasMatt (talk) 05:11, 22 April 2014 (UTC)Reply

@LukasMatt:, feel free to add yourself, project page is open! --NicoV (Talk on frwiki) 08:08, 22 April 2014 (UTC)Reply

Missing articles in ISBN detections ?

edit

  Resolved

Please, don't hit me ! ;-)

I spent quite some time in the last weeks to fix the ISBN errors reported by CW on frwiki, and I thought I had almost finished, but I found a whole bunch of articles that don't seem to be reported. For example, fr:Pont-canal de l'Argent-Double which I fixed today wasn't reported. I'm not entirely sure, because someone may have marked the article as fixed without fixing it... Do you have an easy way to check if the previous version was detected by #69 ?

Dear anonymous, whiny, French person, you must be new to Wikipedia. We sign our posts with 4 tildas (~~~~). This way, we can easily identify who we can ignore. #69 is the wrong error. It is a sexual position, which is why you have fixated on that number (pervert). You are looking for error #70, ISBNs with wrong length. Checkwiki does not look for ISBN errors inside cite templates. Why? I haven't a clue. It was that way when I inherited the code. On enwiki, they have recently changed the cite template code to check for ISBN problems. Pages are found at Category:Pages with ISBN errors. Bgwhite (talk) 18:49, 25 April 2014 (UTC)Reply
Thanks a lot! My mistake, probably because #69 is a lot easier than #70 ;-)
That explains why I had the impression that many were not detected... Too bad the cite templates on frwiki don't check for ISBN problems: I asked about adding it earlier today, I hope someone will add it. We have an equivalent category, fr:Catégorie:Ouvrage avec ISBN invalide, but it's not automatically filled :-( I'm looking into adding features in WPCleaner to populate it, much like what I'm doing for disambiguation links on frwiki. Have a nice weekend, I'll try to have not too many requests ;-) --NicoV (Talk on frwiki) 19:48, 25 April 2014 (UTC)Reply

Localisation for #1

edit

  Done

Sorry to bother you again... I was wondering why there was (almost) never errors detected for #1 on frwiki, so I looked at the code: apparently only {{template: is detected, and not the localized names for template (like {{modèle:). --NicoV (Talk on frwiki) 08:06, 22 April 2014 (UTC)Reply

NicoV, add it to the translation file and I'll add it to the code. Bgwhite (talk) 17:32, 24 April 2014 (UTC)Reply
Bgwhite, I was hoping that an API request would do the trick (like this one) without having to change the translation file on any wiki, but I can add it to the translation file if you prefer (using the _templates_ parameter?). --NicoV (Talk on frwiki) 19:03, 24 April 2014 (UTC)Reply
NicoV I totally forgot about that. You are correct. Things happen in threes. What stupid thing will I do next that you catch me on? Bgwhite (talk) 19:59, 24 April 2014 (UTC)Reply
NicoV Done. Bgwhite (talk) 23:50, 24 April 2014 (UTC)Reply
Thanks! Already 25 on frwiki :-) --NicoV (Talk on frwiki) 04:57, 25 April 2014 (UTC)Reply

WPC

edit

Is it possible to detect how many articles has been marked as 'done' using WPC? It could be "fun" to see. (tJosve05a (c) 16:40, 19 April 2014 (UTC)Reply

Josve05a, the only stats kept would be the web stats. It doesn't show the difference between an article retrieved or fixed. Last time I checked, I think WPCleaner was generating around 1/2 of the traffic. I'll updated stats at the end of the month. Bgwhite (talk) 17:28, 24 April 2014 (UTC)Reply

Question about #11

edit

  Done

Hi, what HTML named characters are excluded from the search in #11? I figure dagger, emdash and endash are excluded because they got their own error. But, are there other characters excluded? (like nbsp, emsp, ...). --NicoV (Talk on frwiki) 11:22, 13 April 2014 (UTC)Reply

And I would like to know which ones are included just to make sure AWB fixes all of them. :) -- Magioladitis (talk) 11:29, 13 April 2014 (UTC)Reply
Included would be the correct term. Form the code:
# See http://turner.faculty.swau.edu/webstuff/htmlsymbols.html
our @HTML_NAMED_ENTITIES = qw( aacute acirc aeligi agrave aring aumla bull ccedil cent copy dagger euro hellip iexcl iquest lsquo middot minus ntilde oline ouml pound quot reg rswuo sect sup2 sup3 szling trade uuml crarr darr harr larr rarr uarr );
Bgwhite (talk) 20:26, 13 April 2014 (UTC)Reply
Thanks! I will have to exclude a few from my current list. Question: you don't have the uppercase accented letters ? (like Aacute ?) --NicoV (Talk on frwiki) 20:37, 13 April 2014 (UTC)Reply
AWB check on Bgwhite's list: [6]. -- Magioladitis (talk) 20:42, 13 April 2014 (UTC)Reply
Should I add more from WPCleaner's list or any others? Bgwhite (talk) 20:46, 13 April 2014 (UTC)Reply
Bgwhite, NicoV AWB has a white list of html entities that should not be replaced because they "look bad if changed" these are "ndash|mdash|minus|times|lt|gt|nbsp|thinsp|zwnj|shy|lrm|rlm|[Pp]rime|ensp|emsp|#x2011|#820[13]|#8239". there are some more exceptions for other reason found in Parsers.cs line ~60. You might want to have a look. -- Magioladitis (talk) 21:01, 13 April 2014 (UTC)Reply

@Bgwhite: @Magioladitis: I tried to go through the list of existing HTML named entities to see which ones should be reported. What do you think of this list ? (I took the current list, added what seemed reasonable, and then removed the ones that are excluded by AWB.) --NicoV (Talk on frwiki) 23:00, 14 April 2014 (UTC)Reply

NicoV, sounds good to me. After Magioladitis looks at the list, I'll add them. Bgwhite (talk) 23:49, 14 April 2014 (UTC)Reply
Bgwhite I agree. -- Magioladitis (talk) 05:05, 15 April 2014 (UTC)Reply
Ok, I've released WPCleaner with this list. --NicoV (Talk on frwiki) 06:39, 16 April 2014 (UTC)Reply

Done. Updated list is now in checkwiki. Bgwhite (talk) 21:21, 18 April 2014 (UTC)Reply

@NicoV and Bgwhite: Now I recall we discontinued this error. There were complains that html entities should not change especially in pages about math where math formulas are allowed not only in math tags but also in plain text. This is the reason AWB skips unicodification in pages with math tags. -- Magioladitis (talk) 17:13, 20 April 2014 (UTC)Reply

@NicoV and Bgwhite: How about turning #11 on, but skip any pages with <math> or {{math}}? Bgwhite (talk) 22:30, 21 April 2014 (UTC)Reply
@NicoV and Bgwhite: OK let's try that but I am not sure I trust a guy who pings himself. -- Magioladitis (talk) 22:34, 21 April 2014 (UTC)Reply
Yea, I'm trying to wake up. Yea, I'm pinging myself awake. That must be it..... lowers head in shame Bgwhite (talk) 22:41, 21 April 2014 (UTC)Reply

Notice for #94 ?

edit

  Done

Hi, it would be nice to have the "notice" column filled for #94 (like the text just before the isolated closing ref tag). I'm trying to fix them on frwiki, and when WPCleaner doesn't find the problem I don't know if it has been fixed since it has been detected or if there's a discrepancy between WPCleaner and CheckWiki script. --NicoV (Talk on frwiki) 21:51, 2 April 2014 (UTC)Reply

Magioladitis, I'm working on Nico's request. 2010–11 Morecambe F.C. season was a bugger. AWB does not recognize a stray </ref> tag. It's in the "League table" section right at the end:
‡Hereford United deducted 3 points for fielding an unregistered player.</ref>[1]
Bgwhite (talk) 22:34, 14 April 2014 (UTC)Reply

Parsoid-based online-detection of broken wikitext

edit

Greeting, wiki checkers!!

I plan to propose a GSOC project through Wikimedia this year, based around the idea of Parsoid-based online-detection of broken wikitext. The original idea of the project is defined here, Which is to develop a tool that will use parsoid to fix broken wikitext found while parsing wiki pages and then develop a user interface for editors to fix broken wikitext. But after few discussions on the project with the parsoid team, We found out that we already have tool Check Wikipedia. But it lacks the fixup information that parsoid generates while parsing wiki pages. So through my GSOC project we plan to integrate this information with your tool.

After having discussions with parsoid devs, I have written an application draft under my username GSOC Application 2014. I would be really thankful, if I get some feedback and we can have some discussion on the same. Hardik95 (talk) 21:30, 14 March 2014 (UTC)Reply

Sounds good. Using parsoid to finding all pages with broken wikitext would be a good first step.--Salix alba (talk): 08:34, 15 March 2014 (UTC)Reply
Sorry for being late, I've been out sick for the past few days. Your idea does sound like a good idea. Anything I can do to help, just ask. The Checkwiki code is found at here. Checkwiki.pl is the main detection script. It runs at http://tools.wmflabs.org/ and uses wmflabs' MySQL as the database. Both AWB and WPCleaner can retrieve specific Checkwiki errors to fix. Many errors can be corrected in bot mode while the rest have to be fixed manually. The List of errors page contains a listing of the Checkwiki errors and what program can correct each error. Bgwhite (talk) 20:43, 18 March 2014 (UTC)Reply

2 servers, 2 scripts

edit

Hi! It seems that now CheckWiki works parallel on 2 servers: toolserver.org and tools.wmflabs.org, and they are using:

Different language communities use different servers, but they translate the same descriptions, which do not always fit to the logic. It seems to be a problem.

So, e.g., error 042 searches errors with incorrect <small> tags on the one server and <strike> tags on the other. But they take description of the error from the same page, which should be translated from enwiki translation page. Another example is error 089, etc.

(I am from eowiki.) Yurij Karcev (talk) 06:38, 14 March 2014 (UTC)Reply

Yurij Karcev, toolserver is dead and WMFLabs is its replacement. People have been given time to move their programs over to WMFLabs, which is why both are running. Toolserver will be turned off in about 3 months. I don't have access to toolserver, so I can't place any messages there.
WMFLabs is adding new errors and turning off some old ones. WMFlabs' checkwiki processes dump files every two weeks when available. Toolserver hasn't run on a dump in over a year. The translation page for eowiki has not been updated in a long time. Should the translation page be in English? If not, could you translate it Esperanto. Bgwhite (talk) 08:08, 14 March 2014 (UTC)Reply
Ok. This transition wasn't described clearly anywhere, and some CheckWiki's are still mentioning toolserver – for example, Russian, Spanish and some others. I'm just working on Esperanto CheckWiki, so have found this inconsistency.
Other problem is – when you change error number meaning in the script logic (see above 042), other language projects must synchronously change their translation pages. Now they don't. Maybe at least not to reuse numbers? Yurij Karcev (talk) 09:48, 14 March 2014 (UTC)Reply
Translation pages have to be changed no matter what, so it is a moot point. I only speak English. The French, German, Greek and Swedish pages have been changed. Czech might have. I already got into a brouhaha in trying to changed some stuff on the German page, was reverted and told Germans only, so I'm hesitant of changing other pages. If you know any other languages your help would be much appreciated. Bgwhite (talk) 17:57, 14 March 2014 (UTC)Reply
@Bgwhite: Czech pages are changed ASAP. I understand Slovak, so I will change something on the Slovak pages. I had asked one user and she said she would translate the rest. Matěj Suchánek (talk | cont.) 08:12, 15 March 2014 (UTC)Reply
@Bgwhite: Esperanto: updated project page and error descriptions. Russian: updated project page, working on error descriptions. Suggestions:
  • Please add characters ĈĜĤĴŜŬĉĝĥĵŝŭ as correct for eowiki in errors 007 and 036;
  • Could you check error 055 – it finds too many strange errors in eowiki, dewiki, eswiki etc. Yurij Karcev (talk) 12:56, 21 March 2014 (UTC)Reply
The characters have been added. Thank you for updating eowiki an ruwiki. Yes, those are strange 55 errors. I ran some articles thru checkwiki and it didn't produce 55 errors. What's stranger is checkwiki is not detecting any strange 55 errors during the daily runs. I just blanked 55 on dewiki and will see if the daily runs produces new errors. Bgwhite (talk) 18:36, 21 March 2014 (UTC)Reply
So, on dewiki daily run produces only real 055 errors. Also on eswiki. But on eowiki daily run doesn't adding anything at all — is it turned off? Yurij Karcev (talk) 05:21, 25 March 2014 (UTC)Reply
Yurij Karcev Daily runs are only done for enwiki, frwiki, eswiki and dewiki. They are the largest ones and most prone to alot of changes. Upon request, arwiki was added. I can add eowiki if you like.
In theory, eowiki has two dumps per month created. Checkwiki runs on those two dumps. A lising of all dumps and schedules is located here. Bgwhite (talk) 05:43, 25 March 2014 (UTC)Reply

Error #37

edit

It was suggested to exclude all pages where adding DEFAULTSORT doesn't make a difference. Redirects are an example. If a page neither

  • contains a template (templates may set categories and therefor may require DEFAULTSORT) nor
  • contains a category with no sort key (e.g. [[Category:Ä]] requires DEFAULTSORT but [[Category:Ä|A]] does not)

it can be skipped. The following line of code should do that (again, not tested). --TMg 20:24, 20 January 2014 (UTC)Reply

if ( index( $text, '{{' ) >= 0 or $text =~ /\[\[($cat_regex):[^[|\]]+\]\]/i ) {
    # Do the check
}
Suggested by whom and where?
For #37, articles and redirects are already skipped if there are no categories. Bgwhite (talk) 22:11, 20 January 2014 (UTC)Reply
Discussed here. This is an example for a page where all categories already contain a sort key. Adding DEFAULTSORT does not change anything. Currently error #37 reports about 14,000 pages in the German Wikipedia. It would help if we could remove such cases that aren't actual errors. Just for now. We could re-add this later. --TMg 00:48, 21 January 2014 (UTC)Reply
Ok, it now makes sense what you are asking. Short answer... No. Long answer... This has been asked several times before. Ideally, defaultsort should be added and any identical sorts in the categories removed. AWB does do this already. Magiolidatis recently finished up all 90,000 missing defaultsorts in enwiki via a bot using AWB. In the long run, this would be the best solution. Bgwhite (talk) 07:29, 21 January 2014 (UTC)Reply
I understand and I agree that all pages should use DEFAULTSORT in the long run. But this is not how things work in the German Wikipedia right now. There is no consensus to use bots for such trivial tasks in dewiki. As I said: It would help the German Checkwiki users a lot to be able to focus on actual errors first. You can add the additional check above for dewiki only. If the current 14,000 reported errors are down to 100 (or something like that) we can remove the check. By the way, I spend several hours updating the German localization. Just to let you know. --TMg 21:47, 21 January 2014 (UTC)Reply

ID 73 - ISBN errors

edit

Can you write the errors on the talk page of the appropriate article? Because in many cases the author of an article watches it and then can correct the ISBN. --Tsor (talk) 19:38, 14 January 2014 (UTC)Reply

Tsor, unfortunately it cannot write to articles. This requires bot approval which Checkwiki would not get. There was a bot that was tagging articles and the articles were ending up in Category:Articles with invalid ISBNs. The owner of the bot is no longer active, thus the bot is also no longer active. Bgwhite (talk) 00:31, 15 January 2014 (UTC)Reply

Since yesterday I cannot mark articles as "Done". Leads to an error message. --Tsor (talk) 10:45, 16 January 2014 (UTC)Reply

Tsor, could you give me some examples. What language and what error number? Bgwhite (talk) 11:10, 16 January 2014 (UTC)Reply
Goto ISBN-13, klick on any "Done". After a few minutes you get following error mesage:

{{U|Ts

Check Wikipedia
Aggregat 4
Software error:
Cannot execute: Lock wait timeout exceeded; try restarting transaction
--Tsor (talk) 12:36, 16 January 2014 (UTC)Reply
When I go to https://tools.wmflabs.org/checkwiki/cgi-bin/checkwiki.cgi?project=dewiki&view=only&id=73 I get this following error message:

Could not connect to database: Can't connect to MySQL server on 'tools-db' (111). (tJosve05a (c) 14:48, 16 January 2014 (UTC)Reply

Josve05a, the error you saw is most likely WMFLabs having trouble. When you see that, try again a bit later. Labs are aware of problems to their database machines, but are not going to fix it for who knows how long. The latest excuse is they will when all the machines are physically located to their new location.
Tsor, I still cannot duplicate and I haven't seen that error before. The error message usually means another process has a "lock" or total control over the database and all other database connections are locked out. Why is the error showing up now? Could you tell me the exact time you tried and what article you pressed "done" on. That way I can look at logs and hopefully they will tell me something. Bgwhite (talk) 21:10, 17 January 2014 (UTC)Reply

#16

edit

When I fix the error 16 on arwiki is just fix about 5% of all list, I try with WCP and AWB, where the problem. --Zaher talk 13:42, 28 November 2013 (UTC)Reply

Apparently, there are situations where removing the control character changes the text and it seems to be a problem. I know this is usually happening with some characters (arabic, hebrew, ...). Nobody has been able to explain to me how to know if it's a special situation and how to fix it, so I've coded WPCleaner so that #16 is fixed automatically only if the characters around the control characters are part of a limited list (mainly ASCII, some diacritics, punctuation, ...). That's why it doesn't do much on arwiki. If you're able to guide me to know when it is safe to remove the control characters, I can update WPCleaner. --NicoV (Talk on frwiki) 22:12, 28 November 2013 (UTC)Reply
This is best answered by Magioladitis as he is the resident expert on this. If I remember right, most false-positives do come when dealing with left-to-right languages. Bgwhite (talk) 06:32, 29 November 2013 (UTC)Reply
I was never able to determine when we are in the case where the text order changes. This is a very rare situation in the English Wikipedia (less than 0.1% by my experience). I can't tell the same for Arabic Wikipedia. Are we sure arwiki wants invisible left-to-right characters to be removed? Meno25? -- Magioladitis (talk) 12:20, 6 December 2013 (UTC)Reply
@Magioladitis: Zaher and me want the characters to be removed. I can start a discussion on Arabic Wikipedia Village Pump about this isuue if this is needed. --Meno25 (talk) 12:25, 6 December 2013 (UTC)Reply
@Meno25: I am OK either way, but I don't know the statistics for arwiki. AWB removes the characters using simple Find & Replace method. Check instructions at User:Magioladitis/AWB_and_CHECKWIKI#cite_note-4. Recall that 16 can not be fixed in bot mode. -- Magioladitis (talk) 12:29, 6 December 2013 (UTC)Reply
@Magioladitis: @NicoV: Checkwiki error 16 is fixed automatically (not manually) by WPCleaner for English texts. But this fix is disabled for Arabic texts. What Zaher is trying to say above is that he wants fixing this error to be enabled for Arabic texts too. I have been using AWB to fix this error manually using the same regex you provided for months in Arabic Wikipedia without complains from other users, so, I guess we can safely enable fixing this error for Arabic texts. Of course, bot operators on arwiki can disable fixing error 16 in WPCleaner preferences if a problem arises. --Meno25 (talk) 12:41, 6 December 2013 (UTC)Reply
In WPCleaner, I decided to restrict automatic fixing after some reports of problems. See this discussion for example, or someone reported that fixing fr:Alâ ud-Dîn Khaljî resulted in characters inversion (it may be the same for the few pages left with error #16 on frwiki). Having a discussion about this issue with people knowing how it works would be better before letting again WPCleaner automatically fix every control character. --NicoV (Talk on frwiki) 15:07, 6 December 2013 (UTC)Reply

Fixing ISBN errors

edit

  Note:

Hi, I've made a lot of improvements in WPCleaner to help fixing ISBN errors #69, #70, #71, #72 and #73 (which account for about 10k errors for enwiki). Some of this improvements require configuration in WPCleaner configuration file or Check Wiki configuration file.

  • #72, #73: possibility to search the provided ISBN number or the ISBN number modified with the computed check value in several web sites. Web sites are configurable in general_isbn_search_engines, with 3 default web sites (WorldCat, OttoBib, Copyright Clearance Center). If you know other interesting web sites, let me know, I can add them by default.
  • #70, #71, #72, #73: when the ISBN is provided as a template parameter (isbn=), possibility to search in several web sites using an other parameter of the template (for example the title). This is configurable in general_isbn_search_engines_templates, with no default configuration as it depends on the templates of the wiki. Example available in frwiki configuration.
  • #70: when the ISBN provided contains 8 characters, possibility to search if this is an ISSN number in several web sites. Web sites are configurable in general_issn_search_engines, with 1 default web site (WorldCat). If you know other interesting web sites, let me know, I can add them by default.
  • all: possibility to request help on fixing the ISBN. It's configurable through general_isbn_help_needed_comment, general_isbn_help_needed_templates, error_070_reason_yywiki and so on.

If you have other ideas on how to help fixing those errors, I'm quite interested. --NicoV (Talk on frwiki) 23:21, 19 November 2013 (UTC)Reply

HTML entities

edit

  Resolved

I object to a blanket replacement of HTML entities with the corresponding Unicode character on the basis of source code readability. The Wikipedia editor lacks any mechanism to identify the character at the cursor location. Also, the editor can direct the editor to use a variety of different fonts, and the casual editor probably does not know what font is in use. Thus there are many similar characters, such as −, -, – A, Α, Η, K, Κ, N, and Ν. When these are present in the source as Unicode rather than HTML entities it is difficult for editors to know which is which. Jc3s5h (talk) 14:28, 5 May 2014 (UTC)Reply

Jc3s5h I think the check already excludes all the letters/symbols that look similar to Latin characters. Am I wrong? Anyway, since we discovered a lot of false positives I agree with you. -- Magioladitis (talk) 14:32, 5 May 2014 (UTC)Reply
After posting, I managed to find the full list at https://tools.wmflabs.org/checkwiki/cgi-bin/checkwiki.cgi?project=enwiki&view=only&id=11. However, the full list contains an ellipsis. I don't know if that means the HTML entity for ellipsis will be converted to the Unicode character for ellipsis, or if the full list is really not a full list. Jc3s5h (talk) 14:37, 5 May 2014 (UTC)Reply
@Bgwhite and NicoV: what is the current status of this one? -- Magioladitis (talk) 14:40, 5 May 2014 (UTC)Reply
For WPCleaner, error is reported for all characters listed in #11, if there's no <math /> or {{math}}. When working in manual mode, no automatic replacement is done, just a suggestion to replace them by their Unicode character. When working in bot mode, automatic replacement (not sure if I should keep this). --NicoV (Talk on frwiki) 13:24, 6 May 2014 (UTC)Reply
This would certainly create confusion with the Greek letters Α, Β, Ε, Ζ, Ι, Κ, Μ, Ν, Ο, ο, Ρ, Τ, and Χ. The letter υ could be a problem in some fonts. If the bot behaves inconsistently for different Greek letters, that could create further confusion; maybe it would be better to leave all Greek letters alone. In any case, all these characters should be documented. Jc3s5h (talk) 13:36, 6 May 2014 (UTC)Reply
@Bgwhite and Magioladitis: Do we remove Greek letters from #11 ? No problem on my side, I just want to keep being coherent with the detections from the script. --NicoV (Talk on frwiki) 14:26, 6 May 2014 (UTC)Reply
NicoV, I was going to mark this as resolved until I actually read the last two messages above and saw this was something else. Grrrr, I wish I had my mind. Following comment is in the checkwiki code. The following section was added to the code on April 22nd:
FOR #011. DO NOT CONVERT GREEK LETTERS THAT LOOK LIKE LATIN LETTERS.
Alpha (A), Beta (B), Epsilon (E), Zeta (Z), Eta (E), Kappa (K), kappa (k), Mu (M), Nu (N), nu (v), Omicron (O), omicron (o), Rho (P), Tau (T), Upsilon (Y), upsilon (o) and Chi (X).
Bgwhite (talk) 23:30, 15 May 2014 (UTC)Reply
Thanks Bgwhite, I just removed the same letters in WPCleaner. --NicoV (Talk on frwiki) 04:51, 16 May 2014 (UTC)Reply

I insist these bots comply with MOS:MARKUP. Jc3s5h (talk) 13:40, 6 May 2014 (UTC)Reply

Jc3s5h, you can insist all you want, but you are in the wrong spot. You will have to contact the bot's talk page or the individual bot owner. CheckWiki only checks, not fixes. Bgwhite (talk) 23:30, 15 May 2014 (UTC)Reply
It is incorrect to label, as an example, the HTML entity &Alpha; as an error. Jc3s5h (talk) 23:47, 15 May 2014 (UTC)Reply
Jc3s5h, per above, CheckWiki does not catch &Alpha; as an error and never has done so. Bgwhite (talk) 00:05, 16 May 2014 (UTC)Reply
I'm pleased to see that these are not being incorrectly labelled as errors. It would be nice if the documentation made it clear which HTML entities are being replaced. This information is of interest to all editors who edit articles, not just people who write bot code or people who use AutoWikiBot. Therefore, which HTML entities it is safe to put into an article should be accessible to all editors, with no programming skill required. Jc3s5h (talk) 00:11, 16 May 2014 (UTC)Reply

<references> detected by #67

edit

  Done

Hi, with the last dump on frwiki, I see that several articles are detected by #67 but it's a <references>...</references> not a <ref>...</ref>... (fr:2 février, fr:23 février, ...). Maybe only detect if there's no letter after ref (white space, ">", ...) ? --NicoV (Talk on frwiki) 08:44, 6 May 2014 (UTC)Reply

NicoV Done. Bgwhite (talk) 04:45, 13 May 2014 (UTC)Reply

Homepage → enwiki

edit

  Done

For Homepage → enwiki → High priority (and all and middle and low), would you please make the "ID" column sortable?
--LukasMatt (talk) 07:17, 3 May 2014 (UTC)Reply

Moin Moin at all, I think this will be interesting for all languages. --Crazy1880 (talk) 17:15, 6 May 2014 (UTC)Reply
@LukasMatt and Crazy1880: It has been added. Bgwhite (talk) 07:31, 8 May 2014 (UTC)Reply
Beautiful. Thanks. --LukasMatt (talk) 08:29, 8 May 2014 (UTC)Reply
Moin Moin @Bgwhite:, i checked it, thank you. Regards --Crazy1880 (talk) 18:18, 9 May 2014 (UTC)Reply

Multiple <ref /> tags separated by commas

edit

  Resolved

Hi, are multiple <ref>...</ref> tags separated by commas (or other punctuations) detected by #61 or #67: like <ref>...</ref>,<ref>...</ref> ? If not, it may be useful to create a new error for that, because on many wiki, references should not be separated by normal punctuation, but rather by things like fr:Modèle:,. --NicoV (Talk on frwiki) 12:51, 12 May 2014 (UTC)Reply

NicoV, it is detected for #61 and in theory for #67 as well. I don't fix any #67, so I can't say for positive. Bgwhite (talk) 17:49, 12 May 2014 (UTC)Reply
Ok, thanks, I will update WPCleaner to detect them also. --NicoV (Talk on frwiki) 18:08, 12 May 2014 (UTC)Reply

Detection of ISBN templates with the same ISBN repeated several times

edit

  Not done

Hi, when fixing ISBN in frwiki, I found a few cases where the same ISBN was defined several times in one ISBN template: one time with the "-" separators, one time without. Do you think we should create a new error for this? --NicoV (Talk on frwiki) 09:57, 15 May 2014 (UTC)Reply

NicoV, do you have an example? Bgwhite (talk) 17:26, 15 May 2014 (UTC)Reply
Something like this, but without the missing last digit on the second ISBN. The same ISBN would have been used twice in the template, once with the "-" (978-2-296-00571-6) and once without (9782296005716). I don't find an exact example in my contributions, my bot account has made too many edits lately to find it. --NicoV (Talk on frwiki) 17:53, 15 May 2014 (UTC)Reply
NicoV, I'm inclined to say no. enwiki doesn't have an ISBN template, but I don't recall seeing this problem before when the ref is written without a template. Bgwhite (talk) 23:42, 15 May 2014 (UTC)Reply
Ok, no problem. --NicoV (Talk on frwiki) 04:57, 16 May 2014 (UTC)Reply

Leaflet For Wikiproject Check Wikipedia At Wikimania 2014(updated version)

edit

Please note: This is an updated version of a previous post that I made.

 

Hi all,

My name is Adi Khajuria and I am helping out with Wikimania 2014 in London.

One of our initiatives is to create leaflets to increase the discoverability of various wikimedia projects, and showcase the breadth of activity within wikimedia. Any kind of project can have a physical paper leaflet designed - for free - as a tool to help recruit new contributors. These leaflets will be printed at Wikimania 2014, and the designs can be re-used in the future at other events and locations.

This is particularly aimed at highlighting less discoverable but successful projects, e.g:

• Active Wikiprojects: Wikiproject Medicine, WikiProject Video Games, Wikiproject Film

• Tech projects/Tools, which may be looking for either users or developers.

• Less known major projects: Wikinews, Wikidata, Wikivoyage, etc.

• Wiki Loves Parliaments, Wiki Loves Monuments, Wiki Loves ____

• Wikimedia thematic organisations, Wikiwomen’s Collaborative, The Signpost

The deadline for submissions is 1st July 2014

For more information or to sign up for one for your project, go to:

Project leaflets
Adikhajuria (talk) 12:43, 25 June 2014 (UTC)Reply

Leaflet For Wikiproject Check Wikipedia At Wikimania 2014

edit

Are you looking to recruit more contributors to your project?
We are offering to design and print physical paper leaflets to be distributed at Wikimania 2014 for all projects that apply.
For more information, click the link below.
Project leaflets
Adikhajuria (talk) 14:57, 22 May 2014 (UTC)Reply

Adikhajuria Bgwhite I would be interested on that. -- Magioladitis (talk) 17:30, 12 June 2014 (UTC)Reply

Can someone tell me ...

edit

  Not possible - Wrong forum

Why this edit was claimed as a CHECKWIKI fix? Near as I can see - it moved the authorlink parameter from next to the author to later in the reference template and removed a space. This doesn't look like any sort of error to me.... and I really prefer to see authorlinks near the author parameter - makes more sense. I also like the space - there is no rule that it shouldn't exist and it makes it easier to edit and tell sections of templates. Ealdgyth - Talk 12:30, 14 May 2014 (UTC)Reply

Ealdgyth it did not move the authorlink. Authoerlink was a duplicate parameter. There were two parameters with the same title and same content. -- Magioladitis (talk) 12:33, 14 May 2014 (UTC)Reply
Could that be listed as an error or something in the thing? It's very annoying when the bot moves through a huge bunch of articles and does a pile of different edits, but the edit summaries are all the same - which means I have to guess what error caused each edit. Ealdgyth - Talk 12:38, 14 May 2014 (UTC)Reply
Ealdgyth, this is not a CheckWiki issue. CheckWiki only finds problems. How a problem is fixed, including the edit summary, is up to the individual editor. If an editor is using AWB, general fixes will be applied, which the authorlink issue is part of. It is not possible to add these fixes to the edit summary. Bgwhite (talk) 06:19, 15 May 2014 (UTC)Reply
Well, the edit summary clearly stated it WAS a CheckWiki fix ... if it isn't one, shouldn't these sorts of fixes not state they are? Ealdgyth - Talk 12:18, 15 May 2014 (UTC)Reply
Ealdgyth, when a bot runs on any list, CheckWiki or otherwise, the list is always out of date as articles are being changed or updated all the time. The AWB bot arrives at an article, issue on the list was fixed, but AWB's general fixes corrects another issue. Bgwhite (talk) 17:25, 15 May 2014 (UTC)Reply

404 Not Found

edit

Moin Moin Bgwhite and NicoV, since this evening I got to see "404 Not Found" for the script https://tools.wmflabs.org/checkwiki/cgi-bin/checkwiki.cgi is there something wrong this evening? Regards --Crazy1880 (talk) 17:30, 3 June 2014 (UTC)Reply

Crazy1880. The web server died for whatever reason. Things are working now. Bgwhite (talk) 18:06, 3 June 2014 (UTC)Reply
Thank you Bgwhite, for an IIS Webserver I know the doing. Our Company uses a CRM and a SharePoint, there are sometimes the same alerts. Have a good evening. --Crazy1880 (talk) 18:54, 3 June 2014 (UTC)Reply

False positive for ISBNs

edit
  Resolved

ca:Rent (musical) gives a false positive for issue #72 because of a URL which contains the string "/qisbn=1164910567/". Can you please check on it? --Joutbis (talk) 18:32, 14 July 2014 (UTC)Reply

That is an old Amazon format. The correct link is: http://www.amazon.com/Rent-Jonathan-Larson/dp/0688154379 Bgwhite (talk) 19:54, 14 July 2014 (UTC)Reply

Old interface gone for good?

edit
  Resolved

Is the old interface gone for good? If so, how come errors #30 and #79 don't get flagged in the new one? --Joutbis (talk) 18:37, 14 July 2014 (UTC)Reply

Joutbis toolserver does not work anymore. -- Magioladitis (talk) 18:57, 14 July 2014 (UTC)Reply
As Magioladits mentioned, Toolserver was turned off on June 30. Anything that was on Toolserver was either migrated to WMFLabs or is gone. Error #30 & #79 are deactivated on all Wikis. A bunch of errors were deactivated and a bunch of new errors have been added. Bgwhite (talk) 20:02, 14 July 2014 (UTC)Reply
Ah, OK, thanks. That's too bad, we had those two under control...--Joutbis (talk) 23:02, 18 July 2014 (UTC)Reply

Addition to error 16

edit

  Done

I suggest that we add "u00a0" (invisible nbsp) in the list of invisible unicode characters. -- Magioladitis (talk) 06:53, 2 August 2014 (UTC)Reply

Done Bgwhite (talk) 07:42, 22 August 2014 (UTC)Reply

"Hard space"?

edit

  Not possible - Wrong forum

I was linked her by es, but the word "hard space" (1970–1991?) does not appear on the page. Any serious (AWB) es should specify by Unicode, and maybe HTML entity when needed. -DePiep (talk) 20:45, 26 July 2014 (UTC)Reply

DePiep, I'm not exactly sure what you are asking, also what is "es"? Bgwhite (talk) 22:20, 26 July 2014 (UTC)Reply
es=edit summary, WP:ES. I responded to this edit: [7]. Earlier recent talk is at User_talk:Magioladitis#What_kind_of_spaces?.
My points: 1. The es linked to "hard space", which is an old-fashioned name. That is, it is not used since we know & use standard Unicode (of course I can click & read & click & read my homework, but why am I required to do so?). Personal note: I have made hundreds of edits in enwiki about Unicode, and I am still surprised by this 1980 word of 'hard space' today. And I do know ALGOL60. There also seems to exist, by AWB talk: 'normal space', 'invisible nbsp', ' visible nbsp' (says Magioladitis, a WP:AWB contributor).
Quite simple: we use Unicode, so we communicate by Unicode.
U+0020   SPACE
U+00A0   NO-BREAK SPACE (&nbsp;, &NonBreakingSpace; · NBSP)
AWB should must comply to Unicode and HTML parlance. I do not see why an automated (prewritten AWB) es is allowed to be out of touch. -DePiep (talk) 23:05, 26 July 2014 (UTC)Reply
DePiep I am open to suggestions for a better es. -- Magioladitis (talk) 23:35, 26 July 2014 (UTC)Reply
DePiep, Magioladitis. I left a message Magioladitis' talk page where this mess got started. This isn't a Checkwiki problem. I've been reminded multiple times lately that there is no "must" on Wikipedia. Also, there is no automated AWB summary except for changing the spelling of a word. Magioladitis' edit summaries needed work at the beginning, but this has turned into a lame edit war where both of you should stop and be able to use either word. THE BOTH MEAN THE SAME THING. Bgwhite (talk) 00:00, 27 July 2014 (UTC)Reply
(edit conflict) re Magioladitis: Wellllllll, then stop using words like 'hard space' and 'invisible space'. Start using Unicode names I already gave you. And, maybe you could es like: "replace entity &nbsp; for character [NBSP]" - if that is what you mean (because I still don't understand these edits). -DePiep (talk) 00:04, 27 July 2014 (UTC)Reply
DePiep thanks I am going to use this! -- Magioladitis (talk) 00:05, 27 July 2014 (UTC)Reply
Thanks, Magioladitis This positive replay took the fire out of my attitude ;-). Looking forward to your next edits, I will reduce my watchlist. -DePiep (talk) 00:11, 27 July 2014 (UTC)Reply
DePiep No problem. Thanks for the feedback. This is what I said I need from the very first moment. It's difficult to please everyone. -- Magioladitis (talk) 00:13, 27 July 2014 (UTC)Reply

A page not updated?

edit
  Resolved

Hi, I use to fix ISBN codes listed in the itwiki page of the high priorities. Unfortunately, the preceding page of the toolserver was daily updated, while this new page seems not. Am I wrong? Or....? Thanks. --Er Cicero (talk) 21:38, 6 August 2014 (UTC)Reply

Er Cicero, normally itwiki would be updated twice a month from dump files. However, since mid-June, the dump files have stopped updating due to the file system being full. See #update arwiki for more info. A new itwiki dump should be generated in the next few days. I'll run that manually to get itwiki updated. Bgwhite (talk) 22:46, 6 August 2014 (UTC)Reply
Bgwhite, many thanks for your explanation and for your work. Regards! --Er Cicero (talk) 23:31, 6 August 2014 (UTC)Reply

Showing ISBN errors to other editors

edit

Hi,

Don't worry, not a request for more work to do, just an announcement to make. I'm happy to announce WPCleaner v1.32, with the main addition being the ability to add/update/remove a warning about ISBN errors (#70, #71, #72, #73) on article talk page. This can work either on a given article (from the full analysis window), or on a big bunch of articles as a bot tool (members of Category:Pages with ISBN errors, articles listed in #70-73, articles with the warning on their talk page).

Some configuration is required before being able to use it on a wiki. I've configured it for frwiki, and used it this weekend :

With the addition of the automatic detection of ISBN errors in cite templates on frwiki, I hope that it will help reduce the number of ISBN errors.

If you wish to configure this for an other wiki, please check what WPC is doing on one article before trying the bot tool on large scale. --NicoV (Talk on frwiki) 21:28, 27 April 2014 (UTC)Reply

And also the possibility to create a list of all ISBN errors: for each invalid ISBN, it gives a list of articles containing it. This allows working on all the articles that contain the same invalid ISBN. I'm currently running WPCleaner to create it for enwiki, you can see an example at frwiki (showing a record of the same invalid ISBN used 297 times). This function requires a lot less configuration (todo templates, and preferably a category for pages with ISBN errors). --NicoV (Talk on frwiki) 20:55, 28 April 2014 (UTC)Reply
List generated... big... but bad rendering... I thought the {{ISBN}} template would create an ISBN, not messages... --NicoV (Talk on frwiki) 21:55, 28 April 2014 (UTC)Reply

Given that I was just working on ISBN errors last night, I feel entitled to spout my two halers worth...

On the page "→ Homepage → enwiki → middle priority → ISBN with wrong length", I wish the table contained an additional indication if the error occurs multiple times in the article. Surely, if the script can find the error once in an article, it can also find the error more than once and tell us rather that hording such information for itself.  
--LukasMatt (talk) 01:48, 29 April 2014 (UTC)Reply

Ok, will add it to the generated list. --NicoV (Talk on frwiki) 06:41, 29 April 2014 (UTC)Reply
List updated: list of all ISBN errors --NicoV (Talk on frwiki) 15:38, 29 April 2014 (UTC)Reply
I'll contact Bgwhite as you suggested. I looked at "list of all ISBN errors"; it's not exactly what I had in mind for my first request. Sometimes, in one article, a person will cite the same source 10 times and not use a "ref name". Thus, the same incorrectly formatted ISBN occurs 10 times in the article. I need something in "→ Homepage → enwiki → middle priority → ISBN with wrong length" that tells me "This bad ISBN occurs 10 times in the article".
--LukasMatt (talk) 16:30, 29 April 2014 (UTC)Reply
Lists on Labs only show the first error in each article (no information if the same error is happening several times, or there are other errors), and it's probably not going to change. I would suggest to use a tool that will show how many times each error occurs. WPCleaner does this, AWB probably also.
On frwiki, I configured WPCleaner to be able to put a message on article talk page listing all ISBN errors (see fr:Modèle:Avertissement ISBN). --NicoV (Talk on frwiki) 16:59, 29 April 2014 (UTC)Reply

Thanks, NicoV. One more request, please. In "→ Homepage → enwiki → middle priority → ISBN with wrong length", instead of only showing 25 articles per page, can we have something like

View (previous 50) (next 50) (20 | 50 | 100 | 250 | 500)

--LukasMatt (talk) 12:33, 29 April 2014 (UTC)Reply

This is more a request for Bgwhite probably, I'm only updating WPCleaner, not the scripts that work on WMF Labs (probably the same for the previous request, I can only add the count the list WPCleaner generates). It's already possible manually by adding &limit=50 to the URL like https://tools.wmflabs.org/checkwiki/cgi-bin/checkwiki.cgi?project=frwiki&view=only&id=12&limit=50 --NicoV (Talk on frwiki) 13:43, 29 April 2014 (UTC)Reply
Yep, it works. Thanks. (Still, a simple mouse click would be nicer. I'll contact Bgwhite.)
--LukasMatt (talk) 16:30, 29 April 2014 (UTC)Reply

"List of all ISBN errors" is not going to happen. That information isn't stored in the database by design.
As for "View (previous 50) (next 50)", that is a good idea. Will add it to the list of things to do. Bgwhite (talk) 16:48, 29 April 2014 (UTC)Reply

@NicoV: I am very interested in this feature, thanks for it! Will be working on assimilating this with cswiki. Matěj Suchánek (talk | cont.) 15:06, 30 April 2014 (UTC)Reply

Happy to know that it's going to be used on an other wiki. Keep me posted! --NicoV (Talk on frwiki) 09:35, 2 May 2014 (UTC)Reply
@Matěj Suchánek: Any luck using it with cswiki? The page containing the list of ISBN errors can now be updated automatically by WPCleaner (see frwiki). --NicoV (Talk on frwiki) 13:27, 6 May 2014 (UTC)Reply
@NicoV: wikt:dočkej času, jako husa klasu... actually, I have already created the template and updated some configuration, so it only depends on when I start using this feature or when someone finds this feature since I didn't write anywhere about it. Matěj Suchánek (talk | cont.) 17:21, 7 May 2014 (UTC)Reply
Ok, no rush ;-) Luckily, he only thing that is done completely automatically is updating the warning (but not creating it) when you save a page where you fixed some ISBN errors, so nothing should happen before someone tries to use it. --NicoV (Talk on frwiki) 17:47, 7 May 2014 (UTC)Reply

Showing more than 25 articles

edit

  Done

Copied from the section "Showing ISBN errors to other editors"

Thanks, NicoV. One more request, please. In "→ Homepage → enwiki → middle priority → ISBN with wrong length", instead of only showing 25 articles per page, can we have something like

View (previous 50) (next 50) (20 | 50 | 100 | 250 | 500)

--LukasMatt (talk) 12:33, 29 April 2014 (UTC)Reply

"List of all ISBN errors" is not going to happen. That information isn't stored in the database by design.
As for "View (previous 50) (next 50)", that is a good idea. Will add it to the list of things to do. Bgwhite (talk) 16:48, 29 April 2014 (UTC)Reply
LukasMatt, Done Bgwhite (talk) 06:38, 18 May 2014 (UTC)Reply
I just noticed it. Sweet! Thanks. --LukasMatt (talk) 15:41, 21 May 2014 (UTC)Reply

Bgwhite, would it be possible to do the same for the list of "done" articles ? Thanks --NicoV (Talk on frwiki) 09:43, 25 May 2014 (UTC)Reply

NicoV Done Bgwhite (talk) 07:37, 22 August 2014 (UTC)Reply

Problem with special character

edit

  Done

Moin Moin @Bgwhite:, since today there is a problem with "more" in every ID. If an article has an special character you couldn't open "more". If there is no special character, there is no problem. Tip: Is this a Bug from #Homepage → enwiki? Regards --Crazy1880 (talk) 08:41, 10 May 2014 (UTC)Reply

Crazy1880, could you give me a link where you see it because I can't find it. It would not be related to the previous feature addition. Different parts of the code. Bgwhite (talk) 06:53, 11 May 2014 (UTC)Reply
Moin Moin Bgwhite, I checked some more round about this problem. I normally use Opera but yesterday I used the IE. Today in the morning I used Opera an see no problem. So I used IE 11, too, and there it is.
  • Link one: //tools.wmflabs.org/checkwiki/cgi-bin/checkwiki.cgi?project=enwiki&view=detail&title=Ahmed Sékou Touré
  • Link two: //tools.wmflabs.org/checkwiki/cgi-bin/checkwiki.cgi?project=enwiki&view=detail&title=Air Livonia
It seems that the underlines at special characters link at "title" are the riddle solution. Regards --Crazy1880 (talk) 09:20, 11 May 2014 (UTC)Reply
Crazy1880, well that is strange. It works fine in Chrome and Firefox, but dies in IE. The edit and Article columns work fine in all browsers. I don't want to test the done column. I'll look at the code to see if it does anything different between the columns. Otherwise, I'll need to get an expert on IE. Bgwhite (talk) 05:12, 12 May 2014 (UTC)Reply
Crazy1880, with the help of Redrose64, the problem is now fixed. Bgwhite (talk) 05:58, 15 May 2014 (UTC)Reply
Moin, thank you Bgwhite and Redrose64. Regards --Crazy1880 (talk) 17:22, 15 May 2014 (UTC)Reply


Moin Moin and sorry Bgwhite and Redrose64, but the problem is not done. Now I have the problem in every browser, that under "more" when there is a special character you couldn't click on "done" and set it as done.

  • Link one: //tools.wmflabs.org/checkwiki/cgi-bin/checkwiki.cgi?project=enwiki&view=detail&title=Al-Qusayr,%20Syria
  • Link two: //tools.wmflabs.org/checkwiki/cgi-bin/checkwiki.cgi?project=enwiki&view=detail&title=Air%20Command%20Tandem

And in the IE there is the problem, that I am not able to open "more" by articles with special character. Please check there again, thanks --Crazy1880 (talk) 05:43, 16 May 2014 (UTC)Reply

Crazy1880, it is the same exact problem, but in a different part of the code. I'll get to within the next hour. Bgwhite (talk) 05:48, 16 May 2014 (UTC)Reply
First part is fixed. Could you give me an example link for the second (IE) part. Bgwhite (talk) 06:03, 16 May 2014 (UTC)Reply
Moin Bgwhite, here the link to english CheckWikipedia see artikle "Ahmed Sékou Touré" or "Ajumako/Enyan/Essiam District". Regards --Crazy1880 (talk) 16:56, 16 May 2014 (UTC)Reply
Crazy1880, it does work for me with IE. I'm using IE 11 and I have a feeling you are using another version. What version are you using? Bgwhite (talk) 00:17, 17 May 2014 (UTC)Reply
Moin Bgwhite, true, I use multiple versions of Internet Explorer in my work, but primarily the FF and Opera. I now looked again to the problem and I found that in my version of IE now everything looks ok. Thanks. --Crazy1880 (talk) 14:40, 17 May 2014 (UTC)Reply

False positives for #94

edit

Hi, it seems that false positives are detected when the closing ref tag is </ref > (with the space at the end). For Spahettification, CheckWiki reports the error being at <ref> pour une corde du même type de 8 m. --NicoV (Talk on frwiki) 05:27, 10 July 2014 (UTC)Reply

NicoV I just fix them. -- Magioladitis (talk) 06:18, 10 July 2014 (UTC)Reply
I am very happy. I have forgotten this was a mistake some people do. I just fixed 17 pages in the English Wikipedia. -- Magioladitis (talk) 06:40, 10 July 2014 (UTC)Reply
This is done by design. Yea, it is minor, but fixable. Besides it makes Magioladitis happy. Bgwhite (talk) 07:40, 10 July 2014 (UTC)Reply

I did not remember that but AWB fixes the spacing inside close reg tag! -- Magioladitis (talk) 07:52, 10 July 2014 (UTC)Reply

False positive for #94 ?

edit

Hi, on frwiki, fr:Fièvre hémorragique Ebola is detected with the following notice </ref>. | width = 225 | icd1. The notice is related to text in the infobox, but I don't see any problem there: there's a opening ref tag before. --NicoV (Talk on frwiki) 16:36, 22 July 2014 (UTC)Reply

NicoV check now. I fixed some spacing. -- Magioladitis (talk) 19:01, 22 July 2014 (UTC)Reply
Bgwhite, Magioladitis, you both modified the article to remove carriage return inside the refs text, but I don't think that should trigger #94. --NicoV (Talk on frwiki) 19:10, 22 July 2014 (UTC)Reply
Magioladitis, NicoV, it isn't fixed. I was thinking a hidden character might be the problem, so I re-typed out the ref. But, that wasn't the problem. Bgwhite (talk) 20:25, 22 July 2014 (UTC)Reply

Hi Bgwhite, fr:Fièvre hémorragique Ebola is popping up almost daily, and there's also a false positive with fr:Multiplicateur de tension, with the following notice <ref name="yuan">{{Harvnb|Yuan|2010|pp=1, where I don't see any problem. --NicoV (Talk on frwiki) 09:36, 8 August 2014 (UTC)Reply

NicoV. It isn't a false positive, but checkwiki is showing the wrong location. Ref names should not contain < or >.
In Fièvre hémorragique Ebola, the error was at: <ref name="10.1002/(SICI)1096-9071(199911)59:3<341::AID-JMV14">. I removed the offending <. Now for the sad part. AWB did pick up the error and the correct spot. Crap.
For Multiplicateur de tension, it is showing the correct spot, but it is the space before > that is issuing the error. </ref > should be </ref>. This was talked about a few months back. Bgwhite (talk) 06:14, 9 August 2014 (UTC)Reply
Ok, thanks, I will try to add this to WPCleaner. --NicoV (Talk on frwiki) 08:59, 9 August 2014 (UTC)Reply
Forgot to say that it's added in WPCleaner. --NicoV (Talk on frwiki) 08:01, 22 August 2014 (UTC)Reply

Several main pages...

edit

Hi, I just found out that there were several Check Wiki main pages:

--NicoV (Talk on frwiki) 08:13, 14 August 2014 (UTC)Reply

Encoding problem when clicking on Done

edit

  Done

Hi, when clicking on "Done", the list is displayed again and at the beginning of the page, there's the name of the article that has been marked as done. If this name contains accented characters, they are badly displayed. For example, in the list for #96, I clicked on Done for Liste des députés de la treizième législature par circonscription, the page is displayed with "Liste des députés de la treizième législature par circonscription" just after the Check Wikipedia title. --NicoV (Talk on frwiki) 12:09, 19 August 2014 (UTC)Reply

NicoV. The page name displayed with bad charachters was a print statement I had in for debugging. It has been removed. However, that reminded me that if an article title had a quote character, pressing done would do nothing. That is now fixed. Bgwhite (talk) 22:04, 22 August 2014 (UTC)Reply

Improvement for #25 notice

edit

  Done

Hi, a suggestion for a prettier notice for #25 errors: instead of displaying a <br> between the two titles, maybe put a real line break so that the two titles are one above an other. Just a suggestion to have a better display. --NicoV (Talk on frwiki) 22:00, 20 August 2014 (UTC)Reply

NicoV Done Bgwhite (talk) 07:35, 22 August 2014 (UTC)Reply

Software Error Check Wikipedia

edit

  Done

Moin Moin Bgwhite, at this morning I would like to open the Check Wikipedia an got the following massage: Cloud not connect to database: Host '10.68.17.174' is blocked because of many connection errors; unblock with 'mysqladmin flush-hosts'. Could you have a look at? Thanks --Crazy1880 (talk) 04:58, 21 August 2014 (UTC)Reply

Crazy1880, WMFLabs database went down about 1/2 hour ago. Nothing I can do on my end. Also, the dump directory has been down for almost two months, which is the reason for no updates. Bgwhite (talk) 05:04, 21 August 2014 (UTC)Reply
Bgwhite, yes, i heard about this and i saw the bugzilla alert from user Merlissimo and this using for bot MerlBot. He has the same problems. Thanks and king regards. --Crazy1880 (talk) 06:45, 21 August 2014 (UTC)Reply

Down again... --NicoV (Talk on frwiki) 07:11, 23 August 2014 (UTC)Reply

Please stop fixing things that aren't broken, and breaking things that work

edit

This edit [8] breaks the formatting, because (contrary to popular belief) a blank line is not always equivalent to <p>. Please fix your tools to operate only where you understand the effects of what you're doing and, ideally, stop "fixing" things that aren't broken in pursuit of some perfectionist ideal of what markup should look like. Thanks. EEng (talk) 00:53, 8 August 2014 (UTC)Reply

I think it would be great if at least as much attention was given to not breaking things as is given to fixing not-broken things. Could I please have a response on this? EEng (talk) 13:08, 22 August 2014 (UTC)Reply
EEng, This edit has been made manually by Sfan00 IMG, not automatically by any tool. --NicoV (Talk on frwiki) 13:12, 22 August 2014 (UTC)Reply
Then why does the edit summary say WPCleaner v1.33 - Fixed using WP:WCW, with a link to this very page? EEng (talk) 13:14, 22 August 2014 (UTC)Reply
Hi EEng. Sfan00 IMG was using WPCleaner as the tool for editing. WPCleaner detects the same things that WP:WCW, and shows to the user what it has detected: in this case, as enwiki WP:WCW is configured to detect use of <p>, WPCleaner highlighted the <p> in the text. Then, the user decided to remove it. At the end, WPCleaner knew that there was a <p> in the original version, and that <p> has been removed, so it suggested an automatic comment. --NicoV (Talk on frwiki) 13:20, 22 August 2014 (UTC)Reply
OK, we're making some progress. So please tell me: why does WCW highlight < p>? EEng (talk) 13:29, 22 August 2014 (UTC)Reply
Technically, because error #39 (HTML text style element <p>) is activated in WCW configuration file. --NicoV (Talk on frwiki) 14:35, 22 August 2014 (UTC)Reply
What purpose is served by activating it? Please answer in terms of how articles are improved by highlighting < p>, not in terms of the mechanisms of operation of these tools. EEng (talk) 15:33, 22 August 2014 (UTC)Reply
We've been thru this before. You do not like anything about Checkwiki. You've told us to fuck off. You've called us MOS Nazis. We show where in MOS, but you've used MOS is just a guideline/policy and IAR. The funny thing is, one of the reasons Phineas Gage is not a GA is because of your idiosyncratic formatting. The very thing we've been preaching is one of things holding back your GA nomination. Eleanor Elkins Widener is already on the whitelist and won't be checked for <p> again. Bgwhite (talk) 17:35, 22 August 2014 (UTC)Reply

Errors #72 and #73 "fixed" by WPC??

edit

  Resolved

Hello. I've spent some time fixing ISBN errors and came here as a result of the relocation of Wikipedia:WikiProject_Check_Wikipedia/ISBN_errors. Looking at Wikipedia:WikiProject_Check_Wikipedia/List_of_errors I'm a bit worried to see "ISBN with wrong checksum" marked as "Fixed in all cases" by WPC. This sounds like a tool "fixing" ISBNs that fail the checksum test by blindly applying a recalculated checksum. I would expect this to be the wrong action about 90% of the time. Hopefully I've misunderstood. Could someone please clarify what is actually going on?TuxLibNit (talk) 19:10, 30 August 2014 (UTC)Reply

TuxLibNit, this is more of a question for WPCleaner. NicoV is the one to ask. He is either on vacation or in the middle of the ocean for the next week. So, give him a bit before he responds. Bgwhite (talk) 21:07, 30 August 2014 (UTC)Reply
TuxLibNit, no need to worry, it's just the list of errors that has incorrect informations. WPCleaner detects ISBN problem, and gives some suggestions, but doesn't fix anything by itself for these errors. --NicoV (Talk on frwiki) 10:48, 31 August 2014 (UTC)Reply

fa.wikipedia

edit

  Done

Would you please active fa translation? I want to start translating this tool in Farsi but it doesn't have any page for farsiYamaha5 (talk) 05:26, 11 July 2014 (UTC)Reply

Yamaha5, so you are the poor sucker that Ladsgroup rounded up. :)
If you want to set up the Persian Checkwiki, you need to create a translation file. If you goto here and click on any language, there will be a translation file towards the top. Arabic, French, Germany, Swedish, Czech, Slovenian Slovak, Greek and English translation files are the ones being actively updated. So, it is best to use one of those as a template. Place it somewhere on fawiki and tell me the location. This way, fawiki is in control of what errors should be checked. For example, some errors are only applicable to Latin script.
There are sections in the translation file for a whitelist (what articles create a false-positve) and templates. Every wiki has their own name for templates.
WPCleaner also uses the same file for its use. If you set up the translation file, WPCleaner can be used on fawiki. Towards the end of the file, errors #500 and above are WPCleaner only. Everything else is WPCleaner and CheckWiki. Bgwhite (talk) 05:49, 11 July 2014 (UTC)Reply
Thank you for your fast answer :)
I made fa:ویکی‌پدیا:ویکی‌پروژه تصحیح ویکی‌پدیا/ترجمه and I will start translating. Yamaha5 (talk) 05:58, 11 July 2014 (UTC)Reply
Hi Yamaha5, I've added fawiki to WPCleaner if you're interested. WPCleaner configuration is available at fa:کاربر:NicoV/WikiCleanerConfiguration. --NicoV (Talk on frwiki) 21:15, 13 July 2014 (UTC)Reply
NicoV Thank you for your edit.Yamaha5 (talk) 22:31, 13 July 2014 (UTC)Reply

Error #5 issues

edit

  Resolved

It seems that people keep trying to correct this error on an article I've formatted that intentionally uses an HTML quirk to have one end tag closing off two start tags so one of the start tags can be removed at a later date to display some other text (effectively <!-- foo <!-- bar -->). People keep closing off the first tag at the wrong point because it appears to be unpaired when HTML ignores any open tags in between a pair of tags. The results are here, where if you scroll down to the bottom you see that content that would have been hidden is now displayed because of the "correction". I am tired of having to re-fix these pages because people use semi-automated tools to correct this false positive. I've even had to put "There is no need for another closing comment tag" into the hidden text to jump out at people who constantly break the page but no one notices.—Ryūlóng (琉竜) 14:14, 29 August 2014 (UTC)Reply

Ryulong, we/you can add the article to a whitelist, so it won't be checked for #5 problems. When the series is over, then the article can be removed from the whitelist. Bgwhite (talk) 21:21, 29 August 2014 (UTC)Reply
These shows go on for about a year, and then a new show comes on in its place. Will I have to be doing this constantly?—Ryūlóng (琉竜) 21:26, 29 August 2014 (UTC)Reply
Ryulong Unless you see a different route. As you say, you are using an HTML quirk. The whitelist was setup to bypass false-positives and pages that are doing something "wrong", but need to in order to accomplish something. Bgwhite (talk) 06:12, 30 August 2014 (UTC)Reply
No automated tool is used to fix the comment tags anyway. Automated tool are used to spot the page. The rest is editors' actions. -- Magioladitis (talk) 06:13, 30 August 2014 (UTC)Reply
Maybe it's not a quirk but an exploit. But it seems editors keep fixing this despite the fact I have a message in the text informing them of the exploit that they ignore anyway.—Ryūlóng (琉竜) 06:19, 30 August 2014 (UTC)Reply
Ryulong to be honest I also think it's not a nice format. I just could not be bothered more and you do a lot of work on these pages and I did not want to distract you more. I have thought of other alternatives to suggest you like keeping the example piece of code in a different place and copy pasting, etc. But I am not sure if you are interested in this kind of solution. Not everyone reads hidden comments I guess. -- Magioladitis (talk) 06:28, 30 August 2014 (UTC)Reply
And yet the last person who made the change put the closing tag right next to the warning about how it isn't needed.—Ryūlóng (琉竜) 06:29, 30 August 2014 (UTC)Reply
Ryulong I just setup the whitelist for this error and added the article to it. Whitelist is at Wikipedia:WikiProject Check Wikipedia/Error 005 whitelist. Feel free to add/delete your articles to it or bug us about it. Bgwhite (talk) 06:33, 30 August 2014 (UTC)Reply
All right. I won't have anything to add to it for the next month it seems (a new show has been announced but there's no episode list for it yet obviously).—Ryūlóng (琉竜) 06:43, 30 August 2014 (UTC)Reply

update arwiki

edit

  Resolved

Please update the arwiki Last scanned dump 2014-04-07 (80 days old). --Zaher talk 23:19, 26 June 2014 (UTC)Reply

Zaher, the good news is that the daily update is still running, so new errors in articles are being caught. Looking at the logs, it appears that a page is so badly borked that it causes the checkwiki program to die. This does happen every once in awhile. Last happened with svwiki around 8 months ago. I'll have to work on this on my home computer to find the article... it's not easy to find. I'll try and have the majority of a dump processed and up on the webpage by this weekend. Bgwhite (talk) 00:03, 27 June 2014 (UTC)Reply
@Magioladitis: Zaher. If you look at all of the languages, you would see that none of them are updating. WMFLabs' disk space for the dump files is full and they are currently not doing anything about it.
Me reporting problem.

T67909

Others reporting the problem

T66362

Them saying it is known and will be fixed soon (July 11)

T48894.

Bgwhite (talk) 20:58, 4 August 2014 (UTC)Reply
Thanks for the clarification and for your efforts. --Zaher talk 17:44, 5 August 2014 (UTC)Reply

Error #55

edit

  Done

Hi! I can't find where are double small tags here. There are 90k entries so I thought it's something in a template but I haven't found anything. Thanks for your help! --AlessioMela (talk) 08:40, 1 July 2014 (UTC)Reply

AlessioMela, you are correct. I didn't see anything either. There is also something fishy with links as they goto the main page and not to the article. I will look into what is wrong. Bgwhite (talk) 22:45, 1 July 2014 (UTC)Reply

Parsoid Based Linter

edit

People here might be interested in the thread Wikipedia:Village_pump_(technical)#Parsoid_Based_Linter.--Salix alba (talk): 02:38, 9 July 2014 (UTC)Reply

Now archived at Wikipedia:Village pump (technical)/Archive 128#Parsoid Based Linter. EdJohnston (talk) 23:08, 18 July 2014 (UTC)Reply

AWB logic improvements

edit
  • rev 10273 Double quotation marks covered (errors 6 and 37)
  • rev 10296 A first try to expand MultipleHttp fixing inside url templates (error 93)
  • rev 10301, rev 10302 Fix for lj and nj in sortkey (errors 6 and 37)
  • rev 10319 moves punctuation in more cases. (error 61)
  • rev 10334 move refs after question and exclamation mark (error 61)
  • rev 10390 recognises more footnotes (error 61)
  • rev 10417 expands FixReferenceTags (error 94)

-- Magioladitis (talk) 20:47, 19 August 2014 (UTC)Reply

False positive for #92

edit

Hi, fr:Élément meta is reported by #92 with the notice "=== L'attribut ===". It seems that it's because there are several titles in the form L'attribut <code>something</code>. I think contents of <code>...</code> should be kept for analyzing #92. --NicoV (Talk on frwiki) 10:36, 14 August 2014 (UTC)Reply

NicoV, I'm not sure how to get around this. I've got headings inside code tags. Not sure how to remove one without the other. Bgwhite (talk) 21:34, 20 August 2014 (UTC)Reply
Bgwhite Ok, seems difficult. Throwing idea: keep the text inside the code tags, but somehow encode it internally so that it doesn't looks like other things (base 64, ...). Not sure. If it's too difficult, forget about it, we'll end up using the white list. --NicoV (Talk on frwiki) 21:57, 20 August 2014 (UTC)Reply

frwikiversity

edit

  Done

Hi, I saw in CW main page that for frwikiversity links to project page and translation page are pointing to frwiki. There's a project page and a translation page, but I'm not sure if they're correct (I will try to update the translation page using what's in frwiki). --NicoV (Talk on frwiki) 09:21, 25 August 2014 (UTC)Reply

I've updated the translation page based on frwiki. It should be a good start. --NicoV (Talk on frwiki) 09:52, 25 August 2014 (UTC)Reply

Again problems with error 55 (itwiki)

edit

  Done

Hi, like in the past update I can't find double tag small in those 90k articles. --AlessioMela (talk) 17:54, 26 August 2014 (UTC)Reply

AlessioMela, this one is a bugger because I can't reproduce it. Plus, it is only happening on some languages. I've made some changes to the logic of finding actual errors. Hopefully that fixes it. Bgwhite (talk) 23:09, 2 September 2014 (UTC)Reply
Thanks! I keep my fingers crossed ;-) --AlessioMela (talk) 09:56, 3 September 2014 (UTC)Reply

Interview for The Signpost

edit

The WikiProject Report would like to focus on WikiProject Check Wikipedia for a Signpost article. This is an excellent opportunity to draw attention to your efforts and attract new members to the project. Would you be willing to participate in an interview? If so, here are the questions for the interview. Just add your response below each question and feel free to skip any questions that you don't feel comfortable answering. Multiple editors will have an opportunity to respond to the interview questions, so be sure to sign your answers. If you know anyone else who would like to participate in the interview, please share this with them. Thanks, Rcsprinter123 (constabulary) @ 08:38, 29 August 2014 (UTC)Reply

@Bgwhite: Be sure to mention me, or else!!! [[=P}}  (speaking of CHECKWIKI-errors) (tJosve05a (c) 13:08, 29 August 2014 (UTC)Reply

Error #31

edit

Discussion in User_talk:Frietjes#Infoboxes_to_take_of revealed that most probably Error #31 needs expansion to cover more HTML table tags. -- Magioladitis (talk) 22:45, 31 May 2014 (UTC)Reply

@Frietjes and Magioladitis:. #31 only checks for the case of <table. There are legitimate cases where <td> can be used. Will first check the upcoming June dump file to see the lay of the land for tr and td tags. Bgwhite (talk) 06:47, 1 June 2014 (UTC)Reply
@Frietjes and Magioladitis:, I've added checking for <tr>. I do expect articles to go onto the whitelist. A listing of articles can be found at User:Bgwhite/Sandbox1. Bgwhite (talk) 00:25, 16 September 2014 (UTC)Reply

New error type

edit

Hello! I'd like to propose to detect a new error type: sometimes there are an in-page interlanguage links written as a regular interlanguage links, i.e. without a starting colon. But they are obviously in-page links since they contain a pipe symbol. For example, this situation was on a page 男同性恋免疫缺乏症 of Chinese Wiki (I don't know such examples in En.Wiki), which contained two such links: [[en:Kaposi's sarcoma|卡波西氏肉瘤]] and [[en:Pneumocystis pneumonia|卡氏肺囊虫肺炎]]. A link part after the pipe symbol is obviously useless for the regular interwikis and this situation is undoubted error. --Emaus (talk) 14:35, 2 June 2014 (UTC)Reply

Emaus @Magioladitis:. In theory, error #31, interwiki before last heading, should catch these situations. Since interwiki use should be minimal now, renaming this error would be a good thing. Maybe "interlanguage link with incorrect syntax"? Bgwhite (talk) 20:12, 2 June 2014 (UTC)Reply
@Bgwhite and Emaus: AWB will react by moving the interwiki at the bottom unless the interwiki matches the project code. -- Magioladitis (talk) 08:05, 3 June 2014 (UTC)Reply

Error #64

edit

@Bgwhite and NicoV: [[[[foo]]]] is caught as #64 by CHECKWIKI but as #10 by WPCleaner. It is not fixed by AWB. -- Magioladitis (talk) 06:51, 18 June 2014 (UTC)Reply

Hi Magioladitis. What do you think we should do ? I don't see why it's detected as #64 (link equal to link text): do you mean #46 (Square brackets not correct begin)? WPCleaner should detect both #10 and #46. --NicoV (Talk on frwiki) 13:05, 20 June 2014 (UTC)Reply

OK. I am getting rusty. Sorry again. This one show that AWB did not fix 64. but this is maybe due to the order of how stuff is done. Same here. -- Magioladitis (talk) 13:14, 20 June 2014 (UTC)Reply

Ok, I understand better, especially with the next modification. Maybe internal link is not correctly recognized by AWB due to the extra brackets? WPCleaner edit seems fine (#10, #46 and #64), except for the automatic comment ("null"...), I have to fix this one. NicoV (Talk on frwiki) 13:28, 20 June 2014 (UTC)Reply

Whitelists not always exclude things

edit

@Bgwhite: After the last dump I realised that the whitelist for #48 never works. Same for the #101 whitelist. -- Magioladitis (talk) 08:09, 18 June 2014 (UTC)Reply

These two were fixed. -- Magioladitis (talk) 09:59, 21 September 2014 (UTC)Reply


@Bgwhite: Error 24 whitelist does not work. -- Magioladitis (talk) 08:46, 21 September 2014 (UTC)Reply

I may have fixed it with this edit. -- Magioladitis (talk) 08:49, 21 September 2014 (UTC)Reply

@Bgwhite: Error 31 and 49 whitelists do not work. -- Magioladitis (talk) 09:59, 21 September 2014 (UTC)Reply

Magioladitis, #49 had the same problem as #24 and I fixed it a few weeks back. #31 and #49 haven't been updated on my computer. I was a little blindsided by the timing of this month's dump and didn't do any updates before hand. Bgwhite (talk) 22:44, 21 September 2014 (UTC)Reply

Error #48

edit

  Done

We should exclude anything inside timeline tags. -- Magioladitis (talk) 07:10, 19 June 2014 (UTC)Reply

Error #101

edit

  Done

We should exclude search inside {{Not a typo}}. -- Magioladitis (talk) 07:49, 20 June 2014 (UTC)Reply

Analysis of an article

edit

Hi @Bgwhite:, I was wondering if we could enhance the integration between Check Wiki and tools like WPCleaner, by providing access to the direct analysis of an article in Check Wiki: I'd like to be able to send a request to Check Wiki script checkwiki_bots.cgi (with the following parameters: wiki, article title, article text) and receive an answer telling me which errors are still detected and where (character position ?). I don't know how much work that would be on your side, but that could be very helpful to users when WPCleaner doesn't detect the problem CW detected: we would know if CW thinks that the problem is still present and where, so I could tell the user where it is on their current version of the article. --NicoV (Talk on frwiki) 20:01, 10 August 2014 (UTC)Reply

About software

edit

I can see that this wikiproject uses scripts and tools to assist work of the participants. I have a feeling that (usually) routinely done tasks are to be done server-side instead. What wiki software features would ease this work? Gryllida (talk) 04:13, 17 September 2014 (UTC)Reply

Gryllida I don't understand your question, but that isn't unusual for me. Every fix is done by a person or bot. Bots can't do everything. Both AWB and WPCleaner can be used manually or in bot mode. There is a listing of what tool can or cannot fix. WPCleaner is written in Java, AWB is written in .Net, and Auto-Formatter is javascript. Bgwhite (talk) 04:45, 17 September 2014 (UTC)Reply

Comment on the WikiProject X proposal

edit

Hello there! As you may already know, most WikiProjects here on Wikipedia struggle to stay active after they've been founded. I believe there is a lot of potential for WikiProjects to facilitate collaboration across subject areas, so I have submitted a grant proposal with the Wikimedia Foundation for the "WikiProject X" project. WikiProject X will study what makes WikiProjects succeed in retaining editors and then design a prototype WikiProject system that will recruit contributors to WikiProjects and help them run effectively. Please review the proposal here and leave feedback. If you have any questions, you can ask on the proposal page or leave a message on my talk page. Thank you for your time! (Also, sorry about the posting mistake earlier. If someone already moved my message to the talk page, feel free to remove this posting.) Harej (talk) 22:47, 1 October 2014 (UTC)Reply

No errors at plwiki

edit

For the last few days Check Wikipedia reports no errors at all at the Polish Wikipedia. Please have a look. ToSter (talk) 12:47, 16 October 2014 (UTC)Reply

ToSter, probably somebody went thru and marked all the bugs fixed. Happens on enwiki too. A new dump is available every two or so weeks and the errors will get repopulated then.
The bigger problem is the latest plwiki dump came out two days ago and a new checkwiki run wasn't done. Looking around, I found the dump files were not being updated at WMFLabs, again. I've filed a bug report at WMFLabs to have them fix this. Ironically, I got an email this morning saying my last bug report for the same thing was finally closed after a month. I wouldn't have caught this for another week or two, so thanks to you, it will get fixed sooner. Thank you. Bgwhite (talk) 21:52, 16 October 2014 (UTC)Reply
Bgwhite, thanks for the explanation. Good to know that I inspired you to find the error. As for the disappearing errors, wouldn't it be better to scan all the pages which have been lately marked as done too? If all pages which are not marked as done are scanned regularly, that cannot have a great impact on performance. It's simple to click "done" accidentally. And even if it's done on purpose, the script should check it on its own. In my opinion, the distinction between "done" and "not done" should be used solely for the purpose of editors who are fixing the errors - sometimes concurrently. ToSter (talk) 19:16, 21 October 2014 (UTC)Reply
ToSter, a couple of reasons not to do it. 1) New lists are generated every ~15 days (whenever a dump is available). This is a relatively short amount of time. 2) There's only an occasional problem of people blanking errors. 3) A majority of errors are fixed via WPCleaner. It automatically marks done if the error was fixed, so not much of a problem of accidentally hitting done. Bgwhite (talk) 22:40, 21 October 2014 (UTC)Reply
Bgwhite, until now we haven't used WPCleaner at plwiki and I sometimes use pywikipediabot. Could you please describe what checkwiki does exactly on daily basis? I cannot find this documented. ToSter (talk) 07:10, 22 October 2014 (UTC)Reply
ToSter, see Wikipedia:WikiProject Check Wikipedia#Operation
The main programs used for manual and bot fixing are WPCleaner and AWB. There are also some pywikipediabots. WPCleaner does have a Polish translation. Not sure about AWB, but Magioladitis would know. To see what errors these tools fixes, look at the List of errors. I know bots have been approved on multiple Wikipedia's.
I saw you edited the "Polish Translation" file. Both Checkwiki and WPCleaner use the same file. Anyone can change what errors Checkwiki will and will not look for, also change priority settings. Feel free to change the file. One can add a whitelist and a "template" listing to the translation file (see the English file as an example). The "template" listing can be a listing of whitelisted templates (see #59's listing) or adds templates to check (#61's add templates to check for punctuation after the template). A common template listing to add is for #78 as different language projects have their own reference templates. Bgwhite (talk) 07:54, 22 October 2014 (UTC)Reply
Yes, I have edited the Polish translation file but it seems to have no impact on checkwiki - the labels are still in English or even blank. Bgwhite, could you please have a look? ToSter (talk) 11:11, 24 October 2014 (UTC)Reply
ToSter, I removed the depreciated parameters from the template file. The descriptions that were in English are now in Polish. I did notice one problem, the whitelists. The whitelist parameter should point to a file. There can be alot of articles on the whitelist, so a file is easier to maintain. Look at the English translation page to see the syntax and also view a English whitelist file to see its syntax.
And yes, I have read the "Operation" section but the point "For a few Wikipedias, the program scans newly revised articles on a daily basis to create a new list for users, omitting already-corrected articles." doesn't say much. Which Wikipedias are these and what does "newly revised" mean? ToSter (talk) 11:13, 24 October 2014 (UTC)Reply
There are five wikipedias, English, French, German, Spanish, Arabic and Czech, that are updated daily. The first three because they are the largest Wikipedias, the last two because they were requested. Every ten minutes, checkwiki grabs the last 500 articles that were edited. At 0z everyday, these articles are checked for problems. In the case of Arabic or Czech, that is probably every article that was edited that day. For the others, because of such high volume of editing, not every edited article will be checked.
Bgwhite, thanks for the explanation. It would be great if you could enable it also for the Polish Wikipedia - it could be really helpful for us. I will also have a look at the whitelist issue. ToSter (talk) 13:13, 26 October 2014 (UTC)Reply
ToSter, it is enabled. As it is close to 0z, you should notice it tomorrow. Bgwhite (talk) 22:26, 26 October 2014 (UTC)Reply
Bgwhite, thanks, it works! ToSter (talk) 07:31, 27 October 2014 (UTC)Reply

Error number 48 title linked in text

edit

I saw a bot correction of a citation I posted the other day, and the edit summary referred me here to the description of error number 48, title linked in text. But the cite template documentation says that the title of a source can be wikilinked to an existing Wikipedia article, as I attempted to do. Did I throw the error with my citation because the span of text wikilinked was no letter-for-letter identical with the title of the book in the template title field? If so, I can fix the problem by setting up a redirect to the article. The citation I put in new articles the other day is shown here (the raw mark-up of this question in edit mode will show exactly how I coded the template).

Flynn, James R. (2009). What Is Intelligence?: Beyond the Flynn Effect (expanded paperback ed.). Cambridge: Cambridge University Press. ISBN 978-0-521-74147-7. {{cite book}}: Unknown parameter |laydate= ignored (help); Unknown parameter |laysummary= ignored (help)

Thanks for any advice you have about this. -- WeijiBaikeBianji (talk, how I edit) 18:06, 8 October 2014 (UTC)Reply

WeijiBaikeBianji, #48 does have anything to do with citation. That was the primary reason the bot arrived at the article. Depending on what bot did the edit, the summary may have contained something like, "Do general fixes and cleanup if needed". The citation edit probably would fall under that. However, it would help if you could give the edit in question. I could give a better answer if I could see what happened. Bgwhite (talk) 22:45, 8 October 2014 (UTC)Reply
Oh, I see, The edit[9] was in the article that is about the book, and thus the removal of the Wikilink from the citation template had nothing to do with the format of the template's fields, but everything do with where the template was inserted. (That means, I guess, that I can still wikilink the book title when I cite the book in other articles on Wikipedia.) Thanks for your reply. -- WeijiBaikeBianji (talk, how I edit) 23:17, 8 October 2014 (UTC)Reply
WeijiBaikeBianji, I think you got it and yes, you can still wikilink the book title in other articles. Bgwhite (talk) 00:59, 9 October 2014 (UTC)Reply

Error #39

edit

NicoV Magioladitis After looking at some of the articles in a list of #39 errors not fixed by a bot, I've noticed some "false positives". I use quotation marks because it is actually errors with mediawiki that is causing the problem.

Newlines don't function in <blockquote>, {{quote}}, {{cquote}} and {{quotation}}. I have the checkwiki code skip these for error #39. After looking at the new list of articles, <ref>, [[Image: and {{bq}} also don't work.

<skip several hours>

I have the bug bookmarked and brought it up. Low and behold, the patch that was submitted in December 2011 was finally accepted. Final changes were made today on enwiki. Turns out Visual Editor was assuming newlines worked the same everywhere... silly VE. So, VE started the move to finally fix the problem. Hey, who knew, VE was actually helpful for the first time ever. According to the log, it only took 8 1/2 years to fix.

I've verified that {{quote}}, {{cquote}} and {{quotation}}, <blockquote> and {{bq}} now treat newlines correctly. I've verified that <ref> and [[Image: still barfs on newlines.

I need to add the ref and various image tags to #39's code and remove the currently skipped templates in #39's code. Bgwhite (talk) 05:22, 16 October 2013 (UTC)Reply

Bgwhite this means that now AWB can replace p tags inside blockquote with newlines? -- Magioladitis (talk) 06:12, 16 October 2013 (UTC)Reply
Magioladitis. I'm confused. It doesn't work on Aristole#Geology, but it does work below.

a

b

c

Bgwhite (talk) 06:42, 16 October 2013 (UTC)Reply
Asked a question at User talk:Kaldari#bug 6200 and quote templates. Bgwhite (talk) 06:55, 16 October 2013 (UTC)Reply
Comment was made at bugzilla bug 6200 about the problem. Also bug 55674 for newlines in ref tags. Bgwhite (talk) 09:05, 16 October 2013 (UTC)Reply
6200 marked as fixed. -- Magioladitis (talk) 14:45, 28 October 2013 (UTC)Reply

We can re-enable search inside <blockquote> since bug fixed. -- Magioladitis (talk) 23:49, 15 September 2014 (UTC)Reply

Ideas for new errors

edit

Time to start thinking about what new errors should be added to Checkwiki.

Ping: Magioladitis, NicoV, Meno25, Crazy1880, LindsayH, GoingBatty, Matěj Suchánek, Josve05a, ChrisGualtieri, Graham87. I think that is everybody. If not, add them to the list.

What should or should not be added will be determined by several factors:

  1. How easy is it to code up?
  2. Is it something that AWB or WPCleaner already can find.
  3. Is it something that AWB, WPCleaner or a bot can currently fix?
  4. Is it an accessibility issue?
  5. Is it a serious issue? Are the errors on the high or medium lists?
  1. High priority: error corrupts or distorts the content posted in Article
  2. Medium priority: improving the encyclopedic content or readability of the article
  3. Low priority: improving maintenance or MOS fixes

Some examples:

  1. Replacing <strike> with <s>. It would take a copy/paste to code up. WPCleaner finds and fixes the problem. It would be Low priority.
  2. Finding cases of url=http://http:// This is a common error I see. It would be High priority. It is fixed by AWB.
  3. Blank lines in bulleted vertical lists. This is an accessibility issue per Wikipedia:Accessibility#Blocked elements. This causes problems for screen readers.
  4. No blank space after the comma in DEFAULTSORT values. An example would be: Bush,George. The article would be sorted first for all surnames beginning with Bush. Currently not fixed by AWB or WPCleaner. Probably medium priority.

Bgwhite (talk) 01:34, 26 November 2013 (UTC)Reply

How about putting the TOC in the standard position in the wiki-markup, which is also an accessibility issue? Not sure about automating this though. Graham87 01:39, 26 November 2013 (UTC)Reply
Seems that there are lots of new citation style errors, some of which appear in red text in the references section. Those might be something worth exploring. GoingBatty (talk) 02:08, 26 November 2013 (UTC)Reply

A few suggestions:

  1. An error to detect non-existent files (red linked files). We have a bot on Arabic Wikipedia that removes such links. However, the bot works on all pages of the Main namespace. Generating a list of pages for the bot to work on would be a good idea. See Wikipedia:Database reports/Articles containing deleted files.
  2. Detecting user signatures in articles (articles containing links to user pages). To be worked on manually. See Wikipedia:Database reports/Articles containing links to the user space.
  3. Detecting fat redirects (redirects obscuring page content). To be checked manually. See Wikipedia:Database reports/Redirects obscuring page content. --Meno25 (talk) 06:43, 26 November 2013 (UTC)Reply

The errors I suggested are covered by the Database reports on English Wikipedia. Database reports are updated regularly only on enwiki, Commons and Meta. Moving the errors to checkwiki means that the reports would get generated for other wikis too. So, maybe disable those errors for enwiki and enable them for other wikis. --Meno25 (talk) 06:43, 26 November 2013 (UTC)Reply

Meno25 you can request similar databases for other wikis. -- Magioladitis (talk) 09:19, 26 November 2013 (UTC)Reply

CHECKWIKI is more about common syntax errors. We need to focus on that. If lists are already generated by other bots/projects we do not need to duplicate the job. Bgwhite's idea of unspaced DEFAULTSORT is a great example of what we are after. WPC's extended list is another good example. I have some minor suggestions:

I don't know if this is an error or maybe already monitored but:

  1. {{cite web}} without access dates.
  2. When only <ref>http://exemple.com/</ref> is used without title/description. This is to prevent link rot.
  3. When two (or more) refs with the same information has diffrent ref-name.
  4. When the time (e.g. 08:45 or 8 am) or the day (e.g. Moday or Saturday) is used inside |accessdate=.

(tJosve05a (c) 11:52, 26 November 2013 (UTC)Reply


Hi, I think new errors should be generic enough to work on most wikis, so avoid very specific errors (for example: {{cite web}} without access dates should be dealt by the template itself: put the page in a maintenance category if access dates are missing). Otherwise, some of WPCleaner errors in the #5xx numbers:

  • #502: Useless "Template" in {{Template:...}} (low)
  • #508: Non-existent templates (medium ?)
  • #511: Internal link written as an external link (medium ?)
  • #512: Interwiki link written as an external link (low ?)
  • #513: Internal link inside an external link (medium ?)
  • #517: <strike>...</strike>
  • #519: <a>...</a>
  • I like some of previous proposals: missing space after a comma in a DEFAULTSORT, doubled http, blank bulleted lined, non-existent files, ...

Some of them are probably hard to develop or require access to a lot more information, so they will be difficult to add (non-existent templates / files, ...) --NicoV (Talk on frwiki) 12:57, 26 November 2013 (UTC)Reply

NicoV I agree with you. My first suggestion is not good neither. I think the best suitable new additions are WPCleaner errors in the #5xx numbers. For non-existent file etc I disagree that we should do something about them. There are databases for those already. -- Magioladitis (talk) 13:39, 26 November 2013 (UTC)Reply
NicoV: #508 is already listed, see Special:WantedTemplates, the files are in Category:Pages with missing files.
I have once suggested a link to a year which has another description ([[2012|2013]], medium or high).
Some inspiration: de:Benutzer:Stefan Kühn/Check Wikipedia#Next features. Matt S. (talk | cont. | cs) 15:16, 26 November 2013 (UTC)Reply

A few more:

  1. More than one blue link per * on a disambig-page. (Per WP:MOSDAB)
  2. Refs and reflist on a disambig-page. (per WP:MOSDAB)
  3. When an article does not have "nbsp" between e.g. 15 km, 2,5 miles and 3 cm)

-(tJosve05a (c) 16:12, 26 November 2013 (UTC)Reply

Moin, like a free space in a category as medium. Example: right: "[[category:xyz]]" and wrong "[[categorie: xyz]]". Stefan Kühn had had a code for the persondata-script. Regards --Crazy1880 (talk) 09:09, 29 November 2013 (UTC)Reply
Crazy1880, Error 22 should be picking those up. Bgwhite (talk) 02:20, 3 December 2013 (UTC)Reply
Bgwhite, oh, yes it did. In the german Wikipedia was the question, if ID 69 will check für "ISBN:", because the linked site only use ISBN. (example: ISBN: 978-3-7657-2781-8 > ISBN 978-3-7657-2781-8) Regards --Crazy1880 (talk) 20:19, 4 January 2014 (UTC)Reply
Crazy1880, #69 checks for ISBN: and ISBN- Bgwhite (talk) 22:21, 4 January 2014 (UTC)Reply

Round 2

edit

Ping: Magioladitis, NicoV, Meno25, Crazy1880, LindsayH, GoingBatty, Matěj Suchánek, Josve05a, ChrisGualtieri, Graham87.

Following is a list of errors that I think could be added. Some notes:

  • English database reports that Meno25 are not being ported to other languages unless somebody is willing to take on the task. Very few have been ported. So, if a report meets the "standard", I see no reason not to add it to checkwiki.
  • Most citation style errors would be a pain in the butt to code, too many articles that take too long to correct and are not really syntax errors. The one exception that I can think of is missing "url=" when the web address is given.
  • NicoV and Magioladitis, could you WPCleaner or AWB to the appropriate errors and columns.
  • Any errors not in the list that you think should be added? Any other comments?
Description Priority Coding Tools to detect Tools to fix Other
Useless "Template" in {{Template:...}} low Done WPC, AWB WPC, AWB #1 (#502)
Internal link written as an external link medium Done WPC WPC & Frescobot #90 (#511)
Interwiki link written as an external link low Done WPC WPC #91 (#512)
Internal link inside an external link medium WPC (#513) WPC
<strike>...</strike> low Done WPC, AWB WPC, AWB* #42 (#517). Obsolete in HTML5. Use <s>...</s> instead
<a>...</a> low Done WPC WPC #4 (#519)
URL without http:// high Done WPC, AWB WPC, AWB #62
Finding cases of url=http://http:// medium Done WPC, AWB WPC, AWB #93
Blank lines in bulleted vertical lists medium Accessibility issue per Wikipedia:Accessibility#Blocked elements
Putting the TOC in the standard position medium Done WPC #96 and #97. Accessibility issue per MOS Elements of the lead
No blank space after the comma in DEFAULTSORT low Done WPC, AWB WPC, AWB #89
Unbalanced ref tags medium Done WPC, AWB WPC, AWB #94
Detecting user signatures in articles low Done WPC, AWB WPC, AWB #95
Detecting fat redirects (redirects obscuring page content) low
<span class="plainlinks"> in articles low
Pipe in external link [http:/www.wikipedia.org|Wikipedia] low
Link to a year which has another description ([[2012|2013]]) low This error is often caused by VE.
Cases of {{cite web|url=http://www.wikipedia.org| title= medium
Move anchor in front title in heading
Detect non-existent files (red linked files)
Detect non-existent templates WPC (#508)
Detect refs <ref name=> low easy often detected as #56
Category with double colon easy AWB
More same parameters in template medium medium
Good :-). I've added the information about what WPCleaner can currently detect and fix (automatic or bot, at least partially). For errors I've already coded with a #5xx number, feel free to use an error number following what CW currently manages or keep the #5xx number. For other errors, I don't see any problem for implementing them in WPCleaner, but it will probably have to wait 2 months, as I will be almost completely unavailable for several weeks. --NicoV (Talk on frwiki) 20:05, 3 December 2013 (UTC)Reply

Errors added

edit

Magioladitis, NicoV, Meno25, GoingBatty, Matěj Suchánek, Josve05a, ChrisGualtieri

  • #01 - Template with the useless word "template"
  • #04 - HTML text style element <a>
  • #42 - HTML text style element <strike>
  • #62 - URL containing no http://
  • #89 - DEFAULTSORT with no space after the comma
  • #90 - Internal link written as an external link
  • #91 - Interwiki link written as an external link
  • #93 - External link with double http://
  • #94 - Reference tags with no correct match
  • #95 - Editor's signature or link to user space
  • #96 - TOC after first headline
  • #97 - Material between TOC and first headline

Notes

edit
  • Only turned on for enwiki for right now. Will start to expand after NicoV's return.
  • Just added #90 and #91. So, there will probably be some problems.
  • For #90 and #91, it will only search for articles written as an external link. Talk pages or special pages will no be searched. History of Wikipedia has examples on why it is done this way.
The description on #91 most be changes to The script found an external link that should be replaced with a interwiki link. An example would be on enwiki [http://fr.wikipedia.org/wiki/Larry Wall] should be written as [[:fr:Larry Wall]] so it says fr.wikipedia.org in the extrnal link and not en.wikipedia.org. -(tJosve05a (c) 21:07, 24 December 2013 (UTC)Reply
And #90 most be changed to e.wikipedia.org. -(tJosve05a (c) 21:11, 24 December 2013 (UTC)Reply
Another thing is that it should not say [...]/wiki/Larry Wall]. It should say [...]/wiki/Larry_Wall Larry Wall].(tJosve05a (c) 21:14, 24 December 2013 (UTC)Reply

Errors modified

edit
  • #22 - Finds more cases of a space in a category
  • #19 - Finds headlines that start with one "=" anywhere in the article instead of only at the start of the article.

WPCleaner

edit

Bgwhite I've updated WPCleaner (version 1.31) for the following errors for all wikis: #1 (previously #502), #4 (previously #519), #42 (previously #517), #90 (previously #511), #91 (previously #512). Still have to do: #62, #89, #93, #94. Old #62 and #89 have been disabled. --NicoV (Talk on frwiki) 21:51, 22 January 2014 (UTC)Reply

Thank you Nico. Do you want me to turn on the new errors for frwiki or wait? I'm sure Josve05a will have found a bug before I write this. :). New error #95 will be an editor's signature found in an article. Bgwhite (talk) 22:49, 22 January 2014 (UTC)Reply
Bgwhite, will 95 include UTC, CET, CEST etc.? Since this error might not only be used on enwp? (tJosve05a (c) 22:55, 22 January 2014 (UTC)Reply
(BTW Bgwhite I'm 16 in 5...4...3...2...1...HAPPY BIRTHDAY TO ME!) (tJosve05a (c) 23:00, 22 January 2014 (UTC)Reply
Hey, I already wished you a happy birthday, which you already complained about. Now you want another... pfffft.  :) Time is irrelevant for #95 as I'm only looking for a signature. Bgwhite (talk) 23:09, 22 January 2014 (UTC)Reply
@Bgwhite Yes, I think you can turn the new errors on for frwiki, I'll check what has to be changed in the translation file. --NicoV (Talk on frwiki) 08:47, 23 January 2014 (UTC)Reply
Bgwhite, NicoV, (#91) WPCleaner changes [http://www.imdb.com/name/nm0403424/ Hurley on the [[Internet Movie Database]]] to [[:imdbname:0403424|Hurley on the]][[Internet Movie Database]]]. I see multiple issues with this. It removes the blank space, it leaves 3 bracket at the end (without the WPCleander reporing it. (Found on Colin Hurley). (tJosve05a (c) 10:52, 23 January 2014 (UTC)Reply
@Josve05a: It will be fixed in a next version. It's due to the incorrect syntax of having an internal link inside an external link. It can be reported by WPC if #513 is activated. --NicoV (Talk on frwiki) 19:54, 23 January 2014 (UTC)Reply
@Bgwhite: If possible, start please the new checks for cswiki, too. I will modify the configuration file. Within a week, you can also enable skwiki.
@Josve05a: Happy birthday! You are now same aged as I am (for next 10 months). Matt S. (talk | cont. | cs) 18:28, 23 January 2014 (UTC)Reply

NicoV and Matt S., in theory frwiki and cswiki should start seeing the new errors at the next 0z run.... if the database is up. Today's outage was caused by a disc getting full. Bgwhite (talk) 07:38, 24 January 2014 (UTC)Reply

Hi! It's strange: I've modified the translation file in frwiki 4 days ago, but the old description is still displayed in WMFLabs for #1, #4, #42, #62. No errors are found. --NicoV (Talk on frwiki) 12:15, 27 January 2014 (UTC)Reply
@Bgwhite For frwiki, I've changed the translation file a few days ago: descriptions for new error numbers (#93, ...) have been taken into account on WMF Labs, but not the modified descriptions for old error numbers that have been recycled (#1, #4, #42, #62, #89, #90, #91). Is it a problem to have kept the old descriptions as comments? --NicoV (Talk on frwiki) 09:12, 29 January 2014 (UTC)Reply
NicoV, hmmm, I didn't see the message above this one. Sorry for that. The translation file and every other program has been bombing lately, so that was probably why you didn't see it right away. Between database problems and mounting problems, I'm ready to go screaming into the night. The frwiki dump processing is still running. Which is very amazing that it hasn't died yet.
Do you mean as comments in the translation file as you have done for the French one? I see no problems.
For #96 and #97 I've thrown in a little regex in the English translation file to account for templates being used with a space and no space.
For #95, I only have English "User" and "User talk". I'll get individual wiki's words in a bit. I'll get them thru the API. Bgwhite (talk) 09:31, 29 January 2014 (UTC)Reply
Bgwhite For example, on WMFLabs, #1 is still displayed as "Pas de texte en gras" (the old description, which is commented out in the translation file) when the translation file has been changed 6 days ago; whereas the translations for the new errors (#93 and so on) are correctly used even if they have been changed later (only 2 days ago). --NicoV (Talk on frwiki) 16:47, 29 January 2014 (UTC)Reply
NicoV, looking at the code, it grabs the first variable, commented or not. So, putting the commented lines second does the trick. Bgwhite (talk) 22:32, 29 January 2014 (UTC)Reply
Ok, thanks a lot! --NicoV (Talk on frwiki) 11:08, 30 January 2014 (UTC)Reply

Error #62

edit
Discussion

If a website is called "www.news.de" for example something like this is valid in the German Wikipedia:

<ref>www.news.de: [http://www.news.de/article Article].</ref>
<ref>www.news.de: ''[http://www.news.de/article Article]''.</ref>

This shouldn't be reported as an error. Would be nice to have this excluded somehow. Disabling the check would also disable the check for url= which would be a shame. Here is an idea for an extended regex (not tested).

/(?:<ref\b[^<>]*>|url\s*=)\s*www\w*\.(?![^<>[\]{|}]*\[\w*:?\/\/)/i

--TMg 17:10, 19 January 2014 (UTC)Reply

I had to drop checking for cases with |url=. There were infoboxes which required external links not have http://. That should make the regex a little easier. I currently have:
/(<ref>\s*\[?www\.)/
I'm not yet catching named refs, which you do. Bgwhite (talk) 06:10, 20 January 2014 (UTC)Reply
Unfortunately this will cause the same false positives. Here is my regex again without the url= option.
/<ref\b[^<>]*>\s*\[?www\w*\.(?![^<>[\]{|}]*\[\w*:?\/\/)/i
--TMg 09:51, 20 January 2014 (UTC)Reply
Yes, I know it will cause the same false positives. I was only giving the reasons why for the current status of the regex, including dropping url. Bgwhite (talk) 22:17, 20 January 2014 (UTC)Reply
It does work, but it has a hitch. For example, it does find an error in Central Philippine University, Ciclosporin and Gravity Rush. However, it reports the error at the end of the article. The hitch happens with the entire regex or just /(<ref\b[^<>]*>\s*\[?www\.)/. I'm off to bed Bgwhite (talk) 09:10, 21 January 2014 (UTC)Reply
Not sure what you mean with "hitch". Maybe it's because I removed the brackets but you are relying on them? Let's re-add them:
/(<ref\b[^<>]*>\s*\[?www\w*\.)(?![^<>[\]{|}]*\[\w*:?\/\/)/i
This matches:
But it does not match my two examples above. I'm happy. :-) --TMg 21:24, 21 January 2014 (UTC)Reply
The "hitch"... for some articles, the regex does not tell where the error is found. It just reports the last bracket in the article. See [10] and look at the notice column. Bgwhite (talk) 21:52, 21 January 2014 (UTC)Reply
I see. That's an upper/lowercase problem. The index() call is case-sensitive but gets $1 lowercased. --TMg 22:09, 21 January 2014 (UTC)Reply
Current Suggestion
my $test_text = $lc_text;

if ( $test_text =~ /(<ref\b[^<>]*>\s*\[?www\.)/ ) {
    my $pos = index( $text, $1 );
    error_register( $error_code, substr( $text, $pos, 40 ) );
}
if ( $text =~ /<ref\b[^<>]*>\s*\[?www\w*\.(?![^<>[\]{|}]*\[\w*:?\/\/)/i ) {
    my $pos = $-[0];
    error_register( $error_code, substr( $text, $pos, 40 ) );
}

False positives for #2

edit

  Resolved

Hi, on frwiki, there are 5 false positives for #2:

--NicoV (Talk on frwiki) 13:37, 17 November 2014 (UTC)Reply

NicoV, try this (something other than a space there). Frietjes (talk) 16:21, 17 November 2014 (UTC)Reply
that fixed all of them except for fr:Antipaïne. Frietjes (talk) 16:28, 17 November 2014 (UTC)Reply
I fixed the last one. -- Magioladitis (talk) 16:54, 17 November 2014 (UTC)Reply
Thanks guys ! --NicoV (Talk on frwiki) 08:58, 18 November 2014 (UTC)Reply

Error #46

edit

It seems to be happening again on frwiki (fr:Antihéros, fr:Insulte, ...) but I don't find anything wrong in the articles, even somewhere else. --NicoV (Talk on frwiki) 09:21, 5 April 2014 (UTC)Reply

#10

edit
  • Hmmm, this shouldn't be happening. Looks like it is counting ]]] as having two ]] possibilities. Will look into it. Bgwhite (talk) 20:25, 21 September 2013 (UTC)Reply
    • Matěj Suchánek, interesting case. The problem going on is: [[metr za sekundu|[m/s]]] was followed by a statement with a broken bracket, [ran/[[minuta|min]]. If it wasn't followed by a broken bracket, checkwiki would not say this was an error. Normally I'd say this is a rare case and checkwiki correctly said there is an error on the page, thus this is a real low priority. However, the problem in the code is similar to the problem in the code for #46 error report above this report. So, a solution in one probably fixes the other error. Problem is, I've yet to figure out the #46 error after many hours. Bgwhite (talk) 08:52, 29 October 2013 (UTC)Reply

Statistics

edit

Hi, I know that you're always looking for more work since it's so easy to use Labs ;-)

I'd like to suggest adding some statistics for Check Wiki to give us some information on how errors evolve on each wiki. Would it be possible to add a table with the following informations ?

  • One line for each error
  • Several columns for each day for a month : number of articles detected for the error after the daily scan, number of articles marked as fixed for the error during the day, eventually number of articles marked as fixed during the day but that were detected again

--NicoV (Talk on frwiki) 10:21, 6 November 2013 (UTC)Reply

Great idea. Though I am not sure if Bgwhite has enough time to implement it. --Meno25 (talk) 16:25, 9 November 2013 (UTC)Reply
I know, it's just wishful thinking, no emergency and no problem if it's not implemented. --NicoV (Talk on frwiki) 12:08, 14 November 2013 (UTC)Reply

Include pages in namespace 104 on arwiki

edit

Please include pages in namespace "ملحق" (NS:104) on Arabic Wikipedia (arwiki) in the lists generated by Checkwiki script. This namespace contains lists and years pages. Pages in that namespace are counted in the number of articles (magic word: {{NUMBEROFARTICLES}}) and AWB's Auto-Tagger already tags articles in that namespace. --Meno25 (talk) 12:11, 23 November 2013 (UTC)Reply

Meno25, I'm going to wait on this for a bit. I've held off on 104, commonswiki and File namespace. I'm using code optimized to run only grab Article namespace from the dump file. Changing out will cause a severe decrease in speed. I'll have to some other changing around to insert the code, but everything else is setup for it. For example, there are if statements that say only Article and 104 namespace can check certain errors. Bgwhite (talk) 08:29, 24 November 2013 (UTC)Reply
@Bgwhite: Thank you for the explanation. Take your time. We are not in a hurry. --Meno25 (talk) 12:21, 24 November 2013 (UTC)Reply

Error #39 (again)

edit
  Resolved

Hi all. In Demons (novel) the section headed "Characters" employs paragraphs within a bulleted list. This has been coded per the advice given here, but Yobot (and, I think, other AWB-based robots) persists in making "corrections": [11] [12] [13] [14] [15] [16] and so on. Aside from destroying the logical structure of the section, this is also contrary to accessibility guidelines.

I note that the detection of error #39 has already been modified to accept the use of <p>s within certain tags, such as <blockquote>. Can this tolerance be extended to include <p>s within lists?

(I was uncertain whether to raise this concern here, with Yobot, or with AWB. If I've chosen the wrong place, could you please let me know, and I'll try again.) In the meantime, thanks for your collective good work with checkwiki: fighting the good fight, and at scale! — Simon the Likable (talk) 13:49, 10 February 2014 (UTC)Reply

Simon the Likable hi. Thanks for starting the discussion. I was not aware of this problem. Bots tend to revisit a page unless something is changed. -- Magioladitis (talk) 13:54, 10 February 2014 (UTC)Reply
Simon the Likable can you please check if you like my version? -- Magioladitis (talk) 13:57, 10 February 2014 (UTC)Reply
This is both a Checkwiki and AWB issue, so having a discussion at either spot is just fine.
@Graham87: As this is also an accessibility issue, Graham is the one to ask. Current version of Demons (novel) uses * and : to create paragraphs inside lists. This version uses * and standard html paragraph tag. Is the current version acceptable or should the older version be used? Bgwhite (talk) 18:24, 10 February 2014 (UTC)Reply
@Simon the Likable, Magioladitis, and Bgwhite: The older version is better, but even there, the gaps between the list items would need to be removed. In the newer version, the HTML lists finish at the end of each paragraph (as can be seen by checking the HTML source). It might be easier to use HTML rather than wiki-markup to create the lists. Graham87 01:13, 11 February 2014 (UTC)Reply
Sorry guys but on my laptop, both versions have the same visual result. I must be semi-blind or something. This happens to me after working on my laptop for several hours. Can someone explain me what are the visual differences? Thanks, Magioladitis (talk) 06:59, 11 February 2014 (UTC)Reply
Visually they are the same. On a screen reader, it breaks up the list. The first item on the list, the one with the <p> tags, with the : it appears as an one item list to a screen reader. Bgwhite (talk) 07:12, 11 February 2014 (UTC)Reply
Thanks Magioladitis. As Bgwhite has outlined, your solution is impeccable visually, but will not allow visually impaired readers good access using a screen reader. I have therefore reverted your change (reinstating the <p>s), but have also taken on board Graham87's point and removed the blank lines between list items. Thus, I think the current version covers both visual and accessibility requirements, and follows recommended coding practices in Help:List#Paragraphs_in_lists and now WP:LISTGAP.
This leaves open my original issue: checkwiki and AWB both regard this recommended markup as an error. Can checks for error #39 be modified to accept the use of <p>s within lists? (Or perhaps there is some other solution?) — Simon the Likable (talk) 13:59, 11 February 2014 (UTC)Reply
Thanks Simon; sounds good here now! Graham87 14:03, 11 February 2014 (UTC)Reply
Hey guys. Any chance that this is a Mediawiki bug and we should report it? -- Magioladitis (talk) 14:06, 11 February 2014 (UTC)Reply
I looked at source code for the latest version and Magioladitis' version. It does not appear to be a bug. In the latest version of Demons (novel), it is one long list made up of <li> tags. If a blank line happens, the list ends. In Magioladitis' version, it starts as a list. When the first : happens, the list is ended. The HTML tags to produce the layout for the : consists of <dl> and <dt> tags. The use of the dl and dt tags is standard HTML practice when text needs varying indentation. The source for this talk page is full of dl and dt tags. Bgwhite (talk) 06:23, 12 February 2014 (UTC)Reply
Yes, both Checkwiki and AWB should not call this an error. Finding a solution is another matter. My brain isn't coming up with an answer. For the time being, I've added the article to a whitelist, so Checkwiki will not find a <p> error in the article. Bgwhite (talk) 06:23, 12 February 2014 (UTC)Reply

Bgwhite thanks to Frietjes we found a wonderful workaround called {{paragraph break}}. -- Magioladitis (talk) 08:30, 24 January 2015 (UTC)Reply

Code used for generating the lists

edit
  Resolved

Is the code (or list of regular expressions) available? I believe I could suggest some improvements for cutting down on false positives and/or the number of whitelisted articles for some of the lists. Frietjes (talk) 15:35, 17 October 2014 (UTC)Reply

Frietjes check here. -- Magioladitis (talk) 16:21, 17 October 2014 (UTC)Reply
thank you. my first improvement would be to add the following on line 1132 of checkwiki.pl
$test_text =~ s/\{\{\{\|safesubst:\}\}\}//g;
this would fix all the false positives from RFD discussion tags in list 28 (i.e. remove these), unless that's already been fixed? Frietjes (talk) 16:31, 17 October 2014 (UTC)Reply
my second improvement would be to change '<tr' to '<tr[^a-z]' in error_031_html_table_elements which would avoid matching '<transcript>' and other non-table tags that start with tr. Frietjes (talk) 16:34, 17 October 2014 (UTC)Reply
Frietjes, Magioladitis both changes implemented. The sufesubst change was also added to errors #34 and #43 as it showed up there too. Bgwhite (talk) 20:35, 17 October 2014 (UTC)Reply
It looks like the fix was to ignore all pages with {{{|safesubst:}}}, which is suboptimal :( I suppose the better thing would be to fix Module:RfD, but it seems as though there was a logical reason for adding it there. not sure if there is any other solution, but we shall see. it would be a shame to have to resort to such hacks since, technically, {{{|safesubst:}}} is a programming element. Frietjes (talk) 21:08, 17 October 2014 (UTC)Reply
Bgwhite can we undo the 'safesubst:' hack? this change was just made, so in a few weeks, we shouldn't have any of these left. the fix for the tr tags is great though since it means we won't have to hack around Additive Manufacturing File Format, Event Programming Language, GPS Exchange Format, ... Frietjes (talk) 22:11, 17 October 2014 (UTC)Reply
@Bgwhite: Indeed, the {{{|safesubst:}}} stuff should all be gone now. Can you remove the hack related to it now? Jackmcbarn (talk) 21:52, 11 December 2014 (UTC)Reply
Jackmcbarn, I actually removed it yesterday. Checkwiki did find redirects still at RfD from months past, but have never been closed. Look at the bottom of Wikipedia:CHECKWIKI/034 dump. Bgwhite (talk) 22:08, 11 December 2014 (UTC)Reply

Frietjes, Bgwhite RfD changed the code used. Hopefully, this resolves are problem. -- Magioladitis (talk) 08:32, 24 January 2015 (UTC)Reply

False positives for #87

edit

Hi, with the latest full dump, there seems to be a lot of false positives for #87 (HTML entities without ;). Examples from the 25 first pages reported:

--NicoV (Talk on frwiki) 20:54, 21 July 2014 (UTC)Reply

NicoV We turned off #87 on enwiki because of the false positives. I'm not sure how to fix this. The hard part is there can be letters or numbers after an entity. Any ideas? Bgwhite (talk) 22:32, 21 July 2014 (UTC)Reply
Bgwhite Apart from the last 2, I think the only thing that could be done is filtering out the errors when they are found in special places (URL, attribute of a tag, timeline, image, ...). For the last one, I only see doing a case sensitive compariso. And for the &phis;, I don't know... Not very helpful, sorry. --NicoV (Talk on frwiki) 06:00, 22 July 2014 (UTC)Reply
edit
  Resolved

how about a check for this? Frietjes (talk) 16:27, 10 November 2014 (UTC)Reply

I think it should be detected by #90. --NicoV (Talk on frwiki) 17:01, 10 November 2014 (UTC)Reply
yes, you are correct. the one's that I found were very new, so weren't detected yet. thank you. Frietjes (talk) 17:34, 10 November 2014 (UTC)Reply

Error #61 false positive

edit
  Resolved

<ref > (changed to <ref> by BG19bot) is not an error and should not be changed. Whitespace is permissible here and even has advantages, as giving word wrap a safe place to break lines without introducing either syntactic or legibility confusion. Andy Dingley (talk) 14:16, 14 November 2014 (UTC)Reply

Andy Dingley. Why do people refuse to give the article's name? I'm not psychic. As the edit summary states, Checkwiki error #61 is for punctuation after references. Your issue is NOT a checkwiki issue. It falls under the general fixes section of the edit summary, therefore this is an AWB issue. You should have taken this up at AWB's talk page. Second, spaces in ref tags do cause errors... < ref>hi</ref>. Third, you are arguing over a space and a space that is your personal preference, not backed up by any doc. My guess, you have been blindly reverting the bot's edit without fixing the #61 issue that caused the bot to arrive at the page. Bgwhite (talk) 22:22, 14 November 2014 (UTC)Reply
If you think that < ref>hi</ref> is the same thing as <ref >hi</ref> (in either XML or in wikicode parsing) then you really do need to fix your bot.
Whether something is a personal preference or not, a 'bot should not "automatically fix it" unless it is broken. <ref >hi</ref> is well-formed and should not be messed with by 'bots. Andy Dingley (talk) 22:29, 14 November 2014 (UTC)Reply
Bgwhite, luckily I am psychic even though Andy Dingley was unable to provide a diff. Frietjes (talk) 22:53, 14 November 2014 (UTC)Reply
by the way, the net outcome was that error #61 was fixed. Frietjes (talk) 22:56, 14 November 2014 (UTC)Reply
I redid the bot edit. -- Magioladitis (talk) 22:56, 14 November 2014 (UTC)Reply

False positives for #2

edit

  Resolved

Hi, on frwiki, there are 5 false positives for #2:

--NicoV (Talk on frwiki) 13:37, 17 November 2014 (UTC)Reply

NicoV, try this (something other than a space there). Frietjes (talk) 16:21, 17 November 2014 (UTC)Reply
that fixed all of them except for fr:Antipaïne. Frietjes (talk) 16:28, 17 November 2014 (UTC)Reply
I fixed the last one. -- Magioladitis (talk) 16:54, 17 November 2014 (UTC)Reply
Thanks guys ! --NicoV (Talk on frwiki) 08:58, 18 November 2014 (UTC)Reply

Error #34

edit

@NicoV: After discussion with Bgwhite CHECKWIKI now checks for the following magic words too: "BASEPAGENAME", "FULLPAGENAME", "PAGENAME", "PAGESIZE", "PROTECTIONLEVEL", "Pagename", "SUBPAGENAME", "Subpagename". -- Magioladitis (talk) 23:48, 2 January 2015 (UTC)Reply

Thanks for the info, I will check if I need to add them to WPCleaner. I am currently traveling at least until mid February, and won't be able to do anything before. Could you post on WPCleaner talk page so that I remember? NicoV (Talk on frwiki) 08:56, 4 January 2015 (UTC)Reply

Template:Coding TfD

edit
  Resolved

Is this template something that would be useful to you guys? That is, if users were educated to flag problems with it, would it help you find currently missed errors? The TfD discussion is at Wikipedia:Templates for discussion/Log/2014 November 17#Template:Coding. Comments are welcome! —PC-XT+ 06:52, 22 November 2014 (UTC)Reply

Template deleted. -- Magioladitis (talk) 08:29, 24 January 2015 (UTC)Reply

Error #43

edit
  Resolved

There are many false positives for error #43 which include usage of the {{familytree}} template - at plwiki pl:Burbonowie might be an example. That's probably because the brace '}' can be used legally as a parameter. ToSter (talk) 20:55, 3 November 2014 (UTC)Reply

ToSter, yes that is a false positive. However, {{familytree}} has been depreciated and should no longer be used. It has been replaced by {{chart}}. There will be other false positives, which is why on enwiki there is 043 whitelist. Bgwhite (talk) 22:23, 3 November 2014 (UTC)Reply
Bgwhite, it might be deprecated at enwiki but perfectly fine at plwiki - what is the reason for the deprecation? I know about whitelisting but that's the ultimate solution - it has the clear disadvantage that it would skip a real error on the same page and it has to be kept up-to-date. Isn't there any solution to this case? ToSter (talk) 17:22, 5 November 2014 (UTC)Reply
three options, as far as I can tell: (1) modify {{familytree}} to allow it to use two new symbols as synonyms for { and }, while still keeping { and } around for backwards compatibility until they can be all changed. (2) replace any { and } with {{(}} and {{)}}. (3) add a line to the checkwiki.pl script to do something like
  content =~ s/(\{\{[Ff]amilytree[^\{\}]*)[\{]([^\{\}])/$1&#123;$2/g;
  content =~ s/(\{\{[Ff]amilytree[^\{\}]*)[\}]([^\{\}])/$1&#125;$2/g;
which would internally remove these from the family tree template before processing. note, that a couple more regexps are probably need for cases where the bracket is the last parameter, and probably need to be put in a loop to match multiple occurrences within the same family tree call. the same could be done for IPA, but for IPA, the open brace is equivalent to ae, so it's an easy replacement. Frietjes (talk) 21:10, 5 November 2014 (UTC)Reply
Thanks, I have implemented solution 1. ToSter (talk) 09:25, 8 November 2014 (UTC)Reply
ToSter, I did the same here, but kept the old syntax until they can all be updated. Frietjes (talk) 15:27, 8 November 2014 (UTC)Reply

"Fixing" ref tags inside comments

edit

For most of the life of Shooting of Michael Brown, we have used list-defined references and commented out unused refs rather than removing them. The commenting technique we have consistently used is to change <ref name=...> to <!--ref name=...> and change </ref> to </ref-->. This method requires the least amount of effort. This has not been a problem until this bot edit, which used WCW according to its editsum.

We have no problem with the change to Vox.Feds, since it was commented incorrectly to begin with. For the remaining three refs, WCW apparently "fixed" the leading ref tags, despite the fact that they were inside valid comments. This requires us to (1) notice what the bot did, and (2) then clean up after it. We wonder why this has happened for the first time since we started using this technique in August, and we would like to know what we can do to prevent it from happening again. I'm watching, so no need to ping me. ‑‑Mandruss  08:45, 13 December 2014 (UTC)Reply

Mandruss First off, you should have contacted the bot owner (Magioladitis). Check Wikipedia only finds problems and doesn't do any of the fixing.
The issue Check Wikipedia found was a missing closing comment tag. In the first paragraph that was changed, at the very end... </ref>-- was changed to </ref>-->. This is why the bot arrived at the page.
The reason the bot changed <!--ref to <!--<ref is because it thinks a < is missing. In HTML/XML, there can be nothing between < and the first letter of the tag name. Bgwhite (talk) 09:18, 13 December 2014 (UTC)Reply
@Bgwhite: Ok, I was relying on my 30 yrs in computer fields, which led me to believe that nothing inside a comment could be called a tag name because comments are supposed to be ignored by software. But that's neither here nor there, as the bot will mind its own botness unless we attract its attention with a coding error. Thank you. ‑‑Mandruss  09:27, 13 December 2014 (UTC)Reply
(Btw, the editsum said "Fixed using WCW", not "Found using WCW". Based on what you said, the latter would be more correct, and perhaps Magioladitis would consider changing that if they read this. In any case, some rewording of the editsum might help prevent others from making the same mistake I did.) ‑‑Mandruss  09:34, 13 December 2014 (UTC)Reply
Mandruss Eeeks. That makes us similar old farts. 20 years in the computer field for me. I put up my first web page almost 21 years ago. Sniff. You don't have to get off my lawn. Bgwhite (talk) 09:39, 13 December 2014 (UTC)Reply
LOL. Different worlds, I'm a retired mainframe dinosaur. ‑‑Mandruss  09:40, 13 December 2014 (UTC)Reply

Mandruss Hi. I used my bot account but it was a manual edit. No unclosed comment tag are fixed in bot mode. Feel free to improve. -- Magioladitis (talk) 13:12, 13 December 2014 (UTC)Reply

Magioladitis Feel free to improve, as in improve the tool? It's a common misconception that anyone here who has any technical background can modify tools in this environment. Not so, by a long shot. The necessary skills could be acquired with a ton of time and effort, but it's impossible to justify that for the occasional (unpaid) need such as this one. ‑‑Mandruss  21:49, 13 December 2014 (UTC)Reply
Mandruss I meant: feel free to improve the article. :) You can help improve the tool too of course. :) But in this particular edit I only used the bot account to perform a manual edit. WPCleaner only helped me to spot the unbalanced comment tag. -- Magioladitis (talk) 22:05, 13 December 2014 (UTC)Reply
Magioladitis If I understand correctly, then, you manually "fixed" the ref tags that were inside comments? In that case, we would ask that you not do that in this article, since it interferes with the article's convention as described above. We have a higher level of consistency than exists in most articles, and we like to keep it that way. It's not ownership, as it's consistent with many guidelines that recommend respecting existing article conventions. Note that you "fixed" only three out of about 75 commented-out refs, all of which use the same commenting method described above. ‑‑Mandruss  22:13, 13 December 2014 (UTC)Reply
Mandruss no problem. I already implied that my edit was not perfect. The vital part as to close the comment tag. The other part was incomplete and perhaps wrong. Feel free to correct my edit as long as you keep all comment tags closed. -- Magioladitis (talk) 22:32, 13 December 2014 (UTC)Reply

Unclosed center tags (error 102?)

edit

Maybe it's time to add unclosed center tags as error #102? Errors 28 and 39 reduced and we need a need game to play with. -- Magioladitis (talk) 08:24, 3 October 2014 (UTC)Reply

Article without title

edit

  Resolved

Have a look at this page - a page without a title is reported. ToSter (talk) 07:55, 9 November 2014 (UTC)Reply

Looking at the date it was reported at, it seems to be simply a strange error. George.Edward.CTalkContributions 11:05, 9 November 2014 (UTC)Reply
Not an isolated incident... Happened here too. The reports seems to be refreshing, which explains why WPCleaner picks up nothing. George.Edward.CTalkContributions 18:22, 10 November 2014 (UTC)Reply
George.Edward.C, ToSter, it is a new problem that showed up last month. I've yet to hunt down the offending article. It was causing the daily scan to crash, but I've worked around it. Bgwhite (talk) 18:49, 10 November 2014 (UTC)Reply
Thanks for your work! It's much appreciated. :D George.Edward.CTalkContributions 19:47, 10 November 2014 (UTC)Reply

Whitelists

edit

  Resolved

I don't understand how the whitelists are handled - is there any guide on this? At plwiki, there is a whitelist for #58 but checkwiki still reports pl:Remixes 81 - 04. ToSter (talk) 21:31, 19 November 2014 (UTC)Reply

ToSter, you have it sort of backward. The actual whitelist is a file. The articles are not listed in the translation file. You need to:
  1. Create a file with the articles. You can see the format with enwiki's #58 whitelist at Wikipedia:WikiProject Check Wikipedia/Error 058 whitelist. It needs to be in the format of: * [[article name]]
  2. In the translation file, you put the location of the whitelist. The line in enwiki's translation file is:
error_058_whitelistpage_enwiki=Wikipedia:WikiProject_Check_Wikipedia/Error_058_whitelist END
  1. The info from the translation file is updated everyday at 0z.
Bgwhite (talk) 22:17, 19 November 2014 (UTC)Reply
Bgwhite, thanks! That's pretty simple but I can't see it anywhere on the project page... ToSter (talk) 23:08, 19 November 2014 (UTC)Reply

Possible new issue to scan for

edit

  Resolved

The presence of empty rows, as I removed for instance here. —TheDJ (talkcontribs) 13:58, 25 November 2014 (UTC)Reply

that table is way over styled (fixed further). the use of double |- is fairly common. some editors feel it improves the readability. note that in this particular case the styles were way over specified, since the default is text-align:left, and the rowstyles were been overridden by the cell styles. Frietjes (talk) 16:53, 25 November 2014 (UTC)Reply
one that may be useful would be class="wikitable" class="wikitable sortable"class="wikitable sortable" or style="foo1" style="foo2"style="foo2". basically, duplicate class or style declarations where the first one is ignored due to the presence of the second. Frietjes (talk) 16:55, 25 November 2014 (UTC)Reply

CheckWiki on nonWMF wiki

edit

  Resolved

Hi, is it possible to run an instance of the checkwiki tool (the lists of errors) on nonWMF wiki project? We would like to catch and be able to fix errors like you do. Thanks. --Wesalius (talk) 07:27, 5 December 2014 (UTC)Reply

Wesalius, yes it is possible. I have done it before. The key is getting a dump file of the wiki. I can then run the dump on my home computer. Bgwhite (talk) 00:08, 9 December 2014 (UTC)Reply
Bgwhite I have a dump file ready. Could you please advice me on how do I get the lists of pages that fulfill different error criterias? I imagined there is some script like pywikibot that goes through the dump and looks for the errors...? --Wesalius (talk) 05:48, 9 December 2014 (UTC)Reply
Wesalius It is a Perl program that uses the MediaWiki-Bot Perl module, some Perl XML modules and Mysql/MariaDB. Source code is here. It would be easier if I could download the dump file and run it. Bgwhite (talk) 06:28, 9 December 2014 (UTC)Reply
Bgwhite Ok, I will get a fresh dump. And try to learn my way through Mediawiki-Bot during Christmas :-) Thanks for your help so far.--Wesalius (talk) 17:08, 9 December 2014 (UTC)Reply

Bgwhite Is this dump working? Its produced with php /var/www/wiki/maintenance/dumpBackup.php \ plugin=AbstractFilter:/var/www/wiki/extensions/ActiveAbstract/AbstractFilter.php \ --current \ --report=100 \ --output=gzip:/var/www/wiki/WSdump2.gz \ --filter=namespace:NS_MAIN \ --filter=noredirect \ .

Wesalius, I downloaded the file and everything looks good. I don't work much on weekends as the wife controls me with a very short leash. I should have something for you Monday... Tuesday your time. I'll use the Czech translation file for the defaults. FYI... pinging doesn't work if you don't sign your message. Bgwhite (talk) 09:32, 13 December 2014 (UTC)Reply

Bgwhite How did it go?--Wesalius (talk) 17:49, 20 December 2014 (UTC)Reply

CHECKWIKI vs WPCleaner

edit

  Resolved

Josve05a will not move this discussion/section to Wikipedia talk:WPCleaner.

@NicoV and Bgwhite: Something is wrong. WPCleaner doesn't list any errors on svwp, even though there are. Itworks with enwp, but not with svwp. (tJosve05a (c) 18:49, 13 December 2014 (UTC)Reply

@Bgwhite and Josve05a: I see at least 2 problems:
Currently travelling without much net access, so can't do much. --NicoV (Talk on frwiki) 04:29, 14 December 2014 (UTC)Reply
As of today, still the same error. Could someone in the know please handle it. Cheers, --Bulver (talk) 08:16, 21 December 2014 (UTC)Reply
Bulver There was a problem with the dumps. It rectified itself a few days ago as frwiki, plwiki, arwiki and itwiki have produced valid errors. Next time svwiki dump is produced it should produce valid errors too. Bgwhite (talk) 09:14, 21 December 2014 (UTC)Reply
Excellent. Many thanks //--Bulver (talk) 13:25, 21 December 2014 (UTC)Reply