Wikipedia:Bots/Requests for approval/PkbwcgsBot 20
- The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Approved.
Operator: Pkbwcgs (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)
Time filed: 13:06, Sunday, January 13, 2019 (UTC)
Function overview: The bot will fix some broken Wall Street Journal external links.
Automatic, Supervised, or Manual: Supervised
Programming language(s): AWB
Source code available: AWB
Links to relevant discussions (where appropriate): Wikipedia:AutoWikiBrowser/Tasks#Fix_Wall_Street_Journal_links
Edit period(s): One-time run
Estimated number of pages affected: Over 1,100
Namespace(s):Article
Exclusion compliant (Yes/No): Yes
Function details: The bot is going to fix some broken Wall Street Journal external links. The find pattern will be https://www.wsj.com/news/articles/
and the replace pattern will be https://www.wsj.com/articles/
and this should fix all of the broken external links.
Discussion
edit- What regex are you using? What do you have in place to not change that string when it is within another string, such as a webarchive url? — xaosflux Talk 01:20, 17 January 2019 (UTC)[reply]
- @Xaosflux: I am not using any regex. I am removing out "/news/" in each of the urls so that it is not broken like said in the discussion. Pkbwcgs (talk) 16:11, 17 January 2019 (UTC)[reply]
- @Pkbwcgs: can you provide a couple examples of the broken links? In some random samples of your search above, the links appear to be working. — xaosflux Talk 16:23, 17 January 2019 (UTC)[reply]
- @Xaosflux: The links weren't working at all when I opened the BRFA. However, they are only redirecting to the correct url. Pkbwcgs (talk) 16:32, 17 January 2019 (UTC)[reply]
- @Pkbwcgs: what will you do if you encounter something like this:
archiveurl=https://web.archive.org/web/nnnnnnn/http://www.wsj.com/news/articles/xxxxxxxx
? — xaosflux Talk 16:44, 17 January 2019 (UTC)[reply]- @Xaosflux: I haven't found any such cases so far but they would remain unfixed because they are at a web archive so it is unlikely that their urls would have changed. Pkbwcgs (talk) 16:47, 17 January 2019 (UTC)[reply]
- @Pkbwcgs: with your task running in automatic mode - what are you going to do to ensure you do not match that substring that otherwise matches your pattern? — xaosflux Talk 16:49, 17 January 2019 (UTC)[reply]
- @Xaosflux: That is not a problem as the list of affected URLs do not contain any urls from web archive that matches this pattern. Pkbwcgs (talk) 16:52, 17 January 2019 (UTC)[reply]
- @Pkbwcgs: That link proves nothing, because it's not a suffix-based link search. I don't know of a quick way to find counterexamples, though, except for perhaps this edit that prompted me to start this discussion. The trial edits were fine, though, as far as I'm concerned. Graham87 06:40, 18 January 2019 (UTC)[reply]
- @Xaosflux: That is not a problem as the list of affected URLs do not contain any urls from web archive that matches this pattern. Pkbwcgs (talk) 16:52, 17 January 2019 (UTC)[reply]
- @Pkbwcgs: what will you do if you encounter something like this:
- @Xaosflux: The links weren't working at all when I opened the BRFA. However, they are only redirecting to the correct url. Pkbwcgs (talk) 16:32, 17 January 2019 (UTC)[reply]
- @Pkbwcgs: can you provide a couple examples of the broken links? In some random samples of your search above, the links appear to be working. — xaosflux Talk 16:23, 17 January 2019 (UTC)[reply]
- @Xaosflux: I am not using any regex. I am removing out "/news/" in each of the urls so that it is not broken like said in the discussion. Pkbwcgs (talk) 16:11, 17 January 2019 (UTC)[reply]
- Approved for trial (40 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. — xaosflux Talk 16:57, 17 January 2019 (UTC)[reply]
- The first 27 edits are located here. Pkbwcgs (talk) 21:21, 17 January 2019 (UTC)[reply]
- @Pkbwcgs: how are you going to avoid breaking links such as this one that Graham87 brought up? — xaosflux Talk 16:40, 27 January 2019 (UTC)[reply]
- @Xaosflux: Those ones are going to be skipped. Pkbwcgs (talk) 16:42, 27 January 2019 (UTC)[reply]
- @Pkbwcgs: with this job running in automatic mode, can you describe how you will avoid this? — xaosflux Talk 20:29, 27 January 2019 (UTC)[reply]
- @Xaosflux: Given that I have operated this task in supervised mode so far and checked each edit; the task will run with supervision. Every edit in this trial so far I saved manually without automatic saving so this is a supervised task. Pkbwcgs (talk) 20:32, 27 January 2019 (UTC)[reply]
- @Pkbwcgs: with this job running in automatic mode, can you describe how you will avoid this? — xaosflux Talk 20:29, 27 January 2019 (UTC)[reply]
- @Xaosflux: Those ones are going to be skipped. Pkbwcgs (talk) 16:42, 27 January 2019 (UTC)[reply]
- Approved. Task approved. — xaosflux Talk 20:39, 27 January 2019 (UTC)[reply]
- The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.