Wikipedia talk:Bots/Requests for approval/BareRefBot/Code

Code review

edit

if (theurl.indexOf(".pdf") >=0)

We should be looking for ".pdf" at the end of the URL. This will find it anywhere.

webstruct.isDeadLink

Do we know what this does?

" {{Dead link|date=January 2022}}" // replace every month

I would forget to modify the source code every month. I don't think we want whitespace between the URL and the tag. Suggested replacement: "{{Dead link|{{subst:DATE}}}}"

if (theur.indexOf(usethistitle) >= 0)

Typo? ~Kvng (talk) 00:46, 27 January 2022 (UTC)Reply

@Kvng: I have been troubled for some time that your commentary on bare URL fixing is generated by significant excess of certainty over understanding. This commentary illustrates that problem very well:
  1. {{Dead link|date=January 2022}} tags the link as dead. See {{Dead link}}
  2. Your suggestion of {{Dead link|{{subst:DATE}}}} won't work, because subst is not expanded inside ref tags.
    Making the datestamp self-updating requires coding.
  3. Even if it did expand, {{subst:DATE}} produces the wrong result because you omitted the name of the parameter |date=
  4. On the contrary, we do want whitespace between the URL and the tag to avoid the tag appearing to be part of the URL.
You re right on just one point: that We should be looking for ".pdf" at the end of the URL. But the rest is all wrong. BrownHairedGirl (talk) • (contribs) 18:06, 28 January 2022 (UTC)Reply
@BrownHairedGirl: I did not know that subst is not expanded inside ref tags. I do know that {{Dead link|{{subst:DATE}}}} includes |date=. I've always put the inline tags directly after the URL or whatever they're tagging to make it clear what they're attached to. That's what I've seen other editors do too. If we think there should be whitespace, it probably should be non-breaking ( ). ~Kvng (talk) 21:51, 28 January 2022 (UTC)Reply
@Kvng: It's great pity that you even entertain the odd idea that adding a whitespace before the tag in <ref>http://example.com/foo {{deadlink}}</ref> could in any way make it unclear what that tag refers to.
I see no need for a non-breaking space. The {{dead link}} tag is displayed only in the list of refs, where it will always immediately follow the URL. In the overwhwelming majority of cases, it will be on the same line as the URL.
However, where a non-breaking space does has any effect, it will the highly undesirable effect of preventing wrapping of long lines. That is highly disruptive to the display.
Yet again, you are making ill-considered proposals based on ignorance of the issues involved, and without thinking them too. This is disruptive, because while your ill-considered ides can be rebutted, the discussion is cluttered by bad idea plus rebutted.
Please restrict your comments to issues where you know what you are talking bout. BrownHairedGirl (talk) • (contribs) 22:32, 28 January 2022 (UTC)Reply
No, I'm not going to be bullied. We don't always know what we don't know. Please be nice. ~Kvng (talk) 22:40, 28 January 2022 (UTC)Reply
You are not being bullied. You are being asked to do your homework before posting.
Please be nice, by desisting from wasting the time of other editors by posting ill-considered proposals. BrownHairedGirl (talk) • (contribs) 22:58, 28 January 2022 (UTC)Reply
We should be looking for ".pdf" at the end of the URL. This will find it anywhere. Yes, but keep in mind that there are some urls like "https://example.com/test.pdf/" and it needs to be able to detect those.
webstruct.isDeadLink If the networking code finds the link is dead, it will return a structure with "isDeadLink" set to true. In a previous version (the one on my AWB runs) it would retreive a Wayback URL, but in this version (the one that the bot will run) it will just tag the URL.
Regarding date, your suggested placement sounds cool. I did not know about that. Thanks for leaving a note. So the code review has been good so far
Typo fixed.

Rlink2 (talk) 00:56, 27 January 2022 (UTC)Reply

I think https://example.com/test.pdf/ would be a directory not a PDF file. Can you give any other examples where we should be looking for .pdf somewhere in the middle of the URL? ~Kvng (talk) 17:42, 28 January 2022 (UTC)Reply
Well, I prefer to be safer than sorry and err on the side of caution. I don't want to be filling things I can't verify is of a certain quality. A URL like that indicates that there might be more to what meets the eye. — Preceding unsigned comment added by Rlink2 (talkcontribs)
Your code already processes bare URLs that end in a trailing / ... i.e., URLs that point to directories. It doesn't matter if ".pdf" is in the name of the directory or not; it should process it the same either way. Levivich 18:30, 28 January 2022 (UTC)Reply
As we've discussed, there are several ways a link can be dead. Do we know which ones isDeadLink is responsive to? ~Kvng (talk) 17:42, 28 January 2022 (UTC)Reply
Just 404 links. If there is another issue with the link , the structure will have nothing and the script will skip over it. Rlink2 (talk) 18:05, 28 January 2022 (UTC)Reply
I've found several examples where dead URL tags are applied for things other than 404; see my cmts at the bot request page. Levivich 18:30, 28 January 2022 (UTC)Reply

webstruct.website Do we know what this does?

If it can retreive the "website" prameter, the structure will contain it. Otherwise, it would be null Rlink2 (talk) 18:03, 28 January 2022 (UTC)Reply
It would be great to see this code. Levivich 18:30, 28 January 2022 (UTC)Reply
@Levivich: I agree.
@Rlink2: please can you promptly publish all the code for the bot, as used for the trial run. This is only a code fragment, and it trying to review a fragment is futile. BrownHairedGirl (talk) • (contribs) 19:01, 28 January 2022 (UTC)Reply
@BrownHairedGirl: Ok, as soon as I finish replying to the other people on the project page. Rlink2 (talk) 19:04, 28 January 2022 (UTC)Reply
@BrownHairedGirl: @Levivich: The full script has been uploaded with comments. The website title grabber script is seperate and I will upload that too. Rlink2 (talk) 20:18, 28 January 2022 (UTC)Reply
@Rlink2: the code you have uploaded has neither date nor version number.
Please start numbering and dating your code files, and using those numbers in any edits. The bot cannot be properly evaluated is we don't know which version of the code was used.
And please make sure that all the code is uploaded.
You have done lot of very good work here, but this drip-drip disclosure is frustrating, and it increases the chances that the bot will be declined simply because editors cannot clearly and easily see what the bot is doing, and therefore cannot have confidence that you have full control of the bot. BrownHairedGirl (talk) • (contribs) 22:39, 28 January 2022 (UTC)Reply
@BrownHairedGirl: Your hostility here is inappropriate and likely counterproductive. AFAIK there is no requirement to disclose source code and there's certainly no deadline. ~Kvng (talk) 22:47, 28 January 2022 (UTC)Reply
@Kvng: you are projecting your own hostility onto me. I am not being in the slightest bit hostile to @Rlink2. On the contrary, I am trying to help Rlink2 to improve the bot's chances of approval.
Sure, code publication is not required. But code publication helps editors to assess the bot, and the more clearly and systematically it is published the greater the chance that editors will be able to satisfy thenselves that the bot is working in a way which they approve. BrownHairedGirl (talk) • (contribs) 22:56, 28 January 2022 (UTC)Reply
She isn't being hostile, she's just telling the truth. Both of you have been helpful. I will certainly start dating the code per BHG's request. I had no intention of "drop-drip" disclosure, it's just that I need to comment all lines of my code and make sure its clean before I release it. This is so it is easier for other editors to understand the code, know what it is doing, and provide suggestions, which is the whole point of source code release.
There are two scripts, one that places the website titles, and one that gets the website titles. The "placer" usually runs right after the "getter". The former has been uploaded here, in its complete entirety, shortly after BHG's request. I will upload the the getter as soon as I finish commenting it all out. Granted, it's just glue for interacting with other pieces of software, and the getter and this are bascially the same in code structure (the "getter" just has different code in the bare "if" statement) but in the interest of transperency (after all, Levivich said We value transperency), and you want all the code, I will upload it. Rlink2 (talk) 23:09, 28 January 2022 (UTC)Reply
Now it looks you're being hostile to Rlink2 and gaslighting me. Law of holes applies to both of us here. ~Kvng (talk) 23:10, 28 January 2022 (UTC)Reply
@Kvng: Please read before posting. I have made it v clear that I am not being hostile to Rlink2.
And I am not gaslighting you. Asking you stop post ill-considered rubbish would be gaslighting only if you were not actually posting ill-considered rubbish. But you have posted ill-considered rubbish, multiple times, and I have documented where you did it. Please stop creating disruptive drama. BrownHairedGirl (talk) • (contribs) 23:53, 28 January 2022 (UTC)Reply
Ok, I guess I will be bullied. You are unpleasant for me to work with. I'll go apply myself elsewhere. ~Kvng (talk) 15:15, 29 January 2022 (UTC)Reply