Wikipedia:Bots/Requests for approval/Prombot

The following discussion is an archived debate. Please do not modify it. Subsequent comments should be made in a new section. The result of the discussion was

Denied.

Prombot

Operator: «l| Ψrom3th3ăn ™|l» (talk)

Automatic or Manually Assisted: Automatic

Programming Language(s): Pywikipedia

Function Summary:

Scans articles and adds reference tags to articles needing them. (weekly)
Checks the last (number yet to be decided) uploaded images without copyright status templates also marks duplicates (hourly or every 30 minutes pending discussion)
Checks for orphaned pages and adds the orphaned page template as required (daily)
Checks all external links, Reports them on article talk page if they are dead 2 checks in a row (fortnightly)

Edit period(s) (e.g. Continuous, daily, one time run): Multiple

Already has a bot flag (Y/N):

Function Details:

Gets all pages list, and scans all pages in the mainspace to check if they references in the body but do not have a reference section. Adds the references section to bottom of pages if required and continues.
Gets the last 80 uploaded files and checks each one for copyright status templates, if missing adds {{di-no license}} to image and notifies uploadeder.
Checks for orphaned pages and adds the orphaned page template as required.
Scans all pages via Special:Allpages for dead external links saving them to a local dat file, IF on next run they are still dead (still in the file) the bot adds a notice on the talkpage requesting someone to check the link and remove it if nessecery. If the link is back up it is removed from the local dat file (Fortnightly, Automatic)

Discussion

Per Wikipedia:BOT bots should have bot or something signifying an automated process in the user name --Chris 10:10, 19 July 2008 (UTC)[reply]

I'll request it to be renamed now «l| Ψrom3th3ăn ™|l» (talk) 10:11, 19 July 2008 (UTC)[reply]

If I recall, a bot's username should either have some relation to its operator's username, or else to it's task (eg. Giggabot, Chris G Bot, RFRBot). Sir BOT wouldn't appear to meet either of these criteria.
On another note; do you have any past experience with Pywikipedia? What will it look for to determine if an article should be skipped ("a reference section" is too vague), and what will it add? —Giggy 10:27, 19 July 2008 (UTC)[reply]

I have chosen Prombot, It was an oversight on my part (the username). I have used Pywikipedia on my own test wiki to play around, and are aware of its ins and outs. The code works like this

Is there a </ref> tag? (If Yes, Continue. If not, Next article)
Is there a <references/> tag? (If Yes, Next Article. If Not Continue)
Is there a <references /> tag? (If Yes, Next Article. If Not Continue)
Is there a known references template? (If Yes, Next Article. If Not Continue)
Add <references/> to bottom of page

(see relevant detection code below).

    def lacksReferences(self, text, verbose = True):
        """
        Checks whether or not the page is lacking a references tag.
        """
        oldTextCleaned = wikipedia.removeDisabledParts(text)
        if not self.refR.search(oldTextCleaned):
            if verbose:
                wikipedia.output(u'No changes necessary: no ref tags found.')
            return False
        elif self.referencesR.search(oldTextCleaned):
            if verbose:
                wikipedia.output(u'No changes necessary: references tag found.')
            return False
        else:
            if self.referencesTemplates:
                templateR = u'{{(' + u'|'.join(self.referencesTemplates) + ')'
                if re.search(templateR, oldTextCleaned, re.IGNORECASE):
                    if verbose:
                        wikipedia.output(u'No changes necessary: references template found.')
                    return False
            if verbose:
                wikipedia.output(u'Found ref without references.')
            return True

«l| Ψrom3th3ăn ™|l» (talk) 10:42, 19 July 2008 (UTC)[reply]

I have renamed the bot to Prombot per [1] =Nichalp «Talk»= 12:54, 19 July 2008 (UTC)[reply]

Note: all of the functions are done with different scripts operating out of the same bot account, so you can approve them separately (whether entirely or for trials) of course. Again i apologize for the naming stuff up :-) «l| Ψrom3th3ăn ™|l» (talk) 15:43, 19 July 2008 (UTC)[reply]

This needs to be split in to four different requests. BJ^Talk 18:40, 19 July 2008 (UTC)[reply]

I would suggest the first and fourth tasks be denied. Scanning every mainspace page for missing {{reflist}}s and dead links every 2 weeks via the API and Special:Export is a massive waste of resources (even doing it once would be iffy). It should either be done in smaller groups of pages or it should use a database dump. Mr.Z-man 19:35, 19 July 2008 (UTC)[reply]

Second and third tasks are already being done by at least two bots. BJ^Talk 02:43, 20 July 2008 (UTC)[reply]

Task 1 can be done on a single page, category, subcategory or allpages using an XML dump The 4th can be done on a single namespace. 2nd and 3rd tasks still need doing and the bots already doing it don't do it often enough, another bot will help with this. «l| Ψrom3th3ăn ™|l» (talk) 03:59, 20 July 2008 (UTC)[reply]

Tasks 1 and 4 are fine, as long as you use a database dump to do them if you want to do every article in one run (which means you'll only be able to do it about once a month). I would assume that task 4 is already only planned to run on the article namespace only. It wouldn't be very useful elsewhere. Also, how is it going to check for orphaned pages? Mr.Z-man 17:46, 20 July 2008 (UTC)[reply]

Task 1 will use an XML dump. Task 4 will only run in the article namespace, but does not have the capacity to use an XML dump, But what i can do is limit its max connections to 1 therfore its load on the servers would be minimal, I estimate that at 1 page at a time, it would be around 25-35 seconds in between pages (hardly noticeable). Task 3 Utuilises Special:LonelyPages as a list of pages to tag, with the exception of redirect and disambigution pages «l| Ψrom3th3ăn ™|l» (talk) 01:16, 21 July 2008 (UTC)[reply]

REQUESTING TRIAL APPROVAL FOR TASK 2 - One time run, last 500 files uploaded. If its okey, uses the same process as the exisitng pythong bot that does this task. «l| Ψrom3th3ăn ™|l» (talk) 02:18, 21 July 2008 (UTC)[reply]

Denied. Procedural deny, please file this properly. BJ^Talk 03:31, 21 July 2008 (UTC)[reply]

The above discussion is preserved as an archive of the debate. Please do not modify it. Subsequent comments should be made in a new section.