User talk:Σ/Archive/2013/December
This is an archive of past discussions about User:Σ. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Well done
That's some of the trickiest user page source obfuscation I've ever seen. — Scott • talk 12:19, 28 November 2013 (UTC)
- Thanks. →Σσς. (Sigma) 08:21, 1 December 2013 (UTC)
Lowercase sigmabot III Archiving, thanks
Thank you for putting in the work to get this functional. I know that doing so takes quite a bit of time and effort. It is certainly appreciated. Makyen (talk) 00:27, 2 December 2013 (UTC)
- I also want to express my thanks. People like you help ensure Wikipedia functions smoothly in subtle yet important ways and don't get enough appreciation for it. Thank you! –Prototime (talk · contribs) 04:07, 2 December 2013 (UTC)
Getting an archiving bot at English Wikisource
Would you be able to explain to me the means for getting (this bot|a bot like this) User:lowercase sigmabot …
operating on at English Wikisource. We are not particularly hacking inclined, and having a configurable bot to look after things would be most useful in so many places. Thanks for any guidance. — billinghurst sDrewth 10:16, 2 December 2013 (UTC)
- The bot is not yet ready for use by people who don't know their way around it. I'll let you know when it's more stable. →Σσς. (Sigma) 11:03, 2 December 2013 (UTC)
Please comment on Talk:Vladimir Putin
Greetings! You have been randomly selected to receive an invitation to participate in the request for comment on Talk:Vladimir Putin. Should you wish to respond to the invitation, your contribution to this discussion will be very much appreciated! If in doubt, please see suggestions for responding. If you do not wish to receive these types of notices, please remove your name from Wikipedia:Feedback request service. — Legobot (talk) 00:05, 3 December 2013 (UTC)
Bot on the fritz again...
Hi Σ,
Your archival bot seems to be a bit on the fritz. It just archived my talkpage and the page it archived it to was numbered "8 8", rather than in sequence with previous pages. Unless it's an error on my part, in which case, can you explain?
Thanks in advance. Black Yoshi (Yoshi! | Yoshi's Eggs) 15:27, 1 December 2013 (UTC)
- That's a problem with your configuration template. Your counter is set to 8, and the archive page is set to User talk:Black Yoshi/Archive 8 %(counter)d. The bot substituted %(counter)d with the counter, 8, which is User talk:Black Yoshi/Archive 8 8. →Σσς. (Sigma) 20:43, 1 December 2013 (UTC)
- Can you explain how to fix it? I'd try to figure it out, but I'm afraid I'll break something and won't be able to fix it again. Black Yoshi (Yoshi! | Yoshi's Eggs) 03:30, 3 December 2013 (UTC)
Lowercase sigmabot III RFE, unsigned comment template recognition
While I was moving the archive pages of Talk:Comparison of American and British English, I noticed a few sections on the talk page which should have been moved to the archive, but which were left on the main talk page. The sections are Use of tenses and "I couldn't care less". In addition,Quotes and punctuation should also be archived, but has unsigned text after the last date (an unsigned comment template is the only date in that section).
I put this as a bug/RFE because a brief look shows that the sections were not moved by MiszaBot. However, they were not old enough to qualify to be moved at the last time which I am able to verify MiszaBot ran on the page (17:37, April 26, 2012). Thus, it is not clear from this small sample if MiszaBot would have recognized the date and archived them correctly. A quick look at the archives from this page do not show any instances of where MiszaBot performed actions indicating either way as to it correctly identifying and archiving similar sections (only date being an unsigned comment template). The most recent similar cases on that page appear to be prior to auto-archiving being enabled on the page (Archive 7). Either way (Bug or RFE), the sections where an unsigned comment template is the only date should be moved if they otherwise qualify. Makyen (talk) 00:27, 2 December 2013 (UTC)
- They aren't using standard timestamps that the bot will see. Notice how all the stamps in those threads are missing (UTC). →Σσς. (Sigma) 00:43, 2 December 2013 (UTC)
- I had believed that the unsigned date was added by SineBot due to a quick check at that time of the page history. After checking in more detail, I found that an original signature was provided by SineBot, but was then deleted by the IP user shortly thereafter. A signature was later added to the three sections manually by a different user (PBS on 13:20, December 26, 2011). It appears that PBS used the {{unsigned}} template to add the signatures, but did not include the text "(UTC)" in the date argument provided.
- I expect it is common for an editor who is using any of the several {{unsigned}} templates --there is a list on that page-- to forget to include the "(UTC)" text because the "(UTC)" is not included on the history page (which is likely to be the source for a cut & paste) and the text is not automatically added by the template. You might want to consider having the bot recognize dates that do not include the "(UTC)" text when they are within a signature generated by one of the {{unsigned}} templates. I expect that the work to have the bot recognize these is less than the work to change all of current places where this is the case. It will also be an ongoing issue if the templates are not changed to add "(UTC)" when not provided. To find these occurrences the bot could search lines with the text "<!-- Template:Unsigned". Catching all of the occurrences may be more more complex (e.g. for {{Uns-ip}}).
- The real question is if the lack of "(UTC)", when using these templates, is sufficiently widespread to justify code on your part to compensate for the problem. In my opinion, the answer is yes, the issue is sufficiently widespread to justify code to compensate for the lack of "(UTC)" in these instances.
- I am attempting to not bring into consideration the general case of recognizing generic signature dates without the "(UTC)" text because most normal methods of signing contributions result in the "(UTC)" text being included (with the exception of these templates).
- Assuming that MiszaBot did not recognize these, this change would definitely be a RFE, not a bug. Makyen (talk) 02:29, 2 December 2013 (UTC)
- I originally described my bot as literally MiszaBot with a different implementation. MiszaBot never did anything like this, and thus, neither does mine. This can change, though.
- I changed the section title to say RFE only, not Bug, as it is clear that this is an enhancement.
- Why don't we ask the question, "do threads containing {{unsigned}} usually receive no replies from someone who uses ~~~~ to add a timestamp the bot can parse"? Or perhaps, "Shouldn't {{unsigned}} always add (UTC), either by an editor or the template itself"?
- Both are good questions.
- A) There are certainly a significant number of cases where someone else will have a comment in such a section providing you with a date that has "(UTC" in it. On the other hand, there are probably other sections where the most recent comment is made by someone where the date/time does not properly include the "(UTC". In such cases, the bot could archive the section prior to it actually meeting the time requirement.
- [Arrrg... This brings up something from vaguely recalled memory, from years ago, as to the functionality of MiszaBot: What I am recalling is that I had to add dates onto the last text in a section in order for MiszaBot to recognize it as valid to consider for archiving. In other words, finding a date in the section was not sufficient, the date had to be on the comment that was at the end of the section. It is also possible that my memory on this issue is faulty.]
- B) Given that there is functionality on WP which relies on having the "(UTC" text exist in a date then it is reasonable for the {{unsigned}} template to do at least some work at attempting to place that text in, probably by making the assumption that any date entered without (UTC) is entered in the date that the user set for themselves. As you have done, changing the template to include some "(UTC" text when a date was supplied was one of the first things I thought about doing. It would, of course, only be an assumption on the part of the template.
- However, any change in these templates does not solve the problem of already existing entries, only future entries.
- That is, why are we catering to the people who use {{unsigned}} and don't follow the directions written on the documentation page?
- Because the case exists in the on pages which this bot is archiving. It can reasonably be argued on both sides as to should the bot handle these cases. The question is should a user manually handle theses, or should they be handled by the bot?
- Perhaps we could extend {{unsigned}} to add (UTC) if one is not provided. We could also modify my bot to read stamps more leniently, in a sense, if {{unsigned}} was used. We could do something else, too. But I am not convinced that the second option is justified. →Σσς. (Sigma) 11:03, 2 December 2013 (UTC)
- Ultimately, the point of the bot is to reduce the amount of work which editors manually have to put in to accomplish a similar task. The question is to what extent is the time and effort to increase the complexity of the bot, and debug it, justified to reduce the time manually spent, by many editors, on archiving. Answering that question requires either data, or guesses, as to how common the issue attempting to be addressed is. Acquiring such data would require the bot to keep track of how many sections it encounters where it is unable to fine a valid date/time including a "(UTC". I expect that we are not going to accumulate such data, thus guessing is necessary to determine if the problem is sufficiently widespread to justify the time spent on the bot.
- Makyen (talk) 20:53, 2 December 2013 (UTC)
- I originally described my bot as literally MiszaBot with a different implementation. MiszaBot never did anything like this, and thus, neither does mine. This can change, though.
- Fair enough. I'll work on this when I have more free time. →Σσς. (Sigma) 06:14, 4 December 2013 (UTC)
Gimme a break!
DO NOT clear the sandbox while someone is trying to use it. Please. Kelisi (talk) 17:01, 4 December 2013 (UTC)
Bot did not archive at all yesterday
The Internationale
I have run to ground a now little-known English version of The Internationale (the version I remember from my youth)—the Socialist Labor Party of America version. I would like to see it regain recognition, and since you have contributed to the Wikipedia page on the song, I thought perhaps you could help to get it posted there. abarbour(at)lightspeed.net 99.92.81.95 (talk) 01:53, 6 December 2013 (UTC)
- Although I would like to see it on Wikipedia, there's not much of a chance if we don't know where it came from, thanks to our policy on non-free content. →Σσς. (Sigma) 07:38, 9 December 2013 (UTC)
Archive
Can I ask You for archiving of old discussions (older then 90 days) on Talk:Abkhazia? Thanks. Jan CZ (talk) 13:18, 8 December 2013 (UTC)
- (talk page stalker) Archived. It was not archiving because
minthreadsleft
was default set(5). --레비ReviDiscussSUL Info 06:48, 9 December 2013 (UTC)- One discussion (older then 90 days) is still waiting there for archiving. Can You also archive it? Thank You. Jan CZ (talk) 11:03, 9 December 2013 (UTC)
- (talk page stalker) DoneI wanted to try out semi-automatic archiving with the One Click Archiver so I went ahead and did this task. It worked well.Makyen (talk) 11:43, 9 December 2013 (UTC)
- Thanks! Jan CZ (talk) 15:22, 9 December 2013 (UTC)
- (talk page stalker) DoneI wanted to try out semi-automatic archiving with the One Click Archiver so I went ahead and did this task. It worked well.Makyen (talk) 11:43, 9 December 2013 (UTC)
- One discussion (older then 90 days) is still waiting there for archiving. Can You also archive it? Thank You. Jan CZ (talk) 11:03, 9 December 2013 (UTC)
Talk:Paul Walker
I reverted this edit to Talk:Paul Walker because Talk:Paul Walker/Archive 1 did not exist. Now that I've created the page, could you please run the bot over this again? Is there any way for the bot to automatically take some action if this was to happen on another page? (e.g. create the archive page or not archive or throw an error somewhere?) Thanks! GoingBatty (talk) 05:16, 10 December 2013 (UTC)
P.S. I apologize for rolling back the edit instead of undoing the edit with an informative edit summary. I immediately realized my mistake, which is why I posted here. GoingBatty (talk) 05:18, 10 December 2013 (UTC)
- I was having problems with my internet connection today. When I realised that my poor connection could prevent my bot from acting if it failed to archive a page, I stopped today's archiving run. Thank you for letting me know. →Σσς. (Sigma) 07:04, 10 December 2013 (UTC)
- Hi again! It appears your bot removed 29 threads from Talk:Paul Walker, but didn't add them to the Talk:Paul Walker/Archive 1 page I created. Did I do something wrong when I created the archive page? Thanks! GoingBatty (talk) 00:44, 11 December 2013 (UTC)
- There was a blacklisted link that prevented the edit from going through. When this happens, the API gives the error in a non-standard format, which my bot did not recognise as an error. →Σσς. (Sigma) 02:57, 11 December 2013 (UTC)
- Thanks for the reply - I see that the third time was the charm. Thanks for all your work with archiving! GoingBatty (talk) 03:00, 11 December 2013 (UTC)
- (talk page stalker) I have reverted the removal of those threads. The user contributions for lowercase sigmabot III show that the archives were removed, but not saved anywhere. The problem appears to be that mediamass.net was added to the Wikipedia SPAM Black List at 14:17, December 3, 2013 (UTC-7). This was between the time that the link was added to Talk:Paul Walker and when lowercase sigmabot III tried to save the text into Talk:Paul Walker/Archive 1. The bot was prevented from saving the text because the link was blocked. It is not a problem in your configuration, but is a bug in the bot. If the bot was unable to save the archive it should not have deleted them from the talk page to begin with. On the other hand, if the bot performed correctly (i.e. did nothing to your talk page), you would probably be here asking why archiving was not occurring on that talk page. We would probably all be scratching our heads wondering why. In the time it to me to get to writing this note the bot correctly archived your page. Makyen (talk) 04:04, 11 December 2013 (UTC)
- Thanks for the reply - I see that the third time was the charm. Thanks for all your work with archiving! GoingBatty (talk) 03:00, 11 December 2013 (UTC)
- There was a blacklisted link that prevented the edit from going through. When this happens, the API gives the error in a non-standard format, which my bot did not recognise as an error. →Σσς. (Sigma) 02:57, 11 December 2013 (UTC)
- Hi again! It appears your bot removed 29 threads from Talk:Paul Walker, but didn't add them to the Talk:Paul Walker/Archive 1 page I created. Did I do something wrong when I created the archive page? Thanks! GoingBatty (talk) 00:44, 11 December 2013 (UTC)
A cookie for you!
Thank you so much for getting Lowercase sigmabot III up and running. It is so nice to have automatic archiving for my user talk page once again. Your efforts in making this happen are much appreciated! Michael Barera (talk) 02:06, 11 December 2013 (UTC) |
Potential lcΣb3 bug: saving deleted source file prior to saving added-to archive file
The discussion above #Talk:Paul Walker and looking at the user contributions for lcΣb3 appears to bring to my attention what may be a bug in lcΣb3. All of the contributions show that lcΣb3 saves the archive file after saving the page being archived. I assume that the logic is that you revert the edit to the source page if the save to the archive fails. In the above case of Talk:Paul Walker it is likely that you would not be permitted to revert the save (I tried both "undo" and "revert", neither of which worked). While the interface may be different for bots, the fact that the bot was prevented from saving the archive indicates that it might be prevented from reverting the deletions from the source page. Would it not be better to use logic something like:
ifnoerr(save_archive_page) then save_source_page_with_deletions else report_error;
With that logic the deletions are never actually made to the source page unless saving the archive page actually succeeded. This would reasonably result in more than one write to the source page when you are deleting into more than one archive file. This might increase the logic a bit more, or you could make it such that for each time lcΣb3 considers a file it only removes enough to fill up a single archive file. If a second archive file is needed then just wait to come back to the page the next time the bot runs, or just call itself recursively after saving the one archive file, saving the source page, and cleaning up.
The logic of saving the archive first then the source page would eliminate any need for logic for reverting an edit if the save of the archive fails. Overall, it is also less prone to causing data loss due to cases not handled. If you are actually concerned about data loss, you could add a step of reading back the saved archive file and comparing it against what the bot was expecting to be there.
Makyen (talk) 04:37, 11 December 2013 (UTC)
Tool migration
You indicated that you migrated a few of my tools from toolserver. Where can I find them? ‑Scottywong| prattle _ 01:16, 5 December 2013 (UTC)
Σ, I've been going through my tools and doing some significant cleanup on them, including cleaning up the html/css so that they don't look like a 10-year-old designed them; getting rid of the javascript menubar at the top; and cleaning up the python code to make it faster, more efficient, and more readable. You can get an idea of what I've been doing by looking at this and this, and comparing it to the toolserver versions of the same tools. The two tools that you migrated need to be cleaned up as well. We have a couple of options for this:
- I re-migrate these two tools from scratch, and create a new tool account for them, and delete the tools out of the .../sigma/ directory.
- You clean up the tools by examining the source code of the other tools I've migrated and emulating the changes I've made.
- You grant me access to your tool account and allow me to clean up the tools.
I have been creating separate tool accounts for each tool. I'm not sure if that is the smartest way to do it, but it makes short and clean URL's. I'm a fan of consistency, so I have a slight preference for option #1 above, even though I know that you've probably put a decent amount of work into migrating these two tools. I'd like you to continue maintaining the tools, if you're willing, since you have put the work into understanding how they work, and you're much more available than I am to fix problems with them and respond to complaints. In fact, I wonder if you'd be willing to maintain all of the tools that I eventually migrate to tools lab. Let me know what you think about all of this. ‑Scottywong| squeal _ 23:54, 11 December 2013 (UTC)
- Your revisions look good. Considering that I am the one working on the tools I migrated, I am more comfortable keeping them in my own directory. I will still work on prettifying the html when time allows. →Σσς. (Sigma) 00:28, 13 December 2013 (UTC)
Editor Interaction Analyzer.
Speaking of: at 2: http://tools.wmflabs.org/sigma/editorinteract.py, the timeline links don't work. You're the contact person. (If not fixable/getting fixed, delink?)--Elvey (talk) 18:06, 10 December 2013 (UTC)
- It should be working now. Thanks. →Σσς. (Sigma) 03:13, 11 December 2013 (UTC)
Edit Summary wmflabs tool
Hi there! I'm currently considering running a Tyop Contest and I would like to be able to search user's contributions to find how many typo's they've fixed. Your tool is perfect, I just have one minor request: Could you possibly number the amount of edits that a user has in which the edit summary uses a certain keyword? For example, if I searched my user profile on your page for the word 'revert' (as seen here), it would number the results.
Thanks! Newyorkadam (talk) 02:16, 12 December 2013 (UTC)Newyorkadam
- You can put javascript:alert(document.getElementsByTagName("ul")[4].getElementsByTagName("li").length) in your browser's URL bar. →Σσς. (Sigma) 00:28, 13 December 2013 (UTC)
Redirect bug
This edit to the mainpaged redirect Jang Sung-taek broke the redirect. From what I can tell from the next edit it was because it was put before the #REDIRECT at the start of the page. 8ty3hree (talk) 01:22, 14 December 2013 (UTC)
lowercase sigmabot III not archiving User talk:Hassocks5489
I set up lcΣB3 on User talk:Hassocks5489, at the user's request, on December 12, but it has yet to be archived. There are threads which are 6+ months old and the age is set to 90 days. I know that lcΣB3 has been running based on both its contributions and seeing archiving on other pages. In fact, I set up Talk:Albert Ball with almost an identical configuration. lcΣB3 archived that page just over 3 hours after I set it up. I have not been able to see anything in the config which would prevent archiving. Could you take a look at things to see what is going on? Thanks. Makyen (talk) 13:10, 17 December 2013 (UTC)
- (talk page stalker) Maybe <!----> message on
algo=
is preventing bot from understanding commands. --레비Revi 13:26, 17 December 2013 (UTC)- Definitely a possibility. Frankly, I did not even consider that lcΣB3 was not removing such comments from parsing. Not sure why I did not consider it, but oh well. Let's see if it works. Makyen (talk) 14:05, 17 December 2013 (UTC)
- Archived. --레비Revi 03:30, 18 December 2013 (UTC)
- Yep. Thanks. Don't know why I did not try removing the comment. 8-). Makyen (talk) 06:23, 18 December 2013 (UTC)
- Archived. --레비Revi 03:30, 18 December 2013 (UTC)
- Definitely a possibility. Frankly, I did not even consider that lcΣB3 was not removing such comments from parsing. Not sure why I did not consider it, but oh well. Let's see if it works. Makyen (talk) 14:05, 17 December 2013 (UTC)
A new possibility for your archive box / list
I noticed your archives were stored in pages with the format of User talk:Σ/Archive/YYYY/Month instead of the /Archives/YYYY/Month format many of the archive box templates automatically find. I had noticed this issue on several other talk pages. Thus, I took some time and updated a couple of the templates. The {{Archives by months}} template can now work with such page names:
{{archive box |search=yes |bot=ClueBot III |age=4 |index=User:ClueBot III/Master Detailed Indices/User talk:Σ |collapsed=yes |image=[[File:Exquisite-folder font.png|70px]] | style=background-color:#F9F9F9; border-color:#AAAAAA; | {{nowrap|'''2009''': {{Archives by months|2009|archprefix=Archive/}}}} {{nowrap|'''2010''': {{Archives by months|2010|archprefix=Archive/}}}} {{nowrap|'''2011''': {{Archives by months|2011|archprefix=Archive/}}}} {{nowrap|'''2012''': {{Archives by months|2012|archprefix=Archive/}}}} {{nowrap|'''2013''': {{Archives by months|2013|archprefix=Archive/}}}} {{nowrap|'''2014''': {{Archives by months|2014|archprefix=Archive/}}}} }} |
|
This may make it a bit easier to access your archives rather than have to scroll through the long list. The above template also works from within archive files, so it can be used to navigate around the actual archive pages by including the above code with the archive header. Makyen (talk) 13:46, 17 December 2013 (UTC)
- Thank you, but I think I'll keep my current archive box for now. →Σσς. (Sigma) 06:40, 18 December 2013 (UTC)
Threads deleted by Lowercase sigmabot III but not moved to archive page
See these five edits: the middle one deleted three threads from a page without moving them to an archive page. --Redrose64 (talk) 01:20, 18 December 2013 (UTC)
- Ah, it was the legendary 503 JSON non-error. However, it appears that my error logs indicate that this time around, it was actually an error. This probably won't happen again. →Σσς. (Sigma) 06:40, 18 December 2013 (UTC)
Not archiving User talk:Jeff G.
Hi. Thanks for all you do on this project, but it seems Lowercase sigmabot III has not been archiving my user talk page. I think that should have been done five days ago. Did I do something wrong? — Jeff G. ツ (talk) 05:44, 18 December 2013 (UTC)
- You shouldn't put the config template under a section header. Legoktm has fixed this for you. →Σσς. (Sigma) 06:40, 18 December 2013 (UTC)
- Thank you. — Jeff G. ツ (talk) 06:44, 18 December 2013 (UTC)
- That worked. Thank you again. — Jeff G. ツ (talk) 06:26, 21 December 2013 (UTC)
Please comment on Talk:Egalitarianism
Greetings! You have been randomly selected to receive an invitation to participate in the request for comment on Talk:Egalitarianism. Should you wish to respond to the invitation, your contribution to this discussion will be very much appreciated! If in doubt, please see suggestions for responding. If you do not wish to receive these types of notices, please remove your name from Wikipedia:Feedback request service. — Legobot (talk) 00:05, 19 December 2013 (UTC)
lcΣb3 bug/RFE: Fails to archive if config block contains wikipedia comment "<!-- -->"
As can be seen in the thread above, I had a lcΣb3 config not function because there was a <!-- -->
comment in the config block/template. I can easily solve this for any page I set up or encounter. However, given the wide use of the config block by a significant number of editors, I believe that lcΣb3 should be able to handle (i.e. ignore) comments in the config block. This is just deployed too widely with a base of users that expect that style of comment to be ignored. I expect that if lcΣb3 continues to fail when such comments are encountered there will be ongoing issues of people having problems with configs throughout the life of the bot. Makyen (talk) 10:48, 19 December 2013 (UTC)
- Although I find it rather unlikely that people who put comments in the config template did not notice a lack of MiszaBot on their talk page, this proposal seems reasonable. I have implemented it. →Σσς. (Sigma) 04:07, 21 December 2013 (UTC)
- Thank you. And thanks for removing the extra copy of the above. Sorry about that. Not sure how it happened. The extra copy was certainly not displayed while I was editing/previewing, etc. on the "new section" page. Makyen (talk) 10:30, 21 December 2013 (UTC)
lcΣb3: pages where threads were deleted but not moved to archive page 12-01 -> 12-17
Looking at lcΣb3's contributions I found the following times/pages where threads were deleted and not placed in an archive file:
- 08:55 16 December 2013 (UTC) 3 down from the top for page User talk:Marchjuly (already corrected Done)
- 02:49 16 December 2013 (UTC) 5 down from the top for page Wikipedia talk:WikiProject Football (Corrected Done)
- 00:46 16 December 2013 (UTC) 3 down from the top for page Wikipedia:Media copyright questions (already corrected Done)
- 01:52 14 December 2013 (UTC) 3 down from the top for page User talk:CambridgeBayWeather (Corrected Done)
- 23:23 10 December 2013 (UTC) 3 down from the top for page Talk:Paul Walker (I previously corrected Done)
- 04:27 10 December 2013 (UTC) 3 down from the top for page Talk:Paul Walker (already corrected Done)
- 03:38 10 December 2013 (UTC) 3 down from the top for page Talk:Reincarnation (corrected, (blacklist) Done)
- 12:26 1 December 2013 (UTC) 3 down from the top for page Talk:Bareback (sex) (corrected, (blacklist) Done)
There was also a strange occurrence which you already know about. I assume was a start-up glitch as it occurred on the first edit of its run. This is included for completeness:
- 00:20 14 December 2013 (UTC) 3 down from the top for page Talk:Graffiti (Σ reverted this a couple of minutes after it happened, it is different than the others. Done)
I do recommend that you add a check into lcΣb3 that reads the file size of the archives written and verifies that the amount of data actually stored is at least the size of the amount lcΣb3 expected to be there.
I am putting this list here in case you were not aware of any of them as they may have some debugging value. It is my expectation that the issues earlier than this do not have any debugging value, so they are not listed here. On all of the above, the resulting data problems have been corrected.
I am, however, concerned that there are remaining data integrity issues from errors which occurred earlier. Given that I fixed 5/9 of the above, I expect it likely that there has been some data lost which is only available from page histories. I have not checked (semi-automated) all the way back for problems, but there are a good number of additional errors (as is to be expected when developing something like this). I plan to take time, eventually, to look through all those I have found so far. If anyone wants to help clean up any data errors/loss the help is more than welcome. The incomplete list is located here. Makyen (talk) 17:32, 19 December 2013 (UTC)
- Thank you for bringing this issue to my attention. Unfortunately, I don't have enough information to pinpoint the problem with full confidence. If you can give me a list of pages that are affected by this after tomorrow, that would be great. →Σσς. (Sigma) 04:07, 21 December 2013 (UTC)
Lowercase sigmabot removing PC1 template on pages that still have it
In this edit, Lowercase sigmabot removed {{pp-pc1}} (along with correctly removing {{pp-semi-blp}}) even though the page was still pending-changes protected. Why did this happen? Jackmcbarn (talk) 03:29, 21 December 2013 (UTC)
- My bot doesn't do anything with pending-changes protection. To do so would duplicate Cyberbot. →Σσς. (Sigma) 04:07, 21 December 2013 (UTC)
- Why did it remove the PC1 template then? Jackmcbarn (talk) 04:59, 21 December 2013 (UTC)
- The bot thought that page was not protected, so it removed all the templates. If you overlook the pending changes protection, as the bot did, the removal was logical. However, this causes problems, as we've just seen. Off the top of my head, I can come up with two ways to resolve this:
- Add code that checks pending changes protection as well as regular protection, but only add or remove the templates for regular protection in order to avoid duplicating Cyberbot.
- Extend the bot's functions to adding and removing pending changes and regular protection, effectively making Cyberbot redundant.
- →Σσς. (Sigma) 05:42, 21 December 2013 (UTC)
- The bot thought that page was not protected, so it removed all the templates. If you overlook the pending changes protection, as the bot did, the removal was logical. However, this causes problems, as we've just seen. Off the top of my head, I can come up with two ways to resolve this:
A bot of yours archiving my bot's talk page
This discussion has been closed. Please do not modify it. |
---|
The following discussion has been closed. Please do not modify it. |
Howdy. Recently, a bot of yours archived the talk page for my bot. I understand why it did this. I still had the MiszaBot config template on my bot's talk page. I have since reverted your bot on my bot's talk page. I did not request for your bot to archive my talk page. I understand that commenting out the MiszaBot archiving config template will stop your bot from archiving my bot's page in the future, but I would like request that you do not assume that I want any of your bots archiving any of my or my bot's talk pages. Thank you.--Rockfang (talk) 08:52, 21 December 2013 (UTC)
|
It's not an exact replacement of Miszabot, because Miszabot had a FAQ page, while this bot doesn't. It wasn't a very good FAQ page, but at least it was something. How do you turn it off? How do you change archive frequency? Michael-Zero (talk) 16:51, 23 December 2013 (UTC)
- Hi Michael-Zero, I've created an FAQ page. Is that helpful for you? Legoktm (talk) 23:11, 24 December 2013 (UTC)
Rockfang's complaints do not concern me. →Σσς. (Sigma) 23:33, 21 December 2013 (UTC)
summary.py is broken
Hi, this query comes back empty. It should return quite a lot, including this, this, this, this, this, this plus many others. What I was searching for was this edit, which I originally tried to find using this query, but that failed too. Redrose64 (talk) 21:35, 21 December 2013 (UTC)
- Thank you. A fix will be applied soon. →Σσς. (Sigma) 23:33, 21 December 2013 (UTC)
- Done. →Σσς. (Sigma) 00:29, 22 December 2013 (UTC)
- Thank you The first query that I linked is working better than it was, but I now notice that it has a spurious trailing space which I don't recall entering. Removing that increases the number of hits. --Redrose64 (talk) 01:18, 22 December 2013 (UTC)
- Sigma, would it make sense to run
str.strip()
on queries to avoid this in the future? Theopolisme (talk) 07:05, 22 December 2013 (UTC)
- Sigma, would it make sense to run
- Thank you The first query that I linked is working better than it was, but I now notice that it has a spurious trailing space which I don't recall entering. Removing that increases the number of hits. --Redrose64 (talk) 01:18, 22 December 2013 (UTC)
- Trailing whitespace could be used to find "caption" but not "captions". Alternatively, I could add a regex option, but that's not much of a priority right now. →Σσς. (Sigma) 08:34, 22 December 2013 (UTC)
Cold?
Best wishes | |
for the holidays and 2014 from a warmer place than where you probably are ;) Kudpung กุดผึ้ง (talk) 02:33, 22 December 2013 (UTC) |
- Thanks, I'm doing well. Happy holidays! →Σσς. (Sigma) 02:47, 22 December 2013 (UTC)
Hi
I would like to bring to your attention a recent archiving move this bot made. It was on the Talk:Aircraft carrier page. This is how it looked before the archiving. As you can see, there was a lengthy discussion spread across multiple sections, beginning with #2 "To include or not to include..." and ending with #9 "Draft revisited". If possible, it would be preferable to keep this discussion to a single page and unchanged. However, if it must be broken up and spread across additional pages, it would (obviously) be best if it were still kept in it's original, chronological order.
But as it stands now, the discussion has been broken up and spread across three pages. The sections are not in chronological order and there also appears to be info missing (?!) Could you please look into this at your earliest convenience? Thanks
note; I've reverted the moves for now. I don't know much about archiving, but could we just manually archive the discussion to keep it intact? I am happy to help in any way. Cheers - theWOLFchild 06:51, 22 December 2013 (UTC)
- The bot treats each level 2 header as a separate discussion. If you want them to be archived with "To include or not to include", change the threads between [3, 8] to level 3 headers.
- Something will have to be done about the archive pages, though.... →Σσς. (Sigma) 06:59, 22 December 2013 (UTC)
- It can all go onto the same archive page, no? Is there a size limit for archive pages? Also, how does one manually archive a thread? - theWOLFchild 07:38, 22 December 2013 (UTC)
- (talk page stalker)If Miszabot config has
|Maxarchivesize=
code, it will create new archive page when archive page has exceeded the limit. If you want to archive page section-by-section manually, try OneclickArchiver. --레비Revicon 07:45, 22 December 2013 (UTC)- I followed you suggestion and changed all the level 2's to 3's. I guess first we'll see what the bot does to the discussion with those changes now in place, and go from there. Thanks for your help. - theWOLFchild 08:15, 22 December 2013 (UTC)
- (talk page stalker)If Miszabot config has
- It can all go onto the same archive page, no? Is there a size limit for archive pages? Also, how does one manually archive a thread? - theWOLFchild 07:38, 22 December 2013 (UTC)
Happy holidays
JianhuiMobile talk is wishing you a Merry Christmas! This greeting (and season) promotes WikiLove and hopefully this note has made your day a little better. Spread the WikiLove by wishing another user a Merry Christmas, whether it be someone you have had disagreements with in the past, a good friend, or just some random person. Happy New Year!
Spread the cheer by adding {{subst:Xmas2}} to their talk page with a friendly message.
JianhuiMobile talk 07:31, 22 December 2013 (UTC)
- To you as well! →Σσς. (Sigma) 08:35, 22 December 2013 (UTC)
Some commentaries on Sigmabot and protection templates
Wikipedia:Administrators' noticeboard was protected in this edit. Then came Sigmabot and added protection templates in this edit. First commentary: always better to add the protection templates between noinclude tags. Noticeboards, specifically, are often transcluded on userpages, and adding protection templates between noinclude tags prevents those userpages from ending up in Category:Wikipedia pages with incorrect protection templates. Then after the protection expired along came Sigmabot and in this edit removed both protection templates, even though the move protection was (and is) still in place, as the bot subsequently admitted in this edit. Second commentary: do not remove two protection templates, if only one is not relevant any more. Debresser (talk) 04:54, 23 December 2013 (UTC)
- Why would ANI be transcluded on someone's userpage? →Σσς. (Sigma) 03:37, 24 December 2013 (UTC)
- The second one should be fixed. Funny thing is that every time I rewrite the relevant function, it seems to work fine for a week, and then messes up... →Σσς. (Sigma) 04:47, 24 December 2013 (UTC)
- There are a few people who transclude pages like WP:ANI, WP:AE et al. onto their userspace. To keep track of things, I guess. You'll notice that all of those pages use the noinclude tags (and subpages, if they are protected), for this reason. In any case, if there is a noinclude tag, as a rule it is always better to put protection templates inside them, to be on the safe side. Debresser (talk) 11:46, 24 December 2013 (UTC)
- I'll see what I can do. →Σσς. (Sigma) 23:58, 24 December 2013 (UTC)
LCΣB3 not archiving
During the last run on last weekend, LCΣB3 did not archived first section of Wikipedia talk:WikiProject Korea. Config was |algo=(90d)
, section is almost one year old and there was more than 10 sections. Can you check it? --레비Revicon 12:51, 23 December 2013 (UTC)
- (talk page stalker) A comma was misplaced so that the timestamp wasn't being recognised as such. --Redrose64 (talk) 00:03, 24 December 2013 (UTC)
- Archived. Thank you. --레비Revicon 03:30, 24 December 2013 (UTC)
lcΣb3: pages where total bytes into the archive pages <> bytes out of talk page for 12-20 run.
Per your request, here is a list of the points during the run yesterday where the total bytes out of the source page did not equal the number into the archive.
Excluded are archives where at least one new page was created and the total archive size (tas) difference is ((tas>0) and (tas < 75*n) and (tas mod n = 0)). This could mean there are errors masked. However, the other data set showed ~70 bytes as the longest addition for a new page (assumed that was the cause of the difference). I, obviously, don't know how large the archive header is for any particular page. The differences in the most recent run were:
# | Bytes | Missing / Extra | Time / date (UTC) (Contribs Link) |
Page name | Bytes diff to source page |
Bytes into Archives | |
---|---|---|---|---|---|---|---|
1 | 849 | extra bytes at | 07:22 21 December 2013 | in archives of | User talk:Nonsenseferret | -2,586 | 3,435 |
2 | 36 | MISSING bytes at | 05:07 21 December 2013 | from archives of | Talk:Paraben | -5,043 | 5,007 |
3 | 5 | MISSING bytes at | 06:06 21 December 2013 | from archives of | User talk:Musamies | -22,369 | 22,364 |
4 | 2 | MISSING bytes at | 07:17 21 December 2013 | from archives of | User talk:FoCuSandLeArN | -8,912 | 8,910 |
[The first two I have adjusted the contributions link such that the page in question is near the top. The rest will probably be near the bottom, but should be on the page.]
I previously was not looking at smaller differences because my initial priority was to verify that there was no significant data lost. Given that there were a good number where larger amounts of data might be only available in the edit histories, I was temporarily ignoring small differences. I felt that it is now better to report them here as this information is being used for debugging. Makyen (talk) 16:39, 21 December 2013 (UTC)
- Those cases where there is a single byte difference (missing or extra) are probably due to the bot ensuring that one blank line separates each section. If there were previously two, it will show as 1 missing byte; if there were previously none, it will show as 1 extra byte. These are not worth worrying about. --Redrose64 (talk) 21:07, 21 December 2013 (UTC)
- That was my assumption also. However, I have not verified it. I felt that it was better for me to include them in this list rather than me exclude them based on my making an assumption. Based on that assumption, I did exclude them —actually all with less than a 5 byte difference, or much more if a new page was created— from those I was looking at from a data loss/recovery point of view (above) and the list at this link. Makyen (talk) 22:51, 21 December 2013 (UTC)
- To put it honestly and directly, I couldn't care less about the bot archiving several bytes too few/many as long as the threads are archived and the pages' appearances aren't affected. Out of 40 "errors", only 4 are worth examining, and only 1 was an actual error. As such, I'm removing the irrelevant entries from the table.
- User talk:Nonsenseferret: The bot substituted a template that should have been substituted. Apparent success.
- Talk:Paraben: The bot removed comments from the config template. Apparent success.
- User talk:Musamies: The bot removed comments from the config template. Apparent success.
- User talk:FoCuSandLeArN: Apparent failure. Fixing this will be trivial.
- If you have any more cases of the bot removing threads from the talk page but not putting them in an archive (the original topic of the thread you linked to), please, show them to me. These are infinitely more important than the bot leaving a newline at the bottom of the archive. →Σσς. (Sigma) 23:33, 21 December 2013 (UTC)
- Σ, it was not my intention to antagonize you. If I have done so, I apologize. I was certainly not intending to imply the list I provided was a list where each line was considered an error. I did not call anything in the data that I provided an error. The only comment I made about errors was that it was possible I had made an assumption which might have resulted in data being masked out which would have shown errors. Each line of the table was merely where the number of bytes into the archive did not equal the number of bytes out of the source page. I am sorry if any other implication was communicated. Without personally checking on a specific data item, I certainly would not call one an error. I did not take the time to look through them because at the time I finished with generating the list I was falling asleep at my keyboard. I figured it was better to pass off the data to you than to delay it for another day. My hope was that getting it to you at that time would result in any errors being fixed sooner.
- Frankly, I don't care about small differences either. Particularly not about an extra or missing newline between threads. As with you, I only care that threads are archived (once) and the data and appearance of the page/thread remains very close to what it was in the source page. My expectation was that you would discount and remove from the list those that had a small number of bytes different. I had intentionally presorted the table such that removing them was easy. For my personal review, I had been ignoring any difference that was less in bytes than the max of either the number of threads being archived or 5 (along with the increase in threshold when a new page was created, as mentioned above).
- Because I was providing information about a run which you specifically requested, I tried to reduce the assumptions I was making to filter out possibilities prior to them being considered. It is, generally, better to provide this data to the primary person in way that does not remove the possibility of problems being found due to the assumptions of the person supplying the data. This is true as long as doing so does not make the data itself so overwhelming that it is not useful. Providing more data in this situation is the correct choice. The situation here is an example of why such is the correct choice. Using my earlier assumptions, the page User talk:FoCuSandLeArN would never have been included in the pages considered for checking. However, it is the only one on which you consider that an actual problem existed.
- After looking more closely at the three you labeled apparent success, I disagree with that assessment on all three counts:
- User talk:Nonsenseferret: In the source page the bot changed a post which was not being archived. The situation was that the thread prior to one being archived was missing a }} to close a template that was intended to be substituted when originally saved. The lcΣb3 added a }} to the page, changing the look of the page significantly, rather than just archiving threads. While this change corrected what was almost certainly a mistake by the person originally sending out the automated message in that thread, it significantly changed the look of the page. Unless doing so is part of lcΣB3's mandate, then this page is a failure. If doing so is part of lcΣB3's mandate then that should be explicitly documented somewhere because it goes beyond just archiving threads.
- Talk:Paraben: lcΣb3 removed comments the user placed in the MiszaBot/config template on the source page. While the config template is for lcΣb3, it is not for the bot to remove comments contained therein. User's expect their comments to stay in place. I would call this a fail.
- User talk:Musamies: Same thing as Talk:Paraben. I would call this a fail. The same amount of data was removed on this page as on Talk:Paraben. It shows up as a smaller total difference because a new page was created which adds the page header to the amount stored in the archive.
- After looking more closely at the three you labeled apparent success, I disagree with that assessment on all three counts:
- To do a through job of debugging a program like this, in this stage of its development, takes going through a large amount of data. Most of which will not contain an error. By the design of lcΣb3, there can be differences in the size of the data between that which is deleted from the source page and that which is added to the archive page. Without programatic access to the actual page data, the only thing available is looking at the total number of bytes added or deleted to pages. A consequence of this limitation is that there will either be a large number of false possibilities, or potentially errors not identified because of the assumptions used to filter the data presented. For instance, multiple errors could occur in the same page resulting in a very low, or zero difference in page length. While less likely, such errors are not detectable though just looking at the number of bytes removed from the source and added to the archive.
- One way to greatly reduce the time and effort of this phase of debugging, while also making a program more reliable (in that it fails soft, or immediately corrects, or stops after an error), is for the program to self-check the data written. One method of doing this is for the program itself to track how much data it intended to remove from the source and how much it expected to be added to a set of archive pages. Those numbers can then be compared to the difference between the size of the pages prior to actions and the size of the stored pages after those actions. If the sizes don't match expectations, the edits should be reverted using original unaltered (unparsed) data and an error placed in a log file for a human to examine the problem. Using this type of checking greatly reduces the possibility of errors. Normally, it also dramatically reduces the amount of time spent hunting for the possibility of errors because the bot identifies the majority of errors that occur. It is much more effective to perform this way because the program knows the exact amount of data it adds for an extra newline, nor the header at the top of a new archive page. This type of checking can identify multiple types of errors and is much more effective than someone going through and checking for differences in the size of data written without the knowledge available within the program as to expected size changes.
- I can, and will, not report differences of only a single byte. From the fact that you found a problem in the User talk:FoCuSandLeArN page, it indicates that I should report problems that are 2 bytes, or more. If you desire, I will adjust that to whatever you choose. However, keep in mind that in combination with a new page, or just adding a single extra newline elsewhere, a problem like that on the User talk:FoCuSandLeArN page will not be reported. Also remember that there is a different threshold value for archives when a new page is created. If you still want the data, I will not guarantee that each occurrence identified merely by data size difference actually contains an error. For me to do so would require checking each instance by hand. I have already put in way too much time into this. I do not want to commit to more. I will probably do the more, but I don't want to commit to it. Automatically processing the contributions list should not take much time at this point.
- Looking again at what you wrote above, you have requested I not report anything other than confirmed errors. As a result of your request I will not report any data at all. I may, or may not, choose to take the time to look at other anomalies to determine if they actually constitute instances where threads were not archived.
- Basically, I look at it as I have offered you a tool to help you debug your software. One that potentially saves a considerable amount of time and effort on your part tracking down issues which will be ongoing if not identified and fixed. In addition, getting bugs found and solved sooner certainly saves time and effort on the part of other editors trying to undo the few mistakes that lcΣb3 makes. The expense to you of the data was merely that you look at it and dismiss it if nothing was of interest. You have made a choice. I think that it is a bad one, but I accept it.
- Again, my goal was, and is, not to antagonize you, or to be saying lcΣb3 is a bad piece of software. I think it is a good piece of software. However, I do desire that all the bugs are found and resolved so lcΣb3 can be reliably used for archiving for years to come. Makyen (talk) 07:21, 25 December 2013 (UTC)
- Makyen, remember that we're all volunteers here and Sigma is under no obligation to do, frankly, anything. The countless hours that he's put into making an excellent archival bot are greatly appreciated -- and the time that you put into this bug report is appreciated as well, of course :) Theopolisme (talk) 06:40, 26 December 2013 (UTC)
- (edit conflict) Considering that neither of us care about extra or missing newlines at the end of an archive, we should limit ourselves to checking if the number of threads (conveniently provided in the edit summaries) removed is equal to the number of threads added to archives. In fact, this is something the bot itself can do. I will implement this when I have the time.
- Let's focus on the problems.
- 1. Look closely. All the braces in the template have matches. Why, then, wasn't the template substituted? The answer has been conceptually demonstrated by The Earwig (talk · contribs), here. With that in mind, this edit was an apparent success.
- The key is that when you try to subst a template that doesn't exist, it saves the raw {{subst:...}}. But if the template is created afterwards, and then the original page is edited again, it will then perform the substitution. According to the logs, the template did not exist at the time the message was added by DYKBot; It had been deleted an hour before.
- 2. and 3. Suppose the config template contained |counter=<!-- cmt -->9<!-- cmt -->9<!-- cmt --> and the bot incremented the counter. What would be the proper result? MiszaBot outright refuses to archive pages with comments in their archive templates. ClueBot III does the same. Thus, we have a grand total of zero bots that have ever tolerated html comments. I would actually say that my bot is more lenient than the rest on this matter by removing comments from the template before parsing it. And as such, apparent success.
- →Σσς. (Sigma) 08:45, 26 December 2013 (UTC)
Merry Christmas!
I wish you a Merry Christmas and Happy New Year 2014!
|
- Thanks, and to you as well :) →Σσς. (Sigma) 23:58, 24 December 2013 (UTC)