Archive 1Archive 2Archive 3Archive 5

First draft

OK, we have a draft at User:Mathbot/WP1.0. More needs to be done, obviously, but I wonder if there are any comments on what is so far.

Also, I wonder what the name of the future log file should be (that was suggested by Walkerma), and what to do about {{assessment}} (that was suggested by Titoxd, and appears not necessary to me). Oleg Alexandrov (talk) 03:59, 27 April 2006 (UTC)

That looks like a great start, Oleg, thanks! I am currently adding assessments into the talk page templates for chemistry so you can also test with those (though I need to go to bed soon!). The only change I would suggest at this point is that I would like the final table to look more like the new chemistry list I'm writing, though I think the differences are largely cosmetic. As for the assessment template, I'd assumed that was generated to provide WikiProjects with a standard type of template - readable by Mathbot - that they could adapt for the project's assessment purposes.
In time we may also want to add a feature to add assessments into "article trees" such as this one once we can get the format for these improved (E Pluribus Anthony may work on that), we would probably represent FA-Class by blue font, A-Class by light green, etc to match the assessment tables. We plan to use something like this to help us organise the articles for WP1.0. However, I don't think we need to work on that till we get the tree format right. Thanks a ton! Walkerma 04:22, 27 April 2006 (UTC)

How about Wikipedia:Version 1.0 Editorial Team/Assessment log as a home for the log of assessment changes? I'll be sure to start checking it very regularly once it gets rolling. Thanks, Walkerma 05:03, 27 April 2006 (UTC)

  • The only reason I proposed {{assessment}} is to help the bot's output as the format of the table has not completely stabilized, and there's discussions about changing the tables to include more data. The tables are started with {{assessment header|WikiProject name|WikiProject nickname (usually the same)}}, and completed with {{asessment footer|lastdate=date of last update from the bot}}. It is not necessary, but I tried it at WP Cyclones's listing. Thanks again! Titoxd(?!? - help us) 05:52, 27 April 2006 (UTC)

Maybe I missed the boat

I've been secretly (more or less) working on something similar to this in perl. My code, which runs from a command line and is not a bot in that it does not update things, is available if it would help, but if you have it all figured out maybe I should stop. My approach is to trawl the category dump for information. Mathbot is in perl so maybe??? ++Lar: t/c 04:09, 27 April 2006 (UTC)

Mathbot is written in Perl indeed. Today I spent only half an hour on the code, parsing the categories and spitting out the results. So I won't mind in the least if you take over.
One question. What is a category dump, and how often it is updated? Oleg Alexandrov (talk) 04:50, 27 April 2006 (UTC)
You got farther in an half hour than I did in my first week. So maybe you should take code from me if it's useful and this should stay a bot? (I have a bot ID but never applied for the bot flag as I haven't run in bot mode...) A category dump is one of the main database dump types, it's updated about once every three days or so. It looks like a bunch of SQL insert statements. There's a URL around somewhere to where it lives on Meta that I need to put into the page writeup so everyone can find it easily. ++Lar: t/c 05:06, 27 April 2006 (UTC)

I see. Well, my script has the small advantage that it fetches information directly from Wikipedia so it is up to date. I would also think that having it as a bot rather than a command line tool would be a bit better, as it could update Wikipedia:Version 1.0 Editorial Team/WPArts by itself at a fixed time each day without user interaction. But that's not a whole lot of a difference. So maybe an idea is for me to indeed get your code and tie it up with mine, and see what happens. Then you could set up your bot to run the code each day as you are more involved with me in the WP1.0 project and will be more likely to supervise the bot than me. Wonder what you and others think. Oleg Alexandrov (talk) 15:25, 27 April 2006 (UTC)

Sounds an excellent plan. (I had thought about fetching categories directly but the needing to page forward for very large article lists threw me and getting it from the dump seemed easier... less up to date though as you say) My code is ok at creating new but not so OK at updating. Also an area we have to work through is that my access to WP itself is done via Pearle-Wisebot based routines and I believe Mathbot uses a different set. So that may give some incompatiabilities. If I could look at Mathbot source I might be able to judge how easy it is to marry pieces together... the source for what I've done so far is available as I said above... give it a look and see what you think? ++Lar: t/c 03:19, 28 April 2006 (UTC)

Here is my code. It uses a routine fetch_articles_cats to fetch articles and subcategories in given category, it is included below. When run, it should write to a file called "User:Mathbot/WP1.0.wiki". Wonder what you think. Oleg Alexandrov (talk) 04:31, 28 April 2006 (UTC)

Is that all the code there is? It's pretty nice and economical! Or does it run within a framework not shown? I didn't see many Use statements to bring packages in, I'd assumed you needed some packages to do html gets and puts and the like.... I'm not sure how to proceed but open to ideas. ++Lar: t/c 19:26, 28 April 2006 (UTC)

This is all. I fetch html code with wget. The only part missing is uploading the processed data back to Wikipedia, for which I use WWW::Mediawiki::Client, but that is it. Oleg Alexandrov (talk) 20:09, 28 April 2006 (UTC)

I dunno what to do. This needs doing and I'm moving too slow. You should see if there's anything you can use from mine and carry on, I think. ++Lar: t/c 00:09, 29 April 2006 (UTC)

I see. There is not much we could share however, as our codes are built on totally different framework, and I'd rather write the few remaining bits myself than trying to understand how to use the pearle framework. I guess that's the way to go. Oleg Alexandrov (talk) 01:17, 29 April 2006 (UTC)

regrettably, I think you're right. My fault for not working on this harder 3 weeks ago. God speed. ++Lar: t/c 02:15, 29 April 2006 (UTC)

Assessment logs

Besides the global log to be kept at Wikipedia:Version 1.0 Editorial Team/Assessment log, would it also be possible to generate individual category logs such as for Category:Chemistry articles by quality? This topic was raised recently here, where the person posting said the military history project (who have a huge worklist) would be interested in converting their worklist over to the bot/category based system, but {it was implied) only if an individual project log can be produced. If I understand the system correctly, this should be easy - am I right? By the way, Category:Chemistry articles by quality now has around 50 articles in it in total, enough to test the bot there when you are ready for that (I left some of the assessment spaces blank to see if the bot can fill them in OK). The list is at Wikipedia:Version_1.0_Editorial_Team/Chemistry_articles and the chemistry article log can be at Wikipedia:Version 1.0 Editorial Team/Chemistry article log. Thanks again Walkerma 03:14, 28 April 2006 (UTC)

There have been a few changes to the WP Cyclones listing. Could we get another run to test-debug the log? Titoxd(?!? - help us) 02:16, 29 April 2006 (UTC)

I implemented Walkerma's style suggestions, used Tito's assesment template. and uploaded the new version at User:Mathbot/WP1.0. Next steps will be real dates and logs. Comments welcome. To be continued either tomorrow or Sunday night. Oleg Alexandrov (talk) 03:51, 29 April 2006 (UTC)

A barebone log works now, at Wikipedia:Version 1.0 Editorial Team/Assessment log. Last step is to split the log into subsections by subject (chemistry, music, tropical cyclones) and have the individual subjects logs also go to the corresponding wikiprojects. To be done in a day or two. Oleg Alexandrov (talk) 03:53, 1 May 2006 (UTC)
Very nice! Any chance the headers could be nested one level deeper, though, at least in the subject logs? That would make transcluding the log pages into WikiProjects easier. Or are you planning to have Mathbot update the relevant WikiProject pages directly? Kirill Lokshin 22:39, 1 May 2006 (UTC)
There will be individual logs, for each wikiproject. Oleg Alexandrov (talk) 02:26, 4 May 2006 (UTC)

Format for tables

I would like to propose a change to the layout of the assessment tables generated by the bot. This would require a small change to Template:Assessment, and I am hoping that it wouldn't mean any change to the bot. I would like to move "importance" over to the left so the template would read:

| page | importance | date | class | comments |

If this is a niusance, don't worry, it's just a more intuitive format, since importance is something that is associated with the topic (and therefore the page name) rather than with the assessment. Walkerma 16:18, 1 May 2006 (UTC)

The order in which the parameters are passed to the template don't really affect the appearance of the template, the way the template was created, as long as the bot adds the parameter name before the value. Titoxd(?!? - help us) 18:49, 1 May 2006 (UTC)
Changed. Titoxd(?!? - help us) 00:10, 4 May 2006 (UTC)
Hmm. The bot is now reporting a huge diff change in the log. Could that be because I changed the template? Titoxd(?!? - help us) 02:50, 4 May 2006 (UTC)
No, that's because I forked off the tables to their own pages, and those pages were empty before that. So its a glitch, it is not because of what you did. Oleg Alexandrov (talk) 03:29, 4 May 2006 (UTC)
Out of curiosity, will the {{assessment header}} line be overwritten on each update? I'd like to point the contact link to the proper archive, but I'm not sure if the change will stick. Kirill Lokshin 04:04, 5 May 2006 (UTC)
Yes, they will be overwritten at each update. If it is indeed important that it does not happen, that can be taken care of. Let us see what others say. Oleg Alexandrov (talk) 04:15, 5 May 2006 (UTC)

I made a small change to my code such that now each of the tables, Wikipedia:Version 1.0 Editorial Team/Chemistry articles by quality, etc, have a front matter which will not be overwritten by the bot. So there, any additional text can be placed which may clarify on a per-project basis what that is about and where to go for more info. Also, the bot runs as a cronjob now, that is, automatically everyday. The current running time is something like half an hour before this datestamp, but I can modify it if necessary. Oleg Alexandrov (talk) 04:06, 9 May 2006 (UTC)

Thanks for the front matter - I hope I didn't upset the bot by adding that (I was also curious!). Regarding the time, how long would it take if it were going through (say) 10,000 articles in 200 WikiProjects? This is where I'd like it to be in a year or so. Walkerma 07:11, 10 May 2006 (UTC)
Sorry for my sloppy writing. I meant to say that the bot starts running at 03:35 AM UTC each day, half an hour before the datestamp I had above, which was 04:06, 9 May 2006 (UTC).
The bot does not take that long. The most time is taken by fetching the most recent article version from article history, but that is needed only for new additions or for articles whose status is improved.
At the list of mathematics articles which is updated by my bot we have over 11,000 articles, split into smaller lists alphabetically, and the list of mathematics categories which my bot browses has around 800 entries. Works fine. So, if we encounter scalability issues in the future, I could split each wikiproject list into smaller lists, by starting letter, and that should take care of things. Oleg Alexandrov (talk) 15:18, 10 May 2006 (UTC)

Assessed version (approx) vs. current version of article

Sorry to add more work, but new ways of working tend to raise new questions that aren't always obvious at first. I had a lengthy phone conversation this weekend with User:Maurreen, trying to map out the future for 1.0 and how/when we will get versions released (this fall, probably, for 0.5). One issue that arose when I described the bot is, "How do we know that the assessment is valid?" In other words, let us say an assessment was done on Jan 1st as "A-Class." Then a vandal comes in and changes some of the data. We (WP1.0) come along on Feb 1st and download the article onto a CD - but the article is no longer A-Class.

What I would like to propose is that the bot does not simply save the article name, but rather it lists the actual version it found on the day it found the assessment was changed. This is likely to be very close to the version that was actually assessed as A-Class. Obviously this is not perfect, but it should make the problem I've outlined very rare. So when we come along on Feb 1st to download the article, we are downloading the Jan 1st version. I realise we can click through histories for this, but if we have to do this for 10,000 articles the handful of dedicated WP1.0 people are going to get very bored! So, can the bot save the specific past version rather than just the article name? Walkerma 16:53, 1 May 2006 (UTC)

Will do, both the suggestion above, with changing the order of items, and the current suggestion with article version. Oleg Alexandrov (talk) 17:38, 1 May 2006 (UTC)
Thanks a lot! Walkerma 18:02, 1 May 2006 (UTC)

I have a question. Say an article goes from A-class to FA-class in one day. So you want the bot to link to the version of the article on that very day. but say later the article is downgraded from FA-class to B-class. I guess you would like the bot to still link to the FA-class version, right? In other words, you want the version link to change only when an article goes up and not when it goes down, is that correct? Oleg Alexandrov (talk) 04:57, 6 May 2006 (UTC)

I implemented for now the link to specific version as I thought you would want it above, so if an article improves (say from B-class to A-class), then the link to specific article version is changed to the most current (best! :) version, and if the article gets worse, or its assessed quality does not change, the old link to specific article version is kept. Oleg Alexandrov (talk) 01:26, 7 May 2006 (UTC)
Thanks. I've been out of town, but yes, what you've done sounds eminently sensible. Cheers, Walkerma 04:20, 7 May 2006 (UTC)

Logs update

The logs work now, with the individual topics logs being at Wikipedia:Version 1.0 Editorial Team/Music genre articles by quality log, Wikipedia:Version 1.0 Editorial Team/Tropical cyclone articles by quality log, Wikipedia:Version 1.0 Editorial Team/Chemistry articles by quality log, and the combined log at Wikipedia:Version 1.0 Editorial Team/Assessment log. All the logs above, except the hurricanes one, are empty, as no changes happened.

I'll work on the other stuff suggested above in the next several days. Comments, things to be done? Oleg Alexandrov (talk) 03:40, 3 May 2006 (UTC)

Thank you, those logs look like they'll be excellent, thank you! A couple of other issues that may need to be considered:
  • We are in the process of setting up a system of nomination/approval such as this, for an article to be included on the release. We may want to include a "status" column, which would be read from a category template on the talk page such as Category:Version 0.5 Science. This would tell us that an article had been approved for the release or not. Is this easy to do? I'm sorry to add work like this, but the situation is in a state of flux (that nomination page didn't exist until about 24 hours ago).
  • I've asked Tito to fix the assessment template so "importance" appears right after the article.
  • I notice that with only 3 WikiProjects the page is already getting very slow (for me) to load. Will we still be generating separate pages for each subject (like chemistry) as discussed before? Is there anything we should be doing to make the table more compact to reduce load times? Chemistry already has over 500 pages assessed, and in a couple of years some projects may have 5000, so we need to keep these tables efficient. Suggestions?
Many thanks again, Walkerma 05:33, 3 May 2006 (UTC)
If I'm getting to be a pain with all these requests, please just let me know! I appreciate your time on this, and I'm very aware that the bot as it is already is much better than the manual system. So please tell me if this gets a bit much, or if something I'm asking for is just too much work. Thanks, Walkerma 05:48, 3 May 2006 (UTC)

You are not gettting to be a pain. :) We will slowly get to do everything as it should.

As suggested by you, the bot now updates to individual lists, see

and the logs at

If these lists get bigger, we will see what to do about them.

Question for Walkerma. What to do about the old Wikipedia:Version 1.0 Editorial Team/Chemistry articles? Those articles are not all categorized, so the new Wikipedia:Version 1.0 Editorial Team/Chemistry articles by quality obviously does not contain many of them.

To be continued. Oleg Alexandrov (talk) 02:56, 4 May 2006 (UTC)

Do the logs need to start with a second-level (==) heading? It would be much easier to transclude the pages somewhere if the dates were third-level (===) headings instead. Kirill Lokshin 03:07, 4 May 2006 (UTC)
I changed my code and when the logs get updated tomorrow they will have level three headings. Oleg Alexandrov (talk) 03:30, 4 May 2006 (UTC)
Thanks again! Kirill Lokshin 03:42, 4 May 2006 (UTC)
Also thanks! I will probably make the unused chemistry page a redirect or get it deleted. The fact that some are missing reflects the fact that if you type "class=start" it puts a pink "Start" in the template but doesn't put it in the start cat, you have to type "class=Start" for it to work completely - I'll fix those soon. One thing I noticed - I deliberately added a couple of assessments to templates on 3 May such as lead(II) nitrate as Stub-Class. The bot picked them all up, but didn't record any change in the log (compare manually against the old version on Mathbot/WP1.0 where there is no such stub listing). Is this just the way you ran the test, or is the bot failing to see the change for some reason? Many thanks, this is looking great! Walkerma 09:26, 4 May 2006 (UTC)

The log was wrong because of the way I ran the test. I reran it today, and the log seems to be doing its job, see Wikipedia:Version 1.0 Editorial Team/Chemistry articles by quality log.

Same for tropical cycloles, Titoxd was wondering about it. The correct log is at Wikipedia:Version 1.0 Editorial Team/Tropical cyclone articles by quality log. Oleg Alexandrov (talk) 03:00, 5 May 2006 (UTC)

I have redirected Wikipedia:Version 1.0 Editorial Team/Chemistry articles to Wikipedia:Version 1.0 Editorial Team/Chemistry articles by quality as I merged in there all the comments. By the way, the all these new lists generated by the bot are human editable. At the next update the bot will preserve the comments and the status, but overwrite all other fields. Oleg Alexandrov (talk) 04:10, 5 May 2006 (UTC)

GA's

Just wondering, is the bot going through the {{GA-Class}} categories too? I see that they're being processed, just not added to the table. Also, will the bot be run with a cron job or something similar? Titoxd(?!? - help us) 00:12, 4 May 2006 (UTC)

The bot is now going through {{GA-Class}}, not sure of why it did not go before. Yes, this will run as a cron job once all the issues are ironed out. Oleg Alexandrov (talk) 02:48, 4 May 2006 (UTC)
Something I noticed is that in the last run, 2005 Atlantic hurricane season was listed in the log correctly, but is not in the table anymore (it was a change from A-Class to FA-Class, to help with debugging purposes). As the changes are being processed correctly for the log but not for the table, is that a bug? Titoxd(?!? - help us) 03:03, 5 May 2006 (UTC)
You have sharp eyes. :) That was a bug, fixed now! Thanks. Oleg Alexandrov (talk) 03:50, 5 May 2006 (UTC)

Military history

Any chance of getting Category:Military history articles by quality included in Mathbot's rounds? :-) Kirill Lokshin 01:58, 4 May 2006 (UTC)

Works now, see Wikipedia:Version 1.0 Editorial Team/Military history articles by quality and Wikipedia:Version 1.0 Editorial Team/Military history articles by quality log. Oleg Alexandrov (talk) 02:49, 4 May 2006 (UTC)
Looks great; thanks! Kirill Lokshin 02:53, 4 May 2006 (UTC)

"Importance" field

Would setting up automatic updating of the "Importance" field (from categories like "High-importance military history articles" on the talk page) be possible? As the generated worklists grow, they're going to be more difficult to edit directly, so it would be nicer if all the fields could be filled in directly from the article. Kirill Lokshin 11:44, 5 May 2006 (UTC)

Will do, tomorrow. As well as some of the things suggested above. Oleg Alexandrov (talk) 03:34, 6 May 2006 (UTC)

Rating removed

Any chance we could get the removal of ratings from an article flagged for attention somehow? Maybe bold the line in the log? Kirill Lokshin 04:06, 9 May 2006 (UTC)

I had mentioned this before - I suggested that anything other than a single level upgrade should be bolded as unusual. What do you think? Downgrades at WP:Chem do happen, but are rare, as are upgrades like Start -> A in one step. Walkerma 04:04, 10 May 2006 (UTC)
I had forgotten about Walkerma's suggestion, will do. Oleg Alexandrov (talk) 05:01, 10 May 2006 (UTC)

Done, see Wikipedia:Version 1.0 Editorial Team/Tropical cyclone articles by quality log. Note that as far as I understand the current hierarchy, it is FA -> A -> GA -> B so a change from B to A to be is change of two positions and that is recorded as bold. Oleg Alexandrov (talk) 03:55, 11 May 2006 (UTC)

Linking assessed versions

Oleg, I just wanted to thank you, this feature is beautifully done! I had assumed we would need either two columns, or we would lose the link to the current article, but you have managed to include both in a way that is both clear yet compact. Thank you! Walkerma 04:04, 10 May 2006 (UTC)

May I request an enhancement

Hi Oleg, your mathbot is excellent in finding all the classifications in the WP:Chem wikiproject. I regularly maintain the list of grouped classifications (focussing on the projects goals instead of just listing them) in the project's worklist, and therefore find the log file of the mathbot very useful. Now, yesterday, I thought that perhaps mathbot didn't run, as there is no notification on the logpage, although I assume that it did do the scanning. Hence my question:

  • can you make it so that it does report to have done work on the logpage, even if it doesn't find any changes. Perhaps just the date header and a line 'no changes found'? Wim van Dorst (Talk) 20:17, 11 May 2006 (UTC).
Done. :) Oleg Alexandrov (talk) 23:52, 11 May 2006 (UTC)

Nice, thanks! So I have another question, if I may: can you add a some counting statistics on the large table, viz.

  • Number of FA-Class articles
  • Number of A-Class articles
  • Number of B-Class articles
  • Number of Start-Class articles
  • Number of Stub-Class articles
  • Number of nonassessed articles, but with the {{chemistry}} template
  • Total number of articles with the {{chemistry}} template

Even if this can't be done, thanks for your attention. If it can, I'll use that info for the WP:Chem statistics. Wim van Dorst (Talk) 15:48, 12 May 2006 (UTC).

Just to clarify, where exactly do you want the table, at Wikipedia:Version 1.0 Editorial Team/Chemistry articles by quality? Should it be at the top or at the bottom (I personally would think it may be better at the bottom, but not sure). Oleg Alexandrov (talk) 03:21, 14 May 2006 (UTC)
I made such a table, see the bottom of Wikipedia:Version 1.0 Editorial Team/Chemistry articles by quality. It looks at all the classes of articles and the unassessed ones, then their total. It does not look at the articles with the {{chemistry}} which are neither assessed (FA-class, B-class, etc) nor in Category:Unassessed chemistry articles. So one should make sure that all unassessed chemistry articles are indeed in Category:Unassessed chemistry articles for the bot to count them, at least that's how things are for now.
Comments about where the table should go, how it should look? Oleg Alexandrov (talk) 16:32, 14 May 2006 (UTC)
Very nice! Not counting articles that aren't explicity in the "unassessed category" is probably correct, since there are a number of other non-rated categories floating around that don't really need to be counted. As far as where the table should go: how about putting it on a separate subpage, like Wikipedia:Version 1.0 Editorial Team/Chemistry article counts, and then transcluding it anywhere else it's needed? Being able to transclude the tables from various projects in other places would be a great reporting capability. And, if we go with that, would it be possible for the bot to generate a combined table listing the article counts from all of the projects? Kirill Lokshin 16:39, 14 May 2006 (UTC)
Very nice, indeed, Oleg! Thanks for this. This is exactly as I had in mind. Now to add all the assessments to the templates (making that entry to read 0), and we're done. Wim van Dorst (Talk) 23:33, 14 May 2006 (UTC).
This is an excellent enhancement, thanks Oleg. Walkerma 03:29, 15 May 2006 (UTC)
Thanks. :) Done, see Wikipedia:Version 1.0 Editorial Team/Index of subjects and the links there in. Oleg Alexandrov (talk) 03:56, 15 May 2006 (UTC)

A minor thing from me too: could you add a blank space between the name of an article and the revision returned by the bot? Titoxd(?!? - help us) 19:37, 12 May 2006 (UTC)

Done now, the tables will be modified when the bot runs again in less than half an hour. Oleg Alexandrov (talk) 03:21, 14 May 2006 (UTC)
Done. Will hopefully update the table next time it runs, tomorrow. Oleg Alexandrov (talk) 04:09, 18 May 2006 (UTC)
Thanks. Titoxd(?!? - help us) 04:18, 18 May 2006 (UTC)

Length of logs

If let to run like now, the logs will always grow. What would be a good length of the log? I would think something like 30 days in the past. Anybody thinks a longer log is necessary? Oleg Alexandrov (talk) 04:32, 18 May 2006 (UTC)

Well, since the logs are archived to page history too, I'd say 30 days is enough. Titoxd(?!? - help us) 04:35, 18 May 2006 (UTC)
Yup. I'd go for something even shorter—like two weeks—but that might not be needed for projects with shorter logs. Kirill Lokshin 12:49, 18 May 2006 (UTC)
I think I like a month as a default - I think these logs will be less active once all of the templates have been updated, and most projects are smaller/less active than the test projects. Walkerma 13:22, 18 May 2006 (UTC)

{{Unassessed}}

I had to create this for the statistics tables, see for example Wikipedia:Version 1.0 Editorial Team/Chemistry articles by quality statistics. So it is good if people are aware of it, and maybe put it on a few people's watchlist, in case it gets vandalized. Also, if anybody feels like giving it a background color, be my guest. :) Oleg Alexandrov (talk) 03:41, 19 May 2006 (UTC)

So, how do we use it?

Besides my prior involvement in The Beatles WikiProject, I've just set up The KLF WikiProject. The obvious question then is as above - what do I need to do now and how do I use this tool? Or, will you be in contact with us? --kingboyk 12:42, 21 May 2006 (UTC)

The bot collects the info from Category:Wikipedia 1.0 assessments. So, you need to construct a set of categories for the Beatles project mirroring the existing schemes for chemistry, military history, etc in Category:Wikipedia 1.0 assessments. 02:13, 22 May 2006 (UTC)
Ah, OK. Thanks. So, we won't need to change our tables at all. Good. With regards to Wikipedia:WikiProject The KLF, The KLF is a featured article and it might just scrape into Wikipedia 1.0 on notability (given the proper time and venue I'm willing to argue why it should, of course :)) but I doubt any of our other articles would be seriously considered. With just one article in contention, how should I proceed? Are the old manual tables going to be used too, or must every WikiProject use the new category/bot system? --kingboyk 20:38, 24 May 2006 (UTC)
I think some projects may want to use a manual table, particularly if they only have a handful of articles assessed. You are most welcome to use the bot, though, I hope it can be useful for WikiProjects to help organise their articles irrespective of any inclusion in WP1.0. As for notability, you may well be right (I hope we can get WP:V0.5 large enough to include KLF, but it's too early to tell). However, I'm also hoping that we can set up a scheme whereby WikiProjects can produce their own CD or book on a specialist topic (a Wikireader), so your group could produce a Wikireader on the KLF (or maybe Acid House?). The infrastructure we are trying to build here should be easily usable by a WikiProject for such a purpose. Walkerma 21:17, 24 May 2006 (UTC)
Thanks very much. I have this page on my watchlist, and of course the WikiProject talk pages are always open. The WikiProjects are potentially your best friends in this endeavour so please engage us with plenty of news and dialogue :) P.S. I purposefully kept the KLF Project very tightly focused (we got one FA and 6 GAs in a couple of months, and I wanted to keep the momentum going), but I'm hopeful that we will one day be a subproject of a rave/acid house Project. Indeed I think there may already be an electronic music project (note to self: get in touch with them!). --kingboyk 21:23, 24 May 2006 (UTC)
I've created Category:The KLF articles by quality. Do I now sit back and watch the bot do its magic, or does s/he need to be told of our existence by me editing Wikipedia:Version 1.0 Editorial Team/Index of subjects? --kingboyk 11:59, 25 May 2006 (UTC)
The bot will run tonight, meaning in 12 hours from now. You should see a new entry pop up at Wikipedia:Version 1.0 Editorial Team/Index of subjects. If it does not, or if there are some problems with it, let us discuss here. Oleg Alexandrov (talk) 15:21, 25 May 2006 (UTC)
Excellent! The Beatles should show up now too. Looking forward to it, and I'll be sure to let you know if any issues arise. Thanks again. --kingboyk 15:37, 25 May 2006 (UTC)
Soon I will add some text to Wikipedia:Version 1.0 Editorial Team/Index of subjects desribing how the whole thing works, to address questions as raised by you above. Oleg Alexandrov (talk) 17:01, 25 May 2006 (UTC)

<-- Very nice work Oleg. More comments/questions later. For now though, could you have the bot add a link from each created page (such as Wikipedia:Version 1.0 Editorial Team/The Beatles articles by quality) back to Wikipedia:Version 1.0 Editorial Team/Index of subjects). People getting to the subpages through our categories, what links here, watchlists etc currently have no way to navigate directly to the subject index. --kingboyk 05:47, 26 May 2006 (UTC)

I added a manual header to all the tables currently available, by adding the following code:
{{process header
 | title    = {{SUBPAGENAME}}
 | section  = assessment table
 | previous = '''↑''' [[Wikipedia:Version 1.0 Editorial Team/Index of subjects|Index of subjects]]
 | next     = [[{{FULLPAGENAME}} log|log]], [[{{FULLPAGENAME}} statistics|statistics]] →
 | shortcut =
 | notes    =
}}
Perhaps it's something the bot can add automatically... although I haven't added them to the statistics and log pages yet, as it's more difficult to do with magic words. Titoxd(?!? - help us) 18:39, 26 May 2006 (UTC)
Very nice work, thank you. It would be better if it were inside <noinclude> tags though (for the smaller Projects at least, and if added automatically). I'll change that for the 2 Projects I'm representing, both of whom currently transclude the article indexes. --kingboyk 19:55, 26 May 2006 (UTC)

Page move

I moved User:Mathbot/WP1.0 to Wikipedia:Version 1.0 Editorial Team/Index of subjects which I think will be the permanent home and the place containing links to all pages written to by the bot. You may want to take the old User:Mathbot/WP1.0 redirect off your watchlist. Oleg Alexandrov (talk) 15:19, 24 May 2006 (UTC)

WikiProject The Beatles

Oleg, I'm going to have to offer you a Beatles barnstar up front if you are able to help with any of these, because I think they would take a couple of hours coding. But let's try anyway and see what you have to say :)

Wikipedia:WikiProject The Beatles/Article Classification was set up to assess our articles primarily as a "stock take", so we know where we stood, which articles would need work, which might need to be deleted, which could be FA candidates and so on. Wikipedia 1.0 was in the back of our minds too, and we're more than happy to help with it. It's not our only goal though. As you know (and if you've forgotten see thread above) User:Lar was working on some code to generate our tables, but he seems to have hit a few snags along the way. With the really excellent work Mathbot is doing, I'm wondering if we might be able to use your tool for all of our classification tasks, whether for WP1.0 or just for our Project to use. The changes would be usable by other projects too; I'll annotate these as I proceed.

We use the following additional classifications: {{Merge-Class}}, Wikipedia:WikiProject The Beatles/Article Classification/MergeDel, Wikipedia:WikiProject The Beatles/Article Classification/Merged, Wikipedia:WikiProject The Beatles/Article Classification/AfD.

We could then transclude these tables - and Wikipedia:Version 1.0 Editorial Team/The Beatles articles by quality - into a Project page. I'm thinking that other Projects might want to use the feature too, so we could place all the categories above into a Mathbot crawlable category to be determined (such as Category:WikiProject The Beatles article classifications under Category:WikiProject article classifications) with all Projects using the same category names and output page names as above (replace "WikiProject The Beatles" with "foo"). Sounds like a lot of work but I think it would mostly be a reuse of existing code?
  • From what I can tell, as currently configured Mathbot trawls the categories and creates the tables from a cronjob overnight. Editors then have to come back for a second time to articles they've assessed and add the comment. Presumably Mathbot doesn't delete these comments even if the article class changes (and if it does, can that be changed?). This wouldn't be optimal for us. Some of the editors we have persuaded to get involved with classification are rather more casual editors than us wiki geeks, and I don't want to discourage them or have to go behind them mopping up. Therefore, I have in mind the following situation:
When an article gets assessed (in our case by the editor adding a parameter such as "|FA" to {{WPBeatles}}), the editor can add a template (which will be blank, just a HTML comment inside it so it's not a redlink, and we can protect it) containing the comment and other metadata. Mathbot would check the Talk pages for these tags, read them, add the info to the table, and erase the tag from the talk page. Example:
{{Assessnotes|A short article. It's worth remembering that ''there is a limit to what can be said about some of these people'' but - despite Williams disappearing from the radar as quickly as he arrived, (he remains a fixture in Liverpool as a 'colourful character') - I think there must be interesting and useful material out there about his short time with the Beatles. He did write a book after all (which I read, nearly 20 years ago). Article lacks even basics such as date/place of birth and photo. -- [[User:Kingboyk]]|date=18 May 2006}}
date= would be optional and if missing would be replaced by today's date. Other optional params could easily be added later on (such as importance= perhaps?).
Alternatively, if you have some other suggestion about how comments could be made and picked up by the bot that would be great too. Again, I could see this of being great use to all Projects.
Better idea for this: Just put the section header into a <noinclude> tag :) --kingboyk 20:42, 26 May 2006 (UTC)
  • Possibly my most troublesome request: Would it be possible to have (either/or/both) a "categories" field in the article classifications tables and/or (perhaps simpler) a table listing all articles regardless of classification (including our special classes!) sorted by category (preferably with duplicates annotated as duplicates rather than repeating the comment/grading again)? If we got this far, we'd have everything Lar's tool was meant to be :) --kingboyk 12:51, 26 May 2006 (UTC)

Thoughts? Am I asking for too much? :) --kingboyk 08:00, 26 May 2006 (UTC)

For the various merge-delete-etc. tags, I would think that they're intended to be less permanent than the ratings, so I would implement them as separate parameters (e.g. {{BeatlesProject|class=Stub|cleanup=MergeDelete}}) and avoid this problem entirely. Is there any real benefit to having such things directly on the worklist, rather than just accessing the list of articles to be merged through a category (which should be reduced in size as articles are merged)? (This could be a problem if you're not using a named parameter for the rating, though; but that's just a potential problem with the template design in that case.)
As far as comments are concerned: I think that the comments field in the table will become unusable for a sufficiently large table size—anyone who wants to comment on the article will do so on the talk page rather than trying to edit a 300K table page—and that transcluding comments into the table won't really work unless the comments are kept very short, since you can't really include an entire conversation about an article in the table. Kirill Lokshin 13:39, 26 May 2006 (UTC)
Thanks for the reply.
See {{WPBeatles}} for our template code. It accepts an argument for the grade, including "Merge" and our other custom grades. I'd prefer not to change that too much unless another bot can update the talk pages as I've spent all day on an AWB run and gosh is it boring! Is it worth having them in a table and just the category? Hmm... you might not think so, and I'll think about it :) Until now we've done it that way to allow commenting, dissenting opinions, etc., but I guess we might not need it. Category might be enough.
I see your point about the comments, but that's the way we want to do it. We want to have a centralised venue for organising and prioriting our workload, of which classification is a very major part. The alternative is to watchlist 600 articles, and I for one don't want to do that :) WP:TBA shows what we've been trying to do. --kingboyk 13:49, 26 May 2006 (UTC)
Fair enough. There's probably a clever way of using Special:Recentchangeslinked on the worklist itself to get any updates from the talk pages; but I'm not sure how you can filter that by namespace. Kirill Lokshin 14:04, 26 May 2006 (UTC)

So, you are trying to bribe me using some cheap image, called a barnstar? :)

I will look into that. I will be away from today until Monday night, and then the workweek will start. I will have the code done but it may take say a couple of weeks. I will know for sure when I read carefully what exactly you want the bot to do. Oleg Alexandrov (talk) 15:11, 26 May 2006 (UTC)

Thanks Oleg. In response to your question re bribery: Yes!
I'll ask Wing Commander Lar to come and read my request and Kirill Lokshin's comments, just to be sure that what I've asked for is what we still want ("reading from the same hymn sheet", to use the cliche). Thanks again. --kingboyk 15:20, 26 May 2006 (UTC)
Right, Wing Commander Lar here, a couple of points about comments, I have to digest this in more depth and say more later but I can't not say anything, not my style!
  • I agree that terrifically large comments in the tables are not appropriate. But that's not what is meant The comments we have now are from the reviewer about why it's reviewed that way (doesn't have enough pics, not long enough, whatever) not general talk page comments. (so getting recent comments would be bringing way too much in, we only want the article reviewer who gave it "A" to say why it's an A in a few words) If they got too long (person B saying no, it really is a "B" class article not an "A") they would need to be redacted back to talk. Our few current double comments and double classifies need to go away, I think, need to pick ONE grade and leave it at that.
  • On the page length issue... presumably someone is going to have to edit in the importance tag anyway (or is the project box supposed to carry that somewhere???, we may have missed that, I added importances to a couple as tests to see what the bot did) which is presumably as much a problem with 5000 article tables, due to length, with or without comments. This is why we broke ours up by alphabetical letters, to make less to edit at once. You might want to consider breaking it up by article class? It is not as good a division as the alphabet, being skewed toward B and stub, but it might help. Maybe ALSO break it by alphabet if it gets over a certain number of entries?
  • Also PLEASE do consider (an option??? maybe the project page carries a box with "settings for Mathbot" that can be easily read??) producing a transcludeable thing with headers that start at level 3... OK that's enough, Steve(kingboyk) told me not to say much... ++Lar: t/c 16:38, 26 May 2006 (UTC)
  • Thanks Lar, good points all. Having mathbot-understood params within each Project's template might not be a bad idea, although a new blank template or a set of meta data inside a HTML comment would probably be easier to parse/easier for novice editors to work with? Agree with you re the commenting system (glad we're still on roughly the same wavelength, always a great help when working with people! :)) --kingboyk 20:47, 26 May 2006 (UTC)
  • Thinking about this overnight, our "comments" will be more a little more verbose than required for WP1.0 and aimed at what's to fix as much as the current article state. Perhaps a longer comment cell for each row could be inside an includeonly tag, so they only appear when transcluded onto our Project page? --kingboyk 10:10, 27 May 2006 (UTC)
You may be interested in Tito's elegant solution to the issue here. This is new and we haven't tested it yet, but Tito is usually a wizard at such things. In other words, you could create a "category=L" for all articles within the Beatles list that begin with L, and you could get the bot to produce a list called "Beatles L Articles by quality" or some such thing. I think this is very easy for Oleg to do, in fact the capability may already be in there. Walkerma 16:48, 26 May 2006 (UTC)
However, an important point was brought up there about the importance ratings. We had discussed whether to put the importance parameter in {{hurricane}}, but we don't know if the bot would pick it up. Also, would the bot be able to pick up intersections with Category:Wikipedia:Version 0.5, or subcategories, so we can know which article is vetted for the release? Titoxd(?!? - help us) 17:58, 26 May 2006 (UTC)

(A gentle "bump"). We got importance sorted out, any thoughts about how comments embedded in article talk pages might be transferred to the MathBot listing? :) At WP:BEATLES I think we're just about ready to abandon our previous efforts and go entirely with Mathbot, but we'd love to be able to allow editors to comment on talk pages as we want to use the tables as a worklist in addition to helping WP1.0. --kingboyk 12:20, 20 June 2006 (UTC)

I don't think it's possible, from what I can tell. The only logical place to put comments would be in a parameter on the template being used to tag articles, but Mathbot looks at the generated categories, not the template itself. Indeed, given the wide variety in the project banners, I'm not sure that it would even be possible for Mathbot to understand their invocation except if it were programmed with each project's template on a case-by-case basis. Kirill Lokshin 12:51, 20 June 2006 (UTC)
Not possible?! Come on, Oleg's smarter than that! :P --kingboyk 12:53, 20 June 2006 (UTC)

Right, there is no good way for the bot to collect comments using categories. It is theoretically possible, but this will require

  • Choosing a specific standardized box on each talk pages where the comments will be.
  • Letting the bot (via some category say) which talk pages have comments which should be harvested.

I would like to mention that this will require a good amount of work on my script, and will make the bot take much longer to compile the lists everyday , as one would need to visit each and every talk page having comments each and every single time.

I would be reluctant to implement that without a really good reason for it. If enough people think it is a good idea and will get used, then I will go for it, otherwise I won't, for reasons of too much work and impacting the performance too much. I suggest for now one could just edit each worksheet directly (e.g., Wikipedia:Version 1.0 Editorial Team/The Beatles articles by quality/1) and write the comments there. Oleg Alexandrov (talk) 16:22, 20 June 2006 (UTC)

I don't really think it's necessary in any case. For projects with small worklists, it's trivial to edit the list directly; for projects with large ones, adding comments to the worklist is unproductive to begin with, since nobody will go searching through all twenty pages just to look up some comment. Given that most of the work is being done on the talk pages themselves, why not have projects that require/encourage comments simply leave them there directly? Kirill Lokshin 16:30, 20 June 2006 (UTC)
It might not be necessary for your Project Kirill, but with all due respect I think I'm better placed to say what's best for WP:BEATLES :) In our Project, we have lots of foot soldiers, editors who won't come near WP space. We (Beatles Project) may be one of the less important projects represented here, but we also get a massive number of casual visitors at our articles. I'd like to harness some of that energy by allowing people to assess in situ. We want comments because quite frankly a lot of the articles are in a terrible state, and we (the "management", if you like) want a centralised list to work with (and we want those management classes too, like "Merge"). We have too many articles to watchlist them all and for one editor to know what's happening on every article (unlike WP:KLF where everything is on my watchlist); we don't have so many that the lists will be too large to work with. It's not 100% essential in as much as we're gonna stamp our feet and march off in disgust if Oleg says no, but it would help us. If he does say no, we'll have to think again certainly: either abandoning the idea or running our own bot (something I'd rather avoid if at all possible). --kingboyk 17:22, 20 June 2006 (UTC)

Category-based importance ratings

So, to play around with this idea once again:

  • Would it be possible to fill in the "Importance" cells in the table from article-talk categories like Category:High importance military history articles? Would such a scheme need to be adopted by everyone, or could that column be automatically updated only where the project has such a category tree?
  • More conceptually, would there be any problems with allowing importance rating to take place in a distributed fashion? Kirill Lokshin 16:55, 26 May 2006 (UTC)
I think we'd consider adopting that too, as long as it's not too complicated. I like the idea of our more casual members being able to do the work at the talk page level, with the results "collected" at a central level (here or at the WikiProject). Categories seem a decent way to do it, the other way is metadata. --kingboyk 17:00, 26 May 2006 (UTC)
Just like I wrote above, WikiProject Tropical cyclones is interested in doing that too. Titoxd(?!? - help us) 17:59, 26 May 2006 (UTC)

I have been away, as mentioned above. Yes, I can fill in the "importance" cells from categories like Category:High importance military history articles. It does not need to be adopted by everyone, but the format better be the same if adopted by several wikiproject.

What do you mean by "allowing importance rating to take place in a distributed fashion"? Oleg Alexandrov (talk) 22:27, 29 May 2006 (UTC)

I think he means that it would be done in a talk page instead of directly on the assessment table (correct me if I'm wrong). So, we would have Category:High importance military history articles feeding into Military history articles by importance, which would then feed into Category:Wikipedia 1.0 assessments? Titoxd(?!? - help us) 23:53, 29 May 2006 (UTC)

Now that subpages work, time to deal with this importance stuff. So, military people, anybody willing to make Category:High importance military history articles a bluelink and add articles to it? Then I will see how to modify the script to collect that information.

Note. In order for my bot to locate those importance categories, it is good if a "master" category is created, from where the bot can go look up all these Category:High importance military history articles, Category:Low importance chemistry articles, etc. Oleg Alexandrov (talk) 04:03, 6 June 2006 (UTC)

Ok, I've created Category:Military history articles by importance and two categories under it, Category:Top-importance military history articles and Category:High-importance military history articles (with two article in each). Is that enough to do some testing with? Kirill Lokshin 04:37, 6 June 2006 (UTC)
Good thanks. I will have that working by the end of the week. Oleg Alexandrov (talk) 15:19, 6 June 2006 (UTC)
I've created a similar hierarchy for physics, and will now populate it a bit. They're all in Category:Physics articles by importance. -- SCZenz 16:38, 6 June 2006 (UTC)

Just to clarify, should the bot just fill in the importance fields in existing lists, like in Wikipedia:Version 1.0 Editorial Team/Chemistry articles by quality, or should it also generate a new list, with those articles resorted, this time by importance? To me it is fine either way, but I wonder what is the exact thing to do. I have this question because, per Wikipedia:Version 1.0 Editorial Team/Using Mathbot, it appears that the bot should also do the second, so I would like to clarify. Thanks. Oleg Alexandrov (talk) 15:12, 7 June 2006 (UTC)

The ideal thing, from my perspective, would be to use the importance rating—rather than alphabetical ordering—as the second sort key for the main table. In other words:
  • FA-Class
  • Top-importance FA-Class
  • Top-importance FA-Class beginning with A
  • Top-importance FA-Class beginning with B
  • ...
  • High-importance FA-Class
  • High-importance FA-Class beginning with A
  • ...
  • ...
  • Unrated-importance FA-Class
  • ...
  • A-Class
  • ...
Would that be difficult to implement? Kirill Lokshin 15:16, 7 June 2006 (UTC)
When I wrote up the "Using Mathbot" page, there were two listings in the main Index for "Physics articles by importance" and "Military History articles by importance" so I assumed that would be the form these would take. I can also imagine that there may be some groups who want to see their articles ranked by importance first, quality second. If this sort of table is hard to do, then Kirill's proposal would be the next best thing. I agree that Kirill's ordering is a nice way to do the "By quality" lists. Walkerma 17:18, 7 June 2006 (UTC)
To answer Walkerma, that double list was an artifact by my bot which now is fixed.
Changing the way things are sorted is very easy, and doing one more list is easy also (although I am not sure why there is a need for an extra list with the same topics sorted differentlly).
That is to say, I can sort first by class, then by importance, then alphabetical. Let us see if there are any objections to that or other suggestions. Oleg Alexandrov (talk) 17:33, 7 June 2006 (UTC)
I'm not really sure that the "by importance" list is necessary, but that's probably just me. Titoxd(?!?) 03:24, 9 June 2006 (UTC)

The bot now collects "importance" information, see Wikipedia:Version 1.0 Editorial Team/Physics articles by quality. Next thing to do will be sorting by importance, to be done in a day or two. Oleg Alexandrov (talk) 05:09, 9 June 2006 (UTC)

Sorting works now too, see Wikipedia:Version 1.0 Editorial Team/Chemistry articles by quality/1. Oleg Alexandrov (talk) 05:37, 11 June 2006 (UTC)

The Beatles and KLF Projects now have importance categories too. Will let you know if I manage to break the bot! :) --kingboyk 12:01, 12 June 2006 (UTC)

Seems to be working a treat, thanks ever so much. --kingboyk 09:22, 13 June 2006 (UTC)

The lists of articles are getting big

Before we deal with the importance ratings above, I suggest we do something about the lists getting too big. At the moment, Wikipedia:Version 1.0 Editorial Team/Military history articles by quality has around 2000-3000 articles listed and is already 481 kilobytes long and is slow to lead.

My idea is to split such a list into subpages, each 400-500 articles. If at some point a subpage becomes bigger than 500, all the subpages get reshuffled, new subpages get added, so that each subpage contains only 400 articles again. In this way we would have

  • ...

and the number of subpages would grow as needed. Wonder what you think of that. Oleg Alexandrov (talk) 22:45, 29 May 2006 (UTC)

Maybe have the original page (Wikipedia:Version 1.0 Editorial Team/Military history articles by quality) be an index to all of the subpages, and start the actual article listing at Wikipedia:Version 1.0 Editorial Team/Military history articles by quality 1? Or did you have some other place where you were intending to keep links to the pages? Kirill Lokshin 22:47, 29 May 2006 (UTC)
Good point, we need an index. So we can start from 1. :) Oleg Alexandrov (talk) 22:50, 29 May 2006 (UTC)
Sounds like a good idea to solve the problem. Titoxd(?!? - help us) 23:54, 29 May 2006 (UTC)
Yes. And maybe have the subpages named like so: Wikipedia:Version 1.0 Editorial Team/Military history articles by quality/1, so that as subpages they automatically get a navigation link back to Wikipedia:Version 1.0 Editorial Team/Military history articles by quality. --kingboyk 10:11, 31 May 2006 (UTC)
I will add a link to the parent page either way. But using this /1 format does not appeal too much to me.
Using subpages will require some big changes to my code. I will do it by the end of the week. Oleg Alexandrov (talk) 17:11, 31 May 2006 (UTC)

Suggest you do it by the quality level and then by alpha (as WP:Beatles did) rather than hard coded numbers... ++Lar: t/c 17:39, 31 May 2006 (UTC)

That will be easier to implement. However, that may not be enough. In the future there may be a very big number of articles of certian quality starting with certain letter. But maybe we could think of that when this becomes a problem. Oleg Alexandrov (talk) 00:31, 1 June 2006 (UTC)
There already are: Stub-Class military history starting with 'B' (for fairly obvious reasons). Kirill Lokshin 00:33, 1 June 2006 (UTC)

I implemented the subpages, for now only for the military history articles (after a period of testing and comment I will expand that to the other lists). See Wikipedia:Version 1.0 Editorial Team/Military history articles by quality. Note that we need another template for header and footer for an index, as the current one is not appropriate. That is, the line

 Article Importance Date Assessment Version Comments 

should not be there. Comments? Oleg Alexandrov (talk) 04:39, 5 June 2006 (UTC)

Thanks for the change. With V0.5, we only have a few (~100) as yet, but we will at some point want to break up the lists into the categories - Arts, Natural Sciences, etc, these are all variables included in the {{V0.5}} template that the bot reads. Whenever you want to make this switch, go ahead. Will the statistics then give us the no. of Arts articles, the no. of Natsci articles, etc in the V0.5 table? Thanks, Walkerma 04:45, 5 June 2006 (UTC)
I don't think the statistics will for now separate between the various categories. I will work on that, eventually, after this subpage business is done with and the importance thing is implemented. Oleg Alexandrov (talk) 02:04, 6 June 2006 (UTC)