The Citation Tool is a semi-bot for finding citation errors and fixing them.

About

edit

This page contains discussion of the Citation Tool semi-bot. Watch for announcements of development progress. A related project is the experimental /Hybrid referencing footnote style.

Version 1.0 is now live! It appears to be both functional and useful; for example, I identified errors in the <ref> markup for Race and intelligenceΔ and Jean LaplancheΔ (two articles I happen to have worked on that use Cite).[1]

To learn more about the use of Python for tasks such as this, see Text Processing in Python.[2]

Features

edit

Please feel free to add additional requested features to list.

  1. Diagnose issues related to content in non-first named references.
    • Identify cases where multiple same-named references contain content. In such case, the non-first content will not be rendered by <references/>.
    • Identify cases where an empty named reference occurs before the one (or more) with content, and <references/> renders that note as empty.
    • Propose revision of article source with named reference content in first position. If multiple occurrences have contents, provide a manual choice of which one is the "authentic" note content.
  2. In a user-guided manner, convert m:Cite.php references that look like citations to either Harvard or Label reference templates.
    • What's the criterion for "looks like"? Maybe start with ones that are entirely {cite XXX} templates.
    • Any better idea of what a citation is (as opposed to a footnote), from a robot perspective?
  3. Create separate "Footnotes" and "References" sections for the two types of notes.
  4. Put the whole thing on a web interface that lets users make the necessary decisions with checkboxes and the like.
    • The final result should be text that a user may copy into an article. I definitely don't want to have some errant bot make bad decisions without human guidance.
  5. Convert bare references (i.e. [http://example.com/page.html]) to full {{cite web}} citation templates by following links and extracting metadata.
  6. Automatically "WebCite" (cache/archive) cited URLs. WebCite has a relatively straightforward XML-based ASP for this, see http://www.webcitation.org/faq. Caching cited URLs with WebCite prevents Link rot and archives a snapshot of the URL an author meant to cite. The cited URL can either be replaced by a WebCite link (which contains the cited URL and caching date, or a unique snapshot ID) (note that this should be done only for new articles, otherwise we can't be sure if the page has been updated or disappeared), or the WebCite link could be added to the originally cited URL.
  7. Identify reference strings that are out of chronology and allow for a quick fix.

Bugs

edit

Please list known problems here, if in pithy summary. For longer discussion of bugs (or misfeatures), use the talk page.

  1. Sometimes, but not according to any pattern that seems obvious, updating a page with the "Update using this WikiText" button inaccurately reports an edit conflict. In such cases, you may still copy-and-paste the proposed text into the Wikipedia edit window.

Mixed-style example

edit

I believe that a mixture of annotational footnotes and citational references is often desirable for articles. A toy example of this style is at: Wikipedia talk:Footnotes/Mixed citations and footnotes. An example "in the wild" of something similar is at Jello Biafra. Please consider whether such a mixed-style would be useful for articles you actively edit. Over time, this tool will aid more in creating this style.

Caveats

edit

This tool is intended to aid editors in automated page editing. The decision to use a particular reference style within an article is a matter for the consensus of article editors. Do not modify citation/footnote styles in an article simply because a tool exists that makes the process easier; only do so because editors agree that a particular style is desirable.

Pages known to use m:Cite.php

edit

These might be worth keeping an eye on for errors that creep in with editing.

See also

edit

Footnotes

edit
  1. ^ If you use the identified test cases for testing, please rollback your changes after use so that other users see the same "typical" Cite errors when examining the test cases.
  2. ^ (Mertz 2003)

References

edit
  • Mertz, David (2003). Text Processing in Python (HTML). Addison-Wesley Professional. ISBN 0321112547.