Wikipedia:Bots/Requests for approval/HersfoldBot
- The following discussion is an archived debate. Please do not modify it. Subsequent comments should be made in a new section. The result of the discussion was Approved.
Operator:
Automatic or Manually Assisted: Automatic, but only run when supervised.
Function Overview: HersfoldBot will transwiki articles from Category:Copy to Wiktionary to Wiktionary using the Special:Import function. This bot will also need approval and admin/import rights over there before it can fully operate, however testing of the import function is possible at test.wikipedia.org.
Edit period(s): When needed, probably no more than once a day.
Already has a bot flag (Y/N): No, would need one should not need one unless required by policy - it would be preferred to have the bot's edits show up in RC so they can be noticed and the imported articles dealt with.
Function Details: HersfoldBot will collect the list of articles (only pages in the main namespace) from Category:Copy to Wiktionary and complete the following execution cycle for each article. The bot will ignore any article that has been tagged with {{TooManyRevisions}}; that template's function is explained later on.
- Determine if Wiktionary already has a page existing at wikt:Transwiki:<article name>.
- If so, the bot will attempt to import the full history of the article using Special:Import at Wiktionary (through the use of the API).
- If the import is successful, the bot will replace the Transwiki template ({{Copy to Wiktionary}} or one of its redirects) on Wikipedia with {{TWCleanup}} and log the transwiki both at Wikipedia:Transwiki log/Articles moved from here/en.wiktionary and wikt:Wiktionary:Transwiki log.
- If there is already a transwikied article by that title, the bot will not import, but simply replace the transwiki template with {{TWCleanup2}}.
All of the actions the bot makes are logged to a text file on my computer so that I can review what happened, why it stopped running, and whether or not I need to go in myself to clean up some of the things it wasn't able to do (see next paragraph).
The bot has multiple safety checks built into it which will either stop it running or set it to ignore particular articles which have proven to be a waste of time.
- The bot will stop editing more-or-less immediately if it has new messages at either Wiktionary or Wikipedia.
- The bot will be unable to continue if it is blocked, and should exit cleanly if this proves to be the case.
- The bot will stop if it is unable to create or open the text log file on my computer. This happens before it tries to log in to either wiki, and in fact before I even enter the password.
- The bot will stop running if it encounters an IOException at any point (with one exception mentioned later), as these usually indicate a problem with the internet connection.
- The bot will stop running if it has inadvertently been logged out or finds that it does not have access to import articles.
- The bot will stop running if it gets a "cantimport", "badinterwiki", or unknown error back from the import API, as this indicates access has been denied, there is a problem with the hard-coded portion of the request URL, or something really bad happened.
- The bot will stop running if it encounters ten "notempdir" errors in a row - this is a server-side error, and may be only temporary; the counter allows the server time to correct itself without the bot stopping, but then will force the bot to stop if it seems the server is really having trouble.
- The bot will stop if I enter the wrong password or it otherwise fails to log in twice.
- The bot will mark within its log that manual review is needed in the following circumstances, however will not stop running:
- The bot receives a HTTP 504 error from the import API after attempting to import an article. In testing, it appears that this will sometimes occur when importing articles with high revision counts (roughly 200-300, I think), even though the import may successfully complete. The bot will also pause for five seconds to allow the server to recover.
- The bot receives a "filetoobig" error from the API. This will cause the bot to stop importing it and add {{TooManyRevisions}} to the article on Wikipedia, which will cause it to ignore the article on future runs.
- The bot receives ten "cantopenfile" errors from the API for the same article. This seems to occur at random for some articles, but repeatedly for articles with very high revision counts (estimated to be 300 or more, not reliably tested yet). This will cause the bot to stop importing the article and add {{TooManyRevisions}} to the article on Wikipedia, which will cause it to ignore the article on future runs.
- The bot will stop running if a total of three or more of these errors occur during its run. While these errors do not necessarily indicate a problem by themselves (since the import API does appear to be only partially reliable at best), repeated occurrences of them could mean I need to check the code. When each of these errors occurs, the error will be noted in the text log and the article will be re-added to the bot's import queue for a later attempt.
- The bot receives a "notoken" error from the import API.
- The bot receives a "badtoken" error from the import API.
- The bot receives a "nofile" error from the import API.
- The bot receives a "partialupload" error from the import API.
I will be placing the source code online soon at User:HersfoldBot/Source for review; that page will be fully protected. The code contains more complete documentation, as well as a slightly more detailed listing of the various conditions that will make the bot die (there are currently 30 different exit codes that indicate an error).
I would like to get approval here first, if possible. I have been unable to test the editing functions of this bot yet, and would like to be able to test that here before trying to get approval and admin rights over at Wiktionary. The import functionality has been tested at testwiki: and appears to work fine (see testwiki:Special:Contributions/HersfoldBot). Once operational, I will also look into transferring the logs the bot produces onto Wikipedia, somewhere within the bot's userspace.
Discussion
Wow, that's one BRFA you've got there Hersfold. Anyhow, by request, a quick criteria analysis from me (as urgency probably plays second fiddle to getting it perfect here):
- is harmless: that's what a period of debate and trial is for.
- is useful : Yes, passes that one easily enough.
- does not consume resources unnecessarily: Yep.
- performs only tasks for which there is consensus: I can't see this being a problem.
- carefully adheres to relevant policies and guidelines: I can't see any problems, and I trust an admin to know his way around them anyway.
- uses informative messages, appropriately worded, in any edit summaries or messages left for users: I would hope so.
So in summary, no problems so far, although one gets the feeling that some time, trial and error may be needed to get everything working perfectly. - Jarry1250 (t, c) 20:25, 9 March 2009 (UTC)[reply]
- Oh, definitely. You'll see at ]
- Seems harmless Approved for trial (50 edits or 5 days). Please provide a link to the relevant contributions and/or diffs when the trial is complete. MBisanz talk 23:09, 9 March 2009 (UTC)[reply]
- Trial running now -
- The bot will handle test.wikipedia.org as Wiktionary; edits can be viewed at testwiki:Special:Contributions/HersfoldBot.
- The bot will use User:HersfoldBot/Wikipedia:Transwiki log/Articles moved from here/en.wiktionary as the local transwiki log since the articles aren't being transwikied to Wiktionary.
- The bot will edit the source articles here, however all of those edits will be rollbacked on completion to keep the articles categorized. Special:Contributions/HersfoldBot
- Once done, I will copy the text log to ]
- Trial running now -
- First trial run failed - seems there's an issue with the edit functions, so the bot stopped running due to repeated non-fatal errors. Taking a look to see what happened; the log will be available at the above link soon. ]
- Ok, trying this again after I clean up the mess on test wiki - I forgot to assign the results of some functions back into some strings, so when the bot tried to edit, it ended up not doing anything (on the articles) or overwriting the existing content (on the logs). ]
- And it messed up again. I'm going to try and figure out why it's not noticing these transwiki templates; the log editing seems to be correct now, but the article editing has some issues. ]
- The problem was it's being totally case sensitive, when templates and links are case insensitive for the first letter only. This has been fixed, so I'm running the bot again for the remaining 40 edits. ]
- Trial complete. The trial has finished. On the last run log, the bot imported 13 articles to test.wikipedia.org out of the 16 that it attempted. The 3 articles that it failed to import received "cantopenfile" errors; I'm not sure what caused ]
I've just made some changes to the code to allow it to run through a GUI instead of the command line; could I get another trial to make sure it still works OK? The changes made will be logged in the bot's userspace shortly, although the changes made to the bot's operating code shouldn't have a substantial effect on how it runs. ]
- Screw it - I've been messing around with the GUI without actually running the bot and I can't get the output to work right. Command line's not awful anyway. ]
- Running bot again, limited to 24 total edits, which should be roughly eight articles. Again, any edits the bot makes to actual articles here will be rollbacked. ]
- Oops. Forgot to unblock the bot. ]
- Trial complete. Seems to be working fine now - the bot's operation is unaffected. ]
- I was called to comment over at Wiktionary and left my comments there[1], but my main suggestions are to make it check if there are main Wiktionary articles under the same name (not just Transwiki) for duplicates, and have it have a character limit of how long articles can be that are imported to Wiktionary. Goldenrowley (talk) 02:48, 12 March 2009 (UTC)[reply]
- I've added the check for the main namespace, however I'm still leery on the character limit for the reasons I explained on Wiktionary; some articles could be fairly sizeable but still useful to your purposes. ]
- (outdent) Still adding features as requested on Wiktionary; I'm going to hold off on the final trial until I get approval for test runs on their end or they stop throwing suggestions at me. Their suggestions are including a lot of stuff that a Wikipedia editor wouldn't know about simply because it's about how they deal with things on their end. ]
- {{]
{{
]Approved for trial (50 edits or 7 days). Please provide a link to the relevant contributions and/or diffs when the trial is complete. sounds fine to me. MBisanz talk 21:35, 24 March 2009 (UTC)[reply]
- Running into some minor problems with the import API - should be running smoothly in a moment. ]
- Trial complete. After fixing the API queries (I forgot to fix that bit of the code before running), the bot imported 10 articles, marked one for manual review due to its size, and removed another from the category since it already existed at Wiktionary. The bot ran for approximately 11-12 minutes and encountered no errors. A log of the imports it made can be viewed at ]
Approved.
- The above discussion is preserved as an archive of the debate. Please do not modify it. Subsequent comments should be made in a new section.