Wikipedia:Bots/Requests for approval/Snotbot 2
- The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at Approved.
Operator:
Time filed: 17:53, Tuesday February 8, 2011 (
Automatic or Manually assisted: Automatic
Programming language(s): Python
Source code available: Pywikipedia script, source code can be made available on request.
Function overview: Tagging image files which are used in video game articles with {{WikiProject Video games}}.
Links to relevant discussions (where appropriate): Wikipedia:Bot requests/Archive 40#Tagging files for WikiProject Video games
Edit period(s): One time run
Estimated number of pages affected: Maximum of about 20,00025,000 images.
Exclusion compliant (Y/N): No
Already has a bot flag (Y/N): Yes
Function details:
Discussion
Quick update on the scope of the bot: In preparing the bot, I've found that there are exactly 28,060 unique files that are directly linked from WPVG articles (and by "directly linked", I mean that the article uses [[File:...]] or [[Image:...]] in the wikicode, not counting images transcluded from templates, although images used in
Could you post a list of some 200-300 random files from your list, so we can see how many are not actually relevant to the project? This wasn't answered by anyone in any of the discussions. —TALK 21:06, 11 February 2011 (UTC)[reply]- Well, the false positives seem to be very few and rare. Let them not stand in building encyclopaedia.
- Approved for trial (150 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete.. — TALK 21:13, 11 February 2011 (UTC)[reply]
- I don't think it's a terrible idea to get a random sample. Here's 300 files from the list, chosen completely at random. Let me know if you find any irrelevant ones, I'll check some of them out as well.
- Looks like there's a bunch of redlinks. I imagine that's because the bot doesn't distinguish between actual links and links that have been commented out in the wikicode (i.e. <!-- Unsourced image removed: [[Image:EndlessSaga1.jpg|250px| ]] -->. In any case, the bot is already programmed to make sure that a file actually exists before posting on its talk page, so those redlinks will be ignored. chat 21:36, 11 February 2011 (UTC)[reply]
- Anomie's lists already had the files, so that's where I looked and previewed a batch of 300 or so. — TALK 21:50, 11 February 2011 (UTC)[reply]
- Trial complete.
- 150 edits made with only one problem: The bot is skipping over files that are hosted on Commons. The API is telling the bot that the files don't exist (see this API query), and the bot just skips them. What do you think the bot should do in these cases? Even if it could search Commons to see if the file exists, it still probably wouldn't be useful to create a talk page on en-wiki for the file, right? I suppose the bot should just be skipping over these files anyway? spout 22:41, 11 February 2011 (UTC)[reply]
- Argh. It also appears that it blanked the existing talk page of 15 articles. I'll go through and fix them manually. Silly omission in the code. babble 22:49, 11 February 2011 (UTC)[reply]
- Problem fixed and code updated to add the template to the top of existing talk pages instead of overwriting them completely (assuming the talk page doesn't already have the WPVG banner or one of its redirects). babble 23:00, 11 February 2011 (UTC)[reply]
- Problem fixed and code updated to add the template to the top of existing talk pages instead of overwriting them completely (assuming the talk page doesn't already have the WPVG banner or one of its redirects).
- Use prop=imageinfo to tell if the image exists (imagerepository in the response will tell you if it's local or Commons); prop=info gets you info on the description page whether or not the file itself exists. Anomie⚔ 18:29, 13 February 2011 (UTC)[reply]
- Thanks for the tip, I tried it out and it works. However, I don't think I should be creating talk pages here for images that are on commons. Do you agree? soliloquize 16:38, 14 February 2011 (UTC)[reply]
- Thanks for the tip, I tried it out and it works. However, I don't think I should be creating talk pages here for images that are on commons. Do you agree?
- Argh. It also appears that it blanked the existing talk page of 15 articles. I'll go through and fix them manually. Silly omission in the code.
- Looks like there's a bunch of redlinks. I imagine that's because the bot doesn't distinguish between actual links and links that have been commented out in the wikicode (i.e. <!-- Unsourced image removed: [[Image:EndlessSaga1.jpg|250px| ]] -->. In any case, the bot is already programmed to make sure that a file actually exists before posting on its talk page, so those redlinks will be ignored.
Only non-scope image I found was
Anyway,
- As the majority of our images are hosted locally (at least, all of the non-free ones), the hope was that the bot would skip the Commons images. All the iffy images should be at commons (map of Nebraska, picture of a tiger, or whatever). Free video game images aren't terribly common, are easy to find at Commons, and don't require the maintenance that non-free images do. If any free images are hosted locally, I will move them to commons. ▫ JohnnyMrNinja 18:22, 14 February 2011 (UTC)[reply]
- Sounds good. This is the bot's current behavior, so it shouldn't need any updating. I'm not currently at the PC that runs the bot, but I should have the ability to do another test run in a few hours. talk 18:52, 14 February 2011 (UTC)[reply]
- Sounds good. This is the bot's current behavior, so it shouldn't need any updating. I'm not currently at the PC that runs the bot, but I should have the ability to do another test run in a few hours.
Trial complete. I made 50 more random edits, and only one of them had an existing talk page,
What about if talk page uses {{
- Ahh, right. I haven't addressed that yet. My first thought is to just have the bot skip over any pages that use that template, and create a list of such pages. Then, if there are only a handful of pages (and I expect there will be very few, if any at all), then I can just process them manually. If the list ends up being quite large, then I can run a different script to process them on their own. However, I think that creating a complicated regex to deal with that situation is going to be a lot of effort to expend for one or two cases (and increased risk for mistakes), especially considering that less than 150 pages in the File Talk namespace use that template (including its redirects). yak 22:31, 14 February 2011 (UTC)[reply]
- Does that work for you? speak 02:28, 16 February 2011 (UTC)[reply]
- I was thinking about future case in which you could run this task for more projects, eventually tagging some files to multiple projects.
- Does that work for you?
Anyway, Approved. —
- The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.