User talk:Citation bot/Archive 37

Page contents not supported in other languages.
Source: Wikipedia, the free encyclopedia.
Archive 30 Archive 35 Archive 36 Archive 37 Archive 38

weird cite arxiv convertion

Status
{{
wontfix
}} for now, since so rare
Reported by
b
}
21:57, 2 October 2023 (UTC)
Relevant diffs/links
[1]
We can't proceed until
Feedback from maintainers


Possibly from garbage Pubmed metadata

b
} 21:57, 2 October 2023 (UTC)

Really weird that PubMed is now tracking arxiv stuff. AManWithNoPlan (talk) 15:14, 3 October 2023 (UTC)

Incorrect grab of title

Status
{{fixed}} unicode PHP oddity. Thank you for reporting.
Reported by
UtherSRG (talk) 16:53, 10 October 2023 (UTC)
What happens
Bot incorrectly grabs journal article title
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Homo_habilis&diff=prev&oldid=1179513770
We can't proceed until
Feedback from maintainers


Adds broken repository.upenn.edu URL

Status
{{fixed}} by adding epository.upenn.edu to the list of bad urls
Reported by
  — Chris Capoccia 💬 13:39, 11 October 2023 (UTC)
What happens
Expanding a cite to
JSTOR 2859808 adds a broken URL that redirects to a home page. Maybe they've changed their URL format? Correct URL for this case should be https://repository.upenn.edu/handle/20.500.14332/36341
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=User%3AChris_Capoccia%2Fsandbox&diff=1179641487&oldid=1179641470
We can't proceed until
Feedback from maintainers


Journal capitalization

Status
{{fixed}}
Reported by
Super Dromaeosaurus (talk) 15:40, 12 October 2023 (UTC)
What happens
Bot keeps capitalizing a journal's name (examples from one article: [2], [3], [4]; examples from another: [5], [6], [7]).
What should happen
Leave it as is. I'm tired of this bot.
Relevant diffs/links
Gave them above.
We can't proceed until
Feedback from maintainers


The issue here should be to recognize language=rup

b
} 22:41, 13 October 2023 (UTC)

Thanks for the fix! Super Dromaeosaurus (talk) 22:33, 15 October 2023 (UTC)

Moving Jstor and Worldcat URLs to parameters

From discussions (1, 2, 3) on stopping useless cruft – for example this useless blank archive of a Jstor article – from semi-automated mass archiving, a number of editors have noted their support for a bot to parse Jstor and Worldcat URLs (eg https://www.jstor.org/stable/24432812) for their respective |jstor=24432812 and |oclc= parameters where relevant and purge URLs, archive URLs, and archive metadata for CS1 templates.

Is this something that can be done with citation bot? I will note that I'm not saying to purge all URLs – they can be useful if the full text is separately hosted elsewhere – just URLs and archives thereof (almost always useless blank pages) that are duplicative of the generated parameter URLs. Tagging GreenC. Ifly6 (talk) 06:19, 22 September 2023 (UTC)

The bot got blocked for doing this (although the person who lead the charge on this themselves eventually got banned). The main arguement was that the users of wikipedia are only capable of clicking on title-links, and numbers after the reference as above their IQ level. Although I would argue that having these as title links is misleading since they they almost never lead to the source, but just a page listing the source. AManWithNoPlan (talk) 13:09, 22 September 2023 (UTC)
That policy feels like insanity. Is it possible to determine whether the Jstor link leads to a full source and the URL (metadata, archive URL, and archive metadata) only if it does not lead to a full source? Worldcat is easier because it never(?) leads thereto. Ifly6 (talk) 14:11, 22 September 2023 (UTC)
I feel there's a case to remove links that will never host the full text, like PMID, OCLC, etc... because they mislead the reader into thinking there's a full text available at the end. But that would require an RFC.
b
}
03:34, 23 September 2023 (UTC)
Is it really the case that we cannot do anything to change this (to me at least) absurdist combination where the following series of events keep occurring:
  • People use Citoid which places Jstor links into {{cite journal}} |url=
  • Citation bot comes around and extracts the Jstor ID etc but doesn't remove the URL
  • Some NPC hits ARCHIVE EVERYTHING with the IA Bot check box (eg IA Bot) and now we have a massive pile of archive URL cruft (nb the check box does not actually archive anything)
  • After this rigmarole an editor can now see the result, which is:
    • A main URL that doesn't give you full text
    • A duplicated parameter which renders an identical URL link (|jstor=24432812)
    • An archive URL which is a literally blank page
    • A mark up reference which is now 70 per cent longer than it needs to be to do the exact same thing
Ifly6 (talk) 15:17, 25 September 2023 (UTC)
The bot used to do this until the argument was made that: our users were too stupid to figure out non-title links, and yet so smart that they needed links to scientific journals, since wikipedia was too simple for them. AManWithNoPlan (talk) 16:26, 25 September 2023 (UTC)

Is there really nothing we can do on this without an RFC? Ifly6 (talk) 17:13, 25 September 2023 (UTC)

Getting blocked twice for the same thing is probably an existential risk.
I think Headbomb makes a good point, removing title-links that don't contain full content and that can be replaced with non-title-links. Sometimes JSTOR has the full content sometimes not, sometimes freely accessible (pre-1923), sometimes not. As for archive URLs, this will depend what is cited, if the content is available in the archive URL. It's context sensitive. I would be careful with an RfC, they can be counter-productive with complex matters. An RfC might codify a minority opinion that bots should not be used at all due to "context sensitive" and the "community" will take care of it, which dooms the whole thing to fantasy land due the reality of the scale.
It's possible a bot (this one or another) could start on JSTOR, determine content availability, url-status, and edit accordingly. It might also check archive URLs for possible problems. This is going to be a slow process, and it might run into bot blockers at JSTOR, rate limiting, which further complicates. If true that would leave the "blind" edit option of simply removing all JSTOR links from the title-link as the only viable method, unless someone has another idea how to determine content availability. -- GreenC 20:01, 25 September 2023 (UTC)
Some people have deeper access to JSTOR resources than others, depending on where they are. Surely when a JSTOR resource is cited, no-one is seriously suggesting that only open-access ones may be given? Is anyone suggesting that we deprecate ISBNs because <shudder> some readers might have to buy the actual book? Or have I completely missed the point? --𝕁𝕄𝔽 (talk) 22:57, 25 September 2023 (UTC)
Nobody is saying that Jstor should not be cited. The dispute here is whether a link to the Jstor page should be included in the URL parameter. For me this emerges from the really pointless practice of adding the "archive" version of Jstor links so you can get the glory of gazing upon a blank page. Removing the |url= entry would prevent "archive" links from being added. It is a dispute between whether a reference should look like this:
{{Cite journal |last=Steel |first=Catherine |date=2014 |title=The Roman senate and the post-Sullan "res publica" |journal=Historia: Zeitschrift für Alte Geschichte |volume=63 |issue=3 |pages=323–339 |doi=10.25162/historia-2014-0018 |jstor=24432812 |s2cid=151289863 |issn=0018-2311 }}
Or, by almost inevitable accretion through inaction, like this:
{{Cite journal |last=Steel |first=Catherine |date=2014 |title=The Roman senate and the post-Sullan "res publica" |journal=Historia: Zeitschrift für Alte Geschichte |volume=63 |issue=3 |pages=323–339 |doi=10.25162/historia-2014-0018 |jstor=24432812 |s2cid=151289863 |issn=0018-2311 |url=https://www.jstor.org/stable/24432812 |access-date=26 May 2022 |archive-date=26 May 2022 |archive-url=https://web.archive.org/web/20220526152815/https://www.jstor.org/stable/24432812 |url-status=live }}
The portions at the end after |url= entirely duplicate existing links in the citation and regardless add nothing for the unprivileged reader while clogging up the mark up and making it difficult to do the edit part of "editor". Even if I have Ivy League library access and be able to read all full texts through proxies (eg Penn Libraries), that doesn't mean that linking the proxy page whole (like https://www-jstor-org.wikipedialibrary.idm.oclc.org/) does any good for readers without Penn or Wikipedia library privileges. Ifly6 (talk) 23:37, 25 September 2023 (UTC)
The number of Wikipedians who potentially have access to JSTOR sources that are hidden by paywalls may be larger than you think. "Veteran" Wikipedians (I believe the cut-off is 500 life-time edits) can avail themselves of access to JSTOR (and many ohter sources barred to the hoi polloi) via the Wikipedia library. So I think for these relatively "privileged" people giving a link to a page that contains a doi is till useful. I have no problem doing it, also for sources like Cambridge U.P and the like. Ereunetes (talk) 23:37, 25 September 2023 (UTC)
If the purpose of adding |url=https://www.jstor.org/stable/24432812 is for the "average" reader this link does nothing because they will not have a Jstor subscription. If adding it is to help the "average" university student, the link also does nothing because they will have to go through their university proxy. If it is to help the privileged editor with
WP:LIBRARY access, it also does nothing because we have to go through a proxy too. The only people it supports are those few who have direct access to Jstor (which ironically includes me via the Federal Reserve). Ifly6 (talk
) 23:43, 25 September 2023 (UTC)
What do you mean "the link does nothing"? If someone has access to JSTOR, via WP:LIBRARY, their local public library, an academic library, or whatever, seeing that there is anything in the |jstor= parameter lets them know that the article is on JSTOR and they will likely have access to it, and once they click on the link they can easily log in via whatever gives them access via whatever proxy, or if they're physically at their library just click the link and access it. The JSTOR link also provides metabibliographic information, a first page preview, and abstract. Plus JSTOR allows independent researchers 100 free articles each month, and if someone so chooses they have the option to buy it à la carte. Anything which helps a reader access a source is useful, and quite often JSTOR is the electronic place of record for a journal. [Edit: sorry I'm following more closely now, I still think it should be in |jstor= -- that's why we have that parameter; it does not also belong in |url=.] Umimmak (talk) 23:49, 25 September 2023 (UTC)
What do you mean "the link does nothing"? The link to the native Jstor website in |url= is not the proper one and will not yield the full text unless you have direct Jstor access. If you access it through a proxy, you would have to copy the Jstor ID and paste it in after ../static/. Putting the direct URL in |url= is not very useful and largely facilitates
WP:MEATBOTs crufting up articles with unnecessary mark up pointing to blank archive pages. Ifly6 (talk
) 01:13, 26 September 2023 (UTC)
OK. Forgive my ignorance; I didn't know about the "jstor=" parameter and will use it in future if the case applies, instead of the "url=" parameter. Would it be possible to enable the Citation bot to change "url=" to "jstor=" if that would be appropriate? Or am I stupid again? Ereunetes (talk) 20:42, 27 September 2023 (UTC)
The discussion we are in is whether citation bot should extract Jstor URLs and put them into |jstor=. Apparently there was an RFD, ban, or something of the sort which has led the maintainer(s) of the bot not being willing to re-enable that previously-present functionality. Ifly6 (talk) 21:54, 27 September 2023 (UTC)
The RFC you are looking for is this one.
Perhaps the maintainers of the bots should put together an FAQ somewhere about why the bot does some things that it does and some things that it does not with links to appropriate major discussions. Izno (talk) 00:37, 30 September 2023 (UTC)
I think it might be possible to effect a change like this if we take it slowly. If we can start with getting consensus that archives of paywall landing pages (like Jstor) should be removed, and |access-date= in {{cite book}} and {{cite journal}} (and maybe others) should be removed, we'll have solved almost the entire problem of these kinds of URLs without needing to determine whether or not readers / editors will understand the alternative stable identifiers. Folly Mox (talk) 04:44, 30 September 2023 (UTC)
While I agree that those two should be done, it doesn't appear to me to solve the problem of someone driving by to blindly hit the check box and add those archives back in. Ifly6 (talk) 23:28, 30 September 2023 (UTC)
Right, the prevention is more difficult than the cure, but if we have consensus to remove archives to paywall landing pages, we could get a bot to do it, and getting consensus to remove would be a step towards consensus against adding. I don't think this is a one-step recipe. Folly Mox (talk) 00:39, 1 October 2023 (UTC)
And there's no way to prevent the URLs and prefer custom stable identifiers. Citoid guarantees a valid URL in its output, and works across multiple projects, most of which don't implement custom stable identifiers. We'd have to get every maintainer of every automated referencing script, including VisualEditor, to build in functionality to reach our end goal here, which it's unclear if there's even consensus for in all facets. Folly Mox (talk) 00:43, 1 October 2023 (UTC)

Well that issue is why we're here at Citation bot. Do you think it's actually impossible to get a decision for Citation bot to remove those URLs? A bot to remove those archives would produce even more watchlist events, which people in the discussion below seem to be adamantly against, while also probably being impossible to implement per GreenC's comment above. Ifly6 (talk) 19:36, 1 October 2023 (UTC)

I don't know what venue should generate the consensus, but we do need the theoretical underpinnings of a discussion reaching consensus regarding archives of paywall landing pages before a Bot request or BRFA for a new task could be submitted. I wouldn't necessarily frame it as something that Citation bot in particular needs to handle, instead of some other bot, and I wouldn't want it to take place in absence of other constructive edits even though it doesn't violate COSMETICBOT.
So, I'd try to frame this bit of the discussion as "archives to paywall landing pages are useless cruft: they don't archive the content and you can't use them to navigate to the content", not "proposal for a one-time bot run to have User:Citation bot remove archives to paywall landing pages in 1,700,000 articles".
So no, I don't think it's actually impossible, and I think setting jstor.org to permalive for IABot is also a reasonable first step. Folly Mox (talk) 22:09, 1 October 2023 (UTC)
And I do appreciate that the discussion you opened on Wikipedia talk:Link rot#Mass additions of archive links for live sites is essentially a superset of the discussion I just proposed. Folly Mox (talk) 22:11, 1 October 2023 (UTC)
I'm not sure if this has been mentioned before, but just wanted to note that resources in JSTOR: Global Plants have URLS of the form plants.jstor.org/stable/10.5555/al.ap.person.bm000000658, and that if a bot naively took any |url= including a "jstor.org/stable/XXXXX" to turn it into a
JSTOR 10.5555/al.ap.person.bm000000658 this would not work; occasionally JSTOR the website gets cited instead of a book/article it is hosting so just bots should be aware of this. Umimmak (talk
) 21:20, 5 October 2023 (UTC)


{{

wontfix}} because people are whiners. AManWithNoPlan (talk
) 20:41, 24 October 2023 (UTC)

Proxy cleanup?

Status
{{fixed}}
Reported by
b
}
00:25, 27 September 2023 (UTC)
What should happen
[8] + more?
We can't proceed until
Feedback from maintainers


Treating a release date for content on a webpage as the publication date for that webpage

Status
{{fixed}} with a list of bad-date websites.
Reported by
AKiwiDeerPin (talk) 11:08, 10 October 2023 (UTC)
What happens
Uses metadata for media the webpage is about as metadata for the page itself (specifically, release/publication date). There are two comments on this page from within the last month, and several older comments, saying the bot changes web citations for pages about books to book citations; it seems to have problems handling webpages about other types of media.
What should happen
Don't know exactly where it got the date from, but adding dates to web citations seems tricky to properly automate, so maybe it shouldn't try to do that? It probably shouldn't act like citations for a webpage about a work are equivalent to citations for that work.
Relevant diffs/links
Diff here
We can't proceed until
Feedback from maintainers


Bot is not respecting Template:inuse

Status
{{
nobots
}} right next to it. And then remove together.
Reported by
Justin (koavf)TCM 14:55, 11 October 2023 (UTC)
What happens
Bot is not respecting {{
inuse
}}
What should happen
Just stop.
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=The_Rest_%28EP%29&diff=1179644470&oldid=1179642372
We can't proceed until
Feedback from maintainers


The bot never respected {{

b
} 22:42, 13 October 2023 (UTC)

Then they can remove the tag and then run the bot. The bot should not cause edit conflicts by interfering when articles have this tag on them. ―Justin (koavf)TCM 22:47, 13 October 2023 (UTC)
The tag is often put to tell others to not edit the article so the bot can make its edits.
b
}
09:14, 14 October 2023 (UTC)
That's not how it should work: all bots should respect the tag. Other bots do and one should expect it to not edit with the tag on an article. ―Justin (koavf)TCM 10:31, 14 October 2023 (UTC)

From my discussion page: Hi, I see that you have used citation bot to add dates to references to numismatics.org.uk webpages here. I am not familiar with the bot, so could you explain what the dates mean? The pages seem to be updated regularly.

I think the bot is wrong. Grimes2 (talk) 14:41, 13 October 2023 (UTC)

Something (probably upstream in the Zotero libraries) is using meta property="article:published_time" instead of meta property="article: modified_time". Folly Mox (talk) 16:00, 13 October 2023 (UTC)
@Grimes2 No.bot 49.237.203.59 (talk) 06:06, 24 October 2023 (UTC)
{{fixed}} by adding to NO_DATE_WEBSITES array. AManWithNoPlan (talk) 20:35, 24 October 2023 (UTC)

Publisher removed from cite book

Status
{{fixed}} with some code additions
Reported by
Umimmak (talk) 23:26, 15 October 2023 (UTC)
What happens
Publisher removed from cite book, not realizing that ProQuest was in fact the correct publisher (see copyright page [9])
What should happen
Publisher should not be removed from book citation
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Lustrum_(journal)&curid=72440503&diff=1180192312&oldid=1176169375
We can't proceed until
Feedback from maintainers


Authors incorrectly added

Status
{{fixed}}. New Republic now only add one author.
Reported by
Klinetalk to me!contribs 01:21, 20 October 2023 (UTC)
What happens
citation bot picks up authors from other articles and adds them to the citation
What should happen
the author of the article should only be added
Relevant diffs/links
revision
We can't proceed until
Feedback from maintainers


Finna and Elonet are not book refs

[10] 2001:14BA:9CE5:8400:20AB:2C62:7318:4F88 (talk) 04:35, 22 October 2023 (UTC)

This is the third thread about this behaviour visible on this talkpage, and I'm beginning to wonder why, when editors cite a source to establish the existence of a book it is ever less preferable to include the full publication information of the book, even when the route chosen to establish the book's existence is a website somewhere.
I think the root solution here might be additional guidance about writing about books. Like, in an article about a book, just have a section in the appendix called Publication information. For articles about authors, put their books in Selected bibliography. I don't think this is the right kind of information for inside a citation template inside a pair of ref tags. Folly Mox (talk) 05:00, 22 October 2023 (UTC)
I agree with Folly Mox. If the purpose of the citation is to establish the existence of a book, full publication information should always be preferred to a website which says the exact same thing. If I need to establish that
Erich S Gruen wrote Last generation of the Roman republic surely the best way to do that would be to give you all the information you would need to find that book in a library and confirm on the cover, title page, and verso for yourself. Ifly6 (talk
) 15:17, 23 October 2023 (UTC)
And some more:
Probably some more can be found among these causing ref errors, e.g. [11], [12]
2001:14BA:9CE5:8400:8CDE:6F36:A6DA:6CE6 (talk) 18:14, 23 October 2023 (UTC)
Also, not just elonet.finna.fi but also elonet.fi it seems: [13], [14] (probably some more). Please stop the bot from changing the citation templates of elonet.fi and finna.fi from "cite web" to "cite book", thank you. 2001:14BA:9CE5:8400:79D9:9129:F234:CDFA (talk) 20:03, 24 October 2023 (UTC)
{{fixed}} AManWithNoPlan (talk) 20:27, 24 October 2023 (UTC)

At the Kenny Clarke article in the oral history ref, the bot changes "Cite web" to "Cite journal" without changing any other parameters, causing this error message. While checking hidden categories on that page, I discovered that the bot did this in June 2022 and I reproduced the problem just now. Graham87 (talk) 06:50, 23 October 2023 (UTC)

{{fixed}} with special code for 10.7282 DOIs that have odd meta-data. AManWithNoPlan (talk) 20:33, 24 October 2023 (UTC)
{{fixed}} for all cases with no |work= AManWithNoPlan (talk) 20:56, 24 October 2023 (UTC)

Journal capitalization

Status
{{fixed}}
Reported by
Super Dromaeosaurus (talk) 09:23, 25 October 2023 (UTC)
What happens
Another instance of improper journal capitalization. This time in Serbian.
What should happen
Leave as is.
Relevant diffs/links
[15]. "i" is "and" in Serbian, not even in English should "and" be capitalized.
We can't proceed until
Feedback from maintainers


web > book template bug

Status
{{fixed}} by adding that website to special case code
Reported by
Spinixster (chat!) 12:53, 30 September 2023 (UTC)
What happens
template type altered from web to book
What should happen
not that, since it generates an error and it's also not a book
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Shoto_Todoroki&diff=prev&oldid=1177916086
We can't proceed until
Feedback from maintainers


Spinixster (chat!) 12:53, 30 September 2023 (UTC)

Created {dead link...} with invalid parameter.

Status
{{
wontfix
}} on rescuing, since we do not do that type of thing
Reported by
A876 (talk) 05:17, 27 October 2023 (UTC)
What happens
When {cite ...} has |url-status=dead but lacks |archive-url= and |archive-date= , Citation bot removes |url-status=dead and adds {dead link...} after, with parameter |dd mmmm yyyy.
What should happen
 
  • {dead link...} needs parameter |date=mmmm yyyy. (As changed by AnomieBOT and AManWithNoPlan.)
  • Should have fixed all 4 places.
  • Shouldn't insert a space before the }}.
  • COULD HAVE done the rescues too (like #IABot), to get more done in one edit.
Relevant diffs/links
this diff
Replication instructions
Find a dead URL. Add |url-status=dead , but no |archive-url= and |archive-date=. Wait for Citation bot.
We can't proceed until
Feedback from maintainers


The first instance, starting from my edit:

  • |access-date=February 13, 2011 |url-status=dead}}</ref> (by me) (and 3 more like it) (Not official, though I don't see why.) (Nannyware keeps me from viewing any "archive" websites and/or I didn't have time.)
  • |access-date=February 13, 2011 }}{{dead link|29 September 2023}}</ref> (by Citation bot, Misc citation tidying...) (Made it official, but wrong format) (only got 3 of 4 instances) (disoptimal - should be no space before the closing }}.
  • |access-date=February 13, 2011 }}{{dead link|date=September 2023}}</ref> (by AnomieBOT, Dating maintenance tags...) (Corrected the 3 changed by Citation bot.)
  • (by AManWithNoPlan, properly flag dead link) (Fixed the 4th instance to match the other 3.)
  • |access-date=February 13, 2011 |archive-date=December 18, 2010 |archive-url=https://web.archive.org/web/20101218224918/http://makingitbigcareers.com/books/making-it-big-in-software/mark-russinovich/ |url-status=dead }} (by AManWithNoPlan), Rescuing 4 sources and tagging 0 as dead.) #IABot (v2.0.9.5)) (disoptimal - field order should be |access-date= |url-status= |archive-url= |archive-date=.)

Config issue

Status
{{fixed}} itself. Weird.
Reported by
Comfr (talk) 19:04, 27 October 2023 (UTC)
We can't proceed until
Feedback from maintainers


Today I do not see "Expand citations" in my tools menu. I do not know what caused it to disappear.

Today I logged off and logged back onto Wikipedia, and Expand citation was back on my Tools menu.

MDPI url ending in /pdf-vor or /pdf

Status
{{fixed}}
Reported by
b
}
01:33, 2 November 2023 (UTC)
What should happen
[16] [17]
We can't proceed until
Feedback from maintainers


Treat as if /pdf-vor or /pdf isn't there.

b
} 01:33, 2 November 2023 (UTC)

Not finding author names

Status
{{
wontfix
}} adding correct authors, since it is a PDF
Reported by
Chidgk1 (talk) 14:27, 2 November 2023 (UTC)
What happens
uses "Object" as a name
What should happen
add the 3 names Berberoglu, S, Cilek, A, Kirkby, M
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Agriculture_in_Turkey&curid=15948548&diff=1183153893&oldid=1183153147
We can't proceed until
Feedback from maintainers


Incorrect article title from archive

Status
{{fixed}} by rejecting "latin1" as a character set.
Reported by
GoingBatty (talk) 10:08, 28 October 2023 (UTC)
What happens
Incorrect article title
What should happen
this (or nothing)
Relevant diffs/links
this edit
We can't proceed until
Feedback from maintainers


https://web.archive.org/web/20180312205216/http://www.zdnet.co.kr/news/news_view.asp?artice_id=20140408103154

Another example: Special:Diff/1182437641 - https://web.archive.org/web/20150314222820/http://kharkivoda.gov.ua/en/ GoingBatty (talk) 22:26, 29 October 2023 (UTC)

Book Chapter Quotes

Status
{{fixed}}
Reported by
Susmuffin Talk 00:15, 1 November 2023 (UTC)
What happens
The bot replaced singular quotation marks with double ones in the title of a book chapter, causing citation style issues.
What should happen
Not that
Relevant diffs/links
[18]
We can't proceed until
Feedback from maintainers


watch out for usurped websites, importing spam text

Status
{{
wontfix
}}, since that title is too rare. We do block a lot
Reported by
billinghurst sDrewth 10:08, 2 November 2023 (UTC)
What happens
when the bot is running and it pulls in titles it is pulling in title of usurped websites, so the title is spam
What should happen
there should be a blacklist of terms that are not able to be applied, and ideally would help report usurp sites
Relevant diffs/links
special:diff/1175534976
We can't proceed until
Feedback from maintainers


This is a
WP:JUDI, so they can be repaired by the automated processes there. There are many 100s of domains, and more being added all the time. It's never ending. If you do match on these titles, it's important to know which domains, so I can usurp them. Currently I find them by looking for spam titles in the wiki code, most often added by Citation bot. If CB were to stop adding these titles, it would be very difficult to find the usurped domains. I've added khyber.org to the list: Special:Diff/1179566502/1183150376 -- GreenC
13:57, 2 November 2023 (UTC)

Cosmetic edit


That is super odd. AManWithNoPlan (talk) 21:24, 2 November 2023 (UTC)

The bot adds something and then removes it during the clean-up phase at the end. The net result is that the missing space is added. AManWithNoPlan (talk) 00:56, 5 November 2023 (UTC)

Bot breaking refs

changing a date to today and breaking a ref in the process. I am sick and tired of Citation bot going around breaking sfn refs willy nilly. DuncanHill (talk) 22:05, 2 November 2023 (UTC)

{{fixed}} the bug that was causing extra book clean-ups. But, no idea where that date came from. AManWithNoPlan (talk) 00:55, 5 November 2023 (UTC)

Bot ignores comments

Status
{{fixed}}
Reported by
b
}
22:55, 3 November 2023 (UTC)
What happens
[20]
We can't proceed until
Feedback from maintainers


title/chapter

Status
{{fixed}}
Reported by
Boghog (talk) 08:29, 4 November 2023 (UTC)
What happens
[21] in {{cite book}} template, |title=Medical Microbiology replaced with the name of the chapter
We can't proceed until
Feedback from maintainers


Two Four Weird Behaviours

Status
{{Fixed}} several different ways including dates that are today.
Reported by
UndercoverClassicist T·C 22:07, 4 November 2023 (UTC)
What happens
Two Four strange things seem to have happened sporadically in this edit:

1. Bot seems to be replacing (only very occasionally) "year=[a year]" with "date=[today's date]" - there seems to be no particular pattern to which sources these are, and none of them are 2023 sources.
2. Where a journal is published in single-issue volumes (this particular one is Archaeological Reports, the bot has erroneously expanded "volume=14" to "volume=14|issue=14", though no issue 14 that year or ever existed.
(Edit:) 3. Replaced:

{{cite book| last=Gates| first=Charles| year=2004| chapter=The Adoption of Pictorial Imagery in Minoan Wall Painting: A Comparativist Perspective| series=Hesperia Supplements| title= ΧΑΡΙΣ: Essays in Honor of Sara A. Immerwahr| editor-last=Chapin| editor-first=Anne P.| publisher=American School of Classical Studies at Athens| place=Athens| isbn=978-0-87661-533-1| pages=27–46| jstor=1354061}}

with {{cite book| last=Gates| first=Charles| title=The Adoption of Pictorial Imagery in Minoan Wall Painting: A Comparativist Perspective| series=Hesperia Supplements| date=4 November 2023| volume=33| editor-last=Chapin| editor-first=Anne P.| publisher=American School of Classical Studies at Athens| place=Athens| isbn=978-0-87661-533-1| pages=27–46| jstor=1354061}}

Overriding the (correct) description of this article as a chapter in a Festschrift and that Festschrift's title.

4. It added the page range 1-370 to a cite book template, where the whole book is being cited.

Relevant diffs/links
[22]
We can't proceed until
Feedback from maintainers


Strip underscore from authorlinks

Status
{{fixed}}
Reported by
b
}
09:12, 17 October 2023 (UTC)
What should happen
[23]
We can't proceed until
Feedback from maintainers


And similar wikilink parameters.

b
} 09:12, 17 October 2023 (UTC)

The following? (this would be compared after removing numbers and dashes):

authorlink chapterlink contributorlink editorlink episodelink interviewerlink inventorlink serieslink subjectlink titlelink translatorlink 

AManWithNoPlan (talk) 13:22, 25 October 2023 (UTC)

I think so, yes.
b
}
20:39, 25 October 2023 (UTC)
No such thing as |chapter-link=.
Trappist the monk (talk) 20:47, 25 October 2023 (UTC)

Creates title with replacement character

Status
{{
notabug
}} sadly
Reported by
GoingBatty (talk) 04:13, 5 November 2023 (UTC)
What happens
this edit
What should happen
this edit
We can't proceed until
Feedback from maintainers


That is the title CrossRef has https://search.crossref.org/?from_ui=yes&q=10.1063%2Fpt.6.4.20200327a AManWithNoPlan (talk) 12:23, 5 November 2023 (UTC)

Is there a way Citation bot could check to see if its output matches any characters like that, and decline to fill the value if it does? GIGO is still undesirable behaviour that can be addressed. Folly Mox (talk) 13:42, 5 November 2023 (UTC)
A title with partial garbage beats no title at all. Especially since corrupt characters are easily reviewable and rarely last more than a few hours on Wikipedia.
b
}
18:39, 5 November 2023 (UTC)
How will Citation bot know not to alter an existing title to an incorrect one if it doesn't check its output for obvious errors, like a glyph not used in any language? I don't agree that known incorrect output is superior to declining to return a value, but I understand that's probably a philosophical position: software limitations are safer than software errors. Folly Mox (talk) 19:28, 5 November 2023 (UTC)

Below is a method to detect replacement characters. (Not pretty but works.) I added some inline comments because it's an obscure language

isbinary() in Nim
#
# Return true if string contains a 'replacement' or binary character (black diamond with ? in middle)
#   Based on: https://unix.stackexchange.com/questions/474709/how-to-grep-for-unicode-in-a-bash-script/474812#474812
#   Requires a secondary shell layer so UTF-8 works
#   tcsh -s 'grep -axv ".*" <filename>'
#
proc isbinary*(s: string): bool {.discardable.} =

  result = false                                                            # default return value
  let tmpfile = mktempname(GX.ramdir & "isbinary.")                         # Generate a temporary and unique filename "isbinary.xxx" to be located in a ramdisk directory for speed
  s >* tmpfile                                                              # Write the string to the tempfile
  let command1 = "tcsh -c 'grep -axv \".*\" \"" & tmpfile & "\" | wc -l'"   # need to use tcsh -c for UTF-8 to work. Bash with similar -c might also work.
  let c1 = runshellBasic(command1)                                          # run the shell command and capture output to c1
  if strip(c1) !~ "^0$":                                                    # If the output is not "0" (only) then it contains a replacement character.
    result = true
  removeFile(tmpfile)                                                       # Delete the temp file and return 'result' 

— Preceding unsigned comment added by GreenC (talkcontribs) 16:32, 5 November 2023 (UTC)

@AManWithNoPlan: Thank you for the explanation. I reported the issue to Crossref. GoingBatty (talk) 19:26, 5 November 2023 (UTC)

Enhancement request - Could new OECD data site be supported?

Status
{{
wontfix}} - dynamic javascripte controlled title-less webpages, and the bot uses https://en.wikipedia.org/api/rest_v1/#/Citation/getCitation
which is outside our control
Reported by
Chidgk1 (talk) 08:56, 6 November 2023 (UTC)
What happens
nothing
What should happen
As this new site at https://data-explorer.oecd.org/ will presumably be useful for very many articles it would be great if citation bot could fill in the tedious details - for example for the first cite in Agriculture_in_Turkey#Subsidies
We can't proceed until
Feedback from maintainers


Enhancement request - support more pdfs from major organisations

As you know the "automatic" option in the Visual Editor cite button does not support any pdfs, perhaps because it would be too slow. As this bot is not constrained as much for time it would be great if the bot could expand more pdfs from major organisations. For example the second cite in Agriculture_in_Turkey#Subsidies namely https://www.g20.org/content/dam/gtwenty/gtwenty_new/document/G20_Report_on_Macroeconomic_impacts.pdf Chidgk1 (talk) 09:08, 6 November 2023 (UTC)

{{

wontfix}} - PDF files, and the bot uses https://en.wikipedia.org/api/rest_v1/#/Citation/getCitation
which is outside our control

Could dergipark.org.tr journals be expanded

Status
{{ which is outside our control
Reported by
Chidgk1 (talk) 10:09, 6 November 2023 (UTC)
What happens
nothing - at least one dergipark.org.tr journal not being expanded
What should happen
it would be good if they could be expanded as there are a lot of journals on that cite - I am not sure how many are affected. If this is a problem on their side I could contact them if you tell me what to ask them
Replication instructions
Run on Agriculture in Turkey and see whether the cite at the end of the sentence about einkorn gets expanded
We can't proceed until
Feedback from maintainers


adds |chapter= to {{citation}} templates that have |work=

Status
{{fixed}} - {{citation}} is such a pain to |work=with
Reported by
Trappist the monk (talk) 14:05, 7 November 2023 (UTC)
What happens
adds |chapter= to {{
cite periodical}}, {{cite web
}}) and not supported in the cs2 {{citation}} template when it is configured as a periodical template.
What should happen
These particular {{citation}} templates are best written as {{cite Australian Dictionary of Biography}} templates. In no case should the bot add a |chapter= alias to {{citation}} when there is a |work= alias with an assigned value present in {{citation}}.
Relevant diffs/links
Special:Diff/1183892109, Special:Diff/1183891942, Special:Diff/1183927757, Special:Diff/1183891483
We can't proceed until
Feedback from maintainers


weird journal overwriting

Status
GIGO {{fixed}}
Reported by
b
}
04:29, 7 November 2023 (UTC)
What happens
[24]
What should happen
leave it alone
We can't proceed until
Feedback from maintainers


This is the sort of thing that happens when you have a bot whose entire philosophy is "anything the publisher says must be correct and any deviation from that by the person who formatted the citation in the first place must be an error". That aside, the citation was garbage to begin with, as you might have guessed. The doi and editors go to the book "Casimir physics", the arxiv goes to a chapter inside the book (whose author and title are not mentioned), and the pmid and authors go to a different paper "Casimir physics". It looks like the sort of thing that happens when one bot makes a mistake and adds the wrong id to a citation and another bot runs with it to fill in all the details and remove the other details that don't fit. In the long run as humans get tired of chasing after bot mistakes all our citations will become this garbled. —David Eppstein (talk) 07:48, 7 November 2023 (UTC)
So much GIGO in that reference. I have cleaned up the reference and removed the bogus PMID, etc. AManWithNoPlan (talk) 14:10, 7 November 2023 (UTC)
I didn't even notice the GIGO. That was a bad one.
b
}
21:26, 7 November 2023 (UTC)

Could a simple overview be added to the "function summary" please

Although this is a very useful bot I am struggling to understand what it can and cannot do and how it works.

I recently submitted a bug report, and a couple of enhancement requests asking if the bot could be run on pdf files and they were immediately closed because there is an api which is "outside our control".

Could the "function summary" be rewritten with a first paragraph to explain what the bot does in very simple terms and a second para to explain how it works in very simple terms and relegate the technical explanation to later paragraphs?

Also it would be useful if the "won't fix" could be left here for a couple of days for us to read rather than being immediately archived.

Chidgk1 (talk) 12:21, 7 November 2023 (UTC)

You can always read the archives. AManWithNoPlan (talk) 14:13, 7 November 2023 (UTC)
Lack of PDF support added to description. AManWithNoPlan (talk) 15:45, 7 November 2023 (UTC)
documentation {{fixed}} AManWithNoPlan (talk) 02:23, 11 November 2023 (UTC)

More conferences: 10.21437/...

Status
{{fixed}}
Reported by
b
}
03:10, 9 November 2023 (UTC)
What should happen
[25] [26], [27], [28], [29], ...
We can't proceed until
Feedback from maintainers


Caps: Antibiotiki i Khimioterapiia

Status
{{fixed}}
Reported by
b
}
22:41, 10 November 2023 (UTC)
What should happen
https://en.wikipedia.org/w/index.php?title=Riamilovir&diff=prev&oldid=1184518817
We can't proceed until
Feedback from maintainers


And it should leave every other 'I' alone too. This is particularly annoying. The only 'I' that needs capitalization are those from Part I, Section I, etc...

b
} 22:41, 10 November 2023 (UTC)

Edit broke template

Status
{{fixed}}
Reported by
GoingBatty (talk) 04:18, 12 November 2023 (UTC)
What happens
This edit added the template to Category:CS1 errors: redundant parameter, so I reverted the edit
What should happen
Leave the template alone
We can't proceed until
Feedback from maintainers


Caps: iSciences

Status
{{fixed}}
Reported by
b
}
07:49, 19 November 2023 (UTC)
What should happen
[30]
We can't proceed until
Feedback from maintainers


Adding CS1|2 templates to bare URLs badly

Status
{{not a bug}}
Reported by
Folly Mox (talk) 12:06, 20 November 2023 (UTC)
What happens
Special:Diff/1186015763: Citation bot takes a bare URL, and wraps it in {{cite web}}, adding a |title= parameter scraped from meta tags in the HTML header, and absolutely nothing else.
What should happen
Special:Diff/1186017469: actual citation information is added to the reference, based on the source.
We can't proceed until
Feedback from maintainers


If the goal is to wrap a URL in a citation template so Internet Archive picks it up, and there's no good translators available for the domain, just set the |title=(URL) so it's obvious the citation is incomplete and needs work. This sort of lazy not-citation is essentially worthless, and encourages people to use scripts for tasks the scripts are not ready to handle, instead of putting in the one minute of work it takes to create a real citation by looking at the source.

If Citation bot can't figure out anything from the URL except the title, it should either leave the link alone, set the title to the URL, or tag its change with a template like {{citation needs human review}} so this sort of garbage can be tracked.

Apologies for the strong language, but if we train a whole generation of editors to rely on pushbutton non-solutions, the maintenance burden of trash citations is going to outpace our capacity and never be fixed. Folly Mox (talk) 12:06, 20 November 2023 (UTC)

I hate this, and I'm not sure it has consensus, but even I know it's not a bug.  Request withdrawn Folly Mox (talk) 15:12, 26 November 2023 (UTC)

Garbage title from scraping captcha

Status
{{fixed}}
Reported by
Folly Mox (talk) 08:08, 25 November 2023 (UTC)
What happens
attempts to fix "Archived copy as title" by scraping metadata from captcha
What should happen
basic sanity checking should be done on proposed output before committing edit
Relevant diffs/links
Special:MobileDiff/1186756111
We can't proceed until
Feedback from maintainers


Allow 1-click activation of category run on Category:CS1 maint: unflagged free DOI, much like Category:CS1 errors: DOI and the others

Status
{{fixed}}
Reported by
b
}
17:31, 25 November 2023 (UTC)
We can't proceed until
Feedback from maintainers


Also

b
} 12:23, 26 November 2023 (UTC)

TNT journal = National Library of Medicine

Status
{{
wontfix
}} - obscure and sometimes correct, in which case should be work or website.
Reported by
b
}
03:21, 24 November 2023 (UTC)
What should happen
[31]
We can't proceed until
Feedback from maintainers


For books, fetch publishers

Status
{{
wontfix
}} - CrossRef does not support
Reported by
b
}
14:01, 25 November 2023 (UTC)
What should happen
[32]
We can't proceed until
Feedback from maintainers


biorxiv is not a journal

Status
{{fixed}}
Reported by
David Eppstein (talk) 17:23, 25 November 2023 (UTC)
What happens
Citation bot sees a {{cite web}} with a doi url, but the url goes to bioarxiv, tries to convert it to cite journal, and fails, producing a buggy citation with no periodical
What should happen
not that
Relevant diffs/links
Special:Diff/1186759015
We can't proceed until
Feedback from maintainers


last1=(punctuation mark)

Status
{{fixed}}
Reported by
Folly Mox (talk) 21:38, 25 November 2023 (UTC)
What happens
adds bogus parameter consisting wholly of punctuation
What should happen
basic sanity checking should be performed on proposed output before edit is published
Relevant diffs/links
Special:MobileDiff/1186845416. Here, Citation bot provides the obviously erroneous |last1=&#124 (the pipe character |).
We can't proceed until
Feedback from maintainers


Another instance of this same error at Special:MobileDiff/1186954603. Folly Mox (talk) 16:36, 26 November 2023 (UTC)

Better IEEE Xplore handling

Status
{{fixed}} mostly, not all is doable without human help
Reported by
b
}
00:02, 16 November 2023 (UTC)
What happens
[33]
What should happen
[34]+[35]
We can't proceed until
Feedback from maintainers


The garbage human-entered title prevented the full expansion. Wondering if we can't just yeet the title out when converting a cite web to a cite journal/book for ieeexplore links. It's a highly-reliable database. Either way, the website= parameter should be nuked.

b
} 00:02, 16 November 2023 (UTC)

IEEE Xplore part deux

Status
{{fixed}} mostly, not all is doable without human help. Some things will still require multiple edits, since the bot does things in a certain order and re-checking everything for a few rare cases can really slow the bot down
Reported by
b
}
00:06, 16 November 2023 (UTC)
What happens
[36]+[37]+[38]
What should happen
Should happen in one edit, including [39]
We can't proceed until
Feedback from maintainers


Better handling of SPIE proceedings

Status
{{
wontfix
}} - limited by meta-data.
Reported by
b
}
01:16, 19 November 2023 (UTC)
What should happen
[40]

and

[41] (here I manually switched |journal= to |series=

We can't proceed until
Feedback from maintainers


Reordering splits up author parameters

Status
{{fixed}} - thank you for reporting this
Reported by
David Eppstein (talk) 18:45, 23 November 2023 (UTC)
What happens
I always find it difficult to edit author parameters when they are split up into different parts of the template, separated from each other. In Special:Diff/1186468004, beyond cosmetic edits and a minor capitalization change, that is exactly what the bot did: it moved a title from later in the template to a new position between the author and author-link parameters.
What should happen
Not that
We can't proceed until
Feedback from maintainers


arXiv-to-journal-conversion upcases article title

Status
 Fixed with better CrossRef parsing
Reported by
David Eppstein (talk) 23:17, 30 November 2023 (UTC)
What happens
As part of the conversion of an arXiv preprint reference to a journal publication reference (fortunately, mostly correct this time, except that the article text stating when the publication occurred needed to be edited to match) it changed the article title from sentence-case to title-case
What should happen
Not that
Relevant diffs/links
Special:Diff/1187705641
We can't proceed until
Feedback from maintainers


arXiv-to-journal conversion breaks math formula formatting in title

Status
 Fixed
Reported by
David Eppstein (talk) 23:19, 30 November 2023 (UTC)
What happens
As part of the same conversion of an arXiv preprint reference to a journal publication reference as the previous bug, it changed the mathematical formulas in the article title from wiki-formatted math (with <math>) to TeX-formatted math with dollar signs
What should happen
Not that
Relevant diffs/links
Special:Diff/1187705641


Messing with correct citation

Status
 Fixed for this source
Reported by
Michael Bednarek (talk) 12:06, 1 December 2023 (UTC)
What happens
introduced CS1 error "|work= ignored"
What should happen
Nothing; the citation was fine.
Relevant diffs/links
Special:diff/1187788448
We can't proceed until
Feedback from maintainers


Last week or the week before, I fixed a lot of instances of this error introduced by Citation bot, and I noticed but did not mention that the "BnF catalog" source seemed to be a major stumbling block. I'm not sure why Citation bot thinks it's a book, but it's common enough and invariably incorrect enough that the source could probably be put on some sort of exclusion list rather than coding more complicated logic. Folly Mox (talk) 12:55, 1 December 2023 (UTC)

Incorrect changing to cite document

Status
 Fixed to match VERY recent changes to cite document, not years.
Reported by
Isaidnoway (talk) 11:14, 2 December 2023 (UTC)
What happens
Bot changes citation to cite document when no publisher= parameter is present as required by that template, and the url= parameter and access-date= parameter is not used by cite document either. This issue has been going on for years.
What should happen
Stop changing to cite document when no publisher parameter is present
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Zome&diff=1187921488&oldid=1187424537
We can't proceed until
Feedback from maintainers


Does not fix "zero width space character in |title= at position 1"

Status
 Fixed
Reported by
GoingBatty (talk) 22:33, 13 November 2023 (UTC)
What happens
The bot fixes many errors in Category:CS1 errors: invisible characters, but not "zero width space character in |title= at position 1"
What should happen
this edit


Journal capitalization

Status
{{not a bug}}
Reported by
Super Dromaeosaurus (talk) 19:44, 2 December 2023 (UTC)
What happens
Capitalized journal name.
What should happen
Don't capitalize it. Searching it in Google will show it's most commonly written without capitalizing each word. See also its official website [42].
Relevant diffs/links
[43]
We can't proceed until
Feedback from maintainers


On Wikipedia, we follow

b
} 20:54, 2 December 2023 (UTC)

The bot isn't even following this policy you've cited. It capitalised the indefinite article "a" which is not allowed by the policy. Super Dromaeosaurus (talk) 21:01, 2 December 2023 (UTC)
You capitalize the start of subtitles.
b
}
21:10, 2 December 2023 (UTC)
That would make sense if it were a subtitle set off by a colon. But here, it follows a comma, which would not usually force capitalization of the next word. —David Eppstein (talk) 21:13, 2 December 2023 (UTC)

Enable 1-click activation of Category:CS1 errors: dates

Per the update description in the category.

b
} 23:04, 2 December 2023 (UTC)

 Fixed AManWithNoPlan (talk) 02:13, 3 December 2023 (UTC)

changes cite journal → cite book for no apparent reason

Status
{{fixed}} and fixed the few messed up pages
Reported by
Folly Mox (talk) 09:40, 3 December 2023 (UTC)
What happens
{{cite journal}} is altered to {{cite book}} for no discernible reason: all of |volume=, |issue=, and |journal= are already present in the citation. No isbn is given in the metadata of the target article. |journal= is altered to |series=, refreshingly avoiding the "periodical ignored" error, but this change doesn't appear to make any sense.
What should happen
no change: the script should have some awareness of the citation parameters present in the citation it's altering, and make reasonable choices based on those.
Relevant diffs/links
Special:MobileDiff/1188094499
We can't proceed until
Feedback from maintainers


Leaves journal= parameter in cite arXiv

Status
 Fixed - got it figured out
Reported by
Isaidnoway (talk) 16:54, 3 December 2023 (UTC)
What happens
Changed cite journal to cite arXiv, but didn't remove the journal= parameter
What should happen
fix it, journal= parameter not recognized by cite arXiv template
Relevant diffs/links
unknown parameter
We can't proceed until
Feedback from maintainers


Not sure how to handle that, in general. AManWithNoPlan (talk) 01:53, 4 December 2023 (UTC)

in {{citation}} templates, bot leaves behind |journal=

Status
 Fixed
Reported by
Trappist the monk (talk) 17:00, 3 December 2023 (UTC)
What happens
Bot edited a {{citation}} template to make a 'book' reference. Bot changed |title= to |chapter=, added a new |title= that more-or-less duplicated existing |journal=, and |volume=.
What should happen
{{citation}} uses the work parameters (|journal=, |magazine=, |newspaper=, |periodical=, |website=, |work=) to switch from its default 'book' format (|title= rendered in an italic font) to its 'work' format (|title= rendered in an upright font with quote marks). When making these types of edits to {{citation}} templates (it desires to make a 'book' reference), the bot should remove work parameters (in this case |journal=).
Relevant diffs/links
Diff


Replaces – with €�

Status
 Fixed - mb_strtoupper should have been used where strtoupper was used. Google books urls work with normal and long dashes. Normal ones are better, but the bot should not hose the goofy ones. Also, only appears on certain OS/PHP combinations.
Reported by
GoingBatty (talk) 23:46, 2 December 2023 (UTC)
What happens
replaces – with €�
What should happen
nothing
Relevant diffs/links
Special:Diff/1187999249, which breaks URL
We can't proceed until
Feedback from maintainers


That URL was already broken before Citation bot got to it. Truncating it at the dash glyph might have fixed it. Folly Mox (talk) 01:19, 3 December 2023 (UTC)

@Folly Mox: Before Citation bot got to it, the existing URL redirected to a different Google Books page. Truncating the URL at the dash glyph would not have changed the redirect, and I don't expect Citation bot to try to do that. However, Citation bot made it worse, and this edit added the article for Category:CS1 errors: invisible characters GoingBatty (talk) 03:00, 4 December 2023 (UTC)
I agree that the edit was a disimprovement, definitely. Both the before and after versions of the gbooks url redirect to the same page for me, but I might be doing it wrong. Folly Mox (talk) 03:27, 4 December 2023 (UTC)
@Folly Mox: I agree that the before and after versions (and truncating it at the dash glyph) redirect to the same page. GoingBatty (talk) 04:23, 4 December 2023 (UTC)
It definitely looks like a bug. I'm not sure why I hopped in to defend Citation bot here, apart from the fact that I've been showing up here as a critic or reporter of bugs more days than not. I think what should have happened was for me to do nothing. Folly Mox (talk) 04:45, 4 December 2023 (UTC)
@Folly Mox: Reporting bugs is important, as is discussing them. I hope the maintainers find our discussion helpful. GoingBatty (talk) 04:54, 4 December 2023 (UTC)

Proceedings of Science handling

Status
 Fixed quite a bit
Reported by
b
}
00:56, 21 November 2023 (UTC)
What should happen
[44], [45]
We can't proceed until
Feedback from maintainers


"series" parameter

Status
 Fixed
Reported by
GrindtXX (talk) 12:26, 14 November 2023 (UTC)
What happens
In the article Surtees Society, the bot has repeatedly tried to remove the parameter "series=Surtees Society" from the bibliographical entry Thompson AH (1939), The Surtees Society, 1834–1934, including a catalogue of its publications with notes on their sources and contents, and a list of the members of the society from its beginning to the present day, Surtees Society, vol. 150, Durham{{citation}}: CS1 maint: location missing publisher (link). The series is an essential part of the entry: it provides the necessary context for the volume number, and shows that this was the 150th volume published by the society, whereas its absence suggests that the history is itself a work of 150 or more volumes.


See also

b
} 00:18, 15 November 2023 (UTC)

The Michigan State University library catalog lists this as having |series=Publications of the Surtees Society rather than the shorter removed series name. No idea whether this would affect the bot's attempted removals. It also needs a publisher; following the same catalog entry, it looks like |publisher=Andrews & co. and B. Quaritch for the Surtees Society would be accurate. —David Eppstein (talk) 01:29, 15 November 2023 (UTC)

Journal/conference/book cleanup

Status
 Fixed mostly
Reported by
b
}
22:43, 12 November 2023 (UTC)
What should happen
[46]
We can't proceed until
Feedback from maintainers


nonsense journal to cite book

Status
 Fixed
Reported by
b
}
16:46, 11 November 2023 (UTC)
What happens
[47]
What should happen
[48]


Failing to understand "OUP Academic" as "Oxford University Press" (already present, correctly in |publisher=) is one thing; adding an unsupported |journal= parameter to {{cite book}} is something I thought Citation bot was better than. Folly Mox (talk) 20:28, 11 November 2023 (UTC)

Sassy and unconstructive. Struck with apologies. I've been getting back into ReferenceExpander repair, hoping that maybe we'll get those first 2500 diffs checked and fixed by the end of the year. There are another ~3000 we've barely started on, and I feel scared and hurt when I see high volume citation scripts making errors that could be avoided by careful output checking. My own error rate is probably higher. Folly Mox (talk) 13:19, 12 November 2023 (UTC)

page generated un error

Status
 Fixed with rejecting bibcode pages that match existing issue or volume
Reported by
ChaseKiwi (talk) 20:17, 7 November 2023 (UTC)
What happens
multipage article referenced in cite journal by <pipe>issue = nn has additional <pipe>page =nn entry created
What should happen
when multipage article in pdf format no <pipe>page =nn entry if already <pipe>issue=nn entry in cite journal name. In general page is a concept for written output and issue or number is a better default concept for web content, especially given numbers can be very largw. If manual programmer used best inline reference style with say rp or snf templates bot would cause chaos potentially with error messages
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Ata_Caldera&diff=1135117702&oldid=1132179284
We can't proceed until
Feedback from maintainers


doi:10.1007/s00445-020-01384-6 are both of the new problematic "article number" type. |issue= is wrong, and |page= is less than ideal, but the best the CS1 and 2 have for us at the moment. The journals clearly state to not use issue for these in the "how to cite" areas. AManWithNoPlan (talk
) 20:54, 7 November 2023 (UTC)

Umm, nope, {{cite journal}} supports |article-number=:
{{cite journal |last1=Geshi |first1=N. |last2=Yamada |first2=I. |last3=Matsumoto |first3=K. |last4=Nishihara |first4=A. |last5=Miyagi |first5=I. |title=Accumulation of rhyolite magma and triggers for a caldera-forming eruption of the Aira Caldera, Japan |journal=Bulletin of Volcanology |volume=82 |article-number=44 |year=2020 |doi=10.1007/s00445-020-01384-6 |url=https://link.springer.com/article/10.1007/s00445-020-01384-6}}
Geshi N, Yamada I, Matsumoto K, Nishihara A, Miyagi I (2020). "Accumulation of rhyolite magma and triggers for a caldera-forming eruption of the Aira Caldera, Japan". Bulletin of Volcanology. 82 44. .
Trappist the monk (talk) 21:02, 7 November 2023 (UTC)
I will look into adding support for this parameter. Adding is easy, but dealing with all the edge cases (removing matching pages, etc), will require some work. AManWithNoPlan (talk) 22:11, 29 November 2023 (UTC)

Use Project MUSE book parameter instead of adding URL

Status
 Fixed be no longer adding MUSE urls, since they match DOIs
Reported by
  — Chris Capoccia 💬 19:03, 25 October 2023 (UTC)
What happens
Citation bot expands book citation and adds Project MUSE URL https://muse.jhu.edu/book/59700
What should happen
Should use |id={{Project MUSE|59700|type=book}} instead of URL https://muse.jhu.edu/book/59700
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=User%3AChris_Capoccia%2Fsandbox&diff=1181873516&oldid=1181873496
We can't proceed until
Feedback from maintainers


If CS1|2 has a supported ID, it should be used. If there is a template for the ID, it could also be used, although I am personally against URL templates because they create link rot - archive bots have to be programmed to support them, but there are so many thousands of different templates it is impractical to provide automated support. And in this case, the template is more characters than simply using the plain URL, which all standard tools can recognize and support. -- GreenC 17:35, 26 October 2023 (UTC)

cosmetic feature request suggestion

Maybe 5 years ago, IABot had a bug that added a "#" to the end of every archive URL, and sometimes the source URL. The bug is long fixed, and WaybackMedic has been removing the errant #'s, but it's a cosmetic edit that can only be done when making another edit to the page, so it's been a long process. There are a lot of them. An example: Special:Diff/1183493290/1185983128 (second change). My code below if interested, no edge cases, simply removing any trailing # from the URLs. It won't break the archive URL.

Extended content
   # Fix trailing # in |url and |archive-url added by IABot 2.0 beta10

     psplit(GX.articlework, GX.cite2, p):
         if isarg("archive-url", "value", p.field[i]) and isarg("url", "value", p.field[i]):
           archiveurl = getarg("archive-url", "clean", p.field[i]) 
           sourceurl  = getarg("url", "clean", p.field[i])
           j = 0               
           if archiveurl ~ "[#]$":
             inc(j)                 
             sub("[#]$", "", archiveurl)
             p.field[i] = replacearg(p.field[i], "archive-url", archiveurl, "cosmetic1.1")
             if sourceurl ~ "[#]$":
               inc(j)
               sub("[#]$", "", sourceurl)
               p.field[i] = replacearg(p.field[i], "url", sourceurl, "cosmetic1.2")
           if j > 0:
             p.ok += inclog("cosmetic1.1", GX.esformat, Project.logiats, &"{archiveurl} ---- remove trailing #")

psplit() iterates over every cite template which are held in p.field[i]

GreenC 17:35, 20 November 2023 (UTC)

 Fixed AManWithNoPlan (talk) 17:27, 4 December 2023 (UTC)

Surtees Society

Still needs fixed. Fix did not work. AManWithNoPlan (talk) 00:41, 5 December 2023 (UTC)

 Fixed AManWithNoPlan (talk) 17:12, 5 December 2023 (UTC)

10.22323 is open access

Covers both

b
} 02:25, 5 December 2023 (UTC)

 Fixed AManWithNoPlan (talk) 18:07, 5 December 2023 (UTC)

IEEE Xplore part trois

Status
 Fixed
Reported by
b
}
03:14, 5 December 2023 (UTC)
What happens
[49], [50]
What should happen
[51], [52]


misidentifying not-journals as journals

Status
{{fixed}}
Reported by
Folly Mox (talk) 07:47, 5 December 2023 (UTC)
What happens
Not causing any template errors, so not as big of a deal as the bug that seems largely fixed by now (I went through a few dozen today where Citation bot altered {{
cite dictionary
}}, it rather obviously failed verification so I didn't bother.
Relevant diffs/links
Special:MobileDiff/1188400138 (the "journal" (actual source cited))
Special:MobileDiff/1188398188 (the "journal" (TWL link))
We can't proceed until
Feedback from maintainers


"Document unavailable"/"Preview unavailable" is not a title

Status
{{fixed}}
Reported by
* Pppery * it has begun... 05:11, 8 December 2023 (UTC)
What happens
https://en.wikipedia.org/w/index.php?title=%C2%A1Vamos%21+Let%27s+Go+Eat&diff=prev&oldid=1188870114
What should happen
It doesn't add placeholder titles
We can't proceed until
Feedback from maintainers


Caps: SIGPLAN

Status
{{fixed}}
Reported by
b
}
14:52, 8 December 2023 (UTC)
What should happen
[53]
We can't proceed until
Feedback from maintainers


10.15347 is free to read

WikiJournals

b
} 15:36, 10 December 2023 (UTC)

 Fixed AManWithNoPlan (talk) 21:03, 10 December 2023 (UTC)

Convert &#x00026; to &

Status
 Fixed
Reported by
b
}
20:45, 5 December 2023 (UTC)
What should happen
[54]


Adds doi-access=free for broken DOI

Status
{{fixed}} somewhat. As best as a bot can.
Reported by
Nemo 11:19, 3 December 2023 (UTC)
What happens
special:diff/1188055766
What should happen
special:diff/1188109405
Replication instructions
Both the DOI and the PubMed full text link are broken and redirect to https://lww.com/pages/default.aspx .
We can't proceed until
Feedback from maintainers


Unfortunately this journal is not preserved so there are no archived copies either. Nemo 11:19, 3 December 2023 (UTC)

For the cases where the DOI used to provide a gratis copy but no longer does, see #Add Internet Archive Scholar links. Nemo 11:37, 3 December 2023 (UTC)
That the DOI is broken is a separate issue than it's free-to-read status. Once repaired, the DOI will be free.
b
}
12:12, 3 December 2023 (UTC)
And how do you know that? Nemo 13:31, 3 December 2023 (UTC)
It's originally from Medknow. All Medknow journals/DOIs are open access.
b
}
13:54, 3 December 2023 (UTC)
Or were. Now that they've been migrated, anything could happen. This journal has a nonfree license so it could vanish unless someone archives it. If all Medknow DOIs are broken right now, I agree it's likely they'll be fixed within a few months by LWW, but in the meanwhile they're not a suitable link target so it makes no sense to add doi-access=true. Nemo 14:33, 3 December 2023 (UTC)
Actually, not all Medknow DOIs are broken, for example
The Journal of Indian Prosthodontic Society has functioning DOIs issued by Springer, like doi:10.1007/s13191-013-0262-x, for 2010–2014. (Didn't check the rest of the archive.) Have you sampled the DOIs under non-Springer prefix to see how many are working? Nemo
14:47, 3 December 2023 (UTC)
"This journal has a nonfree license" CC BY-NC-SA is a free-to-read license.
b
}
15:12, 3 December 2023 (UTC)
"Actually, not all Medknow DOIs are broken" I compliment you on finding one that actually works. AManWithNoPlan (talk) 22:09, 3 December 2023 (UTC)
This just blew up because of this https://en.wikipedia.org/wiki/Category:CS1_maint:_DOI_inactive_as_of_December_2023 AManWithNoPlan (talk) 18:53, 4 December 2023 (UTC)
I personally patrol this page and report ALL bad DOIs. Many of them point to the wrong place since the journal has been purchased. Or they are data DOIs that are not part of crossref, so who knows. Or they are MedDontKnow. AManWithNoPlan (talk) 19:19, 4 December 2023 (UTC)
I have no idea what a "free-to-read license" is. A free license is a well-defined concept. A "free-to-read" source is an English-Wikipedia specific moving concept vaguely defined at Access indicators for url-holding parameters. Mixing the two expressions serves no purpose. Nemo 22:36, 4 December 2023 (UTC)
I think that the idea of open-source journals that you cannot find is funny, but I do think that keeping the DOIs in the articles is good, since you can sometimes google them and find a copy online. AManWithNoPlan (talk) 19:16, 5 December 2023 (UTC)
Keeping the DOI is useful, making it auto-link less so. Nemo 09:20, 8 December 2023 (UTC)
If the bot thinks the DOI works, then it will not add the free. AManWithNoPlan (talk) 14:23, 8 December 2023 (UTC)
That's an issue for the template to handle, not a reason to not flag things that should be flagged. And the template disables automatic linking via |doi-broken-date=.
b
}
15:01, 8 December 2023 (UTC)
Again, we can't know whether the DOI provides a free-to-read copy when we don't even know where the copy is supposed to be. (Yes I know we were discussing this elsewhere, I'm in a hurry now.) But good the autolinking is disabled by the broken-doi parameter; the green lock should be as well. Nemo 15:09, 8 December 2023 (UTC)
"we can't know whether the DOI provides a free-to-read copy"
Yes we can.
b
}
15:33, 8 December 2023 (UTC)
How? Nemo 22:08, 10 December 2023 (UTC)
Look at the registrant. Is the registrant a fully open access publisher? If yes, then yes.
b
}
00:06, 11 December 2023 (UTC)

unsupported parameters when changing template type to cite document

Status
{{fixed}}, or at least better. Dealing with non-url references is not obvious, even to us meatbags
Reported by
Folly Mox (talk) 19:11, 9 December 2023 (UTC)
What happens
in the normal course of removing proxy urls that duplicate stable identifiers, Citation bot removed the url from a source that was not published in a journal or book, and accordingly altered the citation template type to {{cite document}}. This caused unidentified parameter errors for |citeseerx= and |s2cid=.
Relevant diffs/links
Special:Diff/1189060531
We can't proceed until
Feedback from maintainers


This was my fix: changing back to {{cite web}}, adding the url of the source, and an unrelated fix to |publisher=. I'm not sure this is really Citation bot's fault, or if maybe the parameter set supported by {{cite document}} ought be expanded to allow for more stable identifiers. Pinging Trappist the monk as the template maintainer, to see if they have input. Folly Mox (talk) 19:11, 9 December 2023 (UTC)

Yeah, converting {{
cite citeseerx
}}
.
Because the original template had |citeseerx=10.1.1.42.3374, an alternate fix might be:
{{cite citeseerx |last=Wirz |first=Marc |title=Characterizing the Grzegorczyk hierarchy by safe recursion |date=November 1999 |citeseerx=10.1.1.42.3374}}
Wirz M (November 1999). "Characterizing the Grzegorczyk hierarchy by safe recursion".
CiteSeerX 10.1.1.42.3374
.
{{cite document}} is a 'last resort' sort of template when absolutely none of the other cs1|2 templates apply. The bot should avoid using {{cite document}} because, almost always, there is a better choice.
Trappist the monk (talk) 19:47, 9 December 2023 (UTC)

10.1074 and 10.1194 are open access

b
} 01:20, 11 December 2023 (UTC)


 Fixed AManWithNoPlan (talk) 13:27, 11 December 2023 (UTC)

One cache to rule them all.

Note to me for when I have time. AManWithNoPlan (talk) 16:26, 10 December 2023 (UTC)

Replace inf by sub tags

Status
{{
wontfix
}} - seems to be a one page only problem
Reported by
b
}
15:27, 8 December 2023 (UTC)
What should happen
[55]
We can't proceed until
Feedback from maintainers


Incorrect title

Status
{{fixed}}
Reported by
GoingBatty (talk) 18:53, 12 December 2023 (UTC)
What happens
Creates a reference with |title= parameter with incorrect characters, adding the article to Category:CS1 errors: invisible characters (e.g. this edit in Malayalam).
What should happen
Proper |title= (and maybe even |website=) (e.g. this edit).
We can't proceed until
Feedback from maintainers


Here's another edit in Gujarati. — Preceding unsigned comment added by GoingBatty (talkcontribs) 20:11, 12 December 2023 (UTC)

Fails to remove invisible character

Status
{{fixed}}
Reported by
b
}
00:07, 13 December 2023 (UTC)
What should happen
[56]
We can't proceed until
Feedback from maintainers


The character is &#8203; (zero width space).

b
} 00:07, 13 December 2023 (UTC)

After inspection, all of 10.5210 are free access, not just 10.5210/fm

b
} 01:22, 15 December 2023 (UTC)

 Fixed AManWithNoPlan (talk) 15:23, 15 December 2023 (UTC)

Specifying name list style for newly-added name entries

There is a pull request that allows specifying name list style for newly-added name entries: https://github.com/ms609/citation-bot/pull/4236

It adds an option to already existing style of first1/last1, first2,last2, etc.

This pull request introduces the following functionality. If a page contains {{Use vanc name-list-style}} template, then the bot will use |vauthors= and |veditors= attributes rather than firstN/lastN and editor-firstN/editor-lastN when adding name entries for a citation template if the names were not specified in this template. This is similar to {{Use dmy dates}} template when the bot uses date format as specified on the page. To reproduce this behaviour, edit a page on Wikipedia, add {{Use vanc name-list-style}} template (or {{Use vanc name-list-style|date=December 2023}}), delete author names (firstN/lastN) and run the bot. It will fill the names as vauthors. Maxim Masiutin (talk) 16:48, 7 December 2023 (UTC)

Why does {{Use vanc name-list-style}} exist? Was there any discussion that brought it into existence? cs1|2 doesn't know anything about that template but will understand {{CS1 config|name-list-style=vanc}}. Why create a new otherwise non-functional template?
Trappist the monk (talk) 18:26, 7 December 2023 (UTC)
I agree, we can use {{CS1 config|name-list-style=vanc}}. Should we use {{CS1 config|name-list-style=vanc}}? If yes, I will update the pull request. Anyway, {{CS1 config|name-list-style=vanc}} is not currently supported by the Citations Bot.Maxim Masiutin (talk) 18:30, 7 December 2023 (UTC)
However the templates {{CS1 config|name-list-style=vanc}} and {{Use vanc name-list-style}} are different. {{CS1 config|name-list-style=vanc}} controls how the names are displayed during the render, whereas {{CS1 config|name-list-style=vanc}} does not affect the rendering but is a hint on whether the templates should use firs/last or vauthors attribute, in analogy to {{Use dmy dates}} which also does not control the output but hints how the dates should be specified in the source. This replies your question on why {{Use vanc name-list-style}} exist and how it is different from {{CS1 config|name-list-style=vanc}}. Maxim Masiutin (talk) 15:33, 9 December 2023 (UTC)
You are mistaken. cs1|2 uses {{use dmy dates}} and {{use mdy dates}} to control date formatting when cs1|2 templates are rendered. See Template:Use dmy dates § Auto-formatting citation template dates for example. I see no reason to keep {{Use vanc name-list-style}}.
Trappist the monk (talk) 15:41, 9 December 2023 (UTC)
@Trappist the monk thank you for letting me know! Why then there are separate templates for use dmy dates if this can be solved by "{{CS1 config}}"? That is the same question you asked me about the name list style.
Anyway, my proposal is not about a particular template but about the functionality of the bot to adhere to the name list style specified for the page. My pull request can be adjusted to any template, and we need a consensus. Maxim Masiutin (talk) 16:16, 9 December 2023 (UTC)
The {{use xxx dates}} templates came first (January 2009). Development of Module:Citation (the predecessor to Module:Citation/CS1) began August 2012. Auto date formatting was added to Module:Citation/CS1 April 2019. Support for {{CS1 config}} was added August 2023. {{CS1 config}} applies only to cs1|2 templates but the {{use xxx dates}} templates apply to both the article body and to article referencing (regardless of how referencing is implemented).
Trappist the monk (talk) 16:37, 9 December 2023 (UTC)
My initial proposal for discussion was on a pull request that allows specifying name list style for newly-added name entries: https://github.com/ms609/citation-bot/pull/4236
It adds an option of specifying vauthors to already existing style of first1/last1, first2,last2, etc.
Is my understanding correct that you support the feature based on {{CS1 config|name-list-style=vanc}} but not on {{Use vanc name-list-style}} so that this template should never be used at all.
If you support the feature based on {{CS1 config|name-list-style=vanc}}, I will modify the pull request. Maxim Masiutin (talk) 21:16, 11 December 2023 (UTC)
I am indifferent about whether or not Citation bot applies Vancouver style to cs1|2 templates. From the number of participants in this discussion it would seem that the response to the proposal is a resounding 'meh'. If the bot is going to apply Vancouver style based on some sort of flag template, that template should be {{CS1 config}} because that template has functionality beyond being a simple flag template.
Trappist the monk (talk) 00:13, 12 December 2023 (UTC)
I updated the pull request and I hope that the maintainers will accept it. Maxim Masiutin (talk) 22:00, 13 December 2023 (UTC)
Thank you for your guidance! I updated the pull request to support {{cs1 config|name-list-style=vanc}} and the mainteners of the citations bot accepted this pull request, making also necessary adjustments. So if you now take the citations bot from Github, it will support this feature. Thank you again for your feedback and guidance. Maxim Masiutin (talk) 04:41, 15 December 2023 (UTC)

Expand non-templated refs

Would it be possible to expand from non-templated reference <ref>[https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5553785/ Bar]</ref>, as long as |title= would be exactly the same (Bar) which already exists for the URL specified as if the bot would try to expand the bare URL (as long as there is no other content in the ref)? Jonatan Svensson Glad (talk) 17:16, 24 July 2023 (UTC)

Example here, I had to remove the brackets and the already provided title prior to running the bot. The outcome provided the exact same title as was already present prior to me doing the removal, causing a lot of manual labor in order to get the bot to attempt to expand the citation. Jonatan Svensson Glad (talk) 17:19, 24 July 2023 (UTC)
How close should the titles have to be? Also, it seems that from my experience, the title is often some mix of the title and journal and authors. AManWithNoPlan (talk) 20:08, 14 August 2023 (UTC)
Well, a first start could be exact "only-title" match inside square brackets (with only a preceding period/dot inside or outside the brackets being the difference). To later build upon with more possibilities... Jonatan Svensson Glad (talk) 21:01, 14 August 2023 (UTC)

arxiv is not a journal

Status
{{fixed}} mostly
Reported by
Trappist the monk (talk) 14:19, 9 December 2023 (UTC)
What happens
Bot changed an admittedly malformed {{cite journal}} template. In that template: |journal=arXiv, |doi=10.48550/arXiv.2206.12231, and |doi-access=free. The only action that the bot took was to convert |doi=10.48550/arXiv.2206.12231 to |arxiv=2206.12231.
What should happen
Bot should recognize that arXiv is not a journal so {{cite journal}} is the wrong template; should be changed to {{
cite arxiv}}. When removing |doi=, the bot should always remove |doi-access=. Remember that {{cite arxiv}} supports a limited subset of the whole cs1|2 parameter set so other parameters in a {{cite journal}}{{cite arxiv}} conversion may need to be removed. The limited parameter set is defined in Module:Citation/CS1/Whitelist lines 340–346
.
Relevant diffs/links
Diff
We can't proceed until
Feedback from maintainers


If you run the bot again, then it does clean up. I will look at having it not take two times. AManWithNoPlan (talk) 15:10, 9 December 2023 (UTC)

Bug? The bot should not replace first/last to first1/last1 when there is just one author

According to Help:Citation Style, An author may be cited using separate parameters for the author's surname and given name by using

However, the bot replaces |last= and |first= to |last1= and |first1= even when there is just one author, which is contrary to the description of the CS1 Citation Style.

The bot should probably already not replace them back, but it should definitely avoid changing that in the future. Also, when there were no authors specified, and there is a single author, the bot should use |last= and |first=

If you agree with that, I can try to submit a pull request. Maxim Masiutin (talk) 15:38, 9 December 2023 (UTC)

Could you give en example of where the bot changed last to last1, when there is not second author. AManWithNoPlan (talk) 16:34, 9 December 2023 (UTC)
{{
wontfix}}, since the complexity of going back and changing them will just make the bot's author handling that much more insane, and it is already complicated enough. AManWithNoPlan (talk
) 15:26, 15 December 2023 (UTC)

Use of template "ODNB"

Citation bot changed one of the source descriptions in the article James Hamilton (English Army officer) from:

{{Cite web|last=Smith |first=Geoffrey |date=May 2006 |title=Armorer, Sir Nicholas (c.1620–1686) |website=[[Oxford Dictionary of National Biography]] |doi=10.1093/ref:odnb/94686 |url=http://www.oxforddnb.com/index/94686/ |access-date=13 May 2023 |url-access=subscription}}

to:

{{Cite ODNB|last=Smith |first=Geoffrey |date=May 2006 |title=Armorer, Sir Nicholas (c.1620–1686) |doi=10.1093/ref:odnb/94686 |url=http://www.oxforddnb.com/index/94686/ |access-date=13 May 2023 |url-access=subscription}}

I wondered why. I read up on

Template:ODNB. It says it is a wrapper around Template:Cite encyclopedia
. Well, perhaps I should not have used "Cite web" but "Cite encyclopedia" and Citation bot should probably have corrected me to:

{{Cite encyclopedia|last=Smith |first=Geoffrey |date=May 2006 |title=Armorer, Sir Nicholas (c.1620–1686) |encyclopedia=[[Oxford Dictionary of National Biography]] |edition=online |publisher=[[Oxford University Press]] |doi=10.1093/ref:odnb/94686 |url=http://www.oxforddnb.com/index/94686/ |access-date=13 May 2023 |url-access=subscription}}

However, I do not understand why we should be forced to use a wrapper around Cite encyclopedia rather than the original. I thought the use of the ODBC template was voluntary and not obligatory. With thanks and best regards Johannes Schade (talk) 13:10, 19 November 2023 (UTC)

pages totales

Status
 Fixed
Reported by
GoingBatty (talk) 15:31, 18 December 2023 (UTC)
What happens
converts |pages totales= to |pages=
What should happen
remove the parameter |pages totales=, as the total number of pages is not supported in the English Wikipedia citation templates.
Relevant diffs/links
Special:Diff/1190228307


bot drops author first name in favor of a pair of dots

Status
 Fixed with more unicode happy code
Reported by
Trappist the monk (talk) 20:18, 18 December 2023 (UTC)
What happens
According to Google Books, the author is Гродзенский С.Я.. The bot produces | last1 = Гродзенский| first1 = ...
Relevant diffs/links
Diff; same article several months ago: Diff
We can't proceed until
Feedback from maintainers


treat #invoke:cite foo | as cite foo

Status
 Fixed
Reported by
b
}
18:41, 17 December 2023 (UTC)
What should happen
[57]


Note, it shouldn't remove the extra pipe.

b
} 18:54, 17 December 2023 (UTC)

I have had this on my to-do list for a while. Those huge pages are rare, but seem to be the ones with the insane number of citations that need fixed. AManWithNoPlan (talk) 13:39, 18 December 2023 (UTC)
For those wondering why, {{#invoke:cite web || ...}} does the same as {{cite web | ...}}. AManWithNoPlan (talk) 13:43, 18 December 2023 (UTC)
My tasks : add tests, change wikiname() function, and make sure extra pipe does not get removed, since in normal templates that is an error. AManWithNoPlan (talk) 13:45, 18 December 2023 (UTC)

When databases collide

This edit changed a proceedings title from the version given by

WP:CITEVAR. Please stop. —David Eppstein (talk
) 06:46, 30 October 2023 (UTC)

Convesion to mathml conflices with the math:Extension

Status
 Fixed mostly sort of
Reported by
Salix alba (talk): 17:03, 10 December 2023 (UTC)
What happens
The bot replaced the wikitext:
''b'' → ''s'' ℓ<sup>+</sup> ℓ<sup>−</sup>

with the MathML text

<math><mrow>b<mo stretchy="false">→s<msup><mrow>ℓ</mrow><mrow>+</mrow></msup><msup><mrow>ℓ</mrow><mrow>−</mrow></msup></mrow></math>

This conflicts with the maths extension and inturn causes a maths syntax error.

What should happen
No MathML text should be generated
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=LHCb_experiment&diff=prev&oldid=1188666266
We can't proceed until
Feedback from maintainers


The Title is CrossRef is "Measurement of lepton universality parameters in \n<mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\" display=\"inline\"><mml:msup><mml:mi>B<\/mml:mi><mml:mo>+<\/mml:mo><\/mml:msup><mml:mo stretchy=\"false\">\u2192<\/mml:mo><mml:msup><mml:mi>K<\/mml:mi><mml:mo>+<\/mml:mo><\/mml:msup><mml:msup><mml:mo>\u2113<\/mml:mo><mml:mo>+<\/mml:mo><\/mml:msup><mml:msup><mml:mo>\u2113<\/mml:mo><mml:mo>\u2212<\/mml:mo><\/mml:msup><\/mml:math>\n and \n<mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\" display=\"inline\"><mml:msup><mml:mi>B<\/mml:mi><mml:mn>0<\/mml:mn><\/mml:msup><mml:mo stretchy=\"false\">\u2192<\/mml:mo><mml:msup><mml:mi>K<\/mml:mi><mml:mrow><mml:mo>*<\/mml:mo><mml:mn>0<\/mml:mn><\/mml:mrow><\/mml:msup><mml:msup><mml:mo>\u2113<\/mml:mo><mml:mo>+<\/mml:mo><\/mml:msup><mml:msup><mml:mo>\u2113<\/mml:mo><mml:mo>\u2212<\/mml:mo><\/mml:msup><\/mml:math>\n decays" which makes it difficult to clean up. I tags that are not used are the annoying ones. https://github.com/ms609/citation-bot/commit/36648e552b4bf9b4f1e7ff1c88383701e79c95e0 AManWithNoPlan (talk) 21:02, 10 December 2023 (UTC)

Why not just like, not alter an existing title if the one pulled from Crossref is substantially different? or contains difficult markup? Existing, human-generated titles are more likely to be accurate: database titles often have non-compliant casing, poor OCR, and incompatible markup. No fault against adding a title where none exists, but changing them to match Crossref is problematic. Folly Mox (talk) 21:46, 10 December 2023 (UTC)
The problem that a DOI was added to an arXiv, and with the update to the published version then the published title is used instead. Clearly a rare event, with the new title being a problem, but I will look into actually parsing the math. These few math tags that are ignore pre-date my work on the bot, and were flags as someday wish list. AManWithNoPlan (talk) 21:53, 10 December 2023 (UTC)
Its going to be tricky parsing the mathml, and converting it back to Wikipedia format LaTeX. What might work is to simply enclose the foreign <math> tags inside a <nowki>. There is another citation in the same revision (actually the same paper) where this has been done. Probably anything with mathematics in it is going to need human attention. Some sort of tracking for these occurrences could be useful. --Salix alba (talk): 12:15, 12 December 2023 (UTC)
"Going to need human attention" implies "bot should not think it knows how to change the title". It should give up and not produce garbage.
In this instance, arXiv actually has a usable title: the command line
curl -LH "Accept: application/x-bibtex" https://doi.org/10.48550/arXiv.2212.09152
produces
title = {Test of lepton universality in $b \rightarrow s \ell^+ \ell^-$ decays}
which requires only changing the dollar signs to math tags to render correctly. The other thing with the mathml is unfit for human consumption and not a useful start to producing a readable title. —David Eppstein (talk) 18:52, 12 December 2023 (UTC)
doi.org titles are generally much worse quality than crossref api. AManWithNoPlan (talk) 21:05, 12 December 2023 (UTC)
Set them to be wrapped in nowiki tags. AManWithNoPlan (talk) 15:37, 15 December 2023 (UTC)

invoke other template

Status
 Fixed
Reported by
b
}
02:15, 21 December 2023 (UTC)
What happens
[58]
What should happen
don't add pipes for other invocations


Bot rebooted to make sure no running jobs continue to use old code. AManWithNoPlan (talk) 15:31, 21 December 2023 (UTC)

Breaks charts

Status
{{fixed}}
Reported by
Ita140188 (talk) 07:29, 21 December 2023 (UTC)
What happens
The bot breaks working charts by inserting extraneous markup
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Solar_power_in_Japan&diff=1191010522&oldid=1190903421
We can't proceed until
Feedback from maintainers


Fails to fix a line break / invisible character in cite tweet

Status
 Fixed
Reported by
b
}
23:23, 21 December 2023 (UTC)
What should happen
[59] [60]


Fails to fix date

Status
 Fixed
Reported by
b
}
23:25, 21 December 2023 (UTC)
What should happen
[61] [62]


Date/year/access-date/archive-date/etc.

X. 8 December 2022. {{cite book}}: Check date values in: |date= (help)

X. Monday, November 2, 1981. {{cite book}}: Check date values in: |date= (help)

X. 08 Dec 2023. {{cite book}}: Check date values in: |date= (help)

X. 08 December 2023. {{cite book}}: Check date values in: |date= (help)

AManWithNoPlan (talk) 00:52, 22 December 2023 (UTC)

please link to the new Google books web pages

This edit changed links that consistently lead to the new Google books web pages to ones that do not. 50.47.144.129 (talk) 19:49, 30 October 2023 (UTC)

Good question. Right now wikipedia prefers https://books.google.com/books?id=fp9wrkMYHvMC but should this be swapped to https://www.google.com/books/edition/_/fp9wrkMYHvMC AManWithNoPlan (talk) 15:41, 7 November 2023 (UTC)

Replace hyphen-like with hyphen in author names

Status
 Fixed
Reported by
b
}
03:56, 21 December 2023 (UTC)
What should happen
[63]


The culprit is U+2010 : HYPHEN, which should be replaced with the standard U+002D : HYPHEN-MINUS.

b
} 03:56, 21 December 2023 (UTC)

Repair url=www...

Status
 Fixed
Reported by
b
}
15:57, 16 December 2023 (UTC)
What should happen
[64]


i.e. change url=www.

to

url=https://www.

(same for chapter-url, archive-url, etc...)

b
} 15:57, 16 December 2023 (UTC)

How common is this? AManWithNoPlan (talk) 01:53, 20 December 2023 (UTC)
Possibly at lot, I haven't really checked, but a quick search shows this [65]. There's lot of other templates polluting this search, but from the 100 first results, it's at least an issue in [66], so around 1% of 1500? 15 articles ish?
It's more an issue of those getting normally fixed fairly easily by AWB runs or the like, which could be also done by this bot.
b
}
02:03, 20 December 2023 (UTC)
Note to self, look at ALL_URL_TYPES array AManWithNoPlan (talk) 23:08, 21 December 2023 (UTC)
@Headbomb: I have an AWB bot that goes through Category:CS1 errors: URL that I run a few times a month to fix issues like this, including the example you provided in this edit. The category only has 5,176 articles out of 6,817,401 articles, and most of those require manual intervention, so the commonality is well below 1%. Your search captures false positives, such as |url= in infoboxes (which should probably use {{URL}}) and URLs that contain "url=" in the middle. This modified search finds no articles to be fixed. However, if you find patterns in the category that bots can fix, please let me know. GoingBatty (talk) 23:11, 22 December 2023 (UTC)
@Headbomb: ...and I just manually used AWB to fix the drafts found in the category. GoingBatty (talk) 23:22, 22 December 2023 (UTC)
Will only work if url starts with www. since I recognize that it might not actually be a url. Now off to play with grandkiddo. Happy advent to all. AManWithNoPlan (talk) 23:35, 22 December 2023 (UTC)

Oversimplification of title

Status
new bug
Reported by
David Eppstein (talk) 21:12, 23 November 2023 (UTC)
What happens
I don't care what the publisher says the main title of a reference should be; the bot should not take more-detailed versions of the correct title and oversimplify them by only keeping the main title as it did in Special:Diff/1186533390. The same bug (in the form of removing subtitles from titles) has been reported here and archived months ago but the same misbehavior persists. If you don't stop it I am going to start routinely excluding this bot from articles I edit.
What should happen
Not that
We can't proceed until
Feedback from maintainers


Does Citation bot have consensus to be making changes to existing, human-added titles based solely on the metadata it scrapes? This doesn't seem like a good outcome most of the time. Folly Mox (talk) 22:21, 23 November 2023 (UTC)


There the bot is right though. The title is "Graph Drawing". As Springer themselves say, the suggested way to cite this is "Eppstein, D. (2009). Isometric Diamond Subgraphs. In: Tollis, I.G., Patrignani, M. (eds) Graph Drawing. GD 2008. Lecture Notes in Computer Science, vol 5417. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00219-9_37"

"16th International Symposium...." is the expanded subtitle of GD 2008. One could replace it with "Graph Drawing: 16th International Symposium..." instead of ""Graph Drawing. GD 2008."

But the word "Proceedings" is nowhere in there, and shouldn't be.

b
} 00:42, 24 November 2023 (UTC)

The title is not "Graph Drawing". The title suggested at the top of the publisher web page for the individual doi is "International Symposium on Graph Drawing, GD 2008: Graph Drawing". The title given on the landing page for the book doi is "Graph Drawing 16th International Symposium, GD 2008, Heraklion, Crete, Greece, September 21-24, 2008, Revised Papers". The title printed on the cover of the book [67] is similar but with line breaks replacing more of the punctuation. The title given in
zbMATH [69] is again almost the same, "Graph drawing. 16th international symposium, GD 2008, Heraklion, Crete, Greece, September 21–24, 2008. Revised papers". The title given in MathSciNet [70]
is "Graph drawing. Revised papers from the 16th International Symposium (GD 2008) held in Heraklion, September 2008".
All of these are vastly preferable to "Graph Drawing" because they actually identify the precise volume that the work in question comes from, which "Graph Drawing" alone does not. Their preferability should be obvious to anyone who puts actual thought into what citations are for rather than thinking of them as mechanical reproductions of flawed databases. It is exactly that unthinking "we must do what our database of publisher titles says even when it is stupid and uninformative" attitude that I am objecting to here and will continue to strongly object to on individual articles where this attitude translates into disimprovements.
As well, the bot dropped the wikilink on the title into the bit bucket, when it would have been preferable to keep it or move it to a title-link parameter. —David Eppstein (talk) 01:43, 24 November 2023 (UTC)
I keep seeing more of these on my watchlist, and have begun completely blocking citation bot from the affected articles. It won't take much more of this continued damage for me to switch to completely blocking citation bot from all articles that I edit. —David Eppstein (talk) 22:54, 24 November 2023 (UTC)
I'm not there yet, but I did recently turn off the "hide bot edits" watchlist toggle for the first time in a decade or so because of this script. I don't think Citation bot is a bad tool – despite my accumulating complaints on this talk page – but it's not better than a human: just faster.
I do note that the BRFA that supported Citation bot adding missing parameters (Wikipedia:Bots/Requests for approval/DOI bot 2, 2008) specifically says If the CrossRef database contradicts the information in the article, the bot will stick with the data already in Wikipedia, and assume the error to be with CrossRef. This seems wise, and I'm wondering when the behaviour was changed, and where the consensus for the change arose. Folly Mox (talk) 23:26, 24 November 2023 (UTC)
If a citation includes a title-link parameter or a wikilink in the title itself, that seems like a pretty good sign that a human took the trouble to get the information right. A bot shouldn't override that.
talk
) 18:57, 25 November 2023 (UTC)
This is a subgenre of the issue: existing parameters should be known before an edit is made. If |title-link= is present, |title= should not be altered outside of punctuation changes. If |periodical= (or one of its aliases) is present, the wrapper template should not be changed to {{cite book}}. If adding |chapter=, and |journal= or |issue= is present, the wrapper template should be changed to {{cite conference}} rather than {{cite book}}. If none of |title= and |chapter= match the existing |title= (delta punctuation), there's a mismatch between the database record and the work intending to be cited. Folly Mox (talk) 20:54, 25 November 2023 (UTC)

Ok, after seeing this keep going and going with no effort to fix or address the problem, I am going to start adding {{bots|deny=Citation bot}} to all new articles I create, instead of merely the ones where I see this happening. —David Eppstein (talk) 07:55, 3 December 2023 (UTC)

If this goes on for too much longer, the next step will be to ask for a full block of the bot. This continued non-response to this problem is unacceptable. —David Eppstein (talk) 01:11, 4 December 2023 (UTC)
I have added "graph drawing" to the rejection list. AManWithNoPlan (talk) 01:38, 4 December 2023 (UTC)
This applies to all Springer LNCS proceedings, not just that one. —David Eppstein (talk) 19:15, 5 December 2023 (UTC)
https://github.com/ms609/citation-bot/commit/6d644b3bbd7fa038c174e8977cb1ad3e09a60ba7 AManWithNoPlan (talk) 19:51, 5 December 2023 (UTC)

More date format repairs

Status
 Fixed
Reported by
b
}
09:18, 23 December 2023 (UTC)
What should happen
[71], [72]


April-May 1995 to April–May 1995

December 7 2023 to December 7, 2023

AManWithNoPlan (talk) 20:57, 23 December 2023 (UTC)

More invisible character cleanup

Status
 Fixed
Reported by
b
}
23:33, 22 December 2023 (UTC)
What should happen
[73]


Unicode is only of the most useful tools ever invented that is also pure evil. AManWithNoPlan (talk) 23:36, 22 December 2023 (UTC)

Forced "editor" parameter changes

Status
{{fixed}} With not add editors for those DOIs. Secondly, will consider "others" to be a type of editors, so if "others" is set, then editors will not be added.
Reported by
Aza24 (talk) 05:03, 25 December 2023 (UTC)
What happens
"others=Revised by" turns into "editor=". These are completely separate parameters. The article in question (along with many other in which I've had to continuously revert this bot) have writers who come and revise article content years later, not act in an editorial position!. Grove literally says "Revised by" in the articles. The editor of the encyclopedia is an entirely different person
What should happen
Nothing!!
Relevant diffs/links
here and here
We can't proceed until
Feedback from maintainers


It is just really dandy that the publisher reports that information as the editors. https://api.crossref.org/works/10.1093/gmo/9781561592630.article.48611 AManWithNoPlan (talk) 21:31, 25 December 2023 (UTC)

The editor of Grove is Deane Root; the subject editors are other people as well. Grove is clearly inconsistent on how they label the revisers; if you use their citation generator, it just lists them all as authors, yet they note that "James Holmes, revised by Anthony Tommasini and Arlys McDonald" at the article's top. I don't like to list all as authors, since one can easily look at the past revision and see that the article has a clear primary author, whose text was either slightly amended are added to. Readers are expecting an "editor" in a citation to be on in the traditional sense, which these people are certainly not. Aza24 (talk) 00:45, 26 December 2023 (UTC)
It is just really dandy that a bot, whose raison d'etre is to clean up citations on Wikipedia that humans have messed up by cramming metadata into the wrong fields, puts all its trust into citations messed up in exactly the same way on other sites, and uses that messed-up data to replace better data in cases when the human editors here have already put care into getting it right. —David Eppstein (talk) 00:57, 26 December 2023 (UTC)

Christian Science Monitor is not an academic journal

Status
 Fixed by flagging as a "magazine", which is cite peridical
Reported by
Folly Mox (talk) 13:28, 26 December 2023 (UTC)
What happens
Christian Science Monitor
incorrectly identify it as a journal, fail to retrieve authorship and publication date
What should happen
cite periodical}}, but {{cite news
}} may have been equally appropriate.
We can't proceed until
Feedback from maintainers


Isn't {{
cite news}} and probably |work=The Christian Science Monitor a better choice? At The Christian Science Monitor we describe the organization as a 'nonprofit news organization that publishes daily articles both in electronic format and a weekly print edition' originally established 'as a daily newspaper'.

Trappist the monk (talk) 16:14, 26 December 2023 (UTC)

Yes, I see now they don't have issue numbers. Fixed. Folly Mox (talk) 17:19, 26 December 2023 (UTC)
Swapped to cite news. AManWithNoPlan (talk) 17:33, 26 December 2023 (UTC)

Bot down?

Seems to not be working ATM.

b
} 20:14, 5 January 2024 (UTC)

Also experiencing this - the main page loads, but any attempt to run the bot on a specific page results in an HTTP ERROR 500 message. —Ganesha811 (talk) 23:22, 5 January 2024 (UTC)
Rebooted and  Fixed AManWithNoPlan (talk) 02:34, 6 January 2024 (UTC)

Caps: CRISPR

Status
 Fixed
Reported by
b
}
20:33, 5 January 2024 (UTC)
What should happen
[74]


Enable 1-click activation of Category:CS1 maint: date format

Would be useful to clear most of that category.

b
} 06:30, 5 January 2024 (UTC)


Same for

b
} 07:09, 5 January 2024 (UTC)

{{fixed}} AManWithNoPlan (talk) 15:21, 5 January 2024 (UTC)
Nope, not fixed.
b
}
06:10, 6 January 2024 (UTC)
 Fixed I did not see the one in the title. AManWithNoPlan (talk) 15:28, 6 January 2024 (UTC)