Wikipedia:Bots/Requests for approval/IndentBot: Difference between revisions

Source: Wikipedia, the free encyclopedia.
Content deleted Content added
Line 313: Line 313:
::{{u|SD0001}}, are you approving this request, or are you accepting their withdrawal? [[User:Primefac|Primefac]] ([[User talk:Primefac|talk]]) 15:10, 23 January 2022 (UTC)
::{{u|SD0001}}, are you approving this request, or are you accepting their withdrawal? [[User:Primefac|Primefac]] ([[User talk:Primefac|talk]]) 15:10, 23 January 2022 (UTC)
:::{{yo|SD0001|p=}}? [[User:Theleekycauldron|theleekycauldron]] ([[User talk:Theleekycauldron|talk]] • [[Special:Contributions/Theleekycauldron|contribs]]) (she/[[Singular they|they]]) 10:10, 13 February 2022 (UTC)
:::{{yo|SD0001|p=}}? [[User:Theleekycauldron|theleekycauldron]] ([[User talk:Theleekycauldron|talk]] • [[Special:Contributions/Theleekycauldron|contribs]]) (she/[[Singular they|they]]) 10:10, 13 February 2022 (UTC)
::::Left a note at their talk page. [[User:Primefac|Primefac]] ([[User talk:Primefac|talk]]) 14:18, 27 February 2022 (UTC)

Revision as of 14:18, 27 February 2022

New to bots on Wikipedia? Read these primers!

Operator:

talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search
)

Time filed: 03:20, Friday, October 15, 2021 (

UTC
)

Function overview: Adjust indentation on discussion pages.

Automatic, Supervised, or Manual: Automatic

Programming language(s): Python, pywikibot

Source code available: On Github

Links to relevant discussions (where appropriate): Wikipedia:Bot_requests/Archive_83#Bot_to_fix_indents

Edit period(s): Continuous (tracking recent changes on a delay)

Estimated number of pages affected: Depends on parameters. With delay of 10 minutes, around 20-30 pages are checked per 10 minutes (see function details below). Initially, most pages having substantial content will be edited, but since the bot processes the entire page, this will get reduced over time as it covers more ground.

Namespace(s): All talk namespaces, and the project namespace. Not sure if any other namespaces have discussion pages.

Exclusion compliant (Yes/No): Yes, uses pywikibot's save function.

Function details: First, the wikitext is partitioned into lines in the usual manner using \n as a delimiter, except that certain newlines, such as those immediately preceding table, template, or tag (as detected by WikiTextParser), are not considered the end of a line. Then we apply fix_gaps, fix_extra_indents, and fix_indent_style to the sequence of lines.

Definitions

  • The indentation characters are *, :, and #.
  • Given a line X, we denote the indentation characters of the line by indent_text(X), and we denote the indentation level by lvl(X). In particular, if X is not indented then lvl(X) == 0.
  • A blank line is a line consisting of whitespace only.
  • A gap is a nonempty contiguous sequence of blank lines sandwiched between two indented lines, which are called the opening line and closing line.
  • The length of a gap is the length of the sequence of blank lines.

Fixes

  1. fix_gaps: This fix has many variations. Let A and B be the opening and closing lines, respectively. No gap with an opening or closing line beginning with # is removed. Otherwise, all length 1 gaps are removed, and longer gaps are removed only if lvl(B) > 1.
  2. fix_extra_indents: We iterate over the lines from beginning to end. If we encounter a line A followed by a line B such that lvl(B) > lvl(A) + 1, then the subsequent chunk of lines which have indentation level greater than or equal to lvl(B), beginning with B, is shifted to the left by lvl(B) - lvl(A) - 1 positions. This is done by stripping out indent_text[lvl(A):lvl(B)-1] (in Python notation) from these lines.
  3. fix_indent_style: We iterate over the lines from beginning to end and adjust the indent_text of each line to use corresponding characters from the closest previous line with the same or smaller level, except that # characters are not removed from, introduced to, or shifted inside a line.

The above description leaves out some details (namely some exceptions for edge cases). The fixes are repeatedly applied in the above order until another round won't alter the page (one round is almost always enough).

It's basically impossible to handle all edge cases and it's not difficult to come up with some of them, especially when you use ordered lists and combinations of possible mistakes. The hope is these are rare enough to be acceptable.

The bot tracks recent changes with a delay minute delay in chunks of chunk minutes, checking for non-minor non-bot edits which include a user signature with the edit that have not been superseded in the most recent delay minutes. The effect of this is that IndentBot is activated by signature-adding edits only, and does not edit any page which has had a signature-adding edit in the most recent delay minutes. I believe delay should be set to 10 to 30 minutes. Too long of a delay results in editors manually fixing indentation in active discussions, partially defeating the purpose of the bot. Non-talk pages must have at least 3 signatures to be edited, ensuring that a single accidental signature to a non-discussion page doesn't trigger the bot. Most sandboxes are avoided.

Discussion

Approved for trial (50 edits or 7 days, whichever happens first). Please provide a link to the relevant contributions and/or diffs when the trial is complete. Primefac (talk) 07:35, 20 October 2021 (UTC)[reply]

@
talk) 07:36, 20 October 2021 (UTC)[reply
]
For the trial, let's go with "no" so that it receives a bit more scrutiny. I think if this goes through, marking as minor would match similar bots. Primefac (talk) 09:06, 20 October 2021 (UTC)[reply]

Trial complete. See the diffs here.

Trial complete. See the contributions here, or see the diffs in alphabetical order here.

talk) 14:50, 24 October 2021 (UTC)[reply
]

Moved from
talk) 11:10, 26 October 2021 (UTC) First time moving a discussion, tell me if I did it incorrectly.[reply
]

This ?bot? made a useless edit here: https://en.wikipedia.org/w/index.php?title=Talk:Bicarbonate&curid=1450293&diff=1051599806&oldid=1051598562

which has no effect on the output we see. I thought that bots were not permitted to make cosmetic only changes. Even if the extra blank line is redundant, ether is no need to remove it! Graeme Bartlett (talk) 10:18, 26 October 2021 (UTC)[reply]

Break

What's the status of this? Not sure where to go from here. I've noticed that on mobile, bulleted and unbulleted comments don't line up (check here for example), so the bot is even more effective there.

talk) 01:06, 29 October 2021 (UTC)[reply
]

{{
talk) 09:17, 31 October 2021 (UTC)[reply
]
I think this needs another round of trial, this time a larger one. The CfD templates have been fixed per
WP:RFUD, which is where I assume @Graeme Bartlett is coming from, the issue seems to be that {{UND}} when substed produces a bullet indent, but most users haven't noticed this and are anyway adding a indent character of their own.
Also, I think the issue of changing the final indent character should be discussed. I don't have any preferences, but I think changing a visible bullet to no bullet (or vice versa, see several cases in [2]) can be seen as intrusive. Would like to hear others' thoughts on this. – SD0001 (talk) 12:59, 31 October 2021 (UTC)[reply
]
Apologies for radio silence on this one, it's relatively low-priority at this point in my life, but I do agree based on a read-through here that a further trial would probably be good. Primefac (talk) 13:03, 31 October 2021 (UTC)[reply]
I did realize that changing the final (and hence visual) character could be annoying, but the point is that mixing characters shouldn't happen in the first place. So if the final indent character is not changed, it neuters a large portion of the fixes. Even a simple single-level list such as
* Comment 1.
: Comment 2.
* Comment 3.
: Comment 4.
would be left as four separate lists in HTML and to screen readers. Let me see if I can compute approximately what fraction of indentation style fixes occur in the final character.
talk) 13:17, 31 October 2021 (UTC)[reply
]
In
talk) 13:31, 31 October 2021 (UTC)[reply
]
@
talk) 14:26, 31 October 2021 (UTC)[reply
]
Indeed it doesn't. I assumed that was the reason why so many of the RFUD comments were over-indented ([3]). – SD0001 (talk) 14:36, 31 October 2021 (UTC)[reply]

@

talk) 10:10, 1 November 2021 (UTC)[reply
]

With this strategy, the number of lines with altered final character gets reduced by 25% to 630.
talk) 13:27, 1 November 2021 (UTC)[reply
]

I've made a number of slight improvements to each of the three fixes and I think the bot is ready for a third trial. I don't think the final character issue can be mitigated any more without simply ignoring final character INDENTMIX violations. I guess we can see whether anyone complains during/after the trial. I'll continue the non-minor edit policy except for user talk pages for the trial to draw more scrutiny.

talk) 13:56, 2 November 2021 (UTC)[reply
]

@

talk) 11:04, 4 November 2021 (UTC)[reply
]

Sure go ahead. Approved for extended trial (200 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete.SD0001 (talk) 06:48, 5 November 2021 (UTC)[reply]
An indent-bot is definitely required. Some editors make mistakes with their indents. Some simply don't know how to indent. Most frustrating? some deliberately mis-indent (usually after their mistakes have been pointed out) & when they 'continue' to deliberate mis-indent? it's basically their way of giving you (the adviser) the figurative 'middle finger'. GoodDay (talk) 17:47, 5 November 2021 (UTC)[reply]

Trial feedback

Examples
  • I just reverted this massive refactoring when I saw this bot editing my discussion; I chose bullets on purpose to break that section apart. — xaosflux Talk 13:53, 5 November 2021 (UTC)[reply]
  • Here is another example: diff - this doesn't make sense, that first line was clearly not intended to be part of the "discussion" - so was stylized differently. — xaosflux Talk 14:01, 5 November 2021 (UTC)[reply]
  • More bad edits (already reverted by another editor). — xaosflux Talk 14:03, 5 November 2021 (UTC)[reply]
  • Lets not chase around another bot example that I can assume was specifically programmed to edit one way already. — xaosflux Talk 14:09, 5 November 2021 (UTC)[reply]
  • Another example diff that made the new list worse, see the section around "Person who is autistic" - where this bot has introduced double bullets. — xaosflux Talk 14:54, 5 November 2021 (UTC)[reply]
Discuss
  • I think this task is going to need a much larger discussion before being released on all edits, all the the time; I expect it will continue to make contentious edits that don't have a policy to support them (i.e. a policy that only certain indentation or list styles are allowed to be used). — xaosflux Talk 13:58, 5 November 2021 (UTC)[reply]

Changes to bot

In the original bot request for a bot to fix indentation, two examples were given. The first example was the removal of a single extra indent (a general fix), and the second was a non-final-indent-character indentmix fix (an accessibility fix). I decided to tackle this request, but caught feature creep and took the idea too far. This ended up making some "fixes" the very opposite, as the last trial demonstrated. I believe the issues brought up (other than procedural issues like editing user talks and missing the bot flag) were due to the features I implemented beyond the original request, and I apologize.

I have limited the bot to listgap and non-final-character indentmix fixes only. Indentation levels and final indentation characters are not changed (so the first example in the original bot request would actually be left alone). Here are some sandbox diffs. These are accessibility changes, and the only noticeable change for sighted readers should be the hiding of “floating bullets” which are bullet points that appear not as the last indent character. For example,

Markup Renders as
:One.
*: Two
*** Three.

One.
  • Two
      • Three.

would become

Markup Renders as
:One.
:: Two
::* Three.

One.
Two
  • Three.

talk) 10:56, 7 November 2021 (UTC)[reply
]

@
Notsniwiast
:
here is just a sample mixed up list - what, if anything would you do to it?
Extended content
  • A
    • A
      • A
        A
        A
        • A
          1. A
          2. A
          3. A
        • A
          1. A
          2. A
          3. A
        A
      • A
      • A
    • A
  • A
xaosflux Talk 13:05, 7 November 2021 (UTC)[reply]
I've just tested it on this list. It does nothing.
talk) 13:09, 7 November 2021 (UTC)[reply
]
@Xaosflux: Please see the above. --TheSandDoctor Talk 07:39, 29 December 2021 (UTC)[reply]

Chunk 1 (20 diffs)

Chunk 2 (50 diffs)

Chunk 3 (60 diffs)

Chunk 4 (70 diffs)

  • See here. There was only one error here where a bullet point was introduced. This was due to a template creating a table which the bot did not anticipate. The bot now expands templates to check for tables. Trial complete.
    talk) 06:59, 10 January 2022 (UTC)[reply
    ]

Gonna take a break. Code is still available. Withdrawing this request.

talk) 05:15, 13 January 2022 (UTC)[reply
]

Well, as the trial has been completed, assuming no issues are found, you don't have to do anything more as of now. If this is approved, you can start running the bot whenever you return. – SD0001 (talk) 09:51, 13 January 2022 (UTC)[reply]
SD0001, are you approving this request, or are you accepting their withdrawal? Primefac (talk) 15:10, 23 January 2022 (UTC)[reply]
@SD0001? theleekycauldron (talkcontribs) (she/they) 10:10, 13 February 2022 (UTC)[reply]
Left a note at their talk page. Primefac (talk) 14:18, 27 February 2022 (UTC)[reply]