Wikipedia:Village pump (proposals)

Source: Wikipedia, the free encyclopedia.
 Policy Technical Proposals Idea lab WMF Miscellaneous 

The proposals section of the village pump is used to offer specific changes for discussion. Before submitting:

Discussions are automatically archived after remaining inactive for nine days.

RfC: Extended confirmed pending changes (PCECP)

Should a new pending changes protection level - extended confirmed pending changes (hereby abbreviated as PCECP) - be added to Wikipedia? Awesome Aasim 19:58, 5 November 2024 (UTC)[reply]

Background

EC protection
in topic areas authorized by the community or the arbitration committee. However, some administrators refuse to protect pages unless if there is recent disruption. Extended confirmed pending changes would allow non-XCON users to propose changes for them to be approved by someone extended confirmed, and can be applied preemptively to these topic areas.

It is assumed that it is technically possible to have PCECP. That is, we can have PCECP as "[auto-accept=extended confirmed users] [review=extended confirmed users]" Right now it might not be possible to have extended confirmed users review pending changes with this protection with the current iteration of FlaggedRevs, but maybe in the future.

Survey (PCECP)

Support (PCECP)

  • Support for multiple reasons:
    WP:ARBECR encourages the use of pending changes when protection is not used. Third, pending changes effectively serves to allow uncontroversial edit requests without needing to create a new talk page discussion. And lastly, this is within line of our protection policy, which states that protection should not be applied preemptively in most cases. Awesome Aasim 19:58, 5 November 2024 (UTC)[reply
    ]
  • Support (per... nom?) PC is the superior form of uncontroversial edit requests. Aaron Liu (talk) 20:09, 5 November 2024 (UTC)[reply]
    It's better than EC, which already restricts being the free encyclopedia more. As I've said below, the VisualEditor allows much more editing from new people than edit requesting, which forces people to use the source editor. Aaron Liu (talk) 03:52, 6 November 2024 (UTC)[reply]
    This is not somehow less or more restrictive as ECR. It's exactly the same level of protection, just implemented in a different way. I do not get the !votes from either side who either claim that this will be more restriction or more bureaucracy. I understand neither, and urge them to explain their rationales. Aaron Liu (talk) 12:32, 12 November 2024 (UTC)[reply]
    By creating a difference between what non logged-in readers (that is, the vast majority of them) see versus logged-in users, there is an extra layer of difficulty for non-confirmed and non-autoconfirmed editors, who won't see the actual page they're editing until they start the editing process. Confirmed and autoconfirmed editors may also be confused that their edits are not being seen by non-logged in readers. Because pending changes are already submitted into the linear history of the article, unwinding a rejected edit is potentially more complicated than applying successive edit requests made on the talk page. (This isn't a significant issue when there aren't many pending changes queued, which is part of the reason why one of the recommended criteria for applying pending changes protection is that the page be infrequently edited.) For better or worse, there is no deadline to process edit requests, which helps mitigate issues with merging multiple requests, but there is pressure to deal with all pending changes expediently, to reduce complications in editing. isaacl (talk) 19:54, 12 November 2024 (UTC)[reply]
    Do you think this would be fixed with "branching" (similar to GitHub branches)? In other words, instead of PC giving the latest edit, PC just gives the edit of the stable revision and when "Publish changes" is clicked it does something like put the revision in a separate namespace (something like Review:PAGENAME/#######) where ####### is the revision ID. If the edit is accepted, then that page is merged and the review deleted. If the edit is rejected the review is deleted, but can always be restored by a Pending Changes Reviewer or administrator. Awesome Aasim 21:01, 12 November 2024 (UTC)[reply]
    Technically, that would take quite a bit to implement. Aaron Liu (talk) 23:18, 12 November 2024 (UTC)[reply]
    There are a lot of programmers who struggle with branching; I'm not certain it's a great idea to make it an integral part of Wikipedia editing, at least not in a hidden, implicit manner. If an edit to an article always proceeded from the last reviewed version, editors wouldn't be able to build changes on top of their previous edits. I think at a minimum, an editor would have to be able to do the equivalent of creating a personal working branch. For example, this could be done by working on the change as a subpage of the user's page (or possibly somewhere else (perhaps in the Draft namespace?), using some standard naming hierarchy), and then submitting an edit request. That would be more like how git was designed to enable de-centralized collaboration: everyone works in their own repository, rebasing from a central repository (*), and asks an integrator to pull changes that they publish in their public repository.
    (*) Anyone's public repository can act as a central repository. It just has to be one that all the collaborators agree upon using, and thus agree with the decisions made by the integrator(s) merging changes into that repository. isaacl (talk) 23:22, 12 November 2024 (UTC)[reply]
    That makes sense. This has influenced me to amend my Q2 answer slightly, but I still support the existence of this protection and the preemptive PC protecting of low-traffic pages. (Plus, it's still not more restriction.) Aaron Liu (talk) 23:20, 12 November 2024 (UTC)[reply]
  • Support, functionally a more efficient form of edit requests. The volume of pending changes is still low enough for this to be dealt with, and it could encourage the
    pending changes reviewer right to be given to more people currently reviewing edit requests, especially in contentious topics. Chaotic Enby (talk · contribs) 20:25, 5 November 2024 (UTC)[reply
    ]
  • Support having this as an option. I particularly value the effect it has on attribution (because the change gets directly attributed to the individual who wanted it, not to the editor who processed the edit request). WhatamIdoing (talk) 20:36, 5 November 2024 (UTC)[reply]
  • Support: better and more direct system than preemptive extended-confirmed protection followed by edit requests on the talk page. Cremastra (uc) 20:42, 5 November 2024 (UTC)[reply]
  • Support, Pending Changes has the capacity to take on this new task. PC is much better than the edit request system for both new editors and reviewers. It also removes the downsides of slapping ECP on everything within contentious topic areas. Toadspike [Talk] 20:53, 5 November 2024 (UTC)[reply]
    I've read the opposes below and completely disagree that this would lead to more gatekeeping. The current edit request system is extremely complicated and inaccessible to new users. I've been here for half a decade and I still don't really know how it works. The edit requests we do get are a tiny fraction of the edits people want to make to ECP pages but can't. PCECP would allow them to make those edits. And many (most?) edit requests are formatted in a way that they can't be accepted (not clear what change should be made, where, based on what souce), a huge issue which would be entirely resolved by PCECP.
    The automatic EC protection of all pages in certain CTOPs is not the point of this proposal. Whether disruption is a prerequisite to protection is not altered by the existence of PCECP and has to be decided in anther RfC at another venue, or by ArbCom. PCECP is solely about expanding accessibility to editing ECP pages for new and unregistered editors, which is certainly a positive move.
    I, too, hate the PC system at dewiki, and I appreciate that Kusma mentioned it. However, what we're looking at here is lowering protection levels and reducing barriers to editing, which is the opposite of dewiki's PC barriers. Toadspike [Talk] 10:24, 16 November 2024 (UTC)[reply]
  • Support (Summoned by bot): per above. C F A 💬 23:34, 5 November 2024 (UTC)[reply]
  • Support : Per above. PC is always at a low or very low backlog, therefore is completely able to take this change. ~/Bunnypranav:<ping> 11:26, 6 November 2024 (UTC)[reply]
  • Support: I would be happy to see it implemented. GrabUp - Talk 15:14, 6 November 2024 (UTC)[reply]
  • Support Agree with JPxG's principle that it is better to "have drama on a living project than peace on a dead one," but this is far less restrictive than preemptively setting EC protection for all
    WP:ARBECR pages. From a new editor's perspective, they experience a delay in the positive experience of seeing their edit implemented, but as long as pending changes reviewers are equipped to minimize this delay, then this oversight seems like a net benefit. New users will get feedback from experienced editors on how to operate in Wikipedia's toughest content areas, rather than stumbling through. ViridianPenguin 🐧 ( 💬 ) 08:57, 8 November 2024 (UTC)[reply
    ]
  • Support * Pppery * it has begun... 05:17, 11 November 2024 (UTC)[reply]
  • Support Idk what it's like in other areas but in mine, of edit requests that I see, a lot, maybe even most of them are POV/not actionable/nonsense/insults so if it is already ECR only, then yea, more filtering is a good thing.Selfstudier (talk) 18:17, 11 November 2024 (UTC)[reply]
  • Support assuming this is technically possible (which I'm not entirely sure it is), it seems like a good idea, and would definitely make pending changes more useful from my eyes. Zippybonzo | talk | contribs (they/them) 20:00, 12 November 2024 (UTC)[reply]
  • Strong support per @JPxG:'s reasoning—I think it's wild that we're willing to close off so many articles to so many potential editors, and even incremental liberalization of editing restrictions on these articles should be welcomed. This change would substantially expand the number of potential editors by letting non-EC contributors easily suggest edits to controversial topic areas. It would be a huge win for contributions if we managed to replace most ECP locks with this new PCECP.– Closed Limelike Curves (talk) 02:07, 14 November 2024 (UTC)[reply]
  • Yes, in fact, somebody read my mind here (I was thinking about this last night, though I didn't see this VP thread...) Myrealnamm (💬Let's talk · 📜My work) 21:38, 14 November 2024 (UTC)[reply]
  • Support in principle. Edit requests are a really bad interface for new users; if discouraging people from editing is the goal, we've succeeded. Flagged revisions aren't the best, but they are better than edit request templates.
    WP:INVOLVED for reviewers? Right now, there's a big firewall between editors involved in content in an area like Israel-Palestine and admins using their powers in that area. Can reviewers edit in the area and use their tools? This needs to be clarified, as it seems like editing in PIA doesn't disqualify one from answering edit requests. Chess (talk) (please mention me on reply) 21:06, 18 November 2024 (UTC)[reply
    ]

    the current review policy assumes a change is correct unless it's obvious vandalism or the like

    @
    Chess That's true, but reviewers are also currently expected to accept and revert if the change is correct but also irky for a revert. Below, Aasim clarified that reviewers should only reject edits that fail the existing PC review guidelines plus edits made in violation of an already well-established consensus.
    As for Involved, since there's no guidance about edit request reviewers yet either, I think that should be asked in a separate RfC. Aaron Liu (talk) 21:35, 18 November 2024 (UTC)[reply]
  • Support. The number of sysops is ever decreasing and so we will need to take drastic action to ensure maintenance and vandalism prevention can keep up. Stifle (talk) 17:29, 19 November 2024 (UTC)[reply]
  • Support in principle. While I understand objections from others based on the technical downsides and design of the current Flagged Revisions extension, I support making it easier for users to suggest changes with a GUI rather than a difficult-to-understand edit request template, which creates a barrier to entry. Frostly (talk) 05:24, 26 November 2024 (UTC)[reply]
  • Support - It seems to be entirely preferable to ECR. It would be interesting if any current or former Arbcom members were to see it as more problematic. — Charles Stewart (talk) 04:12, 28 November 2024 (UTC)[reply]

Oppose (PCECP)

]
Yeah that is what they are supposed to be but in practice they are not. As anyone who has answered edit requests before, there are often messages that look like this:
Extended content

The reference is wrong. Please fix it. 192.0.0.1 (talk) 23:19, 11 November 2024 (UTC)[reply]

  • Which is not in practice ]
    I don't see how that's much of a problem, especially as edits are also committed to the talk page's history. Aaron Liu (talk) 22:50, 11 November 2024 (UTC)[reply]
    Do the words "Provoke edit wars" mean anything? Talk page posts are far less likely to be the locus of an edit war than article edits. —]
    As an editor who started out processing edit requests, including ECP edit requests, I disagree.
    Aaron Liu (talk) 18:08, 14 November 2024 (UTC)[reply]
  • Oppose, per what JSS has said. I am a little uncomfortable at the extent to which we've seemingly accepted preemptive protection of articles in contentious areas. It may be a convenient way of reducing the drama us admins and power users have to deal with... but only at the cost of giving up on the core principle that anybody can edit. I would rather have drama on a living project than peace on a dead one. jp×g🗯️ 18:16, 7 November 2024 (UTC)[reply]
  • Oppose I am one of those admins who likes to see disruption before protecting. Lectonar (talk) 08:48, 8 November 2024 (UTC)[reply]
  • Oppose as unnecessary, seems like a solution in search of a problem. Furthermore, this *is* Wikipedia, the encyclopedia anyone can edit; preemptively protecting pages discourages contributions from new editors. -Fastily 22:36, 8 November 2024 (UTC)[reply]
  • Weak Oppose I do understand where this protection would be helpful. But I just think something is EC-protectable or not. Don't necessarily think adding another level of bureaucracy is particularly helpful. --Takipoint123 (talk) 05:14, 11 November 2024 (UTC)[reply]
  • Oppose. I'm inclined to agree that the scenarios where this tool would work a benefit as technical solution would be exceedingly niche, and that such slim benefit would probably be outweighed by the impact of having yet one more tool to further nibble away at the edges of the open spaces of the project which are available to new editors. Frankly, in the last few years we have already had an absurdly aggressive trend towards community (and ArbCom fiat) decisions which have increasingly insulated anything remotely in the vain of controversy from new editors--with predictable consequences for editor recruitment and retention past the period of early involvement, further exacerbating our workloads and other systemic issues. We honestly need to be rolling back some of these changes, not adding yet one more layer (however thin and contextual) to the bureaucratic fabric/new user obstacle course. SnowRise let's rap 11:23, 12 November 2024 (UTC)[reply]
  • Oppose. The more I read this discussion, the more it seems like this wouldn't solve the majority of what it sets out to solve but would create more problems while doing so, making it on balance a net negative to the project. Thryduulf (talk) 21:43, 12 November 2024 (UTC)[reply]
  • Oppose and Point of Order Oppose because pending changes is already too complicated and not very useful. I'm a pending changes reviewer and I've never rejected one on PC grounds (basically vandalism). But I often revert on normal editor grounds after accepting on PC grounds. (I suspect that many PC rejections are done for non-PC reasons instead of doing this) "Point of Order" is because the RFC is unclear on what exactly is being opposed. Sincerely, North8000 (talk) 22:15, 12 November 2024 (UTC)[reply]
    Pretty sure that what happens is that when vandals realize they will have to submit their edit for review before it goes live, that takes all the fun out of it for them because it will obviously be rejected, and they don't bother. That's pretty much how it was supposed to work. Just Step Sideways from this world ..... today 22:22, 12 November 2024 (UTC)[reply]
    This is a very good point, and I ask for @Awesome Aasim's clarification on whether reviewers will be able to reject edits on grounds for normal reverts combined with the EC restriction. I think there's enough rationale to apply this here beyond the initial rationale for PC as explained by JSS above. Aaron Liu (talk) 23:24, 12 November 2024 (UTC)[reply]
    Reviewers are given specific reasons for accepting edits (see Wikipedia:Pending changes § Reviewing pending edits) to avoid overloading them with work while processing pending changes expeditiously. If the reasons are opened up to greater evaluation of the quality of edits, then expectations may shift towards this being a norm. Thus some users are concerned this will create a hierarchy of editors, where edits by non-reviewers are gated by reviewers. isaacl (talk) 23:44, 12 November 2024 (UTC)[reply]
    I understand that and wonder how the reviewer proposes to address this. I would still support this proposal if having reviewers reject according to whether they'd revert and "ostensibly" to enforce EC is to be the norm, albeit to a lesser extent for the reasons you mentioned (though I'd replaced "non-reviewers" with "all non–auto-accepted"). Aaron Liu (talk) 00:13, 13 November 2024 (UTC)[reply]
    I'm not sure to whom you are referring when you say "the reviewer" – you're the one suggesting there's a rationale to support more reasons for rejecting a pending change beyond the current set. Since any pending change in the queue will prevent subsequent changes by non-reviewers from being visible to most readers, their edits too will get evaluated by a single reviewer before being generally visible. isaacl (talk) 00:59, 13 November 2024 (UTC)[reply]
    Sorry, I meant Aasim, the nominator. I made a thinko.
    Currently, reviewers can undo just the edits that aren't good and then approve the revision of their own revert. I thought that was what we were supposed to do. Aaron Liu (talk) 02:13, 13 November 2024 (UTC)[reply]
    Yes. Anything that is obvious vandalism or a violation of existing Wikipedia's policies can still be rejected. However, for edits where there is no other problem, the edit can still be accepted. In other words, a user not being extended confirmed shall not be sufficient grounds for rejecting an edit under PCECP, since the extended confirmed user takes responsibility for the edit. If the extended confirmed user accepts a bad edit, it is on them, not whoever made it. That is the whole idea.
    Of course obviously helpful changes such as fixing typos and adding up-to-date information should be accepted sooner, while more controversial changes should be discussed first. Awesome Aasim 17:37, 13 November 2024 (UTC)[reply]
    By or a violation of existing Wikipedia's policies, do you only mean violations of BLP, copyvio, and "other obviously inappropriate content" that may be very-quickly checked, which is the current scope of what to reject? Aaron Liu (talk) 17:41, 13 November 2024 (UTC)[reply]
    Yeah, but also edits made in violation of an already well-established consensus. Edits that enforce a clearly-established consensus (proven by previous talk page discussion), are, from my understanding, exempt from all ]
  • Oppose per Thryduulf and SnowRose. Also regardless of whether this is a good idea as a policy, FlaggedRevs has a large amount of technical debt, to the extent that deployment to any additional WMF wikis is prohibited, so it seems unwise to expand its usage.  novov talk edits 19:05, 13 November 2024 (UTC)[reply]
  • Oppose I have never found the current pending changes system easily to navigate as a reviewer. ~~ AirshipJungleman29 (talk) 20:50, 14 November 2024 (UTC)[reply]
  • Oppose the more productive approach would be to reduce the overuse of extended-confirmed protection. We have come to rely on it too much. This would be technically difficult and complex for little real gain. —Ganesha811 (talk) 18:30, 16 November 2024 (UTC)[reply]
    That's the goal of this proposal (reducing the overuse of ECP), and it provides a plausible mechanism for that (replacing it with the much-less stringent PCECP). How would you go about reducing overuse of ECP instead? – Closed Limelike Curves (talk) 23:29, 29 November 2024 (UTC)[reply]
    Would you support a version in which the reviewers remain PC patrollers? Aaron Liu (talk) 00:58, 30 November 2024 (UTC)[reply]
  • Oppose there might be a need for this but not preemptive. Andre🚐 01:31, 17 November 2024 (UTC)[reply]
    Wouldn't that be a support here for question #1, and an oppose in question #2? – Closed Limelike Curves (talk) 23:34, 29 November 2024 (UTC)[reply]
    Indeed, but as I've said below, it appears the rationale in the background section has confused many. Aaron Liu (talk) 00:58, 30 November 2024 (UTC)[reply]
  • Oppose. The pending changes system is awful and this would make it awfuler (that wasn't a word but it is now). Zerotalk 05:58, 17 November 2024 (UTC)[reply]
  • Oppose. How can we know that the 72,904 extended-confirmed users are capable of reviewing pending changes? I assume this is a step above normal PCP (eg. pcp is preferred over pcecp), how can reviewing semi-protected pending changes have a higher bar (requiring a request at
    WP:PERM) than reviewing extended-protected pending changes? Doesn't make much sense to me. — BerryForPerpetuity (talk) 14:15, 20 November 2024 (UTC)[reply
    ]
    I do not think that XCON are reviewers is fixed. This RfC is primarily about the creation of PCECP. ~/Bunnypranav:<ping> 14:21, 20 November 2024 (UTC)[reply]
    Well, they're capable of reviewing edit requests. Aaron Liu (talk) 14:39, 20 November 2024 (UTC)[reply]
    Sure, but assuming this will work the same as PCR, isn't it possible that an extended-confirmed user who doesn't want to review edits, will try to edit a PCECP page, and be required to review edits beforehand? They're not actively seeking out to review edits in the same way that a PCR or someone who handles edit requests does. Will their review be on par with the scrutiny required for this level of protection? — BerryForPerpetuity (talk) 14:55, 20 November 2024 (UTC)[reply]
    You do not need to review edits to edit the pending version of the page, which is what happens when you press save on a page with pending edits. Aaron Liu (talk) 15:02, 20 November 2024 (UTC)[reply]
    Is it not the case that reviewers need to check a page's pending changes to edit a page? Either way, the point of "what would constitute a revert" needs to be discussed and decided on before we start to implement this, which I appreciate you discussing above. — BerryForPerpetuity (talk) 15:38, 20 November 2024 (UTC)[reply]
    No. It's just that if the newest change is not reviewed, the last reviewed change is shown to readers instead of the latest change. Aaron Liu (talk) 16:00, 20 November 2024 (UTC)[reply]
    How can we know that the 72,734 extended-confirmed users are capable of reviewing pending changes? This isn't about pending changes level 1. This is about pending changes as applied to enforce ECP, with the level [auto-accept=extendedconfirmed] [review=extendedconfirmed]. As this is only intended to be used for
    WP:ARBECR
    restricted pages, it shouldn't be used for anything else.
    What might need to happen for this to work is there are ways to configure who can auto-accept and review changes individually (rather than bundled as is right now) with the FlaggedRevs extension. Something like this for these drop-downs:
    • Auto-accept:
      • All users
      • Autoconfirmed
      • Extended confirmed
      • Template editor
      • Administrators
    • Review:
      • Autoconfirmed
      • Extended confirmed and reviewers
      • Template editors and reviewers
      • Administrators
    Of course, autoreview will have auto-accept perms regardless of these settings, and review will have review perms regardless of these settings. Awesome Aasim 16:36, 20 November 2024 (UTC)[reply]
    I understand what you're saying, and I'm aware this isn't about level 1. I'm not strongly opposed to PCECP, but my original point was talking about the difference in reviewer requirements for semi-protected PC and XCON PC. If this passes, it would make reviewing semi-protected pending changes require a permission request, but reviewing extended-protected pending changes would only require being extended-confirmed. If that could be explained so I could understand it better, I'd appreciate it.
    This also relates to edit requests. XCON users are capable of reviewing edit requests, because they don't have to implement what the request was verbatim. If a user makes a request that has good substance, but has a part that doesn't adhere to some policy (MOS, NPOV, ect), the reviewer can change it to fit policy. With pending changes, there's really no way to do that besides editing the accepted text after accepting it. The edit request reviewer can ask for clarification on something, add notes, give a reason for declining, ect.
    Especially on pages that have ARBCOM enforcement on them, the edit request system is far better than the pending changes system. This approach seems to be a solution for the problem of over-protection, which is what should actually be addressed. — BerryForPerpetuity (talk) 17:22, 22 November 2024 (UTC)[reply]
    Personally, I would also support this change if only reviewers may accept.
    I think editing a change after acceptance is superior. It makes clear which parts were written by whom (and thus much easier to satisfy our CC license). Aaron Liu (talk) 17:43, 22 November 2024 (UTC)[reply]
    Identifying which specific parts were written by whom isn't necessary for the CC BY-SA license. (And since each new revision is a new derivative work, it's not that easy to isolate.) isaacl (talk) 18:50, 22 November 2024 (UTC)[reply]
    Right, but there's no need to forget the attributive edit summary, which is needed when accepting edit requests. Identifying specific parts is just cleaner this way. Aaron Liu (talk) 18:57, 22 November 2024 (UTC)[reply]
    If the change is rejected, then a user who isn't an author of the content appears in the article history. In theory that would unnecessarily entangle the user in any copyright issues that arose, or possibly defamation cases. isaacl (talk) 22:55, 22 November 2024 (UTC)[reply]
    I personally see that as a much lesser problem than the EditRequests issue. Aaron Liu (talk) 19:15, 23 November 2024 (UTC)[reply]
    We should be maximizing the number of pages that are editable by all. Protection fails massively at this task. All this does is tell editors "hey don't edit this page", which is fine for certain legal pages and the main page that no one should really be editing, but for articles? There is a reason we have this thing called "code review" on Git and "peer review" everywhere else; we should be encouraging changes but if there is disruption we should be able to hold them for review so we can remove the problematic ones.
    Since Wikipedia is not configured to have software-based RC patrol outside of new pages patrol (and RC patrol would be a problem anyway not only because of the sheer volume of edits but also because edits older than a certain timeframe are removed from the patrol queue), we have to rely on other software measures to hide revisions until they are approved. Specifically, RC patrol hiding all edits until approved (wikiHow does this) would be a problem on Wikipedia. But that is a tangent. Awesome Aasim 19:43, 22 November 2024 (UTC)[reply]
    There's also a reason why Git changes aren't pushed directly to the main code branch for review, and instead a pull request is sent to an integrator in order to integrate the changes. There's a bottleneck in processing the request (including integration testing). Also note with software development, rebasing your changes onto the latest integrated stream is your responsibility. The equivalent with pending changes would be for each person to revalidate their proposed change after a preceding change had been approved or rejected. Instead, the workload falls upon the reviewer. Side note: the term "code review" far predates git, and is widely used by many software development teams. isaacl (talk) 22:45, 22 November 2024 (UTC)[reply]
    I see I see. I do think we need better pending changes as the current flagged revs system sucks. Also just because a feature is turned on doesn't mean there is consensus to use it, as seen by ]
    Your second sentence would render everything about this to be meaningless. Plus, the community does not like unnecessarily turning features on; both of your examples have been removed. Aaron Liu (talk) 19:18, 23 November 2024 (UTC)[reply]
    I know, that is my point. We also have consensus to make in Vector 2022 the unlimited width being default which was never turned on. Awesome Aasim 19:20, 23 November 2024 (UTC)[reply]
    I don't understand your point. You're making a proposal for a new feature that has to be developed in a MediaWiki extension. If it does get developed, it won't get deployed on English Wikipedia unless there's consensus to use it. And given that the extension is not supported by the WMF right now, to the extent that it won't deploy it on new wikis, I'm not sure it has the ability to support any new version. isaacl (talk) 22:53, 23 November 2024 (UTC)[reply]
  • Oppose, per JSS and others. We don't need another system just to allow the preemptive protection of pages, and allowing non-EC editors to clutter up this history in ARBECR topic areas would just create a lot of extra work with little or no real benefit. – bradv 23:10, 23 November 2024 (UTC)[reply]
  • Oppose - edit requests only for non-EC users is against spirit of open wiki, but is necessary to prevent the absolute flame-wars/edit-wars on contentious topic pages. having a pending changes version of an article only moves flamewars by non-ECR users to pending changes version. Better to allow edit requests and use ARBECR to close non-productive discussions on talk page than having another venue for CTOP flamewars to occur. Bluethricecreamman (talk) 02:28, 2 December 2024 (UTC)[reply]
    In your argument, aren't flamewars still moved to the edit request's discussions? Can't editors also just reject non-productive pending changes? Aaron Liu (talk) 03:48, 2 December 2024 (UTC)[reply]

Neutral (PCECP)

  1. I have made my opposition to all forms of ]
  2. I'm not a fan of the current pending changes, so I couldn't support this. But it also wouldn't effect my editing, so I won't oppose it if it helps others.-- LCU ActivelyDisinterested «@» °∆t° 14:32, 6 November 2024 (UTC)[reply]

Discussion (PCECP)

Someone who is an expert at configuring

WP:VPT
for assistance.

Extended content
// enwiki
// InitializeSettings.php
$wgFlaggedRevsOverride = false;
$wgFlaggedRevsProtection = true;
$wgSimpleFlaggedRevsUI = true;
$wgFlaggedRevsHandleIncludes = 0;
$wgFlaggedRevsAutoReview = 3;
$wgFlaggedRevsLowProfile = true;
// CommonSettings.php
$wgAvailableRights[] = 'autoreview';
$wgAvailableRights[] = 'autoreviewrestore';
$wgAvailableRights[] = 'movestable';
$wgAvailableRights[] = 'review';
$wgAvailableRights[] = 'stablesettings';
$wgAvailableRights[] = 'unreviewedpages';
$wgAvailableRights[] = 'validate';
$wgGrantPermissions['editprotected']['movestable'] = true;
// flaggedrevs.php
wfLoadExtension( 'FlaggedRevs' );
$wgFlaggedRevsAutopromote = false;
$wgHooks['MediaWikiServices'][] = static function () {
	global $wgAddGroups, $wgDBname, $wgDefaultUserOptions,
		$wgFlaggedRevsNamespaces, $wgFlaggedRevsRestrictionLevels,
		$wgFlaggedRevsTags, $wgFlaggedRevsTagsRestrictions,
		$wgGroupPermissions, $wgRemoveGroups;

	$wgFlaggedRevsNamespaces[] = 828; // NS_MODULE
	$wgFlaggedRevsTags = [ 'accuracy' => [ 'levels' => 2 ] ];
	$wgFlaggedRevsTagsRestrictions = [
		'accuracy' => [ 'review' => 1, 'autoreview' => 1 ],
	];
	$wgGroupPermissions['autoconfirmed']['movestable'] = true; // T16166
	$wgGroupPermissions['sysop']['stablesettings'] = false; // -aaron 3/20/10
	$allowSysopsAssignEditor = true;

	$wgFlaggedRevsNamespaces = [ NS_MAIN, NS_PROJECT ];
	# We have only one tag with one level
	$wgFlaggedRevsTags = [ 'status' => [ 'levels' => 1 ] ];
	# Restrict autoconfirmed to flagging semi-protected
	$wgFlaggedRevsTagsRestrictions = [
		'status' => [ 'review' => 1, 'autoreview' => 1 ],
	];
	# Restriction levels for auto-review/review rights
	$wgFlaggedRevsRestrictionLevels = [ 'autoconfirmed' ];
	# Group permissions for autoconfirmed
	$wgGroupPermissions['autoconfirmed']['autoreview'] = true;
	# Group permissions for sysops
	$wgGroupPermissions['sysop']['review'] = true;
	$wgGroupPermissions['sysop']['stablesettings'] = true;
	# Use 'reviewer' group
	$wgAddGroups['sysop'][] = 'reviewer';
	$wgRemoveGroups['sysop'][] = 'reviewer';
	# Remove 'editor' and 'autoreview' (T91934) user groups
	unset( $wgGroupPermissions['editor'], $wgGroupPermissions['autoreview'] );

	# Rights for Bureaucrats (b/c)
	if ( isset( $wgGroupPermissions['reviewer'] ) ) {
		if ( !in_array( 'reviewer', $wgAddGroups['bureaucrat'] ?? [] ) ) {
			// promote to full reviewers
			$wgAddGroups['bureaucrat'][] = 'reviewer';
		}
		if ( !in_array( 'reviewer', $wgRemoveGroups['bureaucrat'] ?? [] ) ) {
			// demote from full reviewers
			$wgRemoveGroups['bureaucrat'][] = 'reviewer';
		}
	}
	# Rights for Sysops
	if ( isset( $wgGroupPermissions['editor'] ) && $allowSysopsAssignEditor ) {
		if ( !in_array( 'editor', $wgAddGroups['sysop'] ) ) {
			// promote to basic reviewer (established editors)
			$wgAddGroups['sysop'][] = 'editor';
		}
		if ( !in_array( 'editor', $wgRemoveGroups['sysop'] ) ) {
			// demote from basic reviewer (established editors)
			$wgRemoveGroups['sysop'][] = 'editor';
		}
	}
	if ( isset( $wgGroupPermissions['autoreview'] ) ) {
		if ( !in_array( 'autoreview', $wgAddGroups['sysop'] ) ) {
			// promote to basic auto-reviewer (semi-trusted users)
			$wgAddGroups['sysop'][] = 'autoreview';
		}
		if ( !in_array( 'autoreview', $wgRemoveGroups['sysop'] ) ) {
			// demote from basic auto-reviewer (semi-trusted users)
			$wgRemoveGroups['sysop'][] = 'autoreview';
		}
	}
};

Novem Linguae (talk) 09:41, 6 November 2024 (UTC)[reply]

I basically came here to ask if this is even possible or if it would need WMMF devs involvement or whatever.
For those unfamiliar, pending changes is not the same thing as the flagged revisions used on de.wp. PC was developed by the foundation specifically for this project after we asked for it. We also used to have
WP:PC2 but nobody really knew what that was supposed to be and how to use it and it was discontinued. Just Step Sideways from this world ..... today 21:21, 6 November 2024 (UTC)[reply
]
Is PC2 an indication of implementation being possible? Aaron Liu (talk) 22:27, 6 November 2024 (UTC)[reply]
Depends on what exactly is meant by "implementation". A configuration where edits by non-extendedconfirmed users need review by reviewers would probably be similar to what was removed in gerrit:/r/334511 to implement T156448 (removal of PC2). I don't know whether a configuration where edits by non-extendedconfirmed users can be reviewed by any extendedconfirmed user while normal PC still can only be reviewed by reviewers is possible or not. Anomie 13:32, 7 November 2024 (UTC)[reply]
Looking at the MediaWiki documentation, it is not possible atm. That said, currently the proposal assumes that it is possible and we should work with that (though I would also support allowing all extended-confirmed to review all pending changes). Aaron Liu (talk) 13:56, 7 November 2024 (UTC)[reply]

I think the RfC summary statement is a bit incomplete. My understanding is that the pending changes feature introduces a set of rights which can be assigned to corresponding user groups. I believe all the logic is based on the user rights, so there's no way to designate that one article can be autoreviewed by one user group while another article can be autoreviewed by a different user group. Thus unless the proposal is to replace autoconfirmed pending changes with extended confirmed pending changes, I don't think saying "enabled" in the summary is an adequate description. And if the proposal is to replace autoconfirmed pending changes, I think that should be explicitly stated. isaacl (talk) 22:06, 6 November 2024 (UTC)[reply]

The proposal assumes that coexistence is technically possible. Aaron Liu (talk) 22:28, 6 November 2024 (UTC)[reply]
The proposal did not specify if it assumed co-existence is possible, or enabling it is possible, which could mean replacement. Thus I feel the summary statement (before the timestamp, which is what shows up in the central RfC list) is incomplete. isaacl (talk) 22:31, 6 November 2024 (UTC)[reply]
While on a re-read, It is assumed that it is technically possible to have PCECP does not explicitly imply co-existence, that is how I interpreted it. Anyways, it would be wonderful to hear from @Awesome Aasim about this. Aaron Liu (talk) 22:42, 6 November 2024 (UTC)[reply]
The key question that ought to be clarified is if the proposal is to have both, or to replace the current one with a new version. (That ties back to the question of whether or not the arbitration committee's involvement is required.) Additionally, it would be more accurate not to use a word in the summary that implies the only cost is turning on a switch. isaacl (talk) 22:49, 6 November 2024 (UTC)[reply]
It is assuming that we can have PC1 where only reviewers can approve edits and PCECP where only extended confirmed users can approve edits AND make edits without requiring approval. With the current iteration I don't know if it is technically possible. If it requires an extension rewrite or replacement, that is fine. If something is still unclear, please let me know. Awesome Aasim 23:06, 6 November 2024 (UTC)[reply]
I suggest changing the summary statement to something like, "Should a new pending changes protection level be added to Wikipedia – extended confirmed pending changes (hereby abbreviated as PCECP)?". The subsequent paragraph can provide the further explanation on who would be autoreviewed and who would serve as reviewers with the new proposed level. isaacl (talk) 23:19, 6 November 2024 (UTC)[reply]
Okay, done. I tweaked the wording a little. Awesome Aasim 23:40, 6 November 2024 (UTC)[reply]
I think inclusion of the preemptive-protection part in the background statement is causing confusion. AFAIK preemptive protection and whether we should use PCECP over ECP are separate questions. Aaron Liu (talk) 19:11, 7 November 2024 (UTC)[reply]

Q2: If this proposal passes, should PCECP be applied preemptively to
WP:ARBECR
topics?

Particularly on low traffic articles as well as all talk pages.

WP:ECP would still remain an option to apply on top of PCECP. Awesome Aasim 19:58, 5 November 2024 (UTC)[reply
]

Support (Preemptive PCECP)

Oppose (Preemptive PCECP)

No, we still shouldn't be protecting preemptively. Wait until there's disruption, and then choose between PCXC or regular XC protection (I would strongly favour the former for the reasons I gave above). Cremastra (uc) 20:43, 5 November 2024 (UTC)[reply]

Neutral (preemptive PCECP)

Discussion (preemptive PCECP)

@Jéské Couriano Could you link to said ArbCom discussion? Aaron Liu (talk) 03:51, 6 November 2024 (UTC)[reply]
I'm not saying such a discussion exists, but changes to Arbitration remedies/discretionary sanctions are something they would want to weigh in on. Arbitration policy (which includes ]
That is not my reading of
WP:ARBECR. Specifically, On any page where the restriction is not enforced through extended confirmed protection, this restriction may be enforced by...the use of pending changes... (bold added by me for emphasis). But if there is consensus not to use this preemptively so be it. Awesome Aasim 05:13, 6 November 2024 (UTC)[reply
]

Q3: If this proposal does not pass, should ECP be applied preemptively to articles under
WP:ARBECR
topics?

Support (preemptive ECP)

  • Support as a second option, but only to articles. Talk pages can be enforced solely through reverts and short protections so I see little reason why those should be protected. Awesome Aasim 19:58, 5 November 2024 (UTC) Moved to oppose. Awesome Aasim 19:10, 23 November 2024 (UTC)[reply]
  • Support for articles per Aasim. Talk pages still need to be open for edit requests. (Also changing my mind, per above. If anything, we should clarify ARBECR so that the 500-30 limit is only applied in cases where it is needed, not automatically, to resolve the ambiguity. 20:52, 7 November 2024 (UTC)) Chaotic Enby (talk · contribs) 20:20, 5 November 2024 (UTC)[reply]
  • Support per my comment in the previous section. * Pppery * it has begun... 20:52, 5 November 2024 (UTC)[reply]
  • I agree with Chaotic Enby and Pppery above and think all CT articles should be protected. I am generally not a fan of protecting Talk pages, but it's true that many CT Talk pages are cesspools of hate, so I am not sure where I sit on protecting Talk pages. Toadspike [Talk] 20:57, 5 November 2024 (UTC)[reply]
    Under the current wording of
    BITEy
    .
    I am not opposed to changing the wording of ARBECR to forbid reverting solely because an editor is not extended confirmed, which is a silly reason to revert otherwise good edits. However, until ArbCom changes ARBECR, we are stuck with the rules we have. We ought to make these rules clear to editors before they edit, by page protection, instead of after they edit, by reversion. Toadspike [Talk] 10:55, 16 November 2024 (UTC)[reply]
  • Support preemptive ECP without PCECP (for article space only). If we have a strict policy (or ArbCom ruling) that a class of user is forbidden to edit a class of page, there is no downside whatever to implementing that policy by technical means. All it does is stop prohibited edits. The consequences would all be positive, such as removing the need for constant monitoring, reducing IP vandalism to zero, and reducing the need to template new editors who haven't learned the rules yet. What I'd like with regard to the last one, is that a non-EC editor sees an "edit" button on an ECP page but clicking it diverts them to a page that explains EC and how to get it. Zerotalk 05:53, 17 November 2024 (UTC)[reply]

Oppose (preemptive ECP)

Neutral (preemptive ECP)

Discussion (preemptive ECP)

I think this question should be changed to "...articles under WP:ARBECR topics?". Aaron Liu (talk) 20:11, 5 November 2024 (UTC)[reply]

Okay, updated. Look good? Awesome Aasim 20:13, 5 November 2024 (UTC)[reply]

As I discussed in another comment, should this concept gain approval, I feel it is best for the community to work with the arbitration committee to amend its remedy. isaacl (talk) 15:34, 7 November 2024 (UTC)[reply]

And as I discussed in another comment while I think the community could do this, I agree with isaac that it would be best to do it in a way that works with the committee. Best, Barkeep49 (talk) 16:03, 7 November 2024 (UTC)[reply]

Q4: Should there be a Git-like system for submitting and reviewing edits to protected pages?

This behaves a little like pending changes, but with a few different things:

  1. There would be an additional option entitled "allow users to submit edits for review" in the protection window. There could also be a specific user group able to accept such edits.
  2. Instead of the standard "protected page text" informing the user is protected, when this option is enabled, the user is given a message something like "This page is currently protected, so you are currently submitting an edit request. Only when your change is approved will your edit be visible." An edit summary as well as a more detailed explanation into the review can be provided. Same for title blacklisted pages. However, the "permission error" will still show for attempting to rename the page, as well as for cases where a user cannot edit a page for a reason other than protection (like being blocked from editing).
  3. All the changes submitted for review end up in some namespace (like Review:1234567) with the change id. Only users with the ability to edit the page or accept the revision would be able to see these changes. There would also be the ability to discuss each change on the talk page for that change or something similar. This namespace by design will be unprotectable.
  4. Users with the ability to edit the page (or when a higher accept level is selected, users with that accept level) are given the ability to merge these changes in. Administrators can delete changes just like they can delete individual revisions, and these changes can also be suppressed just like individual revisions.
  5. Changes are not directly committed to the edit history, unlike the current pending changes system; only to the page in the Review: namespace.

This would be a major improvement over our edit request system which ONLY allows a user to write what they want changed, and that is often prone to stuff that is not

WP:CHANGEXY. If there are merge conflicts preventing a clean merge then the person who submitted the edit or the reviewer will have to manually fix it before it merges cleanly. If this path is chosen we can safely retire pending changes. Awesome Aasim 18:52, 23 November 2024 (UTC)[reply
]

Survey (Q4)

Discussion (Q4)

If additional proposals come (seems unlikely), I wonder if this might be better split as a "pending changes review" or something similar. Awesome Aasim 18:52, 23 November 2024 (UTC)[reply]

I really think this should be straight-up implemented as whatever first instead of being asked in an RfC. Aaron Liu (talk) 19:32, 23 November 2024 (UTC)[reply]

First, please stop calling this a git-like system. The real essence of version control systems is branching history. Plus one of the key principles for git is to enable developers to keep the branching history as simple as possible, with changes merged cleanly into an integration branch, so proposed changes never show up in the history of the integration branch.

I would prefer keeping the article history clear of any edit requests. There could be a tool that would clone an article (or designated sections) to a user subpage, preserving attribution in the edit summary. The user could make their changes on that page, and then a tool could assist them in creating an edit request. Whoever processes the request will be able to review the diff on the subpage. If the current version of the article has changed significantly, they can ask the requester to rebase the page to the current version and redo their change. I think this approach simplifies both creating and reviewing a proposed change, and helps spread the workload of integrating changes when they pile up. isaacl (talk) 22:44, 23 November 2024 (UTC)[reply]

It won't. If the change is not merged. The point of this is the edit history remains clear up until the edit is approved. We can do some "squashing" as well as limit edits to be reviewed to the original creator. A commit on GitHub and GitLab does not show up on main until merged. It is already possible to merge two page's histories right now, this is done after cut and paste moves. This just takes it to a different level. Awesome Aasim 22:53, 23 November 2024 (UTC)[reply]
History merge isn't really the same thing, in that you can't interlace changes in the version history, but only have a "clean" merge when the two have disjoint timespans. If multiple versions of the same page are edited simultaneously before being merged, even assuming no conflicts in merging, the current histmerge system will not be able to handle it properly. Chaotic Enby (talk · contribs) 22:58, 23 November 2024 (UTC)[reply]
If it doesn't show up in the article history, then it isn't like pending changes at all, so I suggest your summary should be updated accordingly. In which case, under the hood your proposal is similar to mine; I suggest having subpages under the user page would be easier for the user to manage. Squashing shouldn't be done with the history of public branches (commits should remain fixed once they've been made known to everyone) plus rewriting history can be confusing, so I think the change history should be preserved on the working page. If you mean that the submission into the article should be one edit, sure.
My proposal was to layer on tools to assist with creating edit requests, while yours seeks to integrate the system with the edit function when a user is prevented from editing due to page protection. Thus from an implementation perspective, my proposal can be implemented independently of the rest of the MediaWiki code base (and could be done with gadgets), while yours would require changes to the MediaWiki code. Better integration of course offers a more cohesive user experience, but faces greater implementation and integration challenges. I suggest reaching out to the WMF development team to find a contact to discuss your ideas. isaacl (talk) 23:13, 23 November 2024 (UTC)[reply]
I agree that for now we should have JS tools, although that itself has challenges. A modification to MediaWiki core will also have challenges but it might be worth it in the long run, as Core gets regular updates to features, but extensions not always. Awesome Aasim 01:31, 24 November 2024 (UTC)[reply]
Okay, I took a stab at making the experience of making an edit request a bit more new-user friendly: User:Awesome Aasim/editrequestor.js.
I did notice someone else created a similar script but it behaves quite differently. This relies largely on the MediaWiki compare API to build a result. Unfortunately it uses deprecated libraries, etc. and will definitely need rewriting, but I think it is a good first prototype.
If something similar was loaded for every edit request with withJS, I wonder how this will change the views of users who expressed opposition. Awesome Aasim 02:35, 30 November 2024 (UTC)[reply]
Not sure which users you're thinking of, as no one in this discussion has so far opposed changes to the edit process so it can feed an edit request system without introducing pending changes into the article history. (I can imagine opposition based on potentially swamping the edit request system, and a lack of capacity to handle requests, but I don't think the discussion is there yet.) Maybe you can create a short video to demonstrate how your prototype functions? It should be a good starting point for discussions with the appropriate WMF developers. isaacl (talk) 20:19, 30 November 2024 (UTC)[reply]
The "similar script" I am referring to is User:NguoiDungKhongDinhDanh/FormattedEditRequest. But it works a bit differently, rather than intercepting "submit an edit request" requests, it adds a link to a portlet.
Here is a MP4 file of my prototype. If this can be converted to a compatible format and uploaded to Wikipedia that would be nice. Awesome Aasim 20:44, 30 November 2024 (UTC)[reply]
I wasn't wondering about the other script, but thanks for the info. isaacl (talk) 22:23, 30 November 2024 (UTC)[reply]

General discussion

Since we're assuming that PCECP is possible and the last two questions definitely deal with policy, I feel like maybe this should go to VPP instead, with the header edited to something like "Extended-confirmed pending changes and preemptive protection in contentious topics" to reflect the slightly−larger-than-advertised scope? Aaron Liu (talk) 23:53, 5 November 2024 (UTC)[reply]

I think policy proposals are also okay here, though I see your point. There is definitely overlap, though. This is both a request for a technical change as well as establishing policy/guidelines around that technical change (or lack thereof). Awesome Aasim 00:26, 6 November 2024 (UTC)[reply]

If this proposal is accepted, my assumption is that we'd bring back the

]

I think light blue is a better color for this. But in any case we will probably need a lock with a checkmark and the letter "E" for extended confirmed. Awesome Aasim 22:22, 8 November 2024 (UTC)[reply]
Light blue seems too similar to the sky-blue currently used for ]
I would go for either the EC lock just with the icon replaced with a checkmark or what you said but with the same color and a diagonal line down the middle. Aaron Liu (talk) 20:02, 1 December 2024 (UTC)[reply]

Courtesy ping

Courtesy ping all from the idea lab that participated in helping formulate this RfC: @Toadspike @Jéské Couriano @Aaron Liu @Mach61 @Cremastra @Anomie @SamuelRiv @Isaacl @WhatamIdoing @Ahecht @Bunnypranav. Awesome Aasim 19:58, 5 November 2024 (UTC)[reply]

Protection?

I am actually starting to wonder if "protection" is a bit of a misnomer, because technically pages under pending changes are not really "protected". Yeah the edits are subject to review, but there are no technical measures to prevent a user from editing. It is just like recent changes on many wikis; those hold edits for review until they are approved, but they do not "protect" the entire wiki. Awesome Aasim 23:40, 11 November 2024 (UTC)[reply]

How about “kinder, gentler protection”? To appear in the know, you can use an acronym, such as in “TCPIP is an example of KGP”. — Charles Stewart (talk) 04:57, 28 November 2024 (UTC)[reply]

Move to close

The main proposal is basically deadlocked and has been for six days, and the sub-proposals are clearly failing. Seems like we have a result. Just Step Sideways from this world ..... today 23:09, 22 November 2024 (UTC)[reply]

I was about to withdraw Q2 and Q3 for putting the pen before the pig, but I did realize I added a couple more comments particularly to Q2. I did add a Q4 that might be more actionable and that is about making the experience of submitting edit requests a lot better. I am starting to agree though for Q2 and Q3 everything that has needed to be said has been said so the proposals can be withdrawn.
We do need to consider the experience of the users actually being locked out of this. I understand the opposition to Q3 (and in fact just struck my !vote because of this). But Q2? Look at the disaster that
WP:V22RFC3 is. These surveys are barely representative of new users, just of experienced editors. We should absolutely be bringing new editors to the table for these discussions. Awesome Aasim 19:13, 23 November 2024 (UTC)[reply
]
Please don't pre-close. 4 of the opposers to the main proposal seem to address only Q2 instead of Q1, and I don't see anyone addressing the argument that it's less restrictive than ECP. It's up to the closer to weigh the consensus. Aaron Liu (talk) 19:30, 23 November 2024 (UTC)[reply]

RfC: Should a blackout be organized in protest of the Wikimedia Foundation's actions?

Proposal to update
WP:GNG

Over at

WP:BAND. Please see that proposal here. I have highlighted the addition to existing policy in green.--3family6 (Talk to me | See what I have done) 13:17, 18 November 2024 (UTC)[reply
]

Unless I'm misunderstanding something, then this proposal passing will be the equivalent of replacing criteria 2-11 with "they must meet the GNG"? Per several comments in the discussion at Wikipedia:Village_pump (policy)#Issues with antiquated guideline for WP:NBAND that essentially cause run of the mill non-notable items to be kept I'm not convinced that there is currently a problem that can be solved in this manner. Thryduulf (talk) 16:03, 18 November 2024 (UTC)[reply]
Yes, this is basically saying that to have an article, the subject must meet GNG. There is an example in the article deletion discussion I mentioned above where NBAND was argued as an exception to GNG.--3family6 (Talk to me | See what I have done) 16:07, 18 November 2024 (UTC)[reply]
A single discussion where somebody argues something that does not gain consensus is not evidence of a problem, let alone evidence that the proposed change would solve that problem. Thryduulf (talk) 16:26, 18 November 2024 (UTC)[reply]
I would like to emphasize a key part of
WP:N
:

A topic is presumed to merit an article if:

  1. It meets either the general notability guideline (GNG) below, or the criteria outlined in a subject-specific notability guideline (SNG); and
  2. It is not excluded under the What Wikipedia is not policy.
This is a feature, not a bug; "or" does not mean "and". That WP:BAND currently circumvents WP:GNG is either trivially true (as creating subject-specific notability guidance outside of the GNG is the whole point of a
WP:SNG) or arises from a fundamental misunderstanding of the purpose of the subject-specific notability guidelines. — Red-tailed hawk (nest) 16:50, 19 November 2024 (UTC)[reply
]
or arises from a fundamental misunderstanding of the purpose of the subject-specific notability guidelines. That might actually be what is at issue - there seem to be two different understandings of what SNG's are - supporting GNG or an alternative to it.--3family6 (Talk to me | See what I have done) 18:17, 19 November 2024 (UTC)[reply]
Some SNGs take one approach; others take different approaches.
WP:SNG was written to allow for the diversity of approaches represented by the current SNGs. Newimpartial (talk) 22:10, 22 November 2024 (UTC)[reply
]
I posted this in the other village pump thread, but while I'm generally fine with this proposal, I don't think it's coming from a place of understanding.
Basically, there's an assumption happening that record labels work off some kind of predictable tier system, where the Big 3 labels are home to the most famous artists, indie labels are home to the semi-famous ones, and everyone else is a non-notable bottom feeder. That's not how it works. One of the more notable albums of the year is Cindy Lee's Diamond Jubilee, which was self-released. Meanwhile, there are artists on the Big 3 who I would guess probably don't have significant coverage. This is because music journalism is dying, no one has staff and no one has money, and the range of artists being covered has shrunk dramatically. See this Columbia Journalism Review article for further on that.
So in other words, I don't think criterion 5 in NBAND is good or useful, but for the opposite reasons that this proposal suggests. The problem is not that people's random garage bands will be considered a "label." The problem is there is less and less correlation between being signed to a label and having significant coverage. (Ironically, the "albums" criterion is probably the more stringent one, because labels are less and less likely to put out a full-length album by an artist that isn't already established via singles and streaming tracks.)
I don't know what to do with that. (I honestly think the collapse of journalism and the shrinking scope of what gets reported on is a ticking time bomb for notability criteria across the board, but that's a whole other topic.) The most straightforward solution is to use
WP:GNG, but I think it's important to have a correct understanding of exactly what musicians we're talking about here. The bar is way, way, way higher than "run of the mill non-notable items" now. The bar is one or two tiers below Sabrina Carpenter. Gnomingstuff (talk) 21:26, 19 November 2024 (UTC)[reply
]
Addendum: One way that this criterion could have value is to serve as a reminder that one Google search is not a sufficient
WP:BEFORE
check, because artists on notable labels are likely to have received coverage in print. (Another way this proposal is misinformed
- removing NBAND #5 will primarily affect older bands, not newer ones.) But alas, people do not do thorough checks even when they're reminded. Gnomingstuff (talk) 21:33, 19 November 2024 (UTC)[reply]
I'd love to make BEFORE specifically include looking where sources are most likely to be found and explicitly state that looking at the first few pages of Google do not constitute a proper check. This always gets shot down in howls of protest at how dare I require people nominating pages for deletion to do more work than they imagine it took to create a three line sub-stub. I don't know how we get past this. Thryduulf (talk) 21:45, 19 November 2024 (UTC)[reply]
I mean, it already does: The minimum search expected is a normal Google search, a Google Books search, a Google News search, and a Google News archive search; Google Scholar is suggested for academic subjects. The problem is that WP:BEFORE is not considered binding so there are no consequences to ignoring it. Gnomingstuff (talk) 21:51, 19 November 2024 (UTC)[reply]
Gnomingstuff, there seems to be a lot of agreement that #5 as it stands does not make much sense for newer bands, but does make sense prior to the rise of streaming services. I'm seeing cut-offs suggested for the mid- to late-2000s.--3family6 (Talk to me | See what I have done) 12:53, 22 November 2024 (UTC)[reply]
Yeah I get that. I don't agree with the reasoning but I basically agree with the conclusion. Gnomingstuff (talk) 17:27, 22 November 2024 (UTC)[reply]

Your proposal operatively eliminates the SNG for bands. And also creates an even tougher GNG requirement than GNG by requiring that GNG compliance be demonstrated. I would like there to be some at least partial demonstration requirement added to GNG, but that's a whole 'nother issue and a secondary one in this case.

It also sort of misses the main point discussed at the linked pump discussion which was eliminating one or two items / "ways in" in the SNG.Sincerely, North8000 (talk) 18:04, 18 November 2024 (UTC)[reply]

in line with this, NBAND can be eeaily fixed to makes sure that the idea that the criteria are a presumption of notability is added. I do not see any language like this though the intent seems to be there. That would quickly resolve one conflict. Mind you, deprecating or time gating criteria that do not make sense in modern music distribution is also a reasonable step though I would not remove them outright for historical purposes. Masem (t) 19:02, 18 November 2024 (UTC)[reply]
this was precisely the intent. Am allowed to modify proposals if there have been no votes yet?--3family6 (Talk to me | See what I have done) 19:05, 18 November 2024 (UTC)[reply]
I was amazed by how much our guidelines were written with Western popular musicians in mind when I started editing 17 years ago and it seems that nothing has changed since. It is so much easier for such a person to have an article about them than for other types or nationalities of musician. This is so obviously caused by Wikipedia's demographics that I hesitate to say anything further. ]
I wonder what effect imposing GNG would have on that. I've heard from some African editors that much of the real news for music and pop culture is posted on social media (i.e., actually posted on Facebook itself, not some website that's sorta kinda social media-like). So if you take away an objective but non-source-oriented criteria and substitute 'must have the kind of sources that are usual enough in the US and UK but are unusual in Nigeria', will that tip even further towards overrepresenting Western popular musicians?
My impression of the two albums/two films kinds of rules from back in the day is that the advice had more to do with
Cover album
in 2001") but that we'd still be able to provide non-red links in related pages and still not have to duplicate information. Consequently, I think the traditional thinking is closer to how we think of spinning off a list or splitting a long article, than about trying to justify the subject as "worthy" of a full, stand-alone article via extensive sourcing.
I could imagine people opposing this merely for fear of the resulting red links, and of course the idea of going beyond the GNG to require "demonstrating" it will turn off other editors. WhatamIdoing (talk) 19:36, 18 November 2024 (UTC)[reply]
If it is established that reliable third party sources covering African music are going to include posts on social media rather than print or web publishing, then we should work to accomodate that so that we are more inclusive, rather than expect the more traditional forms of media. Masem (t) 20:05, 18 November 2024 (UTC)[reply]
speaking for myself, I never had issues with using a third party posting via something like Facebook. I've always considered that to be a statement by that third party, they're just using Facebook as the medium. Am I understanding this example correctly?--3family6 (Talk to me | See what I have done) 20:18, 18 November 2024 (UTC)[reply]
@]
A statement by the band('s representatives) on the band's official facebook page is no more user-generated content and no less reliable than if that same statement was made by the same people was posted on the band's official website or quoted verbatim in a newspaper. Thryduulf (talk) 20:55, 18 November 2024 (UTC)[reply]
Yes,
Salon posted a story on Facebook rather than on their official site. It's gone through an editorial process, they just are using Facebook as a publishing medium.--3family6 (Talk to me | See what I have done) 21:00, 18 November 2024 (UTC)[reply
]
There isn't a wiki-requirement for the type of source that sources used, or even that they have sources. Of course such things still matter regarding regarding actual/ real world reliability of of the source. North8000 (talk) 21:36, 18 November 2024 (UTC)[reply]
So keeping in mind that I have never had a Facebook account and have no experience with social media, my impression from these editors was that when they say they get news on Facebook, it's not necessarily the band that's posting (which wouldn't be Wikipedia:Independent sources) or even news articles being shared. Instead, it could be an ordinary comment by someone whom their followers believe is knowledgeable but who is not necessarily "official". For example – and I completely make this example up; the African editors who told me about this dilemma two years ago are welcome to disavow and correct anything I say – imagine a post by a professional DJ: They'll know things about music and bands, and they'll probably know more than a magazine writer assigned to do a piece on pop music in that city/country. They are "reliable" in the sense that people "rely on" them every day of the year. But it's outside the kinds of formal structures that we use to evaluate official sources: no editor, no publisher, no fact-checker, no peer review, etc. WhatamIdoing (talk) 05:33, 20 November 2024 (UTC)[reply]
I'm all for adapting guidelines and policies for geographic and cultural considerations. However, I don't think this would get far, because it's essentially a using self-published sources for BLPs issue, and that's going to be a steep climb to weaken that policy.--3family6 (Talk to me | See what I have done) 12:55, 22 November 2024 (UTC)[reply]
Yeah, I don't see any easy solution here. Even if it's not BLP-related, it relies on already knowing which accounts are the trustworthy ones, and there's no impartial way to evaluate an unfamiliar source. The post could say something like "This village is best known for cloth dyeing" or "The bus service there doesn't run on Sundays", and you'd still have to know whether that source is a good source of information. What if one person posts that the bus runs on Sundays and another person posts that it doesn't? An outsider would have no way of knowing which to trust. WhatamIdoing (talk) 23:28, 22 November 2024 (UTC)[reply]
I always thought the the reason "two or more" was specified was that if there was only one the name could be redirected. Since that time there seems to have developed a dislike for stubs. I don't know where that came from (most articles in most traditional encyclopedias only consisted of one or two sentences, if that) but in order to satisfy that dislike maybe criterion 6 should be an encouragement to create a disambiguation page (or maybe a set index page, if you want to be pedantic) for the title. ]
I think that is correct. If there's only one, then you could:
  • Have an article about the album
  • Redirect the band's name to it
  • Put the little bit of information you have about the band in the album article
but if there are multiple albums, then:
  • You can have an article about each album
  • But which one do you redirect the band's name to?
  • And do you duplicate the information about the band in each of the album articles?
So it seemed easier to have an article. Now, of course, when the median size of an article is 13 sentences and 4 refs, and when we have a non-trivial minority of editors who think even that is pathetic, there is resistance to it. WhatamIdoing (talk) 23:33, 22 November 2024 (UTC)[reply]

Comment I am interpreting this section as an RfCBEFORE, and contributing in that spirit.

Having briefly reviewed the linked discussions, I do not see a problem with NBAND itself that would justify deprecation (rather than revision). And turning NBAND into a predictor of GNG rather than a standalone SNG seems to me essentially akin to deprecation. Fixing specific criteria seems much more appropriate to me, given the issues raised to date.

There are what seem to me to be evident reasons why NBAND operates according to the same logic as NCREATIVE, which is explicitly excluded from

WP:NOTINHERITED
. These SNGs reflect the reality that creative people produce creative works, and that therefore the people creating those works gain encyclopedic relevance directly from having created them.

In addition, it seems to me that there are practical, navigational reasons (having to do with the affordances of hypertext, Wikipedia's list system, and Wikipedia's category system) to offer more consistent treatment rather than leaving each individual musician, each musical group and each album up to the vagaries of

WP:GNG
.

There may be problems with specific NBAND criteria and the way they are sometimes used at AfD, but it seems entirely incommensurate to deprecate the whole SNG based on such marginal concerns. Newimpartial (talk) 20:31, 18 November 2024 (UTC)[reply]


IMO the Wikipedia norm for a "just barely made it" band has sourcing that meets a slightly lenient interpretation of GNG, and the decision is influenced by somewhat meeting an SNG criteria, thus being more conducive to artists than for example a for-profit corporation. And the "norm" means that is is how Wikipedia as a whole wants it. There are folks out there who are at the extreme deletionist end of the spectrum and they will typically say that the above is not the case and piece together an unusually strignent "letter of the law" demand, even adding some things from essays saying that three sources that 100% meet GNG is the expectation. And so while I really think that the burden should shift to providing some GNG-ish sources (vs. just saying "they are out there" without actually supplying any) I'm loath to shift the balance too much, keeping the folks at the deletionist end of the spectrum in mind.

The pump discussion started with talking about how being signed by a label is no longer as indicative as it used to be and to remove it as being a key to the city of SNG compliance. I think there was support for that.

Sincerely, North8000 (talk) 21:16, 18 November 2024 (UTC)[reply]

I agree with everyone else above that this proposal would gut WP:BAND, which I am not okay with. If you want to remove some criteria of WP:BAND, like #5, which I agree is a little opaque and outdated, fine. But this seems like a sneaky way of demolishing WP:BAND without openly saying so. Toadspike [Talk] 21:36, 18 November 2024 (UTC)[reply]

I like the point North made, that our notability rules are set up to be more conducive to artists than for example a for-profit corporation. I've never thought of our guidelines on artists as particularly lax, but I know that NCORP is purposely stringent and that is the way things should be. Toadspike [Talk] 21:41, 18 November 2024 (UTC)[reply]

RfC: Enable the mergehistory permission for importers

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


Should the (mergehistory) permission be enabled for the importer group? Chaotic Enby (talk · contribs) 12:26, 20 November 2024 (UTC)[reply]

Support (mergehistory for importers)

  1. Support. During
    importers. A technical solution to this would be to enable the (mergehistory) permission for both administrators and importers. Chaotic Enby (talk · contribs) 12:26, 20 November 2024 (UTC)[reply
    ]
  2. Yeah, why not; I didn't really see the point back then, I'm not sure, honestly, that I do now, but enough people have said it's useful work that who am I to deny it? And Graham87's obviously both good at it and committed to it. Support this proposal. SerialNumber54129 12:42, 20 November 2024 (UTC)[reply]
  3. Importers can be trusted to do this adjacent and very important work. Aaron Liu (talk) 12:46, 20 November 2024 (UTC)[reply]
  4. I was about to come propose this myself, but you beat me to it. QuicoleJR (talk) 12:57, 20 November 2024 (UTC)[reply]
  5. Support File importers are trusted enough. – robertsky (talk) 13:03, 20 November 2024 (UTC)[reply]
  6. Support; histmerges are often an essential part of importation work, as noted by Chaotic Enby. JJPMaster (she/they) 13:07, 20 November 2024 (UTC)[reply]
  7. (edit conflict) Support. Importers are editors who are highly trusted to undertake a very specialised role and it makes sense that they be given the rights needed to fully do the job properly. Thryduulf (talk) 13:09, 20 November 2024 (UTC)[reply]
  8. Support obviously – thanks, wow, did not expect this and I didn't know this would be feasible. As I said at my RRFA, I have my own issues with this tool (which explain why I didn't use it so much), but access to it is way better than no history-merge access at all. Graham87 (talk) 13:25, 20 November 2024 (UTC)[reply]
  9. Support if technically feasible. I really opposed the RRFA because Graham87 was asking for a role we didn't have. If they can do their importing/merging work without being able to block users, I would support that. (Normally I wouldn't support a one-off solution like this but, given the rareness of this, I think it makes sense here.) Note that I would also favor further unbundling admin powers beyond this nom. - RevelationDirect (talk) 13:35, 20 November 2024 (UTC)[reply]
    Yes, it is feasible. — xaosflux Talk 13:39, 20 November 2024 (UTC)[reply]
    Oops, just asked that question below. "Thanks for the prompt rely! RevelationDirect (talk) 13:41, 20 November 2024 (UTC)[reply]
  10. Support - clear benefit, and I don't see any reason not to. Tazerdadog (talk) 13:40, 20 November 2024 (UTC)[reply]
  11. Support sure. This is super niche, but basically: if someone can be trusted to be able to do an xmlimport, this is related and much less dangerous. If we're going to touch it I'm find also adding it to transwiki importers as well (even though we don't have any currenty) for parity. transwiki import is less dangerous, and most of the
    WP:RFPI items are able to be done that way -- in case any non-admins were looking to work in that area. — xaosflux Talk 13:44, 20 November 2024 (UTC)[reply
    ]
  12. Support If somebody is a importer, they can be trusted with not messing up the databases any further while apply (merge-history). Sohom (talk) 13:58, 20 November 2024 (UTC)[reply]
  13. Support, makes sense to give this group the tools they need to do the job properly. CapitalSasha ~ talk 14:12, 20 November 2024 (UTC)[reply]
  14. Support. It just makes sense to do it. ~~ Jessintime (talk) 14:50, 20 November 2024 (UTC)[reply]
  15. Support. Makes sense if the two are so interlinked. If an editor is trusted with one, they should also be fine to have the other. ---- Patar knight - chat/contributions 15:41, 20 November 2024 (UTC)[reply]
  16. Support. This seems like a bit of an exceptional case, but I do think that it's worthwhile to allow importers to merge histories for practical reasons. And the role is so restricted that I don't have trust issues here. — Red-tailed hawk (nest) 15:53, 20 November 2024 (UTC)[reply]
  17. Support: Makes perfect sense from my perspective. Hey man im josh (talk) 16:06, 20 November 2024 (UTC)[reply]
  18. Support, sensible unbundling. Nobody becomes an importer without scrutiny so this seems fine to me. WindTempos they (talkcontribs) 17:02, 20 November 2024 (UTC)[reply]
  19. Support per xaosflux.—Alalch E. 17:32, 20 November 2024 (UTC)[reply]
  20. Support. Graham's tireless work in this area is the demonstration of why this should be permitted.  — Hex talk 17:41, 20 November 2024 (UTC)[reply]
  21. I got to support Graham's importer request once upon a time. Pleased to support this request as well. Even setting aside the direct impetus, this is a logical bundling of the tools that does not raise the required trust level for this small user group. -- ]
  22. Support --Redrose64 🌹 (talk) 21:05, 20 November 2024 (UTC)[reply]
  23. Support See no reason not to. -- Pawnkingthree (talk) 21:18, 20 November 2024 (UTC)[reply]
  24. Suppport. A logical part of the bundle. Sincerely, Dilettante 21:33, 20 November 2024 (UTC)[reply]
  25. Clearly yes. There's very low risk of collateral damage here and obvious benefits.—S Marshall T/C 23:23, 20 November 2024 (UTC)[reply]
  26. Graham (the only non admin importer) is trusted enough for this, no reason not to. ]
  27. Support. Let's at least permit Graham to continue his archaeological work. No one else does this, and it requires the mergehistory perm. Folly Mox (talk) 11:15, 21 November 2024 (UTC)[reply]
  28. Graham has a clear use-case for this so I have no objections. JavaHurricane 13:49, 21 November 2024 (UTC)[reply]
  29. Support I see no problems with this. EggRoll97 (talk) 14:56, 21 November 2024 (UTC)[reply]
  30. Support Seems like this is a necessary change given that importing often requires these merges. ]
  31. Support and would go for a ]
  32. Support, obviously, this is invaluable work and it would be a clear negative for it to stop being done, which is effectively what would happen otherwise. Gnomingstuff (talk) 01:51, 22 November 2024 (UTC)[reply]
  33. Support My gut doesn't like this (mergehistory feels like a distinct permission from importer), but I do trust Graham87 to use the tool and think the chance of us ever getting any other non-admin importers is negligible, so I guess I support this. * Pppery * it has begun... 01:54, 22 November 2024 (UTC)[reply]
  34. Support seems like a sensible thing to do. ]
  35. Support Importers are already highly trusted and it is granted by a steward, so this would be a narrow unbundling that would probably satisfy any WMF legal requirements. And I would trust Graham87 with this tool. Abzeronow (talk) 20:58, 22 November 2024 (UTC)[reply]
  36. Support Importers already can fuck with the page history a lot more than someone with history merge rights can. I see no reason to not allow importers to merge history. ThatIPEditor Talk · Contribs 02:04, 23 November 2024 (UTC)[reply]
  37. Support because if this enables work to be performed that much easier for one editor, then chances are increased for other editors to follow suit or pick up the mantle later on down the line. Thanks. Huggums537voted! (sign🖋️|📞talk) 06:50, 25 November 2024 (UTC)[reply]
  38. Support, much has been said above to agree with, an easy support. Randy Kryn (talk) 12:40, 25 November 2024 (UTC)[reply]
  39. Support Importers are trusted enough so I see no problem with this The AP (talk) 14:01, 25 November 2024 (UTC)[reply]
  40. Support. I don't see a reason why not. This seems relevant to the importer group, and I'm surprised this permission isn't already included. – Epicgenius (talk) 14:31, 25 November 2024 (UTC)[reply]

Oppose (mergehistory for importers)

  1. Oppose the current system works just fine. I'm not seeing any compelling reason to carve out an exception for two users. Isaidnoway (talk) 16:27, 20 November 2024 (UTC)[reply]
    Because importing is importing a bunch of revisions into the history of the page. It's quite similar and often needed. Those two users are the only ones who maintain this area critical to Wikipedia, and that's the system, which has persisted due to their being able to merge history; now that Graham's been stripped of history merging, half of his duty and thus a quarter of this system, we need to rectify it or risk destabilizing of the system. Aaron Liu (talk) 16:37, 20 November 2024 (UTC)[reply]
    There is no risk of destabilizing the system, that's hyperbolic nonsense. Isaidnoway (talk) 20:29, 20 November 2024 (UTC)[reply]
    The system is only two people doing this work, and we're otherwise taking away half of what one of them does. I don't see any reason not to do this. Aaron Liu (talk) 20:36, 20 November 2024 (UTC)[reply]
    It no longer "works fine". Your information may be outdated. Folly Mox (talk) 11:17, 21 November 2024 (UTC)[reply]
    My information is not "outdated". Thanks. Isaidnoway (talk) 20:52, 21 November 2024 (UTC)[reply]

Discussion (mergehistory for importers)

The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

RfC: Log the use of the HistMerge tool at both the merge target and merge source

Currently, there are open phab tickets proposing that the use of the HistMerge tool be logged at the target article in addition to the source article. Several proposals have been made:

  • Option 1a: When using Special:MergeHistory, a null edit should be placed in both the merge target and merge source's page's histories stating that a history merge took place.
    (phab:T341760: Special:MergeHistory should place a null edit in the page's history describing the merge, authored Jul 13 2023)
  • Option 1b: When using Special:MergeHistory, add a log entry recorded for the articles at the both HistMerge target and source that records the existence of a history merge.
    (phab:T118132: Merging pages should add a log entry to the destination page, authored Nov 8 2015)
  • Option 2: Do not log the use of the Special:MergeHistory tool at the merge target, maintaining the current status quo.

Should the use of the HistMerge tool be explicitly logged? If so, should the use be logged via an entry in the page history or should it instead be held in a dedicated log? — Red-tailed hawk (nest) 15:51, 20 November 2024 (UTC)[reply]

Survey: Log the use of the HistMerge tool

  • Option 1a/b. I am in principle in support of adding this logging functionality, since people don't typically have access to the source article title (where the histmerge is currently logged) when viewing an article in the wild. There have been several times I can think of when I've been going diff hunting or browsing page history and where some explicit note of a histmerge having occurred would have been useful. As for whether this is logged directly in the page history (as is done currently with page protection) or if this is merely in a separate log file, I don't have particularly strong feelings, but I do think that adding functionality to log histmerges at the target article would improve clarity in page histories. — Red-tailed hawk (nest) 15:51, 20 November 2024 (UTC)[reply]
  • Option 1a/b. No strong feelings on which way is best (I'll let the experienced histmergers comment on this), but logging a history merge definitely seems like a useful feature. Chaotic Enby (talk · contribs) 16:02, 20 November 2024 (UTC)[reply]
  • Option 1a/b. Choatic Enby has said exactly what I would have said (but more concisely) had they not said it first. Thryduulf (talk) 16:23, 20 November 2024 (UTC)[reply]
  • 1b would be most important to me but but 1a would be nice too. But this is really not the place for this sort of discussion, as noted below. Graham87 (talk) 16:28, 20 November 2024 (UTC)[reply]
  • Option 2 History merging done right should be seamless, leaving the page indistinguishable from if the copy-paste move being repaired had never happened. Adding extra annotations everywhere runs counter to that goal. Prefer 1b to 1a if we have to do one of them, as the extra null edits could easily interfere with the history merge being done in more complicated situations. * Pppery * it has begun... 16:49, 20 November 2024 (UTC)[reply]
    Could you expound on why they should be indistinguishable? I don't see how this could harm any utility. A log action at the target page would not show up in the history anyways, and a null edit would have no effect on comparing revisions. Aaron Liu (talk) 17:29, 20 November 2024 (UTC)[reply]
    Why shouldn't it be indistinguishable? Why it it necessary to go out of our way to say even louder that someone did something wrong and it had to be cleaned up? * Pppery * it has begun... 17:45, 20 November 2024 (UTC)[reply]
    All cleanup actions are logged to all the pages they affect. Aaron Liu (talk) 18:32, 20 November 2024 (UTC)[reply]
  • 2 History merges are already logged, so this survey name is somewhat off the mark. As someone who does this work: I do not think these should be displayed at either location. It would cause a lot of noise in history pages that people probably would not fundamentally understand (2 revisions for "please process this" and "remove tag" and a 3rd revision for the suggested log), and it would be "out of order" in that you will have merged a bunch of revisions but none of those revisions would be nearby the entry in the history page itself. I also find protections noisy in this way as well, and when moves end up causing a need for history merging, you end up with doubled move entries in the merged history, which also is confusing. Adding history merges to that case? No thanks. History merges are more like deletions and undeletions, which already do not add displayed content to the history view. Izno (talk) 16:54, 20 November 2024 (UTC)[reply]
    They presently are logged, but only at the source article. Take for example this entry. When I search for the merge target, I get nothing. It's only when I search the merge source that I'm able to get a result, but there isn't a way to know the merge source.
    If I don't know when or if the histmerge took place, and I don't know what article the history was merged from, I'd have to look through the entirety of the merge log manually to figure that out—and that's suboptimal. — Red-tailed hawk (nest) 17:05, 20 November 2024 (UTC)[reply]
    ... Page moves do the same thing, only log the move source. Yet this is not seen as an issue? :)
    But ignoring that, why is it valuable to know this information? What do you gain? And is what you gain actually valuable to your end objective? For example, let's take your There have been several times I can think of when I've been going diff hunting or browsing page history and where some explicit note of a histmerge having occurred would have been useful. Is not the revisions left behind in the page history by both the person requesting and the person performing the histmerge not enough (see {{
    histmerge}})? There are history merges done that don't have that request format such as the WikiProject history merge format, but those are almost always ancient revisions, so what are you gaining there? And where they are not ancient revisions, they are trivial kinds of the form "draft x -> page y, I hate that I even had to interact with this history merge it was so trivial (but also these are great because I don't have to spend significant time on them)". Izno (talk) 17:32, 20 November 2024 (UTC)[reply
    ]

    ... Page moves do the same thing, only log the move source. Yet this is not seen as an issue? :)

    I don't think everyone would necessarily agree (see Toadspike's comment below).
    Chaotic Enby (talk · contribs) 17:42, 20 November 2024 (UTC)[reply]
    Page moves do leave a null edit on the page that describes where the page was moved from and was moved to. And it's easy to work backwards from there to figure out the page move history. The same cannot be said of the Special:MergeHistory tool, which doesn't make it easy to re-construct what the heck went on unless we start diving naïvely through the logs. — Red-tailed hawk (nest) 17:50, 20 November 2024 (UTC)[reply]
    It can be *possible* to find the original history merge source page without looking through the merge log, but the method for doing so is very brittle and extremeley hacky. Basically, look for redirects to the page using "What links here", and find the redirect whose first edit has an unusual byte difference. This relies on the redirect being stable and not deleted or retargetted. There is also another way that relies on byte difference bugs as described in the above-linked discussion by wbm1058. Both of those are ... particularly awful. Graham87 (talk) 03:48, 21 November 2024 (UTC)[reply]
    In the given example, the history-merge occurred here. Your "log" is the edit summaries. "Created page with '..." is the edit summary left by a normal page creation. But wait, there is page history before the edit that created the page. How did it get there? Hmm, the previous edit summary "Declining submission: v - Submission is improperly sourced (AFCH)" tips you off to look for the same title in draft: namespace. Voila! Anyone looking for help with understanding a particular merge may ask me and I'll probably be able to figure it out for you. – wbm1058 (talk) 05:51, 21 November 2024 (UTC)[reply]
    Here's another example, of a merge within mainspace. The automatic edit summary (created by the MediaWiki software) of this (No difference) diff "Removed redirect to Jordan B. Acker" points you to the page that was merged at that point. Voila. Voila. Voila. – wbm1058 (talk) 13:44, 21 November 2024 (UTC)[reply]
    There are times where those traces aren't left. Aaron Liu (talk) 13:51, 21 November 2024 (UTC)[reply]
    Here's another scenario, this one from WP:WikiProject History Merge. The page history shows an edit adding +5,800 bytes, leaving the page with 5,800 bytes. But the previous edit did not leave a blank page. Some say this is a bug, but it's also a feature. That "bug" is actually your "log" reporting that a hist-merge occurred at that edit. Voila, the log for that page shows a temp delete & undelete setting the page up for a merge. The first item on the log:
    @ 20:14, 16 January 2021 Tbhotch moved page Flag of Yucatán to Flag of the Republic of Yucatán (Correct name)
    clues you in to where to look for the source of the merge. Voila, that single edit which removed −5,633 bytes tells you that previous history was merged off of that page. The log provides the details. – wbm1058 (talk) 16:03, 21 November 2024 (UTC)[reply]
    (phab:T76557: Special:MergeHistory causes incorrect byte change values in history, authored Dec 2 2014) — Preceding unsigned comment added by Wbm1058 (talkcontribs) 18:13, 21 November 2024 (UTC)[reply]
    Again, there are times where the clues are much harder to find, and even in those cases, it'd be much better to have a unified and assured way of finding the source. Aaron Liu (talk) 16:11, 21 November 2024 (UTC)[reply]
    Indeed. This is a prime example of an unintended undocumented feature. Graham87 (talk) 08:50, 22 November 2024 (UTC)[reply]
    Yeah. I don't think that we can permanently rely on that, given that future versions of MediaWiki are not bound in any real way to support that workaround. — Red-tailed hawk (nest) 04:24, 3 December 2024 (UTC)[reply]
  • Support 1b (log only), oppose 1a (null edit). I defer to the experienced histmergers on this, and if they say that adding null edits everywhere would be inconvenient, I believe them. However, I haven't seen any arguments against logging the histmerge at both articles, so I'll support it as a sensible idea. (On a similar note, it bothers me that page moves are only logged at one title, not both.) Toadspike [Talk] 17:10, 20 November 2024 (UTC)[reply]
  • Option 2. The merges are already logged, so there’s no reason to add it to page histories. While it may be useful for habitual editors, it will just confuse readers who are looking for an old revision and occasional editors. Ships & Space(Edits) 18:33, 20 November 2024 (UTC)[reply]
    But only the source page is logged as the "target". IIRC it currently can be a bit hard to find out when and who merged history into a page if you don't know the source page and the mergeperson didn't leave any editing indication that they merged something. Aaron Liu (talk) 18:40, 20 November 2024 (UTC)[reply]
  • 1B. The present situation of the action being only logged at one page is confusing and unhelpful. But so would be injecting null-edits all over the place.  — SMcCandlish ¢ 😼  01:38, 21 November 2024 (UTC)[reply]
  • Option 2. This exercise is dependent on finding a volunteer MediaWiki developer willing to work on this. Good luck with that. Maybe you'll find one a decade from now. – wbm1058 (talk) 05:51, 21 November 2024 (UTC)[reply]
    And, more importantly, someone in the MediaWiki group to review it. I suspect there are many people, possibly including myself, who would code this if they didn't think they were wasting their time shuffling things from one queue to another. * Pppery * it has begun... 06:03, 21 November 2024 (UTC)[reply]
    That link requires a Gerrit login/developer account to view. It was a struggle to get in to mine (I only have one because of an old Toolforge account and I'd basically forgotten about it), but for those who don't want to go through all that, that group has only 82 members (several of whose usernames I recognise) and I imagine they have a lot on their collective plate. There's more information about these groups at Gerrit/Privilege policy on MediaWiki. Graham87 (talk) 15:38, 21 November 2024 (UTC)[reply]
    Sorry, I totally forgot Gerrit behaved in that counterintuitive way and hid public information from logged out users for no reason. The things you miss if Gerrit interactions become something you do pretty much every day. If you want to count the members of the group you also have to follow the chain of included groups - it also includes https://ldap.toolforge.org/group/wmf, https://ldap.toolforge.org/group/ops and the WMDE-MediaWiki group (another login-only link), as well as a few other permission edge cases (almost all of which are redundant because the user is already in the MediaWiki group) * Pppery * it has begun... 18:07, 21 November 2024 (UTC)[reply]
  • Support 1a/b, and I would encourage the closer to disregard any opposition based solely on the chances of someone ever actually implementing it. Compassionate727 (T·C) 12:52, 21 November 2024 (UTC)[reply]
    Fine. This stupid RfC isn't even asking the right questions. Why did I need to delete (an expensive operation) and then restore a page in order to "set up for a history merge" Should we fix the software so that it doesn't require me to do that? Why did the page-mover resort to cut-paste because there was page history blocking their move, rather than ask a administrator for help? Why doesn't the software just let them move over that junk page history themselves, which would negate the need for a later hist-merge? (Actually in this case the offending user only has made 46 edits, so they don't have page-mover privileges. But they were able to move a page. They just couldn't move it back a day later after they changed their mind.) wbm1058 (talk) 13:44, 21 November 2024 (UTC)[reply]
    Yeah, revision move would be amazing, for a start. Graham87 (talk) 15:38, 21 November 2024 (UTC)[reply]
  • Option 1b – changes to a page's history should be listed in that page's log. There's no need to make a null edit; pagemove null edits are useful because they meaningfully fit into the page's revision history, which isn't the case here. jlwoodwa (talk) 00:55, 22 November 2024 (UTC)[reply]
  • Option 1b sounds best since that's what those in the know seem to agree on, but 1a would probably be OK. Abzeronow (talk) 03:44, 23 November 2024 (UTC)[reply]
  • Option 1b seems like the one with the best transparency to me. Thanks. Huggums537voted! (sign🖋️|📞talk) 06:59, 25 November 2024 (UTC)[reply]

Discussion: Log the use of the HistMerge tool

CheckUser for all new users

All new users (IPs and accounts) should be subject to CheckUser against known socks. This would prevent recidivist socks from returning and save the time and energy of users who have to prove a likely case at SPI. Recidivist socks often get better at covering their "tells" each time making detection increasingly difficult. Users should not have to make the huge effort of establishing an SPI when editing from an IP or creating a new account is so easy. We should not have to endure Wikipedia:Long-term abuse/HarveyCarter, Wikipedia:Sockpuppet investigations/Phạm Văn Rạng/Archive or Wikipedia:Sockpuppet investigations/Orchomen/Archive if CheckUser can prevent them. Mztourist (talk) 04:06, 22 November 2024 (UTC)[reply]

I'm pretty sure that even if we had enough checkuser capacity to routinely run checks on every new user that doing so would be contrary to global policy. Thryduulf (talk) 04:14, 22 November 2024 (UTC)[reply]
Setting aside privacy issues, the fact that the WMF wouldn't let us do it, and a few other things: Checking a single account, without any idea of who you're comparing them to, is not very effective, and the worst LTAs are the ones it would be least effective against. This has been floated several times in the much narrower context of adminship candidates, and rejected each time. It probably belongs on ]
Why can't it be automated? What are the privacy issues and what would WMF concerns be? There has to be a better system than SPI which imposes a huge burden on the filer (and often fails to catch socks) while we just leave the door open for LTAs. Mztourist (talk) 04:39, 22 November 2024 (UTC)[reply]
How would it be automated? We can't just block everyone who even sometimes shares an IP with someone, which is most editors once you factor in mobile editing and institutional WiFi. Even if we had a system that told checkusers about all shared-IP situations and asked them to investigate, what are they investigating for? The vast majority of IP overlaps will be entirely innocent, often people who don't even know each other. There's no way for a checkuser to find any signal in all that noise. So the only way a system like this would work is if checkusers manually identified IP ranges that are being used by LTAs, and then placed blocks on those ranges to restrict them from account creation... Which is what already happens. -- ]
I would assume that IT experts can work out a way to automate CheckUser. If someone edits on a shared IP used by a previous sock that should be flagged and human CheckUsers notified so they can look at the edits and the previous sock edits and warn or block as necessary. Mztourist (talk) 05:46, 22 November 2024 (UTC)[reply]
We already have ]
Addendum: An actually potentially workable innovation would be something like a system that notifies CUs if an IP is autoblocked more than once in a certain time period. That would be a software proposal for Phabricator, though, not an enwiki policy proposal, and would still have privacy implications that would need to be squared with the WMF. -- ]
I believe Tamzin has it about right, but I want to clarify a thing. If you're hypothetically using T-Mobile (and this also applies to many other ISPs and many LTAs) then the odds are very high that you're using an IP address which has never been used before. With T-Mobile, which is not unusually large by any means, you belong to at least one /32 range which contains a number of IP addresses so big that it has 30 digits. These ranges contain a huge number of users. At the other extreme you have some countries with only a handful of IPs, which everyone uses. These IPs also typically contain a huge number of users. TLDR; is someone is using a single IP on their own then we'll probably just block it, otherwise you're talking about matching a huge number of users. -- zzuuzz (talk) 03:20, 23 November 2024 (UTC)[reply]
As I understand it, if you're hypothetically using T-Mobile, then you're not editing, because someone range-blocked the whole network in pursuit of a vandal(s). See Wikipedia:Advice to T-Mobile IPv6 users. WhatamIdoing (talk) 03:36, 23 November 2024 (UTC)[reply]
T-Mobile USA is a perennial favourite of many of the most despicable LTAs, but that's besides the point. New users with an account can actually edit from T-Mobile. They can also edit from Jio, or Deutsche Telecom, Vodafone, or many other huge networks. -- zzuuzz (talk) 03:50, 23 November 2024 (UTC)[reply]
Would violate the policy ]
It would apply to every new User as a protective measure against sockpuppetry, like a credit check before you get a card/overdraft. ]
What you're suggesting is to just inundate checkusers with thousands of cases. The suggestion (as I understand it) removes burden from SPI filers by adding a disproportional burden on checkusers, who are already an overworked group. If you're suggesting an automated solution, then I believe IP blocks/IP range blocks and autoblock (discussed by Tamzin, above) already cover enough. It's quite hard to weigh up what you're really suggesting because it feels very vague without much detail - it sounds like you're just saying "a new SPI should be opened for every new user and IP, forever" which is not really a workable solution (for instance, 50 accounts were made in the last 15 minutes, which is about one every 18 seconds) BugGhost🦗👻 18:12, 22 November 2024 (UTC)[reply]
And most of those accounts will make zero, one, or two edits, and then never be used again. Even if we liked this idea, doing it for every single account creation would be a waste of resources. WhatamIdoing (talk) 23:43, 22 November 2024 (UTC)[reply]
No, they should not. voorts (talk/contributions) 17:23, 22 November 2024 (UTC)[reply]
This, very bluntly, ]
 Just out of curiosity: If a certain
case of IPs spamming at Help Desk is any indication, would a CU be able to stop that in its track? 2601AC47 (talk|contribs) Isn't a IP anon 14:29, 23 November 2024 (UTC)[reply
]
CU's use their tools to identify socks when technical proof is necessary. The problem you're linking to is caused by one particular ]
@]
LTA MAB is using a peer-to-peer VPN service which is similar to TOR. Blocking peer-to-peer VPN service endpoint IP addresses carries a higher risk of collateral damage because those aren't assigned to the VPN provider but rather a third party ISP who is likely to dynamically reassign the blocked address to a completely innocent party. 216.126.35.235 (talk) 00:22, 27 November 2024 (UTC)[reply]
I slightly oppose this idea. This is not
WP:DUCK as any wiki does. Ahri Boy (talk) 00:14, 25 November 2024 (UTC)[reply
]
How do you know this is how Reddit deals with ban and suspension evasion? They use advanced techniques such as device and IP fingerprinting to ban and suspend users in under an hour. 2600:1700:69F1:1410:5D40:53D:B27E:D147 (talk) 23:47, 28 November 2024 (UTC)[reply]
I can see where this is coming from, but we must realise that checkuser is not ]
The question I ask myself is why must we realize that it is not meant for fishing? To catch fish, you need to fish. The no-fishing rule is not fit for purpose, nor is it a rule that other organizations that actively search for ban evasion use. Machines can do the fishing. They only need to show us the fish they caught. Sean.hoyland (talk) 05:24, 27 November 2024 (UTC)[reply]
I think for the same reason we don't want governments to be reading our mail and emails. If we checkuser everybody, then nobody has any privacy. Donald Albury 20:20, 27 November 2024 (UTC)[reply]

I sympathize with Mztourist. The current system is less effective than it needs to be. Ban evading actors make a lot of edits, they are dedicated hard-working folk in contentious topic areas. They can make up nearly 10% of new extendedconfirmed actors some years and the quicker an actor becomes EC the more likely they are to be blocked later for ban evasion. Their presence splits the community into two classes, the sanctionable and the unsanctionable with completely different payoff matrices. This has many consequences in contentious topic areas and significantly impacts the dynamics. The current rules are probably not good rules. Other systems have things like a 'commitment to authenticity' and actively search for ban evasion. It's tempting to burn it all down and start again, but with what? Having said that, the SPI folks do a great job. The average time from being granted extendedconfirmed to being blocked for ban evasion seems to be going down. Sean.hoyland (talk) 18:28, 22 November 2024 (UTC)[reply]

I confess that I am doubtful about that 10% claim. WhatamIdoing (talk) 23:43, 22 November 2024 (UTC)[reply]
WhatamIdoing, me too. I'm doubtful about everything I say because I've noticed that the chance it is slightly to hugely wrong is quite high. The EC numbers are work in progress, but I got distracted. The description "nearly 10% of new extendedconfirmed actors" is a bit misleading, because 'new' doesn't really mean new actors. It means actors that acquired EC for a given year, so newly acquired privileges. They might have registered in previous years. Also, I don't have 100% confidence in the way count EC grants because there are some edge cases, and I'm ignoring sysops. But anyway, the statement was based on this data of questionable precision. And the statement about a potential relationship between speed of EC acquisition and probability of being blocked is based on this data of questionable precision. And of course, currently undetected socks are not included, and there will be many. Sean.hoyland (talk) 03:39, 23 November 2024 (UTC)[reply]
I'm not interested in clicking through to a Google file. Here's my back-of-the-envelope calculation: We have something like 120K accounts that would qualify for EXTCONF. Most of these are no longer active, and many stopped editing so long ago that they don't actually have the user right.
Wikipedia is almost 24 years old. That makes convenient math: On average, since inception, 5K editors have achieved EXTCONF levels each year.
If the 10% estimate is true, then 500 accounts per year – about 10 per week – are being created by banned editors and going undetected long enough for the accounts to make 500 edits and to work in CTOP areas. Do we even have enough
WP:BANNED editors to make it plausible to expect banned editors to bring 500 accounts a year up to EXTCONF levels (plus however many accounts get started but are detected before then)? WhatamIdoing (talk) 03:53, 23 November 2024 (UTC)[reply
]
Suit yourself. I'm not interested in what interests other people or back of the envelope calculations. I'm interested in understanding the state of a system over time using evidence-based approaches by extracting data from the system itself. Let the data speak for itself. It has a lot to tell us. Then it is possible to test hypotheses and make evidence-based decisions. Sean.hoyland (talk) 04:13, 23 November 2024 (UTC)[reply]
@WhatamIdoing, there's a sockmaster in the IPA CTOP who has made more than 100 socks. 500 new XC socks every year doesn't seem that much of a stretch in comparison. -- asilvering (talk) 19:12, 23 November 2024 (UTC)[reply]
More than 100 XC socks? Or more than 100 detected socks, including socks with zero edits?
Making a lot of accounts isn't super unusual, but it's a lot of work to get 100 accounts up to 500+ edits. Making 50,000 edits is a lot, even if it's your full-time job. WhatamIdoing (talk) 01:59, 24 November 2024 (UTC)[reply]
Lots of users get it done in a couple of days, often through vandal fighting tools. It really is not that many when the edits are mostly mindless. nableezy - 00:18, 26 November 2024 (UTC)[reply]
But that's kind of my point: "A couple of days", times 100 accounts, means 200–300 days per year. If you work five days per week and 52 weeks per year, that's 260 work days. This might be possible, but it's a full-time job.
Since the 30-day limit is something that can't be achieved through effort, I wonder if a sudden change to, say, 6 months would produce a five-month reprieve. WhatamIdoing (talk) 02:23, 26 November 2024 (UTC)[reply]
Who says it’s only one at a time? Icewhiz for example has had 4 plus accounts active at a time. nableezy - 02:25, 26 November 2024 (UTC)[reply]
There is some data about ban evasion timelines for some sockmasters in PIA that show how accounts are operated in parallel. Operating multiple accounts concurrently seems to be the norm. Sean.hoyland (talk) 04:31, 26 November 2024 (UTC)[reply]
Imagine that it takes an average of one minute to make a (convincing) edit. That means that 500 edits = 8.33 hours, i.e., more than one full work day.
Imagine, too, that having reached this point, you actually need to spend some time using your newly EXTCONF account. This, too, takes time.
If you operate several accounts at once, that means:
You spend an hour editing from Account1. You spend the next hour editing from Account2. You spend another hour editing from Account3. You spend your fourth hour editing from Account4. Then you take a break for lunch, and come back to edit from Accounts 5 through 8.
At the end of the day, you have brought 8 accounts up to 60 edits (12% of the minimum goal). And maybe one of them got blocked, too, which is lost effort. At this rate, it would take you an entire year of full-time work to get 100 EXTCONF accounts, even though you are operating multiple accounts concurrently. Doing 50 edits per day in 10 accounts is not faster than doing 500 edits in 1 account. It's the same amount of work. WhatamIdoing (talk) 05:13, 29 November 2024 (UTC)[reply]
Sure it’s an effort, though it doesn’t take a minute an edit. But I’m not sure why I need to imagine something that has happened multiple times already. Icewhiz most recently had like 4-5 EC accounts active, and there are probably several more. Yes, there is an effort there. But also yes, it keeps happening. nableezy - 15:00, 29 November 2024 (UTC)[reply]
My point is that "4-5 EC accounts" is not "100". WhatamIdoing (talk) 19:31, 30 November 2024 (UTC)[reply]
It’s 4-5 at a time for a single sock master. Check the Icewhiz SPI for how many that adds up to over time. nableezy - 20:16, 30 November 2024 (UTC)[reply]
Many of our frequent fliers are already adept at warehousing accounts for months or even years, so a bump in the time period probably won't make much off a difference. Additionally, and without going into detail publicly, there are several methods whereby semi- or even fully-automated editing can be used to get to 500 edits with a minimum of effort, or at least well within script-kid territory. Because so many of those are obvious on inspection some will assume that all of them are, but there are a number of rather subtle cases that have come up over the years and it would be foolish to assume that it isn't ongoing. 184.152.68.190 (talk) 17:31, 28 November 2024 (UTC)[reply]

Also, if we divide the space into contentious vs not-contentious, maybe a one size fits all CU policy doesn't make sense. Sean.hoyland (talk) 18:55, 22 November 2024 (UTC)[reply]

Terrible idea. Let's AGF that most new users are here to improve Wikipedia instead of damage it. Some1 (talk) 18:33, 22 November 2024 (UTC)[reply]

Ban evading actors who employ deception via sockpuppetry in the
WP:PIA topic area are here to improve Wikipedia, from their perspective, rather than damage it. There is no need to use faith. There are statistics. There is a probability that a 'new user' is employing ban evasion. Sean.hoyland (talk) 18:46, 22 November 2024 (UTC)[reply
]
My initial comment wasn't a direct response to yours, but new users and IPs won't be able to edit in the WP:PIA topic area anyway since they need to be extended confirmed. Some1 (talk) 20:08, 22 November 2024 (UTC)[reply]
Let's not hold up the way PIA handles new users and IPs, in which they are allowed to post to talk pages but then have their talk page post removed if it doesn't fall within very specific parameters, as some sort of model. CMD (talk) 02:51, 23 November 2024 (UTC)[reply]

Strongly support automatically checkusering all active users (new and existing) at regular intervals. If it were automated -- e.g., a script runs that compares IPs, user agent, other typical subscriber info -- there would be no privacy violation, because that information doesn't have to be disclosed to any human beings. Only the "hits" can be forwarded to the CU team for follow-up. I'd run that script daily. If the policy forbids it, we should change the policy to allow it. It's mind-boggling that Wikipedia doesn't do this already. It's a basic security precaution. (Also, email-required registration and get rid of IP editing.) Levivich (talk) 02:39, 23 November 2024 (UTC)[reply]

I don't think you've been reading the comments from people who know what they are talking about. There would be hundreds, at least, of hits per day that would require human checking. The policy that prohibits this sort of massive breach of privacy is the Foundation's and so not one that en.wp could change even if it were a good idea (which it isn't). Thryduulf (talk) 03:10, 23 November 2024 (UTC)[reply]
A computer can be programmed to check for similarities or patterns in subscriber info (IP, etc), and in editing activity (time cards, etc), and content of edits and talk page posts (like the existing language similarity tool), with various degrees of certainty in the same way the Cluebot does with ORES when it's reverting vandalism. And the threshold can be set so it only forwards matches of a certain certainty to human CUs for review, so as not to overwhelm the humans. The WMF can make this happen with just $1 million of its $180 million per year (and it wouldn't be violating its own policies if it did so). Enwiki could ask for it, other projects might join too. Levivich (talk) 05:24, 23 November 2024 (UTC)[reply]
"Oh now I see what you mean, Levivich, good point, I guess you know what you're talking about, after all."
"Thanks, Thryduulf!" Levivich (talk) 17:42, 23 November 2024 (UTC)[reply]
I seem to have missed this comment, sorry. However I am very sceptical that sockpuppet detection is meaningfully automatable. From what CUs say it is as much art as science (which is why SPI cases can result in determinations like "possilikely"). This is the sort of thing that is difficult (at best) to automate. Additionally the only way to reliably develop such automation would be for humans analyse and process a massive amount of data from accounts that both are and are not sockpuppets and classify results as one or the other, and that anaylsis would be a massive privacy violation on its own. Assuming you have developed this magic computer that can assign a likelihood of any editor being a sock of someone who has edited in the last three months (data older than that is deleted) on a percentage scale, you then have to decide what level is appropriate to send to humans to check. Say for the sake of argument it is 75%, that means roughly one in four people being accused are innocent and are having their privacy impinged unnecessarily - and how many CUs are needed to deal with this caseload? Do we have enough? SPI isn't exactly backlog free and there aren't hoards of people volunteering for the role (although unbreaking RFA might help with this in the medium to long term). The more you reduce the number sent to CUs to investigate, the less benefit there is over the status quo.
In addition to all the above, how similar is "similar" in terms of articles edited, writing style, timecard, etc? How are you avoiding legitimate sockpuppets? Thryduulf (talk) 18:44, 23 November 2024 (UTC)[reply]
You know this already but for anyone reading this who doesn't: when a CU "checks" somebody, it's not like they send a signal out to that person's computer to go sniffing around. In fact, all the subscriber info (IP address, etc.) is already logged on the WMF's server logs (as with any website). A CU "check" just means a volunteer CU gets to look at a portion of those logs (to look up a particular account's subscriber info). That's the privacy concern: we have rules, rightfully so, about when volunteer CUs (not WMF staff) can read the server logs (or portions of them). Those rules do not apply to WMF staff, like devs and maintenance personnel, nor do they apply to the WMF's own software reading its own logs. Privacy is only an issue when those logs are revealed to volunteer CUs.
So... feeding the logs into software in order to train the software doesn't violate anyone's policy. It's just letting a computer read its own files. Human verification of the training outcomes also doesn't have to violate anyone's privacy -- just don't use volunteer CUs to do it, use WMF staff. Or, anonymize the training data (changing usernames to "Example1", "Example2", etc.). Or use historical data -- which would certainly be part of the training, since the most effective way would be to put known socks into the training data to see if the computer catches them.
Anyway, training the system won't violate anyone's privacy.
As for the hit rate -- 75% would be way, way too low. We'd be looking for definitely over 90% or 95%, and probably more like 99.something percent. Cluebot doesn't get vandalism wrong 1 out of 4 times, neither should CluebotCU. Heck, if CluebotCU can't do better than 75%, it's not worth doing. A more interesting question is whether the 99.something% hit rate would be helpful to CUs, or whether that would only catch the socks that are so obvious you don't even need CU to recognize them. Only testing in the field would tell.
But overall, AI looking for patterns, and checking subscriber info, edit patterns, and the content of edits, would be very helpful in tamping down on socking, because the computer can make far more checks than a human (a computer can look at 1,000 accounts and a 100,000 edits no problem, which no human can do), it'll be less biased than humans, and it can do it all without violating anyone's privacy -- in fact, lowering the privacy violations by lowering the false positives, sending only high-probability (90%+, not 75%+) to humans for review. And it can all be done with existing technology, and the WMF has the money to do it. Levivich (talk) 19:38, 23 November 2024 (UTC)[reply]
The more you write the clearer you make it that you don't understand checkuser or the WMF's policies regarding privacy. It's also clear that I'm not going to convince you that this is unworkable so I'll stop trying. Thryduulf (talk) 20:42, 23 November 2024 (UTC)[reply]
Yeah it's weird how repeatedly insulting me hasn't convinced me yet. Levivich (talk) 20:57, 23 November 2024 (UTC)[reply]
If you are are unable to distinguish between reasoned disagreement and insults, then it's not at all weird that reasoned disagreement fails to convince you. Thryduulf (talk) 22:44, 23 November 2024 (UTC)[reply]
@Levivich: Whatever existing data set we have has too many biases to be useful for this, and this is going to be prone to false positives. AI needs lots of data to be meaningfully trained. Also, AI here would be learning a function; when the output is not in fact a function of the input, there's nothing for an AI model to target, and this is very much the case here. On Wikidata, where I am a CheckUser, almost all edit summaries are automated even for human edits (just like clicking the rollback button is, or undoing an edit is by default), and it is very hard to meaningfully tell whether someone is a sock or not without highly case-specific analysis. No AI model is better than the data it's trained on.
Also, about the privacy policy: you are completely incorrect when you "Those rules do not apply to WMF staff, like devs and maintenance personnel, nor do they apply to the WMF's own software reading its own logs". Staff can only access that information on a need to know basis, just like CheckUsers, and data privacy laws like the EU's and California's means you cannot just do whatever random thing you want with the information you collect from users about them.--Jasper Deng (talk) 21:56, 23 November 2024 (UTC)[reply]
So which part of the wmf:Privacy Policy would prohibit the WMF from developing an AI that looks at server logs to find socks? Do you want me to quote to you the portions that explicitly disclose that the WMF uses personal information to develop tools and improve security? Levivich (talk) 22:02, 23 November 2024 (UTC)[reply]
I mean yeah that would probably be more productive than snarky bickering BugGhost🦗👻 22:05, 23 November 2024 (UTC)[reply]
@Levivich: Did you read the part where I mentioned privacy laws? Also, in this industry no one is allowed unfettered usage of private data even internally; there are internal policies that govern this that are broadly similar to the privacy policy. It's one thing to test a proposed tool on an IP address like Special:Contribs/2001:db8::/32, but it's another to train an AI model on it. Arguably an equally big privacy concern is the usage of new data from new users after the model is trained and brought online. The foundation is already hiding IP addresses by default even for anonymous users soon, and they will not undermine that mission through a tool like this. Ultimately, the Board of Trustees has to assume legal responsibility and liability for such a thing; put yourself in their position and think of whether they'd like the liability of something like this.--Jasper Deng (talk) 22:13, 23 November 2024 (UTC)[reply]
So can you quote a part of the privacy policy, or a part of privacy laws, or anything, that would prohibit feeding server logs into a "Cluebot-CU" to find socking?
Because I can quote the part of the wmf:Privacy Policy that allows it, and it's a lot:

We may use your public contributions, either aggregated with the public contributions of others or individually, to create new features or data-related products for you or to learn more about how the Wikimedia Sites are used ...

Because of how browsers work, we receive some information automatically when you visit the Wikimedia Sites ... This information includes the type of device you are using (possibly including unique device identification numbers, for some beta versions of our mobile applications), the type and version of your browser, your browser's language preference, the type and version of your device's operating system, in some cases the name of your internet service provider or mobile carrier, the website that referred you to the Wikimedia Sites, which pages you request and visit, and the date and time of each request you make to the Wikimedia Sites.

Put simply, we use this information to enhance your experience with Wikimedia Sites. For example, we use this information to administer the sites, provide greater security, and fight vandalism; optimize mobile applications, customize content and set language preferences, test features to see what works, and improve performance; understand how users interact with the Wikimedia Sites, track and study use of various features, gain understanding about the demographics of the different Wikimedia Sites, and analyze trends. ...

We actively collect some types of information with a variety of commonly-used technologies. These generally include tracking pixels, JavaScript, and a variety of "locally stored data" technologies, such as cookies and local storage. ... Depending on which technology we use, locally stored data may include text, Personal Information (like your IP address), and information about your use of the Wikimedia Sites (like your username or the time of your visit). ... We use this information to make your experience with the Wikimedia Sites safer and better, to gain a greater understanding of user preferences and their interaction with the Wikimedia Sites, and to generally improve our services. ...

We and our service providers use your information ... to create new features or data-related products for you or to learn more about how the Wikimedia Sites are used ... To fight spam, identity theft, malware and other kinds of abuse. ... To test features to see what works, understand how users interact with the Wikimedia Sites, track and study use of various features, gain understanding about the demographics of the different Wikimedia Sites and analyze trends. ...

When you visit any Wikimedia Site, we automatically receive the IP address of the device (or your proxy server) you are using to access the Internet, which could be used to infer your geographical location. ... We use this location information to make your experience with the Wikimedia Sites safer and better, to gain a greater understanding of user preferences and their interaction with the Wikimedia Sites, and to generally improve our services. For example, we use this information to provide greater security, optimize mobile applications, and learn how to expand and better support Wikimedia communities. ...

We, or particular users with certain administrative rights as described below, need to use and share your Personal Information if it is reasonably believed to be necessary to enforce or investigate potential violations of our Terms of Use, this Privacy Policy, or any Wikimedia Foundation or user community-based policies. ... We may also disclose your Personal Information if we reasonably believe it necessary to detect, prevent, or otherwise assess and address potential spam, malware, fraud, abuse, unlawful activity, and security or technical concerns. ... To facilitate their work, we give some developers limited access to systems that contain your Personal Information, but only as reasonably necessary for them to develop and contribute to the Wikimedia Sites. ...

Yeah that's a lot. Then there's this whole FAQ that says

It is important for us to be able to make sure everyone plays by the same rules, and sometimes that means we need to investigate and share specific users' information to ensure that they are.

For example, user information may be shared when a CheckUser is investigating abuse on a Project, such as suspected use of malicious "sockpuppets" (duplicate accounts), vandalism, harassment of other users, or disruptive behavior. If a user is found to be violating our Terms of Use or other relevant policy, the user's Personal Information may be released to a service provider, carrier, or other third-party entity, for example, to assist in the targeting of IP blocks or to launch a complaint to the relevant Internet Service Provider.

So using IP addresses, etc., to develop new tools, to test features, to fight violations of the Terms of Use, and disclosing that info to Checkusers... all explicitly permitted by the Privacy Policy. Levivich (talk) 22:22, 23 November 2024 (UTC)[reply]
@
preaching to the choir: only the foundation could even consider assuming this risk. Also, it's clear that you do not have a single idea of how developing something like this works if you think it can be done for $1 million. Something this complex has to be done right and tech salaries and computing resources are expensive.--Jasper Deng (talk) 22:28, 23 November 2024 (UTC)[reply
]
What I am suggesting does not involve sharing everyone's data with Checkusers. It's pretty obvious that looking at their own server logs is "necessary to enforce or investigate potential violations of our Terms of Use". Five people is how big the WMF's wmf:Machine Learning team is, @ $200k each, $1m/year covers it. Five people is enough for that team to improve ORES, so another five-person team dedicated to "ORES-CU" seems a reasonable place to start. They could double that, and still have like $180M left over. Levivich (talk) 22:40, 23 November 2024 (UTC)[reply]
@
nature cannot be fooled. I'm finished arguing with you anyways, because this proposal is either way dead on arrival.--Jasper Deng (talk) 23:45, 23 November 2024 (UTC)[reply
]
@Jasper Deng, haggling over the math here isn't really important. You could quintuple the figures @Levivich gave and the Foundation would still have millions upon millions of dollars left over. -- asilvering (talk) 23:48, 23 November 2024 (UTC)[reply]
@Asilvering: The point I'm making is Levivich does not understand the complexity behind this kind of thing and thus his arguments are not to be given weight by the closer. Jasper Deng (talk) 23:56, 23 November 2024 (UTC)[reply]
As a statistician/data scientist, @
community notes
feature.
IANAL, so I can't comment on the legal side of this, and I can't comment on whether that money would be better-spent elsewhere since I don't know what the WMF budget looks like. Overall though, the technical implementation wouldn't be a major hurdle. – Closed Limelike Curves (talk) 20:44, 24 November 2024 (UTC)[reply]
Third-party services like Sift.com provide this kind of algorithm-based account fraud protection as an alternative to building and maintaining internally. czar 23:41, 24 November 2024 (UTC)[reply]
Building such a model is only a small part of a real production system. If this system is to operate on all account creations, it needs to be at least as reliable as the existing systems that handle account creations. As you probably know, data scientists developing such a model need to be supported by software engineers and site reliability engineers supporting the actual system. Then you have the problem of new sockers who are not on the list of sockmasters to check against. Non-English-language speakers often would be put at a disadvantage too. It's not as trivial as you make it out to be, thus I stand by my estimate.--Jasper Deng (talk) 06:59, 25 November 2024 (UTC)[reply]
None of you have accounted for Hofstadter's law.
I don't think we need to spend more time speculating about a system that WMF Legal is extremely unlikely to accept. Even if they did, it wouldn't exist until several years from now. Instead, let's try to think of things that we can do ourselves, or with only a very little assistance. Small, lightweight projects with full community control can help us now, and if we prove that ____ works, the WMF might be willing to adopt and expand it later. WhatamIdoing (talk) 23:39, 25 November 2024 (UTC)[reply]
That's a mistake -- doing the same thing Wikipedia has been doing for 20+ years. The mistake is in leaving it to volunteers to catch sockpuppetry, rather than insisting that the WMF devote significant resources to it. And it's a mistake because the one thing we volunteers can't do, that the WMF can do, is comb through the server logs looking for patterns. Levivich (talk) 23:44, 25 November 2024 (UTC)[reply]
Not sure about the "building an ML algorithm to detect sockpuppets would be pretty easy" part, but I admire the optimism. It is certainly the case that it is possible, and people have done it with a surprising level of success a very long time ago in ML terms e.g. https://doi.org/10.1016/j.knosys.2018.03.002. These projects tend to rely on the category graph to distinguish sock and non-sock sets for training, the categorization of accounts as confirmed or suspected socks. However, the category graph is woefully incomplete i.e. there is information in the logs that is not reflected in the graph, so ensuring that all ban evasion accounts are properly categorized as such might help a bit. Sean.hoyland (talk) 03:58, 26 November 2024 (UTC)[reply]
Thankfully, we wouldn't have to build an ML algorithm, we can just use one of the existing ones. Some are even open source. Or WMF could use a third party service like the aforementioned sift.com. Levivich (talk) 16:17, 26 November 2024 (UTC)[reply]
Let me guess: Essentially, you would like their machine-learning team to use Sift's AI-Powered Fraud Protection, which from what I can glance, handles safeguarding subscriptions to defending digital content and in-app purchases and helps businesses reduce friction and stop sophisticated fraud attacks that gut growth, to provide the ability for us to automatically checkuser all active users? 2601AC47 (talk·contribs·my rights) Isn't a IP anon 16:25, 26 November 2024 (UTC)[reply]
The WMF already has the ability to "automatically checkuser all users" (the verb "checkuser" just means "look at the server logs"), I'm suggesting they use it. And that they use it in a sophisticated way, employing (existing, open source or commercially available) AI/ML technologies, like the same kind we already use to automatically revert vandalism. Contrary to claims here, doing so would not be illegal or even expensive (comparatively, for the WMF). Levivich (talk) 16:40, 26 November 2024 (UTC)[reply]
So, in my attempt to get things set right and steer towards a consensus that is satisfactory, I sincerely follow-up: What lies beyond that in this vast, uncharted sea? And could this mean any more in the next 5 years? 2601AC47 (talk·contribs·my rights) Isn't a IP anon 16:49, 26 November 2024 (UTC)[reply]
What lies beyond is mw:Extension:SimilarEditors. Levivich (talk) 17:26, 26 November 2024 (UTC)[reply]
So, @
2601AC47, I think the answer to your question is "tell the WMF we really, really, really would like more attention to sockpuppetry and IP abuse from the ML team". -- asilvering (talk) 17:31, 26 November 2024 (UTC)[reply
]
Which I don't suppose someone can at the next board meeting on December 11? 2601AC47 (talk·contribs·my rights) Isn't a IP anon 18:00, 26 November 2024 (UTC)[reply]
I may also point to this, where they mention development in other areas, such as social media features and machine learning expertise. 2601AC47 (talk·contribs·my rights) Isn't a IP anon 16:36, 26 November 2024 (UTC)[reply]
e.g. m:Research:Sockpuppet_detection_in_Wikimedia_projects Sean.hoyland (talk) 17:02, 26 November 2024 (UTC)[reply]
And that mentions Socksfinder, still in beta it seems. 2601AC47 (talk·contribs·my rights) Isn't a IP anon 17:10, 26 November 2024 (UTC)[reply]
3 days! When I first posted my comment and some editors responded that I didn't know what I was talking about, it can't be done, it'd violate the privacy policy and privacy laws, WMF Legal would never allow it... I was wondering how long it would take before somebody pointed out that this thing that can't be done has already been done and has been under development for at least 7 years now.
Of course it's already under development, it's pretty obvious that the same Wikipedia that developed
ClueBot
, one of the world's earlier and more successful examples of ML applications, would try to employ ML to fight multiple-account abuse. I mean, I'm obviously not gonna be the first person to think of this "innovation"!
Anyway, it took 3 days. Thanks, Sean! Levivich (talk) 17:31, 26 November 2024 (UTC)[reply]
Unlike what is being proposed, SimilarEditors only works based on publicly available data (e.g. similarities in editing patterns), and not IP data. To quote the page Sean linked, in the model's current form, we are only considering public data, but most saliently private data such as IP addresses or user-agent information are features currently used by checkusers that could be later (carefully) incorporated into the models.
So, not only the current model doesn't look at IP data, the research project also acknowledges that actually using such data should only be done in a "careful" way, because of those very same privacy policy issues quoted above.
On the ML side, however, this does proves that it's being worked on, and I'm honestly not surprised at all that the WMF is working on machine learning-based tools to detect sockpuppets. Chaotic Enby (talk · contribs) 17:50, 26 November 2024 (UTC)[reply]
Right. We should ask WMF to do the later (carefully) incorporated into the models part (especially since it's now later). BTW, the SimilarUsers API already pulls IP and other metadata. SimilarExtensions (a tool that uses the API) doesn't release that information to CheckUsers, by design. And that's a good thing, we can't just release all IPs to CheckUsers, it does indeed have to be done carefully. But user metadata can be used. What I'm suggesting is that the WMF should proceed to develop these types of tools (including the careful use of user metadata). Levivich (talk) 17:57, 26 November 2024 (UTC)[reply]
Not really clear that they're pulling IP data from logged-in users. The relevant sections reads:

USER_METADATA (203MB): for every user in COEDIT_DATA, this contains basic metadata about them (total number of edits in data, total number of pages edited, user or IP, timestamp range of edits).

This reads like they're collecting the username or IP depending on whether they're a logged-in user or an IP user. Chaotic Enby (talk · contribs) 18:14, 26 November 2024 (UTC)[reply]
In a few years people might look back on these days when we only had to deal with simple devious primates employing deception as the halcyon days. Sean.hoyland (talk) 18:33, 26 November 2024 (UTC)[reply]
I assumed 1 million USD/year was accounting for Hofstadter's law several times over. Otherwise it feels wildly pessimistic. – Closed Limelike Curves (talk) 15:57, 26 November 2024 (UTC)[reply]
IP range 2600:1700:69F1:1410:0:0:0:0/64 blocked by a CU
The following discussion has been closed. Please do not modify it.
Why do you guys hate the WMF so much? If it weren’t for them, you wouldn’t have this website at all. 2600:1700:69F1:1410:5D40:53D:B27E:D147 (talk) 23:51, 28 November 2024 (UTC)[reply]
We don’t. 2601AC47 (talk·contribs·my rights) Isn't a IP anon 01:13, 29 November 2024 (UTC)[reply]
Then why do you guys always whine and complain about how incompetent they are and how much money they make and are actively against their donation drives? 2600:1700:69F1:1410:6DF5:851F:7413:CA3B (talk) 01:29, 29 November 2024 (UTC)[reply]
We don't. Levivich (talk) 02:47, 29 November 2024 (UTC)[reply]
Don’t “we don’t” me again. 2600:1700:69F1:1410:C812:78B7:C08A:5AA5 (talk) 03:11, 29 November 2024 (UTC)[reply]
This may be surprising, but it turns out there's more than one person on Wikipedia, and many of us have different opinions on things. You're probably thinking of @Guy Macon's essay.
I disagree with his argument that the WMF is incompetent, but at the same time, smart thinking happens on the margin. Just because the WMF spent their first $20 million extremely well (on creating Wikipedia) doesn't mean giving them $200 million would make them 10× as good. Nobody here thinks the WMF budget should be cut to $0; there's just some of us who think it needs a haircut.
For me it comes down to, "if you don't donate to the WMF, what does that money go instead"? I'd rather you give that money to some other charity—feeding African children is more important than reskinning Wikipedia—but if you won't, I'd doubt giving it to the WMF is worse than whatever else you were going to spend it on. Whether we should cut back on ads depends on whether this money is coming out of donors' charity budgets or their regular budgets. – Closed Limelike Curves (talk) 03:10, 29 November 2024 (UTC)[reply]
I already struggle enough with prioritizing charities and whether which ones are ethical or not and how I should be spending every single penny I get on charities dealing with PIA and trans issues because those are the most oppressed groups in the world right now. The WMF is not helping people who are actively getting killed and having their rights taken away therefore they are not important. 2600:1700:69F1:1410:C812:78B7:C08A:5AA5 (talk) 03:15, 29 November 2024 (UTC)[reply]
In that case, I'd suggest checking out GiveWell, which has some very good recommendations. That said, this subthread feels wildly off-topic. – Closed Limelike Curves (talk) 03:33, 29 November 2024 (UTC)[reply]
So goes this whole discussion; but to give a slightly longer answer to the IP: We’re not telling them to get lost on a different path, we’re trying (despite everything) to establish relations, consensus and mutual trust. And hopefully long-term progress on key areas of contention. We don’t hate them, or else they’ll dismiss us completely. 2601AC47 (talk·contribs·my rights) Isn't a IP anon 03:44, 29 November 2024 (UTC)[reply]
Any such system would be subject to numerous biases or be easily defeatable. Such an automated anti-abuse system would have to be exclusively a foundation initiative as only they have the resources for such a monumental undertaking. It would need its own team of developers.--Jasper Deng (talk) 18:57, 23 November 2024 (UTC)[reply]

Absolutely no chance that this would pass.

WP:SNOW
, even though there isn't a flood of opposes. There are two problems:

  1. The existing CheckUser team barely has the bandwidth for the existing SPI load. Doing this on every single new user would be impractical and would enable
    WP:LTA
    's by diverting valuable CheckUser bandwidth.
  2. Even if we had enough CheckUser's, this would be a severe privacy violation absolutely prohibited under the Foundation privacy policy.

The vast majority of vandals and other disruptive users don't need CU involvement to deal with. There's very little to be gained from this.--Jasper Deng (talk) 18:36, 23 November 2024 (UTC)[reply]

It is perhaps an interesting conversation to have but I have to agree that it is unworkable, and directly contrary to foundation-level policy which we cannot make a local exemption to. En.wp, I believe, already has the largest CU team of any WMF project, but we would need hundreds more people on that team to handle something like this. In the last round of appointments, the committee approved exactly one checkuser, and that one was a returning former mamber of the team. And there is the very real risk that if we appointed a whole bunch of new CUs, some of them would abuse the tool. Just Step Sideways from this world ..... today 18:55, 23 November 2024 (UTC)[reply]
And its worth pointing out that the Committee approving too few volunteers for Checkuser (regardless of whether you think they are or aren't) is not a significant part of this issue. There simply are not tens of people who are putting themselves forward for consideration as CUs. Since 2016 54 applications (an average of per year) have been put forward for consideration by Functionaries (the highest was 9, the lowest was 2). Note this is total applications not applicants (more than one person has applied multiple times), and is not limited to candidates who had a realistic chance of being appointed. Thryduulf (talk) 20:40, 23 November 2024 (UTC)[reply]
The dearth of candidates has for sure been an ongoing thing, it's worth reminding admins that they don't have to wait for the committee to call for candidates, you can put your name forward at any time by emailing the committee. Just Step Sideways from this world ..... today 23:48, 24 November 2024 (UTC)[reply]
Generally, I tend to get the impression from those who have checkuser rights that CU should be done as a last resort, and other, less invasive methods are preferred, and it would seem that indiscriminate use of it would be a bad idea, so I would have some major misgivings about this proposal. And given the ANI case, the less user information that we retain, the better (which is also probably why temporary accounts are a necessary and prudent idea despite other potential drawbacks). Abzeronow (talk) 03:56, 23 November 2024 (UTC)[reply]
Oppose. A lot has already been written on the unsustainable workload for the CU team this would create and the amount of collateral damage; I'll add in the fact that our most notorious sockmasters in areas like PIA already use highly sophisticated methods to evade CU detection, and based on what I've seen at the relevant SPIs most of the blocks in these cases are made with more weight given to the behaviour, and even then only after lengthy deliberations on the matter. These sort of sockmasters seem to have been in the OP's mind when the request was made, and I do not see automated CU being of any more use than current techniques against such dedicated sockmasters. And, has been mentioned before, most cases of sockpuppetry (such as run-of-the-mill vandals and trolls using throwaway accounts for abuse) don't need CU anyways. JavaHurricane 08:17, 24 November 2024 (UTC)[reply]
These are, unfortunately, fair points about the limits of CU and the many experienced and dedicated ban evading actors in PIA. CU information retention policy is also a complicating factor. Sean.hoyland (talk) 08:28, 24 November 2024 (UTC)[reply]
As I said in my original post, recidivist socks often get better at covering their "tells" each time making behavioural detection increasingly difficult and meaning the entire burden falls on the honest user to convince an Admin to take an SPI case seriously with scarce evidence. After many years I'm tired of defending various pages from sock POV edits and if WMF won't make life easier then increasingly I just won't bother, I'm sure plenty of other users feel the same way. Mztourist (talk) 05:45, 26 November 2024 (UTC)[reply]

SimilarEditors

The development of mw:Extension:SimilarEditors -- the type of tool that could be used to do what Mztourist suggests -- has been "stalled" since 2023 and downgraded to low-priority in 2024, according to its documentation page and related phab tasks (see e.g. phab:T376548, phab:T304633, phab:T291509). Anybody know why? Levivich (talk) 17:43, 26 November 2024 (UTC)[reply]

Honestly, the main function of that sort of thing seems to be compiling data that is already available on XTools and various editor interaction analyzers, and then presenting it nicely and neatly. I think that such a page could be useful as a sanity check, and it might even be worth having that sort of thing as a standalone toolforge app, but I don't really see why the WMF would make that particular extension a high priority. — Red-tailed hawk (nest) 17:58, 26 November 2024 (UTC)[reply]
Well, it doesn't have to be that particular extension, but it seems to me that the entire "idea" has been stalled, unless they're working on another tool that I'm unaware of (very possible). (Or, it could be because of recent changes in domestic and int'l privacy laws that derailed their previous development advances, or it could be because of advancements in ML elsewhere making in-house development no longer practical.)

As to why the WMF would make this sort of problem a high priority, I'd say because the spread of misinformation on Wikipedia by sockpuppets is a big problem. Even without getting into the use of user metadata, just look at recent SPIs I filed, like Wikipedia:Sockpuppet investigations/Icewhiz/Archive#27 August 2024 and Wikipedia:Sockpuppet investigations/Icewhiz/Archive#09 October 2024. That involved no private data at all, but a computer could have done automatically, in seconds, what took me hours to do manually, and those socks could have been uncovered before they made thousands and thousands of edits spreading misinformation. If the computer looked at private data as well as public data, it would be even more effective (and would save CUs time as well). Seems to me to be a worthy expenditure of 0.5% or 1% of the WMF's annual budget. Levivich (talk) 18:09, 26 November 2024 (UTC)[reply]

This looks really interesting. I don't really know how extensions are rolled out to individual wikis - can anyone with knowledge about that summarise if having this tool turned on (for check users/relevant admins) for en.wp is feasible? Do we need a RFC, or is this a "maybe wait several years for a phab ticket" situation? BugGhost🦗👻 18:09, 26 November 2024 (UTC)[reply]
I find it amusing that ~4 separate users above are arguing that automatic identification of sockpuppets is impossible, impractical, and the WMF would never do it—and meanwhile, the WMF is already doing it. – Closed Limelike Curves (talk) 19:29, 27 November 2024 (UTC)[reply]
So, discussion is over? 2601AC47 (talk·contribs·my rights) Isn't a IP anon 19:31, 27 November 2024 (UTC)[reply]
I think what's happening is that people are having two simultaneous discussions – automatic identification of sockpuppets is already being done, but what people say "the WMF would never do" is using private data (e.g. IP addresses) to identify them. Which adds another level of (ethical, if not legal) complications compared to what SimilarEditors is doing (only processing data everyone can access, but in an automated way). Chaotic Enby (talk · contribs) 07:59, 28 November 2024 (UTC)[reply]
"automatic identification of sockpuppets is already being done" is probably an overstatement, but I agree that there may be a potential legal and ethical minefield between the Similarusers service that uses public information available to anyone from the databases after redaction of private information (i.e. course-grained sampling of revision timestamps combined with an attempt to quantify page intersection data), and a service that has access to the private information associated with a registered account name. Sean.hoyland (talk) 11:15, 28 November 2024 (UTC)[reply]
The WMF said they're planning on incorporating IP addresses and device info as well! – Closed Limelike Curves (talk) 21:21, 29 November 2024 (UTC)[reply]
Yes, automatic identification of (these) sockpuppets is impossible. There are many reasons for this, but the simplest one is this: These types of tools require hundreds of edits – at minimum – to return any viable data, and the sort of sockmasters who get accounts up to that volume of edits know how to evade detection by tools that analyse public information. The markers would likely indicate people from similar countries – naturally, two Cypriots would be interested in Category:Cyprus and over time similar hour and day overlaps will emerge, but what's to let you know whether these are actual socks when they're evading technical analysis? You're back to square one. There are other tools such as mediawikiwiki:User:Ladsgroup/masz which I consider equally circumstantial; an analysis of myself returns a high likelihood of me being other administrators and arbitrators, while analysing an alleged sock currently at SPI returns the filer as the third most likely sockmaster. This is not commentary on the tools themselves, but rather simply the way things are. DatGuyTalkContribs 17:42, 28 November 2024 (UTC)[reply]
Oh, fun! Too bad it's CU-restricted, I'm quite curious to know what user I'm most stylometrically similar to. -- asilvering (talk) 17:51, 28 November 2024 (UTC)[reply]
That would be LittlePuppers and LEvalyn. DatGuyTalkContribs 03:02, 29 November 2024 (UTC)[reply]
Fascinating! One I've worked with, one I haven't, both AfC reviewers. Not bad. -- asilvering (talk) 06:14, 29 November 2024 (UTC)[reply]
Idk, the half dozen ARBPIA socks I recently reported at SPI were obvious af to me, as are several others I haven't reported yet. That may be because that particular sockfarm is easy to spot by its POV pushing and a few other habits; though I bet in other topic areas it's the same.
WP:ARBECR helps because it forces the socks to make 500 edits minimum before they can start POV pushing, but still we have to let them edit for a while post-XC just to generate enough diffs to support an SPI filing. Software that combines tools like Masz and SimilarEditor, and does other kinds of similar analysis, could significantly reduce the amount of editor time required to identify and report them. Levivich (talk) 18:02, 28 November 2024 (UTC)[reply
]
I think it is possible, studies have demonstrated that it is possible, but it is true that having a sufficient number of samples is critical. Samples can be aggregated in some cases. There are several other important factors too. I have tried some techniques, and sometimes they work, or let's say they can sometimes produce results consistent with SPI results, better than random, but with plenty of false positives. It is also true that there are a number of detection countermeasures (that I won't describe) that are already employed by some bad actors that make detection harder. But I think the objective should be modest, to just move a bit in the right direction by detecting more ban evading accounts than are currently detected, or at least to find ways to reduce the size of the search space by providing ban evasion candidates. Taking the human out of the detection loop might take a while. Sean.hoyland (talk) 18:39, 28 November 2024 (UTC)[reply]
If you mean it's never going to be possible to catch some sockpuppets—the best-hidden, cleverest, etc. ones—you're completely correct. But I'm guessing we could cut the amount of time SPI has to spend dramatically with just some basic checks. – Closed Limelike Curves (talk) 02:27, 29 November 2024 (UTC)[reply]
I disagree. Empirically, the vast majority of time spent at SPI is not on finding possible socks, nor is it using the CheckUser tool on them, but rather it's the CU completed cases (of which there are currently 14 and I should probably stop slacking and get onto some) with non-definitive technical results waiting on an administrator to make the final determination on whether they're socks or not. Extension:SimilarUsers would concentrate various information that already exists (EIA, RoySmith's SPI tools) in one place, but I wouldn't say the accessibility of these tools is a cause of SPI backlog. An AI analysis tool to give an accurate magic number for likelihood? I'm anything but a Luddite, but still believe that's wishful thinking. DatGuyTalkContribs 03:02, 29 November 2024 (UTC)[reply]
Something seems better than nothing in this context doesn't it? EIA and the Similarusers service don't provide an estimate of the significance of page intersections. An intersection on a page with few revisions or few unique actors or few pageviews etc. is very different from a page intersection on the Donald Trump page. That kind of information is probably something that could sometimes help, even just to evaluate the importance of intersection evidence presented at SPIs. It seems to me that any kind of assistance could help. And another thing about the number of edits is that too many samples can also present challenges related to noise, with signals getting smeared out, although the type of noise in a user's data can itself be a characteristic signal in some cases it seems. And if there are too few samples, you can generate synthetic samples based on the actual samples and inject them into spaces. Search strategy matters a lot. The space of everyone vs everyone is vast, so good luck finding potential matches in that space without a lot of compute, especially for diffs. But many socks inhabit relatively small subspaces of Wikipedia, at least in the 20%-ish of time (on average in PIA) they edit(war)/POV-push etc. in their topic of interest. So, choosing the candidate search space and search strategy wisely can make the problem much more tractable for a given topic area/subspace. Targeted fishing by picking a potential sock and looking for potential matches (the strategy used by the Similarusers service and CU I guess) is obviously a very different challenge than large-scale industrial fishing for socks in general. Sean.hoyland (talk) 04:08, 29 November 2024 (UTC)[reply]
And to continue the whining about existing tools, EIA and the Similarusers service use a suboptimal strategy in my view. If the objective is page intersection information for a potential sock against a sockmaster, and a ban evasion source has employed n identified actors so far e.g. almost 50 accounts for Icewhiz, the source's revision data should be aggregated for the intersection. This is not difficult to do using the category graph and the logs. Sean.hoyland (talk) 04:25, 29 November 2024 (UTC)[reply]
There is so much more that could be done with the software. EIA gives you page overlaps (and isn't 100% accurate at it), but it doesn't tell you:
  • how many times the accounts made the same edits (tag team edit warring)
  • how many times they voted in the same formal discussions (RfC, AfD, RM, etc) and whether they voted the same way or different (vote stacking)
  • how many times they use the same language and whether they use unique phraseology
  • whether they edit at the same times of day
  • whether they edit on the same days
  • whether account creation dates (or start-of-regular-editing dates) line up with when other socks were blocked
  • whether they changed focus after reaching XC and to what extent (useful in any ARBECR area)
  • whether they "gamed" or "rushed" to XC (same)
All of this (and more) would be useful to see in a combined way, like a dashboard. It might make sense to restrict access to such compilations of data to CUs, and the software could also throw in metadata or subscriber info in there, too (or not), and it doesn't have to reduce it all into a single score like ORES, but just having this info compiled in one place would save editors the time of having to compile it manually. If the software auto-swept logs for this info and alerted humans to any "high scores" (however defined, eg "matches across multiple criteria"), it would probably not only reduce editor time but also increase sock discovery. Levivich (talk) 04:53, 29 November 2024 (UTC)[reply]
This is like one of my favorite strategies for meetings. Propose multiple things, many of which are technically challenging, then just walk out of the meeting.
The 'how many times the accounts made the same edits' is probably do-able because you can connect reverted revisions to the revisions that reverted them using json data in the database populated as part of the tagging system, look at the target state reverted to and whether the revision was an exact revert. ...or maybe not without computing diffs, having just looked at an article with a history of edit warring. Sean.hoyland (talk) 07:43, 29 November 2024 (UTC)[reply]

Requiring registration for editing

information Note: This section was split off from "CheckUser for all new users" (permalink) and the "parenthetical comment" referred to below is: (Also, email-required registration and get rid of IP editing.)—03:49, 26 November 2024 (UTC)

@Levivich, about your parenthetical comment on requiring registration:

Part of the eternally unsolvable problem is that new editors are frankly bad at it. I can give examples from my own editing: Create an article citing a personal blog post as the main source? Check. Merge two articles that were actually different subjects? Been there, done that, got the revert. Misunderstand and mangle wikitext? More times than I can count. And that's after I created my account. Like about half of experienced editors, I edited as an IP first, fixing a typo here or reverting some vandalism there.

But if we don't persist through these early problems, we don't get experienced editors. And if we don't get experienced editors, Wikipedia will die.

Requiring registration ("get rid of IP editing") shrinks the number of people who edit. The Portuguese Wikipedia banned IPs only from the mainspace three years ago. Have a look at the trend. After the ban went into effect, they had 10K or 11K registered editors each month. It's since dropped to 8K. The number of contributions has dropped, too. They went from 160K to 210K edits per month down to 140K most months.

Some of the experienced editors have said that they like this. No IPs means less impulsive vandalism, and the talk pages are stable if you want to talk to the editor. Fewer newbies means I don't "have to" clean up after so many mistake-makers! Fewer editors, and especially fewer inexperienced editors, is more convenient – in the short term. But I wonder whether they're going to feel the same way a decade from now, when their community keeps shrinking, and they start wondering when they will lose critical mass.

The same thing happens in the real world, by the way. Businesses want to hire someone with experience. They don't want to train the helpless newbie. And then after years of everybody deciding that training entry-level workers is Somebody else's problem, they all look around and say: Where are all the workers that I need? Why didn't someone else train the next generation while I was busy taking the easy path?

In case you're curious, there is a Wikipedia that puts all of the IP and newbie edits under "PC" type restrictions. Nobody can see the edits until they've been approved by an experienced editor. The rate of vandalism visible to ordinary readers is low. Experienced editors love the level of control they have. Have a look at what's happened to the size of their community during the last decade. Is that what you want to see here? If so, we know how to make that happen. The path to that destination even looks broad, easy, and paved with all kinds of good intentions. WhatamIdoing (talk) 04:32, 23 November 2024 (UTC)[reply]

Size isn't everything... what happened to their output--the quality of their encyclopedias--after they made those changes? Levivich (talk) 05:24, 23 November 2024 (UTC)[reply]
Well, I can tell you objectively that the number of edits declined, but "quality" is in the eye of the beholder. I understand that the latter community has the lowest use of inline citations of any mid-size or larger Wikipedia. What's now yesterday's TFA there wouldn't even be rated B-class here due to whole sections not having any ref tags. In terms of citation density, their FA standard is currently where ours was >15 years ago.
But I think you have missed the point. Even if the quality has gone up according to the measure of your choice, if the number of contributors is steadily trending in the direction of zero, what will the quality be when something close to zero is reached? That community has almost halved in the last decade. How many articles are out of date, or missing, because there simply aren't enough people to write them? A decade from now, with half as many editors again, how much worse will the articles be? We're none of us idiots here. We can see the trend. We know that people die. You have doubtless seen this famous line:

All men are mortal. Socrates is a man. Therefore, Socrates is mortal.

I say:

All Wikipedia editors are mortal. Dead editors do not maintain or improve Wikipedia articles. Therefore, maintaining and improving Wikipedia requires editors who are not dead.

– and, memento mori, we are going to die, my friend. I am going to die. If we want Wikipedia to outlive us, we cannot be so shortsighted as to care only about the quality today, and never the quality the day after we die. WhatamIdoing (talk) 06:13, 23 November 2024 (UTC)[reply]
Trends don't last forever. Enwiki's active user count decreased from its peak over a few years, then flattened out for over a decade. The quality increased over that period of time (by any measure). Just because these other projects have shed users doesn't mean they're doomed to have zero users at some point in the future. And I think there's too many variables to know how much any particular change made on a project affects its overall user count, nevermind the quality of its output. Levivich (talk) 06:28, 23 November 2024 (UTC)[reply]
If the graph to the right accurately reflects the age distribution of Wikipedia users, then a large chunk of the user base will die off within the next decade or two. Not to be dramatic, but I agree that requiring registration to edit, which will discourage readers from editing in the first place, will hasten the project's decline.... Some1 (talk) 14:40, 23 November 2024 (UTC)[reply]
😂 Seriously? What do you suppose that chart looked like 20 years ago, and then what happened? Levivich (talk) 14:45, 23 November 2024 (UTC)[reply]
There are significantly more barriers to entry than there were 20 years ago, and over that time the age profile has increased (quite significantly iirc). Adding more barriers to entry is not the way to solve the issued caused by barriers to entry. Thryduulf (talk) 15:50, 23 November 2024 (UTC)[reply]
"PaperQA2 writes cited, Wikipedia style summaries of scientific topics that are significantly more accurate than existing, human-written Wikipedia articles" - maybe the demographics of the community will change. Sean.hoyland (talk) 16:30, 23 November 2024 (UTC)[reply]
That talks about LLMs usage in artcles, not the users. 2601AC47 (talk|contribs) Isn't a IP anon 16:34, 23 November 2024 (UTC)[reply]
Or you could say it's about a user called PaperQA2 that writes Wikipedia articles significantly more accurate than articles written by other users. Sean.hoyland (talk) 16:55, 23 November 2024 (UTC)[reply]
No, it is very clearly about a language model. As far as I know, PaperQA2, or WikiCrow (the generative model using PaperQA2 for question answering), has not actually been making any edits on Wikipedia itself. Chaotic Enby (talk · contribs) 16:58, 23 November 2024 (UTC)[reply]
That is true. It is not making any edits on Wikipedia itself. There is a barrier. But my point is that in the future that barrier may not be there. There may be users like PaperQA2 writing articles better than other users and the demographics will have changed to include new kinds of users, much younger than us. Sean.hoyland (talk) 17:33, 23 November 2024 (UTC)[reply