Wikipedia:WikiProject Military history/News/November 2023/Op-ed

Source: Wikipedia, the free encyclopedia.




WP:Size
considered harmful

By Hawkeye7
The Encyclopædia Britannica. The very model of the modern encyclopaedia?

Should Wikipedia articles be short or comprehensive?

list articles, especially if splitting them would require breaking up a sortable table. This style of organizing articles is somewhat related to news style
except that it focuses on topics instead of articles.

This is more helpful to the reader than a very long article that just keeps growing, eventually reaching book length. Summary style keeps the reader from being overwhelmed by too much information up front, by summarizing main points and going into more details on particular points (subtopics) in separate articles. What constitutes "too long" varies by situation, but generally

50 kilobytes of readable prose
(8,000 words) is the starting point at which articles may be considered too long. Articles that go above this have a burden of proof that extra text is needed to efficiently cover their topics and that the extra reading time is justified.

The reference to 50 kilobytes of readable prose requires some unpacking. First of all,

emdash that some Wikipedians are so fond of. So we went to Unicode
, whereby characters are encoded with variable numbers of bytes. Therefore the number of bytes is invariably greater than the number of characters.

A bigger gotcha lies in the term "readable prose". This is characters in the main body of the text, excluding material such as footnotes and reference sections ("see also", "external links", bibliography, etc), diagrams and images, tables and lists, Wikilinks and external URLs, and formatting and markup. This is pretty reasonable in my opinion (although universities count footnotes in the thesis size Goddammit) but for articles with a lot of tables, this doesn't work nearly so well. You can use the

WP:SIZE
specifically exempts list articles from its ambit, at least partly because we have no agreed-upon means of measuring their size, which would be a necessary precursor to any size guideline.

Readable prose is much smaller than the markup, which in turn is smaller than the resultant HTML that is actually used to render a page. The actual download size to your computer or smart phone is dependent on the images, which take up far more bandwidth even in 64K thumbnail size. There is indeed a technical limit to markup size, but it is very large: 2 megabytes. It's been done, with the help of templates. The limitations of the template language made many simple templates absurdly large, but the advent of the MediaWiki Scribunto extension means that these can now be written in Lua, with consequent improvements in efficiency. The crucial point is that readable prose size has no bearing on how quickly or slowly a page loads.

The arbitrary nature of the numbers in WP:SIZE and its restrictions

WP:featured articles
.

The root of the problem lies in our conception of what an encyclopaedia should look like. Most encyclopaedias consist of short articles; the Micropædia of the Encyclopædia Britannica has 65,000 articles with fewer than 750 words each. Often overlooked is the accompanying Macropædia with its 699 in-depth articles that range up to 310 pages in length. This arose from the nature of the publication, although the Britannica was both large and expensive, there was a restriction on the size of articles owing to the cost of publication.

body
the latter. Although they can be separated for some purposes, they are together in the same article. They are easy to find with an online search, which often will just present the lead, and it is easier to check that the lead truly reflects what is in the article.

We know a lot about how readers approach the articles that was not known when the guideline was written back in 2004:

  • Most just read the lead and nothing else
  • Many do not read the article sequentially, but jump around looking for very specific information
  • Only a small percentage read the whole article from top to bottom

Therefore, to service the readers' needs, articles need to be comprehensive and detailed, with a well-written lead.

If an article exceeds WP:SIZE limits, we have three techniques for reducing article size:

  1. Material can be deleted outright;
  2. Text can be trimmed to use fewer words to say the same thing;
  3. Sections can be
    split
    off into subarticles.

The first technique cannot be used simply to reduce the size of an article. Material should be

undue weight
. This requires consensus that the section is indeed undue, and the imperative is to preserve, since the Wikipedia is a compendium of information.

The second technique looks more promising. Some good essays have been written on how to do this, including our Wikipedia:WikiProject Military history/Academy/Copy-editing essentials, User:Tony1/Redundancy exercises: removing fluff from your writing and the Wikipedia:Principle of Some Astonishment. Following this advice will improve your article writing style, but it is not a panacea. When it comes to reducing the overall size of an article, one should not expect too much from trimming; experience has shown that perhaps a five percent reduction can be expected at best.

Which brings us to the third technique: splitting off subarticles. This is called summary style. The idea is that sections of long articles should be spun off into their own articles, leaving summaries in their place. A fuller treatment of a major subtopic can have a separate article of its own. This holds out the possibility of substantial savings in the size of the parent article. However, this technique has limitations that need to be carefully considered before embarking on such a course of action.

The first is that a child article must be a complete encyclopaedic article in its own right. That means that it must meet our

point of view fork. (Conversely, splitting off a large section may leave the parent article with undue emphasis issues.) Many editors at Articles for Deletion are unaware of or do not understand the summary style or size guidelines, and in any case in a conflict between them and notability will prefer the latter. They may question the need for the child article, potentially leading to a resolution of WP:merging
it back into the parent article.

Another limitation is that a when a child article is created from a section it must be replaced with a summary in its parent article, which must be similar to the lead in the child article. Simply replacing the section with a hatnote is unacceptable. Apart from the illogic of violating one guideline in the pursuit of another, the readers really do not like this. Put simply, for reasons not fully understood, they do not like following links, and will complain on the talk page when forced to do so. This problem is compounded by the fact that child articles often do not appear in searches with common search engines, which may direct the reader to the main article even if a child article is available.

What this means for the editor trying to reduce the size of an article is that spawning a child article will not reduce the article in size by that of the section being split off. To achieve a reduction, we need to locate a section with more than just a few paragraphs. Not articles have sections that can easily be split off, so in some cases the parent article may need considerable restructuring in order to create one. The creation of child articles also comes with a maintenance overhead. If a child article changes, the summary in the parent article will need to be changed as well.

WP:SIZE is a guideline, and

weasel words
even for a guideline. It cautions that:

There is no need for haste in splitting an article when it starts getting large. Sometimes an article simply needs to be big to give the subject adequate coverage.

In summary, WP:SIZE posits arbitrary size limits on articles, and meeting them may involve considerable work for the article writers and generate conflict with other guidelines while detracting from the quality of the work delivered to the readers.


About The Bugle
First published in 2006, the Bugle is the monthly newsletter of the English Wikipedia's Military history WikiProject.

» About the project
» Visit the Newsroom
» Subscribe to the Bugle
» Browse the Archives
+ Add a commentDiscuss this story

Jumping around looking for information

Quoting (from rev. 1184270070:

We know a lot about how readers approach the articles that was not known when the guideline was written back in 2004:

  • Most just read the lead and nothing else
  • Many do not read the article sequentially, but jump around looking for very specific information
  • Only a small percentage read the whole article from top to bottom
Therefore, to service the readers' needs, articles need to be comprehensive and detailed, with a well-written lead.

I agree, and with respect to bullet two, I've always found it very curious that although Wikipedia is on a hypertext platform and takes full advantage of it via

interlanguage links
to link to sections of articles on other Wikipedias, it seems we don't really encourage linking to other sections of the same article we are reading, other than in the Table of Contents, although in my view we definitely should, via on-page section links.

MOS:SECTIONLINKS
, but it's clear they are talking about sections on other pages.

The one place I've found that does talk about it, is a sentence at Help:Link#Section linking (anchors) and the similar wording at Help:Section#Section linking. But somehow, this doesn't seem to be part of the culture, and I rarely see it used, and it should be encouraged. Using Template:Section link could help the reader to know that it's an on-page link, due to the section symbol prefix, as off-page wikilinks are generally not implemented that way. I don't know whether you'd want to mention that in this essay or not, but I just find it curious, and unfortunate, that we don't encourage the use of on-page section links a lot more than we do, because I think you're right about bullet 2, and we don't do enough to help the user do that. Mathglot (talk) 00:55, 25 November 2023 (UTC)[reply]