Talk:Usage share of web browsers/Archive 5

Page contents not supported in other languages.
Source: Wikipedia, the free encyclopedia.
Archive 1 Archive 3 Archive 4 Archive 5 Archive 6 Archive 7

Clickz

There are some stats from ClickZ at http://web.archive.org/web/20090711201800/http://www.clickz.com/stats/stats_toolbox .Smallman12q (talk) 00:28, 1 March 2012 (UTC)

Wikimedia percentages

In this edit I have made what I hope are some improvements to the new, simpler summary tables. It was good to see Wikimedia data back represented, but it appeared to be utterly at odds with the other figures, whereas in the past, Wikimedia usually provided the majority of the median figures - i.e. it was often right in the middle of the spread. I looked into it, and the reason was the separation of mobile and non-mobile data. When the Wikimedia stats page said 29.2% for MSIE, it meant 29.2% out of the total of 87.5% of non-mobile visits. No wonder it wasn't comparable! The simple arithmetic required is perfectly allowed by

WP:CALC. I copy-and-pasted the Wikimedia table into a spreadsheet and added a column based on =B2/B$26*100 to produce true percentages of the non-mobile visitor figure (which happened to be in cell B26). This was so easy that I did the same for the mobile table below it, and added these figures too. I found 'Other' figures in both cases by adding up the figures used (after rounding to 1 D.P.) and subtracting the totals in each case from 100. This is all simple, accurate and useful, and hopefully will not present any problem to maintain. As for the Wikimedia section table in the main body of the article, I have already complained about the complexity of this here, and now do not really know what to do with it. --Nigelj (talk
) 20:19, 17 February 2012 (UTC)

Psdie (talk) 15:52, 9 March 2012 (UTC): I think the whole decision to use the Wikimedia stats for the "headline" usage chart is suspect - they serve to heavily under-represent Internet Explorer usage. I smell an anti-IE agenda (popular amongst tech-savvy users, but does no favours when trying to objectively monitor real-world IE market share). Reasons for under-representation:
Who has an anti-IE agenda? That seams like pure nonsense. Ad-based stats like Net Applications do no favor when trying to objectively monitor real world ad-blocking browser usage share. This is a real reason for under-representation in the non-Wikimedia stats you seam to favor.
  1. By counting based on page views instead of unique users, the Wikimedia stats over-represent page-refresh-intensive users of the Wikimedia sites, i.e., Wikipedia editors. Thus the browsers used by Wikipedia editors will be over-estimated in the Wikimedia stats. I suggest that editors are likely to be more technically savvy than "typical" visitors, so are more likely to have an alternative browser installed - i.e., non-IE (standard browser with the most popular desktop OS, MS Windows).
There is no evidence that IE user are more or less refresh-intensive than any other users. Your suggestions are pure guesswork.
  1. The Wikimedia stats combine desktop and mobile stats. IE has no mobile presence, so its share will be significantly diluted in stats that merge mobile usage (currently ~13%). It's not necessarily unreasonable to present combined mobile/desktop usage as the headline figure, particularly given the rising importance of mobile, but this should be made clearer in the labelling.
Net applications also combine desktop and mobile stats so i don't really see your point. This article is about browsers, not operating systems. As mobile browsers are also browsers they belong in the stats.
Personally I believe an aggregate stat (median wasn't too bad, traffic weighted mean would surely be better) as the headline chart would present a more realistic picture. If that's prevented by
WP:CALC
) then perhaps omitting a headline figure altogether is the fairest approach - otherwise Wikimedia's stats are being presented as more authoritative and accurate than other sources, which I'd dispute based on #1 above.
I agree. We should weigh in adblock downloads in the stats to get a fairer representation. As wikimedias stats are based on more traffic then the other stats it should be weighed higher then the others. Unfortunately we do not heave stats from equally or more trafficked sites like Facebook and Google.
--Psdie (talk) 15:52, 9 March 2012 (UTC)

Protected

The article has been fully protected two weeks due to the edit war. A

WP:Request for comment is one way to get consensus on what belongs in the article. Since this is now the third time the article has gone under full protection, it may be reasonable to use blocks to deal with any warring that continues after expiry. Protection can be lifted if consensus is reached on talk. EdJohnston (talk
) 16:47, 17 March 2012 (UTC)

Is it a rule

Is it a rule to update the world map at the start of each month? Why don't we just update automatically when the leadership in a country changes? Thank you all--88.240.39.174 (talk) 16:16, 6 April 2012 (UTC)


Can we have updated stats again please?

As long as text on the interpretation of the numbers is emphasized, and the difficulty in measuring the stats is treated at a place that draws attention, I see no problem with the issues anyone here talks about. So can we please have a wikipedia article that summarizes global stats again?

Especially now, when IE8 and IE7 use is dwindling, people will want to know how many people use html5 compatible browsers...

Can't we present all perspectives, and emphasize the fact that there are perspectives?

We could for instance cluster the stats based on unique visitors in one category and hits in another...

Pretty please? Cause this is an awesome article...

80.112.133.70 (talk) 08:34, 25 April 2012 (UTC)

Wikimedia (April 2009 to present) - chart

isn't android the operating system and not the browser? 193.170.74.203 (talk) 09:17, 25 April 2012 (UTC)

I think that the browser on Android devices is special and unique to Android, so is normally referred to simply as 'the Android browser'.

Wikimedia server logs

Generally not accetable

I just want to remind everybody that graphics of the Wikimedia server logs, like the one here are not acceptable, for a variety of reasons:

  • They are original research
  • Syntehsis
  • reliable sources
  • They are self-referential
  • The limit the re-usability of the charts, (in wikipedia mirrors, for example) because they refer to the wikipedia servers, and therefore are not "universal" or appropriate when discussing general browser market share

If anywhere, they could be used in the

reliable sources. --SF007 (talk
) 23:02, 13 March 2012 (UTC)

The only concern that can be considered at least marginally valid is that of ) 23:16, 13 March 2012 (UTC)
I dare to say it is much more than "marginally", since this is not discussed in any reliable source whatsoever. And while this might technically not violate
WP:OR, from my own POV, it certainly violaties the "spirit" or "principle" of those policies. It is arguably a self reference, while it does not mentions "Wikipedia", it mentions the "parent", wikimedia. Why should we present the stats from wikimedia? Are they representative in any way of market share? Why not just choose the sats from any other random website? Simply because Wikimedia websites are popular? Because Wikimedia runs Wikipedia? The answer to those questions should have already came from reliable sources... sadly, it is hard to justify the inclusion of such information. --SF007 (talk
) 00:08, 14 March 2012 (UTC)
Even if the stats were based on accessing this image it wouldn't be self-referencing for a pretty evident reason: it doesn't reference content at all. It is not
WP:OR at all neither in spirit nor in fact: the data is referenced. And we all probably are well aware that squid data is itself pretty reliable source. At least more reliable then known unreliable sources like all those you left intact in the article. That's it: Wikipedia is the 3rd most visited site itself, so Wikimedia projects altogether are at least that much used (not to mention the fact that Wikimedia Commons' content is used throughout the web. If we are talking about the spirit of core content policies, then Wikimedia stats were the only reliable data in the article, as Wikimedia projects are known to have widest possible auditory in contrast to the rest of the sources, and thus the trustworthiness of these stats is out of question. The data in question is collected in the most neutral way possible and is verified in the most objective way – automatically; its sources are easily traceable and can be re-examinated; the chance that this statistics gets purposely misinterpreted in favour of one's commercial interest is neglictable... It is the ideal source for the purpose of all the policies you name. — Dmitrij D. Czarkoff (talk
) 00:33, 14 March 2012 (UTC)
I don't think you really address the issue raised by
WP:OR. No reliable source has taken a critical view on the data and opened up for quoting. Thus this is in violation with the goal of WikiPedia. Put another way, if you consider these data reliable and relevant, what source can you quote that these are reliable numbers? What source can you quote that these are relevant? What source can you quote that these numbers are representative for some population? --Useerup (talk
) 15:40, 14 March 2012 (UTC)
The reliable source that produces these numbers is the reference given. These are the stats for over 150 billion web requests in a single month, across over a dozen of the busiest websites on the internet. The figures are worldwide and have been made by web users with every conceivable interest. have you got any source that says this is not a reliable source? WP:OR - reproducing results published by a reliable source is not OR. WP:SYN - we do not combine these figures with any others, no sysnthesis of multiple sources takes place. WP:UNDUE - this is a very large sample, and so is significant. WP:SELF - we do not assume that the reader is reading Wikipedia and we don't refer to this or any article on Wikipedia in any special way. Therefore these figures and their refs make perfect sense on any mirror server. Wikimedia is an important part of the web. I see that
WP:BRD and it should now stay in the article until this discussion has reached a consensus. --Nigelj (talk
) 00:21, 15 March 2012 (UTC)
WP:BRD
is not a policy and cannot be invoked as a reason for undoing an edit you disagree with. As for the points:
And if someone thinks that this has not been fulfilled that has to be argued for and/or proven to. Just removing material without proper warning and/or discussion is not allowed.
  • The Wikimedia server logs are
    WP:PRIMARY
    . That does not rule out using them, but they should be used with care. They have not been used with care here.
This is a valid point, but applies to all other data used in this article. For example Net Applications use some undisclosed weighting of their data.
  • You state that "These are the stats for over 150 billion web requests in a single month". This number is meaningless unless put into context. You need a RS which say something about how representative or for which demographic this source is representative. You can have 150 trillion web requests, if they are all sampling the same demographic it is not more useful than this number. Sheer volume is meaningless unless put into perspective. By a reliable source, please.
That would be true in the article, but this is a talk page. There are plenty of sources clarifies the things you are asking about, and they is probably useful in the article. But in the talk lack of references cannot be used as an argument.
  • You state that "The figures are worldwide and have been made by web users with every conceivable interest.". Got any RS for that? If so then please put it in the article. If not, your point is moot. Editors don't get to make such assertions.
As above, this is a talk page and not a article. Arguments in the talk page are not "moot" without sources in the talk page.
  • You ask "have you got any source that says this is not a reliable source?". You are seriously misguided as to what Wikipedia is. I or anyone else do not need to provide any source for removing unsourced or improperly sourced material (this being a case of the latter). It is you who need to provide a
    WP:BURDEN
    .
Again this is not an article, but the discussion about the quality of an article. Asking for evidence that something is unsourced or improperly sourced goes here.
  • Regarding
    WP:SYN
    as far as I can see. That is not the main problem.
  • You state that "WP:UNDUE - this is a very large sample, and so is significant.". No. It is
    WP:UNDUE
    again.
Why? It is still the most signification statistics referenced in the article. If it is undue so is everything else. There is not a single reliable source that verifies any of the statistics in the article. Wikipedia stats is the least undue because here we have raw data, that's more than we have from Net Applications. I bet you cant find a single reliable source that validate Net Applications data.

--Useerup (talk) 08:21, 15 March 2012 (UTC)

In case of each source the reliable source itself is the source of stats. Neither of figures are discussed, for none of them the population or relevance to any population is discussed and all of them are reliable sources on their own.

WP:V also don't request that the sources we use should be discussed in other sources. Please just don't start another lame war with no proper grounds – this article is already damaged severely enough. – Dmitrij D. Czarkoff (talk
) 09:01, 15 March 2012 (UTC)

In support of the reservations about highlighting Wikimedia stats over others (given bias created by its counting by page views, which are skewed by high admin activity), see my comment under Wikimedia_percentages above. If Wikimedia stats were based on uniques, I'd be more open to highlighting them as typical (which they aren't at present). --Psdie (talk) 15:21, 15 March 2012 (UTC)

@Useerup, I am very familiar with WP:BURDEN, thankyou. It says, "You may remove any material lacking an inline citation", which does not apply here. I won't repeat what Czarkoff just said; it seems obvious to me. Perhaps you should look at
WP:V, which is core policy. --Nigelj (talk
) 21:41, 15 March 2012 (UTC)
@Psdie, your original point was about the use of a piechart of Wikimedia stats for the "headline" usage chart, was it not? That is something I'd gladly throw into the negotiation pot if everyone was willing to discuss and negotiate rather than delete and edit war. It's interesting that you see these stats as part of a pro/anti Microsoft stance. Did you know that there have been allegations in the past of people being paid specifically by Microsoft to edit Wikipedia?[1] We never find out who may have been paid to come here and add/remove content, but it's always something to be mindful of, within the context of
WP:AGF. --Nigelj (talk
) 21:41, 15 March 2012 (UTC)
The thrust of this objection (please correct me if I'm wrong) is that the Wikimedia stats are not discussed in other references, and so we are only dependent on a primary source for all of them. Is that correct? In that case, we are also going to have to delete the Statcounter figures, as they are only referenced to statcounter.com, and we don't have any references to other WP:RSs discussing them, their sample size, their methodology, or their reliability. Oh, the same is true for Clicky - totally sourced to getclicky.com. Same for W3Counter. Net Applications seems to call itself Net Market Share these days, and the same is true there. StatOwl.com is the same. It looks like there won't be much left. Which one of you would like to do the deletions? There'll have to be a new explanation written to take their place, as there won't be much left of the article. If these deletions don't go ahead, I'll assume that there was a mistake somewhere in the logic and replace the long-standing Wikimedia stats for our readers' benefit soon. --Nigelj (talk) 23:09, 16 March 2012 (UTC)
Wikipedia probably is not representative of the population due to all us open-source fans. I would vote "no". — Preceding unsigned comment added by 2.80.217.197 (talkcontribs) 04:53, 17 March 2012
Guys, stop edit warring. I've requested that this page be protected for that.Jasper Deng (talk) 04:57, 17 March 2012 (UTC)
So you think that Wikimedia stats are less reliable due to the higher load by users of open source OSs/browsers? Why do you think it is the case at all? Why do you think that StatOwl counting visitors of several Windows-related forums doesn't suffer from the similar issues? Do you know what issues do other figures suffer from? — Dmitrij D. Czarkoff (talk) 07:33, 17 March 2012 (UTC)
Use a source which has been reported by reputable mainstream media then. That's a reliable source. What is your problem with that? The Wikimedia server logs may be accurate, but they are raw data and certainly a primary source. As a primary source it is unacceptable that it is given
what reliable sources thinks about the primary source. --Useerup (talk
) 10:30, 17 March 2012 (UTC)
@Useerup, you seem to have missed my point above: we have nothing in the article about what any secondary sources think about any of the primary source statistics. They should all go, by your logic. --Nigelj (talk) 20:11, 17 March 2012 (UTC)
Don't try to put words in my mouth, please. Netmarketshare seems to be quoted a lot in the media. Just follow
WP:DUE and use that as the lede. Do not give a primary source with multiple potential issues a more prominent position than the sources which are usually quoted by reputable secondary sources. Simple. --Useerup (talk
) 21:43, 17 March 2012 (UTC)
Just to be clear, are you arguing against the appearance of a Wikimedia pie chart in the lede, or are you arguing in favour of deleting all Wikimedia tables and removing all Wikimedia statistics from the article? It's important to be clear. --Nigelj (talk) 22:35, 18 March 2012 (UTC)
I am against using Wikimedia as a representative graphics in the lede. I believe that with proper caution (based on raw data with possibly skewed demographics) the stats from Wikimedia does have a place. I just don't think they should be given more weight than, say, Netmarketshare. --Useerup (talk) 00:23, 19 March 2012 (UTC)
Oh. It's just that in this edit you removed the Wikimedia statistics from the lede graphics, the summary table, and also removed all the historic stats and even the whole section about them from the body of the article. Perhaps you could make your present position on their legitimate use in the article clearer in the RfC below? --Nigelj (talk) 14:57, 31 March 2012 (UTC)

RFC

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


Do Wikimedia's server logs constitute

due or undue weight?Jasper Deng (talk)
01:34, 18 March 2012 (UTC)

Consider for example the fact that a given graphic representation of particular data includes data concerning that very graphic representation; is that self-reference? Certainly. The fact that a given argument about argument in general by definition deals with itself and is self-referential, is beyond question; it has been a cliche for a long time. But that does not mean that either of these examples is in itself unacceptable or even undesirable. They may be in any given case, but it is necessary to consult good sense, good conscience, good consequences, and a lot of other goods before we invoke hysterical subjunctives and Cretan liars for every text we disapprove of or disagree with. An alarmingly large number of such arguments in WP are settled by exhaustion or appeal to authority. This is unhealthy. (Now, there is a bit of OR, and make the most of it!) Similar principles apply to all the other holy Wikipillars.
Now, then. Truth and reason above all. I hold no brief for either side in the article under discussion, but I vote for the fair, good-faith, good-sense and constructive use of any representation, even though I have some very snotty views on snappy pie charts. (Edward Tufte had some really good points!) If anyone has a better presentation, bless him and go for it, say I. But if the best he can come up with is lawyering about data that might refer to WP among other subjects, or that unearthing publicly available data or data that can be displayed publicly in an illustration, but does not already appear in other textbooks counts as OR, then go away and explain yourself elsewhere. I have seen nothing in the arguments so far that moves me to forbid the material. JonRichfield (talk) 07:17, 18 March 2012 (UTC)
  • Remove - (via RfC) - This seems very much like a
    wp:undue
    as it places Wikimedia with equal billing with statistical sources which may (or at least should) represent much larger portions of the web spread across more than a single website (or a single set of websites). IMHO the article should only include statistics which represent usage of substantial proportions of the web. It's difficult to tell which article fulfil this definition as the article gives few clues of what certain data sources represent. No information is given on what kinds of information is represented by Clicky, StatOwl.com, OneStat.com, ADTECH, WebSideStory, the GVU WWW user survey or any of those listed after that one.
On a completely tangential line that I felt I should also say:
  • The article seems to be littered with external links.
  • Information on old data sources, like TheCounter.com, is written in the present tense.
  • It's taken for granted that we understand the difference between mobile and desktop browsers. (Are mobile browsers just phones or does it include laptops?) I'm guessing it should be "smart phones/tablets" v. "personal computers". — Blue-Haired Lawyer t 01:21, 31 March 2012 (UTC)
Which statistical sources do represent larger portions of the web spread across more than a single website? Belorn (talk) 07:56, 31 March 2012 (UTC)
  • Remove Including the Wikimedia statistics is the worst kind of data cherry-picking. This article should only use data from highly-regarded Web analytics vendors with a wide reach (i.e., inclusion of many sites versus a single site/family of sites) and publicly-available methodology. This isn't a knock at Wikimedia, or of their data collection methodology, or any such thing; it's just that looking at any single site or family of sites is going to be misleading, at best.
Here's a non-Wikimedia example of what I mean:[ds 1]
Desktop browser share
February 2012
World-wide Ars Technica sites
IE 52.84 12.31
Firefox 20.92 28.81
Chrome 18.90 34.05
Safari 5.24 19.17
Opera 1.71 1.93
Other 0.39 3.73
Now, those numbers might be interesting in the context of how AT readers compare to the rest of the Web, but they're meaningless if you're trying to actually learn something about, oh, the overall usage share of web browsers. Another example of this are the statistics from W3Fools W3Schools—their numbers only apply to their sites, and so, are not representative of the Web as a whole. And consequently, their numbers aren't used as representative data; instead, they're in the external links section. The Wikimedia numbers suffer from the exact same problem.
If we look at where the news media get their data, the field narrows down pretty quickly to two candidates: Net Applications and StatCounter.[ds 2][ds 3][ds 4][ds 5] Wikipedia should simply follow the lead of the reliable sources; no more, and no less. DoriTalkContribs 03:13, 2 April 2012 (UTC)
  1. Condé Nast Publications
    . Retrieved 1 April 2012.
  2. ^ Dingman, Shane (20 December 2011). "Internet Explorer 8 no longer world's most popular web browser: report". The Globe and Mail. Retrieved 1 April 2012.
  3. ^ Leonhard, Woody (1 November 2011). "Worldwide browser share numbers show Chrome way up". InfoWorld. Retrieved 1 April 2012.
  4. PC Magazine
    . Retrieved 1 April 2012.
  5. ^ Capriotti, Roger (18 March 2012). "Understanding Browser Usage Share Data". The Windows Team Blog. Retrieved 1 April 2012.
http://www.netmarketshare.com/?source=NASite looks good, and it has a Usage Policy that looks compatible with the WP's license. http://statcounter.com/ has a default copyright notice, saying all rights reserved. To use the data here on WP, we need the data to be under a compatible license. So as ending question, in your opinion, do you think we can/should use the one source (netmarketshare.com) and remove all other statistic, and if so, would using a single site be compatible with
WP:weight? Belorn (talk
) 09:03, 2 April 2012 (UTC)
I knew this, but I guess it's worth pointing out: NetMarketShare is
WP:RS. WP is fine so long as it's properly attributed. DoriTalkContribs
☽ 19:54, 2 April 2012 (UTC)
Copyright on data points is a tricky matter, and I would be cautious with it. It should be safe to write in our own words a summery of statcounter, but any direct copy of their data onto a table (IE X%, firefox Y%, Chrome Z%, ...) should I think be avoided. In a book/news article, small snippets of text can be cited, but statistics are not useable with just snippets of data. Netmarketshare is thus far better as we can freely use their data so long it is attributed. Belorn (talk) 22:16, 2 April 2012 (UTC)
A couple of points:
  1. Unlike the W3Schools and all the sites monitored by other sources Wikimedia monitors the site receiving hits from nearly all human internet users.
  2. Like the rest of sources Wikimedia tracks more then one site: the media from Commons is used in multiple locations. Though Wikipedia generates the overwhelming amount of hits, some hits from people who don't use Wikipedia (if there are any) also get recorded in Wikimedia stats. — Dmitrij D. Czarkoff (talk) 21:13, 2 April 2012 (UTC)
What I'm hearing you say isn't what I think you mean to say…
  1. W3Schools monitors the sites they run; Ars Technica monitors the sites they run, and Wikimedia monitors the sites they run. How are these different? In all of these cases, you're getting a self-chosen slice of Web visitors. Browser usage stats are only meaningful when you're looking at data from a wide variety of different sites around the world.
  2. I don't understand what you mean here—are you saying that Wikimedia monitors non-Wikimedia sites?{{
    cn}} But honestly: Wikimedia monitors the Wikimedia family of sites and only the Wikimedia family of sites. And that is why their data aren't meaningful. DoriTalkContribs
    ☽ 00:13, 3 April 2012 (UTC)
I think you misinterpret the whole issue:
  1. The diversity of monitored sites is one of possible approaches to neutralizing stats, though it has its flaws. Using one (but nearly most used) site is another approach to neutralizing stats, which also has its drawbacks. The assumption that multiple sources are better is simply false, as eg. StatOwl is known for significant share of sites with dominance of corporate users that are using the browsers imposed by corporate policy on them, thus making a strong bias. Similar concerns are true for other similar sources.
  2. Wikimedia monitors Wikimedia sites including Commons. Commons' images are linked from many parts of the web (example), so Wikimedia ends up monitoring quite a few sites. — Dmitrij D. Czarkoff (talk) 05:14, 3 April 2012 (UTC)
  • Keep. It is published statistics by wikimedia about the browser usage of it's users. It's no different than using similar statistics if they were published by Google. Charwinger21 (talk) 07:37, 2 April 2012 (UTC)
Google doesn't release their data, but if they did, it would still be useless in this regard. Google's stats—just like WIkimedia's—may be large in number, but they are not representative of the entire Web. DoriTalkContribs 00:13, 3 April 2012 (UTC)
But they would represent greater and more diverse portion of the web then all of the sources in the article. With Wikimedia omitted, Google stats' population would be even greater then all of these combined, which makes it effectively less prone to specific biases. — Dmitrij D. Czarkoff (talk) 05:19, 3 April 2012 (UTC)
Honestly… just because a a vendor has a larger sample size of self-selected people doesn't mean that group is any less self-selecting. Is it possible that Google's IE numbers might be under-representative because MS might be sending people to
Siri? And do you really think that Google would do a better job of reporting Chinese browser usage stats than Baidu? Single source is single source is meaningless outside of that particular context. DoriTalkContribs
☽ 23:51, 3 April 2012 (UTC)
Multi site sources has the same type of self-selecting as single source, just with different group of people. Customers of Netmarketshare has grouping in the same way users of google has. Maybe bloggers prefer one type of website statistic tools, web shops a second type, and government a third. The statistics will always has some form of bias, so the goal should be to primary use those that has a reputation of openness and correctness. Belorn (talk) 07:17, 4 April 2012 (UTC)
Exactly, all sources have biases; thus using sources with known and easy to describe/understand biases is clearly beneficial over using sources that don't give information on their flows. — Dmitrij D. Czarkoff (talk) 08:31, 4 April 2012 (UTC)
No, again—that's called
secondary sources, not cherry-picked data. DoriTalkContribs
☽ 05:36, 5 April 2012 (UTC)
It's neither
Ghost
05:46, 5 April 2012 (UTC)
In this context sources are the sources of statistics. Please point me the policy or guideline that says that we can only relied on reliable sources that other reliable sources rely upon, or just stop this. — Dmitrij D. Czarkoff (talk) 07:17, 5 April 2012 (UTC)
You must also consider ) 15:56, 5 April 2012 (UTC)
As I wrote above, all the numbers have the equal
WP:OR. As there is no issue with those, we end up with logical conclusion – unless we have the tool of selecting the appropriate sources, we should report all the sources as having equal weight, unless we have documented proof of the reasons we should exclude particular source (eg. as in case of AT Internet). — Dmitrij D. Czarkoff (talk
) 18:36, 5 April 2012 (UTC)
  • Keep - Per my comment here. Being "representative of the entire web" is not the purpose of these statistics, nor is that what the data tries to suggest. It is not original research to include the data, it is verified by a reliable source (although perhaps not independent). -
    Ghost
    05:51, 5 April 2012 (UTC)
  • Weak remove - I am not comfortable with WM stats being cited where they are not reported by any RS. I am not dead set against keeping them here, but they cannot be allowed to take a more prominent position than the sources which have actually been cited by RS. Hence, they should not be quoted in the lede and should not form the basis of a graph where NA or SC could be used. --Useerup (talk) 15:56, 5 April 2012 (UTC)
  • Keep - I believe that the results should be included in the article, BUT, Wikipedia should change its policy on disclaimers so that you can include a disclaimer stating that the statistics are only the results of wikipedia's site usage, and may or may not be true for everyone using the web. Without that disclaimer, I vote remove.Thepoodlechef (talk) 17:30, 9 April 2012 (UTC)
  • Keep The removal argument seems to be based on an over-zealous reading of certain policies. There is no research being published for the very first time here-- it's produced elsewhere and made available freely.
    WP:UNDUE, the only source of this type that would not be unduly weighting some segment of the internet would be some record of the entire internet browser usage, which obviously does not exist. If there is a serious concern about undue weight, include more charts, because no single one, generated anywhere, will satisfy. IMO, including wikimedia browsing data like this displays a certain honesty on the part of wikipedia, since it is an acknowledge that the the project does not exist in some sort of pure information cyberspace, but rather on the actual web, hosted on actual computers, and being browsed by actual people with actual software. siafu (talk
    ) 04:38, 28 April 2012 (UTC)
The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Comments

For those who like netapps and statcounter a few points:

  1. What do you think about stats were you have to paid to be counted?
  2. What do you think about being able to see raw stats versus manipulated numbers?
  3. Where do you think you will find ipad usage on statcounter? Hint you have to pay extra to see it.

Daniel.Cardenas (talk) 15:07, 2 April 2012 (UTC)

The article makes it crystal clear that the usage statistics are ESTIMATES and change regularly. I'm watching the Wikipedia editors who like to start flame wars and try their best to remove (or carefully reword) any edits which go against their favourite software company. How do you think I found this debate? I'm not stupid; I know why certain articles that mention a giant software company has a few Wikipedia editors fighting tooth and nail to protect the public from reading the true facts. TurboForce (talk) 12:49, 9 April 2012 (UTC)

google analytics ?

google analytics stats anyone ? --Johnny Bin (talk) 06:49, 26 April 2012 (UTC)

How? Daniel.Cardenas (talk) 17:53, 29 April 2012 (UTC)

Figures for Wikimedia pie chart?

Much as I love it, where do the figures for this chart come from? In the diagram we see, for IE, Chrome, Firefox, Safari, Opera, Android and Other respectively, 25.93%, 24.99%, 21.79%, 14.09%, 5.04%, 3.18% and 4.98%. From the source[2] for 'All requests' we see 25.36%, 24.99%, 21.77%, 5.82%, 3.71%, 2.99% and therefore 15.36%. For 'Html pages' we see 26.58%, 20.90%, 20.92%, 4.81%, 2.30%, 2.77% and therefore 21.72. There is no source on the image page, none in the caption, and no hint of what calculations, if any, are being put into this diagram. I would be much happier with an SVG image that anyone could update and edit, displaying the actual figures we can all clearly see in the source. --Nigelj (talk) 17:40, 9 May 2012 (UTC)

OK. Now I've taken the trouble to bring all the figures together, I can see what we're being shown:
'All requests'
non mobile tablets other mobile Total
IE 25.36 0.55 0.02 25.93
Chrome 24.99 24.99
Firefox 21.77 0.02 21.79
Safari 5.82 2.65 5.62 14.09
Opera 3.71 1.33 5.04
Android 0.19 2.99 3.18
Other 4.98
Total 100.00

The problem was that none of this was obvious - to me anyway. Per

WP:V, this should be made clear somewhere. --Nigelj (talk
) 18:02, 9 May 2012 (UTC)

I asked creator on talk page about this and was told for example that I.E. added tablet and mobile numbers also. Not sure what the solution is to the confusion. Perhaps expand this article table to do the same? What do you think?   Thanks! Daniel.Cardenas (talk) 18:05, 9 May 2012 (UTC)
Thanks Daniel. Having gone to the trouble of creating it, I copied the table above onto the graphic's Commons page. I think that covers it. Every figure was, in fact, perfect. --Nigelj (talk) 18:24, 9 May 2012 (UTC)

Google Chrome Now the No. 1 Browser in the World

Chrome is now #1. If someone can please update the article. source. Joseph507357 (talk) 16:05, 21 May 2012 (UTC)

Sample sizes

I just reverted some large scale changes made by

WP:BRD here was that at least some of the new figures that were prominent were clearly grossly in error. Mwarren us's version stated that the Wikimedia stats were based on '1' website, whereas, from the article's own section on the figures, it says, "These server logs cover requests to all the Wikimedia Foundation projects, including Wikipedia, Wikimedia Commons, Wiktionary, Wikibooks, Wikiquote, Wikisource, Wikinews, Wikiversity and others[21]", in every language. It also stated that these figures were based on a 'Pageviews' sample of 15,722. A glance at the cited source shows that the sample size to be 15,722,000,000 HTML page squids where squids are defined by 1:1000-sampled server logs. In other words, the full sample was 15,722,000,000,000 HTML pages served, equivalent to a request count of 128,552,000,000,000. Secondly, the link given regarding arithmetic means appeared to be to a discussion section that closed an RFC. In fact it was to a comment by Useerup (talk · contribs), who, I'm sure won't mind being described as a participant in the RFC. I can't find the actual RFC at the moment, or remember who formally closed it, but it is clear that the link given was not to the official closing comments. Some other aspects of the series of edits may have been valid, but I did not feel that it was right to leave these errors on display. Please discuss changes you would like to make here, one at a time, so that we can all agree on their value. --Nigelj (talk
) 21:51, 21 May 2012 (UTC)

I don't understand why some people get hung up on the sample sizes. For the stats listed in this article, the sample sizes are large enough to drive the variance of the percentages to a very small value. The reason that the stats vary from source to source is that they are sampling from different populations. -- Schapel (talk) 22:07, 21 May 2012 (UTC)

Medians in Usage share of web browsers

Since there is really no consensus above and everyone involved can agree on nothing, I ask for outside comment on whether the medians should be included.Jasper Deng (talk) 00:12, 5 January 2012 (UTC)

Yes, but, notwithstanding my basic objection that this is analysis/research we shouldn't be doing, isn't this true?
  • One of the main arguments for using a median is to reduce the influence of "big outliers" in a large sample.
  • It is, here, being applied to a sample of 5.
  • The sample data for the median is percentages.
  • By definition, percentages are confined to a range of 1-100, somewhat reducing the likelihood of "big outliers".
And, if we are honest, isn't there, anyway, a tiny hint here that we are using median as something that might avoid
WP:CALC
, because we really, deep down know that we're crossing, or over, the line of doing our own research here? (yes, I read the rest of the page, now).
Apologies if my maths/statistics knowledge isn't fully up to speed, I'm largely basing my supposition on medians and their usefulness from a discussion I had with a real estate agent, explaining that it helped to exclude massively overpriced palaces from local property price averages. Begoontalk 12:43, 8 January 2012 (UTC)
@NigelJ: And yet the medians are plotted in a graph directly encouraging comparison of the medians; omitting the fact that readers should actually re-scale the medians if they want to compare them. Of course, comparing the medians would be wrong since they are created from sources which doesn't even claim to state the same kind of numbers. Some sources tries to extrapolate to global usage shares, other sources report their raw usage shares. Doing any type of summary on such numbers is just flat out wrong. It's apples compared to slivers of orange peel.--Useerup (talk) 15:40, 8 January 2012 (UTC)
"the fact that readers should actually re-scale the medians if they want to compare them" is actually wrong. Each median is a percentage and so is comparable with 100%, and therefore is comparable with other percentages, and medians of percentages. All you cannot do is add them up and expect to see 100%. It is perfectly valid to say, "Based on the most reliable figures Wikipedia has been able to identify, the median usage of A just went above 50%", "Based on the most reliable figures Wikipedia has been able to identify, the median usage of A is now two percentage points greater than the median usage of B", and "The usage shares reported by statistics provider P are usually within 5% of the medians based on all the most reliable figures Wikipedia has been able to identify". --Nigelj (talk) 16:12, 8 January 2012 (UTC)
  • Oppose . I agree with already said arguments against median. In our graphs we can choose a single source, i propose StatCounter, already used in some. The only valid "pro" of median is the synthesis, but due to the few sources, in my opinion it is useless. Subver (talk) 13:54, 8 January 2012 (UTC)
  • Oppose. A median is a meaningless number when the inputs are not comparable. kop (talk) 06:16, 12 January 2012 (UTC)
    • But the input is perfectly comparable. It only differs in biases — that exact thing median is supposed to fix. — Dmitrij D. Czarkoff (talk) 11:12, 12 January 2012 (UTC)
      • The sources sample different populations and they may very well sample different behavioral patterns (unique users versus page impressions). The populations they sample are of very different sizes. One of the sources tries to extrapolate to global usage shares; others don't. They are not comparable. Yet, in a median (or mean) calculation they are given equal weight, the result (global usage share???) is not clearly defined and if you compare percentage points you err because they are not scaled to 100%. if the sum of the medians hit 110 (which is possible although right now it "only sums up to 102%), comparing percentage points and concluding that browser A has 2 percentage points more usage than browser B you would err by about 10% --Useerup (talk) 14:53, 12 January 2012 (UTC)
      • How can you say they're comparable when they're not reproducible, not verifiable, and, pointedly, are computed based on populations that are not randomly selected and which therefore represent nothing but themselves? The meaning of each metric is therefore questionable; and entirely unknown with respect to global browser share, which is what the median is suppose to pertain to. Further, as you note, arguments which pertain to the median also pertain to the mean. Yet nobody is arguing that the mean is meaningful -- it's obvious that the mean is not meaningful because it can't be weighted when sample size is unknown. It should be equally clear that when you take a median you must know what you're taking the median of, and nobody knows how to compare the different survey's sample populations. kop (talk) 08:15, 15 January 2012 (UTC)

Note: this RFC was supposed to help building consensus. Therefor it's not enough to say whether you support or oppose the median. Please also state your view on how the user agent statistics should be presented. Eg., a table with raw data, a table and a plot (which plot?), a table with a weighted mean line, a table with a median line, just a text that such studies are performed, or any other way. Please, make sure you not only criticize, but also suggest something. Otherwise your effort will actually turn out to further fuel the dispute. — Dmitrij D. Czarkoff (talk) 14:09, 8 January 2012 (UTC)


Note: Unlike what

WP:CALC requires consensus for a calculation to add it or keep it in).--Useerup (talk
) 15:04, 8 January 2012 (UTC)

How many new polls have there been on these medians, here, at Talk:Usage share of operating systems and elsewhere in recent months? --Nigelj (talk) 15:28, 8 January 2012 (UTC)

  • RFC reply It seems to me to be a big non-issue; the median it returns is effectively the same as the statcounter result anyway.
    talk
    ) 23:51, 9 January 2012 (UTC)

Ending the RFC

The conclusion is that there is no consensus on whether the median is an appropriate calculation. According to

WP:CALC there must exist consensus for keeping the median; otherwise it must be removed. The median is already removed through other changes and there seems (absense of edits) to be consensus that the changes are appropriate (good work!). I have removed the RFC tag. --Useerup (talk
) 19:36, 27 January 2012 (UTC)

Shouldn't the beginning of Usage share of operating systems' page updated to remove the "A discussion is being conducted..." text, then? 195.23.92.74 (talk) 19:34, 19 March 2012 (UTC)
Removed the averages per the no consensus result of the above RFC discussion. Please continue the discussion here before changing that edit. Thanks! sn‾uǝɹɹɐʍɯ (talk) 03:36, 24 May 2012 (UTC)

Mobile vs desktop

Cause of the tag, it needs to define the concept of "desktop" and "mobile". For me it is clear: mobile include smartphones and tablets, and their correspondent operating systems (Android, iOS, etc), "desktop" include proper desktop and laptops and their correspondent Operating systems (Windows, Mac OS, etc). In this form is grouped by the sites that register the browsers share. — Preceding unsigned comment added by Palacesblowlittle (talkcontribs) 15:05, 17 July 2012 (UTC)

StatOwl

This site has two serious problems: Number one: Since may 2012 it doesn't show valid stats anymore. So It has to be moved to older reports section. And two: It has only stats of USA, it can't be together with global stats. it must be apart.

Yes, I don't see any recent data from StatOwl, so I think we should move it to the Older Reports section. I don't know about moving it "apart" otherwise, though, because I don't see any statement or implication that StatOwl's data are representative of global usage share. -- Schapel (talk) 22:30, 1 August 2012 (UTC)

An Animadversion

As a web developer, I certainly root for Chrome and FF over IE. But, as a scientist, I know there's often a big difference between what we want and the reality. The article expresses a bias and with much more confidence than warranted.

On the browser wars, here's a dissenting view Internet Explorer market share surges, as IE 9 wins hearts and mind that gives IE more market share than FF and Chrome in March 2012, an idea that seems to be supported by the page you link at the bottom of the article Browser News > Stats.

I'm just thinking that, even with all the cautions noted in the article, three (or four!) digits of precision is misleading, and in general, the stats should be put forward much more tentatively than they are and contrary positions given some space. JKeck (talk) 15:36, 22 August 2012 (UTC)

I think you've hit upon a basic misunderstanding that bites many people when they discuss usage share. We cannot know the actual "global" usage share. All we can know is for a given set of sites, what is the usage share of each browser. Each stats company can measure to a high degree of precision the usage share of browsers for the set of sites they monitor, although it doesn't make much sense to give more than one or two decimal places in the percentages because second or third decimal place can change on a daily or weekly basis. Each stats company uses a different set of sites, and none of them use an unbiased sample. Some stats companies uses stats primarily from a websites in particular country, and some use stats primarily from larger companies' websites. The best we can do is take each of these data points as a very good educated guess, and average them together to get the wisdom of the crowd, which is a best guess at global usage share. -- Schapel (talk) 18:26, 22 August 2012 (UTC)

iPad is mobile?

the mobile stats break down safari into iPhone and iPod. what about iPad? seems like an important omission, or is it just included in one of those categories? Spot (talk) 01:30, 4 September 2012 (UTC)

I had an email exchange with statcounter ruffly 6 months ago and they said they categorize the ipad as a console so it is in neither desktop or mobile statistics.   :(   Daniel.Cardenas (talk) 03:11, 4 September 2012 (UTC)

Restore logical order to stat providers

Further to removal of NetApps, just noticed DC also reordered the Historical Usage Share section in Nov 10 to place StatCounter at the top for no apparent reason. Previously the providers were listed in order of how long they've been operating - i.e., Net Apps, W3Counter, StatCounter, Wikimedia, Clicky. See long-term contributor Schapel's confirmation of this in the "Restore Net Apps stats" section above.

I propose this order is restored rather than the random order they've been shuffled into. Further, the Summary Tables were also in this age order, now StatCounter is randomly at the top - suggest restore. If there's a decent logic behind the current order, fair enough - let's hear it. — Preceding unsigned comment added by Psdie (talkcontribs) 02:50, 14 September 2012 (UTC)