Talk:Statistics/Archive 4

Page contents not supported in other languages.
Source: Wikipedia, the free encyclopedia.
Archive 1 Archive 2 Archive 3 Archive 4 Archive 5

Can you have negative weighting for sensors that measure a positive quantity?

Answer or referall to an academic expert appreciated.

Is it automatically unphysical to have a PCA reconstruction that has some stations negatively weighted? Would think that it could occur for both degeneracy and anticorrelation with the average (actual physical effects). Of course the summation must be positive, but is it automatically wrong if some of the stations have negative weights?

This is being debated on these blog threads. Unfortunatley, the debate has muddled particular examination of the Stieg Antarctic PCA-based recon with general absolute claims that negative weightings are bad, bad, bad.

Could you please adjuticate?

See here:

http://noconsensus.wordpress.com/2009/06/07/antarctic-warming-the-final-straw/#comment-6727

http://noconsensus.wordpress.com/2009/06/09/tired-and-wrong-again/#comment-6726

http://wattsupwiththat.com/2009/06/10/quote-of-the-week-9-negative-thermometers/#more-8362

http://www.climate-skeptic.com/2009/06/forgetting-about-physical-reality.html —Preceding unsigned comment added by 69.250.46.136 (talk) 17:27, 19 June 2009 (UTC)

Statistics Calculators Portal

I would like to add a new non-commercial resource to this article. The web-page is : http://www.solvemymath.com/online_math_calculator/statistics/index.php statistics calculators portal .

Please verify it's integrity and correctness, and if it respects Wikipedia's external resource guidelines add it to this article as a free resource. —Preceding unsigned comment added by Livius3 (talkcontribs) 15:21, 3 October 2009 (UTC)

Does anyone have anything against this link? —Preceding unsigned comment added by 91.201.193.45 (talk) 02:29, 11 October 2009 (UTC)

Statistics is branch of mathematics

I added the cited fact that statistics is a branch of mathematics. As references, I used a lot of books, some of them of notable "statisticians", which consist this fact. I rsepect the opinion of some great mathematicians and statisticians, that statistics is a science, and I put this sentence, after the lead one. The generally opinion is that, statistics is branch of mathematics.--Kiril Simeonovski (talk) 00:04, 5 October 2009 (UTC)

I do not think it is true that it is "generally accepted." I think people argue about this a lot. It is not a branch of mathematics like Algebra, Geometry, and Analysis (Calculus). It does use techniques of mathematics. You cited a zillion sources that agree with you but only one that gave the alternate definition. I could put in a zillion for the other, but I don't think it is a matter of who has more citations. I think some people say one and some say the other. I am editing the sentence to read that some consider it one thing and some consider it the other. I'm also deleting some of our sources, because it's also silly to have so many sources for one fact. GumbyProf: "I'm about ideas, but I'm not always about good ideas." (talk) 11:47, 23 October 2009 (UTC)

I agree that statistics applies mathematical techniques. But it is not a key fact that leads to the definition of statistics. You consider statistics a science, but the definition of mathematics is that, it is not a science, and the link of "mathematical science" leads to the article mathematics. Mathematics is considered as study, but what pushes statsitics to be a science? On the other hand, the problem is not to define mathematics, but to define what is not mathematics. The sources were cited, just to proove that it's part of mathematics. If "branch of mathematics" is disputed, than "mathematical science" is more disputed. The main thing is to consider statsitics as part of mathematics, but not as science, because actually mathematics is not a science. If this is not enough, I'll carry on with "branch of applied mathematics", which for me is most appropriate desicion, between "branch of mathematics", and "mathematical scieence".--Kiril Simeonovski (talk) 23:34, 23 October 2009 (UTC)
There are two distinct issues: (A) is stats part of math, and (B) is/are math/stats science? GumbyProf's point was about issue (A). In my experience the notion that stats is part of math is far from accepted. For example, a typical university maintains separate departments for stats and math, with little interaction between them. My hope is that GumbyProf adjusts the intro to reflect the disagreement, with sources supporting both sides. Mgnbar (talk) 13:56, 24 October 2009 (UTC)
But, the problem is that the term "mathematical science" is inappropriate. Statistics is more part of mathematics, than it is a science. This seems to be completely wrong. If you two guys are mathematicians or statisticians, it is not decisive to define the term "statistics", as of your experience, and the departmanization on some Universities. Statistics could not be a science, because of its application in other sciences, as in economics is the economitry, in the chemistry is the chemometry, etc. Which are the scientific methods in statistics? Analyzing, and presenting data is a technique, that in the case applies mathematical methods. This problem was given to the mathematics, with the question of its scientific methods. If mathematics is considered to be a study, than the definition of mathematical sciences, including statistics one of them fails.--Kiril Simeonovski (talk) 20:07, 24 October 2009 (UTC)
Kiril, here is how I understand your argument: "Statistics is not a science; hence it must be a branch of mathematics." Is that fair? My point is that statistics could be neither. I won't bother to flesh out my argument, because my argument and yours don't matter; all that matters is cited sources. (You have already said this, and so have I.) So let's get to sources. A Google search for "what is statistics?" immediately turns up citable definitions of statistics as a branch of mathematics, citable definitions of it as a science, and citable defintions of it as other things. For example, the American Statistical Association's page [1] says that statistics is "the scientific application of mathematical principles to the collection, analysis, and presentation of numerical data" (tiptoeing around the issue). Therefore I still agree with GumbyProf that Wikipedia's Statistics article should acknowledge this disagreement, rather than taking a side. Mgnbar (talk) 22:18, 24 October 2009 (UTC)
My professor in statistics had the same opinion as you. I don't mention about statistics as a branch, but not as a science. Above I wrote, that maybe it's better to consider it as "branch of applied mathematics", or something which applies mathematical techniques. I put the sorces, not to make statistics a branch of mathematics, but to leave temporary the disputed "mathematical science". I expected this conversation.--Kiril Simeonovski (talk) 13:03, 25 October 2009 (UTC)
P.S. The definition of ASA looks more appropriate.--Kiril Simeonovski (talk) 13:04, 25 October 2009 (UTC)

The first paragraph now has two sentences debating the mathematical status of statistics, which obscure the basic introduction to statistics. This debate needs to be moved below, and the first sentences need to be restored to their simple and useful state. Kiefer.Wolfowitz (talk) 09:29, 2 November 2009 (UTC)

I aggree with this suggestion. Let's consider having a section called "Scope" immediately before the "history" section. The book by Agarwal [[2]] starts with a list of quotations and sometghing similar might be done here. There at least needs to be mention of the "decision making in the face of uncertainty" interpretation of what statistics is about. Melcombe (talk) 15:13, 2 November 2009 (UTC)
I also agree.--Kiril Simeonovski (talk) 23:46, 4 November 2009 (UTC)

I have made a new lead and put the parts discussed above under "scope": I also moved there some sentences from "history" that were on the same point. Note the dictionary I cited refers to statistics as a science in its own right, and the lead presently says this. Melcombe (talk) 17:18, 5 November 2009 (UTC)

Well-done, Melcombe! Kiefer.Wolfowitz (talk) 22:03, 7 January 2010 (UTC)

Statistics every writer should know

Seems to me, this more clearly introduces this subject to the ignorant and curious than most elementary treatments. Should it be a prominent EL here? Jim.henderson (talk) 14:58, 8 January 2010 (UTC)

Applied statistics

Given that several articles have chosen to use the term "applied statistics" as opposed to just "statistics", and that applied statistics redirects here, it would be good to have a short sentence or two covering this term. Melcombe (talk) 17:26, 27 January 2010 (UTC)

what is the job of a statistician?

is it that he collects data about population only? how is probability used in stats? how is it used efficiently in our life? if anybody can pl ans these questions and do tell what a person can do after completing his/her degree in stats? —Preceding unsigned comment added by 59.93.255.142 (talk) 16:49, 4 February 2010 (UTC)

You might find our statistician article helpful, along with some of the external links there. -- Avenue (talk) 21:27, 4 February 2010 (UTC)

the most deficult problems in statistics ? —Preceding unsigned comment added by 121.52.147.11 (talk) 07:59, 10 February 2010 (UTC)

Simspon's Paradox

Why is there no section on paradoxes, particularly the most accessible of them all - Simpson's Paradox? There is a wikipedia page on it! Someone should make a simplified version of that page (using a better example, like with batting averages).

Why does "statistical analysis" directs here ?

It seems to me like directing http://en.wikipedia.org/wiki/Software_development to http://en.wikipedia.org/wiki/programming Talgalili (talk) 10:10, 19 February 2010 (UTC)

What would you like it to say? There is
statistical modeling, but that is fairly useless. "Applied statistics" just redirects back here too, as does "statistical applications". Of course, there is data analysis, which might serve but might have too much non-statistical stuff. Melcombe (talk
) 11:07, 19 February 2010 (UTC)
I think it deserves it's own article (or be directed to others such as you suggested), Even though the articles are not of the same high quality. Would you agree ? Talgalili (talk) 14:27, 19 February 2010 (UTC)

Statistical Computing

This section needs to be expanded. The main article on stat computing is highly undeveloped, and the language whose picture appears, gretl, is relatively obscure. R is clearly the preferred open-source statistical computing package these days, and there exists quite a bit of history about which packages have developed over the years and why. Wilkinson, L. (2008), "The future of statistical computing", Technometrics, 50 (4): 418--435 contains a thorough study. Owensmartin (talk) 23:47, 8 March 2010 (UTC)

It would seem better to make
statistical software. Melcombe (talk
) 10:32, 9 March 2010 (UTC)

what are the 5 qualities of a good statistician?

~ what are the 5 qualities of a good statistician? —Preceding unsigned comment added by 121.1.29.246 (talk) 11:03, 21 June 2010 (UTC)

What is the correct level of measurement of IQ tests?

I see the main article

Levels of measurement suggests that most psychometric scales (presumably including IQ, although this is not stated explicitly) are better deemed to be ordinal scales, even though many psychologists think otherwise. (In my professional research in psychology, I have seen both kinds of statements in the writings of psychologists. See User:WeijiBaikeBianji/IntelligenceCitations for a growing list of citations to the literature on human intelligence and IQ testing that I share with other Wikipedians.) So which is the more accurate view? Are IQ tests best thought of as interval scales, or as ordinal scales? It seems to me that the two related articles ought to agree on that point. -- WeijiBaikeBianji (talk
) 23:35, 2 July 2010 (UTC)

If there's genuine disagreement in the professional literature then Wikipedia should state that rather than come down on one side or the other, in line with
WP:ASSERT. But as this is only being used as a passing example in 'Statistics#Levels of measurement I think it would be best to replace with an more straightforward uncontroversial one. I'll have a quick go now. Qwfp (talk
) 08:35, 3 July 2010 (UTC)
I've reworded it and replaced IQ with longitude as one example of an interval scale. Qwfp (talk) 08:53, 3 July 2010 (UTC)
"If there's genuine disagreement in the professional literature then Wikipedia should state that rather than come down on one side or the other" is certainly correct as a statement of Wikipedia policy. My question was more motivated by the sense that scholars who are sensitive to the issue of levels of measurement may be in agreement about what the level of measurement is in IQ testing. If so, that information belongs in at least a dozen articles here and elsewhere on Wikipedia related to IQ testing. Perhaps there is a small subset of scholars who form the intersection of the set of statistically astute scholars and the set of psychologically knowledgeable scholars who have formed a consenus on the level of measurement found in IQ testing. I'd like to double-check what they think. Thanks for the edit you just did meanwhile. -- WeijiBaikeBianji (talk) 15:09, 3 July 2010 (UTC)
I have since found (and have added to the Level of measurement article) quite a few references showing that psychologists who consider the issue, including some IQ test developers, agree that IQ tests yield only ordinal rankings, not interval measurements. The only psychologists who say otherwise (claiming that IQ tests are interval scales) do so in passing, never with a reference or detailed discussion of the issue. If IQ scores are ordinal scores, that means IQ statistics are nonparametric statistics, right? (Pardon my statistical-newbie question; I'm trying to check my own understanding of the issues.) I will be making updates of various Wikipedia articles according to what the best sources say, and as I go along, I want to make sure I am correctly understanding the big picture. See the citation list for intelligence articles for what some of the best sources on IQ testing are. -- WeijiBaikeBianji (talk) 15:13, 18 August 2010 (UTC)

Random Chance

The article seems to ignore the effects of pure, random chance - as do statistics. By this I mean Stats tell us that there is a 1 in 1000 chance of something happening, but (given the dataset [It will happen, It won't happen]) this is reduced to 1:2. Ignoring all outside influences, either something happens or it doesn't. Similar to the n=1 fallacy, the odds of winning the lottery are so high that if you win it, stats tell us that you probably won't again. This ignores that they have already told us that you probably won't win the first time. It was down to pure random chance that you did and that same pure random chance can happen again. (I have not looked into the data but I'm sure in some, fairly-run, lottery the same numbers have appeared more than once over time - like the n=1 - there is nothing to stop the same numbers coming up twice in a row). Where is WikiResearch? (talk) 11:41, 18 August 2010 (UTC)

The article does mention that statistics is based on probability theory, which is appropriate, as much of statistics is concerned with random chance --- for example, sampling variability. But what specifically would you like to change about the Statistics article? Mgnbar (talk) 13:59, 18 August 2010 (UTC)
Basically, just a mention that statistical results have little bearing on real-life situations. Just because something has a probability of one-in-a-million it doesn't stop it from happening a thousand times in a row. And that statistical results are, largely, subjective.Where is WikiResearch? (talk) 14:10, 18 August 2010 (UTC)
A statement that "statistical results have little bearing on real-life situations" would be highly controversial, to say the least; statistics is applied to all kinds of real-life problems every day. Your "thousand times in a row" example is essentially the law of averages or gambler's fallacy. Probability theory is loaded with such subtle and/or counterintuitive phenomena, such as the Monty Hall problem. I think that they should be addressed in the probability articles, not the Statistics article. The issue of subjectivity is more interesting; it's dealt with a bit at Foundations of statistics, but that's basically a stub, and there's nothing here about it. What should be said here? Mgnbar (talk) 15:40, 18 August 2010 (UTC)
All I can think of are: the words take with a pinch of salt inscribed in large friendly letters on the cover. I've got a good point to make, I just can't seem to translate from thought to words. It could be that statistical never consider all variables (which would be impossible) but often disregard them when convenient... which is sounding wrong even as I write it.
So long and thanks for all the fish (er, I mean help - see why I posted here first).Where is WikiResearch? (talk) 02:26, 20 August 2010 (UTC)
Maybe you are getting at the idea that a statistical or mathematical model is just a model --- an imperfect representation of reality, made for a particular purpose, that may not serve other purposes? With regard to the "pinch of salt", I will say that statistics gives you an idea of how big of a pinch of salt you should take; that is, statistics helps you manage uncertainty. For example, the p-value in a hypothesis test tells you how likely the data would be if the null hypothesis were true. Anyway, so long. Mgnbar (talk) 16:34, 20 August 2010 (UTC)

Data singular or plural or both?

Kendroche and I have just reverted each other over the issue of whether the word "data" is singular or plural. I have read the citation offered by Kendroche; it seems to be a linguistics database. The Data article says that "data" is plural, but points out that it is sometimes used as singular, especially in colloquial English. The two statistics textbooks in front of me are unambiguously in favor of plural. Do others care to weigh in? Mgnbar (talk) 13:26, 19 August 2010 (UTC)

Data is a singular mass noun. Like agenda, it is borrowed from a Latin plural form, but is not commonly treated as plural in English. There are still many authors who treat it as plural, and even style guides that recommend this, but to me this seems dated or overly formal. I don't think these are styles we should be aiming for here. This Guardian blog post puts it well. --Avenue (talk) 01:10, 20 August 2010 (UTC)

I had a quick look in my Oxford English Dictionary and it does, list "data" as a plural. I then had a look at the date, 1948. What would be interesting to know (for me, at least) is when did it start being used as a singular? Also found this Usage note on Dictionary.com:

Data is a plural of datum, which is originally a Latin noun meaning “something given.” Today, data is used in English both as a plural noun meaning “facts or pieces of information” ( These data are described more fully elsewhere ) and as a singular mass noun meaning “information”: Not much data is available on flood control in Brazil. It is almost always treated as a plural in scientific and academic writing. In other types of writing it is either singular or plural. The singular datum meaning “a piece of information” is now rare in all types of writing. In surveying and civil engineering, where datum has specialized senses, the plural form is datums.

According to that, as Stats are a scientific piece, the usage should be plural (I still think it sounds wrong and would never use it myself, but at least I get a big slice of pie). Where is WikiResearch? (talk) 01:56, 20 August 2010 (UTC)

I'm glad that we've returned to plural, at least for now. I never knew that the singular usage was as accepted as it is; thanks for educating me. Is some of this due to discrepancies between UK and USA English, in how they treat mass nouns? There's no reason for the Statistics article to favor USA English. Mgnbar (talk) 16:40, 20 August 2010 (UTC)
I don't believe this is entirely a US vs UK distinction, e.g. along the lines of "the team is ... " vs "the team are ...". The Guardian writers I linked to above argue for singular data but are based in the UK, for instance, and this post from someone at Harvard says usage in the US is mixed. Our article also displays mixed usage. While I would be happiest if we settled on the singular, for now I will just share a link to another partisan for singular usage: Data is a singular noun. --Avenue (talk) 14:53, 22 August 2010 (UTC)
Again, thanks for educating me. The author makes many good points, although he doesn't refute all of the other side's evidence and arguments. (For example, he writes, "I can't find any sources which argue unashamedly that `data' is a plural". But such sources are easy for me to find.) There is also the issue of the descriptive view of language, in which linguists describe how people use words without making value judgments as to correctness, vs. the prescriptive view. In scientific disciplines it is common for elders to prescribe language use to novices, in order to maintain a slightly artificial technical language. For example, if a classroom of statistics students started using the term "independent" to refer to any situation in which two quantitative variables had correlation coefficient 0, then the teacher would correct them.
I propose that we stop discussing this issue here, and that anyone who's interested should chime in at Talk:Data, where as you can imagine the discussion is well underway. Mgnbar (talk) 15:36, 22 August 2010 (UTC)

It would be rather better if people would say what they actually mean and not go for some quick shorthand. Thus, people could write "dataset" or "data set" (or collection of data) for use in the singular when refering to a whole mass of data, and could use "data points" or "data items" (in the plural) when wanting to emphasize the multiplicity of what is being considered. Melcombe (talk) 10:01, 23 August 2010 (UTC)

Sure, but it is not for Wikipedia to set standards like that. Since the field doesn't have agreement on the use of the term, it is hard for this Encyclopedia to find a single standard. The description as in Data is probably best. It's usually used as plural, but formally (and formerly) singular as a mass noun, and sometimes used where datum would have once been used exclusively. GumbyProf: "I'm about ideas, but I'm not always about good ideas." (talk) 17:22, 30 August 2010 (UTC)

A link added to StatProb

Hello all.

I added a link to StatProb. I was very cautious with adding such a link, but believe it is a project sponsored by some of the more important institutions in the statistical field, offering a very unique resource of knowledge (who's content is shared under CC, non-commercial, with attribution, share a like). I think this link belong to this article.

I would be interested to know if you support or disapprove of this addition.

p.s: I am in no way affiliated with this project.

Talgalili (talk) 02:41, 1 August 2010 (UTC)

The present contents of this "StatProb: The Encyclopedia Sponsored by Statistics and Probability Societies" don't seem very useful for "topics" in statistics, rather than people ... most articles seem to be biographies ... in fact, I see only one article that is not a biography. Melcombe (talk) 08:43, 2 August 2010 (UTC)
Hello Melcombe. Due to the declared scope of the website, I would trust that to change. Do you suggest removing the link until more general topics will be added? Talgalili (talk) 10:48, 2 August 2010 (UTC)
We could do with other opinions. But this page] suggests that there is not a lot of activity at StatProb. However, I have added the link to founders of statistics, because of the historical biographies. Melcombe (talk) 15:25, 10 August 2010 (UTC)
I agree they're low on activity. Although there where two new articles added after your last massage. So the project is very still alive. Talgalili (talk) 16:45, 25 August 2010 (UTC)

I work for Springer, the company hosting the site. The site went live the first week of August; we included 100 biographical articles from a Springer publication "Statisticians of the Centuries" to get us started. These are not typical of what is to come. All article have to be approved by an editorial board and once published, all changes have to be approved by the article's author. There are more than 20 articles currently being reviewed by the editorial board and the ten sponsoring societies are just announcing it to their members. John Kimmel, Springer —Preceding unsigned comment added by 64.69.101.138 (talk) 21:50, 4 September 2010 (UTC)

It may yet turn into something independent, but the new articles I have looked at are acknowledged as being "reprinted" from a Springer publication: Lovric, Miodrag (2011), International Encyclopedia of Statistical Science, Heidelberg: Springer Science. For others' info there is an online announcement of StatProb. There are now about 11 "topic" articles. Melcombe (talk) 15:41, 7 September 2010 (UTC)
Is the license compatible with Wikipedia's? i.e. Can articles be moved between the two? 018 (talk) 18:45, 7 September 2010 (UTC)

Discussion on another Talk page (some time ago now) revealed a different source of refereed online Stats articles in Wikipedia-like format at http://www.scholarpedia.org/article/Category:Statistics . Again, not many articles there. Melcombe (talk) 15:11, 25 October 2010 (UTC)

Education Spending

I didn't know where else to go but I need some statistical information that displays the percentage of spending on education in the United States. I'm doing an arguementative paper on "Should Education Be the Number One Priority in a Nation". I need all sorts of information on education but I'm having a difficult time finding it on the internet. It just displays a lot of useless information. I need info on stuff like, A nation before and after Education,Issues or corporations that are getting more funding than spending,the importance of education,the flaws in our educational system and the impact of education in the future. If I could use some articles on these, it would be a lot of help. Thanks. —Preceding unsigned comment added by 166.214.163.67 (talk) 18:41, 16 March 2011 (UTC)

You could try our humanities reference desk. Please read the instructions, especially the one about homework. --Avenue (talk) 23:58, 16 March 2011 (UTC)

We're not allowed to use any information from wikipedia. —Preceding unsigned comment added by 32.179.25.142 (talk) 15:22, 17 March 2011 (UTC)

in re Restoring article (back) to GA status

I'd like to help improve the article so that it can restored to GA status. I've made minor copyedits in the body text and lead paragraph so far, but I held back on a couple of ideas pending discussion here. I propose changing the lead sentence to the following, per discussion at (Archive 2) "Math and/or science?":

Statistics is the science of applied mathematics in data collection, organization, and interpretation[1]; the data is often numerical but may take other forms including relationships between entities.

Before we make this change though, perhaps the first paragraph of the Scope section should be rewritten for clarity and readability including the compromise discussed before. Also, I think the second sentence of the lead should be rewritten so it doesn't begin with "It" but I haven't come up with an alternative any better than "Statistics includes the planning..." -- By the way, one of the header tags above mentions "suggestions below for improving the article" (June 2006) but I don't see these in the archives. Anyone know where they are? -PrBeacon (talk) 19:12, 23 September 2010 (UTC)

Let's keep the lead brief. We have a citation that that says statistics is a science (in its own right). There are lots of aspects of statistics that do not have a connection to "mathematics". If we need to have something like "the data is often numerical but may take other forms including relationships between entities", then that could well go under "scope", but there is certainly no need to be making fine distinctions in the lead. See
WP:LEAD. Melcombe (talk
) 09:10, 24 September 2010 (UTC)
The "suggestions below for improving the article" (June 2006) might be the discussion at the end of "Archive 3". Melcombe (talk) 11:27, 5 September 2011 (UTC)

Statistical Proof

Hi...I'm not a statistician (a biologist), but I recently came across statistical proof and most of what was written seemed nonsensical so I did a complete re-write of the page. I was hoping to find a statistically minded editor to swing by to read through my re-write and give some feedback. The page may be of some value to Wikilink in this main stats page. Please leave a msg on my talk page if you are interested. Thanks.Thompsma (talk) 00:01, 11 November 2011 (UTC)

Determining the relevence of statistical data based on how old it is

If I were to give a 200-year old statistic, most people, probably including statisticians, wouldn't take me seriously. But if I were to give a one-day-old stastistic, pretty much everyone would take me seriously (in relevence to the statistics' age). I'm not arguing with this. I'm just really surprised this isn't discussed in the article at all, when, in the real world, it is a factor in deteriming the value of a statistic(s). A point in discussing this that should be made should addressed is the problem of setting an age at which a stastistic is "too old" to be relevant. While pretty much anyone would agree with the arguments in my first two points, and with validity, this cannot validly be set: if a statistic is older than ('x' amount of time), it is relevent and valid (in terms of its age); if a statistic is (same 'x' amount of time) old or newer, it is relevent and valid (in terms of its age). Another point that goes along with this is the fact that when a statistic is said, it is presented in the present tense as happening at the exact moment the person saying the stastic, while technically, in reality, it it can only proven to be happening when conducting the experiments that determined the statistic. — Preceding unsigned comment added by 108.79.215.148 (talk) 13:38, 21 December 2011 (UTC)

Correlation and causation

Hi, this article implies that observational studies are "causational" and can give information about causation, however since observational studies only correlate between two or more variables and lacks intervention, randomization or contol groups this is not possible to do. Observational studies cannot give evidence of causation and is only viable as a research method when viewing (hence "observational") different areas in order to find new hypothesises on which new research can be made. — Preceding unsigned comment added by 90.228.223.33 (talk) 12:20, 27 March 2012 (UTC)

I agree that the text implies that observational studies can establish causation. I'm not sure, but I think that the text is very old. I'd like to see the text rewritten, with reliable, verifiable sources, to clarify this issue. Unfortunately I don't have the expertise. Maybe you do, 90.228.223.33? Mgnbar (talk) 12:31, 27 March 2012 (UTC)

Hawthorne

Citation: Wickström, G.; Bendix, T. (2000). "Commentary". Scandinavian Journal of Work, Environment & Health. 26 (4): 363.

. 159.83.196.1 (talk) 20:22, 15 May 2012 (UTC)

Very unclear what this was added. The full title of the article appears to be: 'The "Hawthorne effect" - what did the original Hawthorne studies actually show?'. This might be relevant to some other article, but doesn't seem directly related to article content/intent. Melcombe (talk) 21:06, 15 May 2012 (UTC)
Offered in response to "citation needed".159.83.196.1 (talk) 19:07, 17 May 2012 (UTC)

Mentioning statisticians in the lead

A statistician is someone who is particularly well versed in the ways of thinking necessary for the successful application of statistical analysis. Such people have often gained this experience through working in any of a wide number of fields. There is also a discipline called mathematical statistics that studies statistics mathematically.

This seems unnecessary. The lead summarizes the topic at hand, not the people who work in its field. I think it's more appropriate to relegate 'statisticians' to the See also section or - perhaps - have some other section, dedicated to describing what's required to work in the field, cover this.

Sowlos (talk
) 14:46, 6 September 2012 (UTC)

how is statistics used — Preceding unsigned comment added by 72.27.91.178 (talk) 16:19, 4 December 2012 (UTC)

Restructuring the Statistics article

I noticed that in the Scope and in the Overview chapters of this article a lot of text had been inserted which doesn't really belong there. Besides the main text of the Overview wasn't very clear either. On my home page, see Marcocapelle, I made an attempt to rewrite the Scope and Overview, and besides I moved all stuff that doesn't really belong in a Scope or Overview to other chapters (see chapter 3.1, 6 and 8 as numbered on my home page). Can you all check if this restructuring makes sense to you? Marcocapelle (talk) 09:17, 17 May 2014 (UTC)

You could help us understand your rewrite by listing the items that you've moved. For example, you seem to have moved misuse of statistics out of Overview. It's hard to say what belongs in Overview, as the entire article is arguably an overview of statistics. So is Overview just an overview of this article? Usually the intro section does that. Maybe we should all discuss what these sections mean. Mgnbar (talk) 10:58, 17 May 2014 (UTC)

Good point! In my perspective, an overview should give a reader insight in what Statistics really is, rather than move in all possible side directions from the start. An overview should not be used as a sort of table of contents, and should even less be used for single remarks that aren't elaborated in a later stage.

  • So I moved the more detailed part of sampling to a section in a new chapter 'Data collection'
  • I moved the part about misuse to an already existing chapter about misuse.
  • I moved the misinterpretation of correlation to a paragraph in the chapter of misuse.
  • I moved the paragraph of applied statistics versus theoretical statistics as a section of the new Trivia chapter.
  • I moved the paragraph of machine learning and data mining as another section of the new Trivia chapter.
  • I moved a paragraph into another section of the new Trivia chapter and named it Statistics in society.

In addition, of the chapters that I otherwise didn't touch, I did make the title a bit clearer though (chapter 4, 5 and 7).

Just for your info, I don't think that the Statistics article as a whole is perfect after this restructuring, but at least the start of the article has improved a lot (I think).

Kind regards, Marcocapelle (talk) 11:18, 17 May 2014 (UTC)

Has anyone taken the effort to have a look or to think about the above? Marcocapelle (talk) 19:18, 24 May 2014 (UTC)
People don't seem too upset by your changes. So I suggest that you
Be Bold and start making them. Mgnbar (talk
) 13:04, 26 May 2014 (UTC)
All right then, thanks for the reaction! Marcocapelle (talk) 19:51, 26 May 2014 (UTC)

Fallacy of Transposed conditional

Is there any Wikipedia article which explains the fallacy of transposed conditional? Lbertolotti (talk) 16:45, 2 September 2014 (UTC)

Prosecutor's fallacy. Qwfp (talk
) 17:36, 2 September 2014 (UTC)

Typing Errors

One of the diagrams on this page reads "ovservation", not "observations", which may need attention. 114.78.37.19 (talk) 12:41, 17 November 2014 (UTC)

Resolved

Lbertolotti (talk) 23:49, 1 October 2014 (UTC)

Modern developments in statistics?

I notice both in the statistics article and in the linking box at the bottom, we don't currently link to existing wikipedia articles that reflect the statistics of

difference-in-differences, regression discontinuity design and propensity score matching. Anyway, I'm not very tech-savvy so would be grateful if someone knew how to do this? — Preceding unsigned comment added by 93.162.74.34 (talk
) 23:10, 28 December 2014 (UTC)

Big data? A representative example?

I'd like to ask if the inclusion of Big data, as recently introduced to the lead in this edit [3], is representative of the "active research" being made in the statistics community. Is there not, for example, also active research in, say, Bayesian analysis of small data sets? Or, for that matter, other areas of statistics about which I am not familiar? Also, is "big data" an area of statistical research per se, or is it more accurately described an area where existing statistical methods are being applied in a newish area? So, I just wanted to ask. Isambard Kingdom (talk) 17:56, 19 June 2015 (UTC)

maths

Statics

Yakoobstk (talk) 04:18, 17 August 2016 (UTC)

Statistics vs Data Science

Is statistics a subfield of data science? See discussion in

  1. REDIRECT [[4]]
Statistics Data Science
Data Analysis (Inference) Data Mining
Data Organization Data Management
Data Collection Data Acquisition
Data Presentation (Exploratory Analysis) Data Visualization

The first column is just the begining of the text. Regarding the second column, see:

  1. REDIRECT [[5]]

I "dramatically" suggest merging the two or, at least, give a meaningful distinction rather than "advances in computing with data". It is almost as if a science becomes something different just because you're using a tool.
I think the name for that is Computational Statistics. I agree that there are many methods used in data science that are not yet teached to many degrees in statistics, but come on! Statistics is older and may maintain it's name, but Data Science is more descriptive in my view. What do you think? Am I being too extreme? BrennoBarbosa (talk) 09:18, 23 January 2014 (UTC)

I think you should understand what is data and what is Statistics? The data is a set of experienced facts intuitively describing the states or status of objectives in applied fields. This is why the Statistics was coined and introduced. Statistics is a simpler term by using a single word than the "data science". Yuanfangdelang (talk) 20:20, 1 September 2016 (UTC)

Proposed merge with Mathematical statistics - first proposal

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


The mathematical statistics article is a bit of a waif. There's a nice couple of sentences about the difference between descriptive and inferential statistics and also about the development of statistical theory. The data analysis section seems out of place there: if that were moved to an appropriate place in statistics, then there wouldn't be much left there. The original author of that article endorsed a redirect some years ago. Illia Connell (talk) 05:29, 25 April 2013 (UTC)

I agree that the current mathematical statistics article has little to offer. But, before we merge, let's ask ourselves one question: If Wikipedia were complete and perfect, would that article exist, separately from statistics? What would it cover? If the answer is "a lot", then maybe we should improve that article, instead of merging it. Mgnbar (talk) 12:58, 25 April 2013 (UTC)
In response to Mgnbar's question, I should say in a complete encyclopedia, Mathematical Statistics and Statistics should have different articles. Statistics includes many "qualitative" sub-fields that I think can be only covered in this article.
talk
) 02:01, 30 April 2013 (UTC)
I do believe that mathematical statistics is a separate entry, possibly even a discipline, than statistics (applied), but it should then be part of "probability theory". Limit-theorem (talk) 21:00, 9 June 2013 (UTC)

Merge. Statistics is a branch of mathematics. Having 'mathematical' in the title is redundant. The common terms that describe the difference is 'applied' or 'theoretical'. Science.philosophy.arts (talk) 00:04, 20 September 2013 (UTC)

Statistics using mathematics fairly heavily, as does physics, and engineering. None of these is simply a "branch of mathematics". Applied statistics involves many non-mathematical aspects, and even theoretical statistics goes beyond simply mathematical issues (e.g. the philosophy of inference). --Avenue (talk) 10:29, 27 November 2013 (UTC)
The mathematical statistics article is poor. The current article should be expanded. As far as statistics being a branch of mathematics, it sounds as though you are not a mathematician.

Statistics is a science in my opinion, and it is no more a branch of mathematics than are physics, chemistry and economics; for if its methods fail the test of experience--not the test of logic--they are discarded. - John Tukey

160.36.8.226 (talk) 17:41, 22 November 2013 (UTC)
  • Support - "Mathematical statistics" is completely redundant with "statistics," as I have no clue what "non-mathematical" statistics would be; a statistic is, by definition, just a function on a sample. Seppi333 (talk) 01:38, 27 November 2013 (UTC)
That is so far off base it's not even funny. Do you really need someone to explain how statistics (the field) is not just about statistics (the plural of statistic)? --Avenue (talk) 10:29, 27 November 2013 (UTC)
Avenue, I didn't notice your reply until now. I'm not sure what motivated you to be an asshole and write a rude, asinine response like that. I was (IMO - quite clearly) comparing the "mathematical" vs "non-mathematical" treatment of the field with the mathematical definition of statistical functions by juxtaposing those clauses, not suggesting the something like "the field is defined as the exhaustive set of statistical functions" or whatever absurd proposition you're suggesting I'm asserting.
I'm going to copyedit scope within the next few weeks to expand it and fix the abhorrent lack of citations; I'll probably fix the
WP:UNDUE problem while I'm at it. Seppi333 (Insert 
) 12:21, 23 January 2014 (UTC)

Could someone who opposes the merge explain how these topics are different? Seppi333 (talk) 13:05, 27 November 2013 (UTC)

  • Oppose. Statistics theory [6] may be merged with mathematical statistics. Statistics definitely no. Mathematical statistics, based on mathematical models of uncertainty (nowadays basically probability), concerns the study of principles of inference/learning in statiscal models (selecting best models, model evaluation, and so on..). It is basically a sub field of statistics that deals with decision under uncertainty in a mathematical framework (using mathematical structures). There are other subfields alike, in which there is no clear use of uncertainty though. What is know today as unsupervised learning is the basic example (Clustering, Topological Data Analysis, Association Rule Learning etc...). It is a field by its means.
Indented line

Statistics is larger: there is aquisition of data (metodology and ethics in survey), organization of data (database), presentation of data (visualization) etc... Some may argue that such things may be modelled in mathematical basis. In fact they can, relational algebra/calculus in relational databases is an example, but this is not what is proposed by the mathematical area (a little in methodology in survey sampling yes). BrennoBarbosa (talk) 09:56, 23 January 2014 (UTC)

That statistics may be discussed without referring explicitly to math I understand, but if Clustering, Topological Data Analysis can't be considered math then what is it? computer science? Also how exactly "if its methods fail the test of experience--not the test of logic--they are discarded." Linear regression may work well on some dataset but not in another, so how does that stand as "test of experience"? As far as a I know linear regression may be a poor model when it's assumptions are violated, but then again that is case with any mathematical model since it's the assumptions that logically guarantee the validity of any theorem.Lbertolotti (talk) 21:32, 28 January 2014 (UTC)

  • What is (or what will be) the difference between probability theory and Mathematical statistics? Mathematical statistics sounds like other similar applied statistics subjects like actuarian statistics, biostatistics, social statistics, etc while it is of course Theoretical statistics (unless someone uses statistical methods to model mathematicians). In any case, I support moving the current duplicate (and slightly irrelevant) content on Mathematical statistics into Statistics and Probability theory and wait for someone to add content to Mathematical statistics or redirect to probability theory. We don't want too much duplicate information. Sda030 (talk) 00:21, 26 February 2014 (UTC)
  • Oppose- Keep them as separate articles.
    talk
    16:28, 13 May 2014 (UTC)

Earlier on, Seppi333 asked a fair question: is there anything in statistics that can be considered as "non-mathematical statistics"? The article about mathematical statistics seems to suggest that the non-mathematical part of statistics consists of organizing and planning (of data, of experiments). Though I doubt if many statisticians consider organizing and planning to be part of statistics at all. Therefore support. Marcocapelle (talk) 06:01, 14 May 2014 (UTC)

Wrapping up all opposing arguments:

  • Statistics includes many "qualitative" sub-fields that I think can be only covered in this article.
    • Like what?
  • I do believe that mathematical statistics is a separate entry, possibly even a discipline, than statistics (applied), but it should then be part of "probability theory".
    • Fair enough to make a distinction between application of statistics (as an activity) and statistical theory (as a piece of knowledge). However, everything about statistics on Wikipedia is about statistical theory anyway.
  • Statistics (the field) is not just about statistics (the plural of statistic).
    • I’ve no idea what the author means with regards to whether or not to merge the two articles.
  • Expand mathematical statistics, don't merge them. I agree there's not much there at present, but it's a big subject in its own right, and certainly worthy of a separate article.
    • Its own right, then how?
  • Mathematical statistics (…) is basically a sub field of statistics that deals with decision under uncertainty in a mathematical framework (using mathematical structures). There are other subfields alike, in which there is no clear use of uncertainty though.
    • Fair enough, though that is a distinction between descriptive and inferential statistics which is already explained in the Statistics article.
  • Statistics is larger: there is acquisition of data (methodology and ethics in survey), organization of data (database), presentation of data (visualization) etc...
    • Fair enough, it would be perfect if someone would write articles about all these fields (and some of those articles already exist). Meanwhile we have two articles that are both about statistics in a 'smaller' sense, why shouldn’t we merge them?

Bottom line, I haven't seen any convincing arguments to keep two articles and so the best thing is to merge (or delete what's now on Mathematical Statistics). Marcocapelle (talk) 19:06, 24 May 2014 (UTC)


  • note I have undone the inappropriate close of this discussion. I count four opposes and only two supports, which is either a consensus not to merge or no consensus to do so. Therefore a closure to merge was inappropriate. Further a contentious discussion should only be closed by an uninvolved editor or administrator. An editor who has already participated and !voted on one side or the other should not take it on themselves to do so. Especially not by ignoring the !votes and using their own reasoning to close the discussion.--JohnBlackburnewordsdeeds 20:17, 26 May 2014 (UTC)
@
WP:POV FORK
. I don't need consensus to remove that. Feel free to remake a CORRECT page with CITATIONS to that content. Not a page about mathematical statistics with 7 citations that said
"Mathematical statistics is XYZ." (no citation)
"Bob, Greg, Bill, and Rod used XYZ which was the fad in the 1970s." (7 citations)

If you restore this again, we're going to the NPOV noticeboard AND I'm STRICTLY holding you to

WP:3RR. Just test me. Seppi333 (Insert  | Maintained
) 03:09, 27 May 2014 (UTC)

That article shouldn't even exist until it's large enough to merit its own page, per
WP:POV FORK that the page was without bothering to look at it.Seppi333 (Insert  | Maintained
) 15:25, 28 May 2014 (UTC)
That's two editors now that have undone your blanking of the article. Stop trying to preempt the outcome of this discussion, wait for it to conclude.--JohnBlackburnewordsdeeds 18:11, 28 May 2014 (UTC)
Per my transclusion, this discussion is now moot. Seppi333 (Insert  | Maintained) 21:32, 28 May 2014 (UTC)
That is not even a reason, and the state you left in was a complete mess. A normal editor trying to edit that would be presented with incomprehensible (to norrmal editors) parser code that doesn't belong in an article. Someone using the visual editor would just see a template, not editable text. Two editors have restored it now, myself Andrew Davidson, so a limited consensus in favour of that version. Stop repeatedly removing content against consensus.--JohnBlackburnewordsdeeds 21:50, 28 May 2014 (UTC)
Guess we're going to the noticeboard when I return home. I'd suggest you restore my citations before I do.Seppi333 (Insert  | Maintained) 22:00, 28 May 2014 (UTC)
@JohnBlackburne:I've decided to offer this compromise instead of go straight to the notice board, as I care more about addressing the POV fork than then irrelevant content on math stat: if you're ok with both pages as they currently are, I'll concede the coatrack issue on the other page. Seppi333 (Insert  | Maintained) 00:21, 29 May 2014 (UTC)
My problem with it now is the parser code. I've never seen anything like that in an article, so it's not covered by any guideline, but it renders the page uneditable by the majority of editors. Those editing source see the parser code, which only a small minority of editors understand. Other editors will either stay away from editing or make an attempt but easily break it not knowing how it works. The Visual Editor is even worse: it simply isn't editable. So much for the encyclopaedia that anyone can edit. It's worse here; you can't actually edit the text with either editor. It provides an edit link but a very non-standard one which looks like an external link. Just copying the text would be normal and easily understood: there's no storage limit or other reason for transcluding it.--JohnBlackburnewordsdeeds 00:39, 29 May 2014 (UTC)
WP:SELECTIVETRANSCLUSION is the page on template- or article-to-article transcluding. It's been done extensively on Adderall. Seppi333 (Insert  | Maintained
) 01:08, 29 May 2014 (UTC)
I just realized the template is unnecessary, so that would solve the VE problem if removed. Just the only include tags are needed for 1 section. Seppi333 (Insert  | Maintained) 01:11, 29 May 2014 (UTC)
It an improvement but still looks like parser code in source code view, and you end up with the template editor for the first two words in VE. You and I can look at it and immediately recognise what it's doing but most editors will have a much harder time editing it. As for
WP:SELECTIVETRANSCLUSION does not recommend this as a way to build articles: the examples it gives of transclusion are far simpler and more commonplace.--JohnBlackburnewordsdeeds
01:37, 29 May 2014 (UTC)
It mentions article-article transclusion in
WP:SELTRANS#Target document markup. I'll simplify the source code more. Seppi333 (Insert  | Maintained
) 02:07, 29 May 2014 (UTC)
It doesn't give that as an example how to use transclusion though. The three examples are
#Repetition within a page. Anyway, guidelines only describe common practice, help pages are mostly howtos; neither is meant to be prescriptive. But if markup makes a page uneditable by a large portion of editors then it shouldn't be used if simpler markup would achieve the same. In this case (and at Adderall / Amphetamine) content should just be copied. There's no reason it has to be the same in both articles, so no need to use complex markup to make it so.--JohnBlackburnewordsdeeds
02:43, 29 May 2014 (UTC)
The relevant guideline is
MOS:MARKUP: "Keep markup simple / The simplest markup is often the easiest to edit, the most comprehensible, and the most predictable.--JohnBlackburnewordsdeeds
02:50, 29 May 2014 (UTC)"
@JohnBlackburne: I don't really care how the page is unforked as long as it's not a fork. If you want to copy/paste the lead into this article, I'm fine with that. Transclusion is just much simpler to maintain. Are we in agreement to just copy the lead then? Seppi333 (Insert  | Maintained) 03:02, 29 May 2014 (UTC)
I went ahead and converted it to plain text - I'm assuming you're in agreement since your comments only pertained to the transclusion as opposed to the text. Let me know if otherwise. Seppi333 (Insert  | Maintained) 04:33, 29 May 2014 (UTC)

I've asked at

WP:ANRFC whether this can be reviewed and closed, as it has I think gone on long enough.--JohnBlackburnewordsdeeds
22:07, 28 May 2014 (UTC)

The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

My opinion on this topic is that Statistics is methodology for knowing by collecting experienced facts to form conclusions for objective fields that it is applied and developed. It is totally different from Mathematics in their purposes, thinking modes and practices. The so-called mathematical statistics may be just an attempt to build a branch of mathematics in the same way of building other branches of Mathematics. So, the Mathematical Statistics is not Statistics. They are different with different purposes, thinking modes and practices. The relationship between Statistics and Mathematical Statistics, or between Statistics and Mathematics, is not a "theoretical-applied" relationship since many statistical methods as well as the formulas in the methods are not deduced or originated from a mathematical theory. They are usually originated from a simple and novel ideas for dealing with the questions and the characteristics of data collected from the practical fields. But the mathematical statistics is just a pile of mathematical logic with definitions, properties, proves and theorems, etc. Only very few people may need these purely mathematical stuffs in a routine statistical practices. Yuanfangdelang (talk) 20:11, 1 September 2016 (UTC)

If you look at the head of this section you'll see that the discussion has been closed. In principle the same topic could be raised again, but most people are reluctant to do so. If I may make a suggestion, try to find something in an article that you think can be improved; find some reputable sources to back up your improvement, and make the change. If you are spending all your energies on WP:Talk pages then you are wasting your time.
Gravuritas (talk) 23:16, 1 September 2016 (UTC)

New lead section

I've rewritten the lead, as requested, see you people like it:

Statistics is the study of the collection, analysis, interpretation, presentation and organization of

experiments.[1]
In case
experimental study involves taking measurements of the system under study, manipulating the system, and then taking additional measurements using the same procedure to determine if the manipulation has modified the values of the measurements. In contrast, an observational study
does not involve experimental manipulation.

When analyzing

statistical power
of a test is the probability that it correctly rejects the null hypothesis when the null hypothesis is false. Multiple problems have come to be associated with this framework: ranging from obtaining a sufficient sample size to specifying an adequate null hypothesis.

Measurement processes that generate statistical data are also subject to error. Many of these errors are classified as random (noise) or systematic (

confounding variable
.

Statistics can be said to have begun in ancient civilization, going back at least to the 5th century BC, but it was not until the 18th century that it started to draw more heavily from calculus and probability theory. Statistics continues to be an area of active research, for example on the problem of how to analyze Big data.

— Preceding unsigned comment added by Lbertolotti (talkcontribs) 1 October 2014 (UTC)

References

  1. ^
  2. ^ Lund Research Ltd. "Descriptive and Inferential Statistics". statistics.laerd.com. Retrieved 2014-03-23.

Approve I much prefer this version. The first two paragraphs are really good. I do think the other sections should be in seperate sections. Mcshuffles (talk) 10:09, 5 April 2017 (UTC)

Note: The original poster already made the proposed changes back in October 2014. This section should probably be archived, since the lead section has changed in several ways since then. - dcljr (talk) 03:12, 23 April 2017 (UTC)