Talk:Biomedical text mining
Molecular Biology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks. | ||
??? | This article has not yet received a rating on the importance scale. | |
This article is supported by the Computational Biology task force (assessed as High-importance). |
Untitled
Greetings, fellow wikipedians! I noticed that this stub didn't have a discussion page, while several people have contributed to this article. I'd love to get to know whoever else is interested in the subject. Even though the references so far are all centered around Hoffman/Valencia et al. , I'm surprised nobody brought up iHOP yet, so I added it as an example in a new section. Since I am currently writing a thesis on the subject of biomedical text mining, I expect to be able to give a much more complete view of the subject, and eventually lift the stub status of this article. My edits so far have been only a warming up. Ste1n 19:05, 17 April 2006 (UTC)
Examples, please?
Should this article not have one or two specfic examples where text mining advanced research, helped with drug dscovery, established the etiology of a disease? Are there any such examples in medicine / health? I doubt it[I have looked e.g. on PubMed]. The use of word clouds has been questioned, text mining produces nice stats and graphs but does it tell us anything new? BTW I am not talking about plagiarism detection... Sleuth21 (talk) 07:30, 30 May 2011 (UTC)
External links modified
Hello fellow Wikipedians,
I have just modified one external link on Biomedical text mining. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:
- Added archive https://web.archive.org/web/20060901073846/https://lists.ccs.neu.edu/pipermail/bionlp/ to https://lists.ccs.neu.edu/pipermail/bionlp/
When you have finished reviewing my changes, please set the checked parameter below to true or failed to let others know (documentation at {{
).
This message was posted before February 2018.
{{source check
- If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
- If you found an error with any archives or the URLs themselves, you can fix them with this tool.
Cheers.—InternetArchiveBot (Report bug) 23:40, 2 November 2016 (UTC)
This edit request by an editor with a conflict of interest was declined. The request was not specific enough. Please see the reply section below for additional information about this request. |
Information to be added or removed: In the section "Availability of annotated text data", I would like to add mention of the BioNLP shared tasks following the mention of the Informatics for Integrating Biology and the Bedside (i2b2) challenges. The first pagagraph of this section would then be the following:
Extended content
|
---|
Large annotated corpora used in the development and training of general purpose text mining methods (e.g., sets of movie dialogue,[1] product reviews,[2] or Wikipedia article text) are not specific for biomedical language. While they may provide evidence of general text properties such as parts of speech, they rarely contain concepts of interest to biologists or clinicians. Development of new methods to identify features specific to biomedical documents therefore requires assembly of specialized corpora.[3] Resources designed to aid in building new biomedical text mining methods have been developed through the Informatics for Integrating Biology and the Bedside (i2b2) challenges[4][5][6], BioNLP shared tasks [7][8][9][10][11][12][13][14] and biomedical informatics researchers.[15][16] Text mining researchers frequently combine these corpora with the controlled vocabularies and ontologies available through the National Library of Medicine's Unified Medical Language System (UMLS) and Medical Subject Headings (MeSH).
Explanation of issue: The BioNLP shared tasks (and the corpora created as part of them) represent important community efforts and resources for the biomedical text minning community. The tasks and resources were created by various members of the community, including my own group. I tried to to add this directly, but it was removed as an "Apparent COI cite". Howeever, this represents not only the work of my group, but the work of others. Apologies if I have done something incorrecly - I have not got a great deal of experience in editing Wikipedia pages. References supporting change: Supporting references included in the changes shown above References
|
Daisylagata (talk) 14:38, 27 August 2019 (UTC)
Reply 27-AUG-2019
- Of the provided sources, 50% of them contain
|page=
parameters covering 4 or more cited pages of text. It is highly unlikely that the information contained in five sentences results from all 96 pages of this cited text. Thus, the request should specify which particular page the information is contained upon in sources containing multiple cited|pages=
. - The grouping of eight separate references to source only three words suggests WP:TOOMANYREFS.
- The COI editor is invited to redraft their proposal incorporating exact page numbers, and is asked to make use of only the minimum references needed.
Regards, Spintendo 15:16, 27 August 2019 (UTC)
Reply 28-AUG-2019
I have reduced the number of references to four. There have been four of the BioNLP shared tasks in different years. Now, there is a link to an overview paper, or the conference proceedings, for each of these tasks. I hope that this is more accptable. Please see below. Daisylagata (talk) 10:35, 28 August 2019 (UTC)
Large annotated corpora used in the development and training of general purpose text mining methods (e.g., sets of movie dialogue,[1] product reviews,[2] or Wikipedia article text) are not specific for biomedical language. While they may provide evidence of general text properties such as parts of speech, they rarely contain concepts of interest to biologists or clinicians. Development of new methods to identify features specific to biomedical documents therefore requires assembly of specialized corpora.[3] Resources designed to aid in building new biomedical text mining methods have been developed through the Informatics for Integrating Biology and the Bedside (i2b2) challenges[4][5][6], BioNLP shared tasks [7][8][9][10] and biomedical informatics researchers.[11][12] Text mining researchers frequently combine these corpora with the controlled vocabularies and ontologies available through the National Library of Medicine's Unified Medical Language System (UMLS) and Medical Subject Headings (MeSH).
- )
- ISBN 978-1-4503-2409-0.
- PMID 23935077.
- PMID 21685143.
- PMID 23564629.
- PMID 26225918.
- .
- ^ Kim, JD; Pyysalo, S; Nédellec, C; Ananiadou, S; Tsujii, J, eds. (2012). "Selected articles from the BioNLP Shared Task 2011". BMC Bioinformatics. 13.
- ^ Nédellec, C; Bossy, R; Kim, JD; Kim, JJ; Ohta, T; Pyysalo, S; Zweigenbaum, P, eds. (2012). Proceedings of the BioNLP Shared Task 2013 Workshop.
- ^ Nédellec, C; Bossy, R; Kim, JD, eds. (2016). Proceedings of the 4th BioNLP Shared Task Workshop.
- PMID 23355458.
- PMID 22776079.)
{{cite journal}}
: CS1 maint: unflagged free DOI (link