Talk:Biomedical text mining

Page contents not supported in other languages.
Source: Wikipedia, the free encyclopedia.

Untitled

Greetings, fellow wikipedians! I noticed that this stub didn't have a discussion page, while several people have contributed to this article. I'd love to get to know whoever else is interested in the subject. Even though the references so far are all centered around Hoffman/Valencia et al. , I'm surprised nobody brought up iHOP yet, so I added it as an example in a new section. Since I am currently writing a thesis on the subject of biomedical text mining, I expect to be able to give a much more complete view of the subject, and eventually lift the stub status of this article. My edits so far have been only a warming up. Ste1n 19:05, 17 April 2006 (UTC)[reply]

Examples, please?

Should this article not have one or two specfic examples where text mining advanced research, helped with drug dscovery, established the etiology of a disease? Are there any such examples in medicine / health? I doubt it[I have looked e.g. on PubMed]. The use of word clouds has been questioned, text mining produces nice stats and graphs but does it tell us anything new? BTW I am not talking about plagiarism detection... Sleuth21 (talk) 07:30, 30 May 2011 (UTC)[reply]

External links modified

Hello fellow Wikipedians,

I have just modified one external link on Biomedical text mining. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:

When you have finished reviewing my changes, please set the checked parameter below to true or failed to let others know (documentation at {{

Sourcecheck
}}).

This message was posted before February 2018.

regular verification using the archive tool instructions below. Editors have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the RfC before doing mass systematic removals. This message is updated dynamically through the template {{source check
}} (last update: 18 January 2022).

Cheers.—InternetArchiveBot (Report bug) 23:40, 2 November 2016 (UTC)[reply]

Additional of references for BioNLP Shared Tasks

Information to be added or removed: In the section "Availability of annotated text data", I would like to add mention of the BioNLP shared tasks following the mention of the Informatics for Integrating Biology and the Bedside (i2b2) challenges. The first pagagraph of this section would then be the following:

Extended content

Large annotated corpora used in the development and training of general purpose text mining methods (e.g., sets of movie dialogue,[1] product reviews,[2] or Wikipedia article text) are not specific for biomedical language. While they may provide evidence of general text properties such as parts of speech, they rarely contain concepts of interest to biologists or clinicians. Development of new methods to identify features specific to biomedical documents therefore requires assembly of specialized corpora.[3] Resources designed to aid in building new biomedical text mining methods have been developed through the Informatics for Integrating Biology and the Bedside (i2b2) challenges[4][5][6], BioNLP shared tasks [7][8][9][10][11][12][13][14] and biomedical informatics researchers.[15][16] Text mining researchers frequently combine these corpora with the controlled vocabularies and ontologies available through the National Library of Medicine's Unified Medical Language System (UMLS) and Medical Subject Headings (MeSH).


Explanation of issue: The BioNLP shared tasks (and the corpora created as part of them) represent important community efforts and resources for the biomedical text minning community. The tasks and resources were created by various members of the community, including my own group. I tried to to add this directly, but it was removed as an "Apparent COI cite". Howeever, this represents not only the work of my group, but the work of others. Apologies if I have done something incorrecly - I have not got a great deal of experience in editing Wikipedia pages.

References supporting change: Supporting references included in the changes shown above

References

Daisylagata (talk) 14:38, 27 August 2019 (UTC)[reply]

Reply 27-AUG-2019

  Specification requested  

  1. Of the provided sources, 50% of them contain |page= parameters covering 4 or more cited pages of text. It is highly unlikely that the information contained in five sentences results from all 96 pages of this cited text. Thus, the request should specify which particular page the information is contained upon in sources containing multiple cited |pages=.
  2. The grouping of eight separate references to source only three words suggests
    WP:TOOMANYREFS
    .
  3. The COI editor is invited to redraft their proposal incorporating exact page numbers, and is asked to make use of only the minimum references needed.

Regards,  Spintendo  15:16, 27 August 2019 (UTC)[reply]

Reply 28-AUG-2019

I have reduced the number of references to four. There have been four of the BioNLP shared tasks in different years. Now, there is a link to an overview paper, or the conference proceedings, for each of these tasks. I hope that this is more accptable. Please see below. Daisylagata (talk) 10:35, 28 August 2019 (UTC)[reply]

Large annotated corpora used in the development and training of general purpose text mining methods (e.g., sets of movie dialogue,[1] product reviews,[2] or Wikipedia article text) are not specific for biomedical language. While they may provide evidence of general text properties such as parts of speech, they rarely contain concepts of interest to biologists or clinicians. Development of new methods to identify features specific to biomedical documents therefore requires assembly of specialized corpora.[3] Resources designed to aid in building new biomedical text mining methods have been developed through the Informatics for Integrating Biology and the Bedside (i2b2) challenges[4][5][6], BioNLP shared tasks [7][8][9][10] and biomedical informatics researchers.[11][12] Text mining researchers frequently combine these corpora with the controlled vocabularies and ontologies available through the National Library of Medicine's Unified Medical Language System (UMLS) and Medical Subject Headings (MeSH).

  1. ISBN 978-1-932432-95-4. {{cite book}}: |journal= ignored (help
    )
  2. .
  3. .
  4. .
  5. .
  6. .
  7. .
  8. ^ Kim, JD; Pyysalo, S; Nédellec, C; Ananiadou, S; Tsujii, J, eds. (2012). "Selected articles from the BioNLP Shared Task 2011". BMC Bioinformatics. 13.
  9. ^ Nédellec, C; Bossy, R; Kim, JD; Kim, JJ; Ohta, T; Pyysalo, S; Zweigenbaum, P, eds. (2012). Proceedings of the BioNLP Shared Task 2013 Workshop.
  10. ^ Nédellec, C; Bossy, R; Kim, JD, eds. (2016). Proceedings of the 4th BioNLP Shared Task Workshop.
  11. PMID 23355458
    .
  12. PMID 22776079.{{cite journal}}: CS1 maint: unflagged free DOI (link
    )