Anyone who works with geneticists and biomedical researchers already knows that learning the language of their science is daunting for a non-scientist to understand. This international community has developed dozens of highly specific databases, data-mining software and cooperative, collective digital libraries for their own use. In an approximate sense, one could even imagine the mapping of the human genome as one vast wiki. Clinical care follows the translational research of these investigators.
This month in Nucleic Acids Research, Volume 37-July 1 2009, the Supplement 2: Web Server issue was published, described by Oxford University Press as:
“…the seventh in a series of annual special issues dedicated to web-based software resources for analysis and visualization of molecular biology data. The present issue reports on 112 web servers with a special emphasis on metagenomics, molecular network and pathway analysis, and biological text mining”..
Full-text of the NAR-Supplement 2 is available open-access for anyone in the world to read, on PubMedCentral.
An article in that special issue attracted my interest, entitled “MedlineRanker: flexible ranking of online literature” and written by a group of computational scientists affiliated with the Computational Biology & Data Mining Group of the Max Delbruck Center for Molecular Medicine (MDC) in Berlin.
The six authors describe their project in this way:
“ We have implemented the MedlineRanker webserver, which allows a flexible ranking of Medline for a topic of interest without expert knowledge. Given some abstracts related to a topic, the program deduces automatically the most discriminative words in comparison to a random selection. These words are used to score other abstracts, including those from not yet annotated recent publications, which can be then ranked by relevance. We show that our tool can be highly accurate and that it is able to process millions of abstracts in a practical amount of time. “
Please view the four Supplementary Data (note: these open as either Word or Excel documents) that describe search terms used to search PubMed using the MedlineRanker server.
The illustrations in the article look like a cross between a tag cloud and a Wordle picture.
MedlineRanker is free for use and is available at http://cbdm.mdc-berlin.de/tools/medlineranker.
A list of current research projects from MDC can be viewed at this link.
In January 2009, Supplement 1 – Datatabase Server Issue was published in Nucleic Acids Research, Vol. 37 and that is also available online on the PubMedCentral archive.
The 122 sites listed in the July 2009 NAR supplement will be added to the 1,200 already listed in the Bioinformatics Links Directory which: “... now expands to almost 1400 unique web servers, databases and resources for computational research in the life sciences. All links are freely accessible to the public, and may be browsed by biological category and research task subcategory. “
For more information on text-mining programs written by scientists from around the world, go to the Bioinformatics Links Directory-Literature: Text Mining page.