EBM and Clinical Support Librarians@UCHC

A blog for medical students, faculty and librarians about their use of evidence based medicine, clinical literature, Web 2.0, sources and search strategies

Tag Archives: Text-Mining

News, Scientific Literature, Bioinformatics, Search Technologies: MedlineRanker

Anyone who works with geneticists and biomedical researchers already knows that learning the language of their science is daunting for a non-scientist to understand. This international community has developed dozens of highly specific databases, data-mining software and cooperative, collective digital libraries for their own use.  In an approximate sense, one could even imagine the mapping of the human genome as one vast wiki.  Clinical care follows the translational research of these investigators.

This month in Nucleic Acids Research, Volume 37-July 1 2009, the Supplement 2: Web Server issue was published, described by Oxford University Press as:

“…the seventh in a series of annual special issues dedicated to web-based software resources for analysis and visualization of molecular biology data. The present issue reports on 112 web servers with a special emphasis on metagenomics, molecular network and pathway analysis, and biological text mining”..

Full-text of the NAR-Supplement 2 is available open-access for anyone in the world to read, on PubMedCentral.


An article in that special issue attracted my interest, entitled MedlineRanker: flexible ranking of online literature” and written by a group of computational scientists affiliated with the Computational Biology & Data Mining Group of the Max Delbruck Center for Molecular Medicine (MDC) in Berlin.

The six authors describe their project in this way:

We have implemented the MedlineRanker webserver, which allows a flexible ranking of Medline for a topic of interest without expert knowledge. Given some abstracts related to a topic, the program deduces automatically the most discriminative words in comparison to a random selection. These words are used to score other abstracts, including those from not yet annotated recent publications, which can be then ranked by relevance. We show that our tool can be highly accurate and that it is able to process millions of abstracts in a practical amount of time.

Source: Link from Nucleic Acids Research – Vol. 37, Suppl. 2: W141-W146

Please view the four Supplementary Data (note: these open as either Word or Excel documents) that describe search terms used to search  PubMed using the MedlineRanker server.

The illustrations in the article look like a cross between a tag cloud and a Wordle picture.

MedlineRanker is free for use and is available at http://cbdm.mdc-berlin.de/tools/medlineranker.

A list of current research projects from MDC can be viewed at this link.


In January 2009, Supplement 1 – Datatabase Server Issue was published in  Nucleic Acids Research, Vol. 37 and that is also available online on the PubMedCentral archive.


The 122 sites listed in the July 2009 NAR supplement will be added to the 1,200 already listed in the Bioinformatics Links Directory which:  “... now expands to almost 1400 unique web servers, databases and resources for computational research in the life sciences. All links are freely accessible to the public, and may be browsed by biological category and research task subcategory. “

For more information on text-mining programs written by scientists from around the world, go to the Bioinformatics Links Directory-Literature: Text Mining page.


Medical News: Lit Inspector

Becoming a blogger requires subscriptions to a variety of alerting services. Eurekalert is a handy free service which shoots me a daily recap of news or press releases on medicine and health care from worldwide sources. That is how I learned about the release of LitInspector today, a software program for text-mining designed for use by geneticists and clinical researchers. The companys’ press release – dated Nov 28 2007 – is here.

The German bioinformatics company responsible for this product Genomatix Software GmbH of Munich – describes LitInspector as:

“… a powerful literature and pathway mining system based on all published abstracts and related meta-information (like Medical Subject Heading list or MeSH) from the entire PubMed [database] of the National Library of Medicine. With more than 260,000 gene synonyms, LitInspector uses one of the largest gene synonym tables available, thus securing a substantial amount of relevant information for scientists not knowing all synonyms of their gene of interest. Additional free text input and boolean operators allow for focussed literature searches”.

“MeSH term and keyword classifiers like “disease”, “tissue”, or “pathway” lead to comprehensive results within seconds. More than 1.6 million gene-gene relations are pre-analyzed. Pathway associations and links to KEGG, STKE and BioCarta are provided”.

“At each step, LitInspector optionally branches out into Genomatix´ integrated suite of systems biology databases and analysis technology, such as the ElDorado genomic annotation database”.

Anyone associated with an academic institution can register with Genomatix to obtain a free trial evaluation account by going to www.genomatix.de. The company opened a branch in Ann Arbor, Michigan this summer… read a news item on that company here: Genomatix Software Inc. (on the Mlive.com website).