VizMaps: A Bayesian Topic Modeling Based PubMed Search Interface
A common challenge that users of academic databases face is making sense of their query outputs for knowledge discovery. This is exacerbated by the size and growth of modern databases. PubMed, a central index of biomedical literature, contains over 25 million citations, and can output search results containing hundreds of thousands of citations. Under these conditions, efficient knowledge discovery requires a different data structure than a chronological list of articles. It requires a method of conveying what the important ideas are, where they are located, and how they are connected; a method of allowing users to see the underlying topical structure of their search. This paper presents VizMaps, a PubMed search interface that addresses some of these problems. Given search terms, our main backend pipeline extracts relevant words from the title and abstract, and clusters them into discovered topics using Bayesian topic models, in particular the Latent Dirichlet Allocation (LDA). It then outputs a visual, navigable map of the query results.