Current projects

Model systems are the cornerstone of biomedical research to investigate biological processes, test gene-based disease hypotheses, and develop and test disease treatments. The vast knowledge that we have about model systems can be better utilized if semantically aggregated and made queryable based on any number of facets, such as phenotypic similarity, network analysis, gene expression and function, and genomics. The Monarch Initiative aims to provide easy-to-use tools to navigate this data landscape, services for other resources, and educational outreach regarding the production of structured data for biomedical discovery. Monarch is funded by NIH (NIHGrant # 1R24OD011883-01) and is a collaboration between members at Oregon Health & Science University, Lawrence Berkeley National Laboratory, University of California, San Diego, University of Pittsburgh Sanger Institute, Charité - Universitätsmedizin Berlin, and Garvan Institute of Medical Research.
The OHSU Library and the Department of Medical Informatics & Clinical Epidemiology (DMICE) have been awarded two grants through the NIH Big Data to Knowledge (BD2K) initiative, with the goal of improving health research by training students and researchers to better utilize and generate knowledge from large biomedical datasets. One project is to develop skills courses, with the goal of training students in key issues of big data analysis: problem description, data set discovery, preparation and creation of data sets, and implementation of advanced techniques. The courses will be offered in a hybrid learning environment, part online and partially through in-person intensive courses. In addition to the skills courses, our group is developing Open Educational Resources, in the form of online materials that are available here, that are freely available to educators and students. The ultimate goal is to train data scientists and improve knowledge discovery.  

OpenRIFOpenRIF, the Open Research Information Framework, is an open source community devoted to representing expertise ecosystems - all the things we do and all the things we contribute. The community works on developing and promoting interoperable and extensible semantic infrastructure, such as the VIVO Integrated Semantic Framework (VIVO-ISF), an ontology for representing people, works, and the relationships between them; federated databases modeled on PARDI, the Portfolio Analysis and Reporting Data Infrastructure, for research impact and evaluation;and eagle-i, which aims to make  research resources discoverable via a semantic search interface and represents their relationships to scholarly activities.

Annotating the CRAFT Corpus

Web Taxology project

Past projects


The Clinical and Translational Activity Reporting (CTAR) tool was a collaboration Oregon Clinical and Translational Research Institute and the OHSU Library's Ontology Development Group to prototype tool that would collocate and analyze data about research activities across a disparate set of internal and external databases (e.g. IRB, grants and contracts, PubMed). Leveraging MeSH, other terminologies, and simple Natural Language Processing (NLP) techniques, the CTAR prototype identified research activity topics and trends, and their classification as clinical or translational. The tool was intended to increase the OHSU's and the Oregon Clinical and Translational Research Institute's ability to strategically contribute to research outcomes and human health.


The CTSAconnect project aimed to integrate information about research activities, clinical activities, and scientific resources by creating an Integrated Semantic Framework (ontology). This new framework facilitated the production and consumption of Linked Open Data (a Semantic Web method of sharing data) about investigators, physicians, biomedical research resources, services, and clinical activities. The goal was to enable software to consume data from multiple sources and allow the broadest possible representation of researchers' and clinicians' activities and research products. Current research tracking and networking systems rely largely on publications, but clinical encounters, reagents, techniques, specimens, model organisms, etc., are equally valuable for representing expertise. CTSAconnect was funded by Booz Allen Hamilton (Grant #CTSA 10-001: 100928SB23) and was a collaboration between members at OHSU, Stony Brook University, Cornell University, Harvard University, University at Buffalo, and the University of Florida, and leveraged the work of eagle-i, VIVO, and ShareCenter.

Data Training

ODG and the OHSU Library hosted two working sessions to address reproducibility and research communication. The goal was to promote interaction between information scientists and research scientists to support enhanced awareness of data stewardship. One session was a "Data Management Open House", where aspects of the data-research cycle were addressed, including issues surrounding research reproducibility and scholarly communication of findings. An interactive data wrangling session was also held, and a paper describing our outcomes is available here.

eagle-i Network 


eagle-i is a free application that makes it easy to discover biomedical research resources at a growing network of universities; more than 50,000 resources are listed and more are added every week. Resource types include model organisms, reagents, core laboratory services, instrumentation, and biospecimens. eagle-i was funded by Booz Allen Hamilton (Grant # 90177520). The Ontology Development Group is responsible for maintaining the OHSU resource data and underlying ontology, and can assist with data entry for OHSU researchers.

Biospecimen Query

This project explored options for enhancing search capabilities for an existing biospecimen search application. Text processing tools were used to map anatomy, pathology, and disease concepts from existing terminologies and ontologies to pathology reports that are currently represented in an unstructured natural text form. The concepts identified in the text were also organized in a relational structure to enable taxonomic and parthood based searches. This was a small exploratory project with a goal of integrating these capabilities in an ongoing effort to expand and integrate OHSU's biospecimen databases. This work was funded by OHSU's Medical Research Foundation.