Mondo Disease Ontology Highlights from 2021
By Sabrina Toro and Nicole Vasilevsky
Greetings from the Mondo team! Another year of working remotely, forgoing our hopes that we’d be able to gather together in person this year. But we’ve persevered and we have lots of highlights to share with you from the past year.
New team members, new institution, even stronger team
We expanded our curation team: Sabrina Toro joined the Mondo/Monarch Initiative team in June. She brings her expertise in variant annotation/harmonization and biocuration, and as many of you may have encountered, she’s quickly learning the Mondo curation process and has made a ton of progress in these past months.
Joe Flack from Chris Chute’s group at Johns Hopkins University joined the Mondo team as an ontology engineer this fall. Joe likes to work alongside his four cats on our quality control pipelines.
Last year we announced our new official collaboration with Ada Hamosh from the Online Mendelian Inheritance of Man (OMIM) and her team at Johns Hopkins University, as part of the Phenomics First project. This collaboration has led to countless improvements in Mondo, and her expertise has greatly assisted us in better representation of Mendelian diseases and mappings to external ontologies.
This year again, we have received the support and contributions from many experts, clinicians, researchers, etc. Your contributions are the key to Mondo success, and we are grateful for them.
In June, the Translational and Integrative Sciences Lab (TISLab), directed by Melissa Haendel and including many members of the Mondo and Monarch Initiative team, moved to the University of Colorado, Anschutz Medical Campus (CU). We are excited about joining the brand new Center for Health and Artificial Intelligence (CHAI) where there is great potential for collaboration. This move required a lot of our attention and time, but a lot of progress (and exciting improvements) was made on the Mondo project.
Mondo Development Highlights
In the past year, we opened 660 tickets and closed 850 tickets in GitHub, and we had 85 distinct contributors. In 2021, we added 568 new terms and obsoleted 403 terms (as of the 2021–12–01 release). Anyone can report a problem or request a new term via our Github issue tracker. All of your reported issues have helped greatly improve this resource and with our expanded team, we hope to make more progress through our issue tracker over the next year.
A new version of Mondo is released every month, at the beginning of the month. Our latest release, on December 01 [link], was our 65th!
Improved obsoletion workflow
We developed an improved process for obsoleting terms in Mondo, thanks to the collaboration with Larry Babb and the ClinGen team. Terms are obsoleted for various reasons, such as duplication or they are merged in source ontologies, grouping classes that are considered out of scope, or are terms that are considered phenotypes and not diseases. To ensure minimal disruption to curation and annotations using Mondo terms, if a term is to be considered for obsoletion in Mondo, we inform users before obsoletion. Our workflow entails 1) adding an ‘obsoletion candidate’ subset tag in Mondo ontology file, 2) including a link to the GitHub ticket that describes the reason for obsoletion (and is a place for community comments), 3) add the date for proposed obsoletion (for example, 2021–01–01), and 4) a comment that indicates the reason and terms to be considered for replacement or an actual replacement term (in the case of a merge). We now provide an obsoletion report (for example, see the November report here) with each release and will wait at least two months before obsoleting any class, to allow our users time to update their annotations or make comments (or request the term be kept instead).
Improved documentation of changes
We now provide more detailed reports on the changes in Mondo compared to the previous release, including changes to labels and definitions, and new and obsoleted terms. These changes are attached to every release, for example see the December release notes here. This process was developed, again, with the kind help of Larry Babb from ClinGen and will hopefully help our users react to changes more effectively in the future.
Clinically oriented view of Mondo
Per request by medical experts, we reviewed and revised the classification of Mondo to correspond to the “Harrison’s Principles of Internal Medicine” (Harrison) textbook organization. We identified Mondo high-level terms to be removed from the high-level classification, new high-level grouping terms to be created, and high-level terms to be excluded from this clinically oriented view (read more here). We hosted a follow up workshop on November 17th where we further discussed changes to the high level classifications. We’ll be working on implementing these changes in the upcoming months.
Axiomatization of chromosomal anomalies
One of the benefits of using ontologies in data annotations is the ability to leverage the underlying logical axioms to allow for inferencing and reasoning across data. Monarch team members participate in the INCLUDE Data Coordinating Center, which provides access to data, analysis tools, and resources for the Down syndrome community, and standardizes and harmonizes Down Syndrome patient data using Mondo. There was a need to improve the logical axiomatization of the ‘Down Syndrome’ class and related chromosomal anomaly terms in Mondo. We leveraged the use of Dead Simple Ontology Design Patterns (DOSDP) to consistently and automatically apply equivalence and subclass axioms to the ‘chromosomal anomaly’ branch of the ontology. This work was presented as a poster at the Biocuration Society Virtual Conference (see poster here).
Restructuring of Peroxisomal Biogenesis disorders
The ClinGen Peroxisomal Disorders gene curation expert panel (PD-GCEP) provided expert advice to improve the classification of the ‘peroxisome biogenesis disorder’ (MONDO:0019234) branch in Mondo. Through many working sessions with Shruthi Mohan and her colleagues, we reclassified, renamed, and obsoleted outdated terms to accurately reflect the current understanding of the spectrum of these diseases.
Addition of new Leukemia terms for Open Targets
We expanded our representation of acute myeloid leukemia (AML) terminology in Mondo to address a specific use case for a team at the Children’s Hospital of Pennsylvania (CHOP). The team is working to use a new instance of the Wellcome Trust’s Open Targets portal to specifically host pediatric cancer data. These data sources required classifications with clinically relevant disease dictionaries to allow provision of useful information to the users. Open Targets currently uses the Experimental Factor Ontology (EFO) to standardize their data, which in turn relies on Mondo, for the disease terminology. We worked with Deanne Taylor to add new molecular subtypes of pediatric AML terms into Mondo, to facilitate search and classification of tumor samples by cancer researchers.
Mondo aims to provide complete coverage of disease terminology across species by integrating terminology from source ontologies and continuously adding new classes upon request. Ada Hamosh and Melissa Haendel participate in a community-based effort, along with ClinGen collaborators including Courtney Thaxton, to assist in defining disease nomenclature. While different communities have different preferences for disease naming conventions, Mondo works to address all of our users’ needs through use of primary labels and synonyms that suit each community. We defined some guidelines for naming conventions for Mendelian diseases, which is described here: https://mondo.monarchinitiative.org/pages/disease-naming/.
Our funding under the Phenomics First grant aims to bring together our community of users and medical experts to help improve the scientific accuracy of Mondo through a series of workshops. Our workshops were held virtually this year and focused on various topics, including revision to the upper level classifications, renal diseases and phenotypes, and we participated in the ClinGen virtual retreat. More details about our workshops is here: https://mondo.monarchinitiative.org/pages/workshop/. Please contact us if you’d like to participate in upcoming workshops.
See you next year
Some of our goals for 2022 include 1) ongoing maintenance and addressing issues on our tracker, including some bigger projects like obsoleting grouping classes that are out of scope and more explicitly defining human and non-human diseases, 2) focusing on specific disease branches like kidney diseases and infectious diseases (as a follow up to our reclassification of infectious diseases, which we presented at ICBO2021) and 3) continuing to improve our QC and automated processes. We intend to hold more workshops and if there are any topics of particular relevance to you, please do let us know. With 2021 coming to an end, we look forward to what 2022 has to offer.
How can I contribute to Mondo?
GitHub issue tracker: Submitting a GitHub ticket is the best way to inform the Mondo team about issues you’ve encountered, requests for new terms or synonyms, or any other suggestions.
Community calls: There are Mondo curator calls every Thursday at 10 am PT/1 pm ET, and all are welcome to join to listen in or bring up any issues. Contact Nicole Vasilevsky(firstname.lastname@example.org) for an invitation.
Mondo website: https://mondo.monarchinitiative.org/
Mondo GitHub: https://github.com/monarch-initiative/mondo
Mondo Discussion Board: https://github.com/monarch-initiative/mondo/discussions
Mondo users mailing list: email@example.com (subscribe to the mailing list to get updates about releases, obsoletion candidates and pet pictures)
Generic slide deck on the Mondo Disease Ontology: Generic_Mondo_Slides. This is available for reuse with attribution.
Contact: Nicole Vasilevsky, firstname.lastname@example.org or Sabrina Toro, email@example.com
Thank you to Damien Goutte-Gattat for generating the metrics about Mondo.
Mondo is generously supported by a NIH Office of the Director Grant #5R24OD011883, as well as by NIH-UDP: HHSN268201350036C, HHSN268201400093P, NCI/Leidos #15X143 and Phenomics First, NIH-NHGRI: 1 RM1 HG010860–01.