The Mondo ‘Cat’-a-log and Mondo Highlights from 2020

2020 was a remarkable year for so many reasons. Some of the brighter spots were the progress we made on Mondo and the wonderful collaborations we made along the way. We would like to share with you this update, and highlight some of our accomplishments for 2020; as well, we would love to showcase the darling cats (and dogs!) of Mondo.

Many thanks to our developers

The word Mondo means “world” in Italian. The word takes a new meaning for us as it represents a global, community-developed resource for the world, with major contributions from OMIM, Orphanet, NCIt, GARD, MedGen, EFO/OpenTargets at the European Bioinformatics Institute (EBI), ClinGen, and many other computationally integrated sources (see all of the sources here). We are very grateful for everyone’s contributions to truly make Mondo for and by everyone.

Mondo development highlights

As part of the many enhancements we have made, Mondo is now released on a monthly basis: there have been 54 releases to date. In this past year, we have added 795 new terms and obsoleted or merged 320 terms (as of the 2020–12–18 release), and we have processed and closed over 900 tickets in our GitHub repository. We are aware that there is still a lot more to tackle over the next year, so please feel free to ping us if you have an open ticket that you would like us to prioritize. This progress has been a community effort and we thank all of you for your contributions.

New Quality Control Features

There have also been some great improvements to our quality control pipelines, which help with automated processing of the ontology and free up manual person power. This means that the curation team will have more availability to address your tickets and tackle bigger issues. We have introduced a variety of checks, such as the ones performed by the ROBOT report tool, a system for quality assurance of biomedical ontologies. We have implemented a new feature to check our Mondo disease terms against formal disease patterns, for example, ensuring that ‘COVID-19’ conforms to the ‘infectious disease by agent’ disease pattern, or ‘episodic ataxia type 1’ conforms to the ‘specific disease by dysfunctional structure’ pattern. This check allows us to implement advanced quality control steps to the benefit of our users. For example, we can ensure that a genetic disease with a molecular basis in the dysfunction of one gene is not a subtype of a disease with a different molecular basis. In addition, this ensures proper classification of the ontology without having to manually assert parent-child relationships.

Where terminology used as synonyms is not currently acceptable, we still track them but mark them as deprecated. For example, “mental retardation” is an obsolete synonym of “intellectual disability.” Megan Kane at NCBI reported an issue that brought to our attention the fact that these deprecated synonyms are still appearing in the Ontology Lookup Service; we are currently working on addressing the issue.

Bottom line: the implementation of these new quality control checks will make development more efficient, will give more consistency to the ontology as it continues to evolve, will make it more interoperable, and ultimately more useful for our community. The issues reported by our users are very helpful for identifying quality issues that should be prioritized. We could not be more grateful for your contributions. Thanks!

How is Mondo being used? Spotlight on a couple of use cases

The Clinical Genome Resource (ClinGen) curates genetic variants of clinical relevance for use in precision medicine and annotates diseases using Mondo in their curation workflow. ClinGen’s gene curation involves a process of lumping and splitting to define the most appropriate disease entity. The Lumping and Splitting Working Group developed the initial guidance for defining disease entities for curation in collaboration with members of Mondo, as well as OMIM.

Due to this process, ClinGen curators have made approximately 40 requests over the past year, initiating revisions or requesting new terms for Mondo. These terms include many gene-specific novel diseases which present with a myriad of phenotypes, such as MONDO:0100175 ‘TTN-related myopathy’. We were recently invited to give a talk to their curation team about the development of curation practices for Mondo. The talk can be viewed here.

NCBI provides several resources that manage information related to human disease. Submitters to ClinVar and the NIH Genetic Testing Registry (GTR) are encouraged to use standard terms to describe disorders, and in 2020, NCBI added the infrastructure to allow those submitters to define disorders using Mondo IDs. Integrating Mondo as a source of terminology was particularly useful when the GTR expanded its scope to include tests for infectious disorders such as COVID-19 this fall. MedGen at NCBI is an information portal describing diseases and phenotypes related to Medical Genetics. It aggregates information provided by submitters to ClinVar and GTR with terminology from UMLS, Medical Genetics Summaries, and other authoritative sources. During 2020, MedGen added reporting of Mondo IDs with links to Monarch. As part of integrating Mondo into NCBI’s databases, staff members contributed to more than 100 issues on Mondo repository in GitHub

Mondo and OMIM — our collaboration is official

Our team was recently granted a Center of Excellence in Genomic Science award from NHGRI, which will fund the Phenomics First Resource, aimed at coordinating the community to make the knowledge we have about phenotype and disease more interoperable. We are thrilled to be officially collaborating with Ada Hamosh from the Online Mendelian Inheritance of Man (OMIM) and her outstanding team at Johns Hopkins Medical Institutions. Mondo will be completely aligned with OMIM and we get to tap Ada and her team’s deep expertise in genetic diseases and ensure alignment between our two terminologies and proper representation of these disease types.

Looking forward

We cannot wait for the time when this pandemic is put behind us and it is safe to travel again. When the time comes, we will plan an in-person workshop, building on the efforts from the successful workshop we held in Boston in the winter of 2018. In the near future, we plan to host a virtual Mondo workshop to bring together collaborators and work through some key issues — stay tuned!

How can I contribute to Mondo?

Submitting a GitHub ticket is the best way to inform the Mondo team about issues you’ve encountered, requests for new terms or synonyms, or any other suggestions.

There are Mondo curator calls every Friday at 9am PT/12pm ET and all are welcome to join. Contact Nicole Vasilevsky (nicole@tislab.org) for an invitation. A big thanks to Paola Roncaglia and Zoe Pendlington from the Experimental Factor Ontology (EFO) for participating in so many calls throughout the year and for all of their contributions.

Cats of Mondo

If you are not on the Mondo users mailing list, you may want to sign up (here) to stay up to date with relevant information about Mondo, including new releases, terms that will be potentially obsoleted, and of course, CATS! The more participants we get, the more cat pictures we will have, and the better Mondo will be. Thanks for sharing your cat pics, Mondo community.

Also dogs!

At Monarch, we especially value the wealth of information we can derive from a wider diversity of organisms, so please enjoy some dog pics as well.

If you happen to be a Mondo contributor with a different organism for a pet, we’d love to meet them too. Please send them our way!

How can I find out more about Mondo?

Mondo website: https://mondo.monarchinitiative.org/
Mondo repository on GitHub: https://github.com/monarch-initiative/mondo
Mondo Discussion Board (New!): https://github.com/monarch-initiative/mondo/discussions
Mondo users mailing list: mondo-users-subscribe@googlegroups.com
Contact: Nicole Vasilevsky, nicole@tislab.org

Acknowledgements:

Thanks to Courtney Thaxton for the term ‘cat-a-log’.

Mondo is generously supported by a NIH Office of the Director Grant #5R24OD011883, as well as by NIH-UDP: HHSN268201350036C, HHSN268201400093P, NCI/Leidos #15X143 and Phenomics First, NIH-NHGRI: 1 RM1 HG010860–01.

Semantically curating genotype-phenotype knowledge. Visit us at https://monarchinitiative.org/ #OpenScience #Collaborative #Data