7.6. Discussion

This chapter is simply supposed to present Ontolopy as a usable tool for finding relationships in OBO ontologies, from which useful outputs can be obtained.

7.6.1. Usefulness

Ontolopy fills a need for quickly searching OBO files for relationships between ontology terms, and as the examples (both simple and complex) show, it fulfils this role well: it works quickly and finds the relationships that you would expect to find (when you know what to ask for).

By building on top of the extremely well-used data analysis tool of Pandas, when Ontolopy doesn’t have a function written to do something (for example defining new relations, or combining mappings), users can fall back on the functionality of Pandas to create what they need with Ontolopy’s outputs. By not (yet) integrating well with other OBO tools in development, however, it does miss potential impact.

Ontolopy is only as useful as the ontologies that it can query, so it has all of the limitations of those tools: they’re missing some links because they are constantly being updated as our knowledge increases. At the same time, it is useful because it builds on these resources: these resources are created by biological curators with heaps of experience working with academic and medical communities. Ontolopy has already proved useful at least in providing a valuable way of feeding back into experimental data and ontologies. By checking for inconsistencies between multiple ways of labelling data, multiple issues in these resources and data sets have been identified and some of these revisions have been accepted.

7.6.2. Usability

One key feature of Ontolopy’s usability is that it is well-documented. At the time of writing, it is much more so than other alternatives for working with OBO files. The documentation is versioned and contains well-worked examples and a descriptive API.

It is also quick and easy to install, lightweight, and has a small number of dependencies (the upside of the lack of integration with other tools).

Ontolopy runs quickly for a wide variety of tasks. As we saw in the examples, Ontolopy runs quickly for most uses involving operations on or queries to ontology objects (typically less than half a second). The time taken depends on the size of the ontology, the number of the chosen relations, and the popularity of those relations within the ontology. However, making a query with a large number of relations to check can inflate how long a query takes to run.

The name mapping (Uberon.map_by_name), however is the exception to this, which runs fairly slowly (on the order of seconds). There are more interesting text-mining techniques that could be integrated into Ontolopy if gains in speed where made here, for example using fuzzy-text matching to catch typos in sample information files (which are often present as they are often created by hand).

7.6.3. Limitations

Ontolopy is a small and lightweight package, so it hasn’t got as much functionality as some larger tools, as well as having some limitations due to it’s reliance on underlying ontologies.

7.6.3.1. You still need to understand the structure of the ontology

While Ontolopy makes it easy to query biological ontologies in Python, it doesn’t prevent the user from needing to understand the structure of the ontology (what kind of relations it contains and what these mean) to be able to ask meaningful queries. Ontolopy will allow you to ask for nonsense relations, e.g. combining any relations which may give misleading responses if you are only looking at what it is mapping to and from (rather than the path that the mapping represents).

7.6.3.2. “Missing” functionality

There is plenty of non-existent functionality for Ontolopy that could be useful, namely:

7.6.3.3. Improving choosing from multiple synonym options

The Uberon.sample_map_by_name function simply looks up the strings provided and looks for important external references to decide between synonyms. If this information is not provided or doesn’t help us to make the choice, we currently just choose the first term that we found, ignoring information about synonym, or which synonym-having term is more specific.