Here I am, relaxing at home
July 29, 2010 at 8:52 am | Posted in personal | Leave a CommentTags: photos
Description logic never got me a date
July 29, 2010 at 8:48 am | Posted in article | Leave a CommentTags: ontology, description logic
I am no expert in description logic.
I just put that on the advert to get a girlfriend.
However it is interesting to hear about the limitations of representing biological knowledge in OWL-DL which, though powerful for implementing machine reasoning, can struggle with particular ways of thinking in biology such as:
- Similarity and ‘fuzziness’ : biology is grounded in the idea of similarity – similarity between molecules, between organisms, between functions. HOW similar two biological features may be is sometimes hard to describe (how similar are dogs and cats? how similar is DNA sequence A and DNA sequence B?) Indeterminacy like this is difficult to capture in description logic.
- Prototypes and exception : biologists often regard entities as prototypical, such that a human eye might be considered a prototype of all eyes even though something like an insectile compound, although totally different, is still in many senses related. Furthermore, exceptions to rules are quite normal in biology, so an enzyme class which catalyses transcription will always catalyse transcription, unless it is doing something else.
- Complex property restrictions : for example a transcription factor binds to a promoter and activates gene transcription – the biological process ‘gene transcription’ is a property of the factor + promoter complex
- Expressive datatypes : capturing in description logic the idea of a number ranging meaning ‘this is lots’ or dimensions translating to mean ‘this is a big cell’ – again, this type of thinking is rife in biology
The authors below give other examples and discuss just how some of these conceptual tools in biology might be captured in OWL-DL.
Stevens, R., Aranguren, E. M., Wolstencroft, K., Sattler, U., Drummond, N., Horridge, M., and Rector, A. (2007). Using owl to model biological knowledge. International Journal of Human-Computer Studies, 65(7):583-594.
Own harshest critic judges himself to be doing a good job
July 28, 2010 at 2:57 pm | Posted in article | Leave a Comment“I am my own harshest critic.”
I doubt this statement.
Bada et al., authors and instigators of the Gene Ontology, write a (not) entirely disinterested analysis of how the implentation of the Gene Ontology has useful lessons for other ontologies.
The Gene Ontology is a marvelous creation. However I would not go to the Tory party headquarters for impartial political advice on whether I should join the Tory party.
The authors highlight community involvement and simplicity as two very important factors in the success of the Gene Ontology, factors other ontology designers might bear in mind.
How else might we explain the success of the Gene Ontology? The first product in a marketplace has a natural advantage over its competitors, and complex products may have a high startup cost that deters alternatives. The economics make it unlikely the ontology will fail.
Biologists may use the Gene Ontology because it is there and there are no alternatives. They may contribute to the ontology but feel beholden to the curators as to whether the changes they want will be made. A community is formed and involved as a consequence of the technology rather than any sense of ownership.
Established norms may also play a strong role in take-up of the Gene Ontology. Everyone else is using it as a standard, therefore I must use it too.
Bada, M., Stevens, R., Goble, C. A. et al. (2004). A short study on the success of the gene ontology. Web Semantics: Science, Services and Agents on the World Wide Web, 1(2):235-240.
Irrelevant truth in functional genomics
July 28, 2010 at 2:33 pm | Posted in article, web | Leave a CommentAlways interested to see the issue of ‘relevance’ rearing its noble / ugly / annoying / informative head in the bioinformatics literature, much as relevance has long skulked about the information science domain.
The ontology and description logics literature spends a lot of time avoiding the question of uncertainty which, in my opinion, is fundamental to the practice of science. Scientists often gauge the importance of variation within their empirical framework using all manner of expertise, guesswork and voodoo.
In developing machine learning algorithms designed to take advantage of Gene Ontology annotations, Akand et al. note:
“… any gene is annotated with all of the categories with which it has been associated in the published scientific literature. In any particular experimental setting, however, only a subset of the known annotations of a gene will be relevant.”
All annotations are not created equal, and although the human p53 gene may be annotated with 90 different Gene Ontology terms, in the context of an experiment designed to investigate the process of double-stranded DNA repair, the existence of many of these other annotations may be ignored by the biologist for the sake of simplicity.
What then if all analysis in functional genomics is a matter of attention, in which the biologist is free to ignore accessory information which, although objectively true, is deemed superfluous to the task?
Akand, E., Bain, M., and Temple, M. (2007). Learning from ontological annotation : an application of formal concept analysis to feature construction in the gene ontology. volume 85, pages 15-23.
For abstract and full paper see here
Does 100% reliability between indexers or annotators exist?
July 19, 2010 at 3:52 pm | Posted in article | Leave a CommentTags: annotation, classification
How do we measure the reliability of coding, annotating, indexing or classification by different coders, annotators, indexers or classifiers?
Lombard et al. reported just how poor authors in the mass communication research literature were at reporting in detail the consistency between different coders in analyzing content in their research.
Coders in this situation are individuals reading say, a newspaper, and deciding whether that newspaper contains information about a particular topic or subject, say Wayne Rooney’s wedding. Reliability is the measure of matching judgments between different coders. In the classification world, it might be two librarians choosing subject headings for the same book, or in the Gene Ontology world it might be two different annotators choosing index terms for the same biomedical article.
From failing to report how many coders had coded the sample, to omitting how or whether the coders had been trained or even stating exactly how reliability had been calculated, this paper is a striking investigation into the importance of transparency in reporting research methods.
It is of particular interest to myself since I am interested in the value biologist place on manually created annotations between Gene Ontology terms and biological entities, like genes and proteins. Since this type of coding / classification reliability work has long-shown the impossibility of 100% agreement between different coders, what are the implications of this for bioinformatic tools using ontology annotations?
How much trust can biologists possibly place in even good quality manual annotations if there is always a difference between annotators?
Lombard, M., Snyder-Duch, J., and Bracken, C. C. (2002). Content analysis in mass communication: Assessment and reporting of intercoder reliability. Human Communication Research, 28(4):587-604.
dx doi 10.1111/j.1468-2958.2002.tb00826.x and Full text available
CoLIS 7: The Cool and Belkin faceted classifications of information interactions revisited
June 25, 2010 at 10:42 am | Posted in web | Leave a CommentTags: classification, colis7
Isto Huvila (Uppsala, Sweden) presented at CoLIS 7 and detailed his continuing work with the Cool and Belkin faceted classification of information interactions.
(For Cool and Belkin’s original paper, see here)
What is an information interaction? Isto showed as a picture of a child sitting at a (very dated) PC to illustrate the different ways we could consider an information interaction. The potential complexity of any classification system that attempted to capture the details and nuances of such an interaction was evidently problematic, and my sense of Isto’s presentation was that it was this complexity he had had been grappling with in his work.
My thoughts are that any successful empirical science is necessarily a simplification of the natural world around. Physics and chemistry concentrates on a narrow corridor of the physical world. As sciences, they also simplify the contents and classification of this narrow world, creating a model of what reality is, and finding themselves to possess great explanatory power within that narrow world.
The information scientist who attempts to simplify the information world is often subject to criticism of the sort, “Well, human beings and information is much more complicated than your model suggests, and therefore your model is useless.”
However, I would resist this idea (often flung into the ring from the sociologist’s corner) because if we want to at least try and offer explanations of information interactions, or perhaps adopt an empirical methodology to suggest ways we might augment these interactions, a simplification of the information world as instantiated in the Cool and Belkin classification (and Isto’s extension) is absolutely necessary.
And we can only say such a classification is not helpful when it has failed to demonstrate its utility.
I do not scorn the organic chemist because he cannot offer a complete explanation from within the chemistry paradigm of why I just ate a biscuit .
CoLIS 7: Webometrics, emergent or doomed
June 25, 2010 at 10:42 am | Posted in web | Leave a CommentTags: colis7
Mike Thelwall (Wolverhampton, UK) reflected on the current academic status of webometrics. Though we did not conclusively determine if the discipline is in fact doooooomed, it was interesting to see how other disciplines such as computer science, are playing with webometric-type techniques without paying much attention to the existing literature.
Is this sloppy scholarship, an active rejection of webometrics or something else? I also wondered if webometrics were not actively applied in the private sector without much of this work feeding through into the academic domain, either because there was no point publishing or because of commercial sensitivity.
Mike suggested we take a look at Blogpulse.com to have a go at our own webometric surveys. I searched for ‘gaming AND motion’, to show how bloggers and gamers are responding to the introduction of new motion gaming systems from the major manufacturers like Microsoft and Sony.
The graph below shows to clear peaks, one for Sony’s announcement of its motion controller, and the second, more recent peak, relating to interest generated by the E3 show.
CoLIS 7: Doctoral forum
June 25, 2010 at 10:40 am | Posted in lecture, thesis, work | Leave a CommentTags: colis7
Presented at the CoLIS 7 doctoral forum on Monday and got some very encouraging feedback from the session leaders and the other students.
It was really fascinating to get an insight into how everyone else’s research is coming along, the different approaches they are taking, and the unique problems each of us face in trying to get anywhere with our research.
Our doctoral group included students working on everything from tagging in archives and online communities based around the Twilight saga to the philosophical idea of ‘information refusal’ to retrieval challenges for Quranic resources. Oh, and I should mention a project looking at social media use in public libraries and bibliometrics in the literature studies domain. I think that was everyone – you know who you are!
My (unused) project presentation
Many thanks to Jutta Haider for her hard work in organising the forum – it was great!
CoLIS 7 has finished – boo
June 25, 2010 at 10:39 am | Posted in web | Leave a CommentTags: colis7
The 7th International Conference on Conceptions of Library and Information Science finished yesterday, and a very merry time (I believe) was had by all.
There was much information science-style hilarity at the ‘Metatheoretical Snowmen’ session, plenty of excellent presentations from the invited speakers, and some interesting discussion generated by the panels.
The theme of the conference was ‘Unity in diversity’ and every contribution went some way to confirming that the range of research interests and approaches in the domain serve to strengthen and enrich information research rather than weaken and dilute the discipline.
I’ll post a few comments on some of the sessions later this week. Do check the CoLIS 7 website for abstracts and updates over the next few months.
Plus, CoLIS 8 has been announced (see http://www.iva.dk/english/colis8/), and is to be hosted by the Royal School of Library and Information Science in Copenhagen from 19-22 August, 2013.
Ontology as concept representation: reality as conjecture
May 18, 2010 at 7:01 pm | Posted in article | Leave a CommentTags: ontology, philosophy of science
Is an ontology a representation of subjective, social knowledge as it stands in a plurality of competing concepts in the wild?
Or is an ontology a representation of reality, thus avoiding the woolly vagaries of ‘concepts’?
“Good ontology and good modeling in support of the natural sciences can, we conclude, be advanced by the cultivation of a discipline that is devoted precisely to the representation of entities as they exist in reality. In the framework of such a discipline [...] we would talk not of concepts as linguistic or computer artefacts but rather of universals, conceived as that in reality to which the general terms used in making scientific assertions correspond.”
The quote above is taken from the article below and I find this attitude to be surprisingly optimistic with regard to how scientists observe reality and distinguish between scientific things and non- or pseudo-scientific things (which I presume Smith would exclude from ontologies in the natural sciences).
The view above presumes we can observe reality from a neutral perspective when in fact (and this is old news), all observations are coloured by the spectacles of theory. If there are no theory-neutral observations, there can be no single representation of reality, but this is not necessarily a problem ( I may justify this another time).
Smith also avoids the question of how we decide what ought to be represented in an ontology ‘based on reality’. A scientific conjecture may create a new thing in reality, os how do we decide when a new thing gets added to an ontology?
Scientific knowledge is fallible, and if an ontology is reality representation, then we need criteria for deciding what gets added to an ontology. Does ESP make it? Reality is not straightforward point-and-see exercise, and if we can use the word ‘concept’ to describe competing conjectures about reality, which can be tested and refuted, then I have no problem with ontology as concept representation.
Smith, B. Beyond concepts: ontology as reality representation, in Proceedings of the third international conference on formal ontology in information systems, pp. 73-84 (IOS Press, 2004).
Full text available here
Blog at WordPress.com. | Theme: Pool by Borja Fernandez.
Entries and comments feeds.

