Saturday, December 6, 2008

OwlWatcher and Individuals

I've been poking away at a new version of OwlWatcher that uses the University of Manchester OWLAPI as a replacement for Jena. I have had discussions with several people who suggest that Description Logic (DL) based languages are inappropriate for representing processes, such as behavior. Although I think I understand those concerns, I am pressing forward with OWL as the representation language because the issues that lead me to choose OWL haven't really been resolved:

  1. General availability and a large user community
  2. Support for individuals
OWL, augmented, either with custom reasoners or a rule-based extension such as SWRL, still seem to be the best choice. The alternatives, that I have considered are:

  • OBO
  • Common Logic (CL) or similar
  • CYCL

Let's review the alternatives:

OBO - This is the serious competitor, and it maybe that the backend (aka EthOntos) will be based on the OBD platform that BBOP is developing. Actually part of my reason for moving from Jena to OWLAPI is the built-in support for exporting to OBO. I hope they continue to support this - the only existing outboard translators seem to only support OBO Format v 1.0, which doesn't support individuals at all.

This brings me to the other problem with OBO - weak support for individuals. Although version 2 of the OBO Format includes a stanza type for individuals, no version of OBOEdit does anything more than roundtrip them at this point. Furthermore, OBD seems to be moving toward a 'T-box in the A-box' approach. This is a fancy way of saying that classes and class-level relations will be the 'individuals' in the OBD representation. Of course this is consistent with the history of OBO, for example look at how individual-level relations are introduced for the purpose of defining class-level relations.

I am not meaning to criticize the OBO approach - it has proven successful for its central use case of annotating publications. Indeed, I expect this approach will, more or less, be taken in the EthOntos backend, where the focus will, like OBO, be more on the tree-like and lattice-like networks of relations among classes.

However, OwlWatcher is different - the data comes in as observations of individual events, which are used in the construction a class hierarchy. In fact, the primary operation is more a type of induction or abduction, where classes are proposed, based on the observed properties of their observed individuals. Ideally, OwlWatcher will provide tools for assisting the user in adding restriction definitions to classes that were originally erected as undefined primitives.
I expect OwlWatcher will support a standard Description Logic (DL) reasoner, such as Pellet, and one or more special purpose reasoners, in particular one for temporal reasoning at the object level.

Common Logic - This ought to be a strong contender. Common Logic (CL) is a full, first order language, so there aren't any of the expressiveness issues that OWL and OBO ontologists have struggled with. The very first ontologies I ever constructed for behavior (before Protege) were done with a web-based tool called ontolingua, which used KIF as its serialization language. CL is in someways an extension of KIF, though its native syntax is not lisp-based, as KIF was. The main problem is the lack of tool support, either for the user (editors, preferably with access to some sort of reasoning support) or developers (backends with support for large data stores and reasoner interfaces). The main web presence for CL seem to be the pages left over from the (successful) ISO standardization effort, which ended in 2007. I hope support improves for this in the future.

CYCL - Cycorp has released two versions of Cyc to the outside world: OpenCyc and Research Cyc. The former is available on sourceforge, whereas the research version requires a special license from Cycorp. CYCL is a very expressive (n-th order) logic-based language, and the OpenCyc package includes an integrated reasoner and a substantial subset of the Cyc commonsense knowledgebase. CYCL's expressiveness and the inclusiveness of the packages also underlie some of the difficulties I have considered in evaluating them for this project: although CYC has excellent support for Java, the packages are large and only run on recent versions of Windows and particular Linux distributions. I have been able to install and run OpenCyc on a Windows-XP bootcamp partition on a MacBook Pro, so Mac users wouldn't be categorically excluded. However, the package is large enough and difficult enough to install, that I haven't considered using it for the OwlWatcher distribution or the initial version of EthOntos. There may come a time, however, when I will consider trying Cyc in a backend version. Honestly, that will probably have to wait until I have a stable academic position.

Monday, August 18, 2008

Report from ISBE

I spent last week at ISBE (International Society for Behavioral Ecology), which was held at Cornell this year. I've never been to this meeting before, I usually go to Animal Behavior, which returned to Snowbird for the third time this year. I have to acknowledge Anne Clark's role in getting me up there - we had a meeting of the advisory board to Ethosource and heard an update to the Ethosearch project.

Ethosearch is a database of ethograms that Anne and Sue Margulis have been overseeing development of. They demonstrated the web interface for searching (not publicly available yet) and have started soliciting for people to submit ethograms for the database in the near future (I'd guess toward the end of the year). They are also busy writing text definitions and doing some revision of the ABO Core ontology. I won't update the OWL version I have made available with OwlWatcher until they are ready and OwlWatcher knows how to deal with updates to included ontologies. I will keep you updated when Ethosource goes live.

Anne and Sue each brought students who have been entering ethograms from the published literature. Their method is simply to break up published ontologies into pieces and use the ABO core ontology as a pair of term taxonomies for classifying the pieces from each ethogram. There will be some text search and matching tools in a release at some point.

Besides Anne and Sue and some students they brought along, Cynthia Parr and Ed Scholes also attended. Cynthia, who recently took a position with Encylopedia of Life, was involved at the Cornell workshops and has acquired a substantial expertise in semantic web ontologies. Ed, who has published a couple of papers using a methodology very similar to ontologies to code ethograms of Bird of Paradise courtship has been the video curator at Cornell's Macaulay library for the past six months. Getting Ed and Cynthia talking about sharing between EoL and Macaulay may actually have been the most important outcome of the board meeting. Getting to meet Ed, though I didn't have any time to talk comparative methods, was a high point for me as well.

On to the meeting: there were a lot of papers on social learning and animal cognition. There were also lots of spider papers, though not much overlap between the two. The meeting was substantially larger than Animal Behavior, with 1006 registered for the full scientific program. Most of the time there were six concurrent tracks. Among the plenery talks, certainly Nico Michiels talk on Hermaphroditic invertebrates was memorably lurid - many cases of partners trying to manipulate the other into the female role with the manipulator taking the male role, with no reciprication. The Hamilton lecture, given by Alasdair Houston and John McNamara at the end of the conference program, included a predication of the return of ethology - certainly something that I and others that focus on the comparative study of behavior would welcome.

I also enjoyed talks that related to my social learning work with the Scrub-jays: Steve Scheoch has been continuing his endocrine studies of the jays, and has developed some field methods for assessing personality in fledgling jays. One of them involves a brightly colored ring (though it's bigger than an Aerobie). I also heard about a field experiment by Sarah Benson-Amram which involved placing puzzle boxes in the range of free-living spotted hyenas. Very cool, her data might have something to say about social learning, innovation or both - using a population that has been under long-term study.

I got to discuss Habronattus with Damian Elias, whom I had only briefly met before. I also discussed comparative methods for data sets that include intraspecific variation with Terry Ord, both the method I was involved with (Ives, Midford, Garland 2007) as well as Felsenstein's recent paper (Felsenstein 2008). He has also run into Ed Schole's work with some interest, as Terry works on a range of lizard visual displays. I also briefly spoke with someone from Louis Lefebvre's lab about the use of ontologies in the study of animal innovation

I'll discuss my poster in another post.

Sunday, June 15, 2008

Getting ready for Minneapolis

The evolution meetings in Minneapolis start late this week. For me the meeting will start on Friday, with the Ontology workshop. I'll be giving a 20 minute talk on taxonomy ontologies. The workshop has been booked full for several months now. I am also associated with three posters: one from a comparative methods in R workshop I attended last December, another one from Phenoscape, which gives an update on the TTO, TAO and where we are headed with Phenote and the new workflow, and finally my single author poster on the behavior work. Part of this will be explaining the difference between annotating publications and behavior videos, and part will be an update, not so much on OwlWatcher, but a preliminary discussion of a toy implementation of the EthOntos-Lite alignment module. There is, as of today, running code, but I haven't given it anything more an a couple of trivial examples, which it seems to handle correctly. It would have been nice to have something I could feed OwlWatcher projects into, but that will have to wait - I've still got lots on my plate, despite a productive weekend.

Saturday, June 7, 2008

This summer

Besides the Evolution Meetings (less than two weeks away), I will also be attending the ISBE (International Behavioral Ecology Congress) in Ithaca this August (9-15). I will be presenting an ontology-related poster, and with any luck, I will have a demo-able version of OwlWatcher using the Manchester OWL-API. I have already started building this version. The Manchester OWL-API, although somewhat more experimental than Jena, is more OWL-centric than Jena, which has a larger RDF and semantic web focus. This is not to disparage either Jena or it's development team, both of which have been quite helpful in the process of developing the first versions of OwlWatcher.

I'll save the latest news of EO-Lite for another posting.

Monday, May 5, 2008

OwlWatcher 0.036

This isn't a big deal, mostly a bug fix for a problem that apparently cropped up when with the OSX application during the release of 0.035. The work I've been doing with Phenoscape has had an influence on how the later tab views in OwlWatcher will work. I expect I'll have at least a mockup of a view for instance graphs for the Evolution meetings (which I will be attending) in Minneapolis.

Sunday, April 6, 2008

OwlWatcher 0.035 released

For what it's worth, I've released a new version of OwlWatcher, my ontology-based behavior scoring tool. By announcing it here, I avoid the possibility of this becoming a Mesquite blog, not that there would be anything wrong with a Mesquite blog, it just doesn't go with the title, and I do spend time with other projects.

You can find it here.

Friday, February 15, 2008

I write scripts...

No, I haven't been on strike for the past few months; I write small programs in scripting languages. Many people in bioinformatics, even those with no formal programming background, learn enough to write a simple series of commands in some scripting language or another. The most popular language for this sort of coding is almost certainly Perl. Python is another popular language for doing this. I've written some Perl and a few one liners to play with Python, but those languages won't be the focus of these posts. I intend to provide some enlightenment on another, lesser known language - the scripting language used by the Mesquite package.

Mesquite, for those who haven't heard of it or read my introductory post, is a application environment for phylogenetics, focusing on doing things with trees rather than inferring them. It does have the ability to serve as a frontend to programs that do tree inference - either directly for MrBayes or indirectly using a bridge to the Cipres libraries - which support PAUP*, RAxML, and GARLI as well as MrBayes. Mesquite will also create trees using a variety of simulation methods, such as pure birth (aka Yule trees), birth-death, and the model that corresponds to the BiSSE method of estimation.

Many, if not most, users treat Mesquite as a GUI application without any knowledge of the scripting that Mesquite supports and uses when they save and then reload a project containing trees and character matrices. There are several ways to use Mesquite scripts - either by including them in a Mesquite block in the nexus file that represents the project, but they can also be used to send a series of commands to a window during a Mesquite session. If you use a command line to launch Mesquite you can also send commands or scripts via the terminal that you launched Mesquite from.

I'll finish this post with a simple "Hello World" style example of some Mesquite script that you can run within the Mesquite GUI.

Start Mequite and create a new project, call it scriptTest.nex. Accept the defaults in the dialog that comes up for creating the project. If you are running Mesquite 2.0 or later, you should see something to this in your Mesquite window.

Make sure that "Tree Window 1" is selected and from the Window menu select Scripting:Send Script. This will bring up a dialog where you can type a script command. Here's "Hello World" in Mesquite Script :
message 'Hello World'

So bring up the script window, type in the commands and press the OK button.

So where's my message?
After the script window disappears, you are probably wondering where the message went? It's not in the Tree Window - you actually didn't request the window to do anything. The message appeared in the Mesquite Log. The log contains a record of what Mesquite has done during a session. The easiest way to see the log is to go back to the window menu and select Log from the list. This will expose the log window. If you scroll down to the bottom, you will see your message, followed by the showWindow command you used to bring up the log.

The log appears in other places - the file Mesquite_Log that is created in Mesquite_Support files - a new file is created each time you start Mesquite. If you start Mesquite from a command line, logged information is also written to the terminal you launched from.

That's it for now. Next time, I'll discuss It and other important variables and the structure of scripts. In the meantime, there is a list of "Universal" Mesquite scripting commands available by choosing Scripting from the help menu and the selecting the 'universal commands' link near the top of the page.