Creating a Phenote Data Adapter

2/23/2010: Please note that this document has not been checked recently for accuracy.

If you've just downloaded phenote from sourceforge obo svn, the (latest) source is located in phenote/trunk/src/java. The most important subdirectories under src/java for creating a data adapter are phenote/dataadapter and phenote/datamodel.

A good example data adapter to check out is phenote.dataadapter.phenosyntax.PhenoSyntaxFileAdapter

Data Adapter Types

File Adapter

The interface that a phenote file data adapter implements is phenote.dataadapter.DataAdapterI (which im going to be renamed FileDataAdapterI), and here is what it looks like:

public interface DataAdapterI {
  public void load();
  public CharacterListI load(File f);
  public void commit(CharacterListI charList);
  public void commit(CharacterListI charList, File f);
  /** Set value to use for loading or writeback, for a file adapter this would be
      the file name */
  public void setAdapterValue(String adapterValue);
  public List<String> getExtensions();
  public String getDescription();
}

The most important methods are load and commit.  load()/load(File) will be called by phenote to load up a phenote.datamodel.CharacterListI (see below)


Queryable Data Adapter

The other kind of data adapter is the queryable data adapter. Typically this is a data adapter hooked into a database. Here is phenote.dataadapter.QueryableDataadapterI interface:

public interface QueryableDataAdapterI {
  /** return true if data adapter can query for the char field */
  public boolean isFieldQueryable(String field);
  /** Throws exception if query fails, and no data to return */
  public CharacterListI query(String field, String query) throws DataAdapterEx;
}

isFieldQueryable returns true for strings of fields that are queryable through this adapter. The paradigm here is that one queries the adapter with a field name and value. For instance you can query the pub field with pub id value MED:1234, and ideally the adapter would return a CharacterList of all the characters (phenotypic statments) for that publication in the database. Another common example is querying by genotype or allele.

The phenote gui actually queries the QueryableDataAdapter and for every field that is queryable it puts a "Retrieve" button next to the field.  A user will then fill in that field (e.g. MED:1234 in pub field) and hit the retrieve button. The gui will then call the QueryableDataAdapters query method with the name of the field ("Pub") and the query string ("Med:1234"). The QueryableDataAdapterI should then return a CharacterList using this query, which will then be loaded into phenote.

(todo: if there is unsaved data, phenote should ask if the user wants to save before loading).

Reading & Writing with a Data Adapter

Reading in data

A CharacterListI is just a list of phenote.datamodel.Characters. A Character is basically a phenotypic statement (relating E,Q,genotype, etc...). So basically what a data adapter needs to produce is a list of characters. 

A Character is just a set of tag-value fields, where the tag is the name of the field (Entity, Quality, Genotype...), and value is the value of the field. The actual names of the fields come from the phenote configuration. (link to nicoles doc on phenote config), in other words the phenote datamodel is proscribed by its configuration file. This means that either a data adapter needs to be able to read and write any tag-values that come at it, a dataadapter only handles a certain subset of a configuration, or a configuration is in tune with a data adapter. If the set of fields in configuration is completely different than the fields a data adapter is expecting than the data adapter wont be able to get/load the data it expects. The upshot is in addition to making a data adapter you need to make a configuration that fits with it (or make sure it fits with an existing configuration).

To set a field in a character use:

setValue(CharField cf, String valueString) which throws a phenote.datamodel.TermNotFoundException if the valueString is not found in the ontologies associated with the CharField(via configuration) where phenote.datamodel.CharField is an object that represents a field in a character.

To get a char field you can call

getCharFieldForName(String fieldName) which throws a phenote.datamodel.CharFieldException if you give it a string that is not from the configuration.

Even better, this is combined this into one convenience method:

setValue(String fieldString, String valueString) throws TermNotFoundException, CharFieldException.

For fields with ontologies (with term completion) the valueString has to be the id for the term (not the term name)

So thats basically it for making characters. Some code might look like this:

try {
    Character c = new Character();
    c.setValue("Entity","GO:123");
    c.setValue("Quality","PATO:345");
    c.setValue("Genotype","somegenotypehere");
    ....
}
catch (CharFieldException e) {...} // may want to do this per field - error msg?
catch (TermNotFoundException e) {...} // perhaps per field - error message?

and for CharacterLists just add the characters made above to it:

CharacterList cl = new CharacterList();
cl.add(character1);
cl.add(character2);
...

Writing out data

The CharacterList is passed into the commit method. Iterate through the list of CharacterI's. To get at a Characters field data just call character.getValueString(String fieldString). This throws a CharFieldException if the fieldString doesnt match a field in your configuration. This returns a String which is the value of that field, in the case of fields with ontologies this is a term id (GO:1234). If you would like more info than just the term id from an ontology field you can call getTerm(String fieldName). This returns an org.geneontology.oboedit.datamodel.OBOClass from the obo edit datamodel (I may eventually wrap this in a phenote object - not sure).


You can also query the OntologyManager for all existing character fields with OntologyManager.inst().getCharFieldList() which returns a List<CharField>. You can then query whether the Character has a value for a char field with character.hasValue(CharField), and can retrieve a phenote.datamodel.CharFieldValue from the character with getValue(CharField). You can then call charFieldValue.getName() to get the free text string or the id of the field. You can also query if its a term with charFieldValue.isTerm() and if so get its OBOClass with getTerm().

And thats about it. As you can see theres several way of getting at this data. Heres what some code may look like:

    for (CharacterI ch : characterList.getList()) {
        try {
          String genotype = ch.getValueString("Genotype");
          OBOClass entityTerm = ch.getTerm("Entity");
          OBOClass valueTerm = ch.getTerm("Value");

          // write this data out to data source...

        } catch (CharFieldException ex) { ...error  processing... }

This implies that Genotype, Entity, and Value are all in configuration file, and if not exception will be thrown.

ToDo/Changes needed to data adapter interface:

Ok I just noticed that currently only load(file) and commit(charList,file) are being called via the LoadSaveManager - I will fix this pronto.

So load & save from file menu has been directed to LoadSaveManager which is hardwired to files(Jim Balhoffs work for phenoxml,syntax, & nexus adapters - which are all file based - we havent had non-file yet). This wont work for database adapters and needs a refactoring - I will get on this!

load() should return a CharacterList not void!

##DONE Add method to Character for setting a field with just strings:

setValue(String field, String value)

refactor? should OBOClass be wrapped in a phenote class to detach phenote from obo edit?

refactor note: Im wondering if the file stuff in DataAdapterI should be refactored - i could imagine a AdapterParam class or subclasses of DataAdapterI like FileDataAdapterI and DatabaseDatAdapterI  and DataAdapterI would have methods like boolean isFileAdapter(), FileDataAdapter getFileDataAdpater() - need to think about this.

refactor: phenote datamodel is eventually gonna also get hip to obo edits instance datamodel - however i think it will be under the covers and the above interface will remain the same.

Constraints

You can add edit time or commit time constraints to phenote (as of 1/31/08 this is a work in progress, but gettimg more fleshed out).

Edit time constraint will check for constraints after user makes edit (not implemented yet)

Commit time constraints get checked when user commits - to database or file.

An example might be making sure the ints in a range are proper.

(non null fields should be a constraint that gets configged in the fields themselves - todo)

Before the commit is made the constraints are checked and if there is a problem an error message pops up and the commit is cancelled, or possibly a warning is given and user has option to still commit.

To add a constraint you need to implement phenote.dataadapter.Constraint:

public interface Constraint {
  /** Return true if constraint should be checked at commit time to dataadapter */
  public boolean isCommitConstraint();

  /** do constraint check for commit time - should char list be passed in?
   return ConstraintStatus indication if constraint passed and error msg
   should only be called if isCommitConstraint is true */
  public ConstraintStatus checkCommit();


  /** Return true if constraint should be checked after user edits */
  public boolean isEditConstraint();

  /** do constraint check after user edit - should char list be passed in?
   return ConstraintStatus indication if constraint passed and error msg
   should only be called if isEditConstraint is true */
  public ConstraintStatus checkEdit();
}

If its a commit constraint (and its allowed to be both edit & commit) then return true for isCommitConstraint. And then put your constraint code into checkCommit(). You can access the datamodel (as you probably will want to) with phenote.dataadapter.CharacterListManager.inst().getCharacterList() which returns a CharacterListI. checkCommit returns a phenote.dataadapter.ConstraintStatus - here is its constructor

  public ConstraintStatus(Status status,String message)

so if youre constraint fails then construct as such:

new ConstraintStatus(ConstraintStatus.Status.FAILURE,"The integer you entered for field temperature is out of range, yada yada...");

To add your constraint to the ConstraintManager you need only config it (not implemented yet coming soon) with
<constraint-list>
    <constraint class="phenote.dataadapter.worm.WormRangeConstraint">
    <constraint class="phenote.dataadapter.worm.WormNameConstraint">
  ...
</constraint-list>

(this isnt implemented yet) Alternative is to hardwire the adding of your constraint with ConstraintManager.inst().addConstraint(Constraint). If you add more than one constraint, the messages from all the failing constraints will be appended and displayed together.

ConstraintStatus also has a warning state - this causes a warning to display and lets user decide whether to to go through with commit. use ConstraintStatus.Status.WARNING. and likewise all warning messages from all constraints get lumped into one message.

Http data adapter

So there is a way to send data to phenote via http. This is currently how BIRN's smart atlas sends data to phenote.
In hindsight I realize this is in fact a data adapter - seems obvious right. Its erroneously in the phenote.servlet package (as phenote starts up a servlet to receive the http requests).

phenote.servlet.DataInputServer starts up DataInputServlet to recieve http.
phenote.servlet.DataInputServlet receives the http.
doGet() sends request to inner class EditRunnable which is a thread.
EditRunnable takes the request - which is really just a hash - and expects the keys to be field names in phenote, and then sets the fields in phenote with the values from the keys in the request. So its rather simple. And what you send to it as to align with the phenote configuration.
And the request then manifests as a new row in phenote.

The data input servlet is currently configured as such:
  <ns:data-input-servlet enable="true"/>

but this should be refactored to just be another dataadapter config, and DataInputServlet should be moved to a dataadapter/http package.