M716report

=M7.16=

These are rough notes to be expanded into the full milestone report.

Background
GoldenGATE is an established tool to assist in the semi-automated mark-up of biodiversity literature.

See Sautter, G., K. Bohm, and D. Agosti. ‘Semi-automated XML Markup of Biosystematic Legacy Literature with the GoldenGATE Editor.’ In: Pac Symp Biocomput, 391:402, 2007.

GoldenGATE was envisaged as a tool to assist experienced taxonomists mark up digital literature, both to record document centric information as well as data centric, semantic enhancements, to the literature. The use case was based on a taxonomist scanning an existing work and then proceeding to mark it up to cover all relevant topics of interest. This design led to two drawbacks with GoldenGATE as originally implemented because it had:


 * 1) a monolithic structure - while GoldenGATE comprises many individual services they are not accessible except through the GoldenGATE UI. GoldenGATE was a standalone service, integrated as a whole into the taxonomists workflow, rather than as individual services into that workflow.
 * 2) a set workflow - in which modules can only be called in a set order from within the GoldenGATE UI. Thus the taxonomist had to proceed through marking up the volume in the way required by GoldenGATE, in accord with the original use case, and not in a way that might better fit their needs.

These two drawbacks hindered the wider adoption of GoldenGATE, and consequently hindered the wider benefit of adding semantic mark up to taxonomic literature.

This milestone is the first step in ViBRANT towards addressing these drawbacks.

Obtaining GoldenGATE web services
The repackaged GoldenGATE can be downloaded from http://vbrant.ipd.uka.de/RefBank/GgWS.zip.

GoldenGATE is written in Java, and hence the individual modules have been repackaged as web services in Java servlets. To run Golden web services requires Java Runtime Environment 1.5 or higher, Sun/Oracle JRE recommended, and Apache Tomcat 5.5 or higher.

The zip contents need to be extracted into Tomcat's webapps folder, in conjunction with some local configuration settings as dcoumented in the download's README file.

The default services available in the downloaded package are:


 * BibRefParserAutoNormalizing.webService
 * DateTaggerNormalizing.webService
 * GeoCoordinateTaggerNormalizing.webService
 * QuantityTaggerNormalizing.webService

All | GoldenGATE modules are available as web services.

Calling GoldenGATE web services
A sample script call-GgWS.php is available.

The script shows how to invoke a GoldenGATE web service, and track the invoked web service to completion.

GoldenGATE web services allow progress tracking similar to OAI-PMH. An initial POST invokes the service, which returns a status code. Valid codes are Started, Running and Finished. Started and Running return XML that includes a callback code to be used in a GET to track progress of that particular invocation until the service completes processing. When Finished is returned the output includes the marked up text.

This is the output from the sample script:

The script created the tracking comments Running and Finished, while GoldenGATE created XML bounded by the  </tt> tags. The XML includes the <geoCoordinate></tt> element to allow extraction of the marked up results.