M711report

= M7.11 Review of options for interactive mark up tools within the Scratchpad infrastructure =

Introduction
This wiki page documents an overview of the range of options available when adding interactive mark up tools to the Scratchpads environment to permit taxonomists to mark up literature.

users
Users / requirements - who are our users? And how are they going to carry out the task of interactive mark up of the taxonomic literature?

professional scientist
working process
 * at desk
 * in library
 * not in the field (possibly in a coffee shop, a deck chair in the back garden, or on the train) so offline use would be an advantage, although this has implications for the implementation approaches used and the services that may be available.
 * possibly with original document in front of them (either hard copy or a scanned PDF) and laptop by their side

citizen scientists
This could include secondary and tertiary education, also keen amateurs and retired professionals.

N.B. This could include crowd-sourcing using software from the IMPACT project, the Australian Newspapers Online project or a citizen science version of GoldenGATE.

accessibility
not adding full support

nature of work requires users to have good sight, fine motor control, etc

think about how much accessibility is appropriate

workflow
the professional taxonomist should interact primarily with Scratchpads because that is the tool to support their processing of and sharing of data

mark up of text should not be an onerous additional task because if it is, it won't be done



the mark up process should be enabled within Scratchpad

ideally as much as possible should be automated and the output automatically linked to a document store

the taxonomist must be able to manually revise the mark up to make corrections and to resolve ambiguities highlighted but not resolved by the automatic processes

dedicated desktop app
generic issues
 * we should support offline useage if possible
 * to run on Windows, Mac & Unix
 * -ve
 * installation & support
 * upgrades
 * to software
 * and environment

GoldenGate

 * easier to reuse GoldenGate
 * -ve
 * Java + AWT/Swing
 * relatively poor visuals compared to modern web designs

Other annotation tools
There are several examples of annotation tools; these are often dedicated to particular tasks or literature, e.g. molecular biology. One of the better known annotation tools is GATE (General Architecture for Text Engineering).
 * e.g. GATE
 * +ve
 * modular design
 * easily exensible with other NLP tools e.g. LingPipe
 * plugins available e.g. OpenCalais
 * -ve
 * requires customisation to make it easier to use
 * generic package, would need customisation for taxonomic markup
 * Other annotation tools
 * give another example if we can find one
 * benefits and drawbacks similar to GATE

widget

 * too tied to a particular OS
 * not consider further
 * because multiple versions required
 * harder to implement than alternatives

Java applet
easier to reuse GoldenGate

clunky visuals unlikely to encounter old problems memory
 * not as impressive as alternative RIA options
 * not meet user expectations (personal, anecdotal evidence)
 * Java not installed
 * Java disabled
 * but need to trap error if encountered
 * Java is hungry
 * cope with large documents
 * & disk security issues

jQuery standalone UI
decouple front end and back end to work offline however... rich visuals
 * we could reuse GoldenGate or any other services e.g. GATE as web services
 * should be more robust & versatile solution than desktop app
 * need to develop standalone editor
 * download/upload file to work locally
 * or exploit HTML5's offline support
 * Other web services (e.g. GoldenGate tagging) not available while offline

can reuse OSS tools/modules

user always using latest version
 * ensure standalone (if developed) checks for updates when invoked

Drupal module
how embed +ves -ves
 * separate module
 * iframe
 * as jQuery UI
 * with Scratchpads
 * integrated
 * upgrade issues
 * tied to Drupal
 * less flexible than jQuery UI
 * tied to Drupal
 * possibly harder to integrate with GG back end
 * not have standalone editor
 * unless develop outside Drupal... which rather defeats the point

generic +ve
OS independent easier to adapt to other devices than desktop app would be
 * need to ensure browser independent too
 * if possible
 * thinking long term mobile use

mobile computing
for use on mobiles, phones, etc possibly citizen scientist at a bus stop could use lesser device
 * not suitable for serious editing of a document
 * working on text fragments
 * not document level
 * max paragraph level
 * something for the long term
 * let's deliver functioning application first
 * before considering mobile versions

editor IDE plugins

 * eg eclipse
 * This is a nice idea but most taxonomists don't use or are comfortable with the use of IDEs. Therefore to attract the majority of users, this would require heavy customisation.

need base IDE first
 * doesn't seem to be a universally popular one in taxonomy
 * so how many IDEs to develop
 * not just tied to an IDE but to each version of an IDE, potentially.

Scratchpads
integration of GoldenGate via WP5 developed API

Pre-internet Z3950
Superseded by Rest based SRU (Search/Retrieval via URL) and SOAP based SRW (Search/Retrieve Web service).

Both:
 * use http: in place of Z39.50 protocol
 * are interrogated using CQL (Contextual Query Language)
 * return results in XML.

OAI-PMH
Already in use by Pensoft and CiteBank so this is a good standard to use for citations when we want to talk to other services.

So whatever tool we develop must be capable of being used when we want to talk to these other services.

Recommendations
For maximum flexibility we prefer to implement web services. This will allow us to exploit the facilities of GoldenGate while providing a front end more in line with users' current expectations. The service will be accessible outside of Scratchpads too, so helping to establish a sustainable service through the potential for a larger user base.

The web service will also allow us more easily to exploit other web resources such as BHL and Plazi for information, as shown in the figure below.



Summary
There are many options available for interactive mark up tools to be added to the Scratchpads architecture. This opportunity arises from the open architecture of Scratchpads, and the Drupal they environment they are built on.