WP7UseCases

= Use Case 1: User contributed PDFs =

There was a discussion at the WP6/WP7 Workshop held at the Natural History Museum, London in June 2011 which went something like the following.

How can someone accumulate their own scans of a subject area, as Donat has done for ants, and then share their collection with other people who might want to use it? Most taxonomists I (Dave Roberts) know have extensive photocopy collections of literature relating to their organisms. (Is this a twentieth century view? Do twenty-first century taxonomists have extensive collections of PDFs? - David Morse)

Access to OCR is one issue, but for sustainability lets assume we're looking at text-over-graphics PDFs. (In other words, PDFs which consist of page images and searchable text, very like the DjVu file format. )To share those text-enhanced PDFs with your community you need a repository, for which we proposed Plazi. Then we use the mark-up tools to work on segments of the document, rather than the whole document, which means we need a way of cutting the book-sized work into sections at the PDF level. This cutting up should be done sensibly, probably as a manual or semi-automated process, e.g. at the treatment level.

What might the service look like from the user-end? What are the implications for that at the storage and network levels?

(Contributed by Dave, refined by David.)

= Use case 2: Access to mark-up =



A working taxonomist wishes to apply mark-up to an existing text:
 * 1) Taxonomist logs in to system, with account creation on first use.
 * 2) In a web form:
 * 3) enter bibliographic reference data and a URL pointing to PDF or attach a PDF.
 * 4) enter some details about a document the then search for it to complete the web form, then same process as above.
 * 5) Consistency check for reference data - if inconsistent or incomplete taxonomist will be prompted for correction.
 * 6) Reference, after passing consistency check, is automatically passed to RefBank.
 * 7) Thereafter is GgWS in interactive mode. See | M7.16 report for more information on available services.

Also see Sautter, G., K. Böhm, C. Kühne, and T. Mathäß. ‘ProcessTron: Efficient Semi-automated Markup Generation for Scientific Documents’. In Proceedings of the 10th Annual Joint Conference on Digital Libraries, 21–28. ACM, 2010 (available from http://idaho.ipd.kit.edu/GoldenGATE/ProcessTron.pdf) for a description of the mark-up workflow using GolgenGATE in its previous, monolithic, desktop incarnation.

Therefore, the taxonomist also contributes to RefBank when using GoldenGATE web services.

(Contributed by Guido, refined and documented by Dauvit.)