M711report

From Scratchpad Wiki
Jump to: navigation, search

Contents

M7.11 Review of options for interactive mark up tools within the Scratchpad infrastructure

Introduction

This wiki page documents an overview of the range of options available when adding interactive mark up tools to the Scratchpads environment to permit taxonomists to mark up literature.

users

Users / requirements - who are our users? And how are they going to carry out the task of interactive mark up of the taxonomic literature?

professional scientist

working

  • at desk
  • in library
  • not in the field (possibly in a coffee shop, a deck chair in the back garden, or on the train) so offline use would be an advantage, although this has implications for the implementation approaches used and the services that may be available.

process

  • possibly with original document in front of them (either hard copy or a scanned PDF) and laptop by their side

citizen scientists

This could include secondary and tertiary education, also keen amateurs and retired professionals.

N.B. This could include crowd-sourcing using software from the IMPACT project, the Australian Newspapers Online project or a citizen science version of GoldenGATE.

accessibility

not adding full support

nature of work requires users to have good sight, fine motor control, etc

think about how much accessibility is appropriate

workflow

the professional taxonomist should interact primarily with Scratchpads because that is the tool to support their processing of and sharing of data

mark up of text should not be an onerous additional task because if it is, it won't be done

Flowchart showing taxonomist's interaction with the Bibliography of Life

the mark up process should be enabled within Scratchpad

ideally as much as possible should be automated and the output automatically linked to a document store

the taxonomist must be able to manually revise the mark up to make corrections and to resolve ambiguities highlighted but not resolved by the automatic processes

environment

dedicated desktop app

generic issues

  • we should support offline useage if possible
  • to run on Windows, Mac & Unix
  • -ve
    • installation & support
    • upgrades
      • to software
      • and environment

GoldenGate

  • easier to reuse GoldenGate
  • -ve
    • Java + AWT/Swing
    • relatively poor visuals compared to modern web designs

Other annotation tools

There are several examples of annotation tools; these are often dedicated to particular tasks or literature, e.g. molecular biology. One of the better known annotation tools is GATE (General Architecture for Text Engineering).

  • e.g. GATE
    • +ve
      • modular design
      • easily exensible with other NLP tools e.g. LingPipe
      • plugins available e.g. OpenCalais
    • -ve
      • requires customisation to make it easier to use
      • generic package, would need customisation for taxonomic markup
  • Other annotation tools
    • give another example if we can find one
    • benefits and drawbacks similar to GATE

widget

  • too tied to a particular OS
  • not consider further
    • because multiple versions required
    • harder to implement than alternatives

web

Java applet

easier to reuse GoldenGate

clunky visuals

  • not as impressive as alternative RIA options
  • not meet user expectations (personal, anecdotal evidence)

unlikely to encounter old problems

  • Java not installed
  • Java disabled
  • but need to trap error if encountered

memory

  • Java is hungry
  • cope with large documents
  • & disk security issues

jQuery standalone UI

decouple front end and back end

  • we could reuse GoldenGate or any other services e.g. GATE as web services
  • should be more robust & versatile solution than desktop app

to work offline however...

  • need to develop standalone editor
    • download/upload file to work locally
  • or exploit HTML5's offline support
  • Other web services (e.g. GoldenGate tagging) not available while offline

rich visuals

can reuse OSS tools/modules

user always using latest version

  • ensure standalone (if developed) checks for updates when invoked

Drupal module

how embed

  • separate module
  • iframe

+ves

  • as jQuery UI
  • with Scratchpads
    • integrated

-ves

  • upgrade issues
    • tied to Drupal
  • less flexible than jQuery UI
    • tied to Drupal
    • possibly harder to integrate with GG back end
    • not have standalone editor
      • unless develop outside Drupal... which rather defeats the point

generic +ve

OS independent

  • need to ensure browser independent too
    • if possible

easier to adapt to other devices than desktop app would be

  • thinking long term mobile use

mobile computing

for use on mobiles, phones, etc

  • not suitable for serious editing of a document

possibly citizen scientist at a bus stop

  • working on text fragments
  • not document level
  • max paragraph level

could use lesser device

  • something for the long term
  • let's deliver functioning application first
  • before considering mobile versions

editor IDE plugins

  • eg eclipse
  • This is a nice idea but most taxonomists don't use or are comfortable with the use of IDEs. Therefore to attract the majority of users, this would require heavy customisation.

need base IDE first

  • doesn't seem to be a universally popular one in taxonomy
  • so how many IDEs to develop
  • not just tied to an IDE but to each version of an IDE, potentially.

Scratchpads

integration of GoldenGate via WP5 developed API

Connectivity

Pre-internet Z3950

Superseded by Rest based SRU (Search/Retrieval via URL) and SOAP based SRW (Search/Retrieve Web service).

Both:

  • use http: in place of Z39.50 protocol
  • are interrogated using CQL (Contextual Query Language)
  • return results in XML.

OAI-PMH

Already in use by Pensoft and CiteBank so this is a good standard to use for citations when we want to talk to other services.

So whatever tool we develop must be capable of being used when we want to talk to these other services.

Recommendations

For maximum flexibility we prefer to implement web services. This will allow us to exploit the facilities of GoldenGate while providing a front end more in line with users' current expectations. The service will be accessible outside of Scratchpads too, so helping to establish a sustainable service through the potential for a larger user base.

The web service will also allow us more easily to exploit other web resources such as BHL and Plazi for information, as shown in the figure below.

Flowchart showing taxonomist's interaction with the Bibliography of Life

Summary

There are many options available for interactive mark up tools to be added to the Scratchpads architecture. This opportunity arises from the open architecture of Scratchpads, and the Drupal they environment they are built on.

Workpackages
EMonocot
Personal tools