Scones and structScones Installation Instructions

From TechWiki

Jump to: navigation, search

The Scones web service system (subject concepts or named entities) is used to perform subject concepts and named entities tagging on a target document. The GATE system is used to perform the tagging. A GATE XML annotation file is returned to the user.

structScones is a user interface that lets people send text documents to the Scones web service endpoint to tag them, review them, and index them within the structWSF instance.

Scones uses tomcat6, PHP/Java Bridge and GATE.

Contents

Installing & Configuring the Scones Web Service Endpoint

  1. Download and install the Scones web service endpoint into your structWSF instance.
  2. Install tomcat6
  3. Install Scones.war (Note: the Scones.war installation file is only available on demand due to its size)
    1. Scones.war is a packaged version of GATE + the PHP/Java bridge softwares, configured for Scones's purposes. It includes the PHP/Java bridge version 5.5.4.1 and Gate version 5.2.1 build 3581
    2. To install Scones.war, make sure tomcat6 is running, locate tomcat's webapps folder and copy Scones.war into that folder (/var/lib/tomcat6/webapps/ on a default Ubuntu installation), and wait until tomcat install & deploy the war file.
    3. Restart tomcat to properly handle this new web application by using this command:
      1. JAVA_OPTS="-Djava.ext.dirs=/var/lib/tomcat6/webapps/Scones/plugins/gate-5.2.1-build3581-ALL/lib/ext -Dgate.plugins.home=/var/lib/tomcat6/webapps/Scones/plugins/gate-5.2.1-build3581-ALL/plugins" /etc/init.d/tomcat6 restart
    4. Properly setup the 50local.policy file on your server
      1. The more liberal permissions would be to add that line to that file:
        1. grant codeBase "file:/var/lib/tomcat6/webapps/Scones/WEB-INF/-" { permission java.security.AllPermission; };
      2. Restart tomcat6 by using the command in 2.3.1
  4. Create GATE application
    1. You can use the GATE Developper user interface to create a new GATE application to use with Scones.
      1. Create a new GATE pipeline, and save it into a XGAPP file.
      2. Modify the gate application XGAPP file generated by the GATE Developer user interface for the Scones setup
        1. Edit the generated XGAPP file. And make sure that all the paths to different files (named entities dictionaries used by gazetteer, ontologies, etc) can be located on the server where Scones will be running. Make sure that the paths to the GATE plugins files (in the /webapps/Scones/ folder) are properly defined.
  5. Configure the Scones web service endpoint
    1. Make sure you properly configured the Scones web service endpoint by properly configuring the config.ini configuration file on the server.
    2. In the php.ini file, you will have to enable (turn "On") the allow_url_include directive.
    3. Make sure the port tomcat6 runs on (8080 by default) is not used by another application (such as Jetty)

Configuring structScones

  1. Create a folder where structScones will be able to save the tagged files. This folder *has* to be accessible from the web. By example: /usr/share/drupal/scones.
    1. Make sure that this scones folder is writable by the web server process.
  2. Configure structScones' settings: http://my-domain.com/admin/settings/conStruct/scones

Initialize Scones Web Service

The next step is to initialize the Scones web service endpoint. Each time tomcat6 is restarted, Scones has to be re-initialized as well. The initialization phase consist in creating the GATE threads that will be used (concurrently if needed) to analyze incoming texts.

Other Scones Web Service Tools

We have access to two other tools to help us managing the Scones web service endpoint.

If you want to re-initialize Scones without restarting tomcat, then you can use the destroy.php script in the admin folder to destroy all the previously created sessions:

Additionally, if you want to check what session is currently in use in the system, you can use the analyzeSessions.php to have the status of the loaded sesions:

The output looks like:

Sessions ID: #1
Used: FALSE
Number of documents: 0


Sessions ID: #2
Used: FALSE
Number of documents: 0


Sessions ID: #3
Used: FALSE
Number of documents: 0

If used is TRUE, then it means that that session is currently used by a user. The Number of documents is the number of documents currently being tagged by that user. If used is FALSE, it simply means that the session is waiting to be used by a request user.

Integrating Scones with structOntology

In a structWSF and conStruct setup, all ontologies are managed by the structWSF ontologies endpoints and the conStruct structOntology module.

The integration of these two systems, along with Scones, can easily be done by properly configuring the XGAPP application you created above. The only thing you have to do, modify your XGAPP application to refer to the files managed by structOntology and the structWSF ontologies web service endpoints.

These ontologies files are located in the folder determined by the configuration option ontologies_files_folder defined in the [ontologies] section of the data.ini configuration file.

That folder is where all the ontologies files, manipulated by these systems reside on the server.

Taking Ontology Changes Into Account

All the changes to the ontologies used by Scones are not automatically applied to the running Scones instance. This means that if you are changing some of the ontologies used by Scones, using the structOntology module, these changes won't be taking into account if you tag a new document using Scones.

To make Scones taking these changes into account, you have to perform the following steps:

  1. In structOntology, make sure to Save the ontologies that changed. This action will save the ontologies change into the ontologies files.
  2. Destroy the current Scones instance by using this script: http://my-domain.com/ws/scones/admin/destroy.php
  3. Re-create the Scones instance by using this script: http://my-domain.com/ws/scones/admin/init.php

Once you performed these three steps, the next documents you will tag using Scones will take the new ontologies changes into account.

Integrating Scones With Named Entities Datasets

Scones not only tags concepts that comes from an OWL ontology, but also tags named entities that are defined in some Named Entity dictionaries.

These named entity dictionaries are specially formatted text files that are used by Gate. The best way to create them is to let structScones generate them for you.

The only thing you have to do is to create datasets of records in structWSF. Then, you should tag all the records, in these datasets, that you consider as named entities that should be used to tag documents processed with Scones.

The tagging of named entities is quite simple by adding a triple to each of these records that are named entities. This triple is using the sco:namedEntity attribute like:

  1.   <record-URI> sco:namedEntity "true" .

All records tagged that way will be added to some Scones named entity dictionaries and will be used for Scones tagging purposes.

Taking Named Entities Changes Into Account

All the changes to the named entities used to generate the Scones named entity dictionaries are not automatically applied to the running Scones instance. This means that if you are changing some of the named entities used by Scones, in their own datasets, these changes won't be taken into account if you tag a new document using Scones.

Let's take that scenario: you are changing the preferred labels of some named entities in a few datasets, and you are adding a new dataset, which has named entities, into the system. Here are the steps you have to perform in order to make them available to the next Scones action you will do:

  1. You go to the structScones settings page.
  2. You check the "Recreate the Named Entities dictionaries upon settings save" checkbox and leave the others unchecked.
  3. You click the the "Save Configuration" button an wait until the page reloads.
  4. Once the page get reloaded, you have to run the two structWSF Scones administration scripts: destroy.php and init.php. This will re-initialize the Scones instance with the modifications you did to the named entities dictionaries.

Then all the modifications you did to the named entities dictionaries will be taken into account in the subsequent Scones queries.

Personal tools