IUBio .. Biosequences .. Software .. Molbio soft .. Network News

[DIR]Parent folder
[	 ]Biogridlet.class 25-Nov-02 10 kb
[HTML]Biogridlet.html 26-Nov-02 26 kb
[TXT]Biogridlet.java 26-Nov-02 19 kb
[	 ]Biogridlet.prop 26-Nov-02 964 b

: Class Biogridlet
public class Biogridlet
extends java.lang.Object

Biogridlet - basic biogrid toolkit component

a basic directory access component for bioinformatics grids.

Simple test of a "Gridlet" for bio data directory access. For each compute node on your test grid, do this:

  1. Install/test/locate NCBI BLAST software (yet to do as a gridlet), set bl=/path/to/blastall
  2. Download Biogridlet .class and .prop files, edit Biogridlet.prop properties to taste, especially QUERY selection. Make sure Java 1.3+ runtime is available
  3. Find a query biosequence in fasta format to test. A sample query set is
    java Biogridlet count=100 ldap://bio-mirror.net:3895/srv=srs out=query \
    'query=(lib=genbank)(org=Anopheles gambiae)' 
  4. Use Biogridlet to copy a databank subset to each node and run blast:
    1. node1:
      java Biogridlet start=0 count=1000 | $bl/formatdb -i stdin -p F -o T -n databank1 
      $bl/blastall -p blastn -d databank1 -i query -m 8 -o databank1.out
      
    2. node2:
      java Biogridlet start=1000 count=1000 | $bl/formatdb -i stdin -p F -o T -n databank2  
      $bl/blastall -p blastn -d databank2 -i query -m 8 -o databank2.out
      
    3. node3 .. n
  5. Copy blast results from each node and assemble to full result (yet to do; see NBLAST)
The runtime cost for this grid example, from a few quick tests, is approximately the time it takes to run on one computer with a full databank, divided by the number of nodes and subset databanks you use.

Gridlet defined

From Jan K. Labanowski: Computational Portals for Chemistry Gridlets and XMLets
I found a new word gridlet in papers by Rajkumar Buyya and Manzur Murshed from Monash University: http://www.csse.monash.edu.au/~rajkumar/. By gridlet they understand the tiny GridApp that contains all information related to jobs and job execution management details such as jobs processing requirements.

Note: this should become part of existing package iubio.grid used by BioGridRunner -- maybe subpackage iubio.grid.gridlet ?

Design



Notes

For now security & authentication will wait, as other globus-type components of grid. Design for restricted list of applications that can be run.

Author:
d.gilbert, nov 2002, gilbertd@bio.indiana.edu

Field Summary
static java.lang.String ATTRIBUTES
          properties key: which object fields to return = all, others are defined in http://iubio.bio.indiana.edu/biogrid/directories/schema/bioseq.schema
static java.lang.String COUNT
          properties key: number of objects to retrieve
static java.lang.String DEBUG
           
static java.lang.String DN
          url component
static java.lang.String EXTENSIONS
          properties key: ldap extension controls, sizelimit=10, timelimit=1000 being useful
static java.lang.String FILE
          url component
static java.lang.String FORMAT
          properties key: result biosequence format limited choices now: fasta, native (e.g.
static java.lang.String HOST
          url component
static java.lang.String LISTDN
          properties key: output control; listdn=false for no name
static java.lang.String LISTKEY
          properties key: output control; listkey=false for no field key
static java.lang.String LISTVAL
          properties key: output control; listkey=false for no field value
static java.lang.String OBJECT
          properties key: objectClass to search/retrieve default * gets query summary, objectClasses should be defined in http://iubio.bio.indiana.edu/biogrid/directories/schema/bioseq.schema
static java.lang.String OUTPUT
          properties key: output file, standard output is default
static java.lang.String PATH
          url component
static java.lang.String PORT
          url component
static java.lang.String PROPERTIES
          default properties file
static java.lang.String PROTOCOL
          url component
static java.lang.String QUERY
          properties key: query for databank, data field, etc.
static java.lang.String REF
          url component
static java.lang.String SCOPE
          properties key: search scope, option, sub is default
static java.lang.String START
          properties key: start object number to retrieve from query result
static java.lang.String TITLE
          visible title for url
static java.lang.String URL
          properties key: data directory url; must have
 
Constructor Summary
Biogridlet()
           
Biogridlet(java.util.Properties env)
           
 
Method Summary
(package private)  boolean ldapsearch(java.util.Properties env)
           
(package private)  boolean ldapsearch(java.util.Properties env, java.lang.String ldapurl, java.lang.String scope, java.lang.String filter, java.lang.String[] attr, long sizelimit, int timelimit, boolean derefLinks)
           
(package private)  boolean ldapsearch(java.util.Properties env, java.lang.String ldapurl, java.lang.String scope, java.lang.String filter, java.lang.String[] attr, java.lang.String[] extn)
           
static void main(java.lang.String[] args)
          run with Biogridlet.props properties and/or command-line options
 java.util.Properties parseUrl(java.lang.String url, java.util.Properties h)
           
 void retrieve()
           
 boolean search(java.lang.String url)
           
 void setProperties(java.util.Properties env)
           
 void usage()
           
 
Methods inherited from class java.lang.Object
, clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

URL

public static final java.lang.String URL
properties key: data directory url; must have

PROTOCOL

public static final java.lang.String PROTOCOL
url component

HOST

public static final java.lang.String HOST
url component

PORT

public static final java.lang.String PORT
url component

DN

public static final java.lang.String DN
url component

PATH

public static final java.lang.String PATH
url component

FILE

public static final java.lang.String FILE
url component

REF

public static final java.lang.String REF
url component

SCOPE

public static final java.lang.String SCOPE
properties key: search scope, option, sub is default

QUERY

public static final java.lang.String QUERY
properties key: query for databank, data field, etc. to search, ldap query syntax for now

START

public static final java.lang.String START
properties key: start object number to retrieve from query result

COUNT

public static final java.lang.String COUNT
properties key: number of objects to retrieve

OBJECT

public static final java.lang.String OBJECT
properties key: objectClass to search/retrieve default * gets query summary, objectClasses should be defined in http://iubio.bio.indiana.edu/biogrid/directories/schema/bioseq.schema

FORMAT

public static final java.lang.String FORMAT
properties key: result biosequence format limited choices now: fasta, native (e.g. genbank, embl, swissprot, other biosequence formats)

ATTRIBUTES

public static final java.lang.String ATTRIBUTES
properties key: which object fields to return = all, others are defined in http://iubio.bio.indiana.edu/biogrid/directories/schema/bioseq.schema

EXTENSIONS

public static final java.lang.String EXTENSIONS
properties key: ldap extension controls, sizelimit=10, timelimit=1000 being useful

TITLE

public static final java.lang.String TITLE
visible title for url

LISTDN

public static final java.lang.String LISTDN
properties key: output control; listdn=false for no name

LISTKEY

public static final java.lang.String LISTKEY
properties key: output control; listkey=false for no field key

LISTVAL

public static final java.lang.String LISTVAL
properties key: output control; listkey=false for no field value

DEBUG

public static final java.lang.String DEBUG

OUTPUT

public static final java.lang.String OUTPUT
properties key: output file, standard output is default

PROPERTIES

public static final java.lang.String PROPERTIES
default properties file
Constructor Detail

Biogridlet

public Biogridlet()

Biogridlet

public Biogridlet(java.util.Properties env)
Method Detail

main

public static void main(java.lang.String[] args)
run with Biogridlet.props properties and/or command-line options

usage

public void usage()

setProperties

public void setProperties(java.util.Properties env)

search

public boolean search(java.lang.String url)

retrieve

public void retrieve()

ldapsearch

boolean ldapsearch(java.util.Properties env)

ldapsearch

boolean ldapsearch(java.util.Properties env,
                   java.lang.String ldapurl,
                   java.lang.String scope,
                   java.lang.String filter,
                   java.lang.String[] attr,
                   java.lang.String[] extn)

ldapsearch

boolean ldapsearch(java.util.Properties env,
                   java.lang.String ldapurl,
                   java.lang.String scope,
                   java.lang.String filter,
                   java.lang.String[] attr,
                   long sizelimit,
                   int timelimit,
                   boolean derefLinks)

parseUrl

public java.util.Properties parseUrl(java.lang.String url,
                                     java.util.Properties h)