Copy blast results from each node and assemble to full result (yet to do; see NBLAST)
The runtime cost for this grid example, from a few quick tests,
is approximately the time it takes to run on one computer with a
full databank, divided by the number of nodes and subset
databanks you use.
I found a new word gridlet in papers by Rajkumar Buyya and Manzur
Murshed from Monash University:
http://www.csse.monash.edu.au/~rajkumar/. By gridlet they
understand the tiny GridApp that contains all information related
to jobs and job execution management details such as jobs
Note: this should become part of existing package iubio.grid used by
-- maybe subpackage iubio.grid.gridlet ?
Design for basic biogrid - partway between seti@home and globus grid methods
Data grid components - serve data via simple directory search/retrieval methods
Compute grid components - run applications, fetching data from Data grid components
Runner - client app used by biologist to select data, allocate cpu grid nodes,
run analysis, assemble results
Central/Home (?) - master directory services registry
for registering and identifying data and compute grid components
Biogridlet - Compute-Grid component handler
Biogridrun - basic controller app for running list of available programs (start w/ BLAST?)
Biodirectory - data & software directory services - for Data-Grid servers
-- ldap, http?
Biogridhome - ? central directory for listing & using data & compute grid components
-- ? split into 3 gridlets for cpu-node: node-manager, copy-url, run-app ?
... need simple execute() method to run programs
... need methods for handling Binary objects - gunzip for data
... possible use of java (.class,.jar), perl objects (script)
... should this become a 'screen-saver' type background app run on each compute node
with messaging to Biogridhome to keep node resources up to date?
... need to add different directory access choices: ldap, soap, http-cgi, others
-- basic directory node server (data + applications)
-- use javaldap now for simplicity
-- ldap-srs interface for SRS data providers.
-- select compute nodes from compute grid list known to Biogridhome
--- need parameters of node: os flavor, cpu, disk, memory available at mimimum
-- select application to run (for now from list provided by Biogridhome)
--- will include app binary url (no source compiles?)
-- select data from data grid, using search of data directories known to Biogridhome
-- foreach grid cpu node:
--- package up Biogridlet.class + message w/ app and data parameters
--- send to node & start Biogridlet on node
--- either poll node repeatedly, or have
-- mainly a central directory server, with dynamic registration and updates of
available data, software and compute grid nodes, their resources and allowed
users and application running choices
-- Biogridrun interacts w/ this, gets referrals to other data, app and cpu directories
-- use existing bio name service,
ldap://bio-mirror.net/ -- add new o=Biogrid ... ou=Bio,o=Grid ??
For now security & authentication will wait, as other globus-type components of grid.
Design for restricted list of applications that can be run.
d.gilbert, nov 2002, email@example.com
properties key: which object fields to return
= all, others are defined in
properties key: number of objects to retrieve