GridBlast
  General Information
  Installation
  Usage
 
 
 
 
 
 
 
 
 
 
 
 

GridBlast

  General Information

 


GridBlast consists of the following files:

GridBlast.pl : This is the main script file. Its function is to do static scheduling of the queries for the different nodes, tar-zip and stage the database, executables and query files to the remote node and to spawn remote jobs using globus-run. It is a multi-threaded version which implies that each client node is serviced by a separate thread of execution.

head_node_script.pl: This is the remote node script file. It is spawned on the remote nodes by using the globusrun command. It starts a GASS server on the remote node and then connects to the GASS server on the local node. Once the server-to-server connections are made, the remote script file initiates the transfer of the necessary executable, database and query files and when complete, sets up the executables and depending on whether the remote node is a single processor node or a multi-processor node, spawns either repeated runs of blastall, the BLAST executable or Scatter, the task farming application for running high throughput BLAST on a cluster.

The server program for Scatter, in turn spawns the client program on the job manager for the remote node. A work-queue scheduler is used to distribute the queries to the various processors on each grid node. As each node completes its quota of queries, the results are "tar-zipped" again and copied back to the local node.

clientfile: This file contains information on the client nodes in the cluster that are to be used for the BLAST run. The formatting is quite simple and consists of a list of node names followed by the number of processors available on that node and the local scheduler on that node, on each line. In case there is no scheduler, "none" is entered.

for e.g.:

some.server.edu 8 pbs

someother.server.edu 4 sge

yet.another.server 1 none

scatter.job_orig: This file is used for spawning on a cluster using the PBS job scheduler. Currently the application only supports PBS on clusters. However, future versions will also support LSF and Condor (among others).

head_node_script.rsl_orig: This file is the RSL file for job submission using globusrun.

Node_Specs: This file specifies the parameters necessary for the minmax algorithm. The file is a simple data file consisting of the same number of rows corresponding to the number of machines in the clientfile. The format for the columns is as follows:

<# of procs on node> <Size of file transferred> <Comm time/MB> <Exec time for one blast run>

The order of the nodes should be the same as in the clientfile.