![]() GMOD Resources
Demos Links to Related Projects
![]() GMOD is supported by a specific cooperative agreement from the USDA Agricultural Research Service, and by NIH grants co-funded from the National Human Genome Research Institute and the National Institute of General Medical Sciences.
|
GenBank HOWTOThis is a quick synopsis of the steps needed to initialize a GBrowse database from a genbank record. For the purposes of illustration, we will use the RefSeq record for M. bovis, accession NC_002945.
Using the GBrowse in-memory database
1. Convert from Genbank format into GFF formatDownload the Genbank record and convert it into GFF format. You can do this easily using the bp_genbank2gff.pl script, which is part of bioperl: bp_genbank2gff.pl -stdout -accession NC_002945 > mbovis.gff This will download the record for M. bovis (refseq NC_002945) and save it to the file mbovis.gff. If you already have the genbank record available as a file named NC_002945.gb, you can convert it like this: bp_genbank2gff.pl -stdout -file NC_002945.gb > mbovis.gff The newly-converted file uses GFF3 format, which combines feature data with sequence/DNA data. This means that you do not need a separate FASTA file for the sequence.
2. Install the GFF file into the databases directoryCopy this file into your in-memory GFF databases directory, as described in the tutorial. We will assume /usr/local/apache/htdocs/gbrowse/databases. mkdir /usr/local/apache/htdocs/gbrowse/databases/mbovis chmod o+rwx /usr/local/apache/htdocs/gbrowse/databases/mbovis cp mbovis.gff /usr/local/apache/htdocs/gbrowse/databases/mbovis
3. Set up the configuration fileUse the configuration file 08.genbank.conf as your starting template. This is located in contrib/conf_files: cp contrib/conf_files/08.genbank.conf /usr/local/apache/conf/gbrowse.conf/mb.conf
4. Edit the configuration file as appropriateYou will need to change the [GENERAL] section to use the in-memory adaptor and to point to the location of the M. bovis GFF file:
[GENERAL]
description = Mycobacterium Bovis In-Memory
db_adaptor = Bio::DB::GFF
db_args = -adaptor memory
-dir /usr/local/apache/htdocs/gbrowse/databases/mbovis
You might also want to change the ``examples'' tag to introduce the accession number for the whole genome, and a few choice gene names and search terms: examples = NC_002945 Mb1800 galT glucose That's all there is to it, but since this is a pretty big chunk of DNA (> 4 Mbp), it uses a considerable amount of memory and performance will be sluggish unless you have a fast machine with lots of memory. So you might wish to view it using a MySQL, PostgreSQL or Oracle database. The following are instructions for doing this.
Using the GBrowse in-memory databaseWe will assume that you are using a MySQL database.
1. Create the databaseCreate the database using mysqladmin: mysqladmin create mbovis As described in the tutorial, give yourself write permission for the database, and give the web server user (e.g. ``nobody'') select permission.
2. Convert from Genbank format into GFF format and load it into the databaseThe bp_genbank2gff.pl script can download the accession, convert it into GFF and load the database directly in one smooth step: bp_genbank2gff.pl -create -dsn mbovis -accession NC_002945 If you prefer, you can do this in two steps by first creating the gff file as described for the in-memory adaptor, and then using bp_bulk_load_gff.pl or bp_fast_load_gff.pl. If you are using a PostgreSQL or Oracle database, you must specify the appropriate adaptor to bp_genbank2gff.pl: bp_genbank2gff.pl -create -dsn mbovis -adaptor dbi::oracle -accession NC_002945
3. Set up the configuration fileUse the configuration file 08.genbank.conf as your starting template. This is located in contrib/conf_files: cp contrib/conf_files/08.genbank.conf /usr/local/apache/conf/gbrowse.conf/mb.conf
4. Edit the configuration file as appropriateYou will need to change the [GENERAL] section to use the appropriate database adaptor:
[GENERAL]
description = Mycobacterium Bovis Database
db_adaptor = Bio::DB::GFF
db_args = -adaptor dbi::mysql
-dsn dbi:mysql:database=mbovis;host=localhost
-user nobody
-passwd ""
You might also want to change the ``examples'' tag to introduce the accession number for the whole genome, and a few choice gene names and search terms: examples = NC_002945 Mb1800 galT glucose That should be it!
NOTEYou can load as many accessions into the database as you like. Each one will appear as a ``chromosome'' named after the accession number of the entry. |
|
|
|
cain@cshl.org |