<< , up , Title , Contents

2.13. Programs in the GCG package (release 7.3)


Values show the tabulated score for an amino acid substitution. The pair of amino acids involved is shown in single & triple-letter names. The type of codon change is shown:

I = 1st position in codon

II = 2nd position

III = 3rd position

I/II = change in 1st AND 2nd position

I,II = change in 1st OR 2nd position

M = change in all positions (multiple change)

0 = no change

Value=1.4 Min no subs

F-Y Phe-Tyr (II) 1

R-W Arg-Trp (I,I/III) 1

Value=1.3

L-M Leu-Met (I,I/III) 1

F-W Phe-Trp (II/III) 2

Value=1.2

F-L Phe-Leu (I,III,I/III) 1 (several ways)

Value=1.1

B-D Asx-Asp (0,I,III,I/III) 0

B-N Asx-Asn (0,I,III,I/III) 0

E-Z Glu-Glx (0,I,III,I/III) 0

Q-Z Gln-Glx (0,I,III,I/III) 0

I-V Ile-Val (I,I/III) 1

W-Y Trp-Tyr (II/III) 2

Value=1.0

C-Y Cys-Tyr (II,II/III) 1

D-E Asp-Glu (III) 1

Value=0.9

D-Z Asp-Glx (III,I/III) 1

Value=0.8

I-L Ile-Leu (I,I/III) 1

K-R Lys-Arg (II,M) 1

L-V Leu-Val (I,I/III) 1

Value=0.7

A-G Ala-Gly (II,II/III) 1

B-E Asx-Glu (III,II/III) 1

C-S Cys-Ser (I,II/III,I/III)1

D-G Asp-Gly (II,II/III) 1

D-N Asp-Asn (I,I/III) 1

D-Q Asp-Gln (I/III) 2

E-Q Glu-Gln (I,I/III) 1

F-I Phe-Ile (I,I/III) 1

H-Q Gln-His (III) 1

Value=0.6

B-G Asx-Gly (I,M) 1

B-Z Asx-Glx (III,I/III) 1

G-S Glu-Ser (I/II,M) 2

I-M Ile-Met (III) 1

M-V Met-Val (I) 1

Conclude: >0.5 single base changes, plus multiple substitutions which conserve function.

Value=0.5

A-P Ala-Pro (I,I/III) 1

B-Q Asx-Gln (I/III) 2

E-G Glu-Gly (II,II/III) 1

E-N Glu-Asn (I/III) 2

F-M Phe-Met (I/III) 2

H-N His-Asn (I,I/III) 1

H-R His-Arg (II,II/III,M) 1

H-Z His-Glx (III,I/III) 1

L-W Leu-Trp (II,M) 1

Value=0.0

A-I Ala-Ile (I/II) 2

A-K Ala-Lys (I/II,M) 2

A-M Ala-Met (I/II) 2

D-R Asp-Arg (M) 3

E-R Glu-Arg (I/II) 2

M-Q Met-Gln (I/II,M) 2

M-T Met-Thr (II) 1

N-P Asn-Pro (I/II,M) 2

S-Z Ser-Glx (I,M) 1

.

.

.

.

.

.

Value=-0.8

A-W Ala-Trp (I/II,M) 2

C-L Cys-Leu (I/II,M) 2

F-Q Phe-Gln (M) 3

P-W Pro-Trp (I/II,M) 2

P-Y Pro-Tyr (I/II,M) 2

V-W Val-Trp (M) 3

Z-W Glx-Trp (I/II,M) 2

Value=-1.0

D-F Asp-Phe (I/II) 2

G-W Gly-Trp (I) 1

Value=-1.1

D-W Asp-Trp (M) 3

E-W Glu-Trp (I/II) 2

Value=-1.2

C-W Cys-Trp (III) 1

2.13.1. Supplementary programs

( Bold indicates the program runs as a batch job. most have a corresponding interactive version. Use wisely!!)

( * asterisk indicates that an alternative version beginning with E eg: EAssemble is available as a supplementary program)

Assemble * : joins sequences together

BackTranslate : translates a peptide into a nucleotide sequence

BestFit : displays region of best alignment between two seqs

Chopup : breaks up a file with long records

Circles : a circular plot of RNA secondary structure from FOLD

CodonF * : tabulates codon frequencies

CodonP : plots codon similarity and frequency

CompTable * : creates a symbol comparison table

Compare : compares 2 sequences using word length

Composition : composition and di- and trinucleotide freqs

Compresstext : removes blank lines, spaces etc from files

Consensus * : creates a consensus table from pre-aligned seqs

Correspond * : finds similar patterns of codon choice

Corrupt : random mutation of nucleotide sequence

Count : counts characters,words and lines in file(s)

Crypt * : writes an encrypted (unreadable) file

Dataset : creates a database from any set of GCG sequences

Detab : replaces tabs in file with spaces

Distances : tabulates pair-wise distances within aligned sequences

Diverge * : measures the diff. between 2 nucleotides seqs

DollarToUnder: replaces dollar ($) characters with underscores (_)

Domain : probability of a region of specified composition

Domes : draws a linear plot of folded RNA molecule, after FOLD

DotPlot : makes a 'dot plot' with file from COMPARE, STEMLOOP

EasyGCG : provides a menu for GCG programs

Echo : shows decimal value of each key press (ctrl-Y quits)

Enzymes : types info on available enzymes

Examine : counts the number of characters in each line of a file

ExtractPeptide *: creates a peptide sequence from MAP output

FastA : database search for similarity to a query sequence

Fetch : copies GCG data files to your directory

Figure : makes figures & posters of text & graphics

FindPatterns : searches through sequence(s) for short patterns

FingerPrint * : identifies labelled products of RNA fingerprint

FitConsensus : searches for the best examples of a consensus

Fold : optimal secondary structure for RNA molecule

Fonts : draws tables of all the software-generated fonts

Frames : plots the open reading frames in DNA sequences

FromEMBL : EMBL to GCG format

FromGenbank : GenBank to GCG format

FromIG : Intelligenetics to GCG format

FromPIR : PIR (NBRF) to GCG format

FromStaden * : Staden/Sanger to GCG format

Gap : optimal alignment of 2 sequences using gaps

GapShow : displays 2 sequences marking similarities

GelAssemble : mutiple sequence editor for assembling contigs

GelDisassemble: breaks up the contigs in a fragment assembly project

GelEnter : adds fragments to a fragment assembly project (FAP)

GelOverlap : compares fragments and identifies overlaps in a FAP

GelStart : locks the gelassembly programs onto a particular FAP

GelMerge : automatically aligns the sequences in a FAP

GelView : summarises the structure of the contigs in a FAP

GenHelp : help information on each program in the GCG package

Genmanual : help information arranged by topic

GetSeq * : used by microcomputers to transfer sequences

GetText : similar to GetSeq

GHelp : same as GenHelp but no page waits

GKSterm : selects device for UNIGKS graphics output

HelicalWheel : plots peptide sequence as a helical wheel

Isoelectric : plots the charge of a peptide as a function of pH

KenTau : correlation using Kendall's Tau

Lineup : screen editor for aligning related sequences

ListFile : prints a file on printer attached to terminal

Map : shows restriction sites and possible translations

MapPlot : displays restriction maps graphically

MapSort : tabulates maps by fragment position and size

MFold : optimal and suboptimal RNA secondary structure

Moment : plots helical hydrophobic moment of a protein seq

Motifs : Sequence motif search using PROSITE

Mountains : plots an RNA secondary structure from FOLD ouput

Names : displays filenames in GCG data directories

NewGelStart : Initialises the new FAS to work.

OneCase : changes case of all characters in a file

OverPrint : prints each line of a file as often as specified

Overlap : compares set of DNA sequences to another set

PepData : peptide sequences from GenBank sequence files

PepPlot : plots measures of protein secondary structure

PeptideMap : creates a peptide map of a protein sequence

PeptideSort : tabulates data on peptides

PeptideStruct : secondary structure predictions for peptide sequences

PileUp : Multiple sequence alignments

IPileup : Interactive version of Pileup

PlasmidMap : draws circular plots of plasmid constructs

PlotFold : Graphically displays the output of MFold

PlotSimilarity : average similarity of sequences in a multiple alignment

PlotStructure : plots the output file from PeptideStructure

PlotTest : tests graphic functions used by package

Poster : plots text to help you to label plots/posters

Pretty : displays multiple sequence alignments

Profilemake : calculates a table (protein profile) of aligned sequences

ProfileGap : makes optimal alignment between a protein profile and protein sequence

ProfileScan : compares a sequence to known peptide seq motifs

ProfileSegments: makes optimal alignments from ProfileSearch output

ProfileSearch : uses a protein profile as probe to search protein databases

Publish * : arranges sequences for publication

Red : text formatter to print .RED files

Reformat : converts sequences and tables to GCG format

Repeat * : finds repeats in sequences

Replace : makes character string replacements in text files

Reverse * : reverses and/or complements a sequence

Sample : extracts random fragments from a sequence

Segments : finds where one sequence is similar to others

SeqDif : tabulates sequences differences by position

SeqEd : a screen-oriented sequence editor

Seqformat : makes GCG programs accept data in Staden or GCG format

Setkeys : redefines keys for SEQED, LINEUP, GELENTER, GELASSEMBLE

Shift : moves file to left or right given no. of columns

ShowFiles : creates a documented file of file names

Shuffle : randomizes a sequence

Simplify : simplification scheme for peptide comparisons

Spew : spews characters to a terminal micro-computer

Squiggles : plots RNA secondary structure from FOLD output

StatPlot * : plots columns of numbers in parallel

StemLoop : finds inverted repeats in nucleic acid seqs

Stringsearch : searches files for strings of characters

IStringsearch : interactive version of Stringsearch

TFastA : nucleic-acid database search for similarity to query peptide

Terminator * : searches for prokaryotic terminators

TestCode : identifies protein coding regions in DNA

ToFasta : GCG to FASTA/BLAST format

ToFitch : GCG to Fitch format

ToIG : GCG to Intelligenetics format

ToPIR : GCG to PIR (NBRF) format

ToStaden * : GCG to Staden format

Translate * : translates nucleic acid seqs into peptide seqs

TypeData : displays GCG data files on screen

Ugly : separates sequences in a file produced by PRETTY

Window * : frequencies of di- & tri-residue patterns in a sequence

WordSearch : finds where a sequence is similar to others

Zip : runs BESTFIT, with limits on the permitted gaps

2.13.1. Supplementary programs

These programs are not part of the 'official' package, and include local utilities as well as programs maintained by EMBL (also known as EGCG). Detailed descriptions can be found in EGENHELP

Alltrans : translates a set of aligned nucleotides into protein

Antigenic : identifies potential antigenic regions (Kolaskar)

Basepairplot : plots percentage of di-nucleotides

Bfasta : a version of FASTA that uses the BLOSUM62 matrix

Btfasta : a version of TFASTA that uses the BLOSUM62 matrix

Cpgplot : plots freq of CpG di-nucleotides

Eprogram : several alternative versions of the main programs (asterisked in main list). These usually have command line control added to them.

EGenhelp : help information on each supplementary program

EGenmanual : help information arranged by topic

EGHelp : same as EGenhelp but no page waits

Fastacheck : selects sig. protein alignments in (T)FASTA output

FromFPS : extracts sequences from the Euprom database FPS file

GapAlign : Multiple sequence alignment using GAP

GCGToPhylip : reformats LINEUP data into Phylip format

GClustalV : Multiple sequence alignment of GCG seqs using Clustalv

GelAnalyze : reads Gelstatus output and produces statistics

GelFigure : plots a summary of a contig in a FAP

GelPicture : as Gelview, but also shows a 'Pretty' sequence display

GelStatus : summarises contig quality of the 'project'

Helixturnhelix : potential h-t-h matches in protein sequences

JCAlien : Multiple sequence alignment of GCG seqs using Alien.

Manuals : Creates postscript files of sections of the GCG manual

Mapselect : selects enzymes and writes them to a new enzyme.dat file

Melt : calculates the Tm and percent G+C of a nucleotide seq

Meltplot : plots the melting curve for a nucletotide sequence

Newfeatures : edits the feature table

Palindrome : identifies perfect inverted repeats in nuc seq

Pepallwindow : Pepwindow for multiple sequnence alignments

Pepcoil : identifies coiled-coil regions (Lupas, van Dyke & Stock)

Pepnet : plots 2D helical representations of protein

Pepstats : statistical summary of a protein sequence

Pepwheel : plots periodoc distribution of amino acids

Pepwindow : plots Kyte-Doolittle protein hydropathy

Plotalign : plots amino acid parameters in a multiple alignment

Prdsm : retrieves DNA sequence segments from the EPD file

PrettyBox : Postscript box-shading of multiple sequence alignments

PrettyPlot : boxed (graphical) display of multiple alignments

Sigcleave : potential protein signal cleavage sites

Statsearch : statistics on Wordsearch database searches

ToPrimer : formats a sequence for use with the non-GCG program PRIMER

Tprofilegap : makes optimal alignment between a protein profile and nucleotide sequence

Tprofilesearch : uses a protein profile as probe to search nucleotide databases

Tprofilesegments: makes optimal alignments from TProfileSearch output

Tsegments : As segments - but compares protein to nucleotide

Twordsearch : As wordsearch - but compares protein to nucleotide

Page 2-16 intentionally blank


<< , up , Title , Contents