<< , up , Title , Contents

9.2. Protein motifs


The PROSITE database has about 500 protein motifs, identified and documented from the Swissprot protein database. A motif is a partial sequence associated with a particular function or structure that has been identified in a group of proteins. The GCG program MOTIFS searches for PROSITE patterns within a sequence or sequences.

$ Motifs

MOTIFs from what protein sequence(s) ? sw:kad1_human

What should I call the output file (* Kad1_Human.Motifs *) ?

Kad1_Human len: 194 .............

The output file shows the motif, the fitted pattern and documentation for each matching motif in PROSITE. Frequently found patterns such as post-translational modifications are NOT shown unless the /FREQ command line parameter is used.

9.2.1. Retrieving PROSITE documentation

MOTIFS is very like FINDPATTERNS in that it uses a file of patterns (PROSITE.PATTERNS). To see what patterns are available, retrieve the pattern file.

$ FETCH prosite.patterns

Have a look at the file using the editor.

PROSITETOGCG of: D16:[Flat.Prosite]Prosite.Doc and D16:[Flat.Prosite]Prosite.Da*

Release 10.1 (4/93)

Name Offset Pattern .. PDoc_Name

11s_Seed_Storage 1 NGx(D,E)2x2C(S,T) 0284.PDoc

.

.

Adenylate_Kinase 1 (L,I,V,M,F,Y,W)3DG(F,Y)PRx3(N,Q) 0104.PDoc

The 11s_Seed_Storage pattern is the first in the file. The documentation for that pattern is in the file 284.PDOC, which can also be fetched:

$ FETCH 0284.Pdoc

Again, have a look at the file using the editor (or $ Type).


<< , up , Title , Contents