The PROSITE database has about 500 protein motifs, identified and documented from the Swissprot protein database. A motif is a partial sequence associated with a particular function or structure that has been identified in a group of proteins. The GCG program MOTIFS searches for PROSITE patterns within a sequence or sequences.
$ Motifs
MOTIFs from what protein sequence(s) ? sw:kad1_human
What should I call the output file (* Kad1_Human.Motifs *) ?
Kad1_Human len: 194 .............
The output file shows the motif, the fitted pattern and documentation for each matching motif in PROSITE. Frequently found patterns such as post-translational modifications are NOT shown unless the /FREQ command line parameter is used.
MOTIFS is very like FINDPATTERNS in that it uses a file of patterns (PROSITE.PATTERNS). To see what patterns are available, retrieve the pattern file.
$ FETCH prosite.patterns
Have a look at the file using the editor.
PROSITETOGCG of: D16:[Flat.Prosite]Prosite.Doc and D16:[Flat.Prosite]Prosite.Da*
Release 10.1 (4/93)
Name Offset Pattern .. PDoc_Name
11s_Seed_Storage 1 NGx(D,E)2x2C(S,T) 0284.PDoc
.
.
Adenylate_Kinase 1 (L,I,V,M,F,Y,W)3DG(F,Y)PRx3(N,Q) 0104.PDoc
The 11s_Seed_Storage pattern is the first in the file. The documentation for that pattern is in the file 284.PDOC, which can also be fetched:
$ FETCH 0284.Pdoc
Again, have a look at the file using the editor (or $ Type).