From lfk@ATHENA.MIT.EDU Thu Aug 16 18:52:35 1990
Received: from ATHENA.MIT.EDU by silver.ucs.indiana.edu
	(5.61+/9.2jsm) id AA13277; Thu, 16 Aug 90 18:52:29 -0500
Received: from E40-008-8.MIT.EDU by ATHENA.MIT.EDU with SMTP
	id AA02843; Thu, 16 Aug 90 19:48:48 EDT
From: lfk@ATHENA.MIT.EDU
Received: by E40-008-8.MIT.EDU (5.61/4.7) id AA01970; Thu, 16 Aug 90 19:16:04 -0400
Date: Thu, 16 Aug 90 19:16:04 -0400
Message-Id: <9008162316.AA01970@E40-008-8.MIT.EDU>
To: gilbertd@silver.ucs.indiana.edu
Subject: ProSearch Update -- New Version
Status: R



Note: You may have received two previous copies of this message. We
have been having problems with our mailer, Also, I have fixed some
small problems with the MSDOS batch files. Use This Version.  If you
have received this more than once, I apologize for stuffing your
mailboxes

	Enclosed is a Unix style shar (shell archive) containing the
new version of ProSearch. This version of ProSearch is necessary for 
a bug fix, and easier use of this software by none Unix systems.

	If you get ReadSeq (D. Gilbert), ProSearch can handle any
sequence format ReadSeq knows about (many). The VMS command files
require ReadSeq.

	Additionally, this release is covered by the Gnu Public
License. This allows anyone the right to do what ever they want with
the code, as long as it remains free.

	Finally, I am keeping a list of persons using ProSearch to
make these kind of updates possible. If you do not wish to be on this
list, send me mail. 

	I hope two have three versions of this release available at
the popular FTP sites:

	1) Unix shar (enclosed)
	2) UUencode Zoo archive (MSDOS, Unix, VMS)
	3) DCL Shar (VMS specific)

If in the future you wish to receive a specific format (other than shar)
let me know.

As usual please let me know of any bugs, or desired features.


Frank Kolakowski 

======================================================================
|lfk@athena.mit.edu                     ||      Lee F. Kolakowski    |
|lfk@eastman2.mit.edu                   ||	M.I.T.		     |
|kolakowski@wccf.mit.edu                ||	Dept of Chemistry    |
|lfk@mbio.med.upenn.edu		        ||	Room 18-506	     |
|lfk@hx.lcs.mit.edu                     ||	77 Massachusetts Ave.|
|AT&T:  1-617-253-1866                  ||	Cambridge, MA 02139  |
|--------------------------------------------------------------------|
|                         #include <woes.h>         		     |
|		           One-Liner Here!                           |
======================================================================

#! /bin/sh
# This is a shell archive.  Remove anything before this line, then unpack
# it by saving it into a file and typing "sh file".  To overwrite existing
# files, type "sh file -c".  You can also feed this as standard input via
# unshar, or by typing "sh <file", e.g..  If this archive is complete, you
# will see the following message at the end:
#		"End of shell archive."
# Contents:  COPYING INSTALL.msdos INSTALL.unix INSTALL.vms MANIFEST
#   prodoc.awk pros pros.1 pros.bat pros.com pros.nro prosearc.bat
#   prosearch prosearch.com prosearch.doc prosite.awk prosite.bug
#   prosite.regex
# Wrapped by lfk@e40-008-8 on Wed Aug 15 10:29:35 1990
PATH=/bin:/usr/bin:/usr/ucb ; export PATH
if test -f 'COPYING' -a "${1}" != "-c" ; then 
  echo shar: Will not clobber existing file \"'COPYING'\"
else
echo shar: Extracting \"'COPYING'\" \(12437 characters\)
sed "s/^X//" >'COPYING' <<'END_OF_FILE'
X		    GNU GENERAL PUBLIC LICENSE
X		     Version 1, February 1989
X
X Copyright (C) 1990 Lee F. Kolakowski
X                    
X Everyone is permitted to copy and distribute verbatim copies
X of this license document, but changing it is not allowed.
X
X			    Preamble
X
X  The license agreements of most software companies try to keep users
Xat the mercy of those companies.  By contrast, our General Public
XLicense is intended to guarantee your freedom to share and change free
Xsoftware--to make sure the software is free for all its users.  The
XGeneral Public License applies to the Free Software Foundation's
Xsoftware and to any other program whose authors commit to using it.
XYou can use it for your programs, too.
X
X  When we speak of free software, we are referring to freedom, not
Xprice.  Specifically, the General Public License is designed to make
Xsure that you have the freedom to give away or sell copies of free
Xsoftware, that you receive source code or can get it if you want it,
Xthat you can change the software or use pieces of it in new free
Xprograms; and that you know you can do these things.
X
X  To protect your rights, we need to make restrictions that forbid
Xanyone to deny you these rights or to ask you to surrender the rights.
XThese restrictions translate to certain responsibilities for you if you
Xdistribute copies of the software, or if you modify it.
X
X  For example, if you distribute copies of a such a program, whether
Xgratis or for a fee, you must give the recipients all the rights that
Xyou have.  You must make sure that they, too, receive or can get the
Xsource code.  And you must tell them their rights.
X
X  We protect your rights with two steps: (1) copyright the software, and
X(2) offer you this license which gives you legal permission to copy,
Xdistribute and/or modify the software.
X
X  Also, for each author's protection and ours, we want to make certain
Xthat everyone understands that there is no warranty for this free
Xsoftware.  If the software is modified by someone else and passed on, we
Xwant its recipients to know that what they have is not the original, so
Xthat any problems introduced by others will not reflect on the original
Xauthors' reputations.
X
X  The precise terms and conditions for copying, distribution and
Xmodification follow.
X
X		    GNU GENERAL PUBLIC LICENSE
X   TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
X
X  0. This License Agreement applies to any program or other work which
Xcontains a notice placed by the copyright holder saying it may be
Xdistributed under the terms of this General Public License.  The
X"Program", below, refers to any such program or work, and a "work based
Xon the Program" means either the Program or any work containing the
XProgram or a portion of it, either verbatim or with modifications.  Each
Xlicensee is addressed as "you".
X
X  1. You may copy and distribute verbatim copies of the Program's source
Xcode as you receive it, in any medium, provided that you conspicuously and
Xappropriately publish on each copy an appropriate copyright notice and
Xdisclaimer of warranty; keep intact all the notices that refer to this
XGeneral Public License and to the absence of any warranty; and give any
Xother recipients of the Program a copy of this General Public License
Xalong with the Program.  You may charge a fee for the physical act of
Xtransferring a copy.
X
X  2. You may modify your copy or copies of the Program or any portion of
Xit, and copy and distribute such modifications under the terms of Paragraph
X1 above, provided that you also do the following:
X
X    a) cause the modified files to carry prominent notices stating that
X    you changed the files and the date of any change; and
X
X    b) cause the whole of any work that you distribute or publish, that
X    in whole or in part contains the Program or any part thereof, either
X    with or without modifications, to be licensed at no charge to all
X    third parties under the terms of this General Public License (except
X    that you may choose to grant warranty protection to some or all
X    third parties, at your option).
X
X    c) If the modified program normally reads commands interactively when
X    run, you must cause it, when started running for such interactive use
X    in the simplest and most usual way, to print or display an
X    announcement including an appropriate copyright notice and a notice
X    that there is no warranty (or else, saying that you provide a
X    warranty) and that users may redistribute the program under these
X    conditions, and telling the user how to view a copy of this General
X    Public License.
X
X    d) You may charge a fee for the physical act of transferring a
X    copy, and you may at your option offer warranty protection in
X    exchange for a fee.
X
XMere aggregation of another independent work with the Program (or its
Xderivative) on a volume of a storage or distribution medium does not bring
Xthe other work under the scope of these terms.
X
X  3. You may copy and distribute the Program (or a portion or derivative of
Xit, under Paragraph 2) in object code or executable form under the terms of
XParagraphs 1 and 2 above provided that you also do one of the following:
X
X    a) accompany it with the complete corresponding machine-readable
X    source code, which must be distributed under the terms of
X    Paragraphs 1 and 2 above; or,
X
X    b) accompany it with a written offer, valid for at least three
X    years, to give any third party free (except for a nominal charge
X    for the cost of distribution) a complete machine-readable copy of the
X    corresponding source code, to be distributed under the terms of
X    Paragraphs 1 and 2 above; or,
X
X    c) accompany it with the information you received as to where the
X    corresponding source code may be obtained.  (This alternative is
X    allowed only for noncommercial distribution and only if you
X    received the program in object code or executable form alone.)
X
XSource code for a work means the preferred form of the work for making
Xmodifications to it.  For an executable file, complete source code means
Xall the source code for all modules it contains; but, as a special
Xexception, it need not include source code for modules which are standard
Xlibraries that accompany the operating system on which the executable
Xfile runs, or for standard header files or definitions files that
Xaccompany that operating system.
X
X  4. You may not copy, modify, sublicense, distribute or transfer the
XProgram except as expressly provided under this General Public License.
XAny attempt otherwise to copy, modify, sublicense, distribute or transfer
Xthe Program is void, and will automatically terminate your rights to use
Xthe Program under this License.  However, parties who have received
Xcopies, or rights to use copies, from you under this General Public
XLicense will not have their licenses terminated so long as such parties
Xremain in full compliance.
X
X  5. By copying, distributing or modifying the Program (or any work based
Xon the Program) you indicate your acceptance of this license to do so,
Xand all its terms and conditions.
X
X  6. Each time you redistribute the Program (or any work based on the
XProgram), the recipient automatically receives a license from the original
Xlicensor to copy, distribute or modify the Program subject to these
Xterms and conditions.  You may not impose any further restrictions on the
Xrecipients' exercise of the rights granted herein.
X
X  7. The Free Software Foundation may publish revised and/or new versions
Xof the General Public License from time to time.  Such new versions will
Xbe similar in spirit to the present version, but may differ in detail to
Xaddress new problems or concerns.
X
XEach version is given a distinguishing version number.  If the Program
Xspecifies a version number of the license which applies to it and "any
Xlater version", you have the option of following the terms and conditions
Xeither of that version or of any later version published by the Free
XSoftware Foundation.  If the Program does not specify a version number of
Xthe license, you may choose any version ever published by the Free Software
XFoundation.
X
X  8. If you wish to incorporate parts of the Program into other free
Xprograms whose distribution conditions are different, write to the author
Xto ask for permission.  For software which is copyrighted by the Free
XSoftware Foundation, write to the Free Software Foundation; we sometimes
Xmake exceptions for this.  Our decision will be guided by the two goals
Xof preserving the free status of all derivatives of our free software and
Xof promoting the sharing and reuse of software generally.
X
X			    NO WARRANTY
X
X  9. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
XFOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW.  EXCEPT WHEN
XOTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
XPROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
XOR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
XMERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.  THE ENTIRE RISK AS
XTO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU.  SHOULD THE
XPROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
XREPAIR OR CORRECTION.
X
X  10. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
XWILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
XREDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
XINCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
XOUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
XTO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
XYOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
XPROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
XPOSSIBILITY OF SUCH DAMAGES.
X
X		     END OF TERMS AND CONDITIONS
X
X	Appendix: How to Apply These Terms to Your New Programs
X
X  If you develop a new program, and you want it to be of the greatest
Xpossible use to humanity, the best way to achieve this is to make it
Xfree software which everyone can redistribute and change under these
Xterms.
X
X  To do so, attach the following notices to the program.  It is safest to
Xattach them to the start of each source file to most effectively convey
Xthe exclusion of warranty; and each file should have at least the
X"copyright" line and a pointer to where the full notice is found.
X
X    <one line to give the program's name and a brief idea of what it does.>
X    Copyright (C) 19yy  <name of author>
X
X    This program is free software; you can redistribute it and/or modify
X    it under the terms of the GNU General Public License as published by
X    the Free Software Foundation; either version 1, or (at your option)
X    any later version.
X
X    This program is distributed in the hope that it will be useful,
X    but WITHOUT ANY WARRANTY; without even the implied warranty of
X    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
X    GNU General Public License for more details.
X
X    You should have received a copy of the GNU General Public License
X    along with this program; if not, write to the Free Software
X    Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
X
XAlso add information on how to contact you by electronic and paper mail.
X
XIf the program is interactive, make it output a short notice like this
Xwhen it starts in an interactive mode:
X
X    Gnomovision version 69, Copyright (C) 19xx name of author
X    Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
X    This is free software, and you are welcome to redistribute it
X    under certain conditions; type `show c' for details.
X
XThe hypothetical commands `show w' and `show c' should show the
Xappropriate parts of the General Public License.  Of course, the
Xcommands you use may be called something other than `show w' and `show
Xc'; they could even be mouse-clicks or menu items--whatever suits your
Xprogram.
X
XYou should also get your employer (if you work as a programmer) or your
Xschool, if any, to sign a "copyright disclaimer" for the program, if
Xnecessary.  Here a sample; alter the names:
X
X  Yoyodyne, Inc., hereby disclaims all copyright interest in the
X  program `Gnomovision' (a program to direct compilers to make passes
X  at assemblers) written by James Hacker.
X
X  <signature of Ty Coon>, 1 April 1989
X  Ty Coon, President of Vice
X
XThat's all there is to it!
X
END_OF_FILE
if test 12437 -ne `wc -c <'COPYING'`; then
    echo shar: \"'COPYING'\" unpacked with wrong size!
fi
# end of 'COPYING'
fi
if test -f 'INSTALL.msdos' -a "${1}" != "-c" ; then 
  echo shar: Will not clobber existing file \"'INSTALL.msdos'\"
else
echo shar: Extracting \"'INSTALL.msdos'\" \(981 characters\)
sed "s/^X//" >'INSTALL.msdos' <<'END_OF_FILE'
XINSTALLATION for MSDOS
X
X1) Get the datafile ProSite.doc from NETSERV@EMBL.BITNET
X
X	This file must be mailed to you from the file
X	server.
X
X	Send a Mail message to the above address
X	with a subject line that says:
X
X	Subject: Get Prosite:Prosite.doc
X
X2) If you do not have a working version of Awk. FTP to 
X	WSMR-SIMTEL20.ARMY.MIL
X	(26.2.0.74)
X	Simtel is a DEC-20 so use tenex mode for binaries.
X	mget PD1:<MSDOS.AWK>GAWK*.*
X
X3) Place the two batch files (prosearc.bat and pros.bat) in your path.
X	Edit the line:
X
X	set prolib=\mit\lfk\lib\prosite 
X
X	to reflect where the awk scripts, the regular expression file
X	and the prosite.doc file will be kept. Then replace all
X	occurances of 'awk' in the batch files with the path and name
X	of your implementation of the AWK language.
X
X4) If you get Readseq working on your system, uncomment the lines
X	in the scripts to use readseq. Readseq can be obtained from
X	iubio.bio.indiana.edu 129.79.1.101. This code requires an ANSI
X	C compiler.
X
END_OF_FILE
if test 981 -ne `wc -c <'INSTALL.msdos'`; then
    echo shar: \"'INSTALL.msdos'\" unpacked with wrong size!
fi
# end of 'INSTALL.msdos'
fi
if test -f 'INSTALL.unix' -a "${1}" != "-c" ; then 
  echo shar: Will not clobber existing file \"'INSTALL.unix'\"
else
echo shar: Extracting \"'INSTALL.unix'\" \(906 characters\)
sed "s/^X//" >'INSTALL.unix' <<'END_OF_FILE'
XINSTALLATION for UNIX
X
X1) Get the datafile ProSite.doc from NETSERV@EMBL.BITNET
X
X	This file must be mailed to you from the file
X	server.
X
X	Send a Mail message to the above address
X	with a subject line that says:
X
X	Subject: Get Prosite:Prosite.doc
X
X2) If you do not have a working version of Awk. FTP to 
X	PREP.AI.MIT.EDU (18.71.0.38) and
X	get the file pub/gnu/gawk*
X
X
X3) Place the two scripts (prosearch and pros) in your path. Edit
X	the line:
X
X	prolib='/mit/lfk/lib/prosite' 
X
X	to reflect where the awk scripts, the regular expression file
X	and the prosite.doc file will be kept. Then edit the next line:
X
X        awk=gawk
X
X	to reflect the path and name of your implementation of the AWK
X	language.
X
X4) If you get Readseq working on your system, uncomment the lines
X	in the scripts to use readseq. Readseq can be obtained from
X	iubio.bio.indiana.edu 129.79.1.101. This code requires an ANSI
X	C compiler.
X
END_OF_FILE
if test 906 -ne `wc -c <'INSTALL.unix'`; then
    echo shar: \"'INSTALL.unix'\" unpacked with wrong size!
fi
# end of 'INSTALL.unix'
fi
if test -f 'INSTALL.vms' -a "${1}" != "-c" ; then 
  echo shar: Will not clobber existing file \"'INSTALL.vms'\"
else
echo shar: Extracting \"'INSTALL.vms'\" \(1238 characters\)
sed "s/^X//" >'INSTALL.vms' <<'END_OF_FILE'
XINSTALLATION for VMS
X
X1) Get the datafile ProSite.doc from NETSERV@EMBL.BITNET
X
X	This file must be mailed to you from the file
X	server.
X
X	Send a Mail message to the above address
X	with a subject line that says:
X
X	Subject: Get Prosite:Prosite.doc
X
X2) If you do not have a working version of Awk. FTP to 
X	RML2.SRI.COM (128.18.22.20) and
X	get the file getting_gawk. There are instructions for
X	getting the backup save set containing the VMS implementation
X	of gawk.
X
X3) Place the two command files (prosearch.com and pros.com) in a directory.
X	Edit the lines 
X
X	$ prosearch_awk = "GCGMITPROSITE:prosite.awk"
X	$ prodoc_awk = "GCGMITPROSITE:prodoc.awk"
X	$ prosite_doc = "gengenbankdisk:[prosite]prosite.doc"
X	$ prosite_regex ="gengenbankdisk:[prosite]prosite.regex"
X
X	to reflect where the awk scripts, the regular expression file
X	and the prosite.doc file will be kept. Then replace all
X	occurances of 'awk' in the command files with the path and name
X	of your implementation of the AWK language.
X
X4) You must get Readseq working on your system. Readseq can be
X	obtained from iubio.bio.indiana.edu 129.79.1.101. This code
X	requires an ANSI C compiler.
X
X5) I'd like to thank Anna Tomecka and Jasper Rees for writting the 
X	VMS command files.
X
END_OF_FILE
if test 1238 -ne `wc -c <'INSTALL.vms'`; then
    echo shar: \"'INSTALL.vms'\" unpacked with wrong size!
fi
# end of 'INSTALL.vms'
fi
if test -f 'MANIFEST' -a "${1}" != "-c" ; then 
  echo shar: Will not clobber existing file \"'MANIFEST'\"
else
echo shar: Extracting \"'MANIFEST'\" \(717 characters\)
sed "s/^X//" >'MANIFEST' <<'END_OF_FILE'
XFile            Description
X====            ============
XCOPYING		Gnu Public License
XINSTALL.msdos	Details for MSDOS
XINSTALL.unix	Details for UNIX
XINSTALL.vms	Details for VMS
XMANIFEST	this file
Xprodoc.awk	awk script for data formatting
Xpros            short output script
Xpros.1          unformatted manual page
Xpros.bat	MSDOS batch file for short output
Xpros.com	VMS command file for short output
Xpros.nro        formatted manual page 
Xprosearc.bat	MSDOS batch file for long output
Xprosearch       long output script
Xprosearch.com	VMS command file for long output
Xprosearch.doc	background and info
Xprosite.awk	awk script for search
Xprosite.bug	Details of a small bug
Xprosite.regex	regular expression data for search
END_OF_FILE
if test 717 -ne `wc -c <'MANIFEST'`; then
    echo shar: \"'MANIFEST'\" unpacked with wrong size!
fi
# end of 'MANIFEST'
fi
if test -f 'prodoc.awk' -a "${1}" != "-c" ; then 
  echo shar: Will not clobber existing file \"'prodoc.awk'\"
else
echo shar: Extracting \"'prodoc.awk'\" \(1597 characters\)
sed "s/^X//" >'prodoc.awk' <<'END_OF_FILE'
X# prodoc.awk - release version 1.1
X# Copyright (C) 1990 Lee F. Kolakowski
X#
X# This program is free software; you can redistribute it and/or modify
X# it under the terms of the GNU General Public License as published by
X# the Free Software Foundation; either version 1, or (at your option)
X# any later version.
X#
X# This program is distributed in the hope that it will be useful,
X# but WITHOUT ANY WARRANTY; without even the implied warranty of
X# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
X# GNU General Public License for more details.
X#
X# You should have received a copy of the GNU General Public License
X# along with this program; if not, write to the Free Software
X# Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
X#
X# 
X# Send bugs or improvements to
X# lfk@athena.mit.edu
X#
X# August 13, 1990
X# usage: {ng}awk -f prodoc.awk <output from prosite.awk> <prosite.doc>
X# this provides long output
X#
X# usage: {ng}awk -f prodoc.awk <output from prosite.awk>
X# this provides short output
X#
XBEGIN {
X  printf("\n%s\t%12s\t%-20s\t%s\n", "Access#", "From->To", "Name", "Doc#")
X  printf("%s\t%12s\t%-20s\t%s\n", "_______", "________", "____________________", "_________")
X  n=1
X}
X{
X  if ($0 ~ /^PS/ && NF == 4) {
X    printf("%s\t%12s\t%-20s\t%s\n", $1, $2, $3, $4)
X    regex[NR] = "{"$4
X    regex_num = NR
X  }
X  if ($0 ~ /^{PDOC/ && n < regex_num) {
X    for (i=n; i <= regex_num; i++) {
X      if ( regex[i] != regex_last ) {
X	regex_last = regex[i]
X	if ($0 ~ regex[i]) {
X          n = i
X	  print $0
X	  while ( $0 !~ /{END}/) {
X	    getline
X	    print $0
X	  }
X	}
X      }
X    }
X  }
X}
END_OF_FILE
if test 1597 -ne `wc -c <'prodoc.awk'`; then
    echo shar: \"'prodoc.awk'\" unpacked with wrong size!
fi
# end of 'prodoc.awk'
fi
if test -f 'pros' -a "${1}" != "-c" ; then 
  echo shar: Will not clobber existing file \"'pros'\"
else
echo shar: Extracting \"'pros'\" \(1359 characters\)
sed "s/^X//" >'pros' <<'END_OF_FILE'
X#!/bin/sh
X# pros - release version 1.1
X# Copyright (C) 1990 Lee F. Kolakowski
X#
X# This program is free software; you can redistribute it and/or modify
X# it under the terms of the GNU General Public License as published by
X# the Free Software Foundation; either version 1, or (at your option)
X# any later version.
X#
X# This program is distributed in the hope that it will be useful,
X# but WITHOUT ANY WARRANTY; without even the implied warranty of
X# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
X# GNU General Public License for more details.
X#
X# You should have received a copy of the GNU General Public License
X# along with this program; if not, write to the Free Software
X# Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
X#
X# 
X# Send bugs or improvements to
X# lfk@athena.mit.edu
X#
X# August 13, 1990
X#
X# usage: pros files...
X# produces short output
X#
Xprolib='/mit/lfk/src/prosearch'
Xawk=gawk
Xecho 'Prosite Database -- Release 5.0 of April 1990 Copyright: Amos Bairoch'
Xecho 'ProSearch Software -- Release 1.1 -- Copyright: Lee Kolakowski'
Xfor file in $* ; do
X echo "The following patterns are in < $file >:"
X# readseq -f10 $file > /tmp/pros$$.tmp
X# ${awk} -f ${prolib}/prosite.awk  ${prolib}/prosite.regex  /tmp/pros$$.tmp |
X ${awk} -f ${prolib}/prosite.awk  ${prolib}/prosite.regex $file |
X ${awk} -f ${prolib}/prodoc.awk  - 
Xdone
X
END_OF_FILE
if test 1359 -ne `wc -c <'pros'`; then
    echo shar: \"'pros'\" unpacked with wrong size!
fi
chmod +x 'pros'
# end of 'pros'
fi
if test -f 'pros.1' -a "${1}" != "-c" ; then 
  echo shar: Will not clobber existing file \"'pros.1'\"
else
echo shar: Extracting \"'pros.1'\" \(647 characters\)
sed "s/^X//" >'pros.1' <<'END_OF_FILE'
X.TH PROS 1 "July 13, 1990"
X.SH NAME
Xpros \- search protein sequence for Prosite Patterns
X.SH SYNOPSIS
X.B pros
Xfile ...
X.br
X.B prosearch
Xfile ...
X.br
X.SH DESCRIPTION
X.I Pros
Xreads each
X.I file
Xin sequence and searchs for regular expression patterns described
Xsites or structures in the Prosite Database. The output is displayed
Xon the standard output. The output is a table of sites. Longer output
Xis generated by prosearch, which also displays the relevant section
Xfrom the Prosite database.
X.SH "SEE ALSO"
Xawk(1), gawk(1)
X.br
Xprosite.regex - the regular expression file
X.br
Xprosite.doc - the Prosite database (available from NETSERV@EMBL.BITNET)
END_OF_FILE
if test 647 -ne `wc -c <'pros.1'`; then
    echo shar: \"'pros.1'\" unpacked with wrong size!
fi
# end of 'pros.1'
fi
if test -f 'pros.bat' -a "${1}" != "-c" ; then 
  echo shar: Will not clobber existing file \"'pros.bat'\"
else
echo shar: Extracting \"'pros.bat'\" \(1509 characters\)
sed "s/^X//" >'pros.bat' <<'END_OF_FILE'
XREM Note: If you uncomment the ReadSeq Lines replace RIGHT_CARET
XREM with the proper redirection character
Xecho off
XREM pros.bat - release version 1.1
XREM Copyright (C) 1990 Lee F. Kolakowski
XREM
XREM This program is free software; you can redistribute it and/or modify
XREM it under the terms of the GNU General Public License as published by
XREM the Free Software Foundation; either version 1, or (at your option)
XREM any later version.
XREM
XREM This program is distributed in the hope that it will be useful,
XREM but WITHOUT ANY WARRANTY; without even the implied warranty of
XREM MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
XREM GNU General Public License for more details.
XREM
XREM You should have received a copy of the GNU General Public License
XREM along with this program; if not, write to the Free Software
XREM Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
XREM
XREM
XREM Send bugs or improvements to
XREM lfk@athena.mit.edu
XREM
XREM August 13, 1990
XREM
XREM usage: pros files...
XREM produces short output
XREM
Xset prolib=\usr\lib\prosite
Xecho Prosite Database -- Release 5.0 of April 1990 Copyright: Amos Bairoch
Xecho ProSearch Software -- Release 1.1 -- Copyright: Lee Kolakowski
Xecho The following patterns are in [ %1 ]:
XREM readseq -f10 %1 RIGHT_CARET pros$$.tmp
XREM awk -f %prolib%\prosite.awk  %prolib%\prosite.regex pros$$.tmp RIGHT_CARET pros$$2.tmp
Xawk -f %prolib%\prosite.awk %prolib%\prosite.regex %1 > pros$$2.tmp
Xawk -f %prolib%\prodoc.awk pros$$2.tmp
Xdel pros$$*.tmp
END_OF_FILE
if test 1509 -ne `wc -c <'pros.bat'`; then
    echo shar: \"'pros.bat'\" unpacked with wrong size!
fi
# end of 'pros.bat'
fi
if test -f 'pros.com' -a "${1}" != "-c" ; then 
  echo shar: Will not clobber existing file \"'pros.com'\"
else
echo shar: Extracting \"'pros.com'\" \(3913 characters\)
sed "s/^X//" >'pros.com' <<'END_OF_FILE'
X$! pros.com - release version 1.1
X$! Copyright (C) 1990 Lee F. Kolakowski
X$!
X$! This program is free software; you can redistribute it and/or modify
X$! it under the terms of the GNU General Public License as published by
X$! the Free Software Foundation; either version 1, or (at your option)
X$! any later version.
X$!
X$! This program is distributed in the hope that it will be useful,
X$! but WITHOUT ANY WARRANTY; without even the implied warranty of
X$! MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
X$! GNU General Public License for more details.
X$!
X$! You should have received a copy of the GNU General Public License
X$! along with this program; if not, write to the Free Software
X$! Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
X$!
X$! 
X$! Send bugs or improvements to
X$! lfk@athena.mit.edu
X$!
X$! August 13, 1990
X$!
X$ ver=F$verify(0)
X$ Start:
X$ type sys$input
XProSite Database Version 5.0 Copyright Amos Bairoch
XProSearch Version 1.1 Copyright Lee F. Kolakowski
X$ goto ProMatch
X$!
X$ ProMatch:
X$!==========
X$ prosearch_awk = "GCGMITPROSITE:prosite.awk"
X$ prodoc_awk = "GCGMITPROSITE:prodoc.awk"
X$ prosite_doc = "gengenbankdisk:[prosite]prosite.doc"
X$ prosite_regex ="gengenbankdisk:[prosite]prosite.regex"
X$ type sys$input
X
X	Prosearch scans your protein sequence against the Prosite 
X	database and records the list of matches as positions,
X	pattern names and accession numbers. Your file of protein
X	sequence can be in any format, it will be read into the correct
X	format by this procedure. 
X
X$!
X$ get_doc = 2
X$ count = 1
X$ Get_seq:
X$ file = p'count'
X$ if file.nes."" 
X$   then
X$          if f$search("''file'").nes."" 
X$            then 
X$              write sys$output "	Analyzing sequence ''file'"
X$            else
X$              write sys$output "	file ''file' cannot be found, trying next file"
X$              count = count + 1
X$              goto getseq
X$           endif
X$   else 
X$         if count.gt.1 then goto leave
X$	  Inquire file "	Name of protein sequence file   "
X$            if file.eqs."" then goto get_seq
X$            if f$search("''file'").eqs."" 
X$	       then 
X$                 write sys$output ""
X$                 write sys$output "	your file ''file' was not found"
X$                 inquire retry "	type 1 to continue, return to quit   "
X$ 	             if retry .eqs. "1" 
X$		       then 
X$                         file=""
X$		          goto get_seq
X$ 		       else
X$			goto leave
X$ 	              endif
X$            endif
X$   endif
X$
X$Setup:
X$ file = "''f$parse(file)'"
X$ def_dir="''f$environment("DEFAULT")'"
X$ staden_file = "''def_dir'"+"''F$parse(file,,,"NAME")'"+".STADENPRO"
X$ temp_file   = "''def_dir'"+"''F$parse(file,,,"NAME")'"+".temp_file"
X$ outfile   = "''def_dir'"+"''F$parse(file,,,"NAME")'"+".prosearch"
X$
X$ write sys$output ""
X$ write sys$output "	Your output will be in ''outfile'"
X$ if get_doc.eqs."1" .or. get_doc.eqs."2" then goto work_out
X$Get_DOC:
X$ write sys$output ""
X$ write sys$output "	Do you want to include documentation in ''outfile'"
X$ inquire get_doc "	Type 1 to include, 2 to exclude, QUIT to quit"
X$ if get_doc.eqs."QUIT" then goto leave
X$ if get_doc.eqs."1" .or. get_doc.eqs."2" then goto work_out
X$ goto Get_doc
X$
X$work_out:
X$ write sys$output ""
X$ write sys$output "	Matching against prosite: please wait...."
X$ write sys$output ""
X$
X$TOSTADEN:
X$ on control_y then goto leave
X$ readseq -f13 'file' -o'staden_file'
X$
X$ Prosearch:
X$ gawk -f 'prosearch_awk' 'prosite_regex' 'staden_file' > 'temp_file'
X$ if Get_doc.eqs."1"
X$     then
X$        gawk -f 'prodoc_awk' 'temp_file' 'prosite_doc' > 'outfile'
X$     else
X$        gawk -f 'prodoc_awk' 'temp_file' > 'outfile'
X$ endif
X$
X$check_count:
X$ count= count+1
X$ goto get_seq
X$Leave:
X$ save_message = f$environment("MESSAGE")
X$ set message/nofacility/noiden/noseverity/notext
X$ dele/nolog/noconf *.stadenpro;*
X$ dele/nolog/noconf *.temp_file;*
X$ set message 'save_message'
X
END_OF_FILE
if test 3913 -ne `wc -c <'pros.com'`; then
    echo shar: \"'pros.com'\" unpacked with wrong size!
fi
# end of 'pros.com'
fi
if test -f 'pros.nro' -a "${1}" != "-c" ; then 
  echo shar: Will not clobber existing file \"'pros.nro'\"
else
echo shar: Extracting \"'pros.nro'\" \(833 characters\)
sed "s/^X//" >'pros.nro' <<'END_OF_FILE'
X
X
X
XPROS(1)             UNIX Programmer's Manual              PROS(1)
X
X
X
XNAME
X     pros - search protein sequence for Prosite Patterns
X
XSYNOPSIS
X     pros file ...
X     prosearch file ...
X
XDESCRIPTION
X     _P_r_o_s reads each _f_i_l_e in sequence and searchs for regular
X     expression patterns described sites or structures in the
X     Prosite Database. The output is displayed on the standard
X     output. The output is a table of sites. Longer output is
X     generated by prosearch, which also displays the relevant
X     section from the Prosite database.
X
XSEE ALSO
X     awk(1), gawk(1)
X     prosite.regex - the regular expression file
X     prosite.doc - the Prosite database (available from
X     NETSERV@EMBL.BITNET)
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
XPrinted 7/13/90           July 13, 1990                         1
X
X
X
END_OF_FILE
echo shar: 8 control characters may be missing from \"'pros.nro'\"
if test 833 -ne `wc -c <'pros.nro'`; then
    echo shar: \"'pros.nro'\" unpacked with wrong size!
fi
# end of 'pros.nro'
fi
if test -f 'prosearc.bat' -a "${1}" != "-c" ; then 
  echo shar: Will not clobber existing file \"'prosearc.bat'\"
else
echo shar: Extracting \"'prosearc.bat'\" \(1538 characters\)
sed "s/^X//" >'prosearc.bat' <<'END_OF_FILE'
XREM Note: If you uncomment the ReadSeq Lines replace RIGHT_CARET
XREM with the proper redirection character
Xecho off
XREM prosearc.bat - release version 1.1
XREM Copyright (C) 1990 Lee F. Kolakowski
XREM
XREM This program is free software; you can redistribute it and/or modify
XREM it under the terms of the GNU General Public License as published by
XREM the Free Software Foundation; either version 1, or (at your option)
XREM any later version.
XREM
XREM This program is distributed in the hope that it will be useful,
XREM but WITHOUT ANY WARRANTY; without even the implied warranty of
XREM MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
XREM GNU General Public License for more details.
XREM
XREM You should have received a copy of the GNU General Public License
XREM along with this program; if not, write to the Free Software
XREM Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
XREM
XREM
XREM Send bugs or improvements to
XREM lfk@athena.mit.edu
XREM
XREM August 13, 1990
XREM
XREM usage: prosearch files...
XREM produces long output
XREM
Xset prolib=\usr\lib\prosite
Xecho Prosite Database -- Release 5.0 of April 1990 Copyright: Amos Bairoch
Xecho ProSearch Software -- Release 1.1 -- Copyright: Lee Kolakowski
Xecho The following patterns are in [ %1 ]:
XREM readseq -f10 %1 RIGHT_CARET pros$$.tmp
XREM awk -f %prolib%\prosite.awk  %prolib%\prosite.regex pros$$.tmp RIGHT_CARET pros$$2.tmp
Xawk -f %prolib%\prosite.awk %prolib%\prosite.regex %1 > pros$$2.tmp
Xawk -f %prolib%\prodoc.awk pros$$2.tmp %prolib%\prosite.doc
Xdel pros$$*.tmp
END_OF_FILE
if test 1538 -ne `wc -c <'prosearc.bat'`; then
    echo shar: \"'prosearc.bat'\" unpacked with wrong size!
fi
# end of 'prosearc.bat'
fi
if test -f 'prosearch' -a "${1}" != "-c" ; then 
  echo shar: Will not clobber existing file \"'prosearch'\"
else
echo shar: Extracting \"'prosearch'\" \(1387 characters\)
sed "s/^X//" >'prosearch' <<'END_OF_FILE'
X#!/bin/sh
X# prosearch - release version 1.1
X# Copyright (C) 1990 Lee F. Kolakowski
X#
X# This program is free software; you can redistribute it and/or modify
X# it under the terms of the GNU General Public License as published by
X# the Free Software Foundation; either version 1, or (at your option)
X# any later version.
X#
X# This program is distributed in the hope that it will be useful,
X# but WITHOUT ANY WARRANTY; without even the implied warranty of
X# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
X# GNU General Public License for more details.
X#
X# You should have received a copy of the GNU General Public License
X# along with this program; if not, write to the Free Software
X# Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
X#
X# 
X# Send bugs or improvements to
X# lfk@athena.mit.edu
X#
X# August 13, 1990
X#
X# usage: prosearch files...
X# produces long output
X#
Xprolib='/mit/lfk/lib/prosite'
Xawk=gawk
Xecho 'Prosite Database -- Release 5.0 of April 1990 Copyright: Amos Bairoch'
Xecho 'ProSearch Software -- Release 1.1 -- Copyright: Lee Kolakowski'
Xfor file in $* ; do
X echo "The following patterns are in < $file >:"
X# readseq -f10 $file > /tmp/pros$$.tmp
X# ${awk} -f ${prolib}/prosite.awk  ${prolib}/prosite.regex  /tmp/pros$$.tmp |
X ${awk} -f ${prolib}/prosite.awk  ${prolib}/prosite.regex $file |
X ${awk} -f ${prolib}/prodoc.awk  - ${prolib}/prosite.doc
Xdone
X
END_OF_FILE
if test 1387 -ne `wc -c <'prosearch'`; then
    echo shar: \"'prosearch'\" unpacked with wrong size!
fi
chmod +x 'prosearch'
# end of 'prosearch'
fi
if test -f 'prosearch.com' -a "${1}" != "-c" ; then 
  echo shar: Will not clobber existing file \"'prosearch.com'\"
else
echo shar: Extracting \"'prosearch.com'\" \(3917 characters\)
sed "s/^X//" >'prosearch.com' <<'END_OF_FILE'
X$! prosearch.com - release version 1.1
X$! Copyright (C) 1990 Lee F. Kolakowski
X$!
X$! This program is free software; you can redistribute it and/or modify
X$! it under the terms of the GNU General Public License as published by
X$! the Free Software Foundation; either version 1, or (at your option)
X$! any later version.
X$!
X$! This program is distributed in the hope that it will be useful,
X$! but WITHOUT ANY WARRANTY; without even the implied warranty of
X$! MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
X$! GNU General Public License for more details.
X$!
X$! You should have received a copy of the GNU General Public License
X$! along with this program; if not, write to the Free Software
X$! Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
X$!
X$! 
X$! Send bugs or improvements to
X$! lfk@athena.mit.edu
X$!
X$! August 13, 1990
X$!
X$ ver=F$verify(0)
X$ Start:
X$ type sys$input
XProSite Database Version 5.0 Copyright Amos Bairoch
XProSearch Version 1.1 Copyright Lee F. Kolakowski
X$ goto ProMatch
X$!
X$ ProMatch:
X$!==========
X$ prosearch_awk = "GCGMITPROSITE:prosite.awk"
X$ prodoc_awk = "GCGMITPROSITE:prodoc.awk"
X$ prosite_doc = "gengenbankdisk:[prosite]prosite.doc"
X$ prosite_regex ="gengenbankdisk:[prosite]prosite.regex"
X$ type sys$input
X
X	Prosearch scans your protein sequence against the Prosite 
X	database and records the list of matches as positions,
X	pattern names and accession numbers. Your file of protein
X	sequence can be in any format, it will be read into the correct
X	format by this procedure. 
X
X$!
X$ get_doc = 1
X$ count = 1
X$ Get_seq:
X$ file = p'count'
X$ if file.nes."" 
X$   then
X$          if f$search("''file'").nes."" 
X$            then 
X$              write sys$output "	Analyzing sequence ''file'"
X$            else
X$              write sys$output "	file ''file' cannot be found, trying next file"
X$              count = count + 1
X$              goto getseq
X$           endif
X$   else 
X$         if count.gt.1 then goto leave
X$	  Inquire file "	Name of protein sequence file   "
X$            if file.eqs."" then goto get_seq
X$            if f$search("''file'").eqs."" 
X$	       then 
X$                 write sys$output ""
X$                 write sys$output "	your file ''file' was not found"
X$                 inquire retry "	type 1 to continue, return to quit   "
X$ 	             if retry .eqs. "1" 
X$		       then 
X$                         file=""
X$		          goto get_seq
X$ 		       else
X$			goto leave
X$ 	              endif
X$            endif
X$   endif
X$
X$Setup:
X$ file = "''f$parse(file)'"
X$ def_dir="''f$environment("DEFAULT")'"
X$ staden_file = "''def_dir'"+"''F$parse(file,,,"NAME")'"+".STADENPRO"
X$ temp_file   = "''def_dir'"+"''F$parse(file,,,"NAME")'"+".temp_file"
X$ outfile   = "''def_dir'"+"''F$parse(file,,,"NAME")'"+".prosearch"
X$
X$ write sys$output ""
X$ write sys$output "	Your output will be in ''outfile'"
X$ if get_doc.eqs."1" .or. get_doc.eqs."2" then goto work_out
X$Get_DOC:
X$ write sys$output ""
X$ write sys$output "	Do you want to include documentation in ''outfile'"
X$ inquire get_doc "	Type 1 to include, 2 to exclude, QUIT to quit"
X$ if get_doc.eqs."QUIT" then goto leave
X$ if get_doc.eqs."1" .or. get_doc.eqs."2" then goto work_out
X$ goto Get_doc
X$
X$work_out:
X$ write sys$output ""
X$ write sys$output "	Matching against prosite: please wait...."
X$ write sys$output ""
X$
X$TOSTADEN:
X$ on control_y then goto leave
X$ readseq -f13 'file' -o'staden_file'
X$
X$ Prosearch:
X$ gawk -f 'prosearch_awk' 'prosite_regex' 'staden_file' > 'temp_file'
X$ if Get_doc.eqs."1"
X$     then
X$        gawk -f 'prodoc_awk' 'temp_file' 'prosite_doc' > 'outfile'
X$     else
X$        gawk -f 'prodoc_awk' 'temp_file' > 'outfile'
X$ endif
X$
X$check_count:
X$ count= count+1
X$ goto get_seq
X$Leave:
X$ save_message = f$environment("MESSAGE")
X$ set message/nofacility/noiden/noseverity/notext
X$ dele/nolog/noconf *.stadenpro;*
X$ dele/nolog/noconf *.temp_file;*
X$ set message 'save_message'
END_OF_FILE
if test 3917 -ne `wc -c <'prosearch.com'`; then
    echo shar: \"'prosearch.com'\" unpacked with wrong size!
fi
# end of 'prosearch.com'
fi
if test -f 'prosearch.doc' -a "${1}" != "-c" ; then 
  echo shar: Will not clobber existing file \"'prosearch.doc'\"
else
echo shar: Extracting \"'prosearch.doc'\" \(4280 characters\)
sed "s/^X//" >'prosearch.doc' <<'END_OF_FILE'
XINTRODUCTION
X
X	Over the past year or so Amos Bairoch (bairoch
X@cgecmu51.BITNET) has released an number of versions of his Prosite
Xdatabase. This is a database of patterns which have been associated
Xwith particular enzymatic activities or structures. For example, the
Xwell known pattern for N-link glycosylation Asn-Xxx-Ser/Thr.
X
X	Amos has compiled a database that consists of references about
Xeach pattern, validity of the patterns, occurrences, and a host of
Xother details. This database is of general use, and has been used by
XAmos in his PC/Gene Suite of programs for analysis of DNA and Protein
Xsequences.
X
X	I wanted to use this database on a Unix machine and be able to
Xask the question, "Which of these patterns occur in sequence X?"
X
X	This is the second release of Prosearch. It completely
Xsupersedes the first version with one important bug fix, and support
Xfor VMS, MS-DOS, and UNIX. Also, by using ReadSeq, a fine program
Xfrom Don Gilbert <gilbertd@silver.ucs.indiana.edu>, more protein
Xdata formats are accessible.
X
XIMPLEMENTATION
X
X	Most patterns can be expressed as regular expressions. For
Xexample the pattern '^P' when used with the unix utility grep matches
Xany line in the input that begins with a 'P'.
X
X	I translated all but 1 of the 337 patterns in Prosite to Unix
Xstyle regular expressions and wrote a simple searching program to
Xsearch a protein sequence for their occurrence. The pattern I did not
Xtranslate was the pattern PS0003 which is Tyrosine Sulfation. There is
Xno clean pattern for this modification.
X
X	The program is written in the Awk language, and runs on
Xmachines which have either Nawk from AT&T, Gawk from the Free Software
XFoundation, or one of several versions of Awk which run on MSDOS
Xcompatibles. Read the approriate INSTALL file for details.
X
XINPUT FILES
X
X	In put file are any protein sequence files in an unstructured
Xformat. AWK will accept the input on any number of lines of any length
X(I've tried proteins sequences up to 2500 amino acids on one line with
Xno problem). Each ASCII character will be interpreted as an amino
Xacid, and all letters must be capitalized. With 'readseq' any of a
Xnumber of formats can be used.
X
XOUTPUT
X
X	There are two possible forms of output. The "short" form is a
Xtable of accession numbers, positions in the sequence and short names
Xfor patterns. The "long" form is the same except that the relevant
Xsections from the Prosite Database is also printed.
X
XHere is an example of the short output for Bovine Rhodopsin.
X
XProsite Database -- Release 5.0 of April 1990 Copyright: Amos Bairoch
XProSearch Software -- Release 0.1beta -- Copyright: Lee Kolakowski
XThe following patterns are in < test.ops >:
X
XAccess#     From->To    Name
X_______     ________    ____
XPS00001         2->6    ASN_GLYCOSYLATION
XPS00001       15->19    ASN_GLYCOSYLATION
XPS00001     200->204    ASN_GLYCOSYLATION
XPS00005       14->17    PKC_PHOSPHO_SITE
XPS00005     229->232    PKC_PHOSPHO_SITE
XPS00005     243->246    PKC_PHOSPHO_SITE
XPS00006       22->26    CK2_PHOSPHO_SITE
XPS00006     193->197    CK2_PHOSPHO_SITE
XPS00006     198->202    CK2_PHOSPHO_SITE
XPS00006     229->233    CK2_PHOSPHO_SITE
XPS00006     338->342    CK2_PHOSPHO_SITE
XPS00007       21->30    TYR_PHOSPHO_SITE
XPS00008       89->95    MYRISTYL
XPS00008     120->126    MYRISTYL
XPS00008     156->162    MYRISTYL
XPS00008     182->188    MYRISTYL
XPS00013     157->168    PROKAR_LIPOPROTEIN
XPS00237       68->85    G_PROTEIN_RECEPTOR
XPS00238     296->314    OPSIN
X
XUSAGE
X
X	This is described in the file pros.1, a printable version is
Xin pros.nro.
X
X
XBUGS
X
X	Please send bug reports or improvements to me.
X
XNOTICES
X
X	This code is covered by the Free Software Foundation's Gnu
XPublic License. See the file COPYING for details.
X
X
XFrank Kolakowski 
X
X======================================================================
X|lfk@athena.mit.edu                     ||      Lee F. Kolakowski    |
X|lfk@eastman2.mit.edu                   ||	M.I.T.		     |
X|kolakowski@wccf.mit.edu                ||	Dept of Chemistry    |
X|lfk@mbio.med.upenn.edu		        ||	Room 18-506	     |
X|lfk@hx.lcs.mit.edu                     ||	77 Massachusetts Ave.|
X|AT&T:  1-617-253-1866                  ||	Cambridge, MA 02139  |
X======================================================================
X
X
X
END_OF_FILE
if test 4280 -ne `wc -c <'prosearch.doc'`; then
    echo shar: \"'prosearch.doc'\" unpacked with wrong size!
fi
# end of 'prosearch.doc'
fi
if test -f 'prosite.awk' -a "${1}" != "-c" ; then 
  echo shar: Will not clobber existing file \"'prosite.awk'\"
else
echo shar: Extracting \"'prosite.awk'\" \(1761 characters\)
sed "s/^X//" >'prosite.awk' <<'END_OF_FILE'
X# prosite.awk - release version 1.1
X# Copyright (C) 1990 Lee F. Kolakowski
X#
X# This program is free software; you can redistribute it and/or modify
X# it under the terms of the GNU General Public License as published by
X# the Free Software Foundation; either version 1, or (at your option)
X# any later version.
X#
X# This program is distributed in the hope that it will be useful,
X# but WITHOUT ANY WARRANTY; without even the implied warranty of
X# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
X# GNU General Public License for more details.
X#
X# You should have received a copy of the GNU General Public License
X# along with this program; if not, write to the Free Software
X# Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
X#
X# 
X# Send bugs or improvements to
X# lfk@athena.mit.edu
X#
X# August 13, 1990
X#
X# usage: {ng}awk -f prosite.awk prosite.regex filenames...
X# produces unformatted table for prodoc.awk
X#
X{
X  if ( FILENAME ~ /prosite\.reg/ ) {
X    accession[NR] = $1 ;
X    regex[NR] = $2;
X    name[NR] = $3;
X    doc[NR] = $4;
X    regex_num = NR;
X  }
X  else { 
X    if (FILENAME != lastfile) {
X      while ( getline < FILENAME > 0 ) {
X	input = input$0
X      }
X      $0 = input
X      n = length($0)
X      for (i = 1; i <= regex_num ; i++ ) {
X        if (match($0, regex[i])) {
X	  printf("%s\t%d->%d\t%s\t%s\n", accession[i], RSTART, RSTART+RLENGTH, name[i], doc[i]);
X	  offset = RSTART+1
X	  seq_rem = n-offset
X	  new = substr($0, offset, seq_rem)
X	  while (match(new,regex[i])) {
X	    printf("%s\t%d->%d\t%s\t%s\n", accession[i], offset+RSTART-1, offset+RSTART+RLENGTH-1, name[i], doc[i]);
X	    offset += RSTART+1
X	    seq_rem = n-offset
X	    new = substr($0, offset, seq_rem)
X	  }  
X          lastfile= FILENAME
X	}
X      }
X    }
X  }
X}
X
X
X
END_OF_FILE
if test 1761 -ne `wc -c <'prosite.awk'`; then
    echo shar: \"'prosite.awk'\" unpacked with wrong size!
fi
# end of 'prosite.awk'
fi
if test -f 'prosite.bug' -a "${1}" != "-c" ; then 
  echo shar: Will not clobber existing file \"'prosite.bug'\"
else
echo shar: Extracting \"'prosite.bug'\" \(343 characters\)
sed "s/^X//" >'prosite.bug' <<'END_OF_FILE'
XThe following is a note about an error in ProSite.doc. Please correct
Xyour version
X	
X	Date: Fri, 27 Jul 90 22:23 N
X	From: Amos Bairoch <BAIROCH%cmu.unige.ch@mitvma.mit.edu>
X	Subject: Re: PS number ambiguities in ProSite.doc
X	
X	The cross referenence for engrailed should be PS00033.
X	sorry abou that and thanks for pointing it to me.
X	
X	Amos
X	
END_OF_FILE
if test 343 -ne `wc -c <'prosite.bug'`; then
    echo shar: \"'prosite.bug'\" unpacked with wrong size!
fi
# end of 'prosite.bug'
fi
if test -f 'prosite.regex' -a "${1}" != "-c" ; then 
  echo shar: Will not clobber existing file \"'prosite.regex'\"
else
echo shar: Extracting \"'prosite.regex'\" \(19585 characters\)
sed "s/^X//" >'prosite.regex' <<'END_OF_FILE'
XPS00001 N[^P][ST][^P] ASN_GLYCOSYLATION PDOC00001
XPS00002 SG.G GLYCOSAMINOGLYCAN PDOC00002
XPS00004 [RK][RK].[ST] CAMP_PHOSPHO_SITE PDOC00004
XPS00005 [ST].[RK] PKC_PHOSPHO_SITE PDOC00005
XPS00006 [ST]..[DE] CK2_PHOSPHO_SITE PDOC00006
XPS00007 [RK]...[DE]...Y TYR_PHOSPHO_SITE PDOC00007
XPS00007 [RK]...[DE]..Y TYR_PHOSPHO_SITE PDOC00007
XPS00007 [RK]..[DE]...Y TYR_PHOSPHO_SITE PDOC00007
XPS00007 [RK]..[DE]..Y TYR_PHOSPHO_SITE PDOC00007
XPS00008 G[^EDKRHPYFW]..[STAGCN][^P] MYRISTYL PDOC00008
XPS00009 .G[RK][RK] AMIDATION PDOC00009
XPS00010 C.[DN]....[FY].C.C ASX_HYDROXYL PDOC00010
XPS00011 ............E...E.C......[DEN].[LIVMFY].........[FYW] GLU_CARBOXYLATION PDOC00011
XPS00012 [LI]G[LIVMFYA]DS[LI]...[DE] PHOSPHOPANTETHEINE PDOC00012
XPS00013 [^DERK][^DERK][^DERK][^DERK][^DERK][^DERK][^DERK][LIVSTAG][LIVSTAG][AG]C PROKAR_LIPOPROTEIN PDOC00013
XPS00014 [RKH][DEN]EL$ ER_TARGET PDOC00014
XPS00015 [RKTA]KK[RQNTSG]K NUCLEAR PDOC00015
XPS00016 RGD RGD PDOC00016
XPS00017 [AG]....GK[ST] ATP_A PDOC00017
XPS00018 D.[DNS][^ILVFYW][DENSTG][DNQGHKR][^GP][LIVMC][DENQSTAGC]..[DE][LIVMFYW] EF_HAND PDOC00018
XPS00019 Q[RK]KTFT.W.N ACTININ_1 PDOC00019
XPS00020 A.....I.K[LIVM][LIVM]D..D[LIVM] ACTININ_2 PDOC00019
XPS00021 [FY]CRNPD KRINGLE PDOC00020
XPS00022 C.C.....G..C EGF PDOC00021
XPS00023 C..PF.[FYW].......C..........WC....[ND][FYW].....[FYW].[FYW]C FIBRONECTIN_2 PDOC00022
XPS00023 C..PF.[FYW].......C..........WC....[ND][FYW]...[FYW].[FYW]C FIBRONECTIN_2 PDOC00022
XPS00023 C..PF.[FYW].......C........WC....[ND][FYW].....[FYW].[FYW]C FIBRONECTIN_2 PDOC00022
XPS00023 C..PF.[FYW].......C........WC....[ND][FYW]...[FYW].[FYW]C FIBRONECTIN_2 PDOC00022
XPS00024 [LI]...W...[PE]..[LIVMFY][DE]A[AV][LIVMFY] HEMOPEXIN PDOC00023
XPS00024 [LI]...W..[PE]..[LIVMFY][DE]A[AV][LIVMFY] HEMOPEXIN PDOC00023
XPS00025 R..CG[FY]...[ST]...C....C TREFOIL PDOC00024
XPS00026 CG.......C....CCS..G.CG....[FYW]C CHITIN_BINDING PDOC00025
XPS00027 [LIVMF].....[LIVM]....[IV][RKQ].W........[RK] HOMEOBOX PDOC00027
XPS00028 C....C............H.....H ZINC_FINGER_C2H2 PDOC00028
XPS00028 C....C............H...H ZINC_FINGER_C2H2 PDOC00028
XPS00028 C..C............H.....H ZINC_FINGER_C2H2 PDOC00028
XPS00028 C..C............H...H ZINC_FINGER_C2H2 PDOC00028
XPS00029 L......L......L......L LEUCINE_ZIPPER PDOC00029
XPS00030 [RK]G[^EDKRHPCG][AGCI][FY][LIVA].[FY] RNP_1 PDOC00030
XPS00031 C..C.[DE].....H[FY]....C..CK.FF.R STEROID_FINGER PDOC00031
XPS00032 [LIVM][FY]PWM ANTENNAPEDIA PDOC00032
XPS00033 LMAQGLYN ENGRAILED PDOC00033
XPS00034 RPC...........CVS PAIRED_BOX PDOC00034
XPS00035 RRIKLG POU PDOC00035
XPS00036 [RK][RK].[RKS]N..[STA][STA].[RK].R.[RK] FOS_JUN_BASIC PDOC00036
XPS00037 W[ST]..ED..[LIV] MYB_1 PDOC00037
XPS00038 K[LIVMA].[IT]L..[TA]...[LIVMA]..[LIVM] HELIX_LOOP_HELIX PDOC00038
XPS00039 [LIVM][LIVM]DEAD.[LIVM][LIVM] ATP_HELICASE_1 PDOC00039
XPS00040 Y[LIVM]HRIGR ATP_HELICASE_2 PDOC00039
XPS00041 [LIV]..[LIV]....G[IFY].....F...[FY].......P HTH_ARAC_FAMILY PDOC00040
XPS00042 [ST]R.[DE]I...[LIV]G.[ST].ET HTH_CRP_FAMILY PDOC00041
XPS00043 E..[LIVM]...F.VSR..[LIVM]R.A[LIVM] HTH_GNTR_FAMILY PDOC00042
XPS00044 [LIVF]..[STAV][STA].....[STA][PQHR]..[LIVM][STA]..[LIVF]..[LIVF][RKEQ]..[LIVFY] HTH_LYSR_FAMILY PDOC00043
XPS00045 GF..............NP.T HISTONE_LIKE PDOC00044
XPS00046 AGL.FPV HISTONE_H2A PDOC00045
XPS00047 GAKRH HISTONE_H4 PDOC00046
XPS00048 [AV]RYR...[ST].S.S PROTAMINE_P1 PDOC00047
XPS00049 A[LIV][LIV][LIV].........[DN]G....[FY]..N..V[LIV] RIBOSOMAL_L14 PDOC00048
XPS00050 [RK][RK][AM][IVY][IV][RKT]L RIBOSOMAL_L23 PDOC00049
XPS00051 R[FY]N..RR.WRR RIBOSOMAL_L39 PDOC00050
XPS00052 L....[LIVM]......GKK.....I[LIVMF] RIBOSOMAL_RS7 PDOC00051
XPS00053 G..[LIV][LIV][ST]T..G[LIV]M....AR RIBOSOMAL_S8 PDOC00052
XPS00054 [DN]VTP.P.[DN] RIBOSOMAL_S11 PDOC00053
XPS00055 [RK].PNSA.R RIBOSOMAL_S12 PDOC00054
XPS00056 GD.[LIV].[LIV]...RP[LIV]..T RIBOSOMAL_S17 PDOC00055
XPS00057 AIK.AR...[LF]LP RIBOSOMAL_S18 PDOC00056
XPS00058 LGFRGEAL DNA_MISMATCH_REPAIR PDOC00057
XPS00059 GHE..G.....G..V ADH_ZINC PDOC00058
XPS00060 G..H..AH..G.....PHG ADH_IRON PDOC00059
XPS00061 Y[STAGC][STAGC][STAGC]K.[AG][LIVMAG]..[LIVMF] ADH_INSECT_TYPE PDOC00060
XPS00062 G....[LIVM]G[LIVM]SNF ALDOKETO_REDUCTASE_1 PDOC00061
XPS00063 Q.....[LIVM][AP]KS....R...N ALDOKETO_REDUCTASE_2 PDOC00061
XPS00064 [LIVM]G[EQ]HG[DN][ST] L_LDH PDOC00062
XPS00065 LIN..RG.V.D GLC_2_HYDROXYACID_DH PDOC00063
XPS00066 [RKH]......D.MG.N.[LIVM] HMG_COA_REDUCTASE_1 PDOC00064
XPS00067 GF[LIVM].NR[LIVM] 3HCDH PDOC00065
XPS00068 [LIVM]T[TR]LD..R[STA] MDH PDOC00066
XPS00069 DHYLGKE G6P_DEHYDROGENASE PDOC00067
XPS00070 [AG].F...GQ.C.A ALDEHYDE_DEHYDROGEN PDOC00068
XPS00071 ASCTT GAPDH PDOC00069
XPS00072 G..[FYW][LIV][LIV]NG.K.[FYW]ITN ACYL_COA_DH_1 PDOC00070
XPS00073 Q..GG.G[FY]..[DE].P ACYL_COA_DH_2 PDOC00070
XPS00074 [LIV]..GG[STAG]K[STAG]....[DN] GLU_DEHYDROGENASE PDOC00071
XPS00075 [LIF]G....[LIVMF]PW DHFR PDOC00072
XPS00076 GG.C[LIV]..GC[LIV]P PYRIDINE_REDOX PDOC00073
XPS00077 W.HH[LM] COX1 PDOC00074
XPS00078 C[SA]..CG..H COX2 PDOC00075
XPS00079 G.[FYW].[LIVMFYW].[CST]........G[LM]...[LIVMFYW] MULTICOPPER_OXIDASE1 PDOC00076
XPS00080 HCH...H...G[LM] MULTICOPPER_OXIDASE2 PDOC00076
XPS00081 HP[LIV].KL[LIV]..H LIPOXYGENASE PDOC00077
XPS00082 H........Y...P.G...E EXTRADIOL_DIOXYGENAS PDOC00078
XPS00083 G.[LIVM]....G..[LIVM]....[LIVM][DE].......G.[FY] INTRADIOL_DIOXYGENAS PDOC00079
XPS00084 HHM..F.C CU2_MONOOXYGENASE_1 PDOC00080
XPS00085 H.F....HTH..G CU2_MONOOXYGENASE_2 PDOC00080
XPS00086 F[SGN].[GD].[RHP].C[LIVFA][GD] CYTOCHROME_P450 PDOC00081
XPS00087 [RH][GA][IF]H[LIV]H..G SOD_CU_ZN_1 PDOC00082
XPS00088 D.WEH[STA][FY][FY] SOD_MN PDOC00083
XPS00089 G..NS...A.MP RIBORED_SMALL PDOC00084
XPS00090 [LIVM]...[STANQ][ET]C.....GDD NITROGENASE_1 PDOC00085
XPS00091 M.L.PC....Q THYMIDYLATE_SYNTHASE PDOC00086
XPS00092 [LIVMA][LIVMFYA].[DN]PP[FY] N6_MTASE PDOC00087
XPS00093 [LIVM]TSPP[FY] N4_MTASE PDOC00088
XPS00094 [DN].[LIV]..G.PC..[FW]S C5_MTASE_1 PDOC00089
XPS00095 [RKQ]..GN[STA][LIV]...[LIV]...[LIV]...[LIV] C5_MTASE_2 PDOC00089
XPS00096 TTTTHKTL SER_HYDROXYMETHYLTRF PDOC00090
XPS00097 F.[EK].STRT CARBAMOYLTRANSFERASE PDOC00091
XPS00098 C[SAG]S[SAG][ILVMFY][RKQ][SAG][ILVM]......I THIOLASE_1 PDOC00092
XPS00099 [AG][LIVM].[STA].C.G.G.[AG] THIOLASE_2 PDOC00092
XPS00100 HH.VCD CAT PDOC00093
XPS00101 IGAGS[LIVM]V CYSE_LACA_NODL PDOC00094
XPS00102 GT.NMK PHOSPHORYLASE PDOC00095
XPS00103 [LIVMFYWC][LIVM][LIVM][LIVM][DE][DE].[LIVM]..[GC].[STA] PUR_PYR_PR_TRANSFER PDOC00096
XPS00104 C..KT[FYW]P.[FYW][FYW] EPSP_SYNTHASE PDOC00097
XPS00105 S[FY][SA]K...LY ASP_AMINOTRANSFERASE PDOC00098
XPS00106 GR.NLIGEH.DY GALACTOKINASE PDOC00099
XPS00107 [LIV]G.G.[FY][SG].[LIV] PROTEIN_KINASE_ATP PDOC00100
XPS00108 [LIVMFYC].[HY].D[LIVMFY]K..N[LIMVFC][LIMVFC][LIMVFC] PROTEIN_KINASE_ST PDOC00100
XPS00109 [LIVMFYC].[HY].D[LIVMFY][RA]..N[LIMVFC][LIMVFC][LIMVFC] PROTEIN_KINASE_TYR PDOC00100
XPS00110 II.KIEN PYRUVATE_KINASE PDOC00101
XPS00111 WNGP.G.FE PGLYCERATE_KINASE PDOC00102
XPS00112 LTCPSN CREATINE_KINASE PDOC00103
XPS00113 DG[FY]PR.[LIVM].Q ADENYLATE_KINASE PDOC00104
XPS00114 DLHA.QIQGFFD[LIVM]P[LIVM]D PRPP_SYNTHETASE PDOC00105
XPS00115 Y[ST]P[ST]SP[STANK] RNA_POL_II_REPEAT PDOC00106
XPS00116 [YA].DTDS[LIVM] DNA_POLYMERASE_B PDOC00107
XPS00117 G....HPH.Q GAL-P-UDP-TRANSFER PDOC00108
XPS00118 CC..H..C PA2_HIS PDOC00109
XPS00119 [LIVM]C[^LIVMFYWPCST]CD.....C PA2_ASP PDOC00109
XPS00120 [LIV].[LIVFY][LIV]G[HY]S.G LIPASE_SER PDOC00110
XPS00121 Y..YY.C.C COLIPASE PDOC00111
XPS00122 P.....[LIVFA]G.SAG CARBOXYLESTERASE_B PDOC00112
XPS00123 V.DS..[STG]AT ALKALINE_PHOSPHATASE PDOC00113
XPS00124 GKLR[LIV]LYE FBPASE PDOC00114
XPS00125 RGNHE SER_THR_PHOSPHATASE PDOC00115
XPS00126 HD[LIVMFY].H.[AG]..N.[LIVMFY] PDEASE PDOC00116
XPS00127 CK..NTF RNASE_PANCREATIC PDOC00118
XPS00128 C...C..[LF]...[DEN][LI].....C LACTALBUMIN_LYSOZYME PDOC00119
XPS00129 WIDMN SUCRASE PDOC00120
XPS00130 WA..GVLLLN U_DNA_GLYCOSYLASE PDOC00121
XPS00131 GESYAG CARBOXYPEPTIDASE_SER PDOC00122
XPS00132 [LIVMFY]H[SAG].E.[LIVM][STAG]......[LIVMFY] CARBOXYPEPTIDASE_ZN1 PDOC00123
XPS00133 H[SAG]...[LIVM]..[LIVMFYW]P[FYW] CARBOXYPEPTIDASE_ZN2 PDOC00123
XPS00134 [LIVM][ST]A[STAG]HC TRYPSIN_HIS PDOC00124
XPS00135 GDSGG TRYPSIN_SER PDOC00124
XPS00136 [SAIV].[LIVM][LIVM]D[DSTA]G[LIVMFC]...[DNH] SUBTILISIN_ASP PDOC00125
XPS00137 HGT..[STA]G.[LIVMA] SUBTILISIN_HIS PDOC00125
XPS00138 GTS.[SA].P..[STAV][AG] SUBTILISIN_SER PDOC00125
XPS00139 Q...[GE].CW..[STAG] THIOL_PROTEASE PDOC00126
XPS00140 N.CG...[LIVM][LIVM]H UCH PDOC00127
XPS00141 [LIVFA]DTG[STA][STAN] EUK_ASP_PROTEASE PDOC00128
XPS00142 [TAIV]..HE[LIVMFYW][^DEHKRP]H.[LIVMFYWQ] ZINC_PROTEASE PDOC00129
XPS00143 GL.H..EHM IDE_PTR PDOC00130
XPS00144 [STAG]TGGTIA[STAG] ASN_GLN_ASE PDOC00132
XPS00145 MVCHHLD UREASE PDOC00133
XPS00146 F.[LIVMFY].S..K....[AG].[LIVM]L BETA_LACTAMASE_A PDOC00134
XPS00147 LGGDHS ARGINASE_1 PDOC00135
XPS00148 SGNLHG ARGINASE_2 PDOC00135
XPS00149 GKWHLG SULFATASE PDOC00117
XPS00150 SVDYE[LIVM].G[RK] ACYLPHOSPHATASE_1 PDOC00136
XPS00151 GTV.GQ.QGP ACYLPHOSPHATASE_2 PDOC00136
XPS00152 P[SAP][IV][DN]...S.S ATPASE_ALPHA_BETA PDOC00137
XPS00153 IT.E..E...GA.A ATPASE_GAMMA PDOC00138
XPS00154 DKTGT[LIVM]T ATPASE_E1_E2 PDOC00139
XPS00155 GGYSQG CUTINASE PDOC00140
XPS00156 F.D.KF.DI..T OMPDECASE PDOC00141
XPS00157 G.DF.K.DE RUBISCO_LARGE PDOC00142
XPS00158 EG.LLKPN ALDOLASE PDOC00143
XPS00159 LEVTLR ALDOLASE_KDPG_KHG_1 PDOC00144
XPS00160 FK.FPAE ALDOLASE_KDPG_KHG_2 PDOC00144
XPS00161 KKCGHM ISOCITRATE_LYASE PDOC00145
XPS00162 Q.H.HWG CARBONIC_ANHYDRASE PDOC00146
XPS00163 GS..M..K.N FUMARATE_LYASES PDOC00147
XPS00164 DDLTV[STA]NP ENOLASE PDOC00148
XPS00165 K........S[IF]K.RG DEHYDRATASE_SER_THR PDOC00149
XPS00166 G.ALGGG ENOYL_COA_HYDRATASE PDOC00150
XPS00167 G...[LIVM]ELG..[FY][ST]DP[LIVM]A[DE]G TRP_SYNTHASE_ALPHA PDOC00151
XPS00168 L.H.G[STA]HK.N TRP_SYNTHASE_BETA PDOC00152
XPS00169 D.[LIVM][LIVM]VKP D_ALA_DEHYDRATASE PDOC00153
XPS00170 PG...MAN.GP PPIASE PDOC00154
XPS00171 AYEP.W TPI PDOC00155
XPS00172 [LI]EPKP..P XYLOSE_ISOMERASE_1 PDOC00156
XPS00173 FHD.D[LIV].P XYLOSE_ISOMERASE_2 PDOC00156
XPS00174 [FY]DQWGVELGK P_GLUCOSE_ISOMERASE PDOC00157
XPS00175 [LIVM].RHG[EQ]...N PG_MUTASE PDOC00158
XPS00176 E........SK..Y[LIM] TOPOISOMERASE_I_EUK PDOC00159
XPS00177 [LIVM].EGDSA.[STAG] TOPOISOMERASE_II PDOC00160
XPS00178 P..[STAN]..[LIVMFYP][HT][LIVMFYA]G[HNTG][LIVMFYSTA] AA_TRNA_LIGASE_HIGH PDOC00161
XPS00178 P[STAN]..[LIVMFYP][HT][LIVMFYA]G[HNTG][LIVMFYSTA] AA_TRNA_LIGASE_HIGH PDOC00161
XPS00179 [AG].G[LIVMF][DE]R[LIVM].[LMA][LIVMF] AA_TRNA_LIGASE_ATP PDOC00161
XPS00180 [FYW]DGSS GLNA_1 PDOC00162
XPS00181 NG[SA]G.H...S GLNA_ATP PDOC00162
XPS00182 K[LIVM].....[LIVM]D[RK][DN][LI]Y GLNA_ADENYLATION PDOC00162
XPS00183 [FY]HP........[LIV]C[LIV].[LIV][LIV].....P UBIQUITIN_CONJUGAT PDOC00163
XPS00183 [FY]HP.......[LIV]C[LIV].[LIV][LIV].....P UBIQUITIN_CONJUGAT PDOC00163
XPS00184 RFGDPETQ GARS PDOC00164
XPS00185 [RK].[STA]..S.CY[SL] IPNS_1 PDOC00165
XPS00186 [LIVM][LIVM].CG[STA]..[STAG]..T.[DNG] IPNS_2 PDOC00165
XPS00187 P....[LIVMF].[LIVMF].GD..[LIVMF].[LIVMF]...[DE] TPP_ENZYMES PDOC00166
XPS00188 [LIVM].[AV]MKM...[LIVM] BIOTIN PDOC00167
XPS00189 G..[LIVF]...[DEQN].[LIVF]..[LIVF]...K[STAV][STAVQN]..[LIVF] LIPOYL PDOC00168
XPS00190 C[^CPWHF][^CPWR]CH[^CFWY] CYTOCHROME_C PDOC00169
XPS00191 F[LIV]..HPGG CYTOCHROME_B5 PDOC00170
XPS00192 [DEQ]...G[FYW].[LIVM]R..H CYTOCHROME_B_HEME PDOC00171
XPS00193 PEW[FY][LFY][LFY] CYTOCHROME_B_QO PDOC00171
XPS00194 [TA].WC[AG][PH]C THIOREDOXIN PDOC00172
XPS00195 C.[FY]C..[TA][KQ].[LI] GLUTAREDOXIN PDOC00173
XPS00196 Y.[VFY].C..P.H COPPER_BLUE PDOC00174
XPS00196 Y.[VFY].C..PH COPPER_BLUE PDOC00174
XPS00196 Y.[VFY].C.P.H COPPER_BLUE PDOC00174
XPS00196 Y.[VFY].C.PH COPPER_BLUE PDOC00174
XPS00196 Y[VFY].C..P.H COPPER_BLUE PDOC00174
XPS00196 Y[VFY].C..PH COPPER_BLUE PDOC00174
XPS00196 Y[VFY].C.P.H COPPER_BLUE PDOC00174
XPS00196 Y[VFY].C.PH COPPER_BLUE PDOC00174
XPS00197 C..[STA]..C[STA][^P]C 2FE2S_FERREDOXIN PDOC00175
XPS00197 C.[STA]..C[STA][^P]C 2FE2S_FERREDOXIN PDOC00175
XPS00198 C..C..C...C[PEG] 4FE4S_FERREDOXIN PDOC00176
XPS00199 CTHLGCV RIESKE_1 PDOC00177
XPS00200 CPCHGS RIESKE_2 PDOC00177
XPS00201 [FY].[ST].TG.T...A..I FLAVODOXIN PDOC00178
XPS00202 W.CP.C[AG] RUBREDOXIN PDOC00179
XPS00203 C...C.C..C.C..C METALLOTHIONEIN_CL1 PDOC00180
XPS00204 DPH..DF[LI]E FERRITIN PDOC00181
XPS00205 Y.[VA]VA[VA][VA][RK] TRANSFERRIN_1 PDOC00182
XPS00206 Y.GA..CL.[DE] TRANSFERRIN_2 PDOC00182
XPS00207 LLC.[DN].....V.....C..A....H.V..R TRANSFERRIN_3 PDOC00182
XPS00208 [SN]P.L..HA...F PLANT_GLOBIN PDOC00183
XPS00209 Y[FYW].ED[LIVM]..N......H...P HEMOCYANIN_1 PDOC00184
XPS00210 T..RDP.F[FYW] HEMOCYANIN_2 PDOC00184
XPS00211 [LVF]SGG...[RK][LIVMA].[LIVMF][AG] ATP_BIND_TRANSPORT PDOC00185
XPS00212 [FY]......CC.......C[LFY]......[LIVMFYW] ALBUMIN PDOC00186
XPS00213 [DENST]...[LIVFY].G.W[FYWRH].[LIVM] LIPOCALIN PDOC00187
XPS00214 G.[FYW].[LIVM]....N[FY][DE] FABP PDOC00188
XPS00215 P.[DE].[IVA][RK].[LR][LIVMFY] MITOCH_CARRIER PDOC00189
XPS00216 [LIVMST][DE].[LIVMFA]GR[RK].....G SUGAR_TRANSPORT_1 PDOC00190
XPS00216 [LIVMST][DE].[LIVMFA]GR[RK]....G SUGAR_TRANSPORT_1 PDOC00190
XPS00217 [LIVMF].G[LIVMFA]..G........[LY]..[EQ]......[RK] SUGAR_TRANSPORT_2 PDOC00190
XPS00218 A.GG.IGTGL AMINO_ACID_PERMEASE PDOC00191
XPS00219 FGGL[LIVM]RD[LIVM][RK]RRYP ANION_EXCHANGER_1 PDOC00192
XPS00220 FLISLIFIYETF.KL ANION_EXCHANGER_2 PDOC00192
XPS00221 SG.H.NPAVT MIP_NO26_GLPF PDOC00193
XPS00222 GCGCC..C IGF_BINDING PDOC00194
XPS00223 [TG][STV]........[LIVMF]..R...[DEQNH].......[IFY].......[LIVMF]...[LIVMF]...........[LIVMF]..[LIVMF] ANNEXIN PDOC00195
XPS00224 FLAQQES CLATHRIN_LIGHT_CHAIN PDOC00196
XPS00225 [LIVMFYWA].[^DEHKRSTP][FY][DEQHKY]...[FY].G....[LIVMFCST] CRYSTALLIN_BETAGAMMA PDOC00197
XPS00226 I.[TA]Y[RK].[LM]L[DE] IF PDOC00198
XPS00227 [AG]GGTG[SA]G TUBULIN PDOC00199
XPS00228 ^MR[DE]I TUBULIN_B_AUTOREG PDOC00200
XPS00229 GS..N..H.PGGG MAP2_TAU PDOC00201
XPS00230 Y.Y[DE]..[DE][RK] MAP1B_NEURAXIN PDOC00202
XPS00231 CDYNRD F_ACTIN_CAPPING_BETA PDOC00203
XPS00232 [LIV].[LIV].D.ND[NH].P CADHERIN PDOC00205
XPS00233 G......Y.A.E.GY CUTICLE_LARVAL PDOC00206
XPS00234 SLVGIE GAS_VESICLE_A PDOC00207
XPS00235 FL..T...R...A..Q...L..F GAS_VESICLE_C PDOC00208
XPS00236 C.[LIVM].[LIVM]..[FY]P.D...C NEUROTR_ION_CHANNEL PDOC00209
XPS00237 [LMR][RKHQN]...[NT][LIVMFYW][LIVMFYW][LIV].[SNH][LIV]...[DEG][LIVMFYWA] G_PROTEIN_RECEPTOR PDOC00210
XPS00238 K.....[DN]P.[IV]Y......[FY] OPSIN PDOC00211
XPS00239 D[LIV]Y...YYR RECEPTOR_TYR_KIN_II PDOC00212
XPS00240 C...G.P.P...W..C RECEPTOR_TYR_KIN_III PDOC00213
XPS00241 C[FR]........[STVN]C.W RECEPTOR_CYTOKINES PDOC00214
XPS00241 C[FR].......[STVN]C.W RECEPTOR_CYTOKINES PDOC00214
XPS00242 [FYW][RK].GFF.R INTEGRIN_ALPHA PDOC00215
XPS00243 C.[GNQ]...G.C.C..C.C INTEGRIN_BETA PDOC00216
XPS00243 C.[GNQ].G.C.C..C.C INTEGRIN_BETA PDOC00216
XPS00244 N....P.H..[SAG]...........[SAG].H[SAG][SAG] REACTION_CENTER PDOC00217
XPS00245 APH.CH PHYTOCHROME PDOC00218
XPS00246 QECKCHG INT1 PDOC00219
XPS00247 G.L.[STAG]......[DE]C.F.E HBGF_FGF PDOC00220
XPS00248 C........GC[RK]GID..HWNS.C NGF PDOC00221
XPS00249 CV...RC.GCCN PDGF PDOC00222
XPS00250 [LIVM]..P..[FY]....C.G.C TGF_BETA PDOC00223
XPS00251 Y..YSQV.F TNF PDOC00224
XPS00252 [FY]L.......[CY]AW INTERFERON_ALPHABETA PDOC00225
XPS00253 F.S...P..[FY][LI].T INTERLEUKIN_1 PDOC00226
XPS00254 C.........C......GL..[FY]...L INTERLEUKIN_6 PDOC00227
XPS00255 CFLKRLL INTERLEUKIN_7 PDOC00228
XPS00256 [EQ][LV][NT]F[ST]..W AKH PDOC00229
XPS00257 WA.G[SH][LF]M BOMBESIN PDOC00230
XPS00258 C[STAGDN][STAGDN]..TC[LIVMA]...[LFY]...[LFY] CALCITONIN PDOC00231
XPS00258 C[STAGDN][STAGDN].TC[LIVMA]...[LFY]...[LFY] CALCITONIN PDOC00231
XPS00259 Y.[GD][WH]M[DR]F GASTRIN PDOC00232
XPS00259 Y[GD][WH]M[DR]F GASTRIN PDOC00232
XPS00260 [YH][STA][DEQN][AG].[FY]..[DEQNST].............[LIV] GLUCAGON PDOC00233
XPS00261 C[SA]G.C.[ST] GLYCOPROTEIN_HORMONE PDOC00234
XPS00262 CC...C....[LIMF]...C INSULIN PDOC00235
XPS00263 CFG...DRIG..S..GC NATRIURETIC_PEPTIDE PDOC00236
XPS00264 C[IFY][IFY].NCP.G NEUROHYPOPHYS_HORM PDOC00237
XPS00265 N..TR.RY PANCREATIC_HORMONE PDOC00238
XPS00266 C.[ST]..[LIVMFY].[LIVMSTA]P.....[TAV].......[LIVMFY]......[LIVMFY]..[STA]W SOMATOTROPIN_1 PDOC00239
XPS00267 F[IVFY]G[LM]M[G$] TACHYKININ PDOC00240
XPS00268 W..[KN]..K[KE][LI]E[RKN] CECROPIN PDOC00241
XPS00268 W[KN]..K[KE][LI]E[RKN] CECROPIN PDOC00241
XPS00269 C.......G.C.........CC DEFENSIN PDOC00242
XPS00270 C.C....D..C..[FY]C ENDOTHELIN PDOC00243
XPS00271 CC.....R..[FY]..C THIONIN PDOC00244
XPS00272 CP........[LIVYST].CC SNAKE_TOXIN PDOC00245
XPS00272 CP......[LIVYST].CC SNAKE_TOXIN PDOC00245
XPS00273 CC..CC.PAC.GC ENTEROTOXIN_H_STABLE PDOC00246
XPS00274 T..NW..TNT AEROLYSIN PDOC00247
XPS00275 [LIV]....EA.R[FY][RKQ].[LIV] SHIGA_RICIN PDOC00248
XPS00276 T..W.P[LIVMFY][LIVMFY][LIVMFY]..E CHANNEL_COLICIN PDOC00249
XPS00277 YGG[LIV]T....N STAPH_STREP_TOX_1 PDOC00250
XPS00278 K..[LIV]....[LIV]D...R..L.....[LIV]Y STAPH_STREP_TOX_2 PDOC00250
XPS00279 Y......[FY]GTH[FY] MAC_PERFORIN PDOC00251
XPS00280 F...GC......[FY].....C BPTI_KUNITZ PDOC00252
XPS00281 C.[SAD][STA]C..C BOWMAN_BIRK PDOC00253
XPS00282 C.......C......Y...C...C KAZAL PDOC00254
XPS00282 C.......C......Y...C..C KAZAL PDOC00254
XPS00283 [LIVM].D..G..[LIVM].....Y.[LIVM] SOYBEAN_KUNITZ PDOC00255
XPS00284 [LIVMF].[LIVMFA][DNQ][RKHQ][PS]F[LIVMFY][LIVMFY].[LIVMF] SERPIN PDOC00256
XPS00285 [FYW]PE[LIV][LIV]...[STAV]..A POTATO_INHIBITOR PDOC00257
XPS00286 CP.....CK....C...C.C SQUASH_INHIBITOR PDOC00258
XPS00287 Q[LIVT]V[SAG]G..[LIVMFY].[LIVMFY].[LIVMFY] CYSTATIN PDOC00259
XPS00288 C.C.P.HP TIMP PDOC00260
XPS00289 H.C.[ST]W.S PENTRAXIN PDOC00261
XPS00290 [FY].C.[VA].H IG_MHC PDOC00262
XPS00291 GG.WGQ PRION PDOC00263
XPS00292 R..[LIV]..[FYW][LIV]........[LIV].....[FYW]......D[RK] CYCLIN PDOC00264
XPS00293 S[LIVM]SKI[LIVM][RK]C PCNA PDOC00265
XPS00294 C[^DENQ][LIVM].$ FARNESYLATION PDOC00266
XPS00295 N...K.VKKIK ARRESTIN PDOC00267
XPS00296 AA.EE....GGG CHAPERONIN PDOC00268
XPS00297 [IV]DLGTT.S HSP70_1 PDOC00269
XPS00298 NKEIFL HSP90 PDOC00270
XPS00299 VLRLRGG UBIQUITIN PDOC00271
XPS00300 PI.[FY][LIVM]G.G SRP54 PDOC00272
XPS00301 D....E...[GC].T[IV] EFACTOR_GTP PDOC00273
XPS00302 TGKHGH IF4D_HYPUSINE PDOC00274
XPS00303 LD...[DN]....[FY][EQ].[FY] S100_CABP PDOC00275
XPS00304 GSVGGE SASP PDOC00276
XPS00305 NG.[DE][DE]..C[ST] 11S_SEED_STORAGE PDOC00284
XPS00306 CL[LV]A.A[LV]A CASEIN_ALPHA_BETA PDOC00277
XPS00307 H[LIV]GI[DN][LIV].[ST][LIV].S..T LECTIN_LEGUME_BETA PDOC00278
XPS00308 P[EQ][FYW]V.[LIV]G.[ST] LECTIN_LEGUME_ALPHA PDOC00278
XPS00309 WG.E.RE LECTIN_GALACTOSIDE PDOC00279
XPS00310 [STA]C[LIVM][LIVMFYW]A.[LIVMFYW]...[LIVMFYW]...Y LAMP_1 PDOC00280
XPS00311 G.K..HAGY LAMP_2 PDOC00280
XPS00312 II..VMAG GLYCOPHORIN_A PDOC00281
XPS00313 GQD.VK.....K SVP PDOC00282
XPS00314 AGYGST.T ICE_NUCLEATION PDOC00283
XPS00315 SSSSSSSED[DE]G DEHYDRIN PDOC00285
XPS00316 G.C.TGDC.G...C THAUMATIN PDOC00286
XPS00317 C.[^C][DN]..C.....CC 4_DISULFIDE_CORE PDOC00026
XPS00318 [LIVM]G.[LIVM]GG[AG]T HMG_COA_REDUCTASE_2 PDOC00064
XPS00319 GVEFVCCP A4_EXTRA PDOC00204
XPS00320 NGYENPTYK A4_INTRA PDOC00204
XPS00321 ALKFYASVR RECA PDOC00131
XPS00322 KAPRKQL HISTONE_H3 PDOC00287
XPS00323 M[LIV]G[RKHNQ]KLGEF RIBOSOMAL_S19 PDOC00288
XPS00324 KFGG[ST]S ASPARTOKINASE PDOC00289
XPS00325 ^M[DE]AIKKKM TROPOMYOSIN_MUSCLE PDOC00290
XPS00326 LKEAE.RAE TROPOMYOSIN PDOC00290
XPS00327 [FYW]..LD[LIVM].AK..[FYW] BACTERIAL_OPSIN PDOC00291
XPS00328 HRHRGH..[DE][DE][DE][DE][DE][DE][DE] HCP PDOC00292
XPS00329 DLGGGTFD HSP70_2 PDOC00269
XPS00330 D.[LI]....G.D.[LI].GG...D HEMOLYSIN_CALCIUM PDOC00293
XPS00331 F.DD..GTA.V..AGLL MALIC_ENZYMES PDOC00294
XPS00332 [STAG]G[PAG]H[FY][DN]P SOD_CU_ZN_2 PDOC00082
XPS00333 EG[LIVM][LIVM][LIVM]K...[GC] DNA_LIGASE PDOC00295
XPS00334 W..[LI][SAG].....R........[YW]...[LIM] MYB_2 PDOC00037
XPS00335 VSE.Q..H..G PARATHYROID PDOC00296
XPS00336 FELGS[LIVM]SKTF BETA_LACTAMASE_C PDOC00134
XPS00337 P.STFKI BETA_LACTAMASE_D PDOC00134
XPS00338 C[LIVMFY]..D[LIVMFYST].....[LIVMFY]..[LIVMFY]..C SOMATOTROPIN_2 PDOC00239
END_OF_FILE
if test 19585 -ne `wc -c <'prosite.regex'`; then
    echo shar: \"'prosite.regex'\" unpacked with wrong size!
fi
# end of 'prosite.regex'
fi
echo shar: End of shell archive.
exit 0

