From THOMPSON@BOBCAT.CSC.WSU.EDU Thu May 28 10:59:29 1992 Received: from BOBCAT.CSC.WSU.EDU (wsuvms1.csc.wsu.edu) by sunflower.bio.indiana.edu (4.1/9.5jsm) id AA05862; Thu, 28 May 92 10:59:22 EST Date: Thu, 28 May 1992 8:59:14 -0700 (PDT) From: THOMPSON@BOBCAT.CSC.WSU.EDU (Steve Thompson: VADMS genetics) Message-Id: <920528085914.20201029@BOBCAT.CSC.WSU.EDU> Subject: my profiles are now available via anon ftp To: gribskov@sdsc.edu, gilbertd@sunflower.bio.indiana.edu, tolvanen@cc.helsinki.fi, rluethy@ulrec2.unil.ch, ODONNELL@ARCB.AFRC.AC.UK, ljm@blackbox.mayo.EDU, dmerberg@genetics.com, pongor@genes.icgeb.trieste.it, edelman@gcg.com, 2020273@saphir.ulaval.ca, jrees@vax.ox.ac.uk, bio320@cvx12.inet.dkfz-heidelberg.de, cmswalters@karl.iit.edu, BIO-SOFT@genbank.bio.net X-Vmsmail-To: @MAILDIS:PROFILE Status: R Greetings NetLanders and "Profilelers" Thank you so much for the encouraging words of support and gentle prodding to get my profiles available for transfer. I do apologize for the amount of time it has taken me to get to this point, but you all know the story, busy, busy! As promised, I have moved them to a location accessable by anonymous ftp. I have not yet documented the profiles and, therefore, have not yet sent them off to IUBIO. But in the interest of availability and convenience for everyone I have placed them in our own VAX cluster's anonymous account. As for their full documentation, Michael Gribskov suggests that eventually profiles should include the multiple sequence alignment and database name entries used to form them, annotated functional sites, search statistics and literature references. A laudable goal indeed, but an incredible amount of work! One problem I immediately surmise is that many of my profiles' multiple sequence alignments are quite huge because of a very large number of entries (e.g. Phospholipase A2 has 91 unique entries). I could easily append the two files together, .MSF and .PRF, but do people really want such huge files? This would be a good topic of discussion in the newsgroup. Regardless, my profiles are now available. I consider these "high quality" profiles although I have not applied any special statistics to them. I have eliminated all duplicate entries in them, differentially weighted many of the similar entries, and often trimmed off highly variable termini. Additionally, all have been backscreened to the databases using ProfileSearch and "found" only themselves with "high" Z scores, with the obvious exception of families wich are already known to be very similar to another (such as the Apicomplexan HSP70like antigen profile). I know, all rather empirical verbage, but they do work well. The individual entries' names, weights and beginnings/endings are all listed in the headers to each profile as per standard GCG format. The directory structure and logon information for our anonymous account follows. The MolBio directory is empty other than my profiles but we decided to set it up that way to leave the potential for expansion. Everyone is welcome to use my profiles; I only ask that you acknowledge me as their author and inform me if you publish something which uses them. I will be expanding the list in the coming months, so you may wish to check again in the future. Again, thanks for all of the support; I hope they can be of some use to you. Steve Thompson ******************************************************************************** Internet address: bobcat.csc.wsu.edu 134.121.1.1 alias: wsuvms1.csc.wsu.edu logon as: USER ANONYMOUS password: your Internet address ******************************************************************************** path: root/molbio/profiles (however, this is a VMS site not Unix!) DIR [ANONYMOUS.MOLBIO.PROFILES] APICOMPLEX.DIR AZUZIN.DIR GRWTHFCTR.DIR NEUROPEP.DIR OVALBUMIN.DIR PHOSPHOLIP.DIR PRION.DIR PROTEAINHIB.DIR RECEPT.DIR RUBISCO.DIR and README.TXT (this file) ******************************************************************************** The refined profiles assembled by Steven M. Thompson as of May 1992 (VMS block size & date also listed): [.APICOMPLEX] Apicomplexan antigen protein families: APICAL.PRF;1 156 6-MAY-1992 BABR.PRF;1 55 6-MAY-1992 CSP.PRF;1 117 11-MAY-1992 HRP.PRF;1 87 11-MAY-1992 HSP70LIKE.PRF;2 80 27-MAY-1992 KNOB.PRF;2 116 27-MAY-1992 OS25.PRF;1 55 11-MAY-1992 PMMSA.PRF;1 91 27-MAY-1992 [.AZURIN] Azurin: AZURIN.PRF;2 35 4-FEB-1992 [.GRTHFCTR] Growth Factors and Cytokines: IGF2.PRF;1 20 31-OCT-1991 [.NEUROPEP] Neuropeptides: CRF.PRF;1 50 13-FEB-1992 GRF.PRF;1 28 13-FEB-1992 SMS.PRF;1 36 13-FEB-1992 SMS1.PRF;1 33 13-FEB-1992 SMS2.PRF;1 34 13-FEB-1992 TRH.PRF;1 69 13-FEB-1992 [.OVALBUMIN] Ovalbumin: OVALBUMIN.PRF;1 191 22-FEB-1991 [.PHOSHOLIP] Phospholipases: PA2-1.PRF;1 38 11-MAR-1992 PA2-2.PRF;1 41 11-MAR-1992 PA2-3.PRF;1 37 11-MAR-1992 PA2-CAT.PRF;1 30 8-APR-1992 PA2.PRF;1 48 7-APR-1992 [.PRION] Scrapie/Prion Protein: PRION.PRF;2 67 27-MAY-1992 [.PROTEAINHIB] protease inhibitors: KAZAL.PRF;1 18 7-APR-1992 [.RECEPT] G-Protein Linked Receptors: HUMAN.PRF;1 159 3-APR-1992 NEUROKIN.PRF;1 82 7-FEB-1992 RHODOPSIN.PRF;1 89 21-MAY-1992 [.RUBISCO] Ribulose Biphosphate Carboxylase/Oxygenase: ACTIVE_SITE.PRF;3 44 12-NOV-1991 LSU.PRF;2 132 8-NOV-1991 PROLSU.PRF;1 127 13-NOV-1991 ******************************************************************************** Steven M. Thompson Consultant in Molecular Genetics and Sequence Analysis VADMS (Visualization, Analysis & Design in the Molecular Sciences) Laboratory Washington State University, Pullman, WA 99164-1224, USA AT&Tnet: (509) 335-0533 or 335-3179 FAX: (509) 335-0540 BITnet: THOMPSON@WSUVMS1 or STEVET@WSUVM1 INTERnet: THOMPSON@wsuvms1.csc.wsu.edu ********************************************************************************