From bronze!news.cs.indiana.edu!samsung!psuvax1!psuvm!auvm!NCBI.NLM.NIH.GOV!GHOSH Mon Jun 3 18:48:10 EST 1991 Article: 177 of bit.listserv.info-gcg Path: bronze!news.cs.indiana.edu!samsung!psuvax1!psuvm!auvm!NCBI.NLM.NIH.GOV!GHOSH From: ghosh@NCBI.NLM.NIH.GOV (David Ghosh) Newsgroups: bit.listserv.info-gcg Subject: TFD release 3.0 Message-ID: <9106032042.AA03829@nexus.nlm.nih.gov> Date: 3 Jun 91 20:42:02 GMT Sender: "INFO-GCG: GCG Genetics Software Discussion" Reply-To: David Ghosh Lines: 41 Comments: Gated by NETNEWS@AUVM.AUVM.EDU Comments: To: ken@helix.nih.gov, smith@gcg.com, bucher@gnomic.stanford.edu, blattner@vms3.macc.wisc.edu, info-gcg@utoronto.BITNET, russo@mbcrr.harvard.edu, tsmith@mbcrr.harvard.edu, dxp%trna@lanl.gov, yablonsky@mbcl.rutgers.edu, clark@mshri.utoronto.ca, victoria@helix.nih.gov, broe@aardvark.ucs.uokor.edu, forsdyke@genbank.bio.net, suter2@urz.unibas.ch, rainer.fuchs@embl.BITNET, cochran@mitwccf.BITNET, jaguar@helix.nih.gov, hf.mps@forsythe.stanford.edu, extg002@bbrnsf11.BITNET, cooke%frperp51.bitnet@cunyvm.cuny.edu, kamel@mangrove.cis.ufl.edu, chas@stork.mbir.bcm.tmc.edu, gartmann%vms.mpiib-freiburg.mpg.dbp.de@relay.cs.net, chuck@mizzen.niaid.nih.gov, seqtest@wccf.mit.edu, lfk@eastman1.mit.edu, cherry@frodo.mgh.harvard.edu, jackl@caos.caos.kun.nl, cbonnard@ulmed.unil.ch, frank@statistics-service.scot-a TFD (Transcription Factors Database) release 3.0 is now ready for distribution. The files can be downloaded as with previous releases, from our FTP server "ncbi.nlm.nih.gov" (130.14.20.1/repository/TFD). Those of you who are receiving this message directly and are not on the TFDINFO mailing list, please be advised that this will be the last time that you would receive a direct message about updates to TFD, or about new software products that use TFD datasets, unless you choose to subscribe to TFDINFO. The procedure for subscribing, if you have not done so already, is to send a request to: "tfdinfo-request@ncbi.nlm.nih.gov". Starting with this release, README files from previous releases will be maintained in the "Release-Notes" subdirectory. Information about structural changes to TFD in the 3.0 README file is restricted to changes that have occurred between 2.1 and 3.0. The fixed-record length ascii text file representation of TFD is contained in the "tfd.ascii" subdirectory, and the ASN.1 representation in the "tfd.asn1" subdirectory (as with release 2.1). Sequence analysis datasets from the SITES table are contained in the "dynamic-data", "gcg-data", "macvector-data", and "sigscan-data". The new "fasta-data" subdirectory contains the "tfdaa" amino acid sequences dataset from the TFD 3.0 DOMAINS and POLYPEPTIDES tables, which can be used by the FASTA and BLAST families of programs. For users of the SITES dataset from TFD: The SITES table contains a new field "N_PROB" containing the precomputed intrinsic probability of occurrence for each SITES entry. The various SITES datasets that are formatted for sequence analysis software (in the macvector-data, dynamic-data, gcg-data, sigscan-data subdirectories) were produced using a cutoff value of 5.00e-04. The SITES sequence analysis datasets therefore contain all unique six- nucleotide (and longer) SITES sequence entries, but do not contain five nucleotide SITES entries, or degenerate high-probability six- and seven- nucleotide SITES entries. This filtering was performed to avoid the excessive matches to short sequences that are obtained with most of the currently available software for SITES analyses. - David Ghosh, NCBI June 3, 1991