From: ORANGE::GOLD::WINS%"JMILLER%VXBIO.SPAN@STAR.STANFORD.EDU" 17-AUG-1990 13:30 To: GILBERTD Subj: plota.readme file Return-Path: <@STAR.STANFORD.EDU:JMILLER@VXBIO.SPAN> Received: from STAR.STANFORD.EDU by gold.ucs.indiana.edu with SMTP ; Fri, 17 Aug 90 13:36:34 EST Received: from VXBIO by STAR.STANFORD.EDU via MAIL-11 with DECnet; Fri, 17 Aug 90 11:24:25-PST Date: Fri 17 Aug 90 11:24:24-PST From: JMILLER%VXBIO.SPAN@STAR.STANFORD.EDU Subject: plota.readme file To: "24986::""GilbertD@Gold.Bacs.Indiana.edu"""@VXBIO.SPAN Notes on the compiled versions of the MacPROT package (Peter Markiewicz, 1990) MACPROT MACProt consists of a set of programs for analyzing protein sequences for secondary structure, chain flexibility, hydropathy, helical wheels, and so on. Each program is supplied in a separate archive file. A list of the programs is given below. These notes on the MacPROT package (compiled) are adapted from the original instructions for the interpreted MS basic versions. Changes have been made to reflect the differences between the compiled and interpreted versions. A reference list will be given in a later version of this document. Each compiled version is supplied with the portion of the original manual which covers it, and the general descriptions in this document. A complete list of the programs (June 1990) is given below. Differences Between Complied and Interpreted versions: The compiled versions are for 512k mac and up models. Note that the programs may have problems with unenhanced (64k ROM) 512k macs. To run (simpler) versions of the programs on an "classic" 512k mac, you need to get the interpreted versions of the programs from Angela Luttke, as well as the Microsoft Basic interpreter. The compiled versions are (1)faster, (2)handle several input file types, (3) Allow printing of larger output pictures (up to 11", (4) allow user control of output file type, (5)allow user control of output colors, which will work with color output devices even if the mac has a black and white screen, and (6) improved mac interface and behavior under MultiFinder. Compiled versions run about 5-10 times faster than the interpreted versions. Using 6-point Monaco The Plot/A programs will use 6-point Monaco to output, if it is available in the System Folder. The font is supplied for installation. Using the smaller font makes the output charts easier to read. Adding to the output picture The user may add to the output drawing by clicking on it with the mouse. If a peak on a hydropathy plot is clicked, for example, the program will list the position in the protein that was clicked. This is very useful for finding interesting stretches in the peptide sequence. The numbers are added to output ONLY until the user saves the output to the Clipboard, or prints the result. After that, the user cannot add to the existing picture. The numbers are drawn in Monaco 9-point font UNLESS you have installed Monaco 6-point in your system folder. The font is provided for installation with the MacPROT package with the compiled versions. Input File Formats The interpreted versions of the programs recognize a specific format, and have file errors if any other sequence file format (including plain sequence is used). The compiled versions are more flexible, and can handle the following formats: the original MacPRot format, plain sequence (no comments or numbers), Pearson format, EMBL format, GenBank format, and IntelliGenetics format. If the file contains more than one sequence in these formats, only the first sequence is loaded. In theory, files up to 32,000 amino acids could be loaded, but the practical limit is about 1000 amino acids. All programs were tested for function with up to 1000 amino acid files. The program uses TEXT or DNA Strider protein files only. If you type in your sequences with a word processor or other program, you will need to save your sequences in TEXT ONLY format. If you don't do this, you won't see your sequence when you press the "load file" button! Consult the manual for your favorite word processor on how to save a sequence as a TEXT ONLY file. You CAN read protein files made with DNA Strider. However, this format is tricky, and if you write comments in Strider, the PLOT/A programs won't be able to tell where the sequence ends and the comments begin! If you must use Strider files, either remove all comments. MacPROT and MacDraw/MacPaint The programs in the MacPROT package work create useful output plots which may be printed, or copied to the Clipboard using File menu commands. Once the output is copied to the Clipboard, you may start either MacDraw or MacPaint (or similar programs), and paste the output graph into a document. You can also paste directly into your word processor, but you will probably want to edit the output chart somewhat before doing this. In MacDraw, the object is pasted as a true collection of draw objects, rather than a bitmap, which allows you to customize it by color, font sizes, etc. MacDraw output prints at the resolution of your printer, 300dpi (dots per inch resolution) for a laserprinter, and 72 dpi for an ImageWriter. You can also paste into a paint-type program, which will create a bitmap at 72dpi, and print at 72dpi (even with a LaserWriter!) Note that this copy/paste routine works with ALL Mac programs. That is, is you select "Copy" in any Mac program, load a new program, and select "Paste", the Copied result is pasted. This is an excellent way to move data between drawing programs and word processors. For example, after "cleaning up" a plot in MacDraw, you could select it, copy it, start MacWrite or MS word, and select Paste. The chart would become part of your word processing document. Note that very large protein sequences (>500-600 amino acids) may exceed the maximum allowed output picture size in some of the programs. If this happens, an alert warns the user to reduce the size of the sequence plotted. If this occurs, the best thing is to analyze several portions, copy the results, and paste each segment's results in turn into a single draw or paint document. This works especially well with MultiFinder. The complete set of programs are listed below. MacPROT programs listed under the menu 'Programs' in the given order. MENUPROT (NOT IN THE COMPILED VERSIONS!!!!) gives an overview on the individual programs and their tasks (not under programs, but opened either from the finder or from within the programs by using 'CANCEL' or 'Exit' in all other programs; see below). It is a reminder on what the programs are doing and helps in deciding which program needs be opened for a certain analysis. It is useful for the starting user, who is not yet familiar with what the programs' names stand for. AA.DATA is for storing and revising a protein sequence. The program puts the sequence in the format, which is later used by the analysing programs. It can also be used for editing an already stored sequence. In addition to storing the plain sequence, an optional comment as well as data on amino acid frequency and molecular weights are calculated and written to the same file. PLOT.A/HYD analyses the hydropathy (hydrophobicity/ hydrophilicity) according to four published hydropathy scales of any sequence stored in AA.DATA file format. The plot can be printed or saved to the clipboard for pasting it into a picture accepting program. After the plot is completed the same sequence can be analysed using different parameters and/or a different hydropathy scale. PLOT.A/HYD5 analyses the hydropathy of a given sequence. It calculates and plots the hydropathy for five increasing moving averages (spans) simultaneously. The first span and the step for the increase is chosen by the user. Saving and printing is done as in PLOT.A/HYD. PLOT.A/SUM provides three hydropathy plots for a single sequence simultaneously using three different hydropathy scales. Saving and printing is done as in PLOT.A/HYD. PLOT.A/H3 calculates and plots the hydropathy of up to three sequences simultaneously. It can be chosen among four hydropathy scales. Saving and printing is done as in PLOT.A/HYD. PLOT.A/TMH calculates and plots the hydropathy. In addition each value for the chosen span is evaluated for its coordinates in a 'hydrophobic moment' plot and the corresponding assignment to the helix type shown. The assignment is also written to a disk file in ASCII format. Saving and printing is done as in PLOT.A/HYD. PLOT.A/HEL draws a helical wheel of a user selected span of amino acids, plots the hydropathy, and calculates the mean hydropathy using two scales. Saving and printing is done as in PLOT.A/HYD. PLOT.A/DOT plots two sequences (the same or different ones) against each other in a dot matrix. Two methods are used, which either search for perfect matches (identities) or for similarities according to five different score matrices based on (a) accepted point mutations, (b) genetic and structural similarities, (c) conformational state, and (d) observed substitutions. Saving and printing is done as in PLOT.A/HYD. PLOT.A/GOR calculates and plots the preference of each residue for four conformations (helix, extended, coil, turn) along the sequence. The sequence is written to a file in coded letters according to the most likely preference. Saving and printing is done as in PLOT.A/HYD. PLOT.A/GGR similarly calculates and plots the preference for three conformations (helix, extended, coil) along the sequence and writes the evaluation to a disk file. Saving and printing is done as in PLOT.A/HYD. PLOT.A/STR searches for up to ten user selected stretches of amino acids within a given sequence and plots the respective positions found along the sequence. Saving and printing is done as in PLOT.A/HYD. PLOT.A/FOT suggests regions likely to be amphipathic. Two disk files are created (1) to specify the residue starting an amphipathic segment, and (2) to write the sequence in coded format. Saving and printing is done as in PLOT.A/HYD. PLOT.A/POW plots the power spectrum of a user selected stretch. Saving and printing is done as in PLOT.A/HYD. PLOT.A/KAS determines main chain flexibility and in turn antigenic determinants. PLOT.A/TSL translates a DNA sequence. Paths among programs Each program can be opened from the finder by double-clicking on the respective icon or by starting MENUPROT and then choosing the respective program from the menu bar under the 'Programs' menu. From within the programs a different/same program can be chosen at any time during the run by pulling down the 'Programs' menu. A 'CANCEL' button in the first window provides the option to leave the program and go to 'MENUPROT'. There are two other ways of leaving a program during program execution, 'Exit' and 'Quit', via the 'File' menu (see sketch above), which also contains the options for saving a plot to the clipboard or printing it on a dot matrix printer. A "Help" button in the input window summarizes program use (briefly). The 'File' menu Note that the File menu commands for Open and Save are handled differently in the MacPROT package. The standard Open command for an input file is provided by the "load file" button, and the programs prompt for the name of an output file to Save if necessary. Save to Clip saves a plot to the clipboard for pasting it into a picture accepting program (graphs created by programs for this manual were saved to FullPaint). This option is available only after the the plot is finished. It is otherwise identical to the "Copy" command found under the Edit menu of most mac programs. At other times during program execution (loading a file, choosing parameters, analysis) the option is dimmed. If a plot is saved to the Clipboard, you should quit the program (Finder), or switch to the program you want to paste to (MultiFinder), and select Paste. It may take several seconds, since some of the plots are very complex for the mac to draw! Rerunning the program immediately with the same or different parameters erases the current plot and creates a new one. Print Picture prints the plot to any printer. As with 'Save to ...' the option is available only after the plot is finished. The common two commands on paper size and printing options need be answered before printing starts. Note that MacPROT program can only draw a singel page of output, meaning the plot can't be more than 10 inches to print. To print a long plot, select the sideways "landscape" print mode in the Page Setup Window. QuickPrint NOTE:THIS WORKS ONLY WITH AN APPLE IMAGEWRITER!!!! prints an instant hardcopy of the screen on the dot matrix printer (printer must be on!) in 'Standard' quality. Though similar to the 'Print Picture' option in 'Standard', it's a quick print facility omitting the selection windows for paper format and print quality. Exit closes all files in memory (the running program as well as textfiles created by some programs) and goes to 'MENUPROT'. The option is available all the time during program execution. It is particularly useful, if by mistake the 'wrong' program was opened and explanations about the label for the 'right' program are required (or a cup of coffee/tea is needed in between?). Quit also closes all open files, but goes back to the Finder (done for the day!). The Options Menu The compiled versions of MacPROT have an Options menu, where the user may change the colors used in creating the output graph, and the output file type. Colors being currently used are listed in the input window. The color information is recorded even if you have a black and white mac. You can print in color to (1)an ImageWriter II with a color ribbon, (2)a HP PaintJet or other color inkjet printer with a mac interface, or (3)a color laserprinter. The file type option sets the output file format used in creating text output of the calculations. When the output file is clicked (it may be necessary to open and close its volume/folder first), the mac will use the specified word processor to open it. MacWrite,WriteNow, MS Word,EDIT, and FORMAT are currently supported. Access to Programs Compiled versions of these programs will will become available in 1990. They may be downloaded from the following databases using electronic mail or FTP over bitnet, internet, or your local net if it has a gateway. EMBL: NETSERV@EMBL GET MAC_SOFTWARE:filename.ext IUBIO: IuBio.Bio.Indiana.Edu (IP name) 129.79.1.101 (IP address) U Houston: genbank-server@bchs.uh.edu (INTERNET/ARPANET) uhnix2!genbank-server (usenet won't work for much longer) SEND MAC hqx-encoded-file-name America Online: New service summer 1990, consult your manual for details send HELP messages to find out how to download the programs. from the various servers listed above. They are also available on floppies as part of a larger set of public domain programs for sequence analysis on the Macintosh from the following address: Peter Markiewicz Dept. Microbiology, MBI, UCLA Los Angeles, CA 90024 To get them on floppy, you must (1)send 15 800k initialized floppies with blank labels. (2)Include a return enveloped, addressed to you, with $2.40 postage (US). If one of the above is lacking, the programs won't be sent. It normally takes me 2-4 months to process requests. Note that you can get the programs MUCH faster by using electronic mail and/or online databases listed above. A tutorial on how to use electronic mail and online databases is available as a HyperCard stack from the above address.