SILMUT and TABLE programs in this package help you to identify regions in a sequence which can be altered to introduce restriction enzyme sites and other sequences by silent mutations. The work is based on our following publications. 1. B. Shankarappa, D.A. Sirko, and G.D. Ehrlich. A General Method for the Identification of Regions Suitable for Site-Directed Silent Mutagenesis. BioTechniques 12, No. (3): 382-384 2. B. Shankarappa, K. Vijayananda, and G.D. Ehrlich. SILMUT: A Computer program for the Identification of Regions Suitable for Silent Mutagenesis to Introduce Restriction Enzyme Recognition Sequences. BioTechniques 12, No. (6): 882-884. If you have any questions, please contact one of us at the following address. 1. Dr. Raj Shankarappa, bsh@med.pitt.edu Research Associate, University of Pittsburgh, Department of Pathology, Pittsburgh, PA 15261 email: bsh@med.pitt.edu phone: (412) 648-1986 or 648-9763 or 648-9026 fax (412) 648-1916 OR 2. Mr. Vijayananda, vijay@litsun.epfl.ch Office: EPFL - LIT EL - ECUBLENS CH-1015 lAUSANNE SWITZERLAND +41-21-693.47.03 Residence: K. Vijayananda 16 Ch. du martinet ch-1007 Lausanne Switzerland +41-21-24.03.00 We would like to wish you good luck in your research efforts and hope these programs have been helpful to you. ==================================================================== Please read the following if you have questions about using the program SILMUT and TABLE The disk contains the following files. 1. Silmut.c: documented program of silmut.exe 2. Silmut.exe:executable file compiled by C compiler 3. Table.c: documented program of table.exe 4. Table.exe: executable file compiled by C compiler 5. Dbase1: file containing single letter amino acid codes and the corresponding codons. 6. Dbase2: file containing the common 30 recognition sites for restriction enzymes. 7. gpgr.in: Input file containing the nucleic acid sequence of the conserved GPGR amino acid motif of HIV-1 virus. 8. gpgr.out : Actual output file generated by silmut.exe for the input in gpgr.in file. If you intend to change any files to suit your particular use, we suggest you retain a copy of the above files as a precaution. Silmut program identifies the potential for silent mutagenesis to introduce any 6-base sequence such as restriction enzyme sites. Table program provides the sets of amino acid motifs obtained by translation of any 6-base sequence in three reading frames" The Silmut and Table programs have been written in C language and can be used in any IBM compatible computers. The programs can also be run in UNIX or other systems that support the C compiler. Silmut program identifies the silent mutation potential for the introduction of 6-base restriction enzyme sequences by translating the RE recognition sequence in three reading frames to obtain a set of amino acid motifs and finding a match for the amino acid motif with the amino acid sequence provided as the input. Thirty selected restriction enzymes which do not contain degeneracy in their recognition sites have been provided in file dbase2. This file can be edited in a DOS text editor to remove any restriction enzyme sequences from being recognized or add new 6-base sequences. The maximum number of restriction enzyme sites that can be included in this file is 100. The program is invoked by typing SILMUT whereby you will have the option to chose the analysis of a nucleic acid sequence; or the amino acid sequence; or exiting the program. The default modes for input and output are screen. If you choose to analyze a sequence stored as a file, that file must be in a DOS format. The file should contain 1 (for AA) or 2 (for NA) in the first line, sequence of nucleic/amino acid in the second line and 3 (for exiting) in the third line [for an example please see the file gpgr.in]. We recommend that you use a DOS editor like Q.EXE which can handle lines lengths of 500 or more characters. If you use other programs like WordPerfectTM, which usually handle about 52 characters in a line, you may have to split the sequence into separate overalpping lines. Also make sure you save the file in a DOS format. The proper syntax for evaluating a sequence in a file is as follows. silmut -i file1 -o file2 Here the input from file1 is analyzed and the output is directed to file2. Different combinations of input and output from or to the screen can be used by omitting appropriate parameters following the command silmut. eg., Silmut -i file1 will obtain the information from file1 and output is directed to the screen. Similarly, silmut -o file2 will take the input from screen and direct the output to a file called file2. Table program provides a table/listing of amino acid motifs obtained by translating a 6-base sequence in three reading frames. Translation in the first reading frame yields a sequence of two amino acids, while translation in second and third reading frames will yield degeneracies in the first and third position. Each of these amino acid motifs is separated in the output for clarity. Due to longer lines, the end of the line might be wrapped onto the next line. You may correct for this by editing and printing the file using a word processing program after setting a wider margins or setting smaller sized letters. The default inputs are the files dbase1 and dbase2, for translation of recognition sequences of 30 commonly used restriction enzymes. If you want to change the codon specificities, you have to edit the file dbase1. If you want to add or delete any restriction enzyme sites, or any other 6-base sites like splice sites etc., you need to edit the file dbase2. This editing can be done in a DOS environment using any of the softwares like Q.exe. If you use other softwares like WordPerfect, make sure the file is saved as a DOS text file. Also, make sure that the spacing and other format of the files dbase1 and dbase2 are retained as before. The default output is the screen. If you chose the output to be directed to a file the proper syntax is table output.tab where output tab is the file you want the output to be directed. If you have any questions, please contact either of us at the following address. 1. Dr. Raj Shankarappa, Research Associate, University of Pittsburgh, Department of Pathology, Pittsburgh, PA 15261 email: bsh@med.pitt.edu phone: (412) 648-1986 or 648-9763 or 648-9026 fax (412) 648-1916 OR 2. Mr. Vijayananda, vijay@litsun.epfl.ch Office: EPFL - LIT EL - ECUBLENS CH-1015 lAUSANNE SWITZERLAND +41-21-693.47.03 Residence: K. Vijayananda 16 Ch. du martinet ch-1007 Lausanne Switzerland +41-21-24.03.00 We would like to wish you good luck in your research efforts and hope these programs have been helpful to you. If you find any bugs with this program, we will be thankful if you can inform us. We have an intention of improving the program in the near future. Following is the output from TABLE program for 30 selected RE sites. Please adujust the margins so that all the data is in one line. The order of the contents of the following table is RE, Recognition site, 1st and 2nd AA obtained in the first reading frame, 1,2,3rd AA obtained from 2nd reading frame, and 1,2,3rd AA obtained from the third reading frame. (PLEASE SET THE MARGINS WIDE ENOUGH TO ACCOMODATE THE WRAPPED LINE) HindIII AAGCTT K L XQKE A FLSYXCW LSXPQRITKVAEG S FL MluI ACGCGT T R YHND A FLSYXCW LSXPQRITKVAEG R V SpeI ACTAGT T S YHND X FLSYXCW LSXPQRITKVAEG L V BglII AGATCT R S XQKE I FLSYXCW LSXPQRITKVAEG D L StuI AGGCCT R P XQKE A FLSYXCW LSXPQRITKVAEG G L BanIII/BspXI/ClaI ATCGAT I D YHND R FLSYXCW LSXPQRITKVAEG S IM Cfr6I/PvuII CAGCTG Q L SPTA A VADEG FSYCLPHRITNVADG S CXW NdeI CATATG H M SPTA Y VADEG FSYCLPHRITNVADG I CXW NcoI CCATGG P W SPTA M VADEG FSYCLPHRITNVADG H G Cfr9I/SmaI/XmaI CCCGGG P G SPTA R VADEG FSYCLPHRITNVADG P G SacII CCGCGG P R SPTA A VADEG FSYCLPHRITNVADG R G PvuI/RshI/XorII CGATCG R S SPTA I VADEG FSYCLPHRITNVADG D R EagI/XmaIII CGGCCG R P SPTA A VADEG FSYCLPHRITNVADG G R BsuMI/PaeR7I/XhoI CTCGAG L E SPTA R VADEG FSYCLPHRITNVADG S SR PstI/SflI CTGCAG L Q SPTA A VADEG FSYCLPHRITNVADG C SR EcoRI/RsrI/Sso47I GAATTC E F XRG I LPHQR LSXWPQRMTKVAEG N S SacI/SstI GAGCTC E L XRG A LPHQR LSXWPQRMTKVAEG S S EcoRV GATATC D I XRG Y LPHQR LSXWPQRMTKVAEG I S SphI GCATGC A C CRSG M LPHQR LSXWPQRMTKVAEG H A NaeI GCCGGC A G CRSG R LPHQR LSXWPQRMTKVAEG P A NheI GCTAGC A S CRSG X LPHQR LSXWPQRMTKVAEG L A BamFI/BamHI/BamKI/BamNI/BstI/Bst1503I GGATCC G S WRG I LPHQR LSXWPQRMTKVAEG D P NarI GGCGCC G A WRG R LPHQR LSXWPQRMTKVAEG A P ApaI GGGCCC G P WRG A LPHQR LSXWPQRMTKVAEG G P Asp78I/KpnI GGTACC G T WRG Y LPHQR LSXWPQRMTKVAEG V P SalI GTCGAC V D CRSG R LPHQR LSXWPQRMTKVAEG S T ApaLI GTGCAC V H CRSG A LPHQR LSXWPQRMTKVAEG C T HpaI GTTAAC V N CRSG X LPHQR LSXWPQRMTKVAEG L T AccIII/BspMII TCCGGA S G FLIV R IMTNKSR FSYCLPHRITNVADG P DE NruI/Sbo13I TCGCGA S R FLIV A IMTNKSR FSYCLPHRITNVADG R DE XbaI TCTAGA S R FLIV X IMTNKSR FSYCLPHRITNVADG L DE AtuCI/BclI/BstGI/CpeI TGATCA X S LMV I IMTNKSR FSYCLPHRITNVADG D HQ BalI TGGCCA W P LMV A IMTNKSR FSYCLPHRITNVADG G HQ