From: GOLD::UK%"MA11@PHOENIX.CAMBRIDGE.AC.UK" 17-FEB-1990 08:21 To: gilbertd@IUBACS Subj: Received: From UKACRL(MAILER) by IUGOLD with Jnet id 2638 for GILBERTD@IUBACS; Sat, 17 Feb 90 08:21 EST Received: from RL.IB by UKACRL.BITNET (Mailer X1.25) with BSMTP id 6252; Sat, 17 Feb 90 09:54:08 GMT Received: Via: UK.AC.CAM.PHX; 17 FEB 90 9:54:00 GMT Date: Sat, 17 Feb 90 09:54:25 GMT From: MA11@PHOENIX.CAMBRIDGE.AC.UK To: gilbertd@IUBACS Message-ID: Here is the last version. In fact it may be the final version as i don't see it changing very much. michael ashburner Drosophila Codon Tables Version 8.2 November 15 1989 Michael Ashburner, Department of Genetics, University of Cambridge, Cambridge, England. Telephone 44-(0)223-333969 Electronic mail:ma11@uk.ac.cam.phx These Tables are supplied with the understanding that they can be freely used for research, although if quoted in any publication a suitable acknowledgement (e.g. Michael Ashburner, personal communication) would be appreciated. I will automatically post new versions on the BIOSCI Bulletin Board. These will generally be compiled whenever enough new data warrents the work. I am very happy to include new sequences that have not yet made the Sequence Data Banks, if these can be sent to me by electronic mail with sufficient data for the coding sequences to be extracted. If anyone should need the files of coding sequences that have been used to generate these tables please send me a message. Two series of Tables are included, one for "host" genes and one for orfs carried by transposable elements. For each series you have a codon table, a base composition and the names of the sequences used to compile these. By and large these sequences are taken from the EMBL, GENBANK or DAYHOFF Libraries. However some have been privately communicated to me. All sequences have been checked that they translate but many are incomplete. Hence, for example, the number of sequences is greater than the number of TER codons. The latest versions of the databanks used are EMBL V20.0 and GENBANK V61.0. The "host" gene coding sequences are from a total of 687.482-kb of sequenced DNA. // Table 1A: Base composition of "host" genes: T=65551 C=93013 Y=0 Pyrimidine=158564 A=79558 G=91689 R=0 Purine=171247 N=9 Nucleotides=329820 Deletions=0 Characters=329820 // Table 1B: Codons of "host" genes: TTT 1040 TCT 687 TAT 1061 TGT 633 TTC 2639 TCC 2287 TAC 2262 TGC 1744 TTA 345 TCA 649 TAA 102 TGA 40 TTG 1525 TCG 1986 TAG 57 TGG 1069 CTT 749 CCT 712 CAT 1218 CGT 1083 CTC 1360 CCC 2288 CAC 1966 CGC 1963 CTA 682 CCA 1358 CAA 1451 CGA 752 CTG 4204 CCG 1876 CAG 4362 CGG 751 ATT 1605 ACT 862 AAT 2189 AGT 1070 ATC 2860 ACC 2727 AAC 3067 AGC 2160 ATA 659 ACA 922 AAA 1310 AGA 450 ATG 2715 ACG 1450 AAG 4495 AGG 632 GTT 1123 GCT 1660 GAT 2979 GGT 1986 GTC 1683 GCC 4362 GAC 2692 GGC 3635 GTA 513 GCA 1200 GAA 1772 GGA 2310 GTG 3066 GCG 1515 GAG 4886 GGG 479 Total=109935 // Table 1C: "Host" gene sequences used for Tables 1A and 1B The numbers after the names indicate the number of codons (including the N-terminal met); if this number is bracketed then the coding sequence is incomplete; if the number of codons is followed by a '.' then the terminator is included. [EMBL/GENBANK Acession numbers] M26267; 67B gene 1, 239. X07311; 67B gene 2, 112. X06542; 67B gene 3, 170. M14643; alpha-tubulin-1, 452. M14644; alpha-tubulin-2, 451. M14645; alpha-tubulin-3, 451. M14646; alpha-tubulin-4, 463. M20419; beta-tubulin-1, 448. M16922; beta-tubulin-2, 447. M16923; beta-tubulin-3, 455. X16134; Abdominal-B-M (Abdb-B), 492. X13168; Abdominal-B-r (Abdb-B-r), 528. X05893; acetyl cholinesterase, 650. M17120; achaete, 202. K00667-K00669; actin 5C, 377. K00670;K00671; actin 42A, 377. J01064; actin 79B, 377. K00674;K00675; actin 87E, 377. J01065; actin 88F, 377. Z00030; alcohol dehydrogenase, 257. Z00030;Kreitman; 3' orf to Adh, 273. X04569; amylase-1, 495. X03788-X03791; Antp, 379. M18432; Aprt, 184. X12550; asense, 397. X14476; ATP-ase-alpha-subunit, 1029. X13107;Y00226; awd, 154. X07870; bicoid, 495. X04896; bsg25D, 742. M20630; bw, 676. M14131; C1A9 nuclear protein, 162. M19690-92;M18402; c-abl, 1521. X05939; c-myb (13E), 698. X07181; c-raf, 667. K01960; c-ras1 (85D), 190. M10759;M10803;M10804; c-ras2 (64B), 196. X02200; c-ras3 (62B), 183. M11917; c-src (64B), 553. M16599; c-src4 (28C), 591. X05948-X05951; calmodulin, 150. M18655; cAMP-dependent-protein-kinase-catalytic (Dpck), 354. M18656; cGMP-dependent-protein-kinase-catalytic (Dg1),[473] M16534;J03452; casein-hydrolase-alpha-chain, 337. M16534;J03452; casein-hydrolase-beta-chain, 216. M21069;M21070 caudal, 473. M19008-M19017; chaoptin, 1135. M13219; choline acetyl transferase, [729.] X02947; chorion gene s15-1, 116. X02497; chorion gene s18-1, 273. X02947; chorion gene s19-1, 374. X05245; chorion gene s36, 287. X05245; chorion gene s38, 307. V00200; collagen-like gene fragments, [469] J02727; collagen-IV, [712.] X05144; crumbs (EGF-like at 95F), [293] X07985; cut, 2176. X01761; cytochrome c gene DC3, 106. X01760; cytochrome c gene DC4, 109. J13148 daughterless, 711. X05136; Deformed, 591. X06289; Delta, 881. X04426; dopa decarboxylase, 512. M23702; dorsal, 678. J03957; D-cholecytokinin-like (Dsk), 129. M14978-14982; dunce, 363. X04521; eip28/29, 256. X04024; eip40, 394. X15087; eip74EF, 884. X15586; eip75B, 1444. X15657; element-binding-factor-1, 1064. X06869; elongation factor-1, alpha F1 (48D), 464. X06870; elongation factor-1, alpha F2, 464. X15805; elongation factor-2, 845. M10017; engrailed, 553. M20571; E(spl), 720. M15961; esterase-6, 549. X05138; even-skipped, 377. M20545; fasciclin I, 653. J03232; FMRF-amide, 343. M18281; follicle cell protein @ 3C, 211. J03177; fork-head, 511. X14153; fs(1)K10, 464. M23221; fsh, 2039. X00854;K01951; fushi tarazu, 414. M11254; Gapdh-1, 333. M11255; Gapdh-2, 333. J02932; Glued, 1320. M22567;J04083; G-protein beta subunit, 341. J02527;K02461; glycinimide ribotide transformylase (GART), 1354. J04567 Gpdh, 352. J01085; heat shock cognate 70C [exon 1], [68] K01296;K01297; heat shock cognate 87D [exons 1 & 2], [70] J02569; heat shock cognate 88E, [104] X04073; Histone H1, 257. Dayhoff; Histone H2A, [122] X07485; Histone H2A variant, 142. Dayhoff; Histone H2B, [118] Dayhoff; Histone H3, [122] Dayhoff; Histone H4, [72] M21329; HMG-coenzyme A reductase, 917. Y00843; homoeobox protein H2.0, 411. V00209; hsp22, 175. V00210; hsp23, 187. V00211; hsp26, 209. V00212; hsp27, 214. V00213;V00214; hsp70 [87A], [345.] J01104;J01105; hsp70 [87C], 642. X03810; hsp82, 718. Y00274; hunchback, 759. M13568; Insulin-like receptor protein-1 (Dir-b) [1096.] M14778; Insulin-like receptor protein-2, (Dir-a) [300] X05273; invected, 577. X13331; knirps, 430. X14153; knirps-related, 648. X03414; Kruppel, 467. X04227; l(2)37Cc, 327. X05991; l(2)37Cs, 246. X04695; l(2)amd, 511. X05426; l(2)gl, 1161. M13014;X12834; labial, 636. X07278; lamin, 622. M19525; lamimin B1, 1788. X07802; laminin B2, [1297.] V00202; larval cuticle protein-1 [44D], 131. V00203; larval cuticle protein-2 [44D], 127. V00203; larval cuticile protein-3 [44D], 112. V00204; larval visceral protein-D [44D], 509. V00204; larval visceral protein-H [44D], 522. V00204; larval visceral protein-L [44D], 506. X12549; lethal-of-scute 258. X03872; LSP1-alpha, [70] X03873; LSP1-beta, [100] X03874; LSP1-gamma, [105] X03758; metallothionein-A (MtnA), 41. M16250; metallothionein-B (MtnB), 44. Y00795; mp20, 184. Y00219; mst355a, 265. Y00831; mst(3)gl-9 sperm protein, 57. J02788; myosin-heavy chain, 270. M10125; myosin-alkali-light chain, 156. M11947; myosin-light-chain-2, 223. J03251; myospheroid, 847. X04016; nicotinic acetylcholine receptor (Ard), 522. X07194; nicotinic acetylcholine receptor, alpha subunit, (AcrB) 568. M20230; ninaC, 1502. J03138; norpA, 1096. M11664; Notch, 2704. K02315; opsin, ninaE, 374. M12896; opsin, Rh2, 381. M17718; opsin, Rh3, 384. M17719;M17730; opsin, Rh4, 379. X13693; otu, 812. M14548; paired, 614. M24285; para, [1821] M21201; paragonial peptide (PapB), 56. M25662; pecanex, [1929.] M15762; pen#9b, 366. M11969; period, 1128. Y00402; Phosphoenolpyruvate carboxykinase, 648. M14548; paired, 614. X05076;Y00042; protein kinase C, 640. Y07510; protein phosphatase (pp55A), 315. M19059; PS2 antigen, 1395. J02527;K02461; pupal cuticle protein (Gart), 185. Y00504; ribosomal protein rp21C, 113. X14247; ribosomal protein rpS31, 115. X00848; ribosomal protein rp49, 134. X05016; ribosomal protein rpA1, 114. X13382; ribosomal protein rpL1, 408. M21045; ribosomal protein S14A, 152. M21045; ribosomal protein S14B, 152. X05709; RNA polymerase II-140, 1124. M11798; RNA polymerase II-215, [470] M19537; RNA polymerase II-215 [409.] Y00308; rosy, 1336. X04813; rudimentary, 2357. M17119; scute T4, 346. X03121; serendipity-alpha, 531. X03121; serendipity-beta, 352. X03121; serendipity-delta, 431. J03158; sevenless, 2555. X01918; Sgs3, 308. J01135;J01136; Sgs4, [141] X04269; Sgs5, 164. X01918; Sgs7, 75. X01918; Sgs8, 76. X07131;Y00847; Shaker, 617. M19020; single minded, 656. Y00288; snail, 391. X04513; snake, 431. Y00228; su(Hw), 945. Y00367; superoxide dismutase, 154. M21159; Tcp1, 558. M19140; ter, 429. M19494; tko, 141. J02682; Toll, 1098. M17478; tra, 198. M23633; tra-2, 180. K03277; tropomyosin I, T-isoform, [198] M15466; tropomyosin II, 286. M18635; trp, 265. X02989; trypsin-like enzyme, alpha-chain, 257. X14569; twist, 491. X05723;Y00206; Ubx, 390. X12945;X12946; vasa, 661. X01802; vitelline membrane protein (Vm34C.1), [96] M18280; vitelline membrane protein (Vm26A.2), 142. X02974; white, 697. M17230; wingless, 469. Chia; yellow, 542. V00248; yolk protein-1, 441. J01157; yolk protein-2, 460. M15898; yolk protein-3, 421. Y00049; zeste, 576. X07450; zipper, 501. // Table 2A: Codon table TE genes: TTT 439 TCT 176 TAT 314 TGT 143 TTC 266 TCC 163 TAC 295 TGC 150 TTA 407 TCA 252 TAA 10 TGA 2 TTG 288 TCG 103 TAG 2 TGG 166 CTT 271 CCT 142 CAT 255 CGT 100 CTC 164 CCC 140 CAC 228 CGC 83 CTA 256 CCA 341 CAA 512 CGA 137 CTG 172 CCG 89 CAG 246 CGG 46 ATT 543 ACT 261 AAT 720 AGT 228 ATC 251 ACC 219 AAC 490 AGC 205 ATA 505 ACA 450 AAA 1047 AGA 326 ATG 269 ACG 114 AAG 418 AGG 135 GTT 242 GCT 245 GAT 414 GGT 175 GTC 158 GCC 197 GAC 424 GGC 165 GTA 249 GCA 312 GAA 696 GGA 205 GTG 188 GCG 100 GAG 350 GGG 75 Total=16734 // Table 2B: Base composition TE genes: T=12512 C=10084 Y=0 Pyrimidine=22596 A=18309 G=9297 R=0 Purine=27606 N=0 Nucleotides=50202 Deletions=0 Characters=50202 // Table 2C: TE genes used for Tables 2A and 2B: [EMBL/GENBANK Accession numbers] X01472; 17.6 element, 1975 X03431; 297 element, 1944 X07656; 1731 element, 273 & 982 X04132;X03733; 412 element, 128, 104, 455. & 1237 X02599; copia element [Saigo], 1410 M17214; F-element, 123. & 860. M12927; gypsy, 452., 1036. & 510. X01748; HB1, 149. X04705; hobo, 644 M14954; I element, 430. & 1087. M14653; mariner (mauritiana), 345. O'Hare; P element, 792. X02600; virus like particle RNA (VLP H-RNA), 1289 & 146 Savakis; minos element (hydei), 362. //