Title of Invention

A METHOD PRODUCING EPOTHILONE BY BIOSYNTHESIS

Abstract Nucleic acid molecules are isolated from Sporangium cellulous that encode polypeptides necessary for the biosynthesis of epothilone. Disclosed are methods for the production of epothilone in recombinant hosts transformed with the genes of the invention. In this manner, epothilone can be produced in quantities large enough to enable their purification and use in pharmaceutical formulations such as those for the treatment of cancer.
Full Text GENES FOR THE BIOSYNTHESIS OF EPOTHILONES
FIELD OF THE INVENTION
The present invention relates generally to polyketides and genes for their synthesis. In particular, the present invention relates to the isolation and characterization of novel poly-ketide synthase and nonribosomal peptide synthetase genes from Sorangium cellulosum that are necessary for the biosynthesis of epothilones A and B.
BACKGROUND OF THE INVENTION
Polyketides are compounds synthesized from two-carbon building blocks, the (3-carbon of which always carries a keto group, thus the name polyketide. These compounds include many important antibiotics, immunosuppressants, cancer chemotherapeutic agents, and other compounds possessing a broad range of biological properties. The tremendous structural diversity derives from the different lengths of the polyketide chain, the different side-chains introduced (either as part of the two-carbon building blocks or after the polyketide backbone is formed), and the stereochemistry of such groups. The keto groups may also be reduced to hydroxyls, enoyls, or removed altogether. Each round of two-carbon addition is carried out by a complex of enzymes called the polyketide synthase (PKS) in a manner similar to fatty acid biosynthesis.
The biosynthetic genes for an increasing number of polyketides have been isolated nnd sequenced. For example, see U.S. Patent Nos. 5,639,949, 5,693,774, and 5,716,849, all of which are incorporated herein by reference, which describe genes for the biosynthesis of soraphen. See also, Schupp et al, FEMS Microbiology Letters 159: 201 -207 (1998) and WO 98/07868, which describe genes for the biosynthesis of rifamycin, and U.S. Patent No. 5,876,991, which describes genes for the biosynthesis of tylactone, all of which are incorporated herein by reference. The encoded proteins generally fall into two types: type I and type II. Type I proteins are polyfunctional, with several catalytic domains carrying out different enzymatic steps covalently linked together (e.g. PKS for erythromycin, soraphen, rifamycin, and avermectin (MacNeil et aL, in Industrial Microorganisms: Basic and Applied Molecular Genetics, (ed.: Baltz et aL), American Society for Microbiology, Washington D. C.

pp. 245-256 (1993)); whereas type II proteins are monofunctional (Hutchinson et al, in Industrial Microorganisms: Basic and Applied Molecular Genetics, (ed.: Baltz et al), American Society for Microbiology, Washington D. C. pp. 203-216 (1993)).
For the simpler polyketides such as actinorhodin (produced by Streptomyces coelicolor), the several rounds of two-carbon additions are carried out iteratively on PKS enzymes encoded by one set of PKS genes. In contrast, synthesis of the more complicated compounds such as erythromycin and soraphen involves PKS enzymes that are organized into modules, whereby each module carries out one round of two-carbon addition (for review, see Hopwood e al., in Industrial Microorganisms: Basic and Applied Molecular Genetics, (ed.: Baltz et al), American Society for Microbiology, Washington D. C, pp. 267-275 (1993)).
Complex polyketides and secondary metabolites in general may contain substructures that are derived from amino acids instead of simple carboxylic acids. Incorporations of these building blocks are accomplished by non-ribosomal polypeptide synthetases (NRPSs). NRPSs are multienzymes that are organized in modules. Each module is responsible for the addition (and the additional processing, if required) of one amino acid building block. NRPSs activate amino acids by forming aminoacyl-adenylates, and capture the activated amino acids on thiol groups of phophopantheteinyl prosthetic groups on peptidyl carrier protein domains. Further, NRPSs modify the amino acids by epimerization, N-methyla-tion, or cyclization if necessary, and catalyse the formation of peptide bonds between the enzyme-bound amino acids. NRPSs are responsible for the biosynthesis of peptide secondary metabolites like cyclosporin, could provide polyketide chain terminator units as in rapa-mycin. or form mixed systems with PKSs as in yersiniabactin biosynthesis.
Epothilones A and B are 16-membered macrocyclic polyketides with an acylcyste-ine-derived starter unit that are produced by the bacterium Sorangium cellulosum strain So ce90 (Gerth et al, J. Antibiotics 49: 560-563 (1996), incorporated herein by reference). The structure of epothilone A and B wherein R signifies hydrogen (epothilone A) or methyl (epo-thilone B) is:


The epothilones have a narrow antifungal spectrum and especially show a high cytotoxicity in animal cell cultures (see, Hofle et al. Patent DE 4138042 (1993). incorporated herein by reference). Of significant importance, epothilones mimic the biological effects of taxol, both in vivo and in cultured cells (Bollag et aL, Cancer Research 55: 2325-2333 (1995), incorporated herein by reference). Taxol and taxotere, which stabilize cellular microtubules, are cancer chemotherapeutic agents with significant activity against various human solid tumors (Rowinsky et al, J. Natl. Cancer Inst. 83: 1778-1781 (1991)). Competition studies have revealed that epothilones act as competitive inhibitors of taxol binding to microtubules, consistent with the interpretation that they share the same microtubule-bin-ding site and possess a similar microtubule affinity as taxol. However, epothilones enjoy a significant advantage over taxol in that epothilones exhibit a much lower drop in potency compared to taxol against a multiple drug-resistant cell line (Bollag etal, (1995)). Furthermore, epothilones are considerably less efficiently exported from the cells by P-glycoprotein than is taxol (Gerth et al. (1996)). In addition, several epothilone analogs have been synthesized that have a superior cytotoxic activity as compared to epothilone A or epothilone B as demonstrated by their enhanced ability to induce the polymerization and stabilization of microtubules (WO 98/25929, incorporated herein by reference).
Despite the promise shown by the epothilones as anticancer agents, problems pertaining to the production of these compounds presently limit their commercial potential. The compounds are too complex for industrial-scale chemical synthesis and so must be produced by fermentation. Techniques for the genetic manipulation of myxobacteria such as Sorangium cellulosum are described in U.S. Patent No. 5,686,295, incorporated herein by reference. However, Sorangium cellulosum is notoriously difficult to ferment and production levels of epothilones are therefore low. Recombinant production of epothilones in heterologous hosts that are more amenable to fermentation could solve current production problems. However, the genes that encode the polypeptides responsible for epothilone bio-

synthesis have heretofore not been isolated. Furthermore, the strain that produces epo-thilones, i.e. So ce90, also produces at least one additional polyketide, spirangien, which would be expected to greatly complicate the isolation of the genes particularly responsible for epothilone biosynthesis.
Therefore, in view of the foregoing, one object of the present invention is to isolate the genes that are involved in the synthesis of epothilones, particularly the genes that are involved in the synthesis of epothilones A and B in myxobacteria of the Sorangium/-Polyangium group, i.e., Sorangium cellulosum strain So ce90. A further object of the invention is to provide a method for the recombinant production of epothilones for application in anticancer formulations,
SUMMARY OF THE INVENTION
In furtherance of the aforementioned and other objects, the present invention unexpectedly overcomes the difficulties set forth above to provide for the first time a nucleic acid molecule comprising a nucleotide sequence that encodes at least one polypeptide involved in the biosynthesis of epothilone. In a preferred embodiment, the nucleotide sequence is isolated from a species belonging to Myxobacteria, most preferably Sorangium cellulosum.
In another preferred embodiment, the present invention provides an isolated nucleic .acid molecule comprising a nucleotide sequence that encodes at least one polypeptide involved in the biosynthesis of an epothilone, wherein said polypeptide comprises an amino acid sequence substantially similar to an amino acid sequence selected from the group consisting of: SEQ ID N0:2, amino acids 11-437 of SEQ ID N0:2, amino acids 543-864 of SEQ ID N0:2, amino acids 974-1273 of SEQ ID N0:2, amino acids 1314-1385 of SEQ ID N0:2, SEQ ID N0:3, amino acids 72-81 of SEQ ID NO:3, amino acids 118-125 of SEQ ID N0:3, amino acids 199-212 of SEQ ID N0:3, amino acids 353-363 of SEQ ID N0:3, amino acids 549-565 of SEQ ID N0:3, amino acids 588-603 of SEQ ID N0:3, amino acids 669-684 of SEQ ID N0:3. amino acids 815-821 of SEQ ID N0:3. amino acids 868-892 of SEQ ID N0:3, amino acids 903-912 of SEQ ID N0:3, amino acids 918-940 of SEQ ID N0:3, amino acids 1268-1274 of SEQ ID NO:3, amino acids 1285-1297 of SEQ ID N0:3. amino acids 973-1256 of SEQ ID N0:3, amino acids 1344-1351 of SEQ ID N0:3, SEQ ID N0:4, amino acids 7-432 of SEQ ID N0:4, amino acids 539-859 of SEQ ID NO:4, amino acids 869-1037 of SEQ ID NO:4, amino acids 1439-1684 of SEQ ID N0:4, amino acids 1722-1792 of SEQ ID

N0:4, SEQ ID N0:5, amino acids 39-457 of SEQ ID N0:5, amino acids 563-884 of SEQ ID N0:5, amino acids 1147-1399 of SEQ ID N0:5, amino acids 1434-1506 of SEQ ID N0:5, amino acids 1524-1950 of SEQ ID N0:5, amino acids 2056-2377 of SEQ ID N0:5, amino acids 2645-2895 of SEQ ID N0:5, amino acids 2932-3005 of SEQ ID N0:5, amino acids 3024-3449 of SEQ ID N0:5, amino acids 3555-3876 of SEQ ID N0:5, amino acids 3886-4048 of SEQ ID N0:5, amino acids 4433-4719 of SEQ ID N0:5, amino acids 4729-4974 of SEQ ID N0:5, amino acids 5010-5082 of SEQ ID N0:5, amino acids 5103-5525 of SEQ ID N0:5, amino acids 5631-5951 of SEQ ID N0:5, amino acids 5964-6132 of SEQ ID N0:5, amino acids 6542-6837 of SEQ ID N0:5, amino acids 6857-7101 of SEQ ID N0:5, amino acids 7140-7211 of SEQ ID N0:5, SEQ ID N0:6, amino acids 35-454 of SEQ ID NQ:6, amino acids 561-881 of SEQ ID NO:6, amino acids 1143-1393 of SEQ ID N0:6, amino acids 1430-1503 of SEQ ID N0:6, amino acids 1522-1946 of SEQ ID NO: 6, amino acids 2053-2373 of SEQ ID NO:6, amino acids 2383-2551 of SEQ ID N0:6, amino acids 2671-3045 of SEQ ID N0:6, amino acids 3392-3636 of SEQ ID N0:6, amino acids 3673-3745 of SEQ ID NO:6, SEQ ID NO:7, amino acids 32-450 of SEQ ID N0:7, amino acids 556-877 of SEQ ID NO:7, amino acids 887-1051 of SEQ ID N0:7, amino acids 1478-1790 of SEQ ID NO:7, amino acids 1810-2055 of SEQ ID N0:7, amino acids 2093-2164 of SEQ ID N0:7, amino acids 2165-2439 of SEQ ID N0:7, SEQ ID N0:8, SEQ ID NQ:10, SEQ ID N0:11, and SEQ ID NO:22.
In a more preferred embodiment, the present invention provides an isolated nucleic acid molecule comprising a nucleotide sequence that encodes at least one polypeptide involved in the biosynthesis of an epothiione, wherein said polypeptide comprises an amino acid sequence selected from the group consisting of: SEQ ID N0:2, amino acids 11-437 of SEQ ID N0:2, amino acids 543-864 of SEQ ID N0:2, amino acids 974-1273 of SEQ ID N0:2, amino acids 1314-1385 of SEQ ID N0:2, SEQ ID N0:3, amino acids 72-81 of SEQ ID N0:3, amino acids 118-125 of SEQ ID N0:3, amino acids 199-212 of SEQ ID N0:3, amino acids 353-363 of SEQ ID N0:3, amino acids 549-565 of SEQ ID N0:3, amino acids 588-603 of SEQ ID N0:3, amino acids 669-684 of SEQ ID N0:3, amino acids 815-821 of SEQ ID N0:3, amino acids 868-892 of SEQ ID N0:3, amino acids 903-912 of SEQ ID N0:3, amino acids 918-940 of SEQ ID N0:3, amino acids 1268-1274 of SEQ ID N0:3, amino acids 1285-1297 of SEQ ID N0:3, amino acids 973-1256 of SEQ ID N0:3, amino acids 1344-1351 of SEQ ID NO:3, SEQ ID N0:4, amino acids 7-432 of SEQ ID N0:4, amino acids 539-859 of SEQ ID N0:4, amino acids 869-1037 of SEQ ID N0:4, amino acids 1439-1684

of SEQ ID N0:4, amino acids 1722-1792 of SEQ ID N0:4, SEQ ID N0:5, amino acids 39-457 of SEQ ID N0:5, amino acids 563-884 of SEQ ID N0:5, amino acids 1147-1399 of SEQ ID NO:5, amino acids 1434-1506 of SEQ ID N0:5, amino acids 1524-1950 of SEQ ID N0:5, amino acids 2056-2377 of SEQ ID N0:5, amino acids 2645-2895 of SEQ ID N0:5, amino acids 2932-3005 of SEQ ID N0:5, amino acids 3024-3449 of SEQ ID N0:5, amino acids 3555-3876 of SEQ ID N0:5, amino acids 3886-4048 of SEQ ID N0:5, amino acids 4433-4719 of SEQ ID N0:5, amino acids 4729-4974 of SEQ ID NQ:5, amino acids 5010-5082 of SEQ ID N0:5, amino acids 5103-5525 of SEQ ID N0:5, amino acids 5631-5951 of SEQ ID N0:5, amino acids 5964-6132 of SEQ ID N0:5, amino acids 6542-6837 of SEQ ID N0:5, amino acids 6857-7101 of SEQ ID N0:5, amino acids 7140-7211 of SEQ ID N0:5, SEQ ID NO:6, amino acids 35-454 of SEQ ID N0:6, amino acids 561-881 of SEQ ID N0:6, amino acids 1143-1393 of SEQ ID NQ:6, amino acids 1430-1503 of SEQ ID N0:6, amino acids 1522-1946 of SEQ ID NO: 6, amino acids 2053-2373 of SEQ ID N0:6, amino acids 2383-2551 of SEQ ID N0:6, amino acids 2671-3045 of SEQ ID N0:6, amino acids 3392-3636 of SEQ ID N0:6, amino acids 3673-3745 of SEQ ID N0:6, SEQ ID N0:7, amino acids 32-450 of SEQ ID N0:7, amino acids 556-877 of SEQ ID N0:7, amino acids 887-1051 of SEQ ID N0:7, amino acids 1478-1790 of SEQ ID N0:7, amino acids 1810-2055 of SEQ ID N0:7, amino acids 2093-2164 of SEQ ID N0:7, amino acids 2165-2439 of SEQ ID N0:7, SEQ ID N0:8, SEQ ID NO:10, SEQ ID N0:11, and SEQ ID NO:22.
In yet another preferred embodiment, the present invention provides an isolated nucleic acid molecule comprising a nucleotide sequence that encodes at least one polypeptide involved in the biosynthesis of an epothilone, wherein said nucleotide sequence is substantially similar to a nucleotide sequence selected from the group consisting of: the complement of nucleotides 1900-3171 of SEQ ID NO: 1, nucleotides 3415-5556 of SEQ ID N0:1, nucleotides 7610-11875 of SEQ ID NO: 1, nucleotides 7643-8920 of SEQ ID N0:1, nucleotides 9236-10201 of SEQ ID N0:1, nucleotides 10529-11428 of SEQ ID NO:1, nucleotides 11549-11764 of SEQ ID N0:1, nucleotides 11872-16104 of SEQ ID N0:1, nucleotides 12085-12114 of SEQ ID N0:1, nucleotides 12223-12246 of SEQ ID N0:1, nucleotides 12466-12507 of SEQ ID N0:1, nucleotides 12928-12960 of SEQ ID N0:1, nucleotides 13516-13566 of SEQ ID N0:1, nucleotides 13633-13680 of SEQ ID N0:1, nucleotides 13876-13923 of SEQ ID N0:1, nucleotides 14313-14334 of SEQ ID N0:1, nucleotides 14473-14547 of SEQ ID N0:1, nucleotides 14578-14607 of SEQ ID N0:1, nucleotides 14623-14692 of SEQ ID N0:1, nucleotides 15673-15693 of SEQ ID N0:1,

nucleotides 15724-15762 of SEQ ID N0:1, nucleotides 14788-15639 of SEQ ID N0:1, nucleotides 15901-15924 of SEQ ID NO: 1, nucleotides 16251-21749 of SEQ ID N0:1, nucleotides 16269-17546 of SEQ ID N0:1, nucleotides 17865-18827 of SEQ ID N0:1, nucleotides 18855-19361 of SEQ ID NO: 1, nucleotides 20565-21302 of SEQ ID N0:1. nucleotides 21414-21626 of SEQ ID NO: 1, nucleotides 21746-43519 of SEQ ID N0:1, nucleotides 21860-23116 of SEQ ID NO: 1, nucleotides 23431-24397 of SEQ ID N0:1, nucleotides 25184-25942 of SEQ ID NO: 1, nucleotides 26045-26263 of SEQ ID N0:1, nucleotides 26318-27595 of SEQ ID N0:1, nucleotides 27911-28876 of SEQ ID N0:1, nucleotides 29678-30429 of SEQ ID NO: 1, nucleotides 30539-30759 of SEQ ID N0:1, nucleotides 30815-32092 of SEQ ID NO: 1, nucleotides 32408-33373 of SEQ ID N0:1, nucleotides 33401-33889 of SEQ ID NO: 1, nucleotides 35042-35902 of SEQ ID N0:1, nucleotides 35930-36667 of SEQ ID N0:1, nucleotides 36773-36991 of SEQ ID N0:1, nucleotides 37052-38320 of SEQ ID NO: 1, nucleotides 38636-39598 of SEQ ID N0:1, nucleotides 39635-40141 of SEQ ID NO: 1, nucleotides 41369-42256 of SEQ ID N0:1, nucleotides 42314-43048 of SEQ ID N0:1, nucleotides 43163-43378 of SEQ ID N0:1, nucleotides 43524-54920 of SEQ ID N0:1, nucleotides 43626-44885 of SEQ ID N0:1, nucleotides 45204-46166 of SEQ ID NO: 1, nucleotides 46950-47702 of SEQ ID N0:1, nucleotides 47811-48032 of SEQ ID NO: 1, nucleotides 48087-49361 of SEQ ID N0:1, nucleotides 49680-50642 of SEQ ID NO: 1, nucleotides 50670-51176 of SEQ ID N0:1, nucleotides 51534-52657 of SEQ ID NO: 1, nucleotides 53697-54431 of SEQ ID N0:1, nucleotides 54540-54758 of SEQ ID NO: 1, nucleotides 54935-62254 of SEQ ID N0:1, nucleotides 55028-56284 of SEQ ID N0:1, nucleotides 56600-57565 of SEQ ID N0:1, nucleotides 57593-58087 of SEQ ID N0:1, nucleotides 59366-60304 of SEQ ID N0:1, nucleotides 60362-61099 of SEQ ID N0:1, nucleotides 61211-61426 of SEQ ID N0:1, nucleotides 61427-62254 of SEQ ID NO: 1, nucleotides 62369-63628 of SEQ ID N0:1, nucleotides 67334-68251 of SEQ ID N0:1, and nucleotides 1-68750 SEQ ID N0:1.
In an especially preferred embodiment, the present invention provides a nucleic acid molecule comprising a nucleotide sequence that encodes at least one polypeptide involved in the biosynthesis of an epothiione, wherein said nucleotide sequence is selected from the group consisting of: the complement of nucleotides 1900-3171 of SEQ ID N0:1, nucleotides 3415-5556 of SEQ ID N0:1, nucleotides 7610-11875 of SEQ ID NO: 1, nucleotides 7643-8920 of SEQ ID N0:1, nucleotides 9236-10201 of SEQ ID N0:1, nucleotides 10529-11428 of SEQ ID N0:1, nucleotides 11549-11764 of SEQ ID NO: 1, nucleotides 11872-16104 of

SEQ ID NO: 1, nucleotides 12085-12114 of SEQ ID N0:1, nucleotides 12223-12246 of SEQ ID N0:1, nucleotides 12466-12507 of SEQ ID N0:1, nucleotides 12928-12960 of SEQ ID NO:1, nucleotides 13516-13566 of SEQ ID NO: 1, nucleotides 13633-13680 of SEQ ID NO: 1, nucleotides 13876-13923 of SEQ ID N0:1, nucleotides 14313-14334 of SEQ ID NO: 1, nucleotides 14473-14547 of SEQ ID NO: 1, nucleotides 14578-14607 of SEQ ID NO:1, nucleotides 14623-14692 of SEQ ID NO:1, nucleotides 15673-15693 of SEQ ID NO: 1, nucleotides 15724-15762 of SEQ ID NO:1, nucleotides 14788-15639 of SEQ ID NO: 1, nucleotides 15901-15924 of SEQ ID NO: 1, nucleotides 16251-21749 of SEQ ID NO: 1, nucleotides 16269-17546 of SEQ ID NO: 1, nucleotides 17865-18827 of SEQ ID NO:1, nucleotides 18855-19361 of SEQ ID N0:1, nucleotides 20565-21302 of SEQ ID NO:1, nucleotides 21414-21626 of SEQ ID NO:1, nucleotides 21746-43519 of SEQ ID NO:1, nucleotides 21860-23116 of SEQ ID NO: 1, nucleotides 23431-24397 of SEQ ID N0:1, nucleotides 25184-25942 of SEQ ID NO: 1, nucleotides 26045-26263 of SEQ ID NO: 1, nucleotides 26318-27595 of SEQ ID NO: 1, nucleotides 27911-28876 of SEQ ID N0:1, nucleotides 29678-30429 of SEQ ID N0:1, nucleotides 30539-30759 of SEQ ID NO:1, nucleotides 30815-32092 of SEQ ID NO:1, nucleotides 32408-33373 of SEQ ID NO:1, nucleotides 33401-33889 of SEQ ID N0:1, nucleotides 35042-35902 of SEQ ID N0:1, nucleotides 35930-36667 of SEQ ID N0:1, nucleotides 36773-36991 of SEQ ID N0:1, nucleotides 37052-38320 of SEQ ID N0:1, nucleotides 38636-39598 of SEQ ID N0:1, nucleotides 39635-40141 of SEQ ID NO: 1, nucleotides 41369-42256 of SEQ ID NO: 1, nucleotides 42314-43048 of SEQ ID NO: 1, nucleotides 43163-43378 of SEQ ID N0:1, nucleotides 43524-54920 of SEQ ID N0:1, nucleotides 43625-44885 of SEQ ID N0:1, nucleotides 45204-46166 of SEQ ID N0:1, nucleotides 46950-47702 of SEQ ID N0:1, nucleotides 47811-48032 of SEQ ID N0:1, nucleotides 48087-49361 of SEQ ID N0:1, nucleotides 49680-50642 of SEQ ID N0:1, nucleotides 50670-51176 of SEQ ID NO: 1, nucleotides 51534-52657 of SEQ ID NO: 1, nucleotides 53697-54431 of SEQ ID NO: 1, nucleotides 54540-54758 of SEQ ID NO: 1, nucleotides 54935-62254 of SEQ ID N0:1, nucleotides 55028-56284 of SEQ ID N0:1, nucleotides 56600-57565 of SEQ ID N0:1, nucleotides 57593-58087 of SEQ ID N0:1, nucleotides 59366-60304 of SEQ ID N0:1, nucleotides 60362-61099 of SEQ ID N0:1, nucleotides 61211-61426 of SEQ ID N0:1, nucleotides 61427-62254 of SEQ ID N0:1, nucleotides 62369-63628 of SEQ ID N0:1, nucleotides 67334-68251 of SEQ ID N0:1, and nucleotides 1-68750 SEQ ID N0:1.

In yet another preferred embodiment, the present invention provides an isolated nucleic acid molecule comprising a nucleotide sequence that encodes at least one polypeptide involved in the biosynthesis of an epothilone, wherein said nucleotide sequence comprises a consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair nucleotide portion identical in sequence to a respective consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair portion of a nucleotide sequence selected from the group consisting of: the complement of nucleotides 1900-3171 of SEQ ID N0:1, nucleotides 3415-5556 of SEQ ID N0:1, nucleotides 7610-11875 of SEQ ID N0:1, nucleotides 7643-8920 of SEQ ID NO:1, nucleotides 9236-10201 of SEQ ID N0:1, nucleotides 10529-11428 of SEQ ID N0:1, nucleotides 11549-11764 of SEQ ID N0:1, nucleotides 11872-16104 of SEQ ID N0:1, nucleotides 12085-12114 of SEQ ID N0:1. nucleotides 12223-12246 of SEQ ID N0:1, nucleotides 12466-12507 of SEQ ID N0:1, nucleotides 12928-12960 of SEQ ID N0:1, nucleotides 13516-13566 of SEQ ID N0:1, nucleotides 13633-13680 of SEQ ID N0:1, nucleotides 13876-13923 of SEQ ID N0:1, nucleotides 14313-14334 of SEQ ID N0:1, nucleotides 14473-14547 of SEQ ID NO:1, nucleotides 14578-14607 of SEQ ID N0:1, nucleotides 14623-14692 of SEQ ID N0:1, nucleotides 15673-15693 of SEQ ID NO:1, nucleotides 15724-15762 of SEQ ID N0:1, nucleotides 14788-15639 of SEQ ID N0:1, nucleotides 15901-15924 of SEQ ID NO: 1, nucleotides 16251-21749 of SEQ. ID N0:1, nucleotides 16269-17546 of SEQ ID NO: 1, nucleotides 17865-18827 of SEQ ID N0:1, nucleotides 18855-19361 of SEQ ID NO: 1, nucleotides 20565-21302 of SEQ ID N0:1, nucleotides 21414-21626 of SEQ ID N0:1, nucleotides 21746-43519 of SEQ ID N0:1, nucleotides 21860-23116 of SEQ ID N0:1, nucleotides 23431-24397 of SEQ ID N0:1, nucleotides 25184-25942 of SEQ ID NO:1, nucleotides 26045-26263 of SEQ ID N0:1, nucleotides 26318-27595 of SEQ ID NO: 1, nucleotides 27911-28876 of SEQ ID N0:1, nucleotides 29678-30429 of SEQ ID NO: 1, nucleotides 30539-30759 of SEQ ID N0:1, nucleotides 30815-32092 of SEQ ID N0:1, nucleotides 32408-33373 of SEQ ID NO: 1, nucleotides 33401-33889 of SEQ ID N0:1, nucleotides 35042-35902 of SEQ ID N0:1, nucleotides 35930-36667 of SEQ ID NO: 1, nucleotides 36773-36991 of SEQ ID N0:1, nucleotides 37052-38320 of SEQ ID NO:1, nucleotides 38636-39598 of SEQ ID N0:1, nucleotides 39635-40141 of SEQ ID N0:1, nucleotides 41369-42256 of SEQ ID N0:1, nucleotides 42314-43048 of SEQ ID N0:1, nucleotides 43163-43378 of SEQ ID N0:1, nucleotides 43524-54920 of SEQ ID N0:1. nucleotides 43626-44885 of SEQ ID NO: 1. nucleotides 45204-46166 of SEQ ID NO: 1, nucleotides 46950-47702 of S EQ ID

N0:1, nucleotides 47811-48032 of SEQ ID NO: 1, nucleotides 48087-49361 of SEQ ID N0:1, nucleotides 49680-50642 of SEQ ID NO: 1, nucleotides 50670-51176 of SEQ ID N0:1, nucleotides 51534-52657 of SEQ ID NO: 1, nucleotides 53697-54431 of SEQ lb N0:1, nucleotides 54540-54758 of SEQ ID NO: 1, nucleotides 54935-62254 of SEQ ID N0:1. nucleotides 55028-56284 of SEQ ID NO: 1, nucleotides 56600-57565 of SEQ ID N0:1, nucleotides 57593-58087 of SEQ ID N0:1, nucleotides 59366-60304 of SEQ ID N0:1, nucleotides 60362-61099 of SEQ ID N0:1, nucleotides 61211-61426 of SEQ ID N0:1, nucleotides 61427-62254 of SEQ ID NO: 1, nucleotides 62369-63628 of SEQ ID N0:1, nucleotides 67334-68251 of SEQ ID N0:1, and nucleotides 1-68750 SEQ ID N0:1.
The present invention also provides a chimeric gene comprising a heterologous promoter sequence operatively linked to a nucleic acid molecule of the invention. Further, the present invention provides a recombinant vector comprising such a chimeric gene, wherein the vector is capable of being stably transformed into a host cell. Still further, the present invention provides a recombinant host cell comprising such a chimeric gene, wherein the host cell is capable of expressing the nucleotide sequence that encodes at least one polypeptide necessary for the biosynthesis of an epothilone. In a preferred embodiment, the recombinant host cell is a bacterium belonging to the order Actinomycetales, and in a more preferred embodiment the recombinant host cell is a strain of Streptomyces. In other embodiments, the recombinant host cell is any other bacterium amenable to fermentation, such as a pseudomonad or E. colL Even further, the present invention provides a Bac clone comprising a nucleic acid molecule of the invention, preferably Bac clone pEP015.
In another aspect, the present invention provides an isolated nucleic acid molecule comprising a nucleotide sequence that encodes an epothilone synthase domain.
According to one embodiment, the epothilone synthase domain is a p-ketoacyl-syn-thase (KS) domain comprising an amino acid sequence substantially similar to an amino acid sequence selected from the group consisting of: amino acids 11-437 of SEQ ID N0:2, amino acids 7-432 of SEQ ID N0:4, amino acids 39-457 of SEQ ID N0:5, amino acids 1524-1950 of SEQ ID N0:5, amino acids 3024-3449 of SEQ ID N0:5, amino acids 5103-5525 of SEQ ID N0:5, amino acids 35-454 of SEQ ID N0:6, amino acids 1522-1946 of SEQ ID NO: 6, and amino acids 32-450 of SEQ ID N0:7. According to this embodiment, said KS domain preferably comprises an amino acid sequence selected from the group consisting of: amino acids 11-437 of SEQ ID N0:2, amino acids 7-432 of SEQ ID N0:4, amino acids 39-457 of SEQ ID NO:5, amino acids 1524-1950 of SEQ ID N0:5. amino acids

3024-3449 of SEQ ID N0:5, amino acids 5103-5525 of SEQ ID N0:5, amino acids 35-454 of SEQ ID N0:6, amino acids 1522-1946 of SEQ ID NO: 6, and amino acids 32-450 of SEQ ID N0:7. Also, according to this embodiment, said nucleotide sequence preferably is substantially similar to a nucleotide sequence selected from the group consisting of: nucleotides 7643-8920 of SEQ ID NO: 1, nucleotides 16269-17546 of SEQ ID NO: 1, nucleotides 21860-23116 of SEQ ID N0:1, nucleotides 26318-27595 of SEQ ID N0:1, nucleotides 30815-32092 of SEQ ID N0:1, nucleotides 37052-38320 of SEQ ID NO: 1, nucleotides 43626-44885 of SEQ ID N0:1, nucleotides 48087-49361 of SEQ ID NO:1, and nucleotides 55028-56284 of SEQ ID NO:1. According to this embodiment, said nucleotide sequence more preferably comprises a consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair nucleotide portion identical in sequence to a respective consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair portion of a nucleotide sequence selected from the group consisting of: nucleotides 7643-8920 of SEQ ID N0:1, nucleotides 16269-17546 of SEQ ID NO:1, nucleotides 21860-23116 of SEQ ID N0:1, nucleotides 26318-27595 of SEQ ID N0:1, nucleotides 30815-32092 of SEQ ID N0:1, nucleotides 37052-38320 of SEQ ID NO: 1, nucleotides 43626-44885 of SEQ ID NO: 1, nucleotides 48087-49361 of SEQ ID N0:1, and nucleotides 55028-56284 of SEQ ID N0:1. In addition, according to this embodiment, said nucleotide sequence most preferably is selected from the group consisting of: nucleotides 7643-8920 of SEQ ID N0:1, nucleotides 16269-17546 of SEQ ID N0:1, nucleotides 21860-23116 of SEQ ID NO: 1, nucleotides 26318-27595 of SEQ ID N0:1, nucleotides 30815-32092 of SEQ ID N0:1, nucleotides 37052-38320 of SEQ ID N0:1, nucleotides 43626-44885 of SEQ ID N0:1, nucleotides 48087-49361 of SEQ ID NO:1, and nucleotides 55028-56284 of SEQ ID N0:1.
According to another embodiment, the epothilone synthase domain is an acyltrans-ferase (AT) domain comprising an amino acid sequence substantially similar to an amino acid sequence selected from the group consisting of: amino acids 543-864 of SEQ ID N0:2, amino acids 539-859 of SEQ ID NO:4, amino acids 563-884 of SEQ ID NO:5, amino acids 2056-2377 of SEQ ID N0:5, amino acids 3555-3876 of SEQ ID N0:5, amino acids 5631-5951 of SEQ ID N0:5, amino acids 561-881 of SEQ ID N0:6, amino acids 2053-2373 of SEQ ID N0:6, and amino acids 556-877 of SEQ ID N0:7. According to this embodiment, said AT domain preferably comprises an amino acid sequence selected from the group consisting of: amino acids 543-864 of SEQ ID N0:2, amino acids 539-859 of SEQ ID N0:4. amino acids 563-884 of SEQ ID NO:5, amino acids 2056-2377 of SEQ ID N0:5, amino

acids 3555-3876 of SEQ ID N0:5, amino acids 5631-5951 of SEQ ID N0:5, amino acids 561-881 of SEQ ID N0:6, amino acids 2053-2373 of SEQ ID N0:6, and amino acids 556-877 of SEQ ID N0:7. Also, according to this embodiment, said nucleotide sequence preferably is substantially similar to a nucleotide sequence selected from the group consisting of: nucleotides 9236-10201 of SEQ ID N0:1, nucleotides 17865-18827 of SEQ ID N0:1, nucleotides 23431-24397 of SEQ ID N0:1, nucleotides 27911-28876 of SEQ ID N0:1, nucleotides 32408-33373 of SEQ ID NO: 1, nucleotides 38636-39598 of SEQ ID N0:1, nucleotides 45204-46166 of SEQ ID NO: 1, nucleotides 49680-50642 of SEQ ID N0:1, and nucleotides 56600-57565 of SEQ ID N0:1. According to this embodiment, said nucleotide sequence more preferably comprises a consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair nucleotide portion identical in sequence to a respective consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair portion of a nucleotide sequence selected from the group consisting of: nucleotides 9236-10201 of SEQ ID N0:1, nucleotides 17865-18827 of SEQ ID N0:1, nucleotides 23431-24397 of SEQ ID N0:1, nucleotides 27911-28876 of SEQ ID N0:1, nucleotides 32408-33373 of SEQ ID N0:1, nucleotides 38636-39598 of SEQ ID N0:1, nucleotides 45204-46166 of SEQ ID N0:1, nucleotides 49680-50642 of SEQ ID N0:1, and nucleotides 56600-57565 of SEQ ID N0:1. In addition, according to this embodiment, said nucleotide sequence most preferably is selected from the group consisting of: nucleotides 9236-10201 of SEQ ID N0:1, nucleotides 17865-18827 of SEQ ID N0:1, nucleotides 23431-24397 of SEQ ID NO: 1, nucleotides 27911-28876 of SEQ ID NO:1, nucleotides 32408-33373 of SEQ ID N0:1, nucleotides 38636-39598 of SEQ ID N0:1, nucleotides 45204-46166 of SEQ ID N0:1, nucleotides 49680-50642 of SEQ ID N0:1, and nucleotides 56600-57565 of SEQ ID N0:1.
According to still another embodiment, the epothilone synthase domain is an enoyl reductase (ER) domain comprising an amino acid sequence substantially similar to an amino acid sequence selected from the group consisting of: amino acids 974-1273 of SEQ ID N0:2, amino acids 4433-4719 of SEQ ID N0:5, amino acids 6542-6837 of SEQ ID N0:5, and amino acids 1478-1790 of SEQ ID N0:7. According to this embodiment, said ER domain preferably comprises an amino acid sequence selected from the group consisting of: amino acids 974-1273 of SEQ ID N0:2, amino acids 4433-4719 of SEQ ID N0:5, amino acids 6542-6837 of SEQ ID N0:5, and amino acids 1478-1790 of SEQ ID NO:7. Also, according to this embodiment, said nucleotide sequence preferably is substantially similar to a nucleotide sequence selected from the group consisting of: nucleotides 10529-11428 of

SEQ ID NO:1, nucleotides 35042-35902 of SEQ ID N0:1, nucleotides 41369-42256 of SEQ ID N0:1, and nucleotides 59366-60304 of SEQ ID N0:1. According to this embodiment, said nucleotide sequence more preferably comprises a consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair nucleotide portion identical in sequence to a respective consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair portion of a nucleotide sequence selected from the group consisting of: nucleotides 10529-11428 of SEQ ID N0:1, nucleotides 35042-35902 of SEQ ID NO: 1, nucleotides 41369-42256 of SEQ ID N0:1, and nucleotides 59366-60304 of SEQ ID N0:1. In addition, according to this embodiment, said nucleotide sequence most preferably is selected from the group consisting of: nucleotides 10529-11428 of SEQ ID N0:1, nucleotides 35042-35902 of SEQ ID NO: 1, nucleotides 41369-42256 of SEQ ID N0:1, and nucleotides 59366-60304 of SEQ ID N0:1.
11549-11764 of SEQ ID NO 26045-26263 of SEQ ID NO 36773-36991 of SEQ ID NO 47811-48032 of SEQ ID NO 61211-61426 of SEQ ID NO
According to another embodiment, the epothilone synthase domain is an acyl carrier protein (ACP) domain, wherein said polypeptide comprises an amino acid sequence substantially similar to an amino acid sequence selected from the group consisting of: amino acids 1314-1385 of SEQ ID N0:2, amino acids 1722-1792 of SEQ ID N0:4, amino acids 1434-1506 of SEQ ID N0:5, amino acids 2932-3005 of SEQ ID N0:5, amino acids 5010-5082 of SEQ ID N0:5, amino acids 7140-7211 of SEQ ID N0:5, amino acids 1430-1503 of SEQ ID N0:6, amino acids 3673-3745 of SEQ ID N0:6, and amino acids 2093-2164 of SEQ ID N0:7. According to this embodiment, said ACP domain preferably comprises an amino acid sequence selected from the group consisting of: amino acids 1314-1385 of SEQ ID N0:2, amino acids 1722-1792 of SEQ ID N0:4, amino acids 1434-1506 of SEQ ID NO:5, amino acids 2932-3005 of SEQ ID N0:5, amino acids 5010-5082 of SEQ ID N0:5, amino acids 7140-7211 of SEQ ID N0:5, amino acids 1430-1503 of SEQ ID N0:6, amino acids 3673-3745 of SEQ ID N0:6, and amino acids 2093-2164 of SEQ ID N0:7. Also, according to this embodiment, said nucleotide sequence preferably is substantially similar to a nucleotide sequence selected from the group consisting of: nucleotides
1, nucleotides 21414-21626 of SEQ ID N0:1, nucleotides 1, nucleotides 30539-30759 of SEQ ID N0:1, nucleotides 1, nucleotides 43163-43378 of SEQ ID N0:1, nucleotides 1, nucleotides 54540-54758 of SEQ ID NO: 1, and nucleotides 1. According to this embodiment, said nucleotide sequence more preferably comprises a consecutive 20,25, 30,35,40,45, or 50 (preferably 20) base pair nucleotide portion identical in sequence to a respective consecutive 20,25, 30,35,40,

45, or 50 (preferably 20) base pair portion of a nucleotide sequence selected from the group consisting of: nucleotides 11549-11764 of SEQ ID N0:1, nucleotides 21414-21626 of SEQ ID NO:1, nucleotides 26045-26263 of SEQ ID N0:1, nucleotides 30539-30759 of SEQ ID N0:1, nucleotides 36773-36991 of SEQ ID NO:1, nucleotides 43163-43378 of SEQ ID N0:1, nucleotides 47811-48032 of SEQ ID N0:1, nucleotides 54540-54758 of SEQ ID N0:1, and nucleotides 61211-61426 of SEQ ID N0:1. In addition, according to this embodiment, said nucleotide sequence most preferably is selected from the group consisting of: nucleotides 11549-11764 of SEQ ID NO: 1, nucleotides 21414-21626 of SEQ ID N0:1, nucleotides 26045-26263 of SEQ ID N0:1, nucleotides 30539-30759 of SEQ ID N0:1, nucleotides 36773-36991 of SEQ ID N0:1, nucleotides 43163-43378 of SEQ ID N0:1, nucleotides 47811-48032 of SEQ ID NO: 1, nucleotides 54540-54758 of SEQ ID N0:1, and nucleotides 61211-61426 of SEQ ID N0:1.
According to another embodiment, the epothiione synthase domain Is a dehydratase (DH) domain comprising an amino acid sequence substantially similar to an amino acid sequence selected from the group consisting of: amino acids 869-1037 of SEQ ID N0:4, amino acids 3886-4048 of SEQ ID NO:5, amino acids 5964-6132 of SEQ ID N0:5, amino acids 2383-2551 of SEQ ID N0:6, and amino acids 887-1051 of SEQ ID N0:7. According to this embodiment, said DH domain preferably comprises an amino acid sequence selected from the group consisting of: amino acids 869-1037 of SEQ ID N0:4, amino acids 3886-4048 of SEQ ID NO:5, amino acids 5964-6132 of SEQ ID N0:5, amino acids 2383-2551 of SEQ ID NO:6, and amino acids 887-1051 of SEQ ID N0:7. Also, according to this embodiment, said nucleotide sequence preferably is substantially similar to a nucleotide sequence selected from the group consisting of: nucleotides 18855-19361 of SEQ ID NO: 1, nucleotides 33401-33889 of SEQ ID N0:1, nucleotides 39635-40141 of SEQ ID N0:1, nucleotides 50670-51176 of SEQ ID N0:1, and nucleotides 57593-58087 of SEQ ID NO:1. According to this embodiment, said nucleotide sequence more preferably comprises a consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair nucleotide portion identical in sequence to a respective consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair portion of a nucleotide sequence selected from the group consisting of: nucleotides 18855-19361 of SEQ ID NO:1, nucleotides 33401-33889 of SEQ ID NO:1, nucleotides 39635-40141 of SEQ ID NO:1, nucleotides 50670-51176 of SEQ ID N0:1, and nucleotides 57593-58087 of SEQ ID NO:1. In addition, according to this embodiment, said nucleotide sequence most preferably is selected from the group consisting of: nucleotides 18855-19361 of SEQ ID N0:1,

nucleotides 33401-33889 of SEQ ID NO: 1, nucleotides 39635-40141 of SEQ ID N0:1, nucleotides 50670-51176 of SEQ ID NO: 1, and nucleotides 57593-58087 of SEQ ID NO:!. According to yet another embodiment, the epothilone synthaise domain is a p-keto-reductase (KR) domain comprising an amino acid sequence substantially similar to an amino acid sequence selected from the group consisting of: amino acids 1439-1684 of SEQ ID N0:4, amino acids 1147-1399 of SEQ ID N0:5, amino acids 2645-2895 of SEQ ID N0:5, amino acids 4729-4974 of SEQ ID N0:5, amino acids 6857-7101 of SEQ ID N0:5, amino acids 1143-1393 of SEQ ID N0:6, amino acids 3392-3636 of SEQ ID NQ:6, and amino acids 1810-2055 of SEQ ID N0:7. According to this embodiment, said KR domain preferably comprises an amino acid sequence selected from the group consisting of: amino acids 1439-1684 of SEQ ID N0:4, amino acids 1147-1399 of SEQ ID N0:5, amino acids 2645-2895 of SEQ ID N0:5, amino acids 4729-4974 of SEQ ID N0:5, amino acids 6857-7101 of SEQ ID N0:5, amino acids 1143-1393 of SEQ ID N0:6, amino acids 3392-3636 of SEQ ID NO:6, and amino acids 1810-2055 of SEQ ID N0:7. Also, according to this embodiment, said nucleotide sequence preferably is substantially similar to a nucleotide sequence selected from the group consisting of: nucleotides 20565-21302 of SEQ ID N0:1, nucleotides 25184-25942 of SEQ ID NO: 1, nucleotides 29678-30429 of SEQ ID NO: 1, nucleotides 35930-36667 of SEQ ID NO: 1, nucleotides 42314-43048 of SEQ ID NO:1, nucleotides 46950-47702 of SEQ ID N0:1, nucleotides 53697-54431 of SEQ ID NO:1, and nucleotides 60362-61099 of SEQ ID NO: 1. According to this embodiment, said nucleotide sequence more preferably comprises a consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair nucleotide portion identical in sequence to a respective consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair portion of a nucleotide sequence selected from the group consisting of: nucleotides 20565-21302 of SEQ ID N0:1, nucleotides 25184-25942 of SEQ ID NO:1, nucleotides 29678-30429 of SEQ ID N0:1, nucleotides 35930-36667 of SEQ ID N0:1, nucleotides 42314-43048 of SEQ ID NO: 1, nucleotides 46950-47702 of SEQ ID N0:1, nucleotides 53697-54431 of SEQ ID N0:1, and nucleotides 60362-61099 of SEQ ID N0:1. In addition, according to this embodiment, said nucleotide sequence most preferably is selected from the group consisting of: nucleotides 20565-21302 of SEQ ID N0:1, nucleotides 25184-25942 of SEQ ID N0:1, nucleotides 29678-30429 of SEQ ID N0:1, nucleotides 35930-36667 of SEQ ID N0:1, nucleotides 42314-43048 of SEQ ID N0:1. nucleotides 46950-47702 of SEQ ID N0:1, nucleotides 53697-54431 of SEQ ID N0:1, and nucleotides 60362-61099 of SEQ ID N0:1.

According to an additional embodiment, the epothiione synthase domain is a methyltransferase (MT) domain comprising an amino acid sequence substantially similar to amino acids 2671-3045 of SEQ ID N0:6. According to this embodiment, said MT domain preferably comprises amino acids 2671-3045 of SEQ ID N0:6. Also, according to this embodiment, said nucleotide sequence preferably is substantially similar to nucleotides 51534-52657 of SEQ ID N0:1. According to this embodiment, said nucleotide sequence more preferably comprises a consecutive 20, 25, 30. 35, 40, 45, or 50 (preferably 20) base pair nucleotide portion identical in sequence to a respective consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair portion of nucleotides 51534-52657 of SEQ ID N0:1. In addition, according to this embodiment, said nucleotide sequence most preferably is nucleotides 51534-52657 of SEQ ID N0:1.
According to another embodiment, the epothiione synthase domain is a thioesterase (TE) domain comprising an amino acid sequence substantially similar to amino acids 2165-2439 of SEQ ID N0:7. According to this embodiment, said TE domain preferably comprises amino acids 2165-2439 of SEQ ID N0:7. Also, according to this embodiment, said nucleotide sequence preferably is substantially similar to nucleotides 61427-62254 of SEQ ID N0:1. According to this embodiment, said nucleotide sequence more preferably comprises a consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair nucleotide portion identical in sequence to a respective consecutive 20, 25, 30, 35. 40, 45, or 50 (preferably 20) base pair portion of nucleotides 61427-62254 of SEQ ID N0:1. In addition, according to this embodiment, said nucleotide sequence most preferably is nucleotides 61427-62254 of SEQ 1DN0:1.
In still another aspect, the present invention provides an isolated nucleic acid molecule comprising a nucleotide sequence that encodes a non-ribosomal peptide synthetase, wherein said non-ribosomal peptide synthetase comprises an amino acid sequence substantially similar to an amino acid sequence selected from the group consisting of: SEQ ID N0:3, amino acids 72-81 of SEQ ID N0:3, amino acids 118-125 of SEQ ID N0:3, amino acids 199-212 of SEQ ID N0:3, amino acids 353-363 of SEQ ID N0:3, amino acids 549-565 of SEQ ID N0:3, amino acids 588-603 of SEQ ID N0:3. amino acids 669-684 of SEQ ID N0:3, amino acids 815-821 of SEQ ID N0:3, amino acids 868-892 of SEQ ID N0:3, amino acids 903-912 of SEQ ID N0:3, amino acids 918-940 of SEQ ID N0:3, amino acids 1268-1274 of SEQ ID N0:3, amino acids 1285-1297 of SEQ ID N0:3, amino acids 973-1256 of SEQ ID N0:3, and amino acids 1344-1351 of SEQ ID N0:3. According to this

1, nucleotides 12466-12507 of SEQ ID 1, nucleotides 13516-13566 of SEQ ID 1, nucleotides 13876-13923 of SEQ ID 1, nucleotides 14473-14547 of SEQ ID 1, nucleotides 14623-14692 of SEQ ID 1, nucleotides 15724-15762 of SEQ ID 1, and nucleotides 15901-15924 of SEQ ID
embodiment, said non-ribosomal peptide synthetase preferably comprises an amino acid sequence selected from the group consisting of: SEQ ID N0:3, amino acids 72-81 of SEQ ID N0:3, amino acids 118-125 of SEQ ID N0:3, amino acids 199-212 of SEQ ID N0:3, amino acids 353-363 of SEQ ID N0:3, amino acids 549-565 of SEQ ID N0:3, amino acids 588-603 of SEQ ID N0:3, amino acids 669-684 of SEQ ID N0:3, amino acids 815-821 of SEQ ID NO:3, amino acids 868-892 of SEQ ID N0:3, amino acids 903-912 of SEQ ID N0:3, amino acids 918-940 of SEQ ID N0:3, amino acids 1268-1274 of SEQ ID N0:3, amino acids 1285-1297 of SEQ ID N0:3, amino acids 973-1256 of SEQ ID NO:3, and amino acids 1344-1351 of SEQ ID NO:3. Also, according to this embodiment, said nucleotide sequence preferably is substantially similar to a nucleotide sequence selected from the group consisting of: nucleotides 11872-16104 of SEQ ID N0:1, nucleotides 12085-12114 of SEQ ID
1, nucleotides 1, nucleotides 1, nucleotides 1, nucleotides 1, nucleotides
N0:1, nucleotides 12223-12246 of SEQ ID NO; NO: 1, nucleotides 12928-12960 of SEQ ID NO: NO: 1, nucleotides 13633-13680 of SEQ ID NO: N0:1, nucleotides 14313-14334 of SEQ ID NO: NO: 1, nucleotides 14578-14607 of SEQ ID NO: NO: 1, nucleotides 15673-15693 of SEQ ID NO: NO: 1, nucleotides 14788-15639 of SEQ ID NO: N0:1. According to this embodiment, said nucleotide sequence more preferably comprises a consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair nucleotide portion identical in sequence to a respective consecutive 20, 25, 30, 35, 40, 45, or 50 (preferably 20) base pair portion of a nucleotide sequence selected from the group consisting of: nucleotides 11872-16104 of SEQ ID N0:1, nucleotides 12085-12114 of SEQ ID N0:1, nucleotides 12223-12246 of SEQ ID N0:1, nucleotides 12466-12507 of SEQ ID N0:1, nucleotides
12928-12960 of SEQ ID NO: 1, nucleotides 13516-13566 of SEQ ID NO: 13633-13680 of SEQ ID NO: 1, nucleotides 13876-13923 of SEQ ID NO: 14313-14334 of SEQ ID NO:1, nucleotides 14473-14547 of SEQ ID NO: 14578-14607 of SEQ ID N0:1, nucleotides 14623-14692 of SEQ ID NO: 15673-15693 of SEQ ID NO:1, nucleotides 15724-15762 of SEQ ID NO: 14788-15639 of SEQ ID N0:1, and nucleotides 15901-15924 of SEQ ID N0:1. In addition, according to this embodiment, said nucleotide sequence most preferably is selected from the group consisting of: nucleotides 11872-16104 of SEQ ID N0:1, nucleotides 12085-12114 of SEQ ID N0:1, nucleotides 12223-12246 of SEQ ID N0:1, nucleotides 12466-

1, nucleotides 13516-1, nucleotides 13876-1, nucleotides 14473-1, nucleotides 14623-1, nucleotides 15724-1, and nucleotides 15901-
12507 of SEQ ID N0:1, nucleotides 12928-12960 of SEQ ID NO; 13566 of SEQ ID N0:1, nucleotides 13633-13680 of SEQ ID NO; 13923 of SEQ ID N0:1, nucleotides 14313-14334 of SEQ ID NO: 14547 of SEQ ID N0:1. nucleotides 14578-14607 of SEQ ID NO; 14692 of SEQ ID N0:1, nucleotides 15673-15693 of SEQ ID NO; 15762 of SEQ ID N0:1, nucleotides 14788-15639 of SEQ ID NO; 15924 of SEQ ID NO:1.
The present invention further provides an isolated nucleic acid molecule comprising a nucleotide sequence that encodes a polypeptide comprising an amino acid sequence selected from the group consisting of SEQ ID NOs:2-23.
In accordance with another aspect, the present invention also provides methods for the recombinant production of polyketides such as epothilones in quantities large enough to enable their purification and use in pharmaceutical formulations such as those for the treatment of cancer. A specific advantage of these production methods is the chirality of the molecules produced; production in transgenic organisms avoids the generation of populations of racemic mixtures, within which some enantiomers may have reduced activity. In particular, the present invention provides a method for heterologous expression of epothi-lone in a recombinant host, comprising: (a) introducing into a host a chimeric gene comprising a heterologous promoter sequence operatively linked to a nucleic acid molecule of the invention that comprises a nucleotide sequence that encodes at least one polypeptide involved in the biosynthesis of epothilone; and (b) growing the host in conditions that allow biosynthesis of epothilone in the host.. The present invention also provides a method for producing epothilone, comprising: (a) expressing epothilone in a recombinant host by the aforementioned method; and (b) extracting epothilone from the recombinant host.
According to still another aspect, the present invention provides an isolated polypeptide comprising an amino acid sequence that consists of an epothilone synthase domain.
According to one embodiment, the epothilone synthase domain is a p-ketoacyl-synthase (KS) domain comprising an amino acid sequence substantially similar to an amino acid sequence selected from the group consisting of: amino acids 11-437 of SEQ ID N0:2, amino acids 7-432 of SEQ ID N0:4, amino acids 39-457 of SEQ ID NO:5, amino acids 1524-1950 of SEQ ID N0:5, amino acids 3024-3449 of SEQ ID N0:5, amino acids 5103-5525 of sko ID N0:5, amino acids 35-454 of SEQ ID N0:6. amino acids 1522-1946 of SEQ ID NO: 6, and amino acids 32-450 of SEQ ID N0:7. According to this embodiment,

said KS domain preferably comprises an amino acid sequence selected from the group consisting of: amino acids 11-437 of SEQ ID N0:2, amino acids 7-432 of SEQ ID N0:4, amino acids 39-457 of SEQ ID N0:5, amino acids 1524-1950 of SEQ ID NO:5, amino acids 3024-3449 of SEQ ID N0:5, amino acids 5103-5525 of SEQ ID NO:5, amino acids 35-454 of SEQ ID N0:6, amino acids 1522-1946 of SEQ ID NO: 6, and amino acids 32-450 of SEQ
ID N0:7.
According to another embodiment, the epothilone synthase domain is an acyltrans-ferase (AT) domain comprising an amino acid sequence substantially similar to an amino acid sequence selected from the group consisting of: amino acids 543-864 of SEQ ID N0:2, amino acids 539-859 of SEQ ID N0:4, amino acids 563-884 of SEQ ID N0:5, amino acids 2056-2377 of SEQ ID N0:5, amino acids 3555-3876 of SEQ ID N0:5, amino acids 5631-5951 of SEQ ID N0:5, amino acids 561-881 of SEQ ID NQ:6, amino acids 2053-2373 of SEQ ID NO:6, and amino acids 556-877 of SEQ ID N0:7. According to this embodiment, said AT domain preferably comprises an amino acid sequence selected from the group consisting of: amino acids 543-864 of SEQ ID N0:2, amino acids 539-859 of SEQ ID N0:4, amino acids 563-884 of SEQ ID NO:5, amino acids 2056-2377 of SEQ ID NO:5, amino acids 3555-3876 of SEQ ID N0:5, amino acids 5631-5951 of SEQ ID NO:5, amino acids 561-881 of SEQ ID NO:6, amino acids 2053-2373 of SEQ ID N0:6, and amino acids 556-877 of SEQ ID N0:7.
According to still another embodiment, the epothilone synthase domain is an enoyi reductase (ER) domain comprising an amino acid sequence substantially similar to an amino acid sequence selected from the group consisting of: amino acids 974-1273 of SEQ ID N0:2, amino acids 4433-4719 of SEQ ID N0:5, amino acids 6542-6837 of SEQ ID NO:5, and amino acids 1478-1790 of SEQ ID N0:7. According to this embodiment, said ER domain preferably comprises an amino acid sequence selected from the group consisting of: amino acids 974-1273 of SEQ ID N0:2, amino acids 4433-4719 of SEQ ID N0:5, amino acids 6542-6837 of SEQ ID N0:5, and amino acids 1478-1790 of SEQ ID NO:7.
According to another embodiment, the epothilone synthase domain is an acyl carrier protein (ACP) domain, wherein said polypeptide comprises an amino acid sequence substantially similar to an amino acid sequence selected from the group consisting of: amino acids 1314-1385 of SEQ ID N0:2, amino acids 1722-1792 of SEQ ID N0:4, amino acids 1434-1506 of SEQ ID N0:5, amino acids 2932-3005 of SEQ ID N0:5, amino acids 5010-5082 of SEQ ID N0:5, amino acids 7140-7211 of SEQ ID N0:5, amino acids 1430-1503 of

SEQ ID N0:6, amino acids 3673-3745 of SEQ ID N0:6, and amino acids 2093-2164 of SEQ ID N0:7. According to this embodiment, said ACP domain preferably comprises an amino acid sequence selected from the group consisting of: amino acids 1314-1385 of SEQ ID N0:2, amino acids 1722-1792 of SEQ ID N0:4, amino acids 1434-1506 of SEQ ID N0:5, amino acids 2932-3005 of SEQ ID N0:5, amino acids 5010-5082 of SEQ ID N0:5, amino acids 7140-7211 of SEQ ID N0:5, amino acids 1430-1503 of SEQ ID NO:6, amino acids 3673-3745 of SEQ ID N0:6, and amino acids 2093-2164 of SEQ ID N0:7.
According to another embodiment, the epothilone synthase domain is a dehydratase (DH) domain comprising an amino acid sequence substantially similar to an amino acid sequence selected from the group consisting of: amino acids 869-1037 of SEQ ID N0:4, amino acids 3886-4048 of SEQ ID N0:5, amino acids 5964-6132 of SEQ ID N0:5, amino acids 2383-2551 of SEQ ID N0:6, and amino acids 887-1051 of SEQ ID N0:7. According to this embodiment, said DH domain preferably comprises an amino acid sequence selected from the group consisting of: amino acids 869-1037 of SEQ ID N0:4, amino acids 3886-4048 of SEQ ID N0:5, amino acids 5964-6132 of SEQ ID N0:5, amino acids 2383-2551 of SEQ ID N0:6, and amino acids 887-1051 of SEQ ID N0:7.
According to yet another embodiment, the epothilone synthase domain is a p-keto-reductase (KR) domain comprising an amino acid sequence substantially similar to an amino acid sequence selected from the group consisting of: amino acids 1439-1684 of SEQ ID N0:4, amino acids 1147-1399 of SEQ ID N0:5, amino acids 2645-2895 of SEQ ID N0:5, amino acids 4729-4974 of SEQ ID NQ:5, amino acids 6857-7101 of SEQ ID NQ:5, amino acids 1143-1393 of SEQ ID N0:6, amino acids 3392-3636 of SEQ ID N0:6, and amino acids 1810-2055 of SEQ ID N0:7. According to this embodiment, said KR domain preferably comprises an amino acid sequence selected from the group consisting of: amino acids 1439-1684 of SEQ ID N0:4, amino acids 1147-1399 of SEQ ID N0:5, amino acids 2645-2895 of SEQ ID N0:5, amino acids 4729-4974 of SEQ ID N0:5, amino acids 6857-7101 of SEQ ID N0:5, amino acids 1143-1393 of SEQ ID N0:6, amino acids 3392-3636 of SEQ ID N0:6, and amino acids 1810-2055 of SEQ ID N0:7.
According to an additional embodiment, the epothilone synthase domain is a methyl-transferase (MT) domain comprising an amino acid sequence substantially similar to amino acids 2671-3045 of SEQ ID N0:6. According to this embodiment, said MT domain preferably comprises amino acids 2671-3045 of SEQ ID N0:6.

According to another embodiment, the epothilone synthase domain is a thioesterase (TE) domain comprising an amino acid sequence substantially similar to amino acids 2165-2439 of SEQ ID N0:7. According to this embodiment, said TE domain preferably comprises amino acids 2165-2439 of SEQ ID N0:7.
Other aspects and advantages of the present invention will become apparent to those skilled in the art from a study of the following description of the invention and non-limiting examples.
DEFINITIONS
In describing the present invention, the following terms will be employed, and are intended to be defined as indicated below.
Associated With / Operatively Linked: Refers to two DNA sequences that are related physically or functionally. For example, a promoter or regulatory DNA sequence is said to be "associated with" a DNA sequence that codes for an RNA or a protein if the two sequences are operatively linked, or situated such that the regulator DNA sequence will affect the expression level of the coding or structural DNA sequence.
• Chimeric Gene: A recombinant DNA sequence in which a promoter or regulatory DNA sequence is operatively linked to, or associated with, a DNA sequence that codes for an mRNA or which is expressed as a protein, such that the regulator DNA sequence is able to regulate transcription or expression of the associated DNA sequence. The regulator DNA sequence of the chimeric gene is not normally operatively linked to the associated DNA sequence as found in nature.
Coding DNA Sequence: A DNA sequence that is translated in an organism to produce a protein.
Domain: That part of a polyketide synthase necessary for a given distinct activity. Examples include acyl carrier protein (ACP), p-ketosynthase (KS). acyltransferase (AT), β-ketoreductase (KR), dehydratase (DH), enoylreductase (ER), and thioesterase (TE) domains.
Epothilones: 16-membered macrocyclic polyketides naturally produced by the bacterium Sorangium cellulosum strain So ce90, which mimic the biological effects of taxol. In this application, **epothilone" refers to the class of polyketides that includes epothilone A and epothilone B, as well as analogs thereof such as those described in WO 98/25929.

Epothilone Synthase: A polyketide synthase responsible for the biosynthesis of epo-thilone.
Gene: A defined region that is located within a genome and that, besides the aforementioned coding DNA sequence, comprises other, primarily regulatory, DNA sequences responsible for the control of the expression, that is to say the transcription and translation, of the coding portion.
Heterologous DNA Sequence: A DNA sequence not naturally associated with a host cell into which it is introduced, including non-naturally occurring multiple copies of a naturally occurring DNA sequence.
Homologous DNA Sequence: A DNA sequence naturally associated with a host cell into which it is introduced.
Homologous Recombination: Reciprocal exchange of DNA fragments between homologous DNA molecules.
Isolated: In the context of the present invention, an isolated nucleic acid molecule or an isolated enzyme is a nucleic acid molecule or enzyme that, by the hand of man, exists apart from its native environment and is therefore not a product of nature. An isolated nucleic acid molecule or enzyme may exist in a purified form or may exist in a non-native environment such as, for example, a recombinant host cell.
Module: A genetic element encoding all of the distinct activities required in a single round of polyketide biosynthesis, i.e., one condensation step and all the (i-carbonyl processing steps associated therewith. Each module encodes an ACP, a KS, and an AT activity to accomplish the condensation portion of the biosynthesis, and selected post-condensation activities to effect the β-carbonyl processing.
NRPS: A non-ribosomal polypeptide synthetase, which is a complex of enzymatic activities responsible for the incorporation of amino acids into secondary metabolites including, for example, amino acid adenylation, epimerization, N-methylation, cyclization, peptidyl carrier protein, and condensation domains. A functional NRPS is one that catalyzes the incorporation of an amino acid into a secondary metabolite.
NRPS gene: One or more genes encoding NRPSs for producing functional secondary metabolites, e.g., epothilones A and B, when under the direction of one or more compatible control elements.

Nucleic Acid Molecule: A linear segment of single- or double-stranded DNA or RNA that can be isolated from any source. In the context of the present invention, the nucleic acid molecule is preferably a segment of DNA.
ORF: Open Reading Frame.
PKS: A polyketide synthase, which is a complex of enzymatic activities (domains) responsible for the biosynthesis of poiyketides including, for example, ketoreductase, dehydratase, acyl carrier protein, enoylreductase, ketoacyl ACP synthase, and acyltransferase. A functional PKS is one that catalyzes the synthesis of a polyketide.
PKS Genes: One or more genes encoding various polypeptides required for producing functional poiyketides, e.g., epothilones A and B, when under the direction of one or more compatible control elements.
Substantially Similar: With respect to nucleic acids, a nucleic acid molecule that has at least 60 percent sequence identity with a reference nucleic acid molecule. In a preferred embodiment, a substantially similar DNA sequence is at least 80% identical to a reference DNA sequence; in a more preferred embodiment, a substantially similar DNA sequence is at least 90% identical to a reference DNA sequence; and in a most preferred embodiment, a substantially similar DNA sequence is at least 95% identical to a reference DNA sequence. A substantially similar DNA sequence preferably encodes a protein or peptide having ' substantially the same activity as the protein or peptide encoded by the reference DNA sequence. A substantially similar nucleotide sequence typically hybridizes to a reference nucleic acid molecule, or fragments thereof, under the following conditions: hybridization at 7% sodium dodecyl sulfate (SDS), 0.5 M NaPO4 pH 7,0.1 mM EDTA at 50°C; wash with 2X SSC, 1% SDS, at 50°C. With respect to proteins or peptides, a substantially similar amino acid sequence is an amino acid sequence that is at least 90% identical to the amino acid sequence of a reference protein or peptide and has substantially the same activity as the reference protein or peptide.
Transformation: A process for introducing heterologous nucleic acid into a host cell or organism.
Transformed / Transgenic / Recombinant: Refers to a host organism such as a bacterium into which a heterologous nucleic acid molecule has been introduced. The nucleic acid molecule can be stably integrated into the genome of the host or the nucleic acid molecule can also be present as an extrachromosomal molecule. Such an extrachromosomal molecule can be auto-replicating. Transformed cells, tissues, or plants are understood to

encompass not only the end product of a transformation process, but also transgenic progeny thereof. A "non-transformed", "non-transgenic", or "non-recombinant" host refers to a wild-type organism, i.e., a bacterium, which does not contain the heterologous nucleic acid molecule.
Nucleotides are indicated by their bases by the following standard abbreviations: adenine (A), cytosine (C), thymine (T), and guanine (G). Amino acids are likewise indicated by the following standard abbreviations: alanine (ala; A), arginine (Arg; R), asparagine (Asn; N), aspartic acid (Asp; D), cysteine (Cys; C), glutamine (Gin; Q), glutamic acid (Glu; E), glycine (Gly; G), histidine (His; H), isoleucine (lle; I), leucine (Leu; L), lysine (iys; K), methionine (Met; M), phenylalanine (Phe; F), proline (Pro; P), serine (Ser; S), threonine (Thr; T), tryptophan (Trp; W), tyrosine (Tyr; Y), and valine (Val; V). Furthermore, (Xaa; X) represents any amino acid.
DESCRIPTION OF THE SEQUENCES IN THE SEQUENCE LISTING
SEQ ID N0:1 is the nucleotide sequence of a 68750 bp contig containing 22 open reading frames (ORFs), which comprises the epothilone biosynthesis genes.
SEQ ID N0:2 is the protein sequence of a type I polyketide synthase (EPOS A) encoded by epoA (nucleotides 7610-11875 of SEQ ID N0:1),
SEQ ID N0:3 is the protein sequence of a non-ribosomal peptide synthetase (EPOS P) encoded by epoP (nucleotides 11872-16104 of SEQ ID N0:1).
SEQ ID N0:4 is the protein sequence of a type I polyketide synthase (EPOS B) encoded by epoB (nucleotides 16251-21749 of SEQ ID N0:1).
SEQ ID N0:5 is the protein sequence of a type I polyketide synthase (EPOS C) encoded by epoC (nucleotides 21746-43519 of SEQ ID N0:1).
SEQ ID N0:6 is the protein sequence of a type I polyketide synthase (EPOS D) encoded by epoD (nucleotides 43524-54920 of SEQ ID N0:1).
SEQ ID N0:7 is the protein sequence of a type I polyketide synthase (EPOS E) encoded by epoE (nucleotides 54935-62254 of SEQ ID N0:1).
SEQ ID N0:8 is the protein sequence of a cytochrome P450 oxygenase homologue (EPOS F) encoded by epoF (nucleotides 62369-63628 of SEQ ID N0:1).
SEQ ID N0:9 is a partial protein sequence (partial Orf 1) encoded by orfl (nucleotides 1-1826 of SEQ ID N0:1).

SEQ ID NO:10 is a protein sequence (Orf 2) encoded by orfl (nucleotides 3171-1900 on the reverse connplement strand of SEQ ID N0:1).
SEQ ID N0:11 is a protein sequence (Orf 3) encoded by orf3 (nucleotides 3415-5556 of SEQIDN0:1).
SEQ ID N0:12 is a protein sequence (Orf 4) encoded by orfA (nucleotides 5992-5612 on the reverse complement strand of SEQ ID N0:1).
SEQ ID N0:13 is a protein sequence (Orf 5) encoded by on5 (nucleotides 6226-6675 of SEQIDN0:1).
SEQ ID N0:14 is a protein sequence (Orf 6) encoded by ort6 (nucleotides 63779-64333 of SEQ ID N0:1).
SEQ ID N0:15 is a protein sequence (Orf 7) encoded by orf7 (nucleotides 64290-63853 on the reverse complement strand of SEQ ID N0:1).
SEQ ID N0:16 is a protein sequence (Orf 8) encoded by orf8 (nucleotides 64363-64920 of SEQ ID N0:1).
SEQ ID N0:17 is a protein sequence (Orf 9) encoded by or/9 (nucleotides 64727-64287 on the reverse complement strand of SEQ ID N0:1).
SEQ ID N0:18 is a protein sequence (Orf 10) encoded by orflO (nucleotides 65063-65767 of SEQ ID N0:1).
SEQ ID NO:19 is a protein sequence (Orf 11) encoded by ort11 (nucleotides 65874-65008 on the reverse complement strand of SEQ ID N0:1).
SEQ ID NO:20 is a protein sequence (Orf 12) encoded by ort\2 (nucleotides 66338-65871 on the reverse complement strand of SEQ ID N0:1).
SEQ ID N0:21 is a protein sequence (Orf 13) encoded by ort\2 (nucleotides 66667-67137 of SEQ ID NO:1).
SEQ ID NO:22 is a protein sequence (Orf 14) encoded by o/f14 (nucleotides 67334-68251 of SEQIDN0:1).
SEQ ID NO:23 is a partial protein sequence (partial Orf 15) encoded by o/f15 (nucleotides 68346-68750 of SEQ ID N0:1).
SEQ ID N0:24 is the universal reverse PCR primer sequence. SEQ ID NO:25 is the universal forward PCR primer sequence. SEQ ID NO:26 is the NH24 end "B" PCR primer sequence, SEQ ID NO:27 is the NH2 end "A" PCR primer sequence. SEQ ID NO:28 is the NH2 end "B" PCR primer sequence.

SEQ ID NO:29 is the pEP015-NH6 end "B" PCR primer sequence. SEQ ID NO:30 is the pEP015-H2.7 end "A" PCR primer sequence.
DEPOSIT INFORMATION
The following material has been deposited with the Agricultural Research Service, Patent Culture Collection (NRRL), 1815 North University Street, Peoria, Illinois 61604, under the Budapest Treaty on the International Recognition of the Deposit of Microorganisms for the Purposes of Patent Procedure. All restrictions on the availability of the deposited material will be irrevocably removed upon the granting of a patent.
Deposited Material Accession Number Deposit Date
pEP015 NRRLB-30033 June 11, 1998
pEP032 NRRL B-30119 April 16, 1999
DETAILED DESCRIPTION OF THE INVENTION
The genes involved in the biosynthesis of epothilones can be isolated using the techniques according to the present invention. The preferable procedure for the isolation of epothilone biosynthesis genes requires the isolation of genomic DNA from an organism identified as producing epothilones A and B, and the transfer of the isolated DNA on a suitable plasmid or vector to a host organism that does not normally produce the polyketide, followed by the identification of transformed host colonies to which the epothilone-producing ability has been conferred. Using a technique such asλ.::Tn5 transposon mutagenesis (de Bruijn & Lupski, Gene 27: 131-149 (1984)), the exact region of the transforming epothilone-conferring DNA can be more precisely defined. Alternatively or additionally, the transforming epothilone-conferring DNA can be cleaved into smaller fragments and the smallest that maintains the epothiione-conferring ability further characterized. Whereas the host organism lacking the ability to produce epothilone may be a different species from the organism from which the polyketide derives, a variation of this technique involves the transformation of host DNA into the same host that has had its epothilone-producing ability disrupted by mutagenesis. In this method, an epothilone-producing organism is mutated and non-epothilone-producing mutants are isolated. These are then complemented by genomic DNA isolated from the epothilone-producing parent strain.

A further example of a technique that can be used to isolate genes required for epo-thilone biosynthesis is the use of transposon mutagenesis to generate mutants of an epothi-lone-producing organism that, after mutagenesis, fails to produce the polyketide. Thus, the region of the host genome responsible for epothilone production is tagged by the transposon and can be recovered and used as a probe to isolate the native genes from the parent strain. PKS genes that are required for the synthesis of polyketides and that are similar to known PKS genes may be isolated by virtue of their sequence homology to the biosynthetic genes for which the sequence is known, such as those for the biosynthesis of rifamycin or soraphen. Techniques suitable for isolation by homology include standard library screening by DNA hybridization.
Preferred for use as a probe molecule is a DNA fragment that is obtainable from a gene or another DNA sequence that plays a part in the synthesis of a known polyketide. A preferred probe molecule comprises a 1.2 kb Sma\ DNA fragment encoding the ketosyntha-se domain of the fourth module of the soraphen PKS (U.S. Patent No. 5,716,849), and a more preferred probe molecule comprises the |3-ketoacyl synthase domains from the first and second modules of the rifamycin PKS (Schupp et aL, FEMS Microbiology Letters 159: 201-207 (1998)). These can be used to probe a gene library of an epothiione-producing microorganism to isolate the PKS genes responsible for epothilone biosynthesis.
Despite the well-known difficulties with PKS gene isolation in general and despite the difficulties expected to be encountered with the isolation of epothilone biosynthesis genes in particular, by using the methods described in the instant specification, biosynthetic genes for epothilones A and B can surprisingly be cloned from a microorganism that produces that polyketide. Using the methods of gene manipulation and recombinant production described in this specification, the cloned PKS genes can be modified and expressed in transgenic host organisms.
The isolated epothilone biosynthetic genes can be expressed in heterologous hosts to enable the production of the polyketide with greater efficiency than might be possible from native hosts. Techniques for these genetic manipulations are specific for the different available hosts and are known in the art. For example, heterologous genes can be expressed in Streptomyces and other actinomycetes using techniques such as those described in McDaniel et aL, Science 262:1546-1550 (1993) and Kao et al., Science 265: 509-512 (1994). both of which are incorporated herein by reference. See also, Rowe etal, Gene

216: 215-223 (1998); Holmes etal1L, EMBO Jouman2{8): 3183-3191 (1993) and Bibb etaL, Gene 38: 215-226 (1985), all of which are incorporated herein by reference.
Alternately, genes responsible for polyketide biosynthesis, i.e., epothilone biosynthe-tic genes, can also be expressed in other host organisms such as pseudomonads and £ colL Techniques for these genetic manipulations are specific for the different available hosts and are known in the art. For example, PKS genes have been sucessfully expressed in E. CO//using the pT7-7 vector, which uses the T7 promoter. See, Tabor et aL, Proc. Natl. Acad. Sci. USA 82: 1074-1078 (1985). incorporated herein by reference. In addition, the expression vectors pKK223-3 and pKK223-2 can be used to express heterologous genes in £ coli, either in transcriptional or translational fusion, behind the tac or trc promoter. For the expression of operons encoding multiple ORFs, the simplest procedure is to insert the operon into a vector such as pKK223-3 in transcriptional fusion, allowing the cognate ribo-some binding site of the heterologous genes to be used. Techniques for overexpression in gram-positive species such as Bacillus are also known in the art and can be used in the context of this invention (Quax et aL, in: Industrial Microorganisms: Basic and Applied Molecular Genetics, Eds. Baltz ef a/., American Society for Microbiology. Washington (1993)).
Other expression systems that may be used with the epothilone biosynthetic genes of the invention include yeast and baculovirus expression systems. See, for example, "The Expression of Recombinant Proteins in Yeasts." Sudbery, P. E., Curr. Opin. Biotechnol. 7(5): 517-524 (1996); "Methods for Expressing Recombinant Proteins in Yeast," Mackay. et al., Editor(s): Carey. Paul R., Protein Eng. Des. 105-153. Publisher: Academic. San Diego, Calif (1996); "Expression of heterologous gene products in yeast," Pichuantes, et ai., Editor(s): Cleland. J. L, Craik, C. S.. Protein Eng. 129-161. Publisher: Wiley-Liss, New York, N. Y (1996); WO 98/27203; Kealey etai, Proc. NatL Acad. Sci. USA 95: 505-509 (1998); "Insect Cell Culture: Recent Advances, Bioengineering Challenges And Implications In Protein Production," Palomares. et a!., Editor(s): Galindo, Enrique; Ramirez. Octavio T.. Adv. Bioprocess Eng. Vol. II, Invited Pap. Int. Symp., 2nd (1998) 25-52, Publisher: Kluwer, Dordrecht, Neth; "Baculovinjs Expression Vectors," Jarvis, Donald L, Editor(s): Miller, Lois K.. Baculovimses 389-431, Publisher: Plenum, New York. N. Y. (1997); "Production Of Heterologous Proteins Using The Baculovirus/lnsect Expression System," Grittiths. et al., Methods Mol. Biol. (Totowa, N. J.) 75 (Basic Cell Culture Protocols (2nd Edition)) 427-440 (1997); and "Insect Cell Expression Technology," Luckow. Verne A., Protein Eng, 183-218,

Publisher: Wiiey-Liss, New York, N. Y. (1996); all of which are incorporated herein by reference.
Another consideration for expression of PKS genes in heterologous hosts is the requirement of enzymes for posttranslational modification of PKS enzymes by phosphopante-theinylation before they can synthesize polyketides. However, the enzymes responsible for this modification of type I PKS enzymes, phosphopantetheinyl (P-pant) transferases are not normally present in many hosts such as E coll This problem can be solved by coexpres-sion of a P-pant transferase with the PKS genes in the heterologous host, as described by Kealey etal, Proc. Natl. Acad. ScL USA 95: 505-509 (1998), incorporated herein by reference.
Therefore, for the purposes of polyketide production, the significant criteria in the choice of host organism are its ease of manipulation, rapidity of growth (/.e. fermentation), possession or the proper molecular machinery for processes such as posttranslational modification, and its lack of susceptibility to the polyketide being overproduced. Most preferred host organisms are actinomycetes such as strains of Streptomyces. Other preferred host organisms are pseudomonads and E. col The above-described methods of polyketide production have significant advantages over the technology currently used in the preparation of the compounds. These advantages include the cheaper cost of production, the ability to produce greater quantities of the compounds, and the ability to produce compounds of a prefen-ed biological enantiomer, as opposed to racemic mixtures inevitably generated by organic synthesis. Compounds produced by heterologous hosts can be used in medical (e.g. cancer treatment in the case of epothilones) as well as agricultural applications.

EXPERIMENTAL
The invention will be further described by reference to the following detailed examples. These examples are provided for purposes of illustration only, and are not intended to be limiting unless otherwise specified. Standard recombinant DNA and molecular cloning techniques used here are well known in the art and are described by Ausubel (ed.), Current Protocols in Molecular Biology, John Wiley and Sons, Inc. (1994); T. Maniatis, E. F. Fritsch and J. Sambrook, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor laboratory, Cold Spring Harbor, NY (1989); and by T.J. Silhavy, M.L. Berman, and L.W. Enquist, Experiments with Gene Fusions, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY (1984).
Example 1: Cultivation of an Epothilone-Producing Strain of Sorangium cellulosum
Sorangium cellulosum strain 90 (DSM 6773, Deutsche Sammlung von Mikroorganis-men und Zellkulturen, Braunschweig) is streaked out and grown (30°C) on an agar plate of SolE medium (0.35% glucose, 0.05% tryptone, 0.15% MgS04 x 7H2O, 0.05% ammonium sulfate, 0.1% CaCIs, 0.006% K2HPO4. 0.01% sodium dithionite, 0.0008% Fe-EDTA, 1.2% HEPES, 3.5% [vol/vol] supernatant of sterilized stationary S. cellulosum culture) pH ad. 7.4. Cells from about 1 square cm are picked and inoculated into 5 mis of G5It liquid medium (0.2% glucose, 0.5% starch, 0.2% tryptone, 0.1% probion S, 0.05% CaCl2x2H20, 0.05% MgSO4x7H20, 1.2% HEPES, pH ad. 7.4) and incubated at 30°C with shaking at 225 rpm. After 4 days, the culture is transferred into 50 mis of G51t and incubated as above for 5 days. This culture is used to inoculate 500 mis of G51t and incubated as above for 6 days. The culture is centrifuged for 10 minutes at 4000 rpm and the cell pellet is resuspended in 50 mls of G51t.
Example 2: Generation of a Bacterial Artificial Chromosome (Bac) Library
To generate a Bac library, S. cellulosum cells cultivated as described in Example 1 above are embedded into agarose blocks, lysed, and the liberated genomic DNA is partially digested by the restriction enzyme H/ndllL The digested DNA is separated on an agarose gel by puised-field electrophoresis. Large (approximately 90-150 kb) DNA fragments are

isolated from the agarose gel and ligated into the vector pBelobacll. pBelobacll contains a gene encoding chloramphenicol resistance, a multiple cloning site in the lacZ gene providing for blue/white selection on appropriate medium, as well as the genes required for the replication and maintenance of the plasmid at one or two copies per cell. The ligation mixture is used to transform Escherichia co/ZDHlOB electrocompetent cells using standard electroporation techniques. Chloramphenicol-resistant recombinant (white. lacZ mutant) colonies are transferred to a positively charged nylon membrane filter in 384 3X3 grid format. The clones are lysed and the DNA is cross-linked to the filters. The same clones are also preserved as liquid cultures at -80°C.
Example 3: Screening the Bac Library of Sorangium cellulosum 90 for the Presence of Type
I Polyketide Synthase-Related Sequences
The Bac library filters are probed by standard Southern hybridization procedures. The DNA probes used encode p-ketoacyl synthase domains from the first and second modules of the rifamycin polyketide synthase (Schupp etal, FEMS Microbiology Letters 159: 201-207 (1998)). The probe DMAs are generated by PCR with primers flanking each ketosynthase domain using the plasmid pNE95 as the template (pNE95 equals cosmid 2 described in Schupp et al. (1998)). 25 ng of PCR-amplified DNA is isolated from a 0.5% agarose gel and labeled with 32P-dCTP using a random primer labeling kit (Gibco-BRL, Bethesda MD, USA) according to the manufacturer's instructions. Hybridization is at 65°C for 36 hours and membranes are washed at high stringency (3 times with O.1x SSC and 0.5% SDS for 20 min at 65°C). The labeled blot is exposed on a phosphorescent screen and the signals are detected on a Phospholmager 445SI (screen and 445SI from Molecular Dynamics). This results in strong hybridization of certain Bac clones to the probes. These clones are selected and cultured overnight in 5 mis of Luria broth (LB) at 37°C. Bac DNA from the Bac clones of interest is isolated by a typical miniprep procedure. The cells are resuspended in 200 µl lysozyme solution (SOmM glucose. 10 mM EDTA, 25 mM Tris-HCI, 5mg/ml lysozyme), lysed in 400 µl lysis solution (0.2 N NaOH and 2% SDS), the proteins are precipitated (3.0 M potassium acetate, adjusted to pH5.2 with acetic acid), and the Bac DNA is precipitated with isopropanoL The DNA is resuspended in 20µ1 of nuclease-free distilled water, restricted with BamHl (New England Biolabs, Inc.) and separated on a 0.7% agarose gel. The gel is blotted by Southem hybridization as described above and probed

under conditions described above, with a 1.2 kb Sma\ DNA fragment encoding the ketosyn-thase domain of the fourth module of the soraphen polyketide synthase as the probe (see, U.S. Patent No. 5,716,849). Five different hybridization patterns are observed. One clone representing each of the five pattems is selected and named pEPO15, pEPO20, pEPO30, pEP031, and pEP033, respectively.
Example 4: Subcloning of BamHI Fragments from pEPO15, pEPO20, pEPO30, pEP031,
and pEP033
The DNA of the five selected Bac clones is digested with BamHI and random fragments are subcloned into pBluescript II SK+ (Stratagene) at the BamH\ site. Subclones carrying inserts between 2 and 10 kb in size are selected for sequencing of the flanking ends of the inserts and also probed with the 1.2 Smal probe as described above. Subclones that show a high degree of sequence homology to known polyketide synthases and/or strong hybridization to the soraphen ketosynthase domain are used for gene disruption experiments.
Example 5: Preparation of Streptomycin-Resistant Spontaneous Mutants of Sorangium
cellulosum strain So ce90
0.1 ml of a three day old culture of Sorangium cellulosum strain So ce90. which is raised in liquid medium G52-H (0.2% yeast extract, 0.2% soyameal defatted, 0.8% potato starch, 0.2% glucose, 0,1% MgS04 x7H20, 0.1% CaC12 x2H20, 0.008% Fe-EDTA, pH ad 7.4 with KOH), is plated out on agar plates with SolE medium supplemented with 100 µg/ml streptomycin. The plates are incubated at 30°C for 2 weeks. The colonies growing on this medium are streptomycin-resistant mutants, which are streaked out and cultivated once more on the same agar medium with streptomycin for purification. One of these streptomycin-resistant mutants is selected and is called BCE28/2.

Example 6: Gene Disruptions in Sorangium cellulosum BCE28/2 Using the Subcloned
BamHI Fragments
The BamHI inserts of the subclones generated from the five selected Bac clones as described above are isolated and ligated into the unique SamHI site of plasmid pCIB132 (see, U.S. Patent No. 5,716,849). The pCIB132 derivatives carrying the inserts are transformed into Escherichia coli ED8767 containing the helper plasmid pUZ8 (Hedges and Matthew, Plasmid 2: 269-278 (1979). The transformants are used as donors in conjugation experiments with Sorangium cellulosum BCE28/2 as recipient. For the conjugation, 5-10 x 10® cells of Sorangium cellulosum BCE28/2 from an early stationary phase culture (reaching about 5x10® cells/ml) grown at 30°C in liquid medium G51b (G51b equals medium G51t with tryptone replaced by peptone) are mixed in a 1:1 cellular ratio with a late-log phase culture (in LB liquid medium) of E co//ED8767 containing pCIB132 derivatives carrying the subcloned SamHI fragments and the helper plasmid pUZ8. The mixed cells are then centri-fuged at 4000 rpm for 10 minutes and resuspended in 0.5 ml G51b medium. This cell suspension is then plated as a drop in the center of a plate with Sol E agar containg 50 mg/l kanamycin. The cells obtained after incubation for 24 hours at 30°C are harvested and resuspended in 0.8 ml of G51b medium, and 0.1 to 0.3 ml of this suspension is plated out on a selective SolE solid medium containing phleomycin (30 mg/l). streptomycin (300 mg/l), and kanamycin (50 mg/l). The counterselection of the donor Escherichia co//strain takes place with the aid of streptomycin. The colonies that grow on this selective medium after an incubation time of 8-12 days at a temperature of 30°C are isolated with a plastic loop and streaked out and cultivated on the same agar medium for a second round of selection and purification. The colony-derived cultures that grow on this selective agar medium after 7 days at a temperature of 30°C are transconjugants of Sorangium cellulosum BCE28/2 that have acquired phleomycin resistance by conjugative transfer of the pCIB132 derivatives carrying the subcloned SamHI fragments.
Integration of the pCIB132-derived plasmids into the chromosome of Sorangium cellulosum BCE28/2 by homologous recombination is verified by Southem hybridization. For this experiment, complete DNA from 5-10 tranconjugants per transferred SamHI fragment is isolated (from 10 ml cultures grown in medium G52-H for three days) applying the method described by Pospiech and Neumann, Trends Genet 11: 217 (1995). For the Southem blot, the DNA isolated as described above is cleaved either with the restriction

enzymes Bg/ll, C/al, or A/ofl. and the respective BamHl inserts or pCIB132 are used as 32P labelled probes.
Example 7: Analysis of the Effect of the Integrated SamHI Fragments on Epothilone Production by Sorangium cellulosum After Gene Disruption
Transconjugant cells grown on about 1 square cm surface of the selective S0IE plates of the second round of selection (see Example 6) are transferred by a sterile plastic loop into 10 ml of medium G52-H in an 50 ml Erlenmeyer flask. After incubation at 30°C and 180 rpm for 3 days, the culture is transfered into 50 ml of medium G52-H in an 200 ml Erlenmeyer flask. After incubation at 30°C and 180 rpm for 4-5 days, 10 ml of this culture is transfered into 50 ml of medium 23B3 (0.2 % glucose, 2 % potato starch, 1.6 % soya meal defatted, 0.0008 % Fe-EDTA Sodium salt, 0.5 % HEPES (4-(2-hydroxyethyl)-piperazine-1-ethane-sulfonic-acid), 2 % vol/vol polysterole resin XAD16 (Rohm & Haas), pH adjusted to 7.8 with NaOH) in an 200 ml Erlenmeyer flask.
Quantitative determination of the epothilone produced takes place after incubation of the cultures at 30°C and 180 rpm for 7 days. The complete culture broth is filtered by suction through a 150 |im nylon filter. The resin remaining on the filter is then resuspended in 10 ml isopropanol and extracted by shaking the suspension at 180 rpm for 1 hour. 1 ml is removed from this suspension and centrifuged at 12,000 rpm in an Eppendorff Microfuge. The amount of epothilones A and B therein is determined by means of an HPLC and detection at 250 nm with a UV_DAD detector (HPLC with Waters -Symetry CI 8 column and a gradient of 0.02 % phosphoric acid 607o-0% and acetonitril 40%-100%).
Transconjugants with three different integrated BamHI fragments subcloned from pEPO15, namely transconjugants with the BiamHI fragment of plasmid pEP015-21, transconjugants with the SamHI fragment of plasmid pEPOl 5-4-5, and transconjugants with the SamHI fragment of plasmid pEPOl 5-4-1, are tested in the manner described above. HPLC analysis reveals that all transconjugants no longer produce epothilone A or B. By contrast, epothilone A and B are detectable in a concentration of 2-4 mg/l in transconjugants with SamHI fragments integrated that are derived from pEPO20, pEPO30, pEP031, pEP033, and in the parental strain BCE28/2.

Example 8: Nucleotide Sequence Detemination of the Cloned Fragments and
Construction of Contigs
A. SamHI Insert of Plasmid pEP015-21
Plasmid DNA is isolated from the strain Escherichia co//DH10B [pEP015-21], and the nucleotide sequence of the 2.3-kb SamHI insert in pEP015-21 is determined. Automated DNA sequencing is done on the double-stranded DNA template by the dideoxynucleo-tide chain termination method, using Applied Biosystems model 377 sequencers. The primers used are the universal reverse primer (5' GGA AAC AGC TAT GAC CAT G 3' (SEQ ID NO:24)) and the universal forward primer (5' GTA AAA CGA CGG CCA GT 3' (SEQ ID NO:25)). In subsequent rounds of sequencing reactions, custom-synthesized oligonucleotides, designed for the 3' ends of the previously determined sequences, are used to extend and join contigs. Both strands are entirely sequenced, and every nucleotide is sequenced at least two times. The nucleotide sequence is compiled using the program Sequencher vers, 3.0 (Gene Codes Corporation), and analyzed using the University of Wisconsin Genetics Computer Group programs. The nucleotide sequence of the 2213-bp insert corresponds to nucleotides 20779-22991 of SEQ ID N0:1.
B. SamHI Insert of Plasmid pEP015-4-1
Plasmid DNA is isolated from the strain Escherichia co//DH10B [pEP015-4-1], and the nucleotide sequence of the 3.9-kb SamHI insert in pEPOl 5-4-1 is determined as described in (A) above. The nucleotide sequence of the 3909-bp insert corresponds to nucleotides 16876-20784 of SEQ ID N0:1.
C. SamHI Insert of Plasmid pEPOl 5-4-5
Plasmid DNA is isolated from the strain Escherichia co//DH10B [pEP015-4-5], and the nucleotide sequence of the 2.3-kb SamHI insert in pEP015-4-5 is detemiined as described in (A) above. The nucleotide sequence of the 2233-bp insert corresponds to nucleotides 42528-44760 of SEQ ID N0:1.

Example 9: Subcloning and Ordering of DNA Fragments from pEPO15 Containing
Epothilone Biosynthesis Genes
pEP015 is digested to completion with the restriction enzyme Hindlll and the resulting fragments are subcloned into pBluescript II SK- or pNEB193 (New England Biolabs) that has been cut with Hindlll and dephosphorylated with calf intestinal alkaline phosphatase. Six different clones are generated and named pEP015-NH1, pEP015-NH2, pEP015-NH6. pEP015-NH24 (all based on pNEB193). and pEP015-H2.7 and pEP015-H3.0 (both based on pBluescript II SK-),
The BamHl insert of pEP015-21 is isolated and DIG-labeled (Non-radioactive DNA labeling and detection system, Boehringer Mannheim), and used as a probe in DNA hybridization experiments at high stringency against pEP015-NHl, pEP015-NH2, pEPOIS-NH6, pEP015-NH24, pEP015-H2.7 and pEPO15-H3.0. Strong hybridization signal is detected for pEP015-NH24, indicating that pEP015-21 is contained within pEP015-NH24.
The BamHl insert of pEPOl5-4-1 is isolated and DIG-labeled as above, and used as a probe in DNA hybridization experiments at high stringency against pEP015-NH1, PEP015-NH2. pEPOl5-NH6, pEP015-NH24, pEP015-H2.7 and pEPO15-H3.0. Strong hybridization signals are detected for pEP015-NH24 and pEP015-H2.7. Nucleotide sequence data generated from one end each of pEP015-NH24 and pEP015-H2.7 are also in complete agreement with the previously determined sequence of the BamHl insert of pEP015-4-1. These experiments demonstrate that pEPOl 5-4-1 (which contains one internal H/ndlll site) overlaps pEP015-H2.7 and pEP015-NH24, and that pEP015-H2.7 and pEP015-NH24, in this order, are contiguous.
The BamHl insert of pEPOl 5-4-5 is isolated and DIG-labeled as above, and used as a probe in DNA hybridization experiments at high stringency against pEP015-NH1, PEP015-NH2, PEP015-NH6, pEP015-NH24, pEP015-H2.7 and pEPO15-H3.0. Strong hybridization signal is detected for pEP015-NH2, indicating that pEP015-21 is contained within pEP015-NH2.
Nucleotide sequence data is generated from both ends of pEP015-NH2 and from the end of pEP015-NH24 that does not overiap with pEP015-4-1. PCR primers NH24 end "B": GTGACTGGCGCCTGGAATCTGCATGAGC (SEQ ID NO:26), NH2 end "A": AGCGGGAGCTTGCTAGACATTCTGTTTC (SEQ ID NO:27), and NH2 end "B": GACGCGCCTCGGGCAGCGCCCCAA (SEQ ID NO:28), pointing towards the H/ndlll sites,

are designed based on these sequences and used in amplification reactions with pEP015 and, in separate experiments, with Sorangium cellulosum So ce90 genomic DNA as the templates. Specific amplification is found with primer pair NH24 end "B" and NH2 end "A" with both templates. The amplimers are cloned into pBluescript II SK- and completely sequenced. The sequences of the amplimers are identical, and also agree completely with the end sequences of pEP015-NH24 and pEP015-NH2, fused at the HindW site, establishing that the H/ndlll fragments of pEP015-NH2 and pEP015-NH24 are, in this order, contiguous.
The Hindlll insert of pEP015-H2.7 is isolated and DIG-labeled as above, and used as a probe in a DNA hybridization experiment at high stringency against pEP015 digested by A/ort. A Noti fragment of about 9 kb in size shows a strong a hybridization, and is further subcloned into pBluescript II SK- that has been digested with Nod and dephosphorylated with calf intestinal alkaline phosphatase, to yield pEP015-N9-16. The Notl insert of pEP015-N9-16 is isolated and DIG-labeled as above, and used as a probe in DNA hybridization experiments at high stringency against pEP015-NH1, pEP015-NH2, pEPO15-NH6, pEP015-NH24, pEP015-H2.7 and pEPOl5-H3.0. Strong hybridization signals are detected for pEPOl 5-NH6, and also for the expected clones pEPOl 5-H2.7 and pEPOl 5-NH24. Nucleotide sequence data is generated from both ends of pEP015-NH6 and from the end of pEP015-H2.7 that does not overlap with pEP015-4-1. PCR primers are designed pointing towards the Hindlil sites and used in amplification reactions with pEPO15 and, in separate experiments, with Sorangium cellulosum So ce90 genomic DNA as the templates. Specific amplification is found with primer pair pEP015-NH6 end "B": CACCGAAGCGTCGATCTGGTCCATC (SEQ ID NO:29) and pEP015-H2.7 end "A": CGGTCAGATCGACGACGGGCTTTCC (SEQ ID NO:30) with both templates. The amplimers are cloned into pBluescript II SK- and completely sequenced. The sequences of the amplimers are identical, and also agree completely with the end sequences of pEP015-NH6 and pEP015-H2.7. fused at the H/ndlll site, establishing that the Hindlil fragments of pEP015-NH6 and pEP015-H2.7 are, in this order, contiguous.
All of these experiments, taken together, establish a contig of Hindili fragments covering a region of about 55 kb and consisting of the HindlW inserts of pEP015-NH6, pEP015-H2.7, PEP015-NH24, and pEP015-NH2, in this order. The inserts of the remaining two Hindlil subclones, namely pEP015-NH1 and pEPO15-H3.0, are not found to be parts of this contig.

Example 10: Further Extension of the Subclone Contig Covering the Epothilone
Biosynthesis Genes
An approximately 2.2 kb BamH\ - H/ndlll fragment derived from the downstream end of the insert of pEP015-NH2 and thus representing the downstream end of the subclone contig described in Example 9 is isolated, DIG-labeled, and used in Southern hybridization experiments against pEP015 and pEP015-NH2 DNAs digested with several enzymes. The strongly hybridizing bands are always found to be the same in size between the two target DNAs indicating that the Sorangium cellulosum So ce90 genomic DNA fragment cloned into pEP015 ends with the H/ndlll site at the downstream end of pEP015-NH2.
A cosmid DNA library of Sorangium cellulosum So ce90 is generated, using established procedures, in pScosTriplex-ll (Ji, etal., Genomics 31 185-192 (1996)). Briefly, high-molecular weight genomic DNA of Sorangium cellulosum So ce90 is partially digested with the restriction enzyme Sau3AI to provide fragments with average sizes of about 40 kb, and ligated to BamH\ and Xba\ digested pScosTriplex-ll. The ligation mix is packaged with Gigapack III XL (Stratagene) and used to transfect £ co//XL1 Blue MR cells.
The cosmid library is screened with the approximately 2.2 kb San?HI - H/ndill fragment, derived from the downstream end of the insert of pEP015-NH2, used as a probe in colony hybridization. A strongly hybridizing clone, named pEP04E7 is selected.
pEP04E7 DNA is isolated, digested with several restriction endonucleases, and probed in Southern hybridization experiments with the 2.2 kb BamH\ - H/ndlll fragment. A strongly hybridizing /Vort fragment of approximately 9 kb in size is selected and subcloned into pBluescript II SK- to yield pEP04E7-N9-8. Further Southern hybridization experiments reveal that the approximately 9 kb /VofI insert of pEP04E7-N9-8 overlaps pEP015-NH2 over 6 kb in a A/ort - H/ndlll fragment, while the remaining approximately 3 kb H/ndlll - A/ofl fragment would extend the subclone contig described in Example 9. End sequencing reveals, however, that the downstream end of the insert of pEP04E7-N9-8 contains the SamHI - A/ofl polylinker of pScosTriplex-ll, thereby indicating that the genomic DNA insert of pEP04E7 ends at a Sau3AI site within the extending H/ndlll - A/ofl fragment and that the A/ofl site is derived from pScosTriplex-ll.
An approximately 1.6 kb Psi\ - Sall fragment derived from the approximately 3 kb extending H/ndlll - /Vofl subfragment of pEP04E7-N9-8, containing only Sorangium

cellulosum So ce90-derived sequences free of vector, is used as a probe against the bacterial artificial chromosome library described in Example 2. Besides the previously-isolated EP015, a Bac clone, named EP032, is found to strongly hybridize to the probe. pEP032 is isolated, digested with several restriction endonucleases. and hybridized with the approximately 1.6 kb Pst\ - Sa/1 probe. A H/ndlll - EcoRV fragment of about 13 kb in size is found to strongly hybridize to the probe, and is subcloned into pBluescript II SK-digested with H/ndlll and H/ncll to yield pEP032-HEV15.
Oligonucleotide primers are designed based on the downstream end sequence of pEP015-NH2 and on the upstream (H/ndlll) end sequence derived from pEP032-HEV15, and used in sequencing reactions with pEP04E7-N9-8 as the template. The sequences reveal the existence of a small H/ndlll fragment (EPO4E7-H0.02) of 24 bp, undetectable in standard restriction analysis, separating the H/ndlll site at the downstream end of pEP015-NH2 from the H/ndlll site at the upstream end of pEP032-HEV15.
Thus, the subclone contig described in Example 9 is extended to include the H/ndlll fragment EPO4E7-H0.02 and the insert of pEP032-HEV15, and constitutes the inserts of: pEP015-NH6, pEP015-H2.7, pEP015-NH24, pEP015-NH2, EPO4E7-H0.02 and pEP032-HEV15, in this order.
Example 11: Nucleotide Sequence Determination of the Subclone Contig Covering the
Epothilone Biosynthesis Genes
The nucleotide sequence of the subclone contig described in Example 10 is detemnined as follows.
pEP015-H2.7. Plasmid DNA is isolated from the strain Escherichia co//DH10B [pEP015-H2.7], and the nucleotide sequence of the 2.7-kb SamHI insert in pEP015-H2.7 is determined. Automated DNA sequencing is done on the double-stranded DNA template by the dideoxynucleotide chain termination method, using Applied Biosystems model 377 sequencers. The primers used are the universal reverse primer (5' GGA AAC AGC TAT GAG CAT G 3' (SEQ ID NO:24)) and the universal forward primer (5' GTA AAA CGA CGG CCA GT 3' (SEQ ID NO:25))- In subsequent rounds of sequencing reactions, custom-synthesized oligonucleotides, designed for the 3' ends of the previously determined sequences, are used to extend and join contigs.

pEP015-NH6, pEP015-NH24 and pEP015-NH2. The Hindlll inserts of these plas-mids are isolated, and subjected to random fragmentation using a Hydroshear apparatus (Genomic Instrumentation Services, Inc.) to yield an average fragment size of 1-2 kb. The fragments are end-repaired using T4 DNA Polymerase and Klenow DNA Polymerase enzymes in the presence of desoxynucleotide triphosphates, and phosphorylated with T4 DNA Kinase in the presence of ribo-ATP, Fragments in the size range of 1.5-2.2 kb are isolated from agarose gels, and ligated into pBluescript II SK- that has been cut with EcoRV and dephosphoryiated. Random subclones are sequenced using the universal reverse and the universal fonward primers.
pEP032-HEV15. pEP032-HEV15 is digested with H/ndlll and Sspl, the approximately 13.3 kb fragment containing the -13 kb H/ndlll - EcoRV insert from So, cellulosum So ce90 and a 0.3 kb H/ncll - Sspl fragment from pBluescript II SK- is isolated, and partially digested with Haelll to yield fragments with an average size of 1-2 kb. Fragments in the size range of 1.5-2.2 kb are isolated from agarose gels, and ligated into pBluescript II SK- that has been cut with EcoRV and dephosphoryiated. Random subclones are sequenced using the universal reverse and the universal forward primers.
The chromatograms are analyzed and assembled into contigs with the Phred, Phrap and Consed programs (Ewing, etal, Genome Res. 8(3): 175-185 (1998); Ewing, et al,, Genome Res. 8(3): 186-194 (1998); Gordon, etaL, Genome Res. 8(3): 195-202 (1998)). Contig gaps are filled, sequence discrepancies are resolved, and low-quality regions are resequenced using custom-designed oligonucleotide primers for sequencing on either the original subclones or selected clones from the random subclone libraries. Both strands are completely sequenced, and every basepair is covered with at least a minimum aggregated Phred score of 40 (confidence level of 99.99%).
The nucleotide sequence of the 68750 bp contig is shown as SEQ ID N0:1.

Example 12: Nucleotide Sequence Analysis of the Epothilone Biosynthesis Genes SEQ ID N0:1 is found to contain 22 ORFs as detailed below in Table 1:

epoA (nucleotides 7610-11875 of SEQ ID N0:1) codes for EPOS A (SEQ ID N0:2). a type I polyketide synthase consisting of a single module, and harboring the following domains: p-ketoacyl-synthase (KS) (nucleotides 7643-8920 of SEQ ID NO: 1, amino acids 11-

437 of SEQ ID N0:2); acyltransferase (AT) (nucleotides 9236-10201 of SEQ ID N0:1, amino acids 543-864 of SEQ ID N0:2); enoyl reductase (ER) (nucleotides 10529-11428 of SEQ ID N0:1, amino acids 974-1273 of SEQ ID N0:2); and acyl carrier protein homologous domain (AGP) (nucleotides 11549-11764 of SEQ ID N0:1, amino acids 1314-1385 of SEQ ID N0:2). Sequence comparisons and motif analysis (Haydock, et al. FEBS Lett. 374: 246-248 (1995); Tang, et al., Gene 216: 255-265 (1998)) reveal that the AT encoded by EPOS A is specific for malonyl-CoA. EPOS A should be involved in the initiation of epothilone biosynthesis by loading the acetate unit to the multienzyme complex that will eventually form part of the 2-methylthiazole ring (026 and 020).
epoP (nucleotides 11872-16104 of SEQ ID N0:1) codes for EPOS P (SEQ ID N0:3), a non-ribosomal peptide synthetase containing one module. EPOS P harbors the following domains:
• peptide bond formation domain, as delineated by motif K (amino acids 72-81 [FPLTDIQESY] of SEQ ID N0:3, corresponding to nucleotide positions 12085-12114 of SEQ ID NO:1); motif L (amino acids 118-125 [VVARHDML] of SEQ ID N0:3, corresponding to nucleotide positions 12223-12246 of SEQ ID NO:1); motif M (amino acids 199-212 [SIDLINVDLGSLSI] of SEQ ID N0:3, corresponding to nucleotide positions 12466-12507 of SEQ ID N0:1); and motif O (amino acids 353-363 [GDFTSMVLLDI] of SEQ ID N0:3, corresponding to nucleotide positions 12928-12960 of SEQ ID N0:1);
• aminoacyl adenylate formation domain, as delineated by motif A (amino acids 549-565 [LTYEELSRRSRRLGARL] of SEQ ID NO:3, corresponding to nucleotide positions 13516-13566 of SEQ ID N0:1); motif B (amino acids 588-603 [VAVLAVLESGAAYVPI] of SEQ ID N0:3, corresponding to nucleotide positions 13633-13680 of SEQ ID N0:1); motif 0 (amino acids 669-684 [AYVIYTSGSTGLPKGV] of SEQ ID NO:3, corresponding to nucleotide positions 13876-13923 of SEQ ID N0:1); motif D (amino acids 815-821 [SLGGATE] of SEQ ID NO:3, corresponding to nucleotide positions 14313-14334 of SEQ ID N0:1); motif E (amino acids 868-892 [GQLYIGGVGLALGYWRDEEKTRKSF] of SEQ ID N0:3, corresponding to nucleotide positions 14473-14547 of SEQ ID N0:1); motif F (amino acids 903-912 [YKTGDLGRYL] of SEQ ID N0:3, corresponding to nucleotide positions 14578-14607 of SEQ ID N0:1); motif G (amino acids 918-940 [EFMGREDNQIKLRGYRVELGEIE] of SEQ ID N0:3, corresponding to nucleotide positions 14623-14692 of SEQ ID N0:1); motif H (amino acids 1268-1274 [LPEYMVP] of SEQ ID N0:3, corresponding to nucleotide positions 15673-15693 of SEQ ID N0:1); and

motif I (amino acids 1285-1297 [LTSNGKVDRKALR] of SEQ ID N0:3, corresponding to nucleotide positions 15724-15762 of SEQ ID N0:1);
• an unknown domain, inserted between motifs G and H of the aminoacyl adenylate formation domain (amino acids 973-1256 of SEQ ID N0:3, corresponding to nucleotide positions 14788-15639 of SEQ ID N0:1); and
• a peptidyl carrier protein homologous domain (PCP), delineated by motif J (amino acids 1344-1351 [GATSIHIV] of SEQ ID N0:3, corresponding to nucleotide positions 15901-15924 of SEQ ID N0:1).
It is proposed that EPOS P is involved in the activation of a cysteine by adenylation, binding the activated cysteine as an aminoacyl-S-PCP, forming a peptide bond between the enzyme-bound cysteine and the acetyl-S-ACP supplied by EPOS A, and the formation of the initial thiazoline ring by intramolecular heterocyclization. The unknown domain of EPOS P displays very weak homologies to NAD(P)H oxidases and reductases from Bacillus species. Thus, this unknown domain and/or the ER domain of EPOS A may be involved in the oxidation of the initial 2-methylthiazoline ring to a 2-methylthiazole.
epoB (nucleotides 16251-21749 of SEQ ID N0:1) codes for EPOS B (SEQ ID N0:4), a type I polyketide synthase consisting of a single module, and harboring the following domains: KS (nucleotides 16269-17546 of SEQ ID N0:1, amino acids 7-432 of SEQ ID N0:4); AT (nucleotides 17865-18827 of SEQ ID N0:1, amino acids 539-859 of SEQ ID N0:4); dehydratase (DH) (nucleotides 18855-19361 of SEQ ID NO:1, amino acids 869-1037 of SEQ ID NO:4); p-ketoreductase (KR) (nucleotides 20565-21302 of SEQ ID N0:1, amino acids 1439-1684 of SEQ ID N0:4); and AGP (nucleotides 21414-21626 of SEQ ID N0:1, amino acids 1722-1792 of SEQ ID N0:4). Sequence comparisons and motif analysis reveal that the AT encoded by EPOS B is specific for methylmalonyl-CoA. EPOS A should be involved in the first polyketide chain extension by catalysing the Claisen-like condensation of the 2-methyl-4-thiazolecarboxyl-S-PCP starter group with the methylmalonyl-S-ACP, and the concomitant reduction of the b-keto group of CI 7 to an enoyl.
epoC (nucleotides 21746-43519 of SEQ ID N0:1) codes for EPOS C (SEQ ID N0:5), a type I polyketide synthase consisting of 4 modules. The first module hariDors a KS (nucleotides 21860-23116 of SEQ ID N0:1. amino acids 39-457 of SEQ ID NO:5); a malonyl CoA-specific AT (nucleotides 23431-24397 of SEQ ID N0:1, amino acids 563-884 of SEQ ID N0:5); a KR (nucleotides 25184-25942 of SEQ ID N0:1, amino acids 1147-1399 of SEQ ID N0:5); and an AGP (nucleotides 26045-26263 of SEQ ID N0:1, amino acids 1434-1506 of

SEQ ID N0:5). This module incorporates an acetate extender unit (C14-C13) and reduces the p-keto group at CI5 to the hydroxyl group that takes part in the final lactonization oi the epothilone macrolactone ring. The second module of EPOS C harbors a KS (nucleotides 26318-27595 of SEQ ID N0:1, amino acids 1524-1950 of SEQ ID N0:5); a malonyl CoA-specific AT (nucleotides 27911-28876 of SEQ ID N0:1, amino acids 2056-2377 of SEQ ID N0:5); a KR (nucleotides 29678-30429 of SEQ ID N0:1, amino acids 2645-2895 of SEQ ID N0:5); and an ACP (nucleotides 30539-30759 of SEQ ID N0:1, amino acids 2932-3005 of SEQ ID N0:5). This module incorporates an acetate extender unit (C12-C11) and reduces the p-keto group at CI 3 to a hydroxyl group. Thus, the nascent polyketide chain of epothilone corresponds to epothilone A, and the incorporation of the methyl side chain at CI 2 in epothilone B would require a post-PKS C-methyltransferase activity. The formation of the epoxi ring at C13-C12 would also require a post-PKS oxidation step. The third module of EPOS C harbors a KS (nucleotides 30815-32092 of SEQ ID N0:1, amino acids 3024-3449 of SEQ ID N0:5); a malonyl CoA-specific AT (nucleotides 32408-33373 of SEQ ID N0:1, amino acids 3555-3876 of SEQ ID N0:5); a DH (nucleotides 33401-33889 of SEQ ID N0:1, amino acids 3886-4048 of SEQ ID N0:5); an ER (nucleotides 35042-35902 of SEQ ID N0:1, amino acids 4433-4719 of SEQ ID N0:5); a KR (nucleotides 35930-36667 of SEQ ID N0:1, amino acids 4729-4974 of SEQ ID NO:5); and an ACP (nucleotides 36773-36991 of SEQ ID N0:1, amino acids 5010-5082 of SEQ ID N0:5). This module incorporates an acetate extender unit (C10-C9) and fully reduces the p-keto group at C11. The fourth module of EPOS C harbors a KS (nucleotides 37052-38320 of SEQ ID N0:1, amino acids 5103-5525 of SEQ ID N0:5); a methylmalonyl CoA-specific AT (nucleotides 38636-39598 of SEQ ID NO:1, amino acids 5631-5951 of SEQ ID NO:5); a DH (nucleotides 39635-40141 of SEQ ID N0:1, amino acids 5964-6132 of SEQ ID N0:5); an ER (nucleotides 41369-42256 of SEQ ID NO:1, amino acids 6542-6837 of SEQ ID N0:5); a KR (nucleotides 42314-43048 of SEQ ID NO:1, amino acids 6857-7101 of SEQ ID N0:5); and an ACP (nucleotides 43163-43378 of SEQ ID N0:1, amino acids 7140-7211 of SEQ ID N0:5). This module incorporates a propionate extender unit (C24 and C8-C7) and fully reduces the p-keto group at C9.
epoD (nucleotides 43524-54920 of SEQ ID N0:1) codes for EPOS D (SEQ ID NO:6), a type I polyketide synthase consisting of 2 modules. The first module harbors a KS (nucleotides 43626-44885 of SEQ ID N0:1, amino acids 35-454 of SEQ ID N0:6); a methylmalonyl CoA-specific AT (nucleotides 45204-46166 of SEQ ID N0:1, amino acids 561-881 of SEQ ID N0:6); a KR (nucleotides 46950-47702 of SEQ ID NO:1, amino acids

1143-1393 of SEQ ID N0:6); and an ACP (nucleotides 47811-48032 of SEQ ID N0:1. amino acids 1430-1503 of SEQ ID N0:6). This module incorporates a propionate extender unit (C23 and C6-C5) and reduces the p-keto group at C7 to a hydoxyl group. The second module harbors a KS (nucleotides 48087-49361 of SEQ ID N0:1, amino acids 1522-1946 of SEQ ID NO: 6); a methylmalonyl CoA-specific AT (nucleotides 49680-50642 of SEQ ID N0:1, amino acids 2053-2373 of SEQ ID N0:6); a DH (nucleotides 50670-51176 of SEQ ID N0:1, amino acids 2383-2551 of SEQ ID N0:6); a methyltransferase (MT, nucleotides 51534-52657 of SEQ ID NO:1, amino acids 2671-3045 of SEQ ID N0:6); a KR (nucleotides 53697-54431 of SEQ ID N0:1, amino acids 3392-3636 of SEQ ID N0:6); and an ACP (nucleotides 54540-54758 of SEQ ID NO:1, amino acids 3673-3745 of SEQ ID N0:6). This module incorporates a propionate extender unit (C21 or C22 and C4-C3) and reduces the P-keto group at C5 to a hydoxyl group. This reduction is somewhat unexpected, since epo-thilones contain a keto group at C5. Discrepancies of this kind between the deduced reductive capabilities of PKS modules and the redox state of the corresponding positions in the final polyketide products have been, however, reported in the literature (see, for example, Schwecke, et al., Proc, NatL Acad, ScL USA 92: 7839-7843 (1995) and Schupp. et al., FEMS Microbiology Letters 159: 201-207 (1998)). An important feature of epothilones is the presence of gem-methyl side groups at C4 (C21 and C22). The second module of EPOS D is predicted to incorporate a propionate unit into the growing polyketide chain, providing one methyl side chain at C4. This module also contains a methyltransferase domain integrated into the PKS between the DH and the KR domains, in an arrangement similar to the one seen in the HMWP1 yersiniabactin synthase (Gehring, A.M., DeMoll, E., Fetherston, J.D.. Mori, I., Mayhew, G.F., Blattner, F.R., Walsh, C.T., and Perry, R.D,: Iron acquisition in plague: modular logic in enzymatic biogenesis of yersiniabactin by Yersinia pestis. Chem. Biol 5, 573-586,1998). This MT domain in EPOS D is proposed to be responsible for the incorporation of the second methyl side group (021 or 022) at 04.
epoE (nucleotides 54935-62254 of SEQ ID N0:1) codes for EPOS E (SEQ ID N0:7), a type I polyketide synthase consisting of one module, harboring a KS (nucleotides 55028-56284 of SEQ ID NO:1, amino acids 32-450 of SEQ ID N0:7); a malonyl OoA-specific AT (nucleotides 56600-57565 of SEQ ID N0:1, amino acids 556-877 of SEQ ID N0:7); a DH (nucleotides 57593-58087 of SEQ ID N0:1, amino acids 887-1051 of SEQ ID N0:7); a probably nonfunctional ER (nucleotides 59366-60304 of SEQ ID N0:1, amino acids 1478-1790 of SEQ ID N0:7); a KR (nucleotides 60362-61099 of SEQ ID N0:1, amino acids 1810-2055

of SEQ ID NO:7); an ACP (nucleotides 61211-61426 of SEQ ID N0:1, amino acids 2093-2164 of SEQ ID N0:7); and a thioesterase (TE) (nucleotides 61427-62254 of SEQ ID N0:1, amino acids 2165-2439 of SEQ ID N0:7). The ER domain in this module harbors an active site motif with some highly unusual amino acid substitutions that probably render this domain inactive. The module incorporates an acetate extender unit (C2-C1), and reduces the P-keto at C3 to an enoyi group. Epothilones contain a hydroxyl group at C3, so this reduction also appears to be excessive as discussed for the second module of EPOS D. The TE domain of EPOS E takes part in the release and cyclization of the grown polyketide chain via lactonization between the carboxyl group of C1 and the hydroxyl group of 015.
Five ORFs are detected upstream of epoA in the sequenced region. The partially sequenced orgl has no homologues in the sequence databanks. The deduced protein product (Orf 2, SEQ ID NO:10) of orfl (nucleotides 3171-1900 on the reverse complement strand of SEQ ID N0:1) shows strong similarities to hypothetical ORFs from Mycobacterium and Streptomyces coelicolor, and more distant similarities to carboxypeptidases and DD-peptidases of different bacteria. The deduced protein product of orf3 (nucleotides 3415-5556 of SEQ ID N0:1), Orf 3 (SEQ ID N0:11), shows homologies to Na/H antiporters of different bacteria. Orf 3 might take part in the export of epothilones from the producer strain. orf4 and ortS have no homologues in the sequence databanks.
Eleven ORFs are found downstream of epoE in the sequenced region. epoF (nucleotides 62369-63628 of SEQ ID NO:1) codes for EPOS F (SEQ ID N0:8), a deduced protein with strong sequence similarities to cytochrome P450 oxygenases. EPOS F may take part in the adjustment of the redox state of the cariDons 012, 05. and/or 03. The deduced protein product of 0Af14 (nucleotides 67334-68251 of SEQ ID N0:1), Orf 14 (SEQ ID NO:22) shows strong similarities to 01:3293544, a hypothetic protein with no proposed function from Streptomyces coelicolor, and also to 01:2654559, the human embrionic lung protein. It is also more distantly related to cation efflux system proteins like 01:2623026 from Methano-bacterium thermoautotrophicum, so it might also take part in the export of epothilones from the producing cells. The remaining ORFs (ori6-o/f13 and o/fl 5) show no homologies to entries in the sequence databanks.
Example 13: Recombinant Expression of Epothilone Biosynthesis Genes

Epothilone synthase genes according to the present invention are expressed in heterologous organisms for the purposes of epothilone production at greater quantities than can be accomplished by fermentation of Sorangium cellulosum. A preferable host for heterologous expression is Streptomyces, e.g. Streptomyces coelicolor, which natively produces the polyketide actinorhodin. Techniques for recombinant PKS gene expression in this host are described in McDaniel etal, Science 262:1546-1550 (1993) and Kao etal, Science 265: 509-512 (1994). See also, Holmes et al., EMBO Journan2(8): 3183-3191 (1993) and Bibb etal, Gene 38: 215-226 (1985), as well as U.S. Patent Nos. 5,521,077, 5,672.491, and 5,712.146, which are incorporated herein by reference.
According to one method, the heterologous host strain is engineered to contain a chromosomal deletion of the actinorhodin (acf) gene cluster. Expression plasmids containing the epothilone synthase genes of the invention are constructed by transferring DNA from a temperature-sensitive donor plasmid to a recipient shuttle vector in E. coll (McDaniel etal. (1993) and Kao etal. (1994)), such that the synthase genes are built-up by homologous recombination within the vector. Alternatively, the epothilone synthase gene cluster is introduced into the vector by restriction fragment ligation. Following selection, e.g. as described in Kao etal. (1994). DNA from the vector is introduced into the acf-minus Streptomyces coelicolor strain according to protocols set forth in Hopwood ef a/., Genetic Manipulation o/Streptomyces. A Laboratory Manual {John Innes Foundation, Norwich, United Kingdom, 1985), incorporated herein by reference. The recombinant Streptomyces strain is grown on R2YE medium (Hopwood etal (1985)) and produces epothilones. Alternatively, the epothilone synthase genes according to the present invention are expressed in other host organisms such as pseudomonads. Bacillus, yeast, insect cells and/or E. coll. PKS and NRPS genes are preferably expressed in E. co//using the pT7-7 vector, which uses the T7 promoter. See, Tabor etal., Proc, NatL Acad. Sci. USA 82:1074-1078 (1985). In another embodiment, the expression vectors pKK223-3 and pKK223-2 are used to express PKS and NRPS genes in E coll, either in transcriptional or translational fusion, behind the tac or trc promoter. Expression of PKS and NRPS genes in heterologous hosts, which do not naturally have the phosphopantetheinyl (P-pant) transferases needed for post-translational modification of PKS enzymes, requires the coexpression in the host of a P-pant transferase, as described by Kealey et aL, Proc. Natl. Acad Sci. USA 95: 505-509 (1998).

Example 14: Isolation of Epothilones from Producing Strains
Examples of cultivation, fermentation, and extraction procedures for polyketide isolation, which are useful for extracting epothilones from both native and recombinant hosts according to the present invention, are given in WO 93/10121. incorporated herein by reference, in Example 57 of U.S. Patent No. 5,639,949, in Gerth et aL, J. Antibiotics 49: 560-563 (1996), and in Swiss patent application no. 396/98, filed February 19, 1998, and U.S. patent application no. 09/248,910 (that discloses also preferred mutant strains of Sorangium ceilulosum), both of which are incorporated herein by reference. The following are procedures that are useful for isolating epothilones from cultured Sorangium ceilulosum strains such as So ce90, and may also be used for the isolation of epothilone from recombinant hosts.



Addition of cvclodextrins and cvclodextrin derivatives:
Cyclodextrins (Fluka, Buchs, Switzerland, or Wacker Chemie. Munich, Germany) in different concentrations are sterilised separately and added to the 1B12 medium prior to seeding.
Cultivation: 1 ml of the suspension of Sorangium cellulosum Soce-90 from a liquid N2 ampoule is transferred to 10 ml of G52 medium (in a 50 ml Erienmeyer flask) and incubated for 3 days at 180 rpm in an agitator at 30°C, 25 mm displacement. 5 ml of this culture is added to 45 ml of G52 medium (in a 200 ml Erienmeyer flask) and incubated for 3 days at 180 rpm in an agitator at 30°C, 25 mm displacement. 50 ml of this culture is then added to 450 ml of G52 medium (in a 2 litre Erienmeyer flask) and incubated for 3 days at 180 rpm in an agitator at 30°C, 50 mm displacement.
f^aintenance culture: The culture is overseeded every 3-4 days, by adding 50 ml of culture to 450 ml of G52 medium (in a 2 litre Erienmeyer flask). All experiments and fermentations are carried out by starting with this maintenance culture.
Tests in a flask:
(I) Preculture in an agitatinQ flask:

starting with the 500 ml of maintenance culture, 1 x 450 ml of G52 medium are seeded with
50 ml of the maintenance culture and incubated for 4 days at 180 rpm in an agitator at
30^0, 50 mm displacement.
(ii^ Main culture in the aoitatina flask:
40 ml of 1B12 medium plus 5 g/i 4-morpholine-propane-sulfonic acid (= MOPS) powder (in a
200 ml Erienmeyer flask) are mixed with 5 ml of a 10x concentrated cyclodextrin solution.
seeded with 10 ml of preculture and incubated for 5 days at 180 rpm in an agitator at 30°C,
50 mm displacement.
Fermentation: Fermentations are carried out on a scale of 10 litres, 100 litres and 500 litres. 20 litre and 100 litre femrientations serve as an intermediate culture step. Whereas the pre-cultures and intermediate cultures are seeded as the maintenance culture 10% (v/v), the main cultures are seeded with 20% (v/v) of the intermediate culture. Important: In contrast to the agitating cultures, the ingredients of the media for the fermentation are calculated on the final culture volume including the inoculum. If, for example, 18 litres of medium + 2 litres of inoculum are combined, then substances for 20 litres are weighed in, but are only mixed with 18 litres.
Preculture in an agitating flask:
Starting with the 500 ml maintenance culture, 4 x 450 ml of G52 medium (in a 2 litre Erienmeyer flask) are each seeded with 50 ml thereof, and incubated for 4 days at 180 rpm in an agitator at 30°C, 50 mm displacement.
Intermediate culture. 20 litres or 100 litres:
20 litres: 18 litres of G52 medium in a fermenter having a total volume of 30 litres are
seeded with 2 litres of the preculture. Cultivation lasts for 3-4 days, and the conditions are:
30*'C, 250 rpm, 0.5 litres of air per litre liquid per min, 0.5 bars excess pressure, no pH
control.
100 litres: 90 litres of G52 medium in a fermenter having a total volume of 150 litres are
seeded with 10 litres of the 20 litre intermediate culture. Cultivation lasts for 3-4 days, and
the conditions are: 30°C, 150 rpm, 0.5 litres of air per litre liquid per min, 0.5 bars excess
pressure, no pH control.

Main culture. 10 litres. 100 litres or 500 litres:
10 litres: The media substances for 10 litres of 1B12 medium are sterilised in 7 litres of water, then 1 litre of a sterile 10% 2-(hydroxypropyl) -p-cyclodextrin solution are added, and seeded with 2 litres of a 20 litre intermediate culture. The duration of the main culture is 6-7 days, and the conditions are: 30°C, 250 rpm, 0.5 litres of air per litre of liquid per min, 0.5 bars excess pressure, pH control with H2SO4/KOH to pH 7.6 +/- 0.5 (i.e. no control between pH 7.1 and 8.1).
100 litres: The media substances for 100 litres of 1B12 medium are sterilised in 70 litres of water, then 10 litres of a sterile 10% 2-(hydroxypropyl) -β-cyciodextrin solution are added, and seeded with 20 litres of a 20 litre intermediate culture. The duration of the main culture is 6-7 days, and the conditions are: 30°C, 200 rpm, 0.5 litres air per litre liquid per min., 0.5 bars excess pressure, pH control with H2SO4/KOH to pH 7.6 +/- 0.5. The chain of seeding for a 100 litre fermentation is shown schematically as follows:

500 litres: The media substances for 500 litres of 1B12 medium are sterilised in 350 litres of water, then 50 litres of a sterile 10% 2-(hydroxypropyl) -p-cyclodextrin solution are added, and seeded with 100 litres of a 100 litre intermediate culture. The duration of the main culture is 6-7 days, and the conditions are: 30°C, 120 rpm, 0.5 litres air per litre liquid per min., 0.5 bars excess pressure, pH control with H2SO4/KOH to pH 7.6 +/- 0.5.
Product analysis: Preparation of the sample:

50 ml samples are mixed with 2 ml of polystyrene resin Amberlite XAD16 (Rohm + Haas, Frankfurt. Germany) and shaken at 180 rpm for one hour at 30°C. The resin is subsequently filtered using a 150 pm nylon sieve, washed with a little water and then added together with the filter to a 15 ml Nunc tube. Elution of the product from the resin:
10 ml of isopropanol (>99%) are added to the tube with the filter and the resin. Afterwards, the sealed tube is shaken for 30 minutes at room temperature on a Rota-Mixer (Labinco BV, Netherlands). Then, 2 ml of the liquid are centrifuged off and the supernatant is added using a pipette to HPLC tubes.
B: Effect of the addition of cvclodextrin and cvclodextrin derivatives to the epothilone concentrations attained.
Cyclodextrins are cyclic (a-1,4)-linked oligosaccharides of a-D-glucopyranose with a relatively hydrophobic central cavity and a hydrophilic extemal surface area.
The following are distinguished in particular (the figures in parenthesis give the number of glucose units per molecule): a-cyclodextrin (6), p-cyclodextrin (7), y- cyclodextrin (8). 5"Cyclodextrin (9). £- cyclodextrin (10), δ-cyclodextrin (11), n-cyclodextrin (12), and 9-cyclodextrin (13). Especially prefen-ed are 5-cyclodextrin and in particular a-cyclodextrin, P-cyclodextrin or γ-clodextrin, or mixtures thereof.

Cyclodextrin derivatives are primarily derivatives of the above-mentioned cyclodex-trins, especially of a-cyclodextrin,β-cyclodextrin or y-cyclodextrin. primarily those in which one or more up to all of the hydroxy groups (3 per glucose radical) are etherified or este-rified. Ethers are primarily alky! ethers, especially lower alkyl, such as methyl or ethyl ether, also propyl or butyl ether; the aryl-hydroxyalkyl ethers, such as phenyl-hydroxy-lower-alkyi, especially phenyl-hydroxyethyl ether; the hydroxyalkyi ethers, in particular hydroxy-lower-alkyl ethers, especially 2-hydroxyethyl, hydroxypropyl such as 2-hydroxypropyl or hydroxy-butyl such as 2-hydroxybutyl ether; the carboxyalkyi ethers, in particular carboxy-lower-alkyi ethers, especially carboxymethyl or carboxyethyl ether; derivatised carboxyalkyi ethers, in particular derivatised cartboxy-iower-alkyI ether in which the derivatised carboxy is etherified or amidated carboxy (primarily aminocarbonyl, mono- or di-lower-alkyl-aminocarbonyl, mor-pholino-, piperidino-, pyrrolidine- or piperazino-carbonyl, or alkyloxycarbonyl), in particular lower alkoxycarbonyl-lower-alkyl ether, for example methyloxycarbonylpropyl ether or ethyloxycarbonylpropyl ether; the sulfoalkyl ethers, in particular sulfo-lower-alkyi ethers, especially sulfobutyl ether; cyclodextrins in which one or more OH groups are etherified with
a rariiral ni fnrmiila
wherein R' is hydrogen, hydroxy. -0-(alk-0)rH, -0-{alk(-R)-0-)p-H or -0-(alk(-R)-0-)q-alk-CO-Y; alk in all cases is alkyl, especially lower alkyl; m, n, p, q and z are a whole number from 1 to 12, preferably 1 to 5, in particular 1 to 3; and Y is ORi or NR2R3, wherein R1, R2 and R3 independently of one another, are hydrogen or lower alkyl, or R2 and R3 combined together with the linking nitrogen signify morpholino, piperidino, pyrrolidino or piperazino;
or branched cyclodextrins, in which etherifications or acetals with other sugar molecules are present, especially giucosyl-, diglucosyl- (G2-β-cydodextrin), maltosyl- or dimaltosyl-cyclodextrin, or N-acetylglucosaminyl-, glucosaminyl-, N-acetylgalactosaminyl- or galactosaminyl-cyciodextrin-

Esters are primarily alkanoyl esters, in particular lower alkanoyl esters, such as acetyl
esters of cyclodextrins.
It is also possible to have cyclodextrins in which two or more different said ether and ester groups are present at the same time.
Mixtures of two or more of the said cyclodextrins and/or cyclodextrin derivatives may
also exist.
Preference is given in particular to a-, β- or y-cyclodextrins or the lower alky! ethers thereof, such as methyl-p-cyclodextrin or in particular 2,6-di-O-methyl-p-cyclodextrin, or in particular the hydroxy lower alkyl ethers thereof, such as 2-hydroxypropyl-a-, 2-hydroxy-propyl-p- or 2-hydroxypropyl-Y-cyclodextrin.
The cyclodextrins or cyclodextrin derivatives are added to the culture medium preferably in a concentration of 0,02 to 10, preferably 0.05 to 5, especially 0.1 to 4, for example 0.1 to 2 percent by weight (w/v).
Cyclodextrins or cyclodextrin derivatives are known or may be produced by known processes (see for example US 3,459,731; US 4,383,992; US 4,535.152; US 4,659,696; EP 0 094 157; EP 0 149 197; EP 0 197 571; EP 0 300 526; EP 0 320 032; EP 0 499 322; EP 0 503 710; EP 0 818 469; WO 90/12035; WO 91/11200; WO 93/19061; WO 95/08993; WO 96/14090; GB 2.189,245; DE 3,118.218; DE 3,317,064 and the references mentioned therein, which also refer to the synthesis of cyclodextrins or cyclodextrin derivatives, or also: T. Loftsson and M.E. Brewster (1996): Phamiaceutical Applications of Cyclodextrins: Drug Solubilization and Stabilisation: Journal of Pharmaceutical Science 85 (10):1017-1025; R.A. Rajewski and VJ. Stella(1996): Pharmaceutical Applications of Cyclodextrins: In Vivo Drug Delivery: Journal of Pharmaceutical Science 85 (11): 1142-1169),
All the cyclodextrin derivatives tested here are obtainable from the company Fluka, Buchs, CH, The tests are carried out in 200 ml agitating flasks with 50 ml culture volume. As controls, flasks with adsorber resin Amberiite XAD-16 (Rohm & Haas, Frankfurt, Germany) and without any adsorber addition are used. After incubation for 5 days, the following epothilone titres can be determined by HPLC:



^) Apart from Amberiite (%v/v), all percentages are by weight (%w/v).
Few of the cyclodextrins tested (2,6-di-o-methyl-p-cyclodextrin, methyl-p-cyclodextrin) display no effect or a negative effect on epothilone production at the concentrations used. 1-2% 2-hydroxy-propy|.p-cyclodextrin and p-cyclodextrin increase epothilone production in the examples by 6 to 8 times compared with production using no cyclodextrins.

C: 10 litre fermentation with 1% 2-(hvdroxvpropvlVB-cvclodextrin):
Fermentation is carried out in a 15 litre glass femienter. The medium contains 10 g/l of 2"(hydroxypropyl)-p-cyclodextrin from Wacker Chemie, Munich, Germany. The progress of fermentation is illustrated in Table 3. Fermentation is ended after 6 days and working up takes place.





G: Working up of the epothilones: Isolation from a 500 litre main culture:
The volume of harvest from the 500 litre main culture of example 2D is 450 litres and is separated using a Westfalia clarifying separator Type SA-20-06 (rpm = 6500) into the liquid phase (centrifugate + rinsing water = 650 litres) and solid phase (cells = ca. 15 kg). The main part of the epothilones are found in the centrifugate, The centrifuged cell pulp contains
500 ml of methanol, the insoluble portions filtered off using a folded filter, and the solution added to a 10 kg Sephadex LH 20 column (Phamnacia, Uppsala, Sweden) (column diameter 20 cm, filling level ca. 1.2 m). Elution is effected with methanol as eluant. Epothilone A and B is present predominantly in fractions 21-23 (at a fraction size of 1 litre). These fractions are concentrated to dryness in a vacuum on a rotary evaporator (total weight 9.0 g). These Sephadex peak fractions (9.0 g) are thereafter dissolved in 92 ml of acetonitrile:-water:-methylene chloride = 50:40:2, the solution filtered through a folded filter and added to a RP column (equipment Prepbar 200, Merck; 2. 0 kg LiChrospher RP-18 Merck, grain size 12µm, column diameter 10 cm, filling level 42 cm; Merck, Darmstadt, Germany). Elution is effected with acetonitrile:water= 3:7 (flow rate = 500 ml/min.; retention time of epothilone A = ca. 51-59 mins.; retention time of epothilone B = ca. 60-69 mins.). Fractionation is monitored with a UV detector at 250 nm. The fractions are concentrated to dryness under vacuum on a Buchi-Rotavapor rotary evaporator. The weight of the epothilone A peak fraction is 700 mg, and according to HPLC (external standard) it has a content of 75.1%. That of the epothilone B peak fraction is 1980 mg, and the content according to HPLC (external standard) is 86.6%. Finally, the epothilone A fraction (700 mg) is crystallised from 5 mi of ethyl acetate:toluene = 2:3, and yields 170 mg of epothilone A pure crystallisate [content according to HLPC (% of area) = 94.3%]. Crystallisation of the epothilone B fraction (1980 mg) is effected from 18 ml of methanol and yields 1440 mg of epothilone B pure crystallisate [content according to HPLC (% of area) = 99.2%]. m.p. (Epothilone B): e.g. 124-125 °C; 'H-NMR data for Epothilone B:
500 MHz-NMR, solvent: DMS0-d6. Chemical displacement 5 in ppm relative to TMS. s = singlet; d = doublet; m = multiplet



Pharmaceutical preparations or compositions comprising epothiiones are used for example in the treatment of cancerous diseases, such as various human solid tumors. Such anticancer formulations comprise, for example, an active amount of an epothilone together with one or more organic or inorganic, liquid or solid, pharmaceutically suitable carrier materials. Such formulations are delivered, for example, enterally, nasally, rectally, orally, or parenterally, particularly intramuscularly or intravenously. The dosage of the active ingredient is dependent upon the weight, age, and physical and pharmacokinetical condition of the patient and is further dependent upon the method of delivery. Because epothiiones mimic the biological effects of taxol, epothiiones may be substituted for taxol in compositions and methods utilizing taxol in the treatment of cancer. See. for example, U.S.

Patent Nos. 5,496,804. 5,565,478, and 5.641,803, all of which are incorporated herein by
Reference.
For example, for treatments, epothilone B is supplied in individual 2 ml glass vials formulated as 1 mg/1 ml of clear, colorless intravenous concentrate. The substance is formulated in polyethylene glycol 300 (PEG 300) and diluted with 50 or 100 ml 0.9% Sodium Chloride Injection, USP. to achieve the desired final concentration of the ding for infusion. It is administered as a single 30-minute intravenous infusion every 21 days (treatment three-weekly) for six cycles, or as a single 30-minute intravenous infusion every 7 days (weekly treatment).
Preferably, for weekly treatment, the dose is between about 0.1 and about 6. preferably about 0.1 and about 5 mg/m2, more preferably about 0.1 and about 3 mg/m2 even more preferably 0.1 and 1.7 mg/m2. most preferably about 0.3 and about 1 mg/m2 for three-weekly treatment (treatment every three weeks or every third week) the dose is between about 0.3 and about 18 mg/m2, preferably about 0.3 and about 15 mg/m2. more preferably about 0.3 and about 12 mg/m2, even more preferably about 0.3 and about 7.5 mg/m2. still more preferably about 0.3 and about 5 mg/m2, most preferably about 1.0 and about 3.0 mg/m2. This dose is preferably administered to the human by intravenous (i.v.) administration during 2 to 180 min, preferably 2 to 120 min, more preferably during about 5 to about 30 min. most preferably during about 10 to about 30 min, e.g. during about 30 min.
While the present invention has been described with reference to specific embodiments thereof, it will be appreciated that numerous variations, modifications, and embodiments are possible, and accordingly, all such variations, modifications and embodiments are to be regarded as being within the spirit and scope of the present invention.




















































































































































































What is claimed is:
1. An isolated nucleic acid molecule comprising a nucleotide sequence that encodes at least one polypeptide involved in the biosynthesis of epothilone.
2. An isolated nucleic acid molecule according to claim 1, wherein said nucleotide sequence is isolated from a mycobacterium.
3. An isolated nucleic acid molecule according to claim 2, wherein said mycobacterium is Sporangium cellulose.
4. A chimeric gene comprising a heterologous promoter sequence operatively linked to a nucleic acid molecule according to claim 1.

5. A recombinant vector comprising a chimeric gene according to claim 4.
6. A recombinant host cell comprising a chimeric gene according to claim 4.
7. The recombinant host cell of claim 6, which is a bacteria.
8. The recombinant host cell of claim 7, which is an Actinomycete.
9. The recombinant host cell of claim 8. which is Streptomyces,

10. A Bac clone comprising a nucleic acid molecule according to claim 1.
11. The Bac clone of claim 10, which is pEP015.
12. An isolated nucleic acid molecule according to claim 1, wherein said polypeptide comprises an amino acid sequence substantially similar to an amino acid sequence selected from the group consisting of: SEQ ID N0:2, amino acids 11-437 of SEQ ID N0:2, amino acids 543-864 of SEQ ID N0:2, amino acids 974-1273 of SEQ ID N0:2, amino acids 1314-1385 of SEQ ID NO:2, SEQ ID N0:3. amino acids 72-81 of SEQ ID N0:3, amino acids

118-125 of SEQ ID N0:3, amino acids 199-212 of SEQ ID N0:3, amino acids 353-363 of SEQ ID N0:3, amino acids 549-565 of SEQ ID NO:3, amino acids 588-603 of SEQ ID N0:3, amino acids 669-684 of SEQ ID N0:3, amino acids 815-821 of SEQ ID N0:3, amino acids 868-892 of SEQ ID N0-.3, amino acids 903-912 of SEQ ID NO:3, amino acids 918-940 of SEQ ID N0:3, amino acids 1268-1274 of SEQ ID N0:3, amino acids 1285-1297 of SEQ ID N0:3, amino acids 973-1256 of SEQ ID N0:3, amino acids 1344-1351 of SEQ ID N0:3, SEQ ID N0:4, amino acids 7-432 of SEQ ID N0:4, amino acids 539-859 of SEQ ID N0:4, amino acids 869-1037 of SEQ ID N0:4, amino acids 1439-1684 of SEQ ID N0:4, amino , acids 1722-1792 of SEQ ID N0:4, SEQ ID N0:5, amino acids 39-457 of SEQ ID N0:5, amino acids 563-884 of SEQ ID N0:5, amino acids 1147-1399 of SEQ ID N0:5, amino acids 1434-1506 of SEQ ID N0:5, amino acids 1524-1950 of SEQ ID N0:5, amino acids 2056-2377 of SEQ ID N0:5, amino acids 2645-2895 of SEQ ID N0:5, amino acids 2932-3005 of SEQ ID N0:5, amino acids 3024-3449 of SEQ ID N0:5, amino acids 3555-3876 of SEQ ID NO:5, amino acids 3886-4048 of SEQ ID N0:5, amino acids 4433-4719 of SEQ ID N0:5, amino acids 4729-4974 of SEQ ID N0:5, amino acids 5010-5082 of SEQ ID N0:5, amino acids 5103-5525 of SEQ ID N0:5, amino acids 5631-5951 of SEQ ID NO:5, amino acids 5964-6132 of SEQ ID N0:5, amino acids 6542-6837 of SEQ ID NO:5, amino acids 6857-7101 of SEQ ID N0:5, amino acids 7140-7211 of SEQ ID N0:5, SEQ ID N0:6, amino acids 35-454 of SEQ ID N0:6, amino acids 561-881 of SEQ ID N0:6, amino acids 1143-1393 of SEQ ID N0:6, amino acids 1430-1503 of SEQ ID N0:6, amino acids 1522-1946 of SEQ ID NO: 6, amino acids 2053-2373 of SEQ ID N0:6, amino acids 2383-2551 of SEQ ID N0:6, amino acids 2671-3045 of SEQ ID N0:6, amino acids 3392-3636 of SEQ ID N0:6, amino acids 3673-3745 of SEQ ID N0:6, SEQ ID N0:7, amino acids 32-450 of SEQ ID N0:7, amino acids 556-877 of SEQ ID N0:7, amino acids 887-1051 of SEQ ID N0:7, amino acids 1478-1790 of SEQ ID NO:7, amino acids 1810-2055 of SEQ ID NO:7, amino acids 2093-2164 of SEQ ID N0:7, amino acids 2165-2439 of SEQ ID N0:7, SEQ ID N0:8, SEQ ID NO:10. SEQ ID N0:11, and SEQ ID NO:22.
13. An isolated nucleic acid molecule according to claim 12, wherein said polypeptide comprises an amino acid sequence selected from the group consisting of: SEQ ID N0:2, amino acids 11-437 of SEQ ID N0:2, amino acids 543-864 of SEQ ID N0:2, amino acids 974-1273 of SEQ ID N0:2, amino acids 1314-1385 of SEQ ID N0:2, SEQ ID N0:3, amino acids 72-81 of SEQ ID N0:3, amino acids 118-125 of SEQ ID N0:3, amino

acids 199-212 of SEQ ID N0:3, amino acids 353-363 of SEQ ID N0:3, amino acids 549-565 of SEQ ID N0:3, amino acids 588-603 of SEQ ID N0:3, amino acids 669-684 of SEQ ID N0:3, amino acids 815-821 of SEQ ID N0:3, amino acids 868-892 of SEQ ID N0:3, amino acids 903-912 of SEQ ID N0:3, amino acids 918-940 of SEQ ID N0:3, amino acids 1268-1274 of SEQ ID N0:3, amino acids 1285-1297 of SEQ ID N0:3, amino acids 973-1256 of SEQ ID N0:3, amino acids 1344-1351 of SEQ ID N0:3, SEQ ID N0:4, amino acids 7-432 of SEQ ID N0:4, amino acids 539-859 of SEQ ID N0:4, amino acids 869-1037 of SEQ ID N0:4, amino acids 1439-1684 of SEQ ID N0:4, amino acids 1722-1792 of SEQ ID N0:4, SEQ ID N0:5, amino acids 39-457 of SEQ ID N0:5, amino acids 563-884 of SEQ ID N0:5, amino acids 1147-1399 of SEQ ID N0:5, amino acids 1434-1506 of SEQ ID N0:5, amino acids 1524-1950 of SEQ ID N0:5, amino acids 2056-2377 of SEQ ID N0:5, amino acids 2645-2895 of SEQ ID N0:5, amino acids 2932-3005 of SEQ ID NO:5, amino acids 3024-3449 of SEQ ID N0:5, amino acids 3555-3876 of SEQ ID NO:5, amino acids 3886-4048 of SEQ ID N0:5, amino acids 4433-4719 of SEQ ID N0:5, amino acids 4729-4974 of SEQ ID N0:5, amino acids 5010-5082 of SEQ ID N0:5, amino acids 5103-5525 of SEQ ID N0:5, amino acids 5631-5951 of SEQ ID N0:5, amino acids 5964-6132 of SEQ ID N0:5, amino acids 6542-6837 of SEQ ID N0:5, amino acids 6857-7101 of SEQ ID N0:5, amino acids 7140-7211 of SEQ ID N0:5, SEQ ID N0:6, amino acids 35-454 of SEQ ID N0:6, amino acids 561-881 of SEQ ID NO:6, amino acids 1143-1393 of SEQ ID N0:6, amino acids 1430-1503 of SEQ ID N0:6, amino acids 1522-1946 of SEQ ID NO: 6, amino acids 2053-2373 of SEQ ID N0:6, amino acids 2383-2551 of SEQ ID N0:6, amino acids 2671-3045 of SEQ ID N0:6, amino acids 3392-3636 of SEQ ID N0:6, amino acids 3673-3745 of SEQ ID N0:6, SEQ ID NO:7, amino acids 32-450 of SEQ ID N0:7, amino acids 556-877 of SEQ ID N0:7, amino acids 887-1051 of SEQ ID N0:7, amino acids 1478-1790 of SEQ ID N0:7, amino acids 1810-2055 of SEQ ID N0:7, amino acids 2093-2164 of SEQ ID N0:7, amino acids 2165-2439 of SEQ ID N0:7, SEQ ID N0:8, SEQ ID NO:10, SEQ ID N0:11, and SEQ ID NO:22.
14. An isolated nucleic acid molecule according to claim 12, wherein said nucleotide sequence is substantially similar to a nucleotide sequence selected from the group consisting of: the complement of nucleotides 1900-3171 of SEQ ID N0:1, nucleotides 3415-5556 of SEQ ID N0:1, nucleotides 7610-11875 of SEQ ID NO: 1, nucleotides 7643-8920 of SEQ ID N0:1, nucleotides 9236-10201 of SEQ ID N0:1, nucleotides 10529-11428 of SEQ

ID NO:1, nucleotides 11549-11764 of SEQ ID N0:1, nucleotides 11872-16104 of SEQ ID N0:1, nucleotides 12085-12114 of SEQ ID N0:1, nucleotides 12223-12246 of SEQ ID N0:1, nucleotides 12466-12507 of SEQ ID N0:1, nucleotides 12928-12960 of SEQ ID N0:1, nucleotides 13516-13566 of SEQ ID N0:1, nucleotides 13633-13680 of SEQ ID N0:1, nucleotides 13876-13923 of SEQ ID N0:1, nucleotides 14313-14334 of SEQ ID N0:1, nucleotides 14473-14547 of SEQ ID N0:1, nucleotides 14578-14607 of SEQ ID N0:1, nucleotides 14623-14692 of SEQ ID N0:1, nucleotides 15673-15693 of SEQ ID N0:1, nucleotides 15724-15762 of SEQ ID N0:1, nucleotides 14788-15639 of SEQ ID N0:1, nucleotides 15901-15924 of SEQ ID N0:1, nucleotides 16251-21749 of SEQ ID N0:1, nucleotides 16269-17546 of SEQ ID N0:1, nucleotides 17865-18827 of SEQ ID NO: 1, nucleotides 18855-19361 of SEQ ID N0:1, nucleotides 20565-21302 of SEQ ID N0:1, nucleotides 21414-21626 of SEQ ID N0:1, nucleotides 21746-43519 of SEQ ID NO: 1, nucleotides 21860-23116 of SEQ ID NO:1, nucleotides 23431-24397 of SEQ ID NO:1, nucleotides 25184-25942 of SEQ ID N0:1, nucleotides 26045-26263 of SEQ ID N0:1, nucleotides 26318-27595 of SEQ ID N0:1, nucleotides 27911-28876 of SEQ ID NO: 1, nucleotides 29678-30429 of SEQ ID N0:1, nucleotides 30539-30759 of SEQ ID N0:1, nucleotides 30815-32092 of SEQ ID N0:1, nucleotides 32408-33373 of SEQ ID N0:1, nucleotides 33401-33889 of SEQ ID N0:1, nucleotides 35042-35902 of SEQ ID N0:1, nucleotides 35930-36667 of SEQ ID N0:1, nucleotides 36773-36991 of SEQ ID N0:1, nucleotides 37052-38320 of SEQ ID N0:1, nucleotides 38636-39598 of SEQ ID N0:1, nucleotides 39635-40141 of SEQ ID N0:1, nucleotides 41369-42256 of SEQ ID N0:1, nucleotides 42314-43048 of SEQ ID NO:1, nucleotides 43163-43378 of SEQ ID NO: 1, nucleotides 43524-54920 of SEQ ID NO: 1, nucleotides 43626-44885 of SEQ ID N0:1, nucleotides 45204-46166 of SEQ ID N0:1, nucleotides 46950-47702 of SEQ ID N0:1, nucleotides 47811-48032 of SEQ ID N0:1, nucleotides 48087-49361 of SEQ ID N0:1, nucleotides 49680-50642 of SEQ ID N0:1, nucleotides 50670-51176 of SEQ ID N0:1, nucleotides 51534-52657 of SEQ ID N0:1, nucleotides 53697-54431 of SEQ ID NO: 1, nucleotides 54540-54758 of SEQ ID NO: 1, nucleotides 54935-62254 of SEQ ID N0:1, nucleotides 55028-56284 of SEQ ID NO: 1, nucleotides 56600-57565 of SEQ ID N0:1, nucleotides 57593-58087 of SEQ ID NO: 1, nucleotides 59366-60304 of SEQ ID NO: 1, nucleotides 60362-61099 of SEQ ID NO: 1, nucleotides 61211-61426 of SEQ ID NO: 1, nucleotides 61427-62254 of SEQ ID NO: 1, nucleotides 62369-63628 of SEQ ID N0:1, nucleotides 67334-68251 of SEQ ID NO: 1, and nucleotides 1-68750 SEQ ID N0:1.

15. A nucleic acid molecule according to claim 12, wherein said nucleotide sequence is selected from the group consisting of: the complement of nucleotides 1900-3171 of SEQ ID NO: 1, nucleotides 3415-5556 of SEQ ID N0:1, nucleotides 7610-11875 of SEQ ID N0:1, nucleotides 7643-8920 of SEQ ID N0:1, nucleotides 9236-10201 of SEQ ID N0:1, nucleotides 10529-11428 of SEQ ID N0:1, nucleotides 11549-11764 of SEQ ID N0:1, nucleotides 11872-16104 of SEQ ID NO: 1, nucleotides 12085-12114 of SEQ ID N0:1, nucleotides 12223-12246 of SEQ ID N0:1, nucleotides 12466-12507 of SEQ ID N0:1, nucleotides 12928-12960 of SEQ ID N0:1, nucleotides 13516-13566 of SEQ ID N0:1, nucleotides 13633-13680 of SEQ ID N0:1, nucleotides 13876-13923 of SEQ ID N0:1, nucleotides 14313-14334 of SEQ ID N0:1, nucleotides 14473-14547 of SEQ ID N0:1, nucleotides 14578-14607 of SEQ ID NO: 1, nucleotides 14623-14692 of SEQ ID N0:1, nucleotides 15673-15693 of SEQ ID N0:1, nucleotides 15724-15762 of SEQ ID N0:1, nucleotides 14788-15639 of SEQ ID N0:1, nucleotides 15901-15924 of SEQ ID N0:1, nucleotides 16251-21749 of SEQ ID N0:1, nucleotides 16269-17546 of SEQ ID N0:1, nucleotides 17865-18827 of SEQ ID N0:1, nucleotides 18855-19361 of SEQ ID N0:1, nucleotides 20565-21302 of SEQ ID N0:1, nucleotides 21414-21626 of SEQ ID N0:1, nucleotides 21746-43519 of SEQ ID N0:1, nucleotides 21860-23116 of SEQ ID N0:1, nucleotides 23431-24397 of SEQ ID N0:1, nucleotides 25184-25942 of SEQ ID N0:1, nucleotides 26045-26263 of SEQ ID N0:1, nucleotides 26318-27595 of SEQ ID N0:1, nucleotides 27911-28876 of SEQ ID N0:1, nucleotides 29678-30429 of SEQ ID N0:1, nucleotides 30539-30759 of SEQ ID N0:1, nucleotides "30815-32092 of SEQ ID N0:1, nucleotides 32408-33373 of SEQ ID N0:1, nucleotides 33401-33889 of SEQ ID N0:1, nucleotides 35042-35902 of SEQ ID N0:1, nucleotides 35930-36667 of SEQ ID N0:1, nucleotides 36773-36991 of SEQ ID N0:1, nucleotides 37052-38320 of SEQ ID N0:1, nucleotides 38636-39598 of SEQ ID N0:1, nucleotides 39635-40141 of SEQ ID N0:1, nucleotides 41369-42256 of SEQ ID N0:1, nucleotides 42314-43048 of SEQ ID N0:1, nucleotides 43163-43378 of SEQ ID N0:1, nucleotides 43524-54920 of SEQ ID N0:1, nucleotides 43626-44885 of SEQ ID N0:1, nucleotides 45204-46166 of SEQ ID NO: 1, nucleotides 46950-47702 of SEQ ID N0:1, nucleotides 47811-48032 of SEQ ID N0:1. nucleotides 48087-49361 of SEQ ID N0:1. nucleotides 49680-50642 of SEQ ID N0:1, nucleotides 50670-51176 of SEQ ID N0:1, nucleotides 51534-52657 of SEQ ID N0:1, nucleotides 53697-54431 of SEQ ID N0:1. nucleotides 54540-54758 of SEQ ID

N0:1, nucleotides 54935-62254 of SEQ ID N0:1, nucleotides 55028-56284 of SEQ ID N0:1, nucleotides 56600-57565 of SEQ ID N0:1, nucleotides 57593-58087 of SEQ ID N0:1, nucleotides 59366-60304 of SEQ ID N0:1, nucleotides 60362-61099 of SEQ ID NO: 1, nucleotides 61211-61426 of SEQ ID N0:1, nucleotides 61427-62254 of SEQ ID N0:1, nucleotides 62369-63628 of SEQ ID N0:1, nucleotides 67334-68251 of SEQ ID N0:1, and nucleotides 1-68750 SEQ ID N0:1.
16. A chimeric gene comprising a heterologous promoter sequence operatively
linked to a nucleic acid molecule according to claim 12.
17. A recombinant vector comprising a chimeric gene according to claim 16.
18. A recombinant host cell comprising a chimeric gene according to claim 16.
19. The recombinant host cell of claim 18, which is a bacteria.
20. The recombinant host cell of claim 19, which is an Actinomycete.
21. The recombinant host cell of claim 20, which is Streptomyces.
22. An isolated nucleic acid molecule according to claim 1, wherein said nucleotide sequence comprises a consecutive 20 base pair nucleotide portion identical in sequence to a consecutive 20 base pair portion of a nucleotide sequence selected from the group consisting of: the complement of nucleotides 1900-3171 of SEQ ID N0:1, nucleotides 3415-5556 of SEQ ID NO:1, nucleotides 7610-11875 of SEQ ID N0:1, nucleotides 7643-8920 of SEQ ID NO: 1, nucleotides 9236-10201 of SEQ ID N0:1, nucleotides 10529-11428 of SEQ ID N0:1, nucleotides 11549-11764 of SEQ ID NO:1, nucleotides 11872-16104 of SEQ ID N0:1, nucleotides 12085-12114 of SEQ ID N0:1, nucleotides 12223-12246 of SEQ ID N0:1, nucleotides 12466-12507 of SEQ ID N0:1, nucleotides 12928-12960 of SEQ ID N0:1, nucleotides 13516-13566 of SEQ ID N0:1, nucleotides 13633-13680 of SEQ ID N0:1, nucleotides 13876-13923 of SEQ ID NO: 1. nucleotides 14313-14334 of SEQ ID N0:1, nucleotides 14473-14547 of SEQ ID N0:1, nucleotides 14578-14607 of SEQ ID N0:1, nucleotides 14623-14692 of SEQ ID N0:1, nucleotides 15673-15693 of SEQ ID

N0:1, nucleotides 15724-15762 of SEQ ID N0:1, nucleotides 14788-15639 of SEQ ID N0:1, nucleotides 15901-15924 of SEQ ID N0:1, nucleotides 16251-21749 of SEQ ID N0:1, nucleotides 16269-17546 of SEQ ID N0:1, nucleotides 17865-18827 of SEQ ID NO:1, nucleotides 18855-19361 of SEQ ID N0:1, nucleotides 20565-21302 of SEQ ID N0:1, nucleotides 21414-21626 of SEQ ID N0:1. nucleotides 21746-43519 of SEQ ID N0:1, nucleotides 21860-23116 of SEQ ID N0:1, nucleotides 23431-24397 of SEQ ID N0:1, nucleotides 25184-25942 of SEQ ID N0:1, nucleotides 26045-26263 of SEQ ID NO:1, nucleotides 26318-27595 of SEQ ID N0:1, nucleotides 27911-28876 of SEQ ID NQ:1, nucleotides 29678-30429 of SEQ ID N0:1, nucleotides 30539-30759 of SEQ ID NO:1, nucleotides 30815-32092 of SEQ ID N0:1, nucleotides 32408-33373 of SEQ ID NO:1, nucleotides 33401-33889 of SEQ ID N0:1, nucleotides 35042-35902 of SEQ ID N0:1, nucleotides 35930-36667 of SEQ ID N0:1, nucleotides 36773-36991 of SEQ ID N0:1, nucleotides 37052-38320 of SEQ ID N0:1, nucleotides 38636-39598 of SEQ ID NO:1, nucleotides 39635-40141 of SEQ ID N0:1, nucleotides 41369-42256 of SEQ ID N0:1, nucleotides 42314-43048 of SEQ ID N0:1, nucleotides 43163-43378 of SEQ ID N0:1, nucleotides 43524-54920 of SEQ ID N0:1, nucleotides 43626-44885 of SEQ ID NO:1, nucleotides 45204-46166 of SEQ ID N0:1, nucleotides 46950-47702 of SEQ ID NO:1, nucleotides 47811-48032 of SEQ ID N0:1, nucleotides 48087-49361 of SEQ ID N0:1, nucleotides 49680-50642 of SEQ ID N0:1, nucleotides 50670-51176 of SEQ ID N0:1, nucleotides 51534-52657 of SEQ ID N0:1, nucleotides 53697-54431 of SEQ ID N0:1, nucleotides 54540-54758 of SEQ ID N0:1, nucleotides 54935-62254 of SEQ ID N0:1, nucleotides 55028-56284 of SEQ ID N0:1, nucleotides 56600-57565 of SEQ ID N0:1, nucleotides 57593-58087 of SEQ ID N0:1, nucleotides 59366-60304 of SEQ ID N0:1, nucleotides 60362-61099 of SEQ ID N0:1, nucleotides 61211-61426 of SEQ ID N0:1, nucleotides 61427-62254 of SEQ ID N0:1, nucleotides 62369-63628 of SEQ ID NO:1, nucleotides 67334-68251 of SEQ ID N0:1, and nucleotides 1-68750 SEQ ID N0:1.
23. A chimeric gene comprising a heterologous promoter sequence operatively
linked to a nucleic acid molecule according to claim 22.
24. A recombinant vector comprising a chimeric gene according to claim 23.
25. A recombinant host cell comprising a chimeric gene according to claim 23.

26. The recombinant host cell of claim 25, which is a bacteria.
27. The recombinant host cell of claim 26, which is an Actinomycete.
28. The recombinant host cell of claim 27, which is Streptomyces.
29. An isolated nucleic acid molecule comprising a nucleotide sequence that encodes at least one epothilone synthase domain.
30. An isolated nucleic acid molecule according to claim 29, wherein said epothilone syntheses domain is a p-ketoacyl-synthase domain comprising an amino acid sequence substantially similar to an amino acid sequence selected from the group consisting of: amino acids 11-437 of SEQ ID N0:2, amino acids 7-432 of SEQ ID N0:4, amino acids 39-457 of SEQ ID N0:5, amino acids 1524-1950 of SEQ ID N0:5, amino acids 3024-3449 of SEQ ID N0:5, amino acids 5103-5525 of SEQ ID N0:5, amino acids 35-454 of SEQ ID N0:6, amino acids 1522-1946 of SEQ ID NO: 6, and amino acids 32-450 of SEQ ID N0:7.
31. An isolated nucleic acid molecule according to claim 30, wherein said p-ketoacyl-synthase domain comprises an amino acid sequence selected from the group consisting of: amino acids 11-437 of SEQ ID N0:2, amino acids 7-432 of SEQ ID N0:4, amino acids 39-457 of SEQ ID N0:5, amino acids 1524-1950 of SEQ ID N0:5, amino acids 3024-3449 of SEQ ID NO:5. amino acids 5103-5525 of SEQ ID N0:5, amino acids 35-454 of SEQ ID N0:6, amino acids 1522-1946 of SEQ ID NO: 6, and amino acids 32-450 of SEQ ID N0:7.
32. An isolated nucleic acid molecule according to claim 30, wherein said nucleotide sequence is substantially similar to a nucleotide sequence selected from the group consisting of: nucleotides 7643-8920 of SEQ ID NO: 1, nucleotides 16269-17546 of SEQ ID N0:1, nucleotides 21860-23116 of SEQ ID N0:1, nucleotides 26318-27595 of SEQ ID N0:1. nucleotides 30815-32092 of SEQ ID N0:1, nucleotides 37052-38320 of SEQ ID N0:1. nucleotides 43626-44885 of SEQ ID N0:1, nucleotides 48087-49361 of SEQ ID N0:1. and nucleotides 55028-56284 of SEQ ID N0:1.

33. An isolated nucleic acid molecule according to claim 30, wherein said nucleotide sequence comprises a consecutive 20 base pair nucleotide portion identical in sequence to a consecutive 20 base pair portion of a nucleotide sequence selected from the group consisting of: nucleotides 7643-8920 of SEQ ID N0:1, nucleotides 16269-17546 of SEQ ID N0:1, nucleotides 21860-23116 of SEQ ID NO: 1, nucleotides 26318-27595 of SEQ ID N0:1, nucleotides 30815-32092 of SEQ ID NO: 1, nucleotides 37052-38320 of SEQ ID N0:1, nucleotides 43626-44885 of SEQ ID N0:1, nucleotides 48087-49361 of SEQ ID N0:1, and nucleotides 55028-56284 of SEQ ID N0:1.
34. An isolated nucleic acid molecule according to claim 30, wherein said nucleotide sequence is selected from the group consisting of: nucleotides 7643-8920 of SEQ ID N0:1, nucleotides 16269-17546 of SEQ ID N0:1, nucleotides 21860-23116 of SEQ ID N0:1, nucleotides 26318-27595 of SEQ ID N0:1, nucleotides 30815-32092 of SEQ ID N0:1, nucleotides 37052-38320 of SEQ ID NO:1, nucleotides 43626-44885 of SEQ ID N0:1, nucleotides 48087-49361 of SEQ ID NQ:1, and nucleotides 55028-56284 of SEQ ID N0:1.
35. An isolated nucleic acid molecule according to claim 29, wherein said epothilone synthase domain is a an acyltransferase domain comprising an amino acid sequence substantially similar to an amino acid sequence selected from the group consisting of: amino acids 543-864 of SEQ ID NO:2, amino acids 539-859 of SEQ ID NO:4, amino acids 563-884 of SEQ ID N0:5, amino acids 2056-2377 of SEQ ID N0:5, amino acids 3555-3876 of SEQ ID N0:5, amino acids 5631-5951 of SEQ ID N0:5, amino acids 561-881 of SEQ ID N0:6, amino acids 2053-2373 of SEQ ID N0:6, and amino acids 556-877 of SEQ ID N0:7.
36. An isolated nucleic acid molecule according to claim 35, wherein said acyltransferase domain comprises an amino acid sequence selected from the group consisting of: amino acids 543-864 of SEQ ID N0:2, amino acids 539-859 of SEQ ID N0:4, amino acids 563-884 of SEQ ID N0:5, amino acids 2056-2377 of SEQ ID N0:5, amino acids 3555-3876 of SEQ ID N0:5, amino acids 5631-5951 of SEQ ID N0:5, amino acids 561-881 of SEQ ID N0:6, amino acids 2053-2373 of SEQ ID N0:6, and amino acids 556-877 of SEQ ID N0:7.

37. An isolated nucleic acid molecule according to claim 35, wherein said nucleotide sequence is substantially similar to a nucleotide sequence selected from the group consisting of: nucleotides 9236-10201 of SEQ ID NO:1, nucleotides 17865-18827 of SEQ ID N0:1, nucleotides 23431-24397 of SEQ ID NO: 1, nucleotides 27911-28876 of SEQ ID N0:1, nucleotides 32408-33373 of SEQ ID N0:1, nucleotides 38636-39598 of SEQ ID NO: 1, nucleotides 45204-46166 of SEQ ID N0:1, nucleotides 49680-50642 of SEQ ID N0:1, and nucleotides 56600-57565 of SEQ ID N0:1.
38. An isolated nucleic acid molecule according to claim 35, wherein said nucleotide sequence comprises a consecutive 20 base pair nucleotide portion identical in sequence to a consecutive 20 base pair portion of a nucleotide sequence selected from the group consisting of: nucleotides 9236-10201 of SEQ ID N0:1, nucleotides 17865-18827 of SEQ ID N0:1, nucleotides 23431 -24397 of SEQ ID N0:1, nucleotides 27911 -28876 of SEQ ID NO: 1, nucleotides 32408-33373 of SEQ ID NO: 1, nucleotides 38636-39598 of SEQ ID N0:1, nucleotides 45204-46166 of SEQ ID N0:1, nucleotides 49680-50642 of SEQ ID N0:1, and nucleotides 56600-57565 of SEQ ID N0:1.
39. An isolated nucleic acid molecule according to claim 35, wherein said nucleotide sequence is selected from the group consisting of: nucleotides 9236-10201 of SEQ ID NO: 1, nucleotides 17865-18827 of SEQ ID N0:1, nucleotides 23431-24397 of SEQ ID N0:1, nucleotides 27911 -28876 of SEQ ID N0:1, nucleotides 32408-33373 of SEQ ID N0:1, nucleotides 38636-39598 of SEQ ID N0:1, nucleotides 45204-46166 of SEQ ID N0:1, nucleotides 49680-50642 of SEQ ID N0:1, and nucleotides 56600-57565 of SEQ ID N0:1.
40. An isolated nucleic acid molecule according to claim 29, wherein said epothilone synthase domain is an enoyi reductase domain comprising an amino acid sequence substantially similar to an amino acid sequence selected from the group consisting of: amino acids 974-1273 of SEQ ID N0:2, amino acids 4433-4719 of SEQ ID N0:5, amino acids 6542-6837 of SEQ ID N0:5, and amino acids 1478-1790 of SEQ ID N0:7.
41. An isolated nucleic acid molecule according to claim 40, wherein said enoyi reductase domain comprises an amino acid sequence selected from the group consisting

of: amino acids 974-1273 of SEQ ID N0:2, amino acids 4433-4719 of SEQ ID N0:5, amino acids 6542-6837 of SEQ ID N0:5, and amino acids 1478-1790 of SEQ ID N0:7.
42. An isolated nucleic acid molecule according to claim 40, wherein said nucleotide sequence is substantially similar to a nucleotide sequence selected from the group consisting of: nucleotides 10529-11428 of SEQ ID N0:1, nucleotides 35042-35902 of SEQ ID N0:1, nucleotides 41369-42256 of SEQ ID N0:1, and nucleotides 59366-60304 of SEQ IDN0:1.
43. An isolated nucleic acid molecule according to claim 40, wherein said nucleotide sequence comprises a consecutive 20 base pair nucleotide portion identical in sequence to a consecutive 20 base pair portion of a nucleotide sequence selected from the group consisting of: nucleotides 10529-11428 of SEQ ID N0:1, nucleotides 35042-35902 of SEQ ID NO:1, nucleotides 41369-42256 of SEQ ID NO:1, and nucleotides 59366-60304 of SEQ IDN0:1.
44. An isolated nucleic acid molecule according to claim 40, wherein said nucleotide sequence is selected from the group consisting of: nucleotides 10529-11428 of SEQ ID N0:1, nucleotides 35042-35902 of SEQ ID N0:1, nucleotides 41369-42256 of SEQ ID N0:1, and nucleotides 59366-60304 of SEQ ID N0:1.
45. An isolated nucleic acid molecule according to claim 29, wherein said epothilone synthase domain is an acyl carrier protein domain comprising an amino acid sequence substantially similar to an amino acid sequence selected from the group consisting of: amino acids 1314-1385 of SEQ ID N0:2, amino acids 1722-1792 of SEQ ID N0:4, amino acids 1434-1506 of SEQ ID N0:5, amino acids 2932-3005 of SEQ ID N0:5, amino acids 5010-5082 of SEQ ID NO:5, amino acids 7140-7211 of SEQ ID N0:5, amino acids 1430-1503 of SEQ ID N0:6, amino acids 3673-3745 of SEQ ID N0:6, and amino acids 2093-2164of SEQ ID N0:7.
46. An isolated nucleic acid molecule according to claim 45, wherein said acyl carrier protein domain comprises an amino acid sequence selected from the group consisting of: amino acids 1314-1385 of SEQ ID N0:2, amino acids 1722-1792 of SEQ ID

^0:4, amino acids 1434-1506 of SEQ ID N0:5, amino acids 2932-3005 of SEQ ID N0:5, amino acids 5010-5082 of SEQ ID N0:5, amino acids 7140-7211 of SEQ ID N0:5, amino acids 1430-1503 of SEQ ID N0:6, amino acids 3673-3745 of SEQ ID N0:6, and amino acids 2093-2164 of SEQ ID NO:7.
47. An isolated nucleic acid molecule according to claim 45, wherein said nucleotide sequence is substantially similar to a nucleotide sequence selected from the group :onsisting of: nucleotides 11549-11764 of SEQ ID NO: 1, nucleotides 21414-21626 of SEQ D N0:1, nucleotides 26045-26263 of SEQ ID NO: 1, nucleotides 30539-30759 of SEQ ID ^^0:1, nucleotides 36773-36991 of SEQ ID N0:1, nucleotides 43163-43378 of SEQ ID M0:1, nucleotides 47811-48032 of SEQ ID N0:1, nucleotides 54540-54758 of SEQ ID N0:1, and nucleotides 61211-61426 of SEQ ID N0:1.
48. An isolated nucleic acid molecule according to claim 45, wherein said nucleotide sequence comprises a consecutive 20 base pair nucleotide portion identical in sequence to a consecutive 20 base pair portion of a nucleotide sequence selected from the group consisting of: nucleotides 11549-11764 of SEQ ID N0:1, nucleotides 21414-21626 of SEQ ID NO:1, nucleotides 26045-26263 of SEQ ID NO:1, nucleotides 30539-30759 of SEQ ID N0:1, nucleotides 36773-36991 of SEQ ID N0:1, nucleotides 43163-43378 of SEQ ID NO:1, nucleotides 47811-48032 of SEQ ID N0:1, nucleotides 54540-54758 of SEQ ID N0:1, and nucleotides 61211-61426 of SEQ ID N0:1.
49. An isolated nucleic acid molecule according to claim 45, wherein said nucleotide sequence is selected from the group consisting of: nucleotides 11549-11764 of SEQ ID N0:1, nucleotides 21414-21626 of SEQ ID N0:1, nucleotides 26045-26263 of SEQ ID N0:1, nucleotides 30539-30759 of SEQ ID N0:1, nucleotides 36773-36991 of SEQ ID N0:1, nucleotides 43163-43378 of SEQ ID N0:1, nucleotides 47811-48032 of SEQ ID N0:1, nucleotides 54540-54758 of SEQ ID N0:1, and nucleotides 61211-61426 of SEQ ID N0:1.
50. An isolated nucleic acid molecule according to claim 29, wherein said epothilone synthase domain is a dehydratase domain comprising an amino acid sequence substantially similar to an amino acid sequence selected from the group consisting of:

amino acids 869-1037 of SEQ ID N0:4, amino acids 3886-4048 of SEQ ID N0:5. amino acids 5964-6132 of SEQ ID N0:5, amino acids 2383-2551 of SEQ ID NO:6, and amino acids 887-1051 of SEQ ID N0:7.
51. An isolated nucleic acid molecule according to claim 50. wherein said dehydratase domain comprises an amino acid sequence selected from the group consisting of: amino acids 869-1037 of SEQ ID N0:4, amino acids 3886-4048 of SEQ ID N0:5, amino acids 5964-6132 of SEQ ID N0:5, amino acids 2383-2551 of SEQ ID NO:6, and amino acids 887-1051 of SEQ ID NO:7.
52. An isolated nucleic acid molecule according to claim 50, wherein said nucleotide sequence is substantially similar to a nucleotide sequence selected from the group consisting of: nucleotides 18855-19361 of SEQ ID N0:1. nucleotides 33401-33889 of SEQ ID N0:1, nucleotides 39635-40141 of SEQ ID N0:1, nucleotides 50670-51176 of SEQ ID NO:1, and nucleotides 57593-58087 of SEQ ID N0:1.
53. An isolated nucleic acid molecule according to claim 50, wherein said nucleotide sequence comprises a consecutive 20 base pair nucleotide portion identical in sequence to a consecutive 20 base pair portion of a nucleotide sequence selected from the group consisting of: nucleotides 18855-19361 of SEQ ID N0:1, nucleotides 33401-33889 of SEQ ID N0:1, nucleotides 39635-40141 of SEQ ID NO:1, nucleotides 50670-51176 of SEQ ID N0:1, and nucleotides 57593-58087 of SEQ ID N0:1.
54. An isolated nucleic acid molecule according to claim 50, wherein said nucleotide sequence is selected from the group consisting of: nucleotides 18855-19361 of SEQ ID NO: 1. nucleotides 33401-33889 of SEQ ID NO: 1, nucleotides 39635-40141 of SEQ ID N0:1, nucleotides 50670-51176 of SEQ ID N0:1, and nucleotides 57593-58087 of SEQ ID N0:1.
55. An Isolated nucleic acid molecule according to claim 29, wherein said epothilone synthase domain is a p-ketoreductase domain comprising an amino acid sequence substantially similar to an amino acid sequence selected from the group consisting of: amino acids 1439-1684 of SEQ ID N0:4, amino acids 1147-1399 of SEQ ID N0:5. amino

acids 2645-2895 of SEQ ID N0:5, amino acids 4729-4974 of SEQ ID N0:5, amino acids 6857-7101 of SEQ ID N0:5, amino acids 1143-1393 of SEQID N0:6, amino acids 3392-3636 of SEQ ID N0:6, and amino acids 1810-2055 of SEQ ID N0:7.
56. An isolated nucleic acid molecule according to claim 55, wherein said p-ketoreductase domain comprises an amino acid sequence selected from the group consisting of: amino acids 1439-1684 of SEQ ID N0:4, amino acids 1147-1399 of SEQ ID N0:5, amino acids 2645-2895 of SEQ ID N0:5, amino acids 4729-4974 of SEQ ID N0:5, amino acids 6857-7101 of SEQ ID N0:5, amino acids 1143-1393 of SEQ ID N0:6, amino acids 3392-3636 of SEQ ID N0:6, and amino acids 1810-2055 of SEQ ID N0:7.
57. An isolated nucleic acid molecule according to claim 55, wherein said nucleotide sequence is substantially similar to a nucleotide sequence selected from the group consisting of: nucleotides 20565-21302 of SEQ ID N0:1, nucleotides 25184-25942 of SEQ ID N0:1, nucleotides 29678-30429 of SEQ ID N0:1, nucleotides 35930-36667 of SEQ ID N0:1, nucleotides 42314-43048 of SEQ ID N0:1, nucleotides 46950-47702 of SEQ ID N0:1, nucleotides 53697-54431 of SEQ ID N0:1, and nucleotides 60362-61099 of SEQ ID N0:1.
58. An isolated nucleic acid molecule according to claim 55, wherein said nucleotide sequence comprises a consecutive 20 base pair nucleotide portion identical in sequence to a consecutive 20 base pair portion of a nucleotide sequence selected from the group consisting of: nucleotides 20565-21302 of SEQ ID NO:1, nucleotides 25184-25942 of SEQ ID N0:1, nucleotides 29678-30429 of SEQ ID N0:1, nucleotides 35930-36667 of SEQ ID N0:1, nucleotides 42314-43048 of SEQ ID N0:1, nucleotides 46950-47702 of SEQ ID N0:1, nucleotides 53697-54431 of SEQ ID N0:1, and nucleotides 60362-61099 of SEQ ID N0:1.
59. An isolated nucleic acid molecule according to claim 55, wherein said nucleotide sequence is selected from the group consisting of: nucleotides 20565-21302 of SEQ ID N0:1, nucleotides 25184-25942 of SEQ ID N0:1, nucleotides 29678-30429 of SEQ ID N0:1, nucleotides 35930-36667 of SEQ ID N0:1, nucleotides 42314-43048 of SEQ ID

N0:1, nucleotides 46950-47702 of SEQ ID N0:1, nucleotides 53697-54431 of SEQ ID N0:1, and nucleotides 60362-61099 of SEQ ID N0:1.
60. An isolated nucleic acid molecule according to claim 29, wherein said epothilone synthase domain is a methyltransferase domain comprising an amino acid sequence substantially similar to amino acids 2671-3045 of SEQ ID NO:6.
61. An isolated nucleic acid molecule according to claim 60, wherein said methyltransferase domain comprises amino acids 2671-3045 of SEQ ID N0:6.
62. An isolated nucleic acid molecule according to claim 60, wherein said nucleotide sequence is substantially similar to nucleotides 51534-52657 of SEQ ID N0:1.
63. An isolated nucleic acid molecule according to claim 60, wherein said nucleotide sequence comprises a consecutive 20 base pair nucleotide portion identical in sequence to a consecutive 20 base pair portion of nucleotides 51534-52657 of SEQ ID N0:1.
64. An isolated nucleic acid molecule according to claim 60, wherein said nucleotide sequence is nucleotides 51534-52657 of SEQ ID N0:1.
65. An isolated nucleic acid molecule according to claim 29, wherein said epothilone synthase domain is a thioesterase domain comprising an amino acid sequence substantially similar to amino acids 2165-2439 of SEQ ID N0:7.
66. An isolated nucleic acid molecule according to claim 65. wherein said thioesterase domain comprises amino acids 2165-2439 of SEQ ID N0:7.
67. An isolated nucleic acid molecule according to claim 65, wherein said nucleotide sequence is substantially similar to nucleotides 61427-62254 of SEQ ID N0:1.
68. An isolated nucleic acid molecule according to claim 65, wherein said nucleotide sequence comprises a consecutive 20 base pair nucleotide portion identical in sequence to a consecutive 20 base pair portion of nucleotides 61427-62254 of SEQ ID N0:1.

69. An isolated nucleic acid molecule according to claim 65, wherein said nucleotide sequence is nucleotides 61427-62254 of SEQ ID N0:1.
70. An isolated nucleic acid molecule comprising a nucleotide sequence that encodes a non-ribosomal peptide synthetase, wherein said non-ribosomal peptide synthetase comprises an amino acid sequence substantially similar to an amino acid sequence selected from the group consisting of: SEQ ID N0:3, amino acids 72-81 of SEQ ID NO:3, amino acids 118-125 of SEQ ID N0:3, amino acids 199-212 of SEQ ID N0:3, amino acids 353-363 of SEQ ID N0:3, amino acids 549-565 of SEQ ID N0:3, amino acids 588-603 of SEQ ID N0:3, amino acids 669-684 of SEQ ID N0:3, amino acids 815-821 of SEQ ID N0:3, amino acids 868-892 of SEQ ID N0:3, amino acids 903-912 of SEQ ID N0:3, amino acids 918-940 of SEQ ID N0:3, amino acids 1268-1274 of SEQ ID N0:3, amino acids 1285-1297 of SEQ ID N0:3, amino acids 973-1256 of SEQ ID N0:3, and amino acids 1344-1351 of SEQ ID NO:3.
71. An isolated nucleic acid molecule according to claim 70, wherein said non-ribosomal peptide synthetase comprises an amino acid sequence selected from the group consisting of: SEQ ID NO:3, amino acids 72-81 of SEQ ID N0:3, amino acids 118-125 of SEQ ID NQ:3, amino acids 199-212 of SEQ ID N0:3, amino acids 353-363 of SEQ ID N0:3, amino acids 549-565 of SEQ ID N0:3, amino acids 588-603 of SEQ ID N0:3, amino acids 669-684 of SEQ ID N0:3, amino acids 815-821 of SEQ ID N0:3, amino acids 868-892 of SEQ ID N0:3, amino acids 903-912 of SEQ ID N0:3, amino acids 918-940 of SEQ ID N0:3, amino acids 1268-1274 of SEQ ID N0:3, amino acids 1285-1297 of SEQ ID N0:3, amino acids 973-1256 of SEQ ID NO:3, and amino acids 1344-1351 of SEQ ID N0:3.
72. An isolated nucleic acid molecule according to claim 70, wherein said nucleotide sequence is substantially similar to a nucleotide sequence selected from the group consisting of: nucleotides 11872-16104 of SEQ ID N0:1, nucleotides 12085-12114 of SEQ ID N0:1, nucleotides 12223-12246 of SEQ ID NO:1, nucleotides 12466-12507 of SEQ ID N0:1, nucleotides 12928-12960 of SEQ ID N0:1, nucleotides 13516-13566 of SEQ ID N0:1, nucleotides 13633-13680 of SEQ ID NO: 1, nucleotides 13876-13923 of SEQ ID NO: 1, nucleotides 14313-14334 of SEQ ID N0:1, nucleotides 14473-14547 of SEQ ID

N0:1, nucleotides 14578-14607 of SEQ ID N0:1, nucleotides 14623-14692 of SEQ ID N0:1, nucleotides 15673-15693 of SEQ ID N0:1, nucleotides 15724-15762 of SEQ ID N0:1, nucleotides 14788-15639 of SEQ ID N0:1, and nucleotides 15901-15924 of SEQ ID N0:1.
73. An isolated nucleic acid molecule according to claim 70, wherein said nucleotide sequence comprises a consecutive 20 base pair nucleotide portion identical in sequence to a consecutive 20 base pair portion of a nucleotide sequence selected from the group consisting of: nucleotides 11872-16104 of SEQ ID NO: 1, nucleotides 12085-12114 of SEQ ID N0:1, nucleotides 12223-12246 of SEQ ID NO:1, nucleotides 12466-12507 of SEQ ID N0:1, nucleotides 12928-12960 of SEQ ID NO: 1, nucleotides 13516-13566 of SEQ ID N0:1, nucleotides 13633-13680 of SEQ ID N0:1, nucleotides 13876-13923 of SEQ ID N0:1, nucleotides 14313-14334 of SEQ ID N0:1, nucleotides 14473-14547 of SEQ ID N0:1, nucleotides 14578-14607 of SEQ ID NO: 1, nucleotides 14623-14692 of SEQ ID N0:1, nucleotides 15673-15693 of SEQ ID NO: 1, nucleotides 15724-15762 of SEQ ID N0:1, nucleotides 14788-15639 of SEQ ID N0:1, and nucleotides 15901-15924 of SEQ ID N0:1.
74. An isolated nucleic acid molecule according to claim 70, wherein said nucleotide sequence is selected from the group consisting of: nucleotides 11872-16104 of SEQ ID N0:1, nucleotides 12085-12114 of SEQ ID NO: 1, nucleotides 12223-12246 of SEQ ID NO: 1, nucleotides 12466-12507 of SEQ ID N0:1, nucleotides 12928-12960 of SEQ ID N0:1, nucleotides 13516-13566 of SEQ ID N0:1, nucleotides 13633-13680 of SEQ ID N0:1, nucleotides 13876-13923 of SEQ ID N0:1, nucleotides 14313-14334 of SEQ ID NO: 1, nucleotides 14473-14547 of SEQ ID N0:1, nucleotides 14578-14607 of SEQ ID N0:1, nucleotides 14623-14692 of SEQ ID N0:1, nucleotides 15673-15693 of SEQ ID NO: 1, nucleotides 15724-15762 of SEQ ID N0:1, nucleotides 14788-15639 of SEQ ID NO: 1, and nucleotides 15901 -15924 of SEQ ID N0:1.
75. A method for heterologous expression of epothilone in a recombinant host, comprising:

(a) introducing a chimeric gene according to claim 4 into a host; and
(b) growing the host in conditions that allow biosynthesis of epothilone in the host.

76. A method for producing epothilone, comprising:
(a) expressing epothilone in a recombinant host by the method of claim 75; and
(b) extracting epothilone from the recombinant host.

77. An isolated polypeptide comprising an amino acid sequence that consists of an epothilone synthase domain,
78. An isolated polypeptide according to claim 77. wherein said epothilone synthase domain is a p-ketoacyl-synthase domain comprising an amino acid sequence substantially similar to an amino acid sequence selected from the group consisting of: amino acids 11 • 437 of SEQ ID N0:2, amino acids 7-432 of SEQ ID N0:4, amino acids 39-457 of SEQ ID N0:5. amino acids 1524-1950 of SEQ ID N0:5, amino acids 3024-3449 of SEQ ID N0:5, amino acids 5103-5525 of SEQ ID N0:5, amino acids 35-454 of SEQ ID N0:6, amino acids 1522-1946 of SEQ ID NO: 6, and amino acids 32-450 of SEQ ID N0:7.
79. An isolated polypeptide according to claim 78, wherein said p-ketoacyl-synthase domain comprises an amino acid sequence selected from the group consisting of: amino acids 11-437 of SEQ ID N0:2, amino acids 7-432 of SEQ ID N0:4, amino acids 39-457 of SEQ ID N0:5, amino acids 1524-1950 of SEQ ID N0:5, amino acids 3024-3449 of SEQ ID N0:5, amino acids 5103-5525 of SEQ ID N0:5. amino acids 35-454 of SEQ ID N0:6, amino acids 1522-1946 of SEQ ID NO: 6, and amino acids 32-450 of SEQ ID N0:7.
80. An isolated polypeptide according to claim 77, wherein said epothilone synthase domain is an acyltransferase domain comprising an amino acid sequence substantially similar to an amino acid sequence selected from the group consisting of: amino acids 543-864 of SEQ ID N0:2. amino acids 539-859 of SEQ ID N0:4, amino acids 563-884 of SEQ ID N0:5, amino acids 2056-2377 of SEQ ID N0:5, amino acids 3555-3876 of SEQ ID N0:5. amino acids 5631-5951 of SEQ ID N0:5, amino acids 561-881 of SEQ ID NO:6, amino acids 2053-2373 of SEQ ID N0:6, and amino acids 556-877 of SEQ ID NO:7.
81. An isolated polypeptide according to claim 80, wherein said acyltransferase
domain comprises an amino acid sequence selected from the group consisting of: amino

acids 543-864 of SEQ ID N0:2, amino acids 539-859 of SEQ ID N0:4, amino acids 563-884 of SEQ ID N0:5, amino acids 2056-2377 of SEQ ID N0:5, amino acids 3555-3876 of SEQ ID NO:5, amino acids 5631-5951 of SEQ ID N0:5, amino acids 561-881 of SEQ ID N0:6, amino acids 2053-2373 of SEQ ID N0:6, and amino acids 556-877 of SEQ ID N0:7.
82. An isolated polypeptide according to claim 77, wherein said epothilone synthase domain is an enoyi reductase domain comprising an amino acid sequence substantially similar to an amino acid sequence selected from the group consisting of: amino acids 974-1273 of SEQ ID NO:2, amino acids 4433-4719 of SEQ ID N0:5, amino acids 6542-6837 of SEQ ID NO:5, and amino acids 1478-1790 of SEQ ID N0:7.
83. An isolated polypeptide according to claim 82, wherein said enoyI reductase domain comprises an amino acid sequence selected from the group consisting of: amino acids 974-1273 of SEQ ID N0:2, amino acids 4433-4719 of SEQ ID NO:5, amino acids 6542-6837 of SEQ ID N0:5, and amino acids 1478-1790 of SEQ ID N0:7.
84. An isolated polypeptide according to claim 77, wherein said epothilone synthase domain is an acyl carrier protein domain, wherein said polypeptide comprises an amino acid sequence substantially similar to an amino acid sequence selected from the group consisting of: amino acids 1314-1385 of SEQ ID N0:2, amino acids 1722-1792 of SEQ ID N0:4, amino acids 1434-1506 of SEQ ID N0:5, amino acids 2932-3005 of SEQ ID N0:5, amino acids 5010-5082 of SEQ ID N0:5, amino acids 7140-7211 of SEQ ID N0:5, amino acids 1430-1503 of SEQ ID N0:6, amino acids 3673-3745 of SEQ ID N0:6, and amino acids 2093-2164 of SEQ ID N0:7.
85. An isolated polypeptide according to claim 84, wherein said acyl carrier protein domain comprises an amino acid sequence selected from the group consisting of: amino acids 1314-1385 of SEQ ID N0:2, amino acids 1722-1792 of SEQ ID N0:4, amino acids 1434-1506 of SEQ ID N0:5, amino acids 2932-3005 of SEQ ID NO:5, amino acids 5010-5082 of SEQ ID N0:5, amino acids 7140-7211 of SEQ ID N0:5, amino acids 1430-1503 of SEQ ID N0:6, amino acids 3673-3745 of SEQ ID N0:6, and amino acids 2093-2164 of SEQ ID N0:7.

86. An isolated polypeptide according to claim 77, wherein said epothilone synthase domain is a dehydratase domain comprising an amino acid sequence substantially similar to an amino acid sequence selected from the group consisting of: amino acids 869-1037 of SEQ ID N0:4, amino acids 3886-4048 of SEQ ID N0:5, amino acids 5964-6132 of SEQ ID N0:5, amino acids 2383-2551 of SEQ ID N0:6, and amino acids 887-1051 of SEQ ID N0:7.
87. An isolated polypeptide according to claim 86, wherein said dehydratase domain comprises an amino acid sequence selected from the group consisting of: amino acids 869-1037 of SEQ ID N0:4, amino acids 3886-4048 of SEQ ID N0:5, amino acids 5964-6132 of SEQ ID N0:5, amino acids 2383-2551 of SEQ ID N0:6, and amino acids 887-1051 of SEQ ID N0:7.
88. An isolated polypeptide according to claim 77, wherein said epothilone synthase domain is a p-ketoreductase domain comprising an amino acid sequence substantially similar to an amino acid sequence selected from the group consisting of: amino acids 1439-1684 of SEQ ID N0:4, amino acids 1147-1399 of SEQ ID NO:5, amino acids 2645-2895 of SEQ ID NO:5, amino acids 4729-4974 of SEQ ID N0:5, amino acids 6857-7101 of SEQ ID N0:5, amino acids 1143-1393 of SEQ ID N0:6, amino acids 3392-3636 of SEQ ID N0:6, and amino acids 1810-2055 of SEQ ID N0:7.
89. An isolated polypeptide according to claim 88, wherein said P-ketoreductase domain comprises an amino acid sequence selected from the group consisting of: amino acids 1439-1684 of SEQ ID N0:4, amino acids 1147-1399 of SEQ ID N0:5, amino acids 2645-2895 of SEQ ID N0:5, amino acids 4729-4974 of SEQ ID N0:5, amino acids 6857-7101 of SEQ ID N0:5, amino acids 1143-1393 of SEQ ID N0:6, amino acids 3392-3636 of SEQ ID N0:6, and amino acids 1810-2055 of SEQ ID N0:7.
90. An isolated polypeptide according to claim 77, wherein said epothilone synthase domain is a methyltransferase domain comprising an amino acid sequence substantially similar to amino acids 2671-3045 of SEQ ID NO:6.
91. An isolated polypeptide according to claim 90, wherein said methyltransferase domain comprises amino acids 2671-3045 of SEQ ID N0:6.

92. An isolated polypeptide according to claim 77, wherein said epothilone synthase domain is a thioesterase domain comprising an amino acid sequence substantially similar to amino acids 2165-2439 of SEQ ID N0:7.
93. An isolated polypeptide according to claim 77, wherein said thioesterase domain comprises amino acids 2165-2439 of SEQ ID N0:7.

94. An isolated nucleic acid molecule substantially as hereinbefore
described.
95. A recombinant vector substantially as hereinbefore described.
96. A chimeric gene substantially as hereinbefore described.
97. A Bac clone subsstantially as hereinbefore described.
98. An isolated polypeptide substantially as hereinbefore described.


Documents:

in-pct-2000-830-che-abstract.pdf

in-pct-2000-830-che-assignment.pdf

in-pct-2000-830-che-claims filed.pdf

in-pct-2000-830-che-claims granted.pdf

in-pct-2000-830-che-correspondence others.pdf

in-pct-2000-830-che-correspondence po.pdf

in-pct-2000-830-che-description complete filed.pdf

in-pct-2000-830-che-description complete granted.pdf

in-pct-2000-830-che-form 1.pdf

in-pct-2000-830-che-form 19.pdf

in-pct-2000-830-che-form 26.pdf

in-pct-2000-830-che-form 3.pdf

in-pct-2000-830-che-form 5.pdf

in-pct-2000-830-che-pct.pdf


Patent Number 200717
Indian Patent Application Number IN/PCT/2000/830/CHE
PG Journal Number 8/2007
Publication Date 23-Feb-2007
Grant Date 02-Jun-2006
Date of Filing 15-Dec-2000
Name of Patentee NOVARTIS AG
Applicant Address Schwarzwaldallee 215, CH-4058 Basel
Inventors:
# Inventor's Name Inventor's Address
1 NA NA
PCT International Classification Number C07K14/195
PCT International Application Number PCT/EP1999/004171
PCT International Filing date 1999-06-16
PCT Conventions:
# PCT Application Number Date of Convention Priority Country
1 09/099,504 1998-06-18 U.S.A.