Title of Invention	A METHOD FOR ISOLATING HIGH MOLECULAR WEIGHT DNA FROM A SAMPLE
Abstract	Methods and compositions are disclosed for isolation and propagation of high molecular weight DNA. Large segments of DNA isolated from samples obtained directly from natural sources such as soil, undersea core samples, fresh and salt water, air, and other sources may be used to examine entire gene clusters and to isolate expression products of previously unknown and uncharacterized or even extinct organisms. Isolation and manipulation of larger DNA segments than has been previously possible is taught herein.

Title of Invention

A METHOD FOR ISOLATING HIGH MOLECULAR WEIGHT DNA FROM A SAMPLE

Abstract

Methods and compositions are disclosed for isolation and propagation of high molecular weight DNA. Large segments of DNA isolated from samples obtained directly from natural sources such as soil, undersea core samples, fresh and salt water, air, and other sources may be used to examine entire gene clusters and to isolate expression products of previously unknown and uncharacterized or even extinct organisms. Isolation and manipulation of larger DNA segments than has been previously possible is taught herein.

Full Text	Cross Reference to Related Applications This application claims the benefit of U.S. Provisional Application No. 60/137,065, filed June 2,1999, and U.S. Provisional Application No. 60/191,601, filed March 23, 2000. Background of the Invention Field of the Invention The present invention relates to novel methods for the isolation and cloning of high molecular weight DNA collected from a variety of natural sources and a DNA library produced therefrom. More particularly, the invention relates to the isolation of DNA from a plurality of species collected fi-o m natural sources to produce a library of high molecular weight DNA frgments generated firom those organisms, either singly or in recombination with DNA fragments from other species. Description of the Related Art A great diversity of microorganisms exists in the natural environment. Since the first discovery of the medicinal benefits (as well as other applications) of various naturally produced chemical products ("natural products') produced by the Earth's biota, interest m the discovery and exploitation of these natural products has increased. For example, the majority of known antimicrobial products are, in feet, secondary metabolites isolated from soil microorganisms. Approximately three-quarters of all known bacterially-derived natural products come from soil actinomycetes, especially streptomycetes (Vining, 1990). Screening for medically useful natural products has primarily yielded antibacterial agents. Bacterial and fimgal antibiotics represent a multi-billion dollar market; the third largest pharmaceutical market woridwide. Natural products possessing other biological activities of human benefit have been discovered as well. These include anticoccidial agents, anti-fiingal drugs, herbicidal agents, anticancer drugs, insecticidal and nematocidal agents, immunomodulating compounds, and enzyme inhibitors. In addition, microorganisms produce a variety of lipopeptides, lipoproteins, glycolipids, and lipopolysaccharides with surface-acting properties (Rosenberg, 1986). Among some of the industrially important enzymes are cellulases, amylases, proteases, and Upases used extensively in textile applications. Other microbial enzymes are important in the biotechnology industry (i.e., restriction enzymes and thermostable enzymes). Representatives of all of these natural product categories are in widespread use, indicating very large commercial markets, with tremendous potential for expansion. The screening of natural products from sources such as terrestrial bacteria, fungi, invertebrates and plants has resulted in the discovery of many important drugs. Tens of thousands of these natural products are biologically active, with at least 100 currently in use as antibiotics, agrochemicals and anti-cancer agents (Franco et al., 1991; Goodfellow et al., 1989; Berdy 1982; Suffness et al. 1988). Although many important microorganisms exist in nature, the success of screening for new natural products of interest is directly related to the number of unique source organisms, which provide the compounds tested in the screening process. Pharmaceutical companies may typically screen compound libraries containing hundreds of thousands of natural and synthetic products. The number of novel compounds contained in these libraries has plateaued with time, however. The shortage of new natural products is primarily due to the inability of practitioners to discover and to analyze novel organisms. For example, it is thought that only a small number (probably 0.1%) of the microorganisms in soils can be isolated and cultured by known methods (Bintrim et al., 1997). In addition, it has been estimated that only about five thousand plant species have been studied for possible medical use; a fractionof the 250,000-3,000,000 estimated plant species (Abelson, 1990). Millions of species of marine microorganisms have been estimated, yet only a small number have been characterized. Recently, and only by happenstance, a new species of sulfur bacterium was discovered in sediment cores taken over 100 meters below the waters off the coast of Namibia. Its discovery was due only to visual detection (i.e., it is the largest known bacteria. Schulz et al., 1999). Discovery of such elusive microorganisms is important, not only for the progress of science generally, but also for development of novel natural products of medicinal and industrial value. This undiscovered biodiversity represents an as yet untapped resource of novel compounds. Technology is needed that would allow efficient, precise, and systematic detection and analysis of the genetic diversity of nature. Although the Earth's biodiversity is enormous, most of the Earth's species remain unknown and undetected. Discovery of the vast array of unknown species is difficult, especially for microorganisms, which traditionally require culturing specimens to obtain a sufficient sample for analysis (Pace, 1997). Physical, nutritional, even biological (e.g., commensalism, or symbiosis) requirements can make laboratory cultures impractical, if not impossible. Even if a potentially valuable natural product is found, further analysis and commercial production of the compound may be prohibited due to the inability to obtain additional samples, or samples in sufficient amounts. Modern molecular biology permits the analysis of an organism's biochemical nature from its genetic constitution. Even with modem analytical tools, which may require only minute samples of the organism's genome for investigation, analysis is difficult due to the inability to obtain a sufficient amount of high quality (i.e., high molecular weight) genomic DNA (gDNA). There remains a general need for methods of cataloguing, analyzing, and potentially utilizing, the genetic diversity of the Earth's unknown biological reserves. Recovery of genomic DNA from a natural sample may be accomplished by the isolation of organisms from a natural sample, culturing the organisms, and extracting gDNA from the culture. This method has a number of disadvantages, however. The process is time consuming and requires a large amount of practitioner manipulation and person hours. In addition, the process requires prior knowledge of the organism's physical, chemical, and biological requirements for successful culturing. Alternatively, the practitioner may select arbitrary culture conditions in an effort to culture whatever may grow under the selected conditions. As mentioned earlier, this is a significant limitation, since it is believed that the vast majority of the Earth's biota remains uncharacterized and unknown. Recovery of genomic DNA from a natural sample may also be accomplished by DNA isolation directly from the collected natural sample. In this method, the organisms of the sample are not cultured or otherwise isolated from each other or their natural environment. This direct DNA extraction method lacks robustness in regard to the quality and quantity of the extracted gDNA, however: The quantity of gDNA isolated by conventional direct extraction methods is small, requiring either large amounts of initial sample or repeat sampling. Obtaining large amounts of a natural sample may be impossible or impractical, especially if the sample is taken from remote locales or environmentally sensitive habitats. In addition, laboratory manipulation of large amounts of sample during the DNA extraction protocol is impractical. Repeat sampling suffers from these same limitations. In addition, repeat sampling does not guarantee successful repeated isolation of the same DNA fragments. The quality of gDNA isolated by direct extraction methods is also limited. Despite years of scientific effort, the extraction of high molecular weight gDNA (i.e., >50 kilobase pairs) directly from natural samples has not been demonstrated. The isolation of large gDNA fragments is necessary, however, for proper forensic or phylogenetic analysis, as well as for the discovery of larger polypeptides or compounds produced from the biological activity of a plurality of polypeptides (e.g., a biochemical pathway). Because of the limitations of current methods of DNA isolation from natural samples, there is an overwhelming need in the art for a sensitive and efficient process for isolating DNA from natural samples. A method is needed that would allow DNA extraction directly from a variety of natural samples, without the need to culture the organisms of the sample prior to DNA extraction. Preferably, this method would allow direct DNA extraction without requiring large amounts of initial sample, or repeat sampling. Ideally, this improved method would be able to isolate high molecular weight DNA from a sample, and in a form and condition whereby the high molecular weight DNA can be retained, sequenced, and expressed for further research and development. Summary of the Invention The present invention is directed to a novel method for recovering high molecular weight DNA (hmwDNA) from a natural sample. Preferably the sample will contain a plurality of species. The present invention offers several advantages and novel features over DNA extraction techniques known in the art. The present invention provides the advantage of improved sensitivity, thus not requiring a large amount of initial sample collected from the natural environment. The invention employs equipment and reagents known and used by persons of ordinary skill in the art, and does not require the use of expensive, cumbersome, or otherwise exotic equipment or reagents. The method of the present invention requires less sample manipulation than current techniques known in the art. The method allows for DNA extraction directly from a natural sample, without the need to culture organisms contained within the sample, or any other pre-treatment of the sample before the process of DNA extraction. This invention, therefore, reduces the time and expense of additional reagents, equipment, incubation time, and practitioner manipulation. The method of the present invention provides a more precise representation of the total genetic diversity contained within a given sample than conventional DNA isolation techniques known in the art. Because DNA extraction may commence immediately after sample acquisition, there is less opportunity for the degradation of some genetic material (e.g., due to organism die-off, DNA hydrolysis, etc.) and the amplification of other genetic material (e.g., organism proliferation). The improved techniques of the present invention require no prior understanding or knowledge of the biological requirements (or even the existence) of the organisms in the sample to be processed. (See, for example, Torsvik et al., 1990 and 1994.) The present invention provides for the extraction of hmwDNA from a natural sample. DNA isolated by the present invention can range in size from 50,000 to 400,000 base pairs (50-400 kbp). As a result, the present invention can produce single DNA fragments equivalent in length to one tenth of a typical bacterial genome. Extraction of DNA fragments of this magnitude from a natural sample is not known in the art. The isolation of high quality DNA is critical for phylogenetic or forensic analyses. hmwDNA is essential for the discovery of larger polypeptides, or polypeptides encoded by gDNA that contain noncoding regions (e.g., introns). The isolation of hmwDNA is also necessary for the discovery of polypeptides formed from two or more heterogeneous polypeptide subunits, or other gene clusters and their products; for example compounds produced as a result of a biochemical pathway requiring two or more polypeptides (either product may be the result of polypeptides encoded by polycistronic DNA fragments. See for example Malpartida et al., 1984; Murdock et al., 1993.). The method of the present invention provides for the isolation from a natural sample of hmwDNA fragments suitable for incorporation into a genetic vector, transgenic incorporation into a host organism, and subsequent expression of the DNA insert. It is therefore another object of the invention to provide a library of hmwDNA isolated from a natural sample. The library of the present invention provides for the unlimited storage of the DNA inserts and the genetic information contained therein, and eliminates the need or necessity to obtain additional samples from the natural environment. The library of the present invention can be utilized in the analysis and screening of the genetic information, or the expressed polypeptides for a wide variety of research and development applications (e.g., phylogenetic analyses and drug discovery programs as described earlier). The method of the present invention essentially comprises: preparing an aqueous suspension of a natural sample, gently emulsifying the suspension with an organic solvent, and precipitating the DNA for solution. Preferably, the extraction process comprises additional steps, before the final DNA precipitation, including washing the sample solution with a cationic detergent, and additional organic solvent separations. Most preferably the extraction process culminates with a gradient separation step, to separate the isolated DNA on the basis of molecular weight. It is critical to the invention that the steps of the process are carried out gently, to minimize or prevent shearing of the sample DNA. This invention provides a method for isolating high molecular weight DNA from a plurality of species in a sample, including environmental samples such as soil samples or other samples of material from nature. The method involves suspending a portion of the sample in an aqueous medium to preparing an aqueous suspension of the sample. Then, an extraction mixture is formed by adding to the aqueous suspension an appropriate organic solvent under suitable conditions to remove undesired materials from the suspension while retaining part or all of the high molecular weight DNA. Suitable solvents include phenol, or other organic solvents capable of dissolving undesired materials such as proteins, lipids, etc. Preferably the phenol or other organic solvent is sufficiently warm, usually >50° C in the case of phenol, and is added under suitable conditions (preferably gentle mixing) to dissolve the undesired materials. By "gentle" we mean conditions sufficiently vigorous for dissolution and removal of the undesired materials from the aqueous phase which contains the high molecular weight DNA, yet gentle enough so that at least part, and preferably most, of the high molecular weight DNA remains as such in the aqueous phase. The extraction mixture is then separated into an aqueous phase and an organic phase. The aqueous phase contains the high molecular weight DNA, which may then be precipitated from the aqueous solution using conventional methods and materials, e.g., addition of a cosolvent such as an alcohol, generally ethanol or isopropanol. If desired, a cationic detergent may be added to the separated aqueous phase to form an aqueous mixture. A preferred cationic detergent for this use is cetyltrimethylammonium bromide (CTAB). The aqueous mixture may then be gently extracted with an organic solvent, which may be the same or different from the solvent used in the previous extraction step. Often the organic solvent used to remove the detergent is chloroform, but phenol or any other suitable solvent may be used. The invention optionally further comprises passing the extracted DNA over a density gradient to separate the hmwDNA from smaller DNA fragments. Prior to putting the DNA on the gradient, the solution containing the DNA must be concentrated, either by reducing the volume of the solution or by precipitating the DNA from solution and resuspending it in a smaller volume. If desired, the practitioner may further remove contaminants from the hmwDNA by adding a proteinase to the DNA in a sufficient amount and under appropriate conditions permitting degradation of proteins. The present invention also provides for a genetic construct comprising hmwDNA isolated from a natural sample incorporated into a cloning vector. Preferably, the DNA insert-vector construct is stable and capable of replication of the DNA insert. Most preferably the genetic construct is capable of expressing a polypeptide encoded by the hmwDNA insert. The present invention further provides for a transgenic host cell comprising the incorporation of hmwDNA isolated from a natural sample into a living host cell. Preferably the hmwDNA is stably incorporated into a cloning vector. Alternatively, the hmwDNA is stably incorporated into a host cell chromosome. Most preferably, the host cell is capable of expressing one or more polypeptide(s) encoded by the hmwDNA insert. Accordingly, the present invention provides a method for isolating high molecular weight DNA from a sample, said method comprising the steps consisting essentially of: (a) preparing an aqueous suspension of said sample wherein said sample contains high molecular weight DNA having a size of at least 50 kbp; (b) forming an extraction mixture by combining the aqueous suspension and an organic solvent under nonturbulent rotational mixing to remove undesired materials from the suspension while retaining part or all of the high molecular weight DNA; (c) separating the extraction mixture into an aqueous phase and an organic phase under nonturbulent rotational mixing, wherein the aqueous phase contains DNA; (d) precipitating the DNA from the aqueous phase of step (c) under nonturbulent rotational mixing by addition of an alcohol selected from the group consisting of isopropanol and ethanol; (e) resuspending the DNA from step (d) in an aqueous solution and gently mixing the aqueous solution with a cationic detergent to form an aqueous mixture, wherein the DNA remains in solution; (f) gently extracting the aqueous mixture with an organic solvent; (g) separating the extraction mixture of step (f) into an aqueous phase and an organic phase; (h) precipitating the DNA from the aqueous phase of step (g) under nonturbulent rotational mixing by addition of an alcohol selected from the group consisting of isopropanol and ethanol; and (i) recovering high molecular weight DNA from the precipitate of step (h) having a size of at least 50 kbp. With reference to the accompanying drawings, in which Fig.l is a photograph of an ethidium bromide (EtdBr)-stained agarose gel, depicting DNA isolated from a natural soil sample. Lane 1 is a nucleic acid ladder, used as a gel reference marker. Lane 2 shows total genomic DNA extracted from a natural soil sample prior to density gradient separation. Lane 3 represents a soil hmwDNA fraction separated by sucrose gradient centrifugation. Lanes 4 •• and 5 show restriction digests of soil hmwDNA (prior to vector ligation). Lane is soil hmwDNA cut with EcoRl, and lane 5 is soil hmwDNA cut with HindIII Fig. 2 is a diagram ofpBTP2 used for the construction of the soil library. A. Vector pBTP2 is a modification of pBeloBACl 1, containing additional cloning sites and a pUC origin of replication inserted into the polylinker (allowing for high-copy replication of the empty vector to facilitate purification). The pUC sequence is removed by two sequential gel purification steps before ligation with insert DNA. B. Cloning site of pBTP2. Uppercase letter indicate pBeloBac sequences, lowercase letters indicate PUC 19 sequences, and Bold sequences identify the pBacTA polylinker. Fig. 3 is a phylogenetic tree construct based on small subunit (16S) rRNA sequence homology of DNA obtained from soil, and constructed using a Phylogenetic Inference algorithm (PHYLIPTM, version 3.57, J. Felsenstein, U. of Washington, Seattle). Incorporated into the diagram are known bacterial representatives of divergent taxonomic families, including several from published reports of 16S subunit sequence analysis of DNA obtained directly from soil. Fig, 4 is an ORF map of clone MGl. 1. Sequence was determined as described in the methods. ORF's were identified using MapDraw (DNAStar Inc.). Homology search was done using BLAST (Basic Local Alignment Search Tool, http://www.ncbi.nlm.nih.gov/BLAST). Fig. 5 illustrates the one-dimensional proton NMR spectnmi of red compound (A) compared to a standard sample of Indirubin (B) (Sigma). Both spectra are recorded in d6-DMSO solvent at 27°C). The peaks are assigned and labeled according to the numbering scheme shown in the insert. Additional low-intensity peaks in the test sample are due to sample impurities. Proton chemical shifts are referenced to TMS as standard at 0.0 ppm. Panel C illustrates the structure, determined by NMR, of a colorless compound isolated from MGl.l with anti-bacterial activity (2-(2,2-bis-(lH-indol-3-yl)-ethyl)-phenylamine). Detailed Description of the Invention The present invention provides a novel method for recovering hmwDNA from a natural sample. Because DNA is a polymer of nucleotide bases, DNA molecular weight is directly proportional to polymer size. Therefore, as used herein, the term high molecular weight DNA (hmwDNA) refers to DNA comprising a polymer of nucleotides at least about 50,000 base pairs in length (^ 50 kbp). Preferably hmwDNA ranges between about 50 kbp to about 400 kbp. More preferably the DNA length is about 80 kbp to about 300 kbp. As used herein, a natural sample refers to any sample taken from the natural environment. The natural environment is meant to encompass the biosphere, or any environment wherein genetic material from an organism may be found. Preferably, the natural sample will contain genetic material from a plurality of species. Natural samples include but are not limited to samples taken from any soil (encompassing all of the soil types and depths), water (encompassing all freshwater aquatic, or marine habitats), or atmospheric environment. Sampling techniques are well known in the art (see for example Colwell, 1979; Fenical and Jenson, 1992; Giovanni et al., 1990; GrifBths et al., 1996; Stahl et al,, 1985; Suzuki et al., 1997; Torsvik et al. 1994; Ward, 1990). Because of the complexity and interdependencies of many species-species interactions a natural sample may also comprise a sample (apparently) taken from a single organism, (such as plant or animal samples that may contain genetic material from more than one species due to infestation or symbiosis; see for example Currie et al., 1999). The genetic diversity contained within a natural sample will depend upon the species diversity of the sample and may vary depending upon when the sample is taken as well as where it is taken. Circannual, seasonal, and even circadian changes in the biodiversity of a natural habitat will affect the diversity of genetic material of the present invention isolated from a sample. Preferably, a natural sample contains a multitude of species, thus increasing the total genetic diversity contained within the sample. Most preferably, the sample contains a multitude of previously uncharacterized and unknown species. As used herein, species refers to any taxonomic grouping of genetically distinct individuals. Independent of any ongoing taxonomic debate, viruses are included in the definition of species, as used herein. Species, and individuals of a species, need not be living organisms, but merely possess genetic material in the form of nucleic acid. A DNA library, as used herein, refers to a compilation of genetic constructs, each comprising a DNA fragment stably inserted into a genetic vector. Preferably, the DNA insert-vector construct is capable of replication within a host organism. A DNA expression library, refers to a DNA library, wherein the DNA fragment is operably inserted into an expression vector, such that the DNA fragment is capable of being transcribed and translated into a polypeptide. As used herein, the following abbreviations will apply: gDNA (genomic DNA); hmwDNA (high molecular weight DNA); BAC (bacterial artificial chromosome); YAC (yeast artificial chromosome); bp (base pairs); kbp (kilobase pairs); s (seconds); min (minutes); hrs (hours); rpm (revolutions per minute); RT (room temperature); °C (degrees Centigrade); eq (equivalents); M (Molar); mM (millimolar); jxM (micromolar); N (Normal); mol (moles); mmol (miUimoles); nmol (micromoles); nmol (nanomoles); kg (kilograms); gm (grams); mg (milligrams); \|xg (micrograms); ng (nanograms); 1 (liters); ml (millihters); \|il (microliters); vol (volumes); SDS (sodium dodecyl sulfate); EDTA (Ethylenediaminetetraacetic acid); TE (Tris-EDTA); and CTAB (cetyltrimethylammonium bromide). High Molecular Weight DNA Isolation from a Natural Sample Procedures for isolating DNA from laboratory cell cultures, as well as stably inserting exogenous DNA into genetic vectors and host cells, are well known in the art. See, for example: Ausubel et al.. Current Protocols in Molecular Biology (1988) Greene Publish. Assoc. & Wiley Interscience; Old, R.W. & S.B. Primrose, Principles of Gene Manipulation: An Introduction To Genetic Engineering (3d Ed. 1985) Blackwell Scientific Publications, Boston. Studies in Microbiology; V.2:409 pp.; Sambrook, J, et al. eds.. Molecular Cloning: A Laboratory Manual (2d Ed. 1989) Cold Spring Harbor Laboratory Press, NY. Vols, i-3.; and Winnacker, E.L. From Genes To Clones: Introduction To Gene Technology (1987) VCH Publishers, NY (translated by Horst Ibelgaufts). 634 pp. These publications are incorporated herein by reference in their entirety. DNA extraction procedures typically involve cell lysis and digestion with a combination of a proteolytic enzymes and non-ionic or anionic detergents, such as SDS, The DNA is isolated from the digest with a phenol/chlorofonn(/isoamyl alcohol) separation treatment, to remove most of the hydrolyzed products. The DNA is that precipitated out of solution by the addition of alcohol. The unique process of DNA extraction of the present invention, whereby hmwDNA can be isolated from a natural sample, comprises: preparing an aqueous suspension of the natural sample; gently emulsiiying the suspension with an organic solvent; gently separating the aqueous, DNA-containing phase from the organic phase; and precipitating the DNA from the aqueous solution. Preferably the isolation process comprises additional steps of gently resuspending the DNA and mixing the solution with a cationic detergent; re-emulsifying the solution with an organic solvent; separating the DNA-containing aqueous solution from the organic phase; reprecipitating the DNA; resuspending the DNA; and passing the suspension through a gradient, to separate out the hmwDNA. To obtain the successful isolation of hmwDNA from the sample, it is critical that each step of reagent addition, suspension, mixing, and separation be performed gently (with deliberate care), to reduce or to avoid physical shearing of the sample DNA. Preferably, all additions and separations to and from the sample are performed by a gentle means (e.g., pouring, or pipetting with a wide bore pipette tip). All mixing is preferably accomplished by a gentle means, whereby intermixing of solutions is accomplished with a minimal amount of solution turbulence (e.g., rocking, rolling, or rotating the solution mixture). Most preferably mixing comprises rotation of the sample. The separation of suspension and emulsion phases is preferably accomplished by centrifugation at a speed, and for a duration, sufficient to separate the phases. Preferably, centrifugation is performed at least about 4000 rpm for at least about 10 min. To maximize the genetic diversity ultimately extracted from a natural sample, it is preferable to prepare an aqueous suspension of the sample, and begin emulsion with the organic solvent soon after the sample has been taken from its source. The organic solvent of choice is phenol, and it is preferably warmed above normal room temperature; preferably to a temperature of at least about 35°C, and most preferably at least about 65°C. As noted earlier, emulsion must be done gently, and should be done for a period of time sufiBcient to allow deproteinization. Preferably emulsion is performed for at least about 30 min. Although it is preferred that the phenol be warmed to a temperature above normal room temperature, the emulsion may be performed at room temperature. Precipitation of DNA from an aqueous solution is best perfomied by the addition of alcohol to the solution in a sufiBcient amount, followed by gentle mixing for a time sufiBcient to allow for complete precipitation of DNA. Preferably, DNA precipitation step is accomplished by the addition of isopropanol, more preferably 0.7 vol isopropanol, and the solution rotated for at least about 30 min@RT. As additional, preferred optional steps in the process, precipitated DNA (the "DNA pellet" after centrifugation) may be washed one or more times with 70% ethanol. As a further optional step, the practitioner may choose to treat the resuspended DNA, precipitated after the first organic solvent extraction, with a digestive enzyme. Many proteolytic enzymes are known in the art (e.g., Proteinase K). A further modified process may also include sample treatment with an anionic detergent, such as SDS. These treatments are well known in the art, but not an essential feature of the present invention, and are advantageously omitted to reduce the amount of sample handling. After the critical organic extraction and DNA precipitation, the DNA pellet is preferably resuspended into solution and treated with a cationic detergent. Cationic detergents have been shown not only to precipitate nucleic acids, but also to treat the biological sources of nucleic acid, lysing cells, and solubilizing contaminating lipids and proteins (Schneider, 1997). Commercially useful detergents are known in the art, and include cetrimonium compounds (such as cetyl pyridinium bromide and cetyltrimethylammonium bromide), and benzalkonium compounds (such as alkylbenzyldimethylammonium chlorides). Preferably the cationic detergent comprises cetyl trimethylammonium bromide (CTAB). The DNA solution is mixed with the cationic detergent for a time sufiBcient to allow separation of the nucleic acids from contaminating components. Preferably, mixing comprises rotating the mixture for at least about 5 min. More preferably, the mixture is incubated at a temperature of at least about 65°C after an initial 5 min rotation, followed by an additional 5 min rotation. After treatment with a cationic detergent, the mixture is preferably treated to a second emulsion with an organic solvent (preferably an organic solvent different from the organic solvent used for the first critical separation) to separate the nucleic acids from the detergent, other solvent residues, and remaining sample contaminants. The solvent is preferably chloroform, and treatment preferably comprises gentle mixing for at least about 30 min. Final emulsion separation of the DNA-containing aqueous phase from the organic phase preferably comprises centrifugation of the emulsion at least about 10,000 rpm for at least about 10 min. After final DNA precipitation, preferably using 0.7 vol isopropanol and more preferably followed by repeated ethanol washes (as described above) until the supernatant is clear, the DNA pellet may be resuspended into solution and prepared for gradient separation of the DNA on the basis of polymer size. Gradient separation may be accomplished using any of a variety of techniques well known in the art (e.g., electrophoresis, chromatography, density gradient separation). Preferably the gradient separation comprises a density gradient (such as cesium chloride gradients or sucrose gradients known in the art), and most preferably comprises passing the DNA solution through a sucrose gradient. In this preferred embodiment hmwDNA is isolated, and can be removed from the gradient by fractionation or direct extraction. Construction of a High Molecular Weight DNA Library According to the present invention, DNA libraries comprising hmwDNA extracted from natural samples will greatly facilitate analysis of the genetic diversity of the natural environment for both academic and commercial purposes. DNA libraries derived from natural samples provide an invaluable tool for research and development into novel biochemicals useful in a variety of applications (e.g., medical, industrial, commercial) as discussed earlier. It is an object of the present invention to provide a library of hmwDNA fragments isolated from a natural sample. Having isolated hmwDNA fragments as described above, construction of a DNA library is well known in the art (see the references previously cited and incorporated). The extracted DNA fragments are inserted into a cloning vector (including but not limited to expression vectors) of choice. A variety of well known techniques are available to the practitioner for the successful incorporation of a DNA fragment into a vector (including, but not limited to, blunt end ligation, linker ligation, homopolymeric tailing, restriction digestion). Preferably, the DNA insert-vector construct is stable and capable of replication of the DNA insert, to provide for long-term storage and amplification of the isolated genetic information. The cloning vector may possess any of a number of characteristics useful for genetic engineering. If the vector of choice is an expression vector (i.e., a genetic construct wherein the DNA fragment is operably incorporated into a vector to allow transcription and translation of the DNA insert) expression regulatory regions (e.g., promoter regions and start codons) may be provided by either the vector DNA, the insert DNA, or separately inserted by the practitioner. It is a requirement of the present invention that the vector is capable of incorporating hmwDNA. A variety of genetic vectors may be used including, but not limited to; plasmids, cosmids, phagemids, modified viruses, shuttle vectors, and artificial chromosomes (e.g., YAC's and BAC's)(See Figure 2). Vectors may include other features useful in the manipulation and analysis of the DNA insert, including, but not limited to; DNA linkers, restriction nuclease sites, high copy origins of application; insertion sequences; and indicator and/or selectable markers, the use of which are well known in the art. Vectors used in the present invention may be constructed by the practitioner, using techniques well known in the art, or may be commercially purchased (e.g., from Boehringer Mannheim Corp., Indianapolis, IN; Life Technologies, Inc., Rockville, MD; New England Biolabs, Inc., Beverley, MA; Pharmacia LKB Biotechnology, Inc., Piscataway, NJ; Stratagene, La Jolla, CA.). Preferably the vector libraries of the present invention can be stably introduced and maintained in a host organism for the purposes of DNA insert replication, and more preferably, DNA insert expression. The host organism may be any cell type from any living system: these include species from; Eubacteria, Archaebacteria, Protista, Plantae, Fungi, and Animalia. Recombinant host cell systems from each of these taxonomic Kingdoms, and a multiplicity of techniques for the incorporation of foreign DNA into different cell types are well known in the art; including, but not limited to; biolistic transfer; conjugation, electroporation, infection, liposome-mediated transfer, microinjection, protoplast fusion, transfection, and transformation. In one embodiment of the present invention, hmwDNA extraction from a natural sample is incorporated into an expression vector within a host cell capable of expressing one or more polypeptide(s) encoded by the hmwDNA insert. The multiplicity of novel polypeptides, and/or their biochemical products produced by expression library can then be screened for a chemical property or activity of interest. Standard protocols exist, and are obviously modified, for screening DNA libraries and the products produced therefrom for novel compounds of some desired chemical or biological characteristic. A wide range of selection parameters are well known in the art, including but not limited to various biological selection regimes (e.g., cell or phage proliferation in the presence or absence of a compound, physiological marker systems), physical selection regimes (e.g., various cell sorting regimes), and chemical activity selection regimes (e.g., chromatographic separation). It will be readily apparent to those of ordinary skill in the art that a wide variety of modifications, adaptations, and applications of the present method of hmwDNA extraction from a natural sample as described for the first time herein, as well as a DNA library generated therefrom, are obvious and may be made without departing from the scope of the invention or the disclosed embodiments thereof Having now described the present invention in detail, the same is demonstrated by reference to the following examples, which are included herewith for purposes of illustration only, and are not intended to be limiting of the invention in any way, EXAMPLE 1: Recoverv of High Molecular Weight DNA from a Natural Sample To demonstrate that hmwDNA can be isolated directly from a natural sample without the necessity of culturing, or otherwise pretreating, the genetic organisms contained within the sample, the methods of the present invention were applied to a natural soil sample taken from a local site. A 50 ml soil sample was suspended in a solution of 25mM Tris 8.0, 150mM NaCl, 25mM EDTA (Buffer I) to a final volume of 175 ml. After complete suspension of the soil, 50 ml of equilibrated 655°C phenol was added to the soil suspension and emulsed by rotation (10-15 rpm) for 30 min RT. The emulsion was centrifiiged @ 4000 rpm for 20 min. The aqueous phase was gently poured into a clean vessel, 0.7 vol isopropanol gently added to the aqueous solution (final volume -150 ml), and rotated for an additional 30 min RT. The mixture was centrifiiged @ 4000 rpm for 20 min and the supernatant discarded. The precipitated DNA "pellet" was further washed once with 70% ethanol and dried. The DNA extract was resuspended in 6 mis of Buffer I plus 600µ of 5M NaCl (i.e., gentle rotation for --10 min), and 6 ml of 65°C 2% CTAB (in 2M NaCI) was added. The mixture was rotated for 5 min, incubated @ 65°C for 10 min, and rotated an additional 5 min. Chloroform (6 ml) was added to the solution and rotated a fiirther 30 min. The solution was then centrifiiged @ 10,000 rpm for 10 min. The aqueous phase was transferred into a clean vessel using a wide-bore 1 ml pipette, 0.7 vol isopropanol gently added to the aqueous solution, and rotated for 30 min RT. The DNA precipitate was allowed to settle to the bottom of the vessel, and the supernatant gently poured off. The DNA precipitate was washed repeatedly with 15 ml aliquots of 70% ethanol until the supernatant was clear. Final removal of ethanol was followed by resuspension of the DNA in 1 ml TE (the DNA precipitate was allowed to self-resuspend for --1 hr). The DNA solution was centrifiiged @ 10,000 rpm for 5 min, and any residual particulates removed. The DNA solution was loaded onto a 32 ml sucrose gradient (in r'x3.5' Ultraclear BeckmanTM centrifiige tube), comprising 8 ml steps of 20%, 30%, 40%, and 50% sucrose in TE. The gradient was ultracentrifuge @ 28,000 rpm for 21 hrs (without braking). The gradient was eluted in 2 drop increments into a 96-well plate. Every other elution fraction was run on a pulse-field electrophoresis gel. Recovered DNA fragments, including hmwDNA, from the soil sample ranged in size from 50 kbp to 400 kbp (Fig. 1). DNA yield was approximately 1 microgram (fig) per gram of soil, however the yield can vary depending on soil type (clay vs. sandy) as well as the location and time of sampling. These results provide the first demonstration of hmwDNA efficiently extracted from a natural sample containing DNA without prior amplification of the DNA, or otherwise pre-treating the sample prior to actual DNA extraction. EXAMPLE 2: Construction of a DNA Expression Librarv from a Natural Sample To demonstrate that hmwDNA extracted directly from a natural sample can be inserted stably into an expression vector, whereby the compilation of individual recombinant vectors represents a library of diverse DNA fragments derived from that natural sample, a DNA library was generated using hmwDNA extracted from a soil sample (as described in Example 1 above), inserted into a bacterial artificial chromosome (BAC), and transformed into E. coli. A composite hmwDNA library (-15,000 clones) in pBTP2 (a modified BAC-base vector) was created, comprised of 3 DNA sub-libraries. The construction was accomplished by digesting hmwDNA (separated by sucrose gradient centrifugation) with Hind III (see Fig. 1), The digested DNA was size purified by pulse field electrophoresis (DNA)(Bio-Rad, Hercules, CA). Three separate size ranges were excised; 50-100kbp, 100-150kbp and 150-200kbp fi-agments. The fragments were further purified on a second pulse field gel to remove residual small molecular weight DNA fi-agments trapped in the gel matrix, because smaller DNA fi'agments tend to ligate more efficiently than hmwDNA. The gel slices were dialysed against TE and digested with gelase (Bio-Rad) to dissolve the agarose. Ligation was performed using standard enzymes (T4 ligase and ATP) and 25ng of vector at a 10:1 vector to insert molar ratio. The ligation was drop dialysed with TE and transformed into E, coli (DHIOB). Results of the library construct are provided in Table 1. This 15,000 member library was constructed from a sub-sample of the soil-extracted hmwDNA. Extrapolating to the total 500ug sample obtained from a single 400g soil sample, total library constructs would approximate 10 clones. These results demonstrate that hmwDNA extracted directly from a natural sample can be stably cloned into an expression vector for storage, amplification, and future analysis of the genetic material incorporate therein. EXAMPLE 3: Diversity of a DNA library isolated from a Natural Sample To demonstrate that hmwDNA expression libraries derived from a natural sample possess gDNA from a diversity of species contained within the natural sample, the genetic diversity of a hmwDNA expression library derived from a soil sample (as described in Example 2) was subjected to phylogenetic analysis. A sample of soil hmwDNA was subjected to PCR analysis using primers homologous to sequences found within small subunit (16S) rRNA. The use of rRNA sequences to determine species diversity and to establish phylogenetic relationships is well known in the art (for review see Hugenholtz, 1998). The use of degenerative oligos allows for the amplification of many different bacterial families. Over 200 16S DNA fragments were cloned and sequenced. The results indicate a wide diversity of sequences categorized within known families from around world. The majority of sequences, however, were unidentifiable bacterial families; presumably representing unknown bacterial strains. Figure 3 diagrams a phylogenetic tree based on sequences obtained from soil hmwDNA. Incorporated into the diagram are bacterial representatives of divergent taxonomic families, including several from published reports of 16S subunit sequence analysis of DNA obtained from soil. These results illustrate the genetic diversity of hmwDNA directly obtainable from a natural sample by the method of the present invention, and further demonstrate the capability of the present invention to extract and utilize previously unknown genetic information contained within natural samples. EXAMPLE 4: Screening of a DNA Expression Library Obtained from a Natural Sample To demonstrate that hmwDNA expression libraries derived from a natural sample can be screened for some selected physiological characteristic, chemical or physical property, or biological activity, a hmwDNA expression library derived from a soil sample (as described in Example 2) was screened for a variety of activities (see Table 2). These screening assays isolated a variety of clones from the soil hmwDNA hbrary; e.g., clones capable of expressing known compounds such as indirubin and indole. As one example, the screen for antibacterial activity against a sensitive strain of Bacillus subtilis revealed three separate and independent antibacterial clones. One of the clones, which also expresses a brown pigment, was analyzed further (data not shown). An organic molecule, encoded by the soil DNA insert of this clone, has been identified as the source of the antibacterial activity. Sequence analysis of the soil hmwDNA insert foils to link the compound, or its source, to any currently known antibiotic or bacterial strain. EXAMPLE 5: Genetic and Chemical Analyses of a DNA Expression Library Isolate Obtained from a Natural Sample To demonstrate that DNA isolates of a DNA expression library derived from a natural sample can genetically manipulated for detailed analysis of the DNA isolate and the polypeptide(s) it encodes, one of the three antibacterial clones isolated and screened from Example 4 above (clone mg 1.1) was subcloned and further analyzed. Clone mg 1.1 (insert size of 27 kb), which produces a purple pigment, exhibits antibacterial activity against B. Subtilis and S. aureus. Upon confirmation that the genetic information responsible for these phenotypes was plasmid-encoded, the isolate was fiirther analyzed. For fiirther genetic (including sequence information) and biochemical analysis of the MGl. 1 isolate, transposon mutagenesis using pTRANS was employed. PTRANS is a method for characterizing clones expressing heterologous activities by transposon mutagenesis and DNA sequencing using plasmid pTRANS-sacsB. Briefly, plasmid pTRANS-sacs contains the TV-based transposon TRANS (derived from plasmid pGPSl, New England BioLabs, Beverly, MA), a ColEl origin of replication and a kanamycm resistance gene. In vitro transposition of TRANS allows for random insertion of the ColEl on into a target BAC plasmid, increasing its copy number and thereby facilitating plasmid DNA isolation, sequencing, and (occasionally) expression. Plasmid pTRANS-sacB also encodes the Bacillus subtilis sacB gene in the vector portion of the plasmid, allowing for its counterselection in the presence of 5% sucrose. Transposition reactions were performed following the published protocol for pGPS 1, followed by transformation of electrocompetent E. coli strain DHIOB with 5µl of the transposon reaction and selection of transformants on LB plates containing kanamycin (50µg/ml), chloramphenicol (10µg/ml), and sucrose (5%). The resulting transformants contained multicopy BAC plasmids with TRANS insertions. Transformants that lost the heterologous activity contain TRANS insertions in soil DNA sequences encoding that activity. For sequencing, plasnud DNA was isolated using the Qiagen Biorobot 9600 (Qiagen, Inc., Valencia, CA) according to the manufacturer's instructions, and sequenced using ABI Big Dye sequencing kit and run on an ABI 377 DNA sequencer. Bases were assigned using the Unix program Phred, the data was assembled using Phrap, and edited using Consed (University of Washington, Seattle, WA). Two ORFs responsible for pigment production were identified and encoded amino acid sequences similar to monooxygenases (Fig. 4). Monooxygenase is a family of enzymes shown to produce indole-related compounds in other organisms by catalyzing the incorporation of molecular oxygen (Yen et al., 1991; O'Connor et al., 1997). These two genes were shown to be sufficient for production of both pigments and antibacterial activity, as subcloning and transfer of the two genes to a new host strain resulted in the production activities identical to the original MGl.l clone. The antimicrobial activity and the pigments associated with MGI. 1 were extractable with organic solvents. Thin layer chromatography (TLC) analysis of extracts yielded both pink and blue pigments. Each pigment exhibited weak antibacterial activity. Both pigments had a MW = 262 as determined by mass spectroscopy (MS). Genetic analysis, indicating that indole-monooxygenase genes were responsible for pigment production (Fig. 4), combined with MW determination, suggests that the pigments could be indirubin and indigo-blue; structural isomers previously shown to be co-produced in several microorganisms (Hart et al,, 1992; Eaton and Chapman, 1995). Samples of indirubin, an antileukemic drug known to inhibit tyrosine kinases (Han, 1994; Hoessel et al., 1999) and indigo blue were obtabed (Sigma) as standards. In TLC analysis, the standards comigrated with the unknown pigments purified from MGL 1 (indirublin/pink pigment Rf=0.3, indigo blue/blue pigment Rf=0.58). One dimensional nuclear magnetic resonance (NMR) analysis of the pink pigment confirmed its identity as indirublin (Fig. 5). TLC analysis of extracts revealed additional, nonpigmented antibacterial molecule(s) that were more potent (B. subtilis was sensitive to M Mg/ml in LB liquid culture). Initial MS and NMR analyses suggest that E, coli clone MGl.l produces a family of related molecules with antimicrobial activity, encoded by the soil DNA insert. One such member identified by NMR (data not shown) is 2-(2,2-bis-(lH-indol-3-yl)-ethyl)- phenylamine (Figure 5C). This molecule has been chemically syndiesized (Bocchi and Palla, 1986; Legall et al., 1988; Ishii et al., 1988), but never isolated from a natural source. Derivatized indol dimers could also be isolated following the methods described above. A summary of the four antibacterial activities identified in the MGl library is presented in Table 3. Table 3. Antibacterial activities detected in soil DNA library MGl *+, pTRANS insertions into gene(s) encoding antibacterial activity were obtained, leading to identification of the encoding DNA sequence; -, no pTRANS knockouts were obtained. +, pTRANS insertions into BAC plasmid outside of genes encoding antibacterial activity, leading to amplification of plasmid copy number and antibacterial gene expression. Information obtained from DNA sequence analysis and/or cell fractionation/extraction, biochemical, and analytical chemistry procedures. The above described Examples demonstrate that a hmwDNA library can be generated from a natural sample, screened for selected properties; and desired clones isolated for further growth, manipulation, and analysis. These results provide the first demonstration of the screening of a hmwDNA library derived from a natural sample, the identification of biochemicals, and the isolation of clones containing the genetic information, which encode novel biochemicals. Cited References Each of the publications mentioned herein above and below is incorporated by reference. Abelson, Medicine from Plants, Science 247:513 (1990). Berdy et al., "Search and Discovery Methods for Novel Antimicrobials", In Bioactive Metabolites From Micro-Organisms, pp. 3-25, ME Bushell, U Grafe, eds., Elsevier, Amsterdam. (1982). Bintrim et al., "Molecular phylogeny of Archaea from soil. PNAS USA 94:277-282 (1997). Bocchi and Palla, Tetrahedron 42:5019-5024 (1986). Colwell, "Human pathogens in the aquatic environment", pp.337-344 In Colwell and Foster (eds.) Aquatic Microbial Ecology. University of Maryland Sea Grant, College Paric, MD (1979). Currie et al., "Fungus-Growing Ants Use Antibiotic-Producing Bacteria to Control Garden Parasites", Nature 398:701-704 (1999). Eaton and Chapman, J. Bacteriol. 177:6983-6988 (1995). Fenical and Jenson, Marine Microorganisms: A New Biomedical Resource. Advances in Marine Biotechnology, vol. I: Pharmaceutical and Bioactive Natural Products, pp. 419-457, D. Attaway, O. Zaborsky eds.. Plenum Press, New York. (1992). Franco et al., "Detectipn on Novel Secondary Metabolites", In Critical Reviews in Biotechnology, vol. ll(3):193-276 (1991). Giovannoni et al. "Genetic diversity in Sargasso Sea bacterioplankton" Nature 345:60-63 (1990). Goodfellow et al. In Microbial Products: New Approaches, pp. 343-383, Cambridge University Press (1989). Griffiths et al. "Broad-Scale approaches to the determination of soil microbial community structure: application of the community DNA hybridization technique". Microbial Ecol. 31:269-280 (1996). Han, Stem Cells (Dayt) 12:53-63 (1994). Hart et al., J. Gen. Microbiol. 138:211-216 (1992). Hoessel et al.. Nature Cell Biol. 1:60-67 (1999). Hugenholtz et al. "Impact of culture-independent studies on the emerging phylogenetic view of bacterial diversity", J. Bacteriol. 180:4765-4774 (1998). Ishii et al. J. Chem. Soc. Perkin Trans. 1:2387-2395 (1988). Legall et al., Int. J. Pept. Protein Res. 32:279-291 (1988). Malpartida et al., "Molecular Cloning of the Whole Biosynthetic Pathway of a Streptomyces Antibiotic and its Expression in a Heterologous Host", Nature 309:462-464 (1984). Murdock et al., "Construction of Metabolic Operwi Catalyzing the De Novo Biosynthesis of Indigo in Escherichia Coli" Bio/Technology 11:381-385 (1993). Rosenberg, "Microbial surfactants" Crit. Rev. Biotechnol. 3:109-132 (1986). O'Connor et al,, Appl. Environ. Microbiol. 63:4287-4291 (1997). Pace, "A molecular view of microbial diversity and the biosphere" Science 276:734-740 (1997) Schulz et al. " Dense Populations of a Giant Sulfur Bacterium in Namibian Shelf Sediments", Science 284:493-495 (1999). Schneider, Extraction of Genomic DNA from Blood Using Cationic Detergents U.S. Pat. No. 5,596,092 (1997). Stahl et al., "Characterization of a Yellowstone hot spring microbial community by 5S rRNA sequences" Appl. Environ. Microbiol. 49:1379-1384 (1985). Suffness et al. In Biomedical Importance of Marine Organisms, pp. 151-157, Calif. Acad, of Sciences (1988). Suzuki et al.," Bacterial diversity among small-subunit rRNA gene clones and cellular isolates from the same seawater sample" Appl. Environ. Microbiol. 63:983-989 (1997). Torsvik et al. "Comparison of phenotypic diversity and DNA heterogeneity in a population of soil bacteria", Appl. Environ. Micro. 56:782-787 (1990). Torsvik et al.. In Ritz and Giller (eds.) Beyond the Biomass: Compositional and Functional Analysis of Soil Microbial Communities. John Wiley and Sons, Chichester (1994). Vining, "Functions of secondary metabolites", Annu. Rev. Microbiol. 44:395-427 (1990). Ward et al., "16S rRNA sequences reveal numerous uncultured microorganisms in a natural community". Nature 345:63-65 (1990). Yen et al., J Bacteriol. 173:5315-5327 (1991). We Claim: 1. A method for isolating high molecular weight DNA from a plurality of species in a sample comprising the steps of: (a) preparing an aqueous suspension of said sample; (b) forming an extraction mixture by adding to the aqueous suspension an appropriate organic solvent under suitable conditions to remove undesired materials from the suspension while retaining part or all of the high molecular weight DNA; (c) separating the extraction mixture into an aqueous phase and an organic phase, wherein the aqueous phase contains DNA; and (d) precipitating the DNA from the aqueous solution. 2. The method of claim 1, further comprising the steps of: (a) gently mixing the aqueous solution with a cationic detergent to form an aqueous mixture; and (b) gently extracting the aqueous mixture with an organic solvent to yield an encirhed aqueous solution.. 3. The method of claim 2, wherein the cationic detergent is cetyltrimethylammonium bromide (CTAB). 4. The method of claim 1, 2 or 3, further comprising the steps of: (a) concentrating the DNA in the enriched aqueous solution; (b) passing the concentrated DNA through a density gradient to separate DNA molecules from each other by size; and (c) isolating high molecular weight DNA from said gradient. 5. The method of any of claims 1-3, wherein the sample comprises soil, 6. The method of claim 4, wherein the sample comprises soil. 7. The method of claim 1, wherein the organic solvent is phenol at a temperature of >50oC. 8. The method of any of claims 2-4, further comprising the step of adding a proteinase to the DNA in a sufficient amount and under appropriate conditions permitting degradation of proteins. 9. The method of claim 4, wherein said density gradient is a CsCl gradient or a sucrose gradient. 10. A high molecular weight DNA library comprising a set of vectors containing DNA inserts derived from a plurality of species, a plurality of which inserts being high molecular weight DNA. 11. A high molecular weight DNA library comprising a set of vectors containing DNA inserts derived from a plurality of species, a plurality of which inserts being high molecular weight DNA isolated by the method of claim 1. 12. A high molecular weight DNA library comprising a set of vectors containing DNA inserts derived fi-om a plurality of species, a plurality of which inserts being high molecular weight DNA which are operably linked to a transcription control element which regulates their expression in a host cell. 13. The high molecular weight DNA library of claim 10,11 or 12, wherein said vector is selected from the group consisting of plasmids, cosmids, phagemids, modified viral vectors, and artificial chromosomes. 14. The high molecular weight DNA hbrary of claim 10-12, wherein said vector is a shuttle vector. 15. The high molecular weight DNA library of claim 10, wherein said high molecular weight DNA is obtained from a soil sample. 16. Host cells containing the high molecular weight DNA hbrary of claim 10. 17. Host cells of claim 16 selected from the group consisting of eubacterial cells, fungal cells and animal cells. 18. Host cells of claim 17 wherein said host cells are E. coli, yeast or mammalian cells. 19. A method for making a high molecular weight DNA library containing DNA inserts derived from a plurality of species, comprising inserting high molecular weight DNAs from a plurality of species into a vector. 20. The method of claim 19, wherein said vector is selected from the group consisting of plasmids, cosmids, phagemids, viral vectors, and artificial chromosomes. 21. The method of claim 19, wherein said vector is a shuttle vector. 22. The method of claim 19, wherein said high molecular weight DNA is obtained from a soil sample. 23. The method of claim 19, which further comprises introducing the library into host cells. 24. The method of claim 23, wherein said host cells are selected from the group consisting of eubacterial cells, fungal cells and animal cells. 25. The method of claim 24 wherein said host cells are E. coli, yeast or mammalian cells. 26. A method for identifying a compound with a biological activity of interest produced by cells containing a member of a high molecular weight DNA Ubraiy, wherein said Ubrary contains high molecular weight DNA inserts from a pluraUty of species, comprising the steps of: (a) selecting an assay that detects the biological activity of interest; (b) screening said cells containing members of the high molecular weight DNA Ubrary for the presence of said biological activity; and (c) correlating the biological activity with the presence of the compound. 27. A method for identifying a member of a high molecular weight DNA library associated with production of a compound with a biological activity of interest, wherein said Ubraiy contains high molecular weight DNA inserts from a plurality of species, comprising the steps of: (a) selecting an assay that detects the biological activity of interest; (b) screening said cells containing members of the high molecular weight DNA library for the presence of said biological activity; and (c) correlating the biological activity with the presence of the member of the high molecular weight DNA library. 28. A method for producing a compound of interest, comprising the steps of: (a) identifying a member of a high molecular weight DNA library associated with production of a compound with a biological activity of interest, using the method of claim 26; and (b) culturing cells containing and capable of expressing the member of the library so identified under conditions permitting production of the compound. 29. The method of claim 28 wherein said compound if interest is a derivatized indol dimer. 30. The method of claim 29 wherein said indol dimer is 2-(2,2-bis-(lH-indol-3-yl)-ethyl)-phenylamine. 31. A method for isolating high molecular weight DNA from a plurality of species in a sample, substantially as hereinabove described and illustrated with reference to the acccompanying drawings

Full Text

Cross Reference to Related Applications
This application claims the benefit of U.S. Provisional Application No. 60/137,065, filed June 2,1999, and U.S. Provisional Application No. 60/191,601, filed March 23, 2000.
Background of the Invention
Field of the Invention The present invention relates to novel methods for the isolation and cloning of high molecular weight DNA collected from a variety of natural sources and a DNA library produced therefrom. More particularly, the invention relates to the isolation of DNA from a plurality of species collected fi-o m natural sources to produce a library of high molecular weight DNA frgments generated firom those organisms, either singly or in recombination with DNA fragments from other species.
Description of the Related Art
A great diversity of microorganisms exists in the natural environment. Since the first discovery of the medicinal benefits (as well as other applications) of various naturally produced chemical products ("natural products') produced by the Earth's biota, interest m the discovery and exploitation of these natural products has increased. For example, the majority of known antimicrobial products are, in feet, secondary metabolites isolated from soil microorganisms. Approximately three-quarters of all known bacterially-derived natural products come from soil actinomycetes, especially streptomycetes (Vining, 1990). Screening for medically useful natural products has primarily yielded antibacterial agents. Bacterial and fimgal antibiotics represent a multi-billion dollar market; the third largest pharmaceutical market woridwide.
Natural products possessing other biological activities of human benefit have been discovered as well. These include anticoccidial agents, anti-fiingal drugs, herbicidal agents, anticancer drugs, insecticidal and nematocidal agents, immunomodulating compounds, and enzyme inhibitors. In addition, microorganisms produce a variety of lipopeptides, lipoproteins, glycolipids,
and lipopolysaccharides with surface-acting properties (Rosenberg, 1986). Among some of the industrially important enzymes are cellulases, amylases, proteases, and Upases used extensively in

textile applications. Other microbial enzymes are important in the biotechnology industry (i.e., restriction enzymes and thermostable enzymes). Representatives of all of these natural product categories are in widespread use, indicating very large commercial markets, with tremendous potential for expansion. The screening of natural products from sources such as terrestrial bacteria, fungi, invertebrates and plants has resulted in the discovery of many important drugs. Tens of thousands of these natural products are biologically active, with at least 100 currently in use as antibiotics, agrochemicals and anti-cancer agents (Franco et al., 1991; Goodfellow et al., 1989; Berdy 1982; Suffness et al. 1988).
Although many important microorganisms exist in nature, the success of screening for new natural products of interest is directly related to the number of unique source organisms, which provide the compounds tested in the screening process. Pharmaceutical companies may typically screen compound libraries containing hundreds of thousands of natural and synthetic products. The number of novel compounds contained in these libraries has plateaued with time, however. The shortage of new natural products is primarily due to the inability of practitioners to discover and to analyze novel organisms. For example, it is thought that only a small number (probably 0.1%) of the microorganisms in soils can be isolated and cultured by known methods (Bintrim et al., 1997). In addition, it has been estimated that only about five thousand plant species have been studied for possible medical use; a fractionof the 250,000-3,000,000 estimated plant species (Abelson, 1990). Millions of species of marine microorganisms have been estimated, yet only a small number have been characterized. Recently, and only by happenstance, a new species of sulfur bacterium was discovered in sediment cores taken over 100 meters below the waters off the coast of Namibia. Its discovery was due only to visual detection (i.e., it is the largest known bacteria. Schulz et al., 1999). Discovery of such elusive microorganisms is important, not only for the progress of science generally, but also for development of novel natural products of medicinal and industrial value. This undiscovered biodiversity represents an as yet untapped resource of novel compounds. Technology is needed that would allow efficient, precise, and systematic detection and analysis of the genetic diversity of nature.
Although the Earth's biodiversity is enormous, most of the Earth's species remain unknown and undetected. Discovery of the vast array of unknown species is difficult, especially for microorganisms, which traditionally require culturing specimens to obtain a sufficient sample for analysis (Pace, 1997). Physical, nutritional, even biological (e.g., commensalism, or symbiosis) requirements can make laboratory cultures impractical, if not impossible. Even if a potentially valuable natural product is found, further analysis and commercial production of the compound may be prohibited due to the inability to obtain additional samples, or samples in sufficient amounts.

Modern molecular biology permits the analysis of an organism's biochemical nature from its genetic constitution. Even with modem analytical tools, which may require only minute samples of the organism's genome for investigation, analysis is difficult due to the inability to obtain a sufficient amount of high quality (i.e., high molecular weight) genomic DNA (gDNA). There remains a general need for methods of cataloguing, analyzing, and potentially utilizing, the genetic diversity of the Earth's unknown biological reserves.
Recovery of genomic DNA from a natural sample may be accomplished by the isolation of organisms from a natural sample, culturing the organisms, and extracting gDNA from the culture. This method has a number of disadvantages, however. The process is time consuming and requires a large amount of practitioner manipulation and person hours. In addition, the process requires prior knowledge of the organism's physical, chemical, and biological requirements for successful culturing. Alternatively, the practitioner may select arbitrary culture conditions in an effort to culture whatever may grow under the selected conditions. As mentioned earlier, this is a significant limitation, since it is believed that the vast majority of the Earth's biota remains uncharacterized and unknown.
Recovery of genomic DNA from a natural sample may also be accomplished by DNA isolation directly from the collected natural sample. In this method, the organisms of the sample are not cultured or otherwise isolated from each other or their natural environment. This direct DNA extraction method lacks robustness in regard to the quality and quantity of the extracted gDNA, however: The quantity of gDNA isolated by conventional direct extraction methods is small, requiring either large amounts of initial sample or repeat sampling. Obtaining large amounts of a natural sample may be impossible or impractical, especially if the sample is taken from remote locales or environmentally sensitive habitats. In addition, laboratory manipulation of large amounts of sample during the DNA extraction protocol is impractical. Repeat sampling suffers from these same limitations. In addition, repeat sampling does not guarantee successful repeated isolation of the same DNA fragments.
The quality of gDNA isolated by direct extraction methods is also limited. Despite years of scientific effort, the extraction of high molecular weight gDNA (i.e., >50 kilobase pairs) directly from natural samples has not been demonstrated. The isolation of large gDNA fragments is necessary, however, for proper forensic or phylogenetic analysis, as well as for the discovery of larger polypeptides or compounds produced from the biological activity of a plurality of polypeptides (e.g., a biochemical pathway).
Because of the limitations of current methods of DNA isolation from natural samples, there is an overwhelming need in the art for a sensitive and efficient process for isolating DNA from

natural samples. A method is needed that would allow DNA extraction directly from a variety of natural samples, without the need to culture the organisms of the sample prior to DNA extraction. Preferably, this method would allow direct DNA extraction without requiring large amounts of initial sample, or repeat sampling. Ideally, this improved method would be able to isolate high molecular weight DNA from a sample, and in a form and condition whereby the high molecular weight DNA can be retained, sequenced, and expressed for further research and development.
Summary of the Invention
The present invention is directed to a novel method for recovering high molecular weight DNA (hmwDNA) from a natural sample. Preferably the sample will contain a plurality of species. The present invention offers several advantages and novel features over DNA extraction techniques known in the art.
The present invention provides the advantage of improved sensitivity, thus not requiring a large amount of initial sample collected from the natural environment. The invention employs equipment and reagents known and used by persons of ordinary skill in the art, and does not require the use of expensive, cumbersome, or otherwise exotic equipment or reagents.
The method of the present invention requires less sample manipulation than current techniques known in the art. The method allows for DNA extraction directly from a natural sample, without the need to culture organisms contained within the sample, or any other pre-treatment of the sample before the process of DNA extraction. This invention, therefore, reduces the time and expense of additional reagents, equipment, incubation time, and practitioner manipulation.
The method of the present invention provides a more precise representation of the total genetic diversity contained within a given sample than conventional DNA isolation techniques known in the art. Because DNA extraction may commence immediately after sample acquisition, there is less opportunity for the degradation of some genetic material (e.g., due to organism die-off, DNA hydrolysis, etc.) and the amplification of other genetic material (e.g., organism proliferation). The improved techniques of the present invention require no prior understanding or knowledge of the biological requirements (or even the existence) of the organisms in the sample to be processed. (See, for example, Torsvik et al., 1990 and 1994.)
The present invention provides for the extraction of hmwDNA from a natural sample. DNA isolated by the present invention can range in size from 50,000 to 400,000 base pairs (50-400 kbp). As a result, the present invention can produce single DNA fragments equivalent in length to one tenth of a typical bacterial genome. Extraction of DNA fragments of this magnitude from a natural sample is not known in the art. The isolation of high quality DNA is critical for

phylogenetic or forensic analyses. hmwDNA is essential for the discovery of larger polypeptides, or polypeptides encoded by gDNA that contain noncoding regions (e.g., introns). The isolation of hmwDNA is also necessary for the discovery of polypeptides formed from two or more heterogeneous polypeptide subunits, or other gene clusters and their products; for example compounds produced as a result of a biochemical pathway requiring two or more polypeptides (either product may be the result of polypeptides encoded by polycistronic DNA fragments. See for example Malpartida et al., 1984; Murdock et al., 1993.).
The method of the present invention provides for the isolation from a natural sample of hmwDNA fragments suitable for incorporation into a genetic vector, transgenic incorporation into a host organism, and subsequent expression of the DNA insert. It is therefore another object of the invention to provide a library of hmwDNA isolated from a natural sample. The library of the present invention provides for the unlimited storage of the DNA inserts and the genetic information contained therein, and eliminates the need or necessity to obtain additional samples from the natural environment. The library of the present invention can be utilized in the analysis and screening of the genetic information, or the expressed polypeptides for a wide variety of research and development applications (e.g., phylogenetic analyses and drug discovery programs as described earlier).
The method of the present invention essentially comprises: preparing an aqueous suspension of a natural sample, gently emulsifying the suspension with an organic solvent, and precipitating the DNA for solution. Preferably, the extraction process comprises additional steps, before the final DNA precipitation, including washing the sample solution with a cationic detergent, and additional organic solvent separations. Most preferably the extraction process culminates with a gradient separation step, to separate the isolated DNA on the basis of molecular weight. It is critical to the invention that the steps of the process are carried out gently, to minimize or prevent shearing of the sample DNA.
This invention provides a method for isolating high molecular weight DNA from a plurality of species in a sample, including environmental samples such as soil samples or other samples of material from nature. The method involves suspending a portion of the sample in an aqueous medium to preparing an aqueous suspension of the sample. Then, an extraction mixture is formed by adding to the aqueous suspension an appropriate organic solvent under suitable conditions to remove undesired materials from the suspension while retaining part or all of the high molecular weight DNA. Suitable solvents include phenol, or other organic solvents capable of dissolving undesired materials such as proteins, lipids, etc. Preferably the phenol or other organic solvent is sufficiently warm, usually >50° C in the case of phenol, and is added under suitable conditions (preferably gentle mixing) to dissolve the undesired materials. By "gentle" we mean conditions

sufficiently vigorous for dissolution and removal of the undesired materials from the aqueous phase which contains the high molecular weight DNA, yet gentle enough so that at least part, and preferably most, of the high molecular weight DNA remains as such in the aqueous phase. The extraction mixture is then separated into an aqueous phase and an organic phase. The aqueous phase contains the high molecular weight DNA, which may then be precipitated from the aqueous solution using conventional methods and materials, e.g., addition of a cosolvent such as an alcohol, generally ethanol or isopropanol.
If desired, a cationic detergent may be added to the separated aqueous phase to form an aqueous mixture. A preferred cationic detergent for this use is cetyltrimethylammonium bromide (CTAB). The aqueous mixture may then be gently extracted with an organic solvent, which may be the same or different from the solvent used in the previous extraction step. Often the organic solvent used to remove the detergent is chloroform, but phenol or any other suitable solvent may be used.
The invention optionally further comprises passing the extracted DNA over a density gradient to separate the hmwDNA from smaller DNA fragments. Prior to putting the DNA on the gradient, the solution containing the DNA must be concentrated, either by reducing the volume of the solution or by precipitating the DNA from solution and resuspending it in a smaller volume. If desired, the practitioner may further remove contaminants from the hmwDNA by adding a proteinase to the DNA in a sufficient amount and under appropriate conditions permitting degradation of proteins.
The present invention also provides for a genetic construct comprising hmwDNA isolated from a natural sample incorporated into a cloning vector. Preferably, the DNA insert-vector construct is stable and capable of replication of the DNA insert. Most preferably the genetic construct is capable of expressing a polypeptide encoded by the hmwDNA insert.
The present invention further provides for a transgenic host cell comprising the incorporation of hmwDNA isolated from a natural sample into a living host cell. Preferably the hmwDNA is stably incorporated into a cloning vector. Alternatively, the hmwDNA is stably incorporated into a host cell chromosome. Most preferably, the host cell is capable of expressing one or more polypeptide(s) encoded by the hmwDNA insert.

Accordingly, the present invention provides a method for isolating high molecular weight DNA from a sample, said method comprising the steps consisting essentially of: (a) preparing an aqueous suspension of said sample wherein said sample contains high molecular weight DNA having a size of at least 50 kbp; (b) forming an extraction mixture by combining the aqueous suspension and an organic solvent under nonturbulent rotational mixing to remove undesired materials from the suspension while retaining part or all of the high molecular weight DNA; (c) separating the extraction mixture into an aqueous phase and an organic phase under nonturbulent rotational mixing, wherein the aqueous phase contains DNA; (d) precipitating the DNA from the aqueous phase of step (c) under nonturbulent rotational mixing by addition of an alcohol selected from the group consisting of isopropanol and ethanol; (e) resuspending the DNA from step (d) in an aqueous solution and gently mixing the aqueous solution with a cationic detergent to form an aqueous mixture, wherein the DNA remains in solution; (f) gently extracting the aqueous mixture with an organic solvent; (g) separating the extraction mixture of step (f) into an aqueous phase and an organic phase; (h) precipitating the DNA from the aqueous phase of step (g) under nonturbulent rotational mixing by addition of an alcohol selected from the group consisting of isopropanol and ethanol; and (i) recovering high molecular weight DNA from the precipitate of step (h) having a size of at least 50 kbp.
With reference to the accompanying drawings, in which
Fig.l is a photograph of an ethidium bromide (EtdBr)-stained agarose gel,
depicting DNA isolated from a natural soil sample. Lane 1 is a nucleic acid ladder,
used as a gel reference marker. Lane 2 shows total genomic DNA extracted from a
natural soil sample prior to density gradient separation. Lane 3 represents a soil
hmwDNA fraction separated by sucrose gradient centrifugation. Lanes 4 ••

and 5 show restriction digests of soil hmwDNA (prior to vector ligation). Lane is soil hmwDNA cut with EcoRl, and lane 5 is soil hmwDNA cut with HindIII
Fig. 2 is a diagram ofpBTP2 used for the construction of the soil library. A. Vector pBTP2 is a modification of pBeloBACl 1, containing additional cloning sites and a pUC origin of replication inserted into the polylinker (allowing for high-copy replication of the empty vector to facilitate purification). The pUC sequence is removed by two sequential gel purification steps before ligation with insert DNA. B. Cloning site of pBTP2. Uppercase letter indicate pBeloBac sequences, lowercase letters indicate PUC 19 sequences, and Bold sequences identify the pBacTA polylinker. Fig. 3 is a phylogenetic tree construct based on small subunit (16S) rRNA sequence homology of DNA obtained from soil, and constructed using a Phylogenetic Inference algorithm (PHYLIPTM, version 3.57, J. Felsenstein, U. of Washington, Seattle). Incorporated into the diagram are known bacterial representatives of divergent taxonomic families, including several from published reports of 16S subunit sequence analysis of DNA obtained directly from soil. Fig, 4 is an ORF map of clone MGl. 1. Sequence was determined as described in the methods. ORF's were identified using MapDraw (DNAStar Inc.). Homology search was done using BLAST (Basic Local Alignment Search Tool, http://www.ncbi.nlm.nih.gov/BLAST). Fig. 5 illustrates the one-dimensional proton NMR spectnmi of red compound (A) compared to a standard sample of Indirubin (B) (Sigma). Both spectra are recorded in d6-DMSO solvent at 27°C). The peaks are assigned and labeled according to the numbering scheme shown in the insert. Additional low-intensity peaks in the test sample are due to sample impurities. Proton chemical shifts are referenced to TMS as standard at 0.0 ppm. Panel C illustrates the structure, determined by NMR, of a colorless compound isolated from MGl.l with anti-bacterial activity (2-(2,2-bis-(lH-indol-3-yl)-ethyl)-phenylamine).
Detailed Description of the Invention
The present invention provides a novel method for recovering hmwDNA from a natural sample. Because DNA is a polymer of nucleotide bases, DNA molecular weight is directly proportional to polymer size. Therefore, as used herein, the term high molecular weight DNA (hmwDNA) refers to DNA comprising a polymer of nucleotides at least about 50,000 base pairs in length (^ 50 kbp). Preferably hmwDNA ranges between about 50 kbp to about 400 kbp. More preferably the DNA length is about 80 kbp to about 300 kbp.
As used herein, a natural sample refers to any sample taken from the natural environment. The natural environment is meant to encompass the biosphere, or any environment wherein genetic material from an organism may be found. Preferably, the natural sample will contain genetic

material from a plurality of species. Natural samples include but are not limited to samples taken from any soil (encompassing all of the soil types and depths), water (encompassing all freshwater aquatic, or marine habitats), or atmospheric environment. Sampling techniques are well known in the art (see for example Colwell, 1979; Fenical and Jenson, 1992; Giovanni et al., 1990; GrifBths et al., 1996; Stahl et al,, 1985; Suzuki et al., 1997; Torsvik et al. 1994; Ward, 1990). Because of the complexity and interdependencies of many species-species interactions a natural sample may also comprise a sample (apparently) taken from a single organism, (such as plant or animal samples that may contain genetic material from more than one species due to infestation or symbiosis; see for example Currie et al., 1999). The genetic diversity contained within a natural sample will depend upon the species diversity of the sample and may vary depending upon when the sample is taken as well as where it is taken. Circannual, seasonal, and even circadian changes in the biodiversity of a natural habitat will affect the diversity of genetic material of the present invention isolated from a sample.
Preferably, a natural sample contains a multitude of species, thus increasing the total genetic diversity contained within the sample. Most preferably, the sample contains a multitude of previously uncharacterized and unknown species. As used herein, species refers to any taxonomic grouping of genetically distinct individuals. Independent of any ongoing taxonomic debate, viruses are included in the definition of species, as used herein. Species, and individuals of a species, need not be living organisms, but merely possess genetic material in the form of nucleic acid.
A DNA library, as used herein, refers to a compilation of genetic constructs, each comprising a DNA fragment stably inserted into a genetic vector. Preferably, the DNA insert-vector construct is capable of replication within a host organism. A DNA expression library, refers to a DNA library, wherein the DNA fragment is operably inserted into an expression vector, such that the DNA fragment is capable of being transcribed and translated into a polypeptide.
As used herein, the following abbreviations will apply: gDNA (genomic DNA); hmwDNA (high molecular weight DNA); BAC (bacterial artificial chromosome); YAC (yeast artificial chromosome); bp (base pairs); kbp (kilobase pairs); s (seconds); min (minutes); hrs (hours); rpm (revolutions per minute); RT (room temperature); °C (degrees Centigrade); eq (equivalents); M (Molar); mM (millimolar); jxM (micromolar); N (Normal); mol (moles); mmol (miUimoles); nmol (micromoles); nmol (nanomoles); kg (kilograms); gm (grams); mg (milligrams); |xg (micrograms); ng (nanograms); 1 (liters); ml (millihters); |il (microliters); vol (volumes); SDS (sodium dodecyl sulfate); EDTA (Ethylenediaminetetraacetic acid); TE (Tris-EDTA); and CTAB (cetyltrimethylammonium bromide).

High Molecular Weight DNA Isolation from a Natural Sample Procedures for isolating DNA from laboratory cell cultures, as well as stably inserting exogenous DNA into genetic vectors and host cells, are well known in the art. See, for example: Ausubel et al.. Current Protocols in Molecular Biology (1988) Greene Publish. Assoc. & Wiley Interscience; Old, R.W. & S.B. Primrose, Principles of Gene Manipulation: An Introduction To Genetic Engineering (3d Ed. 1985) Blackwell Scientific Publications, Boston. Studies in Microbiology; V.2:409 pp.; Sambrook, J, et al. eds.. Molecular Cloning: A Laboratory Manual (2d Ed. 1989) Cold Spring Harbor Laboratory Press, NY. Vols, i-3.; and Winnacker, E.L. From Genes To Clones: Introduction To Gene Technology (1987) VCH Publishers, NY (translated by Horst Ibelgaufts). 634 pp. These publications are incorporated herein by reference in their entirety.
DNA extraction procedures typically involve cell lysis and digestion with a combination of a proteolytic enzymes and non-ionic or anionic detergents, such as SDS, The DNA is isolated from the digest with a phenol/chlorofonn(/isoamyl alcohol) separation treatment, to remove most of the hydrolyzed products. The DNA is that precipitated out of solution by the addition of alcohol.
The unique process of DNA extraction of the present invention, whereby hmwDNA can be isolated from a natural sample, comprises: preparing an aqueous suspension of the natural sample; gently emulsiiying the suspension with an organic solvent; gently separating the aqueous, DNA-containing phase from the organic phase; and precipitating the DNA from the aqueous solution. Preferably the isolation process comprises additional steps of gently resuspending the DNA and mixing the solution with a cationic detergent; re-emulsifying the solution with an organic solvent; separating the DNA-containing aqueous solution from the organic phase; reprecipitating the DNA; resuspending the DNA; and passing the suspension through a gradient, to separate out the hmwDNA.
To obtain the successful isolation of hmwDNA from the sample, it is critical that each step of reagent addition, suspension, mixing, and separation be performed gently (with deliberate care), to reduce or to avoid physical shearing of the sample DNA. Preferably, all additions and separations to and from the sample are performed by a gentle means (e.g., pouring, or pipetting with a wide bore pipette tip). All mixing is preferably accomplished by a gentle means, whereby intermixing of solutions is accomplished with a minimal amount of solution turbulence (e.g., rocking, rolling, or rotating the solution mixture). Most preferably mixing comprises rotation of the sample.
The separation of suspension and emulsion phases is preferably accomplished by centrifugation at a speed, and for a duration, sufficient to separate the phases. Preferably, centrifugation is performed at least about 4000 rpm for at least about 10 min.

To maximize the genetic diversity ultimately extracted from a natural sample, it is preferable to prepare an aqueous suspension of the sample, and begin emulsion with the organic solvent soon after the sample has been taken from its source. The organic solvent of choice is phenol, and it is preferably warmed above normal room temperature; preferably to a temperature of at least about 35°C, and most preferably at least about 65°C. As noted earlier, emulsion must be done gently, and should be done for a period of time sufiBcient to allow deproteinization. Preferably emulsion is performed for at least about 30 min. Although it is preferred that the phenol be warmed to a temperature above normal room temperature, the emulsion may be performed at room temperature.
Precipitation of DNA from an aqueous solution is best perfomied by the addition of alcohol to the solution in a sufiBcient amount, followed by gentle mixing for a time sufiBcient to allow for complete precipitation of DNA. Preferably, DNA precipitation step is accomplished by the addition of isopropanol, more preferably 0.7 vol isopropanol, and the solution rotated for at least about 30 min@RT.
As additional, preferred optional steps in the process, precipitated DNA (the "DNA pellet" after centrifugation) may be washed one or more times with 70% ethanol. As a further optional step, the practitioner may choose to treat the resuspended DNA, precipitated after the first organic solvent extraction, with a digestive enzyme. Many proteolytic enzymes are known in the art (e.g., Proteinase K). A further modified process may also include sample treatment with an anionic detergent, such as SDS. These treatments are well known in the art, but not an essential feature of the present invention, and are advantageously omitted to reduce the amount of sample handling.
After the critical organic extraction and DNA precipitation, the DNA pellet is preferably resuspended into solution and treated with a cationic detergent. Cationic detergents have been shown not only to precipitate nucleic acids, but also to treat the biological sources of nucleic acid, lysing cells, and solubilizing contaminating lipids and proteins (Schneider, 1997). Commercially useful detergents are known in the art, and include cetrimonium compounds (such as cetyl pyridinium bromide and cetyltrimethylammonium bromide), and benzalkonium compounds (such as alkylbenzyldimethylammonium chlorides). Preferably the cationic detergent comprises cetyl trimethylammonium bromide (CTAB). The DNA solution is mixed with the cationic detergent for a time sufiBcient to allow separation of the nucleic acids from contaminating components. Preferably, mixing comprises rotating the mixture for at least about 5 min. More preferably, the mixture is incubated at a temperature of at least about 65°C after an initial 5 min rotation, followed by an additional 5 min rotation.

After treatment with a cationic detergent, the mixture is preferably treated to a second emulsion with an organic solvent (preferably an organic solvent different from the organic solvent used for the first critical separation) to separate the nucleic acids from the detergent, other solvent residues, and remaining sample contaminants. The solvent is preferably chloroform, and treatment preferably comprises gentle mixing for at least about 30 min. Final emulsion separation of the DNA-containing aqueous phase from the organic phase preferably comprises centrifugation of the emulsion at least about 10,000 rpm for at least about 10 min.
After final DNA precipitation, preferably using 0.7 vol isopropanol and more preferably followed by repeated ethanol washes (as described above) until the supernatant is clear, the DNA pellet may be resuspended into solution and prepared for gradient separation of the DNA on the basis of polymer size.
Gradient separation may be accomplished using any of a variety of techniques well known in the art (e.g., electrophoresis, chromatography, density gradient separation). Preferably the gradient separation comprises a density gradient (such as cesium chloride gradients or sucrose gradients known in the art), and most preferably comprises passing the DNA solution through a sucrose gradient. In this preferred embodiment hmwDNA is isolated, and can be removed from the gradient by fractionation or direct extraction.
Construction of a High Molecular Weight DNA Library According to the present invention, DNA libraries comprising hmwDNA extracted from natural samples will greatly facilitate analysis of the genetic diversity of the natural environment for both academic and commercial purposes. DNA libraries derived from natural samples provide an invaluable tool for research and development into novel biochemicals useful in a variety of applications (e.g., medical, industrial, commercial) as discussed earlier. It is an object of the present invention to provide a library of hmwDNA fragments isolated from a natural sample.
Having isolated hmwDNA fragments as described above, construction of a DNA library is well known in the art (see the references previously cited and incorporated). The extracted DNA fragments are inserted into a cloning vector (including but not limited to expression vectors) of choice. A variety of well known techniques are available to the practitioner for the successful incorporation of a DNA fragment into a vector (including, but not limited to, blunt end ligation, linker ligation, homopolymeric tailing, restriction digestion). Preferably, the DNA insert-vector construct is stable and capable of replication of the DNA insert, to provide for long-term storage and amplification of the isolated genetic information. The cloning vector may possess any of a number of characteristics useful for genetic engineering. If the vector of choice is an expression

vector (i.e., a genetic construct wherein the DNA fragment is operably incorporated into a vector to allow transcription and translation of the DNA insert) expression regulatory regions (e.g., promoter regions and start codons) may be provided by either the vector DNA, the insert DNA, or separately inserted by the practitioner. It is a requirement of the present invention that the vector is capable of incorporating hmwDNA.
A variety of genetic vectors may be used including, but not limited to; plasmids, cosmids, phagemids, modified viruses, shuttle vectors, and artificial chromosomes (e.g., YAC's and BAC's)(See Figure 2). Vectors may include other features useful in the manipulation and analysis of the DNA insert, including, but not limited to; DNA linkers, restriction nuclease sites, high copy origins of application; insertion sequences; and indicator and/or selectable markers, the use of which are well known in the art. Vectors used in the present invention may be constructed by the practitioner, using techniques well known in the art, or may be commercially purchased (e.g., from Boehringer Mannheim Corp., Indianapolis, IN; Life Technologies, Inc., Rockville, MD; New England Biolabs, Inc., Beverley, MA; Pharmacia LKB Biotechnology, Inc., Piscataway, NJ; Stratagene, La Jolla, CA.).
Preferably the vector libraries of the present invention can be stably introduced and maintained in a host organism for the purposes of DNA insert replication, and more preferably, DNA insert expression. The host organism may be any cell type from any living system: these include species from; Eubacteria, Archaebacteria, Protista, Plantae, Fungi, and Animalia. Recombinant host cell systems from each of these taxonomic Kingdoms, and a multiplicity of techniques for the incorporation of foreign DNA into different cell types are well known in the art; including, but not limited to; biolistic transfer; conjugation, electroporation, infection, liposome-mediated transfer, microinjection, protoplast fusion, transfection, and transformation.
In one embodiment of the present invention, hmwDNA extraction from a natural sample is incorporated into an expression vector within a host cell capable of expressing one or more polypeptide(s) encoded by the hmwDNA insert. The multiplicity of novel polypeptides, and/or their biochemical products produced by expression library can then be screened for a chemical property or activity of interest. Standard protocols exist, and are obviously modified, for screening DNA libraries and the products produced therefrom for novel compounds of some desired chemical or biological characteristic. A wide range of selection parameters are well known in the art, including but not limited to various biological selection regimes (e.g., cell or phage proliferation in the presence or absence of a compound, physiological marker systems), physical selection regimes (e.g., various cell sorting regimes), and chemical activity selection regimes (e.g., chromatographic separation).

It will be readily apparent to those of ordinary skill in the art that a wide variety of modifications, adaptations, and applications of the present method of hmwDNA extraction from a natural sample as described for the first time herein, as well as a DNA library generated therefrom, are obvious and may be made without departing from the scope of the invention or the disclosed embodiments thereof Having now described the present invention in detail, the same is demonstrated by reference to the following examples, which are included herewith for purposes of illustration only, and are not intended to be limiting of the invention in any way,
EXAMPLE 1: Recoverv of High Molecular Weight DNA from a Natural Sample
To demonstrate that hmwDNA can be isolated directly from a natural sample without the necessity of culturing, or otherwise pretreating, the genetic organisms contained within the sample, the methods of the present invention were applied to a natural soil sample taken from a local site.
A 50 ml soil sample was suspended in a solution of 25mM Tris 8.0, 150mM NaCl, 25mM EDTA (Buffer I) to a final volume of 175 ml. After complete suspension of the soil, 50 ml of equilibrated 655°C phenol was added to the soil suspension and emulsed by rotation (10-15 rpm) for 30 min RT. The emulsion was centrifiiged @ 4000 rpm for 20 min. The aqueous phase was gently poured into a clean vessel, 0.7 vol isopropanol gently added to the aqueous solution (final volume -150 ml), and rotated for an additional 30 min RT. The mixture was centrifiiged @ 4000 rpm for 20 min and the supernatant discarded.
The precipitated DNA "pellet" was further washed once with 70% ethanol and dried. The DNA extract was resuspended in 6 mis of Buffer I plus 600µ of 5M NaCl (i.e., gentle rotation for --10 min), and 6 ml of 65°C 2% CTAB (in 2M NaCI) was added. The mixture was rotated for 5 min, incubated @ 65°C for 10 min, and rotated an additional 5 min. Chloroform (6 ml) was added to the solution and rotated a fiirther 30 min. The solution was then centrifiiged @ 10,000 rpm for 10 min.
The aqueous phase was transferred into a clean vessel using a wide-bore 1 ml pipette, 0.7 vol isopropanol gently added to the aqueous solution, and rotated for 30 min RT. The DNA precipitate was allowed to settle to the bottom of the vessel, and the supernatant gently poured off. The DNA precipitate was washed repeatedly with 15 ml aliquots of 70% ethanol until the supernatant was clear. Final removal of ethanol was followed by resuspension of the DNA in 1 ml TE (the DNA precipitate was allowed to self-resuspend for --1 hr). The DNA solution was centrifiiged @ 10,000 rpm for 5 min, and any residual particulates removed.
The DNA solution was loaded onto a 32 ml sucrose gradient (in r'x3.5' Ultraclear BeckmanTM centrifiige tube), comprising 8 ml steps of 20%, 30%, 40%, and 50% sucrose in TE.

The gradient was ultracentrifuge @ 28,000 rpm for 21 hrs (without braking). The gradient was eluted in 2 drop increments into a 96-well plate. Every other elution fraction was run on a pulse-field electrophoresis gel.
Recovered DNA fragments, including hmwDNA, from the soil sample ranged in size from 50 kbp to 400 kbp (Fig. 1). DNA yield was approximately 1 microgram (fig) per gram of soil, however the yield can vary depending on soil type (clay vs. sandy) as well as the location and time of sampling.
These results provide the first demonstration of hmwDNA efficiently extracted from a natural sample containing DNA without prior amplification of the DNA, or otherwise pre-treating the sample prior to actual DNA extraction.
EXAMPLE 2: Construction of a DNA Expression Librarv from a Natural Sample
To demonstrate that hmwDNA extracted directly from a natural sample can be inserted stably into an expression vector, whereby the compilation of individual recombinant vectors represents a library of diverse DNA fragments derived from that natural sample, a DNA library was generated using hmwDNA extracted from a soil sample (as described in Example 1 above), inserted into a bacterial artificial chromosome (BAC), and transformed into E. coli. A composite hmwDNA library (-15,000 clones) in pBTP2 (a modified BAC-base vector) was created, comprised of 3 DNA sub-libraries. The construction was accomplished by digesting hmwDNA (separated by sucrose gradient centrifugation) with Hind III (see Fig. 1), The digested DNA was size purified by pulse field electrophoresis (DNA)(Bio-Rad, Hercules, CA). Three separate size ranges were excised; 50-100kbp, 100-150kbp and 150-200kbp fi-agments. The fragments were further purified on a second pulse field gel to remove residual small molecular weight DNA fi-agments trapped in the gel matrix, because smaller DNA fi'agments tend to ligate more efficiently than hmwDNA. The gel slices were dialysed against TE and digested with gelase (Bio-Rad) to dissolve the agarose. Ligation was performed using standard enzymes (T4 ligase and ATP) and 25ng of vector at a 10:1 vector to insert molar ratio. The ligation was drop dialysed with TE and transformed into E, coli (DHIOB). Results of the library construct are provided in Table 1.

This 15,000 member library was constructed from a sub-sample of the soil-extracted hmwDNA. Extrapolating to the total 500ug sample obtained from a single 400g soil sample, total library constructs would approximate 10 clones.
These results demonstrate that hmwDNA extracted directly from a natural sample can be stably cloned into an expression vector for storage, amplification, and future analysis of the genetic material incorporate therein.
EXAMPLE 3: Diversity of a DNA library isolated from a Natural Sample
To demonstrate that hmwDNA expression libraries derived from a natural sample possess gDNA from a diversity of species contained within the natural sample, the genetic diversity of a hmwDNA expression library derived from a soil sample (as described in Example 2) was subjected to phylogenetic analysis.
A sample of soil hmwDNA was subjected to PCR analysis using primers homologous to sequences found within small subunit (16S) rRNA. The use of rRNA sequences to determine species diversity and to establish phylogenetic relationships is well known in the art (for review see Hugenholtz, 1998). The use of degenerative oligos allows for the amplification of many different bacterial families. Over 200 16S DNA fragments were cloned and sequenced. The results indicate a wide diversity of sequences categorized within known families from around world. The majority of sequences, however, were unidentifiable bacterial families; presumably representing unknown bacterial strains. Figure 3 diagrams a phylogenetic tree based on sequences obtained from soil hmwDNA. Incorporated into the diagram are bacterial representatives of divergent taxonomic families, including several from published reports of 16S subunit sequence analysis of DNA obtained from soil.
These results illustrate the genetic diversity of hmwDNA directly obtainable from a natural sample by the method of the present invention, and further demonstrate the capability of the present invention to extract and utilize previously unknown genetic information contained within natural samples.

EXAMPLE 4: Screening of a DNA Expression Library Obtained from a Natural Sample
To demonstrate that hmwDNA expression libraries derived from a natural sample can be screened for some selected physiological characteristic, chemical or physical property, or biological activity, a hmwDNA expression library derived from a soil sample (as described in Example 2) was screened for a variety of activities (see Table 2).

These screening assays isolated a variety of clones from the soil hmwDNA hbrary; e.g., clones capable of expressing known compounds such as indirubin and indole.
As one example, the screen for antibacterial activity against a sensitive strain of Bacillus subtilis revealed three separate and independent antibacterial clones. One of the clones, which also expresses a brown pigment, was analyzed further (data not shown). An organic molecule, encoded by the soil DNA insert of this clone, has been identified as the source of the antibacterial activity. Sequence analysis of the soil hmwDNA insert foils to link the compound, or its source, to any currently known antibiotic or bacterial strain.
EXAMPLE 5: Genetic and Chemical Analyses of
a DNA Expression Library Isolate Obtained from a Natural Sample
To demonstrate that DNA isolates of a DNA expression library derived from a natural sample can genetically manipulated for detailed analysis of the DNA isolate and the polypeptide(s) it encodes, one of the three antibacterial clones isolated and screened from Example 4 above (clone mg 1.1) was subcloned and further analyzed. Clone mg 1.1 (insert size of 27 kb), which produces a purple pigment, exhibits antibacterial activity against B. Subtilis and S. aureus. Upon confirmation that the genetic information responsible for these phenotypes was plasmid-encoded, the isolate was fiirther analyzed.
For fiirther genetic (including sequence information) and biochemical analysis of the MGl. 1 isolate, transposon mutagenesis using pTRANS was employed. PTRANS is a method for characterizing clones expressing heterologous activities by transposon mutagenesis and DNA sequencing using plasmid pTRANS-sacsB. Briefly, plasmid pTRANS-sacs contains the TV-based transposon TRANS (derived from plasmid pGPSl, New England BioLabs, Beverly, MA), a ColEl

origin of replication and a kanamycm resistance gene. In vitro transposition of TRANS allows for random insertion of the ColEl on into a target BAC plasmid, increasing its copy number and thereby facilitating plasmid DNA isolation, sequencing, and (occasionally) expression. Plasmid pTRANS-sacB also encodes the Bacillus subtilis sacB gene in the vector portion of the plasmid, allowing for its counterselection in the presence of 5% sucrose.
Transposition reactions were performed following the published protocol for pGPS 1, followed by transformation of electrocompetent E. coli strain DHIOB with 5µl of the transposon reaction and selection of transformants on LB plates containing kanamycin (50µg/ml), chloramphenicol (10µg/ml), and sucrose (5%). The resulting transformants contained multicopy BAC plasmids with TRANS insertions. Transformants that lost the heterologous activity contain TRANS insertions in soil DNA sequences encoding that activity.
For sequencing, plasnud DNA was isolated using the Qiagen Biorobot 9600 (Qiagen, Inc., Valencia, CA) according to the manufacturer's instructions, and sequenced using ABI Big Dye sequencing kit and run on an ABI 377 DNA sequencer. Bases were assigned using the Unix program Phred, the data was assembled using Phrap, and edited using Consed (University of Washington, Seattle, WA).
Two ORFs responsible for pigment production were identified and encoded amino acid sequences similar to monooxygenases (Fig. 4). Monooxygenase is a family of enzymes shown to produce indole-related compounds in other organisms by catalyzing the incorporation of molecular oxygen (Yen et al., 1991; O'Connor et al., 1997). These two genes were shown to be sufficient for production of both pigments and antibacterial activity, as subcloning and transfer of the two genes to a new host strain resulted in the production activities identical to the original MGl.l clone.
The antimicrobial activity and the pigments associated with MGI. 1 were extractable with organic solvents. Thin layer chromatography (TLC) analysis of extracts yielded both pink and blue pigments. Each pigment exhibited weak antibacterial activity. Both pigments had a MW = 262 as determined by mass spectroscopy (MS).
Genetic analysis, indicating that indole-monooxygenase genes were responsible for pigment production (Fig. 4), combined with MW determination, suggests that the pigments could be indirubin and indigo-blue; structural isomers previously shown to be co-produced in several microorganisms (Hart et al,, 1992; Eaton and Chapman, 1995). Samples of indirubin, an antileukemic drug known to inhibit tyrosine kinases (Han, 1994; Hoessel et al., 1999) and indigo blue were obtabed (Sigma) as standards. In TLC analysis, the standards comigrated with the unknown pigments purified from MGL 1 (indirublin/pink pigment Rf=0.3, indigo blue/blue pigment Rf=0.58). One dimensional nuclear magnetic resonance (NMR) analysis of the pink pigment

confirmed its identity as indirublin (Fig. 5).
TLC analysis of extracts revealed additional, nonpigmented antibacterial molecule(s) that were more potent (B. subtilis was sensitive to M Mg/ml in LB liquid culture). Initial MS and NMR analyses suggest that E, coli clone MGl.l produces a family of related molecules with antimicrobial activity, encoded by the soil DNA insert. One such member identified by NMR (data not shown) is 2-(2,2-bis-(lH-indol-3-yl)-ethyl)- phenylamine (Figure 5C). This molecule has been chemically syndiesized (Bocchi and Palla, 1986; Legall et al., 1988; Ishii et al., 1988), but never isolated from a natural source. Derivatized indol dimers could also be isolated following the methods described above.
A summary of the four antibacterial activities identified in the MGl library is presented in Table 3.
Table 3. Antibacterial activities detected in soil DNA library MGl

*+, pTRANS insertions into gene(s) encoding antibacterial activity were obtained, leading to identification of the encoding DNA sequence; -, no pTRANS knockouts were obtained. +, pTRANS insertions into BAC plasmid outside of genes encoding antibacterial activity, leading to amplification of plasmid copy number and antibacterial gene expression. Information obtained from DNA sequence analysis and/or cell fractionation/extraction, biochemical, and analytical chemistry procedures.
The above described Examples demonstrate that a hmwDNA library can be generated from a natural sample, screened for selected properties; and desired clones isolated for further growth, manipulation, and analysis. These results provide the first demonstration of the screening of a hmwDNA library derived from a natural sample, the identification of biochemicals, and the isolation of clones containing the genetic information, which encode novel biochemicals.

Cited References Each of the publications mentioned herein above and below is incorporated by reference. Abelson, Medicine from Plants, Science 247:513 (1990).
Berdy et al., "Search and Discovery Methods for Novel Antimicrobials", In Bioactive Metabolites From Micro-Organisms, pp. 3-25, ME Bushell, U Grafe, eds., Elsevier, Amsterdam. (1982). Bintrim et al., "Molecular phylogeny of Archaea from soil. PNAS USA 94:277-282 (1997). Bocchi and Palla, Tetrahedron 42:5019-5024 (1986).
Colwell, "Human pathogens in the aquatic environment", pp.337-344 In Colwell and Foster (eds.) Aquatic Microbial Ecology. University of Maryland Sea Grant, College Paric, MD (1979). Currie et al., "Fungus-Growing Ants Use Antibiotic-Producing Bacteria to Control Garden
Parasites", Nature 398:701-704 (1999). Eaton and Chapman, J. Bacteriol. 177:6983-6988 (1995).
Fenical and Jenson, Marine Microorganisms: A New Biomedical Resource. Advances in Marine Biotechnology, vol. I: Pharmaceutical and Bioactive Natural Products, pp. 419-457, D. Attaway, O. Zaborsky eds.. Plenum Press, New York. (1992). Franco et al., "Detectipn on Novel Secondary Metabolites", In Critical Reviews in Biotechnology,
vol. ll(3):193-276 (1991). Giovannoni et al. "Genetic diversity in Sargasso Sea bacterioplankton" Nature 345:60-63 (1990). Goodfellow et al. In Microbial Products: New Approaches, pp. 343-383, Cambridge University
Press (1989). Griffiths et al. "Broad-Scale approaches to the determination of soil microbial community structure: application of the community DNA hybridization technique". Microbial Ecol. 31:269-280 (1996). Han, Stem Cells (Dayt) 12:53-63 (1994). Hart et al., J. Gen. Microbiol. 138:211-216 (1992). Hoessel et al.. Nature Cell Biol. 1:60-67 (1999). Hugenholtz et al. "Impact of culture-independent studies on the emerging phylogenetic view of
bacterial diversity", J. Bacteriol. 180:4765-4774 (1998). Ishii et al. J. Chem. Soc. Perkin Trans. 1:2387-2395 (1988). Legall et al., Int. J. Pept. Protein Res. 32:279-291 (1988).
Malpartida et al., "Molecular Cloning of the Whole Biosynthetic Pathway of a Streptomyces Antibiotic and its Expression in a Heterologous Host", Nature 309:462-464 (1984).

Murdock et al., "Construction of Metabolic Operwi Catalyzing the De Novo Biosynthesis of Indigo
in Escherichia Coli" Bio/Technology 11:381-385 (1993). Rosenberg, "Microbial surfactants" Crit. Rev. Biotechnol. 3:109-132 (1986). O'Connor et al,, Appl. Environ. Microbiol. 63:4287-4291 (1997).
Pace, "A molecular view of microbial diversity and the biosphere" Science 276:734-740 (1997) Schulz et al. " Dense Populations of a Giant Sulfur Bacterium in Namibian Shelf Sediments",
Science 284:493-495 (1999). Schneider, Extraction of Genomic DNA from Blood Using Cationic Detergents U.S. Pat. No.
5,596,092 (1997). Stahl et al., "Characterization of a Yellowstone hot spring microbial community by 5S rRNA
sequences" Appl. Environ. Microbiol. 49:1379-1384 (1985). Suffness et al. In Biomedical Importance of Marine Organisms, pp. 151-157, Calif. Acad, of
Sciences (1988). Suzuki et al.," Bacterial diversity among small-subunit rRNA gene clones and cellular isolates from
the same seawater sample" Appl. Environ. Microbiol. 63:983-989 (1997). Torsvik et al. "Comparison of phenotypic diversity and DNA heterogeneity in a population of soil
bacteria", Appl. Environ. Micro. 56:782-787 (1990). Torsvik et al.. In Ritz and Giller (eds.) Beyond the Biomass: Compositional and Functional Analysis
of Soil Microbial Communities. John Wiley and Sons, Chichester (1994). Vining, "Functions of secondary metabolites", Annu. Rev. Microbiol. 44:395-427 (1990). Ward et al., "16S rRNA sequences reveal numerous uncultured microorganisms in a natural
community". Nature 345:63-65 (1990). Yen et al., J Bacteriol. 173:5315-5327 (1991).

We Claim:
1. A method for isolating high molecular weight DNA from a plurality of species in a sample
comprising the steps of:
(a) preparing an aqueous suspension of said sample;
(b) forming an extraction mixture by adding to the aqueous suspension an appropriate organic solvent under suitable conditions to remove undesired materials from the suspension while retaining part or all of the high molecular weight DNA;
(c) separating the extraction mixture into an aqueous phase and an organic phase, wherein the aqueous phase contains DNA; and
(d) precipitating the DNA from the aqueous solution.
2. The method of claim 1, further comprising the steps of:
(a) gently mixing the aqueous solution with a cationic detergent to form an aqueous mixture; and
(b) gently extracting the aqueous mixture with an organic solvent to yield an encirhed aqueous solution..

3. The method of claim 2, wherein the cationic detergent is cetyltrimethylammonium bromide (CTAB).
4. The method of claim 1, 2 or 3, further comprising the steps of:

(a) concentrating the DNA in the enriched aqueous solution;
(b) passing the concentrated DNA through a density gradient to separate DNA molecules from each other by size; and
(c) isolating high molecular weight DNA from said gradient.

5. The method of any of claims 1-3, wherein the sample comprises soil,
6. The method of claim 4, wherein the sample comprises soil.
7. The method of claim 1, wherein the organic solvent is phenol at a temperature of >50oC.
8. The method of any of claims 2-4, further comprising the step of adding a proteinase to the

DNA in a sufficient amount and under appropriate conditions permitting degradation of proteins.
9. The method of claim 4, wherein said density gradient is a CsCl gradient or a sucrose gradient.
10. A high molecular weight DNA library comprising a set of vectors containing DNA inserts derived from a plurality of species, a plurality of which inserts being high molecular weight DNA.
11. A high molecular weight DNA library comprising a set of vectors containing DNA inserts derived from a plurality of species, a plurality of which inserts being high molecular weight DNA isolated by the method of claim 1.
12. A high molecular weight DNA library comprising a set of vectors containing DNA inserts derived fi-om a plurality of species, a plurality of which inserts being high molecular weight DNA which are operably linked to a transcription control element which regulates their expression in a host cell.
13. The high molecular weight DNA library of claim 10,11 or 12, wherein said vector is selected from the group consisting of plasmids, cosmids, phagemids, modified viral vectors, and artificial chromosomes.
14. The high molecular weight DNA hbrary of claim 10-12, wherein said vector is a shuttle vector.
15. The high molecular weight DNA library of claim 10, wherein said high molecular weight DNA is obtained from a soil sample.
16. Host cells containing the high molecular weight DNA hbrary of claim 10.
17. Host cells of claim 16 selected from the group consisting of eubacterial cells, fungal cells and animal cells.

18. Host cells of claim 17 wherein said host cells are E. coli, yeast or mammalian cells.
19. A method for making a high molecular weight DNA library containing DNA inserts derived
from a plurality of species, comprising inserting high molecular weight DNAs from a
plurality of species into a vector.
20. The method of claim 19, wherein said vector is selected from the group consisting of plasmids, cosmids, phagemids, viral vectors, and artificial chromosomes.
21. The method of claim 19, wherein said vector is a shuttle vector.
22. The method of claim 19, wherein said high molecular weight DNA is obtained from a soil sample.
23. The method of claim 19, which further comprises introducing the library into host cells.
24. The method of claim 23, wherein said host cells are selected from the group consisting of eubacterial cells, fungal cells and animal cells.
25. The method of claim 24 wherein said host cells are E. coli, yeast or mammalian cells.
26. A method for identifying a compound with a biological activity of interest produced by cells containing a member of a high molecular weight DNA Ubraiy, wherein said Ubrary contains high molecular weight DNA inserts from a pluraUty of species, comprising the steps of:

(a) selecting an assay that detects the biological activity of interest;
(b) screening said cells containing members of the high molecular weight DNA Ubrary for the presence of said biological activity; and
(c) correlating the biological activity with the presence of the compound.
27. A method for identifying a member of a high molecular weight DNA library associated with
production of a compound with a biological activity of interest, wherein said Ubraiy
contains high molecular weight DNA inserts from a plurality of species, comprising the
steps of:
(a) selecting an assay that detects the biological activity of interest;

(b) screening said cells containing members of the high molecular weight DNA library for the presence of said biological activity; and
(c) correlating the biological activity with the presence of the member of the high molecular weight DNA library.
28. A method for producing a compound of interest, comprising the steps of:
(a) identifying a member of a high molecular weight DNA library associated with production of a compound with a biological activity of interest, using the method of claim 26; and
(b) culturing cells containing and capable of expressing the member of the library so identified under conditions permitting production of the compound.

29. The method of claim 28 wherein said compound if interest is a derivatized indol dimer.
30. The method of claim 29 wherein said indol dimer is 2-(2,2-bis-(lH-indol-3-yl)-ethyl)-phenylamine.

31. A method for isolating high molecular weight DNA from a plurality of species in a sample, substantially as hereinabove described and illustrated with reference to the acccompanying drawings

Documents:

in-pct-2002-005-che-abstract.pdf

in-pct-2002-005-che-assignement.pdf

in-pct-2002-005-che-claims filed.pdf

in-pct-2002-005-che-claims granted.pdf

in-pct-2002-005-che-correspondnece-others.pdf

in-pct-2002-005-che-correspondnece-po.pdf

in-pct-2002-005-che-description(complete)filed.pdf

in-pct-2002-005-che-description(complete)granted.pdf

in-pct-2002-005-che-drawings.pdf

in-pct-2002-005-che-form 1.pdf

in-pct-2002-005-che-form 26.pdf

in-pct-2002-005-che-form 3.pdf

in-pct-2002-005-che-form 5.pdf

in-pct-2002-005-che-other documents.pdf

in-pct-2002-005-che-pct.pdf

« Previous Patent

Next Patent »

Patent Number

213909

Indian Patent Application Number

IN/PCT/2002/5/CHE

PG Journal Number

13/2008

Publication Date

31-Mar-2008

Grant Date

23-Jan-2008

Date of Filing

01-Jan-2002

Name of Patentee

AVENTIS PHARMACEUTICALS, INC

Applicant Address

300 Somerset Corporate Boulevard Bridgewater, New Jersey 08807-2854,

Inventors:

#	Inventor's Name	Inventor's Address
1	MCNEIL, Ian	23 Oak Road Milton, MA 02186,
2	LYNCH, Berkley, A	129 Franklin Street, Apartment 224 Cambridge, MA 02139,
3	LOIACONO, Kara, A	44 Elmwood Avenue Salem, NH 03079,
4	TIONG, Choi, Lai	278 W. Squantum Street Quincy, MA 02171,
5	MINOR, Charles, A	10 Gaslight Drive #11 S. Weymouth, MA 02190,
6	OSBURNE, Marcia, S	107 Cedar Street Lexington, MA 02421,
7	GROSSMAN, Trudy, H	18 Hathaway Road Lexington, MA 02420,
8	AUGUST, Paul, R	50 Emerald Dr. Danville, NH 03819,

PCT International Classification Number

C07D 209/1

PCT International Application Number

PCT/US2000/015306

PCT International Filing date

2000-06-01

PCT Conventions:

#	PCT Application Number	Date of Convention	Priority Country
1	60/191,601	2000-03-23	U.S.A.
2	60/137,065	1999-06-02	U.S.A.