Title of Invention  MULTIPLEPOINT STATISTIC (MPS) SIMULATION WITH ENHANCED COMPUTATIONAL EFFICIENCY 

Abstract  An enhanced multipoint statistical (MPS) simulation is disclosed. A multiplegrid simulation approach is used which has been modified from a conventional MPS approach to decrease the size of a data search template, saving a significant amount of memory and cputime during the simulation. Features used to decrease the size of the data search template include: (1) using intermediary subgrids in the multiplegrid simulation approach, and (2) selecting a data template that is preferentially constituted by previously simulated nodes. The combination of these features allows saving a significant amount of memory and cputime over previous MPS algorithms, yet ensures that largescale training structures are captured and exported to the simulation exercise. 
Full Text  MULTIPLEPOINT STATISTICS (MPS) SIMULATION WITH ENHANCED COMPUTATIONAL EFFICIENCY CROSSREFERENCE TO RELATED PATENT APPLICATIONS This application incorporates by reference all of the following copending applications: "Method for Creating Fades Probability Cubes Based Upon Geologic Interpretation," Attorney Docket No. T6359, filed herewith. "Method for Making a Reservoir Fades Model Utilizing a Training Image and a Geologically Interpreted Facies Probability Cube," Attorney Docket No. T6404, filed herewith. BACKGROUND OF THE INVENTION A traditional geostatistical workflow to model hydrocarbon reservoirs consists in modeling fades, then populating each fades with petrophysical properties, typically porosity and permeability, using variogrambased algorithms. Because the variogram is a only a twopoint measure of spatial variability, variogrambased geostatistics do not allow modeling realistic curvilinear or other geometrically complex facies patterns, such as meandering sand channels, which are critical for connectivity and flow performance assessment. A more recent modeling approach, referred to as multiplepoint statistics simulation, or MPS simulation, has been proposed by Guardiano and Srivastava, Multivariate Geostatistics: Beyond Bivariate Moments: GeostatisticsTroia, in Scares, A., ed., GeostatisticsTroia: Kluwer, Dordrecht, V. 1, p. 133144, (1993). MPS simulation is a reservoir facies modeling technique that uses conceptual geological models as 3D training images to generate geologically realistic reservoir models. The training images provide a conceptual description of the subsurface geological bodies, based on well i log interpretation and general experience in reservoir architecture modeling. MPS simulations extract multiplepoint patterns from the training image and anchor the patterns to the well data. Numerous others publications have been published regarding MPS and its application. Caers, J. and Zhang, T., 2002, Multiplepoint Geostatistics: A Quantitative Vehicle for Integrating Geologic Analogs into Multiple Reservoir Models, in Grammer, G.M et al., eds., Integration of Outcrop and Modem Analog Data in Resen/oir Models: AAPG Memoir. Strebelie, S., 2000, Sequential Simulation Drawing Structures from Training Images: Doctoral Dissertation, Stanford University. Strebelie, S., 2002, Conditional Simulation of Complex Geological Structures Using MultiplePoint Statistics: Mathematical Geology, V. 34, No. 1. Strebelie, S., Payrazyan, K., and J. Caers, J., 2002, Modeling of a Deepwater Turbidite Reservoir Conditional to Seismic Data Using MultiplePoint Geostatistics, SPE 77425 presented at the 2002 SPE Annual Technical Conference and Exhibition, San Antonio, Sept. 29Oct. 2. Strebelie, S. and Journel, A, 2001, Reservoir Modeling Using MultiplePoint Statistics: SPE 71324 presented at the 2001 SPE Annual Technical Conference and Exhibition, New Orleans, Sept. 30Oct. 3. SNESIM (Single Normal Equation Simulation) is an MPS simulation program which is particularly well known to those skilled in the art of fades and reservoir modeling. In particular, SNESIM simulation is described in detail in Strebelie, S., 2000, Sequential Simulation of Complex Geological Structures Using MultiplePoint Statistics, doctoral thesis, Stanford University and Strebelie, S., 2002, Conditional Simulation of Complex Geological Structures Using MultiplePoint Statistics: Mathematical Geology, V. 34, No. 1. The basic SNESIM code is also available at the website http://panqea,stanford.edu/strebell/research.html. Also included at the website is a PowerPoint presentation senesimtheory.ppt which provides the theory behind SNESIM, and includes various case studies. PowerPoint presentation senesimprogram.ppt provides guidance through the underlying SNESIM code. Again, these publications are wellknown to fades modelers who employ multiple point statistics in creating fades and reservoir models. 9 These publications of Strebelie are hereby incorporated in there entirety by reference. Experience has shown that the MPS simulation program SNES1M reproduces training patterns reasonably well. However, SNESIM is significantly more cpudemanding than a comparable variogrambased simulation program SISIM, also developed at Stanford University. SNESIM requires a very large amount of memory to extract, then store patterns from 3D multimillionnodes training cubes. The MPS simulation program SNESIM described in the Strebelie dissertation (2000, pp.4053) is based on the same sequential simulation paradigm as the traditional indicator variogrambased program SISIM. A condensation of this description SNESIM is contained in Appendix A of this specification. With SISIM the simulation grid nodes are visited one single time along a random path. SISIM is described in Deutsch, C. and Journef, A. (1998) GSLIB: Geostatistical Software Library and User's Guide, second edition, Oxford University Press, New York. Once simulated, a nodal value becomes a hard datum that will condition the simulation of the nodes visited later in the sequence. While in the variogrambased algorithm, kriging is performed at each unsampled node to estimate the local conditional distribution from which a simulated value is drawn, in MPS simulation that local conditional distribution is estimated by scanning a training image with a given data template of conditioning data (Guardiano and Srivastava, 1992). The main contribution of the Strebelie dissertation (2000) was to decrease the cputime of Srivastava's original code by storing ahead of time all required conditional probability distribution functions (cpdf s) in a dynamic data structure called search tree. For convenience, Appendix B of this specification describes how the search tree of Strebelie (2000) is generated. More precisely, denote by W(u) the data search window centered at location u, and tn the data template constituted by the n vectors {ha, a=1 ...n} defining the n locations u+ ha of M/(u). Prior to the simulation, the training image is scanned with a data template xn, then the numbers of occurrences of all possible data events associated with data template xn are stored in the search tree. During the MPS simulation, the local cpdf s are retrieved directly from that search tree. Accordingly, the training image need not be scanned anew for each node simulation. One major limitation of the search tree approach is that the data template xn cannot include too many grid nodes. There are two reasons for such limitation: 1. The amount of memory used to construct the search tree increases exponentially with the size of the data template: for an attribute taking K possible values, e.g. K fades values, the maximum number of possible data events associated with data template xn is Kn. Fortunately that maximum number is rarely reached. 2. The cputime needed to retrieve cpdf s from the search tree increases dramatically with a large data template xn. At any unsampled node u, only n' ( The number c(dn) of occurrences of any such data event dn can be read directly from the search tree. The smaller the number n' of conditioning data, the greater the number of possible data events dn that include dn*, the greater the cputime needed to retrieve all the related numbers c(dn0 from the search tree. For an attribute taking K possible values, the number of possible data events dn that include dn* can be as large as Kn_n. The data template cannot include too many grid nodes for memory and cputime considerations. Yet, the data search window should be large enough to capture largescale structures of the training image. One solution to alleviate this conflict is to use a multiplegrid approach, whereby a number G of nested and increasingly finer grids are simulated, see Tran, T., Improving Variogram Reproduction on Dense Simulation Grids, Computers and Geosciences, 20(7):11611168 (1994) and Strebelle dissertation (2000, p.46). The gth (1 maximum number of possible fully informed events that include a conditioning data event constituted by n'=20 data could be as large as Kn*n'= 260= 1.2*1018. Fortunately, all these extremely large numbers are capped by the total number of nodes of the training image being scanned. The present invention addresses previous shortcomings in the computational efficiency of the MPS methodology used in SNESIM. SUMMARY OF THE INVENTION Two solutions are proposed hereafter to decrease the size of the data search template: (1) use intermediary subgrids in the multiplegrid simulation approach, and (2) select a data template that is preferentially constituted by previously simulated nodes. The combination of these two solutions allows saving a significant amount of memory and cputime, yet ensuring that largescale training structures are captured and exported to the simulation exercise. It is an object the present invention to use data templates of reduced size in multiplepoint statistics simulations using search trees to enhance computational efficiency. It is another object to use nested grids in an MPS simulation wherein the nested grids are generally equally composed of informed and uninformed nodes. It is yet another object to use data templates which are primarily composed of the informed nodes and few uninformed nodes to reduce data template size and enhance computational efficiency. BRIEF DESCRIPTION OF THE DRAWINGS These and other objects, features and advantages of the present invention will become better understood with regard to the following description, pending claims and accompanying drawings where: FIG. 1 is an overall flowchart of steps taken in a MPS simulation, made in accordance with the present invention, wherein data templates of reduced size are utilized to enhance the computational efficiency of the simulation; FIGS. 2AC illustrate a prior art multiplegrid simulation sequence for a simulation grid of size 8 x 8 = 64 nodes using 3 nested increasingly finer grids with previously simulated nodes in black, nodes to be simulated within the current grid in gray, and nodes which are not be simulated in the step, in white; FIGS. 3AE depicts a multiplegrid simulation sequence, made in accordance with the present invention, for a simulation grid of size 8 x 8 = 64 nodes using intermediary subgrids (previously simulated nodes are in black and nodes to be simulated within the current subgrid are in gray); FIGS. 4AC shows a multiplegrid simulation sequence in 3D using intermediary subgrids (previously simulated nodes are in black and nodes to be simulated within the current subgrid are in gray;) FIGS. 5AC illustrate a second subgrid associated with a fine 8x8 simulation grid (previously simulated nodes in gray, nodes to be simulated within the current subgrid are in white), and a search window contoured in black; b) a prior art data template constituted by all grid nodes contained in the search window; and c) a new data template constituted by previously simulated nodes and the four closest unsimulated nodes to a grid for which an attribute is to be simulated; FIGS. 6AD show a) a training image; b.) an 80data template used for an original simulation; c) simulated realization generated using the original multiplegrid simulation approach; and d) simulated realization generated using the new multiplegrid simulation approach of the present invention; FIG. 7 illustrates a search tree example including a) a training image; b) a retained data template; and c) a search tree obtained from the training image using the search data template. The italic number below each node is the node reference number used in the table displayed in FIG. 8; FIG. 8 shows a table giving the coordinates of the central nodes u of all training replicates scanned to construct the search tree of FIG. 8C; and FIG. 9 is a flowchart of substeps of a simulation step 800 of FIG. 1 in which grid values are simulated from a search tree. DETAILED DESCRIPTION OF THE INVENTION FIG. 1 is a flowchart showing the steps taken in the present invention and which match those steps used in the convential SNESIM algorithm. Appendix A is general theoretical explanation of how the SNESIM algorithm works and is a condensation of p. 4053 of the Strebelle dissertation. The significant improvements, in the present invention, have been made in steps 500 and 600 as will be described in greater detail below. Comparisons and distinctions will be made between the conventional SNESIM algorithm and the enhanced MPS simulation of the present invention. In the conventional SNESIM algorithm and the present invention, a stratigraphic grid of nodes is created in step 100 which is to be used to model a reservoir of interest. A training image is created in step 200 which reflects a modeler's conceptualization of the stratigraphic patterns and heterogeneities that may be present in the reservoir. In order to capture a broad overview of the reservoir and to limit the number of nodes in computations, an initial coarse grid of nodes is selected in step 300 which corresponds to the stratigraphic grid. Attributes, i.e. facies values, are then simulated in step 400 using well data and the training image. Preferably, these attributes are simulated using the MPS simulation steps to be described in greater detail below. After determining these initial attributes at the coarse grid, the grid is refined in step 500 by adding additional nodes into the grid. This finer grid, or working grid, includes the nodes for which the attributes were previously simulated. These nodes are referred to as informed nodes as attributes have been assigned. The informed node may include known well data or else be simulated values. The additional nodes added to the working grid for which attributes are not yet known are called uninformed nodes. Step 500 is one of the steps in which an enhancement has been made in this invention to the basic SNESIM algorithm. A data template of nodes is then selected from this refined working grid of nodes in step 600. This step has also been improved in the present invention over the conventional SNESIM implementation. The training image is then scanned in step 700 using the data template of step 600 to create a search tree. The attributes of the uninformed nodes are then sequentially determined in step 800 from the search tree. Details on the creation of the search tree and how attributes for uninformed nodes may be determined can be found in Appendix B. Appendix B is from the Strebelle 2000 dissertation. The working grid is then checked to see whether its fineness matches that of the stratigraphic grid. If yes, then all of the nodes have been assigned attributes. If not, then the fineness of the working grid is enhanced and attributes of additional uninformed nodes are determined by repeating steps 500 through 800 until the working grid matches the stratigraphic grid in number with all attributes, such as fades type, having been determined. The conventional SNESIM MPS simulation suffers from being computationally intensive. Two solutions are proposed hereafter to decrease the size of the data search template: (1) use intermediary subgrids in the multiplegrid simulation approach (step 500), and (2) select a data template that is preferentially constituted by previously simulated nodes (step 600). The combination of these two solutions allows saving a significant amount of memory and cputime, yet ensuring that largescale training structures are captured and exported to the simulation exercise. SIMULATION OF INTERMEDIARY SUBGRIDS In order to reduce the size of the search data template, intermediary subgrids within each grid is used in the original multiplegrid approach previously described, except for the very first (coarsest) grid. See FIGS. 3AE. In a 2D, two subgrids are considered. The first subgrid associated with the (g+1)th grid is constituted by the nodes of the previous gth grid plus the nodes located at the center of the squares constituted by these nodes, as shown in FIG. 3 for the simulation of a grid of size 8*8=64 nodes. Half of the nodes in that first subgrid are nodes previously simulated within the gth grid. Thus, in order to find at least 20 conditioning data in the neighborhood of each unsampled node, the search template should contain 2*20=40 nodal locations. Note this compares to 80 required nodes when simulating the (g+1)th grid directly as the conventional SNESIM algorithm. The second subgrid is the (g+1 )th grid itself, its number of nodes is twice the number of nodes of the first subgrid, see FIG. 3. Thus, again, half of the nodes of that subgrid are previously simulated nodes, and a data search template with 40 nodal locations is large enough to find at least 20 conditioning data at each unsampled location. In the present invention, it is desirable to have a relatively high informed node to total node ratio for the data template. Note that this ratio in FIGS. 2A is %. in FIG. 3 this ratio is higher, i.e. !4, which is higher than the % ratio found in conventional SNESIM. For a 3D grid, the ratio for conventional SNESIM is 1/8. IN the present invention, the ratio remains Y*. In the original SNESIM multiplegrid simulation approach, the original sample data, i.e., well data, are relocated to the closest nodes of the current simulation grid; they are reassigned to their original locations when the simulation of the grid is completed. In the new multiplegrid simulation approach of the present invention, the same data relocation process is applied to each subgrid. In 3D, the new multiplegrid approach proposed to reduce the size of the search template requires 3 subgrids per nested grid in order that, as in 2D, the number of nodes within each subgrid is twice that within the previous subgrid, see FIGS. 4AC. DATA TEMPLATE PREFERENTIALLY CONSTITUTED BY PREVIOUSLY SIMULATED NODE LOCATIONS The size of the data template can be further reduced by retaining preferentially node locations that correspond to previously simulated nodes. Consider the simulation of a node of the second subgrid associated with the fine 8*8 grid displayed in FIG. 5A. The data template corresponding to the search window shown in FIG. 5A would be constituted by 40 grid nodes if all grid nodes within that search window were retained, see FIG. 5B. Instead, it is proposed that the data template be constituted by: • the already simulated locations belonging to the previous subgrid. There are 20 of them in the example of FIG. 5C; and • a small number of node locations that do not belong to the previous subgrid, but are close to the central location to be simulated, say the 4 closest nodes. At the beginning of the simulation of the current subgrid, these locations are not informed unless they correspond to original sample data locations. Note that the new proposed template, as displayed in FIG. 5C, still provides a wellbalanced coverage of the space within the original search window limits. For an attribute taking K=2 possible values, the number of possible data events associated with the 24data template of FIG. 4C is: 224=1.7*107, versus 240=1.1*1012 for the original 40data template of FIG. 5B. At each unsampled node, at least 20 conditioning data can be found. The maximum number of 24data events that can include a specific conditioning 20data event is: 224_20=24=16, to be compared with the maximum number of 40data events that can include the same conditioning 20data event: 240~ 20=220=106 MEMORY AND CPUTIME SAVING TIPS Using the proposed intermediary simulation subgrids and a data search template preferentially constituted by previously simulated node locations allows decreasing the size of the data template, thus saving memory demand and cputime. It is, however, difficult to estimate precisely the amount of that saving, for the following reasons: • A new search tree must be constructed for each subgrid. In 2D the number of search trees required by the new multiplegrid simulation approach is about twice the number of search trees required by the original approach, and about three times this number in 3D. However, the onetime cost of those additional search trees in terms of memory and cputime is expected to be minor compared to the saving expected from reducing the size of the data template. Recall that the cputime needed for a onetime construction of search trees is much smaller than the cputime needed to simulate multiple realizations of a 3D multimillionnodes simulation arid. • In the previous analysis, orders of magnitude for the numbers of possible data events were provided. Because the training image displays repetitive patterns and has a limited size N (typically lesser than a few millions of nodes in 3D), the number of training data events associated with a data template xn is lesser than N, hence much lesser than the total number Kn of possible data events associated with xn. : However, for an attribute taking K=2 possible values, using a template constituted by 24 nodal locations (instead of 80) does still lead to a significant decrease in memory demand, as illustrated by the case study presented hereafter. The search tree actually consists of n levels. Let Tn'={ha, a=1 ...n1} be the template constituted by the n' data locations of xn closest to its center. At level n' ( DATA TEMPLATE CONSTRUCTION FROM A SEARCH ELLIPSOID In the original SNESIM algorithm, the data template is provided by the user as formatted file where the relative coordinates of the template nodes are specified, see GeoEAS format in the GSUB software (Deutsch and Journel, 1998, p.21). An example of such file is provided in Strebelle thesis (2000, p. 169). Using the new multiplegrid approach, one template per subgrid would need to be provided by the user. In addition, it may be quite tedious for the user to find out then enter the template preferentially constituted by previously simulated node locations. For those two reasons, in the new SNESIM version as provided by the present invention, the template is preferably generated automatically using the following parameters provided by the user: • a data search ellipsoid defined by 3 radii and 3 angle parameters, as described in FIG. 11.3 of the GSLIB user's guide: as radius, radius1t radius2, sangl, sang2, and sang3; • the number nclose of template locations corresponding to previously simulated nodes. The number of close nodes that have not been simulated yet is arbitrarily set to 4 in 2D as well as in 3D in the preferred embodiment of this invention. Using these parameters, the data search template is constructed as follows: • Compute the distance of each grid node of the search ellipsoid to the ellipsoid center using the anisotropic distance measure associated with that search ellipsoid. • Rank the nodes in increasing distance order, and keep only those nodes that belong to the current simulation subgrid. • Retain the nclose closest nodes that belong to the previous simulation subgrid, i.e. the nclose closest previously simulated nodes, and the four closest nodes that have not been simulated yet. For the first (coarsest) simulation grid, since there are no previously simulated nodes, simply retain the nclose closest nodes of that grid. COMPUTATION OF THE NUMBER OF NESTED GRIDS IN THE MULTIPLEGRID SIMULATION APPROACH In the original SNESIM algorithm, the user must specify the number of nested grids to be used in the multiplegrid approach. In accordance with the present invention that number is computed, based on the data search ellipsoid and the template size provided by the user, using the following recursive subroutine: • Set the number nmult of nested grids to 1. • Using the anisotropic distance measure associated with the search ellipsoid, construct a data template xn={ha, a=1 ...nclose} such that, at any location u, the nclose locations u+ ha correspond to the nclose grid nodes closest to u. • If all the nclose locations u+ ha are within the search ellipsoid, increment nmult by 1, rescale the template xn by multiplying all its vectors by 2, and check if all the locations of that rescaled template are still located within the search ellipsoid. If they are, increment nmult again by 1, rescale the data template... until at least one (rescaled) template location is out of the search ellipsoid. EXAMPLE: APPLICATION TO THE SIMULATION OF A 2D HORIZONTAL SECTION OF A FLUVIAL RESERVOIR. In order to provide an estimate of the memory and cputime saved when decreasing the size of the data template, the performance of the modified SNESIM algorithm is compared with the original version on the simulation of a horizontal 2D section of a fluvial reservoir. The training image, displayed in FIG. 6A, depicts the geometry of the meandering sand channels expected to be present in the subsurface; the size of that training image is 250*250=62,500 pixels, and the channel proportion is 27.7%. In the parameter file of the original SNESIM, the number nmult of nested increasingly finer grids to be used in the multiplegrid simulation approach was set to 6, The data template xn shown in FIG. 6B was used to construct the 6 search trees; that template consists of 80 grid nodes. The larger search tree obtained with that template was made of 2,800,000 nodes (Each search tree node correspond to one data event associated with a subtemplate of tn and for which at least one occurrence can be found in the training image.) One unconditional simulated realization generated by the original SNESIM is displayed in FIG. 5C. Using a 400MHz Silicon Graphics Octane, the total time required by the simulation was 60.6 cpu seconds, of which constructing then deleting the 6 search trees took 28.8 seconds, and retrieving the local cpdf s from the search trees took 27.8 seconds. For the modified SNESIM version made in accordance with the present invention, the number nclose of template data locations corresponding to previously simulated nodes was set to 20. An isotropic search ellipsoid was considered: radius=radius1=50, radius2=1, sangl=0, sang2=0, and sang3=0, the distance unit corresponding to the node spacing of the final simulation grid. Using those parameter values, the number of nested grids to be used in the new multiplegrid simulation approach was computed to be 6. FIG. 6D shows one unconditional simulated realization generated using the modified SNESIM version of the present invention. The reproduction of the training patterns is similar to the realization generated using the original SNESIM. One realization takes a total of 16.0 seconds, of which: • 11.3 seconds to construct, then delete the 11 search trees. Recall that one search tree per subgrid, i.e. 2 search trees per simulation grid except for the (first) coarsest grid, are constructed. Thus, although the modified SNESIM calls for more search trees, the cputime is divided by 2.5. In addition, the larger search tree utilizes only 100,000 nodes, versus 2,800,000 nodes for the original SNESIM. • 3.6 seconds to retrieve all the required cpdf s from the search trees, which is eight times less than for the original SNESIM. While in the foregoing specification this invention has been described in relation to certain preferred embodiments thereof, and many details have been set forth for purposes of illustration, it will be apparent to those skilled in the art that the invention is susceptible to alteration and that certain other details described herein can vary considerably without departing from the basic principles of the invention. APPENDIX A MULTIPOINT STATISTICS SIMULATION USING SEARCH TREES  GENERAL THEORY TERMINOLOGY Consider an attribute S takings possible states {*£,£ = l,...,iQ . S can be a categorical variable, or a continuous variable with its interval of variability discretized into K classes by (K 1) threshold values. The following terminology is used: a data event d of size n centered at a location u to be simulated is n constituted by: — a data geometry defined by the n vectors {haa = l,...sn} — the n data values {s(u + ha) = a = l5...,/z The central value of that data event is the unknown value to be evaluated, it is denoted as s(u). a data template rn comprises only the previous data geometry. A subtemplate of rn is a template constituted by any subset n of vectors of rn with n rn. A cpdf (conditional probability distribution function) associated with rn is a probability distribution of the central value s(u) conditional to a specific data event d associated with r„ EXTENDED NORMAL EQUATIONS Consider the attribute S taking £ possible states {sjc,k = ly...,K}. We wish to evaluate the local probability distribution of variable S(u) conditional to the nearest n hard data S(ua) = sjc ,a=l,...,n . Denote by AQ the binary (indicator) random variable associated to the occurrence of state s^ at location u: Similarly, let D be the binary random variable associated to the occurrence of the data event d constituted by the n conditioning data S(ua) = sfr ,a = l,...,n considered jointly: D can be decomposed into the product of the binary random variables associated to each conditioning datum: The previous decomposition allows application of the (generalized) indicator kriging formalism, or extended normal equations. Thus the exact solution provided by indicator kriging identifies the definition of the conditional probability, as given by Bayes' relation. SCANNING THE TRAINING IMAGE(S) The exact solution (2) calls for (« + l)  point statistics much beyond the traditional twopoint variogram or covariance model. There is usually no hope to infer such multiplepoint statistics from actual (subsurface) data, hence the idea to borrow them by scanning one or more training images under a prior decision of stationarity (export license): • the denominator l>Tob{S(ua) = sjc , A replicate should have same geometric configuration and same data values. • the numerator l?rob{S(u) = skandS(ua) = sfc 9a = l,...,n is obtained by counting the number ck{dn) of replicates, among the c previous ones, associated to a central value S(u) equal to s^. The required conditional probability is then identified to the training proportion ck(dn)fc(dn): THE SNESIM ALGORITHM The snesim name chosen for the proposed simulation algorithm recalls that it involves only one single normal equation (sne), which leads to the very definition (2) of a conditional probability. SNESIM has been developed to simulate categorical attributes, e.g. geological fades, but could he extended to simulate continuous attributes such as a permeability field using a sequence of nested simulations: • first discretize the continuous attribute values into a finite number K of classes. In most reservoir applications, since flow simulation is controlled by the connectivity of extreme permeabilities associated with specific fades (e.g. sands and shales), there is no need for a fine discretization. Four classes of permeability values should be sufficient, for example, two extreme classes corresponding to the lowest and highest, deciles and two median dasses, each with a marginal probability 40%. • simulate the resulting categorical variable using snesim, then within each category (class of values) simulate the original continuous variable using a traditional twopoint algorithm such as the GSLIB sequential Gaussian simulation program sgsim (Deutsch and Journel, 1998, p.170). The SNESIM algorithm is based on the sequential simulation paradigm whereby each simulated value becomes a hard datum value conditioning the simulation of nodal values visited later in the sequence (Goovaerts, 1997, p. 376). Guardiano and Srivastava (1993) proposed to scan the full training image anew at each unsampled node to infer the local conditional probability distribution of type (3). Such repetitive scanning can he very cpudemanding, especially when considering a large training image or when generating a large number of realizations each with many nodes. An alternative implementation would consist in tabulating ahead of time all required cpdfs (conditional probability distribution functions). However the large number of possible conditioning data events precludes any brute force tabulation, indeed: • a single template of n data variables, each taking K possible values or classes of values, generates K data events; e.g. K~ 10 classes and n = 10 lead to Kn =10 possible conditioning data events! • in a sequential simulation mode, since the conditioning data include previously simulated nodes, and the grid nodes are visited along a random path, the geometry of the conditioning data event changes from one node to the other. Thus a very large number of data templates (geometric configurations) should be retained. The algorithm implementation proposed in this thesis is much less cpudemanding than Guardiano and Srivastava's implementation, without being too memory (RAM) demanding. This new implementation is based on the two following properties: • Property 1: Given a template rn of n data variables, the number of cpdfs as sociated to rn that can be actually inferred from the traininq image is related to the training image dimensions, hence is generally much smaller than the total number Kn of cpdfs associated with rn. Consider a template rn of n data variables. A cpdf associated with rn can be inferred from the training image only if at least one occurrence of the corresponding conditioning data event can be found in the training image. Denote by Nn the size of the eroded training image rn that could be scanned by rn. Since only Nn data events are scanned in T , the maximum number of cpdf s associated with rn n that can be actually inferred from the training image is necessarily lesser than Nn, hence is usually a reasonable number in contrast to the huge number Kn of all cpdf s associated with rn . • Property 2: The probability distribution conditional to a data event d associated with a subtemplate r • of rn (n the probability distributions conditional to the data events d associated with r„ and for which d • is subset. n n n Let d  be a data event associated with a subtemplate r  of rn (n n n n Similarly for the number c^d •) of d  replicates with a central value S(u) equal to s^: Knowledge of c^{d ■ )and c(d ) allows then estimating the probability distribution conditional to d * using Relation (3.6). n Denote by W(u) the data search neighborhood centered on location u. Consider the data template rn constituted by the n vectors {haa = \,...,n} defined such that the n locations u + ha,a = L...,n correspond to all n grid nodes present within W(u). The snesim algorithm proceeds in two steps: 1. First, store in a dynamic data structure (search tree) only those cpdf s associated with rn that can be actually inferred from the training image. More precisely, store in a search tree the numbers of occurrences of data events and central values (ck(dn)) actually found over the training image, and from which the training proportions (3.6) can be calculated. Section 1 of Appendix A provides a detailed description of that search tree. Because of Property 1, the amount of RAM required by the search tree is not too large if a data template rn with a reasonable number n of nodes, say, less than 100 nodes, is retained. The construction of that search tree requires scanning the training image one single time prior to the image simulation, hence it is very fast, see Section 2 of Appendix A. 2. Next, perform simulation by visiting each grid node one single time along a random path. At each node u to be simulated, the conditioning data are searched in W/(u), hence the local conditioning data event is associated with a subtemplate of rn. According to Property 2, the local cpdf can then be retrieved from the probability distributions conditional to the data events d associated with r„ and for which d ■ is subset, these n n n cpdf s are read directly from the search tree. That fast retrieval of any local cpdf is described in detail in Section 3 of Appendix A. The training image need not be scanned anew at each unsampled node, which renders the snesim algorithm much faster than Guardiano and Srivastava's implementation. A flowchart presents main steps of the snesim simulation algorithm. 1. Scan the training image(s) to construct the search tree for the data template zn = {haa = 1, ,72} corresponding to the data search neighborhood retained. rn can be defined such that the n locations u + haa=l9...9n, correspond to n grid node locations closest to u, but not necessarily. The n vectors ha are ordered per increasing modulus: Note that an anisotropic variogram distance could be used for that ordering. 2. Assign the original sample data to the closest grid nodes. Define a random path visiting once and only once all unsampled nodes. 3. At each unsampled location u, retain the conditioning data actually present within the search neighborhood defined by the template rn centered at u. Let n be the number of those conditioning data (« central value at u is equal to s^, k = l,...,K, as shown in Section 3 of Appendix A. Identify the local cpdf to the proportions of type (3.6) corresponding to the data event d *. To ensure that these proportions are significant, hence avoid poor inference of the local cpdf, if the total number cl (n 1); the probability distribution conditional to this lesser data event d ■ , is retrieved again from the search tree, and so on n 1 ... If the number of data drops to n = 1, and c {d •) is still lower than c^^, the conditional probability p(u\s^ \ (n)); is replaced by the marginal probability p^. Draw a simulated svalue for node u from the cpdf retrieved from the search tree. That simulated value is then added to the sdata to be used for conditioning the simulation at all subsequent nodes. Move to next node along the random path and repeat steps 3 and 4. Loop until all grid nodes are simulated. One stochastic image has been generated. Reiterate the entire process from step 2 with a different random path to generate another realization. CONDITIONING TO HARD DATA Two conditions must be met to ensure proper conditioning to hard data: • the hard data must be exactly reproduced at their locations. • as we come close to any hard datum location, the conditional variance should become smaller, shrinking to the nugget variance. More precisely, the variance of the L simulated values {s^ V = 1,...,Z} at node u should decrease as that node gets closer to a hard datum location ua. Relocating the sample data to the nearest simulation grid node and freezing their values ensures the first condition. As the location u gets closer to the hard datum location u\ the training probability distribution {pfask \ {n)\k = 1,...,AT} gets closer to a single atom distribution at S(u) = s# if S(w])= s^. Indeed, per spatial continuity of the training image the central value s(u) of any data template with u\ close to u will be increasingly often in the same state as the conditioning datum value s{u\). The small scale spatial continuity of the training image is passed to the simulated realizations through the training proportions, hence as the node being simulated gets closer to a hard datum its conditional variance decreases as it does on the training image. MULTIPLE GRID SIMULATION APPROACH The data search neighborhood defined by the data template rn should not be taken too small, otherwise large scale structures of the training image would not be reproduced. On the other hand, a large search template including too many grid nodes would lead to store a large number of cpdf s in the search tree, increasing cpu cost and memory demand. One solution to capture large scale structures while considering a data template tn including a reasonably small number of grid nodes is the multiple grid approach (GomezHernandez, 1991; Tran, 1994). The multiple grid approach implemented in snesim consists of simulating a number G of nested and increasingly finer grids. The gth (1 2^~1 th node of the final simulation grid (g = 1). The data template zn {haa = l,...,w}is rescaled proportionally to the spacing of the nodes within the grid to be simulated. Let r* = \hi>a = lj *n/ be the resulting data template for the gth grid: h* = 2ff ^ Va = l,... 3n._ jhe arger search neighborhoods ' y Tn* of the coarser simulation grids allow capturing the large scale structures of the training image. One search tree needs to be constructed per nested simulation grid, possibly using a different training image reflecting the heterogeneities specific to that scale. When the simulation of the gth grid is completed, its simulated values are frozen as data values to be used for conditioning on the next finer simulation grid. Note that the original sample data need not be located at nodes of the grid currently simulated. Proper conditioning requires assigning the original sample data to the closest nodes of the current simulation grid. When the simulation of the grid is completed, these sample data are reassigned to their original locations. The grid nodes from which the sample data are removed, are simulated later as nodes of the next finer grid. Relocating the original sample data to the the closest nodes of the current simulation grid may, however, affect the local accuracy of the simulated realizations. Hence only the finer grids should be simulated using search trees. For the coarser grids, the local cpdf at each unsampled node could be inferred using Guardiano and Srivastava's original implementation, i.e. by rescanning the full training image at each node, in which case no data relocation is required. Because such rescanning of the full training image is cpudemanding, only the very coarse grids, say the grids constituted by each 8 (or more)th node of the final simulation grid (which represents only 1.5% of the total number of grid nodes), should be simulated without using a search tree. In the multiplegrid approach, a different training image can be used for each simulation grid. The training images can have a size (number of pixels) different from that of the initial simulation grid, but their resolutions (pixel size) must be the same. The construction of a search tree is very fast, but may be quite memory demanding when considering a very large training image or a template with a large number of nodes. Hence we allocate memory for any single search tree prior to the simulation of the corresponding simulation grid, then deallocate that memory once the simulation of the grid is completed. APPENDIX B STORING CPDF'S IN A SEARCH TREE This section presents the dynamic data structure (search tree) under which conditional probability distributions inferred from the training image are stored then retrieved during the simulation performed with snesim. B. 1 DATA STRUCTURE USED Consider an attribute S taking Kpossible states {$*, k=1,...,K}. Denote by W(u) the data search neighborhood centered on any unsampied location u, and consider the data template rn constituted by the n vectors {haa = l9...9n} defined such that the n locations u + {haa = l,...,/i} correspond to all n grid nodes present within W(u). The n vectors ha , are ordered per increasing modulus: An anisotropic variogram distance could have been used for that ordering. The dynamic data structure presented hereafter allows retrieving all conditional probability distributions (cpdf s) existing in a training image, provided that the conditioning data configuration is included in rn . This dynamic data structure (search tree) consists of a linked set of nodes, where each node points towards up to K nodes of the next level. As an example, Figure A. 1c shows the search tree obtained from the binary training image of size 5*5=25 pixels displayed in Figure A. 1a, using the data template r40f Figure A.1b. State value 0 corresponds to white pixels, state value 1 to black pixels. Each search tree node corresponds to one data event. Let r • = {haa = l,...yn'} be the subtemplate constituted by the first n vectors of rn the n locations u + {haa = l,...,n } defining the subtemplate r ■ correspond to the n locations closest to u in the data search neighborhood W(u). Nodes located at level n (e [0,n]) of the search tree correspond to data events associated with T *. In particular, the single node of level 0, from which the tree grows, is called 'the root', and corresponds to the data event dQ no conditioning data present in the template. Only nodes corresponding to data events for which at least one replicate is found in the training image are present in the search tree. Let consider the node associated to the data event d ■= {s(u + ha) = sjc ,a = l,...,n • this node contains an array of K integers {c^id • )9k = l9„.,K}, where cfc (4n ) is the number of training replicates of d • for which the central value S(u) is equal to s^. The total number of d • is then: • a set of K pointers {^{d *\k = 1,..X} is attached to the d •  node. Pjc{d •) points towards the node of level n + 1 corresponding to the data event d ■+! ={dn , and s(u + h '+i ) = SQ provided that the d vi  node is present in the search tree, i.e., provided that at least one d '+1  replicate is found in the training image. If this d »+i  node is missing, P^d •) is a 'null' pointer, which means that it does not point towards any further node. Diagrammatically, at each node, the search tree splits into up to K branches. in the binary search tree of Figure A.1c, the node associated to any data event d • contain the numbers of training d ■ for which the central node is a n n white pixel (cQ(d ■)) or a black pixel (q(J ■)). Two pointers are attached to d '+i  node: the left pointer Po(dn') points towards the node of level n + 1 associated with the data event {dn , and s(u »+i) = 0} (if this node is not missing) while the right pointer P[(dn*) points towards the node of level n + 1 associated with the data event {dn and s(u +1) = 1} (if not missing). For example, at the root (node 1_ associated to the data event d$ no conditioning data), co(^o) = 14 and cl(^o)= 11* corresponding to the 14 white and 11 black pixels of the training image displayed in Figure A. 1a. PQ(^O) points towards the node of level 1 associated to the onepoint data event {s(w +/zi) = 0}(node 2), while P[(dQ) points towards the node associated to {s(u + hi) = l} (node 3). Denote by rthe training image of Figure A. 1a and by rthe eroded training image scanned by the subtemplate r • constituted by the first n vectors of rn:T  ={u 2. The coordinates of the central nodes u of all rfjreplicates are given in the table displayed in Figure A.2. Similarly, 8 replicates of {s(u + h\) = l} can be found in the eroded training image T\\ 5 with a white central value, 3 with a black central value, hence in the corresponding search tree node 3: CQ({S(Ur/q) = l}) = 5 and q({s(« + /ri) = l}) = 3. B.2 PRESCANNING TO CONSTRUCT THE SEARCH TREE Constructing the search tree for the data template rn = {haa = l9...9n} requires scanning the training image, denoted by T, one single time; it proceeds as follows: 1. Allocate memory for the root (data event do corresponding to n = 0), and initialize the numbers c£( Denote by T • = {u e Tstu + ha € 7, Va = 1,...,« } the eroded training image scanned by the subtemplate r »constituted by the first n vectors of rn : Let nmax be the index of that eroded training image such that: Retain the wmax locations ua, of the subtemplate rn :ua =u + ha9a = \. Since the nmax vectors ha are ordered per increasing modulus, the "max locationsua, are ordered per increasing distance to u: Denoteby dUm^ ={5(M+Aa) = 5fca,a=lv..,7zmax}the 7*maxdata event centered on u. 3. Starting from the root, consider the node of level 1 corresponding to the data event d\ = {s(u + ki) = sjCl)constituted by the first datum of dn . If the d\ node is missing, allocate memory in the search tree to create it. Move to that node and increment the number c^(c?i)if the central value at u is s^. Then consider the node of level 2 corresponding to the data event di = {s(u + h\) = sk ,s(u + h2) = sk?} constituted by the first two data of dn and so on... Loop through the sequence of data s(ua),a = l,...,nm3Xl a = 1,..., until the node of level nmax corresponding to dn is reached and the number TT13X cfr(d„ ) is incremented. KK "max J 4. Move to the next grid node u of the training image T and repeat steps 2 and 3. 5. Loop until all grid nodes u of the training image have been visited. Consider any data event d ■ associated with the subtemplate r • constituted by the first n vectors of xn,n e [Q9n]. Suppose that, after scanning the training image, the node corresponding to d 'is missing in the search tree. This means that no replicate of d * could be found in the training image. In such case, c^{d •) = 0,V& = l,...,K . For example, in the search tree of Figure A.1c, the node corresponding to the data event constituted by 4 white pixels UI4 = {s(ua) = 0,a = 1,...,4}) is missing since no replicate of that data event is found in the training image: ^9(^/4) = q(^4) =o Not storing the null numbers c^(d ■)corresponding to missing nodes is critical to minimize memory demand. Indeed, denote by N 1 the size (number of pixels) of the eroded training image T 1. Since only JV 1 data events are scanned in T 1, the number of nodes at level n of the search tree is n necessarily lesser than N \. For example, the eroded training image of T4 scanned by the data template r4of Figure A.1 b is constituted by the 9 central pixels of the original training image shown in Figure A.1 a. Thus the number of nodes at the last level of the resulting search tree was necessarily lesser than JV4 =9, although, with a binary variable, a fourdata template could generate a total 2 = 16 possible data events. Indeed, only 6 nodes are present at level 4 of the search tree. In practice, the more structured the training image, the lesser the number of nodes in the search tree. B.3 RETRIEVING CPDPS FROM A SEARCH TREE Consider the inference of the probability distribution conditional to any n data event d » associated with a subtemplate of rn . The data event d • can be ■ associated with one of the subtemplates r ■ defined by the first n vectors of 1 ?n : eS^d\ = {s(u + h\)  5k } associated with r\ = {h\}, but this need not be so; e.g.,d\ = {s(u + h2) = sfc2 } • If d ■ is associated with the particular subtemplate r • defined by the first n vectors of rn the corresponding d * node can be searched in the search tree as follows: Starting from the root (level 0) move to the node of level 1 corresponding to the data event d\ = {s(u + h\) = s^ } constituted by the first datum value of d ■, provided that the d\  node is present in the search tree. If the d\ node is missing, then the d » node is also missing, which means that there is no training replicate of d ■ f hence the probability distribution conditioned to d  cannot be inferred from the training image. If the Jjnode is not missing, move to the node of level 2 corresponding to the data event ^2 = is(u + h) = sh >s(u + h2) = sk2} constituted by the first 2 data values of d • .provided that the do  node exists, and so on... until the d ■  node is n * n reached at level n of the search tree. The probability distribution conditioned to d ■ can be then estimated using the numbers c^d ■): • If the data event d » is not associated with such a subtemplate r  77 n there is no corresponding d •  node in the search tree, whether d •  replicates are found in the training image or not. For example, the node associated to the data event d\ = {${112) = °} is not present in the search tree of Figure A. 1c, although d\  replicates can be found in the training image of Figure A. 1a. For such data events d i_{s(u + ha)=sfc...,s(u + ha }, consider the smallest particular subtemplate rn . (constituted by the first nmjn vectors of rn ) including d •: The general equality ?rob{A} = Z& Prob{AandB = b} where the summation is carried over ail states b possibly taken by the event B, can be applied to write the following relation: where the summation is carried over all data events dy, . associated with the subtemplate vn . and for which d  is a subset. In terms of training data events: i Since the data events d„ . involved in the summation are associated to the particular subtemplate Tn^ , the numbers ck (dnmin) can be retrieved i directly from the corresponding dJlmki  nodes at the level nmya of the search ; tree. Note that such calculation allows retrieving only those d •  replicates found in the eroded training image rn . not in the eroded training image scanned by the d '  data configuration. This is however a minor limitation when considering large training images. As an example, let estimate the probability distribution conditional to the data event d\ = {s(u2) = 0} using the search tree of Figure A. 1c: The numbers cjc(d2) can be retrieved directly from the search tree: Hence: cQ(d2) = 6 and q(J2) = l» and the probability distribution conditional to {s(u2) = 0} can be estimated by: WHAT IS CLAIMED IS: 1. A method for simulating attributes in a stratigraphic grid, the method comprising: (a) creating a stratigraphic grid of nodes; (b) creating a training image representative of subsurface geological heterogeneity; (c) creating a coarse grid of nodes corresponding to nodes of the stratigraphic grid; (d) simulating attributes at the nodes of the coarse grid utilizing well data to get informed nodes; (e) refining the grid of nodes by adding uninformed nodes to the informed nodes to create a working grid of nodes; V (f) selecting a data template of nodes from the working grid and building a search tree using the data template and the training image; (g) simulating attributes for the uninformed nodes of the working grid using the search tree; and (h) repeating steps (e)(g) until the attributes of the nodes in the stratigraphic grid have been simulated; wherein in at least one of the refining steps (e), the ratio of informed nodes to the total number of nodes is greater than 1/4 for a 2D grid and greater than 1/8 for a 3D grid. The method of claim 1 wherein: the ratio of informed nodes to the total number of nodes is greater than 1/4 in all of the refining steps (e) for a 2D grid and greater than 1/8 for a 3D grid. The method of claim 1 wherein: in at least one of the refining steps (e), the ratio of informed nodes to the total number of nodes is 1/2. The method of claim 1 wherein: the ratio of informed nodes to the total number of nodes is 1/2 in the majority of the refining steps (e). The method of claim 1 wherein: the ratio of informed nodes to the total number of nodes is 1/2 for each of the refining steps (e). The method of claim 1 wherein: the step of selecting the data template of nodes includes creating a search window and selecting informed nodes and only a portion of the uninformed nodes in the search window to create the data template of nodes in at least one of the steps (f). The method of claim 6 wherein: less than 1/2 of the uninformed nodes in the search window are selected to be part of the data template in at least one of the steps (f). The method of claim 1 wherein: the step of selecting the data template of nodes includes creating a search window and selecting informed nodes and uninformed nodes from within the search window with the majority of the selected nodes being informed nodes. The method of claim 1 wherein: the step of selecting the data template of nodes includes creating a search window and selecting all of the informed nodes and only a portion of the uninformed nodes from within the search window. The method of claim 1 wherein: the search window is an ellipsoid. The method of claim 1 wherein: the attributes are fades. A method for simulating attributes in a stratigraphic grids, the method comprising: (a) creating a stratigraphic grid of nodes; (b) creating a training image representative of subsurface geological heterogeneity; (c) creating a coarse grid of nodes corresponding to nodes of the stratigraphic grids; (d) simulating attributes at the nodes of the coarse grid utilizing well data to get informed nodes; (e) refining the grid of nodes by adding uninformed nodes to the informed nodes; (f) selecting a data template of nodes from the working grid and building a search tree using the data template, the data template be selected by creating a search window and selecting informed nodes and uninformed nodes from within the search window with the majority of the selected nodes being informed nodes; (g) simulating attributes for the uninformed nodes of the grid using the search tree; and (h) repeating steps (e)(g) until the attributes of all nodes in the stratigraphic grid have been simulated. The method of claim 12 wherein: in at least one of the refining steps (e), the ratio of informed nodes to the total number of nodes is greater than 1/4 for a 2D grid and greater than 1/8 for a 3D grid. The method of claim 12 wherein: less than 1/2 of the uninformed nodes found within the search window are selected to be part of the data template. The method of claim 12 wherein: the step of selecting the data template of nodes, the majority of the selected nodes are informed nodes. 16. The method of claim 12 wherein: the step of selecting the data template of nodes, all of the informed nodes and only a portion of the uninformed nodes from within the search window are selected. 17. The method of claim 12 wherein: the search window is an ellipsoid. 

1158chenp2007 complete specification as granted.pdf
1158chenp2007 abstract14072009.pdf
1158chenp2007 claims14072009.pdf
1158chenp2007 correspondence others14072009.pdf
1158chenp2007 drawings14072009.pdf
1158chenp2007 form314072009.pdf
1158chenp2007 others14072009.pdf
1158chenp2007 petition14072009.pdf
1158chenp2007correspondneceothers.pdf
1158chenp2007description(complete).pdf
Patent Number  238520  

Indian Patent Application Number  1158/CHENP/2007  
PG Journal Number  8/2010  
Publication Date  19Feb2010  
Grant Date  09Feb2010  
Date of Filing  20Mar2007  
Name of Patentee  CHEVRON U.S.A INC  
Applicant Address  6001 BOLLINGER CANYON ROAD SAN RAMON CALIFORNIA 94583  
Inventors:


PCT International Classification Number  G06G7/48  
PCT International Application Number  PCT/US05/29319  
PCT International Filing date  20050816  
PCT Conventions:
