Title of Invention

"METHOD AND APPARATUS FOR CAPTURING AND PROCESSING IMAGE DATA,INCLUDING GENERATING A REFOCUSSED VIRTUAL IMAGE"

Abstract Image data is processed to facilitate focusing and/or optical correction. According to an example embodiment of the present invention, an imaging arrangement collects light data corresponding to light passing through a particular focal plane. The light data is collected using an approach that facilitates the determination of the direction from which various portions of the light incident upon a portion of the focal plane emanate from. Using this directional information in connection with value of the light as detected by photo sensors, an image represented by the light is selectively focused and/or corrected.
Full Text FORM 2 THE PATENT ACT 1970 (39 of 1970) & The Patents Rules, 2003 COMPLETE SPECIFICATION (See section 10 and rule 13)
1. IMAGING ARRANGEMENTS AND METHODS THEREFOR.
2.
(A) THE BOARD OF TRUSTEES OF THE LELAND STANFORD JUNIOR UNIVERSITY.
(B) UNITED STATES OF AMERICA.
(C) 1705 El Camino Real, Palo Alto, CA 94306-1106,
UNITED STATES OF AMERICA.
The following specification particularly describes the invention and the manner in which it is to be performed.
1

Field of the Invention
The present invention relates generally to imaging applications, and more specifically to processing image data to focus and/or correct the image data.
Background
Imaging applications such as those involving cameras, video cameras, microscopes, telescopes and more have generally been limited in the amount of light that is collected. That is, most imaging devices do not record most of the information about light distribution entering the device. For example, conventional cameras such as digital still cameras and video cameras do not record most of the information about the light distribution entering from the world. In these devices, collected light is often not amenable to manipulation for a variety of approaches, such as for focusing at different depths (distances from the imaging device), correcting for lens aberrations or manipulating an angle of view.
For still-imaging applications, typical imaging devices capturing a particular scene generally focus upon a subject or object in the scene, with other parts of the scene left out of focus. For video-imaging applications, similar problems prevail, with a collection of images used in video applications failing to capture scenes in focus.
Many imaging applications suffer from aberrations with the equipment (lenses) used to collect light. Such aberrations may include, for example, spherical aberration, chromatic aberration, distortion, curvature of the light field, oblique astigmatism and coma. Correction for aberrations has typically involved the use of corrective optics, when tend to add bulk, expense and weight to imaging devices. In some applications benefiting from small-scale optics, such as camera phones and security cameras, the physical limitations associated with the applications make it undesirable to include additional optics.
2

Difficulties associated with the above have presented challenges to imaging applications, including those involving the acquisition and altering of digital images.
Summary
The present invention is directed to overcoming the above-mentioned challenges and others related to imaging devices and their implementations. The present invention is exemplified in a number of implementations and applications, some of which are summarized below.
According to an example embodiment of the present invention, a light is detected with directional information characterizing the detected light. The directional information is used with the detected light to generate a virtual image, corresponding to one or both of a refocused image and a corrected image.
According to another example embodiment of the present invention, two or more subjects at different focal depths in a scene are imaged, with portions of the scene corresponding to each subject focused at different focal planes. Light from the scene is focused upon a physical focal plane and detected, together with information characterizing the direction from which the light arrived at particular locations on the physical focal plane. For at least one subject that is located at a depth of field that is not focused upon the physical focal plane, a virtual focal plane that is different than the physical focal plane is determined. Using the detected light and directional characteristics thereof, portions of the light corresponding to a focused image of the at least one subject upon the virtual focal plane are collected and added to form a virtual focused image of the at least one subject.
According to another example embodiment of the present invention, a scene is digitally imaged. Light from the scene that is passed to different locations on a focal plane is detected, and the angle of incidence of the light detected at the different locations on the focal plane is detected. A depth of field of a portion of the scene from which the detected light came is detected and used together with the determined angle of incidence to digitally re-sort the detected light. Depending upon the application, the re-sorting includes refocusing and/or correcting for lens aberrations.
The above summary of the present invention is not intended to describe each illustrated embodiment or every implementation of the present invention. The figures and detailed description that follow more particularly exemplify these embodiments.
3

Brief Description of the Drawings
The invention may be more completely understood in consideration of the detailed description of various embodiments of the invention that follows in connection with the accompanying drawings, in which:
FIG. 1 is a light ray capturing and prqocessing arrangement, according to an example embodiment of the present invention;
FIG. 2 is an optical imaging device, according to another example embodiment of the present invention;
FIG. 3 is a process flow diagram for image processing, according to another example embodiment of the present invention;
FIG. 4 is a process flow diagram for generating a preview image, according to another example embodiment of the present invention;
FIG. 5 is a process flow diagram for processing and compressing image data, according to another example embodiment of the present invention;
FIG. 6 is a process flow diagram for image synthesis, according to another example embodiment of the present invention;
FIG. 7 is a process flow diagram for image refocusing, according to another example embodiment of the present invention;
FIG. 8 is a process flow diagram for extending the depth of field in an image, according to another example embodiment of the present invention;
FIG. 9 is a process flow diagram for another approach to extending the depth of field in an image, according to another example embodiment of the present invention;
FIG. 10 illustrates one example approach to separating light rays, according to another example embodiment of the present invention;
FIG. 11 illustrates an approach to the mapping of sensor pixel locations to rays in the L(u,v,s,t) space with respect to data collected, according to another example embodiment of the invention;
FIG. 12 illustrates several images refocused at different depths, according to another example embodiment of the present invention;
FIG. 13A shows an imaging configuration in 2D, according to another example embodiment of the present invention;
4

FIG 13B shows a cone of rays from a 3D point summed for a pixel, according to another example embodiment of the present invention;
FIGs. 14A-14C show an approach to computing images with different depths of field, according to another example embodiment of the present invention;
FIG. 15 illustrates an approach to the tracing of rays from a 3D point on a virtual film plane, according to another example embodiment of the present invention;
FIG. 16 shows an approach to finding the value of light, according to another example embodiment of the present invention;
FIG. 17A shows an ideal 512x512 photograph, according to another example embodiment of the present invention;
FIG. 17B shows an image that would be produced with an f/2 bi-convex spherical lens, according to another example embodiment of the present invention;
FIG. 17C shows an image computed using an image correction approach, according to another example embodiment of the present invention;
FIGs. 18A-18C illustrate the tracing of light rays common in color imaging systems, according to another example embodiment of the present invention;
FIGs. 19A-19F illustrate approaches to implementing a mosaic array, according to another example embodiment of the present invention;
FIG. 20 is a process flow diagram of computational approaches to refocusing in a Fourier domain, according to another example embodiment of the present invention;
FIG. 21A shows a triangle filter approach, according to another example embodiment of the present invention;
FIG. 21B shows a Fourier transformation of the triangle filter approach; according to another example embodiment of the present invention;
FIG. 22 is process flow diagram of an approach to refocusing in a frequency domain, according to another example embodiment of the present invention;
FIG. 23 shows a set of rays which pass through a desired focal point, according to another example embodiment of the present invention;
FIGs. 24A-B show differing views of a portion of a microlens array, according to another example embodiment of the present invention;
FIG. 24C illustrates images that appear on a photosensor, according to another example embodiment of the present invention;
5

FIG. 25 illustrates an example embodiment of the present invention, in which a virtual image is computed as it would have appeared on virtual film;
FIG. 26 shows an approach to manipulating a virtual lens plane, according to another example embodiment of the present invention;
FIG. 27 illustrates that the virtual film may take any shape, according to another example embodiment of the present invention;
FIG. 28 shows an imaging arrangement; according to another example embodiment of the present invention;
FIG. 29 is a process flow diagram for pre-computing a database of weights associated with each output image pixel and each ray sensor value, according to another example embodiment of the present invention;
FIG. 30 is a process flow diagram for computing an output image using a database of weights, according to another example embodiment of the present invention;
FIGs. 31A-D illustrate various scalar functions which are selectively implemented as a virtual aperture function, according to another example embodiment of the present invention;
FIG. 32 shows a virtual aperture function which varies from pixel to pixel; according to another example embodiment of the present invention;
FIG. 33 is a process flow diagram of a user selecting a region of an output image, editing an image portion and saving the output image, according to another example embodiment of the present invention;
FIG. 34 is a process flow diagram for extending the depth of field in an image, according to another example embodiment of the present invention; and
FIG. 35 is a process flow diagram for computing a refocused image from received light ray sensor data, according to another example embodiment of the present invention.
While the invention'is amenable to various modifications and alternative forms, specifics thereof have been shown by way of example in the drawings and will be described in detail. It should be understood, however, that the intention is not to limit the invention to the particular embodiments described. On the contrary, the intention is to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the invention.
6

Detailed Description
The present invention is believed to be useful for a variety of different types of devices, and the invention has been found to be particularly suited for electronic imaging devices and applications. While the present invention is not necessarily limited to such applications, various aspects of the invention may be appreciated through a discussion1 of various examples using this context.
According to an example embodiment of the present invention, a four-dimensional (4D) light field (e.g., the light traveling along each ray in a region such as free space) is detected using an approach involving the determination of the amount and direction of light arriving at a sensor located at a focal plane. The two-dimensional position of light in the focal plane is detected, together with information characterizing the direction from which the light arrived at particular locations in the plane. With this approach, the directional lighting distribution arriving at different locations on the sensor are determined and used to form an image. In various discussions herein, the assembly or assemblies implemented for sensing and/or measuring of a light field are referred to as a "light ray sensor," or a "ray sensor."
In one application, an approach similar to the above is implemented using an imaging system having optics and sensors that sample the space of light rays that are incident on an imaging plane, with computational functionality that renders images from the set of measured rays in different ways. Each of the optics, sensors and computational functionality is implemented using a variety of approaches, in combination or distinctly, depending upon the implementation. For example, a camera having lenses (optics) that focus an image upon a photosensor array (sensors) located at an imaging plane can be used to sample the space of light rays. An output from the photosensor array is used with computational functions (e.g., at a processor internal and/or external to the camera) to render images, such as by computing photographs that are focused at different depths or with different depths of field, and/or computationally correcting lens aberrations to produce higher quality images.
In another example embodiment, optics and sensor components of an imaging system direct rays of light onto sensor elements such that each sensor element senses a set of rays including rays emanating from specific directions. In many applications, this set of rays is a bundle of rays that is localized in both space and direction. For many applications, this bundle of rays will converge to a single geometric ray of light as the optics and sensor resolutions increase. In this regard, various portions of the
1-

description herein refer to the values sensed by the sensor elements as "rays of light" or "light rays" or simply "rays," even though in general they may not be limited to geometric rays.
Turning now to the figures, FIG. 28 shows an imaging arrangement 2890, according to another example embodiment of the present invention. The imaging arrangement 2890 includes a main lens 2810, and a light ray sensor that measures the value of light arriving at different locations on the sensor and from difference incident directions. In this context, measuring the value of light may be implemented by detecting light arriving at different locations at the sensor, together with characteristics of that light such as intensity to generate a value.
FIG. 1 shows an imaging system 100, according to another example embodiment of the present invention. The imaging system 100 includes an imaging arrangement 190 having a main lens 110, a microlens array 120 and a photosensor array 130. In this case, the microlens array 120 and photosensor array 130 implement a light ray sensor. Although FIG. 1 illustrates a particular main lens 110 (single element) and particular microlens array 120, those skilled in the art will recognize that a variety of lenses and/or microlens arrays (currently available or developed in the future) are selectively implemented with a similar approach by, for example, replacing the shown main lens and/or microlens array.
Rays of light from a single point on a subject 105 in an imaged scene are brought to a single convergence point on the focal plane of the microlens array 120. For instance, when the imaged point on the subject 105 is at a distance from the main lens that is conjugate to the distance from the microlens array to the main lens, the dimension "d" is about equal to the dimension "s" as shown. A microlens 122 at this convergence point separates these rays of light based on the direction of the light, creating a focused image of the aperture of the main lens 110 on the photosensors underneath the microlens.
The photosensor array 130 detects light incident upon it and generates an output that is processed using one or more of a variety of components. In this application, the output light data is passed to sensor data processing circuitry 140, which uses the data together with positional information about each photosensor providing the data in generating an image of the scene {e.g., including subjects 105, 106 and 107). The sensor data processing circuitry 140 is implemented, for example,
«

with a computer or other processing circuit selectively implemented in a common component (e.g., a chip) or in different components. In one implementation, a portion of the sensor data processing circuitry 140 is implemented in the imaging arrangement 390, with another portion of implemented in an external computer. Using the detected light (and, e.g., characteristics of the detected light) together with a known direction from which the light arrived at the microlens array (as computed using a known location of each photosensor), the sensor data processing circuitry 140 selectively refocuses and/or corrects light data in forming an image (where refocusing may be correcting). Various approaches to processing detected light data are described in detail below, with and without reference to other figures. These approaches may be selectively implemented with the sensor data processing circuitry 140 consistent with the above.
Different portions of the imaging system 100 are selectively implemented in a common or separate physical arrangement, depending upon the particular application. For example, when implemented with a variety of applications, the microlens array 120 and the photosensor array 130 are combined into a common arrangement 180. In some applications, the microlens array 120 and the photosensor array 130 are coupled together on a common chip or other circuit arrangement. When implemented with a hand-held device such as a camera-like device, the main lens 110, microlens array 120 and photosensor array 130 are selectively combined into a common imaging arrangement 190 integrated with the hand-held device. Furthermore, certain applications involve the implementation of some or all of the sensor data processing circuitry 140 in a common circuit arrangement with the photosensor array 130 (e.g., on a common chip).
In some applications, the imaging arrangement 100 includes a preview arrangement 150 for presenting a preview image to a user capturing images. The preview arrangement is communicatively coupled to receive image data from the photosensor array 130. A preview processor 160 processes the image data to generate a preview image that is displayed on a preview screen 170. In some applications, the preview processor 160 is implemented together with the image sensor 180, on a common chip and/or in a common circuit. In applications where the sensor data processing circuitry 140 is implemented with the photosensor array 130 as discussed above, the preview processor 160 is selectively implemented with the sensor data
q

processing circuitry 140, with some or all of the image data collected by the photosensor array 130 used to generate the preview image.
The preview image may be generated using relatively fewer computational functions and/or less data than that used to generate a final image. For instance, when implemented with a hand-held imaging device such as a camera or cell phone, a preview image that does not effect any focusing or lens correction may be sufficient. In this regard, it may be desirable to implement processing circuitry that is relatively inexpensive and/or small to generate the preview image. In such applications, the preview processor generates the image at a relatively low-computational cost and/or using less data, for example by using the first extended depth of field computational method as described above.
The imaging system 100 is implemented in a variety of manners, depending upon the application. For instance, while the microlens array 120 is shown with several distinguishable microlenses by way of example, the array is generally implemented with a multitude (e.g., thousands or millions) of microlenses. The photosensor array 130 generally includes a relatively finer pitch than the microlens array 120, with several photosensors for each microlens in the microlens array 120. In addition, the micolenses in the microlens array 120 and the photosensors in the photosensor array 130 are generally positioned such that light passing via each microlens to the photosensor array does not overlap light passed via adjacent microlenses.
In various applications, the main lens 110 is translated along its optical axis (as shown in FIG. 1, in a horizontal direction) to focus on a subject of interest at a desired depth "d" as exemplified between the main lens and an example imaging subject 105. By way of example, light rays from a single point on the subject 105 are shown for purposes of this discussion. These light rays are brought to a single convergence point at microlens 122 on the focal plane of the microlens array 120. The microlens 122 separates these rays of light based on direction, creating a focused image of the aperture of the main lens 110 on a set of pixels 132 in the array of pixels underneath the microlens. FIG. 10 illustrates one example approach to separating light rays, such that all rays emanating from a point on a main lens 1010 and arriving anywhere on the surface of the same microlens (e.g., 1022) are directed by that microlens to converge at the same point on a photosensor (e.g., 1023). This approach
W

shown in FIG. 10 may, for example, be implemented in connection with FIG. 1 (i.e., with the main lens 1010 implemented for main lens 110, with microlens array 1020 implemented for microlens array 120, and with photosensor array 1030 implemented for photosensor array 130).
The image that forms under a particular microlens in the microlens array 122 dictates the directional resolution of the system for that location on the imaging plane. In some applications, directional resolution is enhanced by facilitating sharp microlens images, with the microlenses focused on the principal plane of the main lens. In certain applications the microlenses are at least two orders of magnitude smaller than the separation between the microlens array and the main lens 110. In these applications, the main lens 110 is effectively at the microlenses' optical infinity; to focus the microlenses, the photosensor array 130 is located in a plane at the microlenses' focal depth.
The separation "s" between the main lens 110 and the microlens array 120 is selected to achieve a sharp image within the depth of field of the microlenses. In many applications, this separation is accurate to within about Axp-(fm/ Axm), where Axp is the width of a sensor pixel, fm is the focal depth of the microlenses and hxm is the width of the microlenses. In one particular application, Axp is about 9 microns,/,, is about 500 microns and. Axm is about 125 microns, with the separation between the microlens array 120 and the photosensor array 130 being accurate to about 36 microns.
The microlens array 120 is implemented using one or more of a variety of microlenses and arrangements thereof. In one example embodiment, a plane of microlenses with potentially spatially varying properties is implemented as the microlens array 120. For example, the microlens array may include lenses that are homogeneous and/or inhomogeneous, square in extent or non-square in extent, regularly distributed or non-regularly distributed, and in a partem than is repeating or non-repeating, with portions that are optionally masked. The microlenses themselves may be convex, non-convex, or have an arbitrary profile to effect a desired physical direction of light, and may vary in profile from microlens to microlens on the plane. Variuos distributions and lens profiles are selectively combined. These various embodiments provide sampling patterns that are higher spatially (correspondingly lower angularly) in some regions of the array, and higher angularly (correspondingly
II

lower spatially) in other regions. One use of such data facilitates interpolation to match desired spatial and angular resolution in the 4D space.
FIG. 24A illustrates a view (line of sight perpendicular to the plane) of a portion of a microlens array, according to another example embodiment of the present invention. The microlenses are square-shaped and regularly distributed in an array.
FIG. 24B illustrates a view of a portion of a microlens array, according to another example embodiment of the present invention. The microlens plane distribution is not regular or repeating, and the microlenses are arbitrarily shaped.
FIG. 24C illustrates images that appear on the photosensor in connection with another example embodiment of the present invention, using a distribution such as that shown in FIG. 24A with a convex profile and a main lens having a circular aperture.
In other example embodiments, a regular mosaic of larger and smaller microlenses is used. In one implementation, the resulting photosensor data is interpolated to provide a homogeneous sampling that has the maximum spatial and angular resolutions of a microlens or microlenses in the mosaic.
FIGs. 19A-19F illustrate approaches to implementing a mosaic array such as that described above, in connection with one or more example embodiments of the present invention. FIG 19A is an overhead view of a plane showing example relative sizes and arrangement of the microlenses. FIG 19B is an illustration of the shape of the images that form on the photosensor array after projection through each microlens in FIG. 19A. FIG 19C is a cross-sectional view of the array in FIG. 19A, illustrating that the microlenses have the same/-number and their focal points share a common plane. This requires that the smaller microlenses are positioned closer to the focal plane than the larger ones. This causes the images of the main lens that appear underneath each microlens to be large without overlapping, and all appear in-focus on a photosensor that is placed at the plane containing the focal points.
FIG. 19D shows a cross-sectional view of the microlenses shown in FIGs. 19A and 19C implemented in a full imaging arrangement containing a main lens 1910, a mosaic microlens array 1920 and photosensor arrangement 1930, according to another example embodiment of the present invention. Note that although the diagram has been shown with several microlenses and several pixels per microlens, the actual number of microlenses and pixels is selected using a variety of approaches, such as by
n

determining the resolution requirements of the given application and implementing appropriate numbers of each.
FIG 19E is a Cartesian ray diagram representing the space of rays starting at u on the main lens 1910 and terminating at s on the microlens array 1920 (the space of rays is shown in 2D for clarity, although the full space of rays is 4D). The set of rays that are summed by each photosensor (labeled A - P) in FIG 19C is shown on the Cartesian ray diagram in FIG 19D. In the full 4D space of rays, each photosensor integrates a 4D box of rays. The 4D boxes for photosensors under the larger microlenses have half the width (twice the resolution) in the (u, v) directional axes, and twice the width (half the resolution) in the (x, y) spatial axes compared to photosensors under the smaller microlenses.
In another example embodiment, photosensor values are interpolated to a regular grid so that the resolution in all axes matches the maximum resolution in all axes. FIG. 19F illustrates such an approach, wherein the boxes of rays that represent each photosensor value are split by interpolating nearby box values. In the 2D ray space illustrated, each box is split into two, but in the 4D space, each box is split into four (split into two along each of its two longer sides). In some embodiments, the interpolated values are computed by analyzing the nearby values. In another embodiment, the interpolation is implemented as a weighted sum of the values of the original, unsplit boxes in a neighborhood of the desired value.
In some applications, the weighting is implemented in a manner that depends on a decision function based on the values in a neighborhood. For example, the weighting may interpolate along the axes that are least likely to contain an edge in the 4D function space. The likelihood of an edge near that value can be estimated from the magnitude of the gradient of the function values at those locations, as well as the components of the Laplacian of the function.
In another example embodiment, each of the microlenses (e.g., in the array 1920 of FIG. 19D or similar) is tilted inwards so that their optical axes are all centered on the main lens aperture. This approach reduces aberrations in the images that form under microlenses towards the edges of the array.
Referring again to FIG. 1, the aperture sizes of the main lens 110 and of the microlenses in the microlens array 120 (e.g., the effective sizes of the opening in the lenses) are also selected to meet specific applications in which the imaging
k3

arrangement 100 is implemented. In many applications, the relative aperture sizes are selected so that collected images are as large as possible without overlapping (i.e., such that light does not undesirably overlap onto an adjacent photosensor). This approach is facilitated by matching the/-numbers (focal ratios; i.e., the ratio of the aperture to the effective focal length of the lens) of the main lens and the microlenses. In this instance, the effective focal length, in terms of the/-number, for the main lens 110 is the ratio of the diameter of the aperture of the main lens to the distance "s" between the main lens 110 and the microlens array 120. In applications in which the principal plane of the main lens 110 is translated relative to the plane at which the microlens array 120 is located, the aperture of the main lens is selectively modified so as to maintain the ratio and thus the size of the images forming under each microlens in the microlens array. In some applications, different main lens aperture shapes such as a square aperture are used to achieve desirable (e.g., efficient) packing of the array of images under the microlens array on the photosensor surface.
The following discussion refers to a general application of the imaging arrangement 100 of FIG. 1, in connection with one or more example embodiments of the present invention. Considering a two-plane light field "L" inside the imaging arrangement 100, L(u,v,s,t) denotes the light traveling along a light ray that intersects the main lens 110 at (u, v) and that intersects the plane of the microlens array 120 at (s,t). Assuming ideal microlenses in the microlens array 120 and ideal photosensors (e.g., pixels) on aligned grids in the photosensor array 130, all the light that passes to a photosensor also passes through its square parent microlens in the microlens array 120, and through the photosensor's conjugate square on the main lens 110. These two square regions on the main lens 110 and microlens specify a small four-dimensional box in the light field, and the photosensor measures the total amount of light in the set of rays represented by this box. Correspondingly, each photosensor detects such a four-dimensional box in the light field; the light field detected by the photosensor array 130 thus is a box- filtered, rectilinear sampling of L(u,v,s,t).
FIG. 11 illustrates an approach to the mapping of sensor pixel locations to rays in the L(u,v,s,t) space with respect to data collected, according to another example embodiment of the invention. The approach shown in FIG. 11 and discussed herein may be applicable, for example, to FIG. 1 with each photosensor in the photosensor array 130 corresponding to a sensor pixel. Image 1170 in the bottom right is a
\U

downsampling of raw data read from a ray sensor (photosensor) 1130, with the circular image 1150 that forms under one microlens circled. Image 1180 in the bottom left is a close-up representation of a portion of the raw data around the circled microlens image 1150, with one photosensor value 1140 circled within the microlens image. Since this circular image 1150, is an image of the lens aperture, the location of the selected pixel within the disk provides the (u, v) coordinates of the starting position of the illustrated ray 110 on the main lens. The location of the microlens image 1150 within the sensor image 1170 provides the (x, y) coordinates of the ray 1120.
Although the mapping of sensor elements to rays is discussed with respect to the figures (and other example embodiments), values associated with the various sensor elements are selectively represented by the value of the set of rays that is directed through optics to each particular sensor element. In the context of FIG. 1, each photosensor in the photosensor array can thus be implemented to provide a value that represents a set of light rays directed via the main lens 110 and microlens array 120 to the photosensor. That is, each photosensor generates an output in response to the light incident upon the photosensor, and the position of each photosensor, relative to the microlens array 120, is used to provide directional inforaiation about the incident light.
In one example embodiment, the resolution of the microlens array 120 is selected to match a particular application's desired resolution for final images. The resolution of the photosensor array 130 is selected so that each microlens covers as many photosensors as required to match the desired directional resolution of the application, or the finest resolution of photosensors that may be implemented. In this regard, the resolution of the imaging system 100 (and other systems discussed herein) is selectively tailored to particular applications, with considerations such as the type of imaging, cost, complexity and available equipment used to arrive at a particular resolution.
Once image data is captured via optics and sensors (e.g., using imaging arrangement 190 in FIG. 1), a variety of computational functions and arrangements are implemented to selectively process the image data. In one example embodiment of the present invention, different sets of photosensors capture these separated light rays from each microlens and pass information about the captured light rays to a
16

computational component such as a processor. Images of the scene are computed from the set of measured light rays.
In the context of FIG. 1, sensor data processing circuitry 140 is implemented to process the image data and compute images of the scene including subjects 105, 106 and 107. In some applications, a preview arrangement 150 is also implemented to generate a preview image using a preview processor 160, with the preview image displayed on a preview screen 170. The previous processor 160 is selectively implemented with the sensor data processing circuitry 140, with a preview image generated in a manner not inconsistent, for example, with approaches discussed below.
In another embodiment, for each pixel in an image output from a sensor arrangement, the computational component weights and sums a subset of the measured rays of light. In addition, the computational component may analyze and combine a set of images computed in the marmer described above, for example, using an image compositing approach. Although the present invention is not necessarily limited to such applications, various aspects of the invention may be appreciated through the discussion of several specific example embodiments of such a computational component.
In connection with various example embodiments, image data processing involves refocusing at least a portion of an image being captured. In some embodiments, an output image is generated in the context of a photograph focused on desired elements of a particular scene. In some embodiments, the computed photograph is focused at a particular desired depth in the world (scene), with misfocus blur increasing away from the desired depth as in a conventional photograph. Different focal depths are selected to focus upon different subjects in the scene.
FIG. 12 illustrates several images 1200-1240 refocused at different depths, computed from a single light field measured in accordance with another example embodiment of the present invention. The approach shown in FIG. 12 may, for example, be implemented using an imaging arrangement such as that shown in FIG. 1.
FIGs. 13A and 13B show a refocusing approach, according to another example embodiment of the present invention. This approach may be implemented, for example, with a computational/processor component of an imaging system, such as the sensor data processing circuitry 140 in FIG. 1. Each output pixel (e.g., 1301)
\G

from an imaging arrangement corresponds to a three-dimensional (3D) point {e.g., 1302) on a virtual film plane 1310. This virtual film plane 1310 is located behind a main lens 1330, where the plane 1310 is optically conjugate to the desired focal plane in the world (not shown). That is, the virtual film plane 1310 is located at a position at which a film plane would desirably be located to capture a simple two-dimensional (2D) image (e.g., the position is comparable to the position at which a photographic film would be located with a conventional camera in order to capture a 2D image). By separating light by direction (e.g., using the microlens array 120 of FIG. 1), the light arriving at the virtual film plane 1310 can be selectively computed. In this regard, the value for the output pixel 1301 is computed by summing the cone of light rays 1320 that converge on the corresponding 3D point 1302. Values for these rays are gleaned from the data collected by the ray sensor 1350. FIG. .13A shows an imaging configuration in 2D for visual simplicity. In FIG 13B, a cone of rays 1330 from the 3D point 1340 is summed for the same pixel 1301, with the chosen world focal depth if closer to the main lens.
In some embodiments, the required light ray values do not correspond exactly to the discrete sample locations captured by the ray sensor. In some embodiments, the light ray value is estimated as a weighted sum of selected close sample locations. In some implementations, this weighting approach corresponds to a four-dimensional filter kernel that reconstructs a continuous four-dimensional light field from the discrete sensor samples. In some implementations, this four-dimensional filter is implemented with a four-dimensional tent function corresponding to quadrilinear interpolation of the 16 nearest samples in the four dimensional space.
FIG. 35 is a process flow diagram for computing a refocused image from received light ray sensor data, according to another example embodiment of the present invention. At block 3520 a set of sub-aperture images is extracted from light ray sensor data 3510, where each sub-aperture image consists of a single pixel under each microlens image where the pixels are at the same relative position under its microlens. At block 3530, the set of sub-aperture images is combined to produce the final output image. The sub-aperture images are optionally translated relative to one another and composited to bring a desired plane into focus.
In another example embodiment, darkening associated with pixels near a border of an output image is mitigated. For instance, with the pixels near the border
n

of an output image, some required rays may not have been captured in the measured light field (they may exceed the spatial or directional bounds of the imaging arrangement, such as the microlens array 120 and photosensor array 130 in FIG. 1). For applications in which this darkening is undesirable, the pixel values are normalized by dividing a value associated with the pixels (e.g., as captured by a photosensor array) by the fraction of rays that are actually found in the measured light field.
As discussed above, a variety of different computational approaches are chosen for different applications. The following discussion addresses various such approaches. In some applications, reference is made to figures, and in other applications, the approaches are discussed generally. In each of these applications, the particular approaches may be implemented using a computational-type component, such as the sensor data processing circuitry 140 shown in FIG. 1.
In another example embodiment, an imaging methodology for each output pixel for a particular imaging system corresponds to a virtual camera model in which a virtual film is rotated or deformed, arbitrarily and/or selectively, and a virtual main lens aperture is correspondingly moved and modified in size as appropriate. By way of example, FIG. 25 illustrates an example embodiment in which a virtual image is computed as it would have appeared on virtual film 2560 if it had been present behind a virtual lens aperture 2520 of arbitrary size on a virtual lens plane 2530 that is permitted to be non-coincident with the physical main lens plane 2510. The value of the pixel corresponding to point 2550 on the virtual film 2560 is computed by summing the rays passing through the virtual aperture 2520 and converging on point 2550, which are found by their intersection point and incident direction on ray sensor 2570.
FIG. 26 shows an approach to manipulating a virtual lens plane, according to another example embodiment of the present invention. A virtual lens plane 2630 and/or virtual film plant 2660 selectively tilted, relative to a physical main lens or other reference. An image computed using this approach has a resulting world focal plane that is not parallel to the imaging plane.
In another example embodiment, as exemplified in FIG. 27, the virtual film 2560 need not be planar, but may take any shape.
W

A variety of approaches involve selective implementation of different apertures. In one example embodiment, the virtual aperture on the virtual lens plane is a generally circular hole, and in other example embodiments, the virtual aperture is generally non-circular and/or is implemented with multiple distinct regions of any shape. In these and other embodiments, the notion of "virtual aperture" can be generalized, and in some applications, corresponds to an approach involving the processing of light data to correspond to light that would be received via a selected "virtual" aperture.
In various embodiments, a virtual aperture approach is implemented with a pre-determined but arbitrary function on a virtual lens plane. FIGs. 31A-31D illustrate various scalar functions that are selectively implemented as the virtual aperture function in connection with one or more example embodiments. The various functions include, for example, smoothly varying values (as exemplified in FIG. 3 IB), implementing a plurality of distinct regions (as exemplified in FIG. 31 A), and taking on negative values (as exemplified in FIG. 3 ID). To compute the value of a point on a virtual film, all rays that start at different points on the virtual lens and converge on the point on the virtual film are weighted by the virtual aperture function and summed. In various other embodiments, the final value is computed with an arbitrary computational function that depends on the ray values. For example, the computational function may not correspond to a weighting by a virtual aperture function, but may contain discontinuous program branches depending on the value of test functions computed on the ray values.
In other example embodiments, as may be implemented in combination with the other example embodiments described herein, a method of computing output pixels is chosen independently. For example, in one example embodiment, parameters including the orientation of a virtual lens plane and the size of the virtual aperture are varied continuously for each output pixel. In another example, as illustrated in FIG. 32, the virtual aperture function used to integrate rays for each output pixel is varied from pixel to pixel. In an output image 3200, pixel 3201 uses virtual aperture function 3210 and pixel 3251 uses virtual aperture function 3250.
In another example embodiment, a virtual aperture function varies from pixel to pixel. In one specific embodiment, the function is chosen to mask out rays from
fl

undesired portions of a particular scene, such as an undesired object in the foreground.
In another example embodiment, a human user chooses virtual aperture parameters interactively, with light data processed in accordance with the selections. FIG. 33 is a process flow diagram showing one such example embodiment. In the first block 3310, the process receives data from a light ray sensor. In block 3320, a user selects a region of an output image; in block 3330 a user selects an image formation method; and in block 3340, the user alters parameters for the selected method and visually examines a computed image of the scene at block 3350 (for example on a computer monitor). Block 3360 checks if the user is done editing the image portion, and if not returns to block 3330. Block 3370 checks whether the user is done choosing portions of the image to edit, and if not returns to block 3320. If editing is complete, block 3380 saves the final edited image.
In another example embodiment, an image with extended depth of field is computed by focusing on more than one subject at the same time. In one implementation, the depth of field of the output image is extended by simulating conventional photographic imaging with a stopped-down (reduced size) main lens aperture. For each output pixel, an evaluation is performed using the rays of light that would have converged at the output pixel through an aperture (on the virtual lens plane) that is smaller than the aperture used in ray sensing.
In one implementation involving the example system 100 shown in FIG. 1, the depth of field is extended by extracting a photosensor value under each microlens image, where each photosensor is located at the same relative position within each microlens image. With respect to FIG. 1, extending the depth of field produces an image in which not only the subject 105 is in focus (due to the correlation between the distances "d" and "s") but also objects at a different depth, such as subjects 106 and 107, which may otherwise be blurry due to misfocus. This approach to extending the depth of field, coupled with optional downsampling of the resulting image, is computationally efficient. This approach is selectively implemented in applications where noise generated with the image is tolerable, such as where the image generated is for preview purposes (e.g., for display at the preview screen 170 in FIG. 1). FIG. 4, discussed below, is further directed to approaches to the generation of a preview image.
2.0

FIGs. 14A and 14B illustrate an approach to computing images with different depths of field, in connection with one or more example embodiments. FIG. 14A shows an image and close-up computed with refocusing. Note that the face in the close-up is blurry due to shallow depth of field. The middle row of FIG. 14B shows the final image computed with an extended depth of field approach such as that described in the previous paragraph.
FIG. 8 is a process flow diagram for another computational approach to extending the depth of field in an image, according to another example embodiment of the present invention. At block 810, a set of images refocused at all focal depths in a particular scene is refocused. At block 820 and for each pixel, a pixel is determined from a set of images that has the highest local contrast. At block 830, the pixels . having the highest local contrast are assembled into a final virtual image. With this approach, a desirable signal-to-noise ratio (SNR) can be obtained, using a relatively high number of pixels (e.g., relative to selecting a single pixel (photosensor) for each microlens in a microlens array). Referring to FIG. 14C, the example image shown is produced using an approach similar to that described in connection with FIG. 8 and exhibits relatively low image noise.
In one alternative embodiment, a minimum set of refocused images to compute is defined as follows, in terms of the distance between a virtual film plane for each refocused image and the principal plane of the main lens via which the light for the image is passed to the virtual film plane. A minimum distance is set at the focal length of the main lens, and the maximum distance set at the conjugate depth for the closest object in the scene. The separation between each virtual film plane is no more than Axmf/ A, where Ax„, is the width of a microIens,/is the separation between the main lens and the microlens array, and A is the width of the lens aperture.
In another example embodiment, refocused images are combined to produce an extended depth of field image at each final pixel to retain the pixel that is best focused in any of the set of refocused images. In another embodiment pixels to retain are chosen by ehnancing the local contrast and coherence with neighboring pixels. For general information regarding imaging, and for specific information regarding approaches to imaging involving enhancing local contrast, reference may be made to Agarwala, A., Dontcheva, M., Agrawala, M., Drucker, S., Colbum, A., Curless, B.,
1
Salesin, D., Cohen, M., Interactive Digital Photomontage, in ACM Transactions on Graphics, 23, 3 (2004), 292-300, which is folly incorporated herein by reference.
In another example embodiment of the present invention, an extended depth of field image is computed as follows. For each output image pixel, a refocusing , computation is performed at the pixel to focus at different depths. At each depth, a measure of the homogeneity of the rays that converge is computed. The depth that produces the (relative) maximum homogeneity is chosen and kept for that pixel value. With this approach, where an image pixel is in focus, all of its rays originate from the same point in the scene and thus are likely to have similar color and intensity.
Although the measure of homogeneity can be defined in various ways, for many applications, the following measure of homogeneity is used: for each color component of each ray, the squared difference of that color intensity is computed from the corresponding color component of the central ray (the ray that arrives at the pixel at an angle closest to the optical axis of the main lens). All of these squared differences are summed, and the homogeneity is taken to be the reciprocal of the sum.
FIG. 34 is a process flow diagram for extending the depth of field in an image, according to another example embodiment of the present invention. At block 3410, a pixel is selected in a virtual image to be computed. At block 3420, the pixel is refocused at a plurality of focal depths, and the homogeneity of the rays that are combined to refocus at each depth is computed. At block 3430, the refocused pixel value associated with the highest homogeneity of rays that are combined is retained as the final output image pixel value. The process continues at block 3440 until all pixels are processed.
In another example embodiment, the above process is adapted so that the selection of final pixel values takes into account the neighboring pixel values, and the homogeneity of the associated rays that are combined to compute those pixel values.
In other example embodiments of the present invention, the depth of field is extended by focusing each pixel on the depth of the closest object in that direction. FIG. 9 is a process flow diagram for extending the depth of field in an image, according to one such example embodiment. At block 910, a pixel is selected in a final virtual image to be computed. At block 920, the depth of a nearest object is estimated for a light ray (or set of light rays) traced from the selected pixel into a scene through the center of the lens.
-2.1

At block 930, the selected pixel's value is computed in an image refocused at the estimated depth. If additional pixels are desirably processed at block 940, another pixel is selected at block 910 and the process continues at block 920 for the newly-selected pixel. When no additional pixels are desirably processed at block 940, the computed values for each selected pixel are used to create the final virtual image.
In some embodiments involving the extension of the depth of field, a value at each output pixel is computed by disregarding light rays that originate at depths closer than the depth of a desired object to mitigate or eliminate artifacts such as those generally referred to as "blooming" or "halo" artifacts around the borders of objects closer to the lens. By way of example, FIG. 23 illustrates the set of rays that pass through a desired focal point 2301 in the world on a subject of interest 2310. Some of these rays are occluded from the main lens by object 2320, and these correspond to the light rays 2340 that are detected by ray sensor 2350 but disregarded in computing an image value for point 2301. In some embodiments, rays to be disregarded are detected by a mismatch with the color of the central ray. In some embodiments the resulting pixel value is normalized by dividing by the fraction of rays that are not disregarded. These embodiments may be used in isolation or combination with each other and any other embodiments, including those directed to extending the depth of field.
As discussed above, light data is processed in accordance with various example embodiments to focus and/or correct images. A variety of approaches to the latter correction approach are discussed as follows. In some of these embodiments, aberrations are corrected by tracing rays through the actual optical elements of the optics (e.g., lens or lenses) used in capturing the rays, and mapping the traced rays to particular photosensors capturing the light. Light data is rearranged, using knownd defects exhibited by the optics as well as the known position of the sensors detecting the light.
In one correction-type embodiment, the world of rays that contribute to each pixel as formed through idealized optics is computed for each pixel on a film of a synthesized photograph. In one implementation, these rays are computed by tracing rays from the virtual film location back through the ideal optics into the world. FIG. 15 illustrates an approach to the tracing of rays from a 3D point 1501 on a virtual film plane 1510 through an ideal thin main lens 1520 out into a cone of world rays 1530, in
23

connection with one such example embodiment. In some implementations, the set of desired rays 1525 may not necessarily correspond to direction through a real lens', but may correspond to any set of rays that are to be weighted and summed to produce a desired image value.
FIG. 16 shows an approach to finding the value of light traveling along ideal rays for a particular application, in connection with another example embodiment. These values are computed by tracing desired ideal world rays 1630 through a real main lens 1650, having a single element with spherical interfaces, used to physically direct the real world light rays to the ray sensor 1640 at the time the rays are measured (detected). In this embodiment, the rays that ideally converge to a single 3D point (1601) do not converge, representing a defect of lenses with spherical interfaces called spherical aberration. The ray sensor 1640 provides individual values for each of the aberrated rays (such as 1651), which are used to correct for the spherical aberration.
FIGs. 17A-17C illustrate example results using a computer simulation with an approach to lens correction. The image in FIG. 17A is an ideal 512x512 photograph (as seen through perfect optics). The image in FIG. 17B is an image that would be produced with a real f/2 bi-convex spherical lens, which has loss in contrast and blurring. The image in FIG. 17C is a photograph computed using an approach to image correction described above, using an optics and sensor arrangement facilitating 10 x 10 directional (u, v) resolution at each of 512 x 512 microlenses.
In another example embodiment of the present invention, chromatic aberrations are corrected in a main lens used to capture an image. Chromatic aberration is caused by the divergence of rays of light as they are physically directed through optics because of differences in the physical direction dependent on the wavelength of light. The incoming rays are traced through the actual optics, taking into account the wavelength-dependent refraction of light that occurs in the actual optics. In some applications, each color component of the system is traced separately based on the primary wavelength.
In another example embodiment, each of the red, green and blue components common in color imaging systems is traced separately, as illustrated in FIG. 18A. The green world light rays are computationally traced back into the imaging system to produce green rays 1830, and to determine where they intersect a color ray sensor
•a*,

1810 and at what direction they intersect the color ray sensor 1810. Similarly, FIG. 18B illustrates computationally tracing the desired blue world light fays 1820, which are refracted to a greater extent than the green light rays. FIG. 18C illustrates computationally tracing the desired red world light rays 1830, which are refracted to a lesser extent than the green light rays. The values for each ray are computed from the values from the ray sensor 1810 using, for example, approaches discussed in connection with other example embodiments described herein. The light field values for each ray are integrated to calculate the corrected image value for each particular film pixel. For some applications, chromatic aberration is ameliorated by refocusing each color channel on the plane at which its wavelengths best come into focus.
The desired light rays may not converge exactly on one of the discrete ray values sampled by the ray sensor. In some embodiments, the value to be used for such rays is computed as a function of the discrete ray values. In some embodiments, this function corresponds to a weighted sum of the value of discrete rays in a neighborhood of the desired light ray. In some implementations, this weighted sum corresponds to a 4D convolution of the discrete sample values with a predetermined convolution kernel function. In other implementations, the weighting may correspond to a quadrilinear interpolation from the 16 nearest neighbors. In still other implementations, the weighting may correspond to a cubic or bicubic interpolation from the 16 nearest neighbors.
It is worth noting that example correction processes have been described in terms of ray-tracing for conceptual simplicity; a variety of other approaches are implemented with correction. In one embodiment, for each desired output pixel, the set of photosensor values that contribute are pre-computed along with their relative weights. As described above, these weights are a property of a number of factors that may include the optics, sensor, desired set of rays to be weighted and summed for each output pixel and desired light field reconstruction filter. These weights are pre-computed, selectively using ray-tracing, and. stored. A corrected image is formed by weighting and adding the appropriate sensed light field values for each output pixel.
FIG. 29 and FIG. 30 illustrate other example embodiments for use in connection with the above correction approach. FIG. 29 is a process flow diagram for pre-computing a database of weights associated with ray (light) sensors and an output pixel value associated with each ray sensor. In the first two blocks 2910 and 2920, a
IS

data set (e.g., in a database) is received for a desired image formation process consisting (for each output image pixel) of a set of ideal world light rays to be summed to produce an output image pixel value, and a specification for real main lens optics used to physically direct light rays to a light ray sensor. At block 2925, an image pixel is chosen. For the output value of this pixel, the associated set of world rays is computationally traced at block 2930 through a virtual representation of the main lens optics to the ray sensor. This results in a set of weights to be applied to each ray sensor value to compute the output pixel value. These values are stored in an output database in block 2940. Block 2950 checks whether all pixels have been processed, returning to block 2925 if not. If all pixels have been processed, the final block 2960 saves the completed database.
FIG. 30 is a flow diagram for a process that computes an output image using a database of weights that may have been computed by a process as in FIG. 29. In block 3010 and 3020, the process receives the database and a set of ray sensor values captured with the main lens optics used in computing the database. At block 3025, a pixel in the output image is selected, so that its final image value may be computed. For the selected pixel, block 3030 uses the database to find the set of ray sensor that contributes and their weights. In block 3040, each sensor value given in 3020 is weighted and added to a sum for that image pixel value. At block 3050, a check is performed to see whether all image pixels have been processed. If not, the process returns to block 3025, and if so, the output image is saved at block 3060.
In a variety of example embodiments, light data is processed in the frequency domain, with certain approaches directed to computational approaches to refocusing that operate in the Fourier domain. FIG. 20 is a flow diagram illustrating one such approach, in connection with another example embodiment. The input to the algorithm is a discrete 4D light field 2010, which we will refer to as Lfot.u.v), representing the ray starting at (u, v) on a main lens and terminating at (s, t) on a microlens plane (e.g., from the main lens 110 and terminating at a plane of the microlens array 120 of FIG. 1). The first step is to compute the discrete 4D Fourier transform 2020 of the light field. The 4D Fourier transform value at (ks, kt, ku, kv), let us call this value M(ks, kt, ku, kv) is defined by the following equation:
M(ks,knku,kv) = jjjJL(s,t,u,v)e\p(-2^J^-(sks+tkl + uku+vkv)\dsdtdudv,
0)
1Q

where the exp function is the exponential function, exp(x) = ex. In some embodiments the discrete light field is sampled on a rectilinear grid in the 4D space, and the Fourier transform is computed with the Fast Fourier Transform (FFT) algorithm.
The next step, which is executed once for each depth at which we wish to refocus the image, is to extract appropriate 2D slices 2030 of the 4D Fourier transform, and compute the inverse 2D Fourier transforms of the extracted slices, which are photographs focused at different depths 2040. The inverse 2D Fourier transform, g(x, y), for a function G(kx, ky) is defined by the following equation:
g(x,y)= \JG(kx,ky)cxp(2x&-(xkx+yky))dkxdky .
(2) The values on the extracted 2D slice are determined by the depth at which we want to refocus. Considering the conjugate plane (on the image-side of the lens) for the desired world focal plane, when the separation between this conjugate plane and the main lens is D and the separation between the microlens plane and the main lens is F, then the value of the extracted 2D slice at coordinates (kx,ky) is given by G(kx,ky) = l/F2-M(kx(l-D/F),ky(l-D/F),kxD/F,kyD/F).
(3) Using various approaches, artifacts that result from discretization, resampling and Fourier transformation are selectively ameliorated. In general signal-processing terms, when we sample a signal it is replicated periodically in the dual domain. When we reconstruct this sampled signal with convolution, it is multiplied in the dual domain by the Fourier transform of the convolution filter. In this regard, the original, central replica is isolated, eliminating all other replicas. A desirable filter is a 4D sine function, sinc(s)sinc(t)sinc(u)sinc(v), where sinc(x) = sin(rcx) / (7tx); however, this function has infinite extent.
In various approaches, finite-extent filters are used with frequency-domain processing; such filters may exhibit defects, which are selectively mitigated. FIGs. 21A illustrate these defects with respect to a specific ID filter, with corresponding discussion below directed to mitigation or such defects. FIG. 21A represents a triangle filter approach implemented with in linear interpolation in ID (or as the basis for a 4D quadrilinear filter). FIG. 2 IB shows the Fourier transform of the triangle filter approach, which is not of unit value within the band-limit (see 2010), and which
rfr

gradually decays to smaller fractional values as the frequency increases. In addition, the filter is not truly band-limited, containing energy at frequencies outside the desired stop-band (2020).
The first defect described above leads to "rolloff artifacts," which can lead to a darkening of the borders of computed photographs. Decay in the filter's frequency spectrum with increasing frequency means that the spatial light field values, which are modulated by this spectrum, also "roll off to fractional values towards the edges.
The second defect described above involves aliasing artifacts in computed photographs, which are related to energy at frequencies above the band-limit. The non-zero energy that extends beyond the band-limit means that the periodic replicas are not fully eliminated, leading to two kinds of aliasing. First, the replicas that appear parallel to the slicing plane appear as 2D replicas of the image encroaching on the borders of the final photograph. Second, the replicas positioned perpendicular to this plane are projected and summed onto the image plane, creating ghosting and loss of contrast.
In an example embodiment, correction for rolloff-type defects as described above is eliminated by multiplying the input light field by the reciprocal of the filter's inverse Fourier spectrum, to nullify the effect introduced during resampling. In this example embodiment, multiplication is performed prior to taking the 4D Fourier transform in the pre-processing step of the algorithm. While it corrects rolloff error, pre-multiplication may accentuate the energy of the light field near its borders, maximizing the energy that folds back into the desired field of view as aliasing.
Three methods of suppressing aliasing artifacts—oversampling, superior filtering and zero-padding—are used individually or in combination in various example embodiments described below. Oversampling within the extracted 2D slice increases the replication period in the spatial domain. This means that less energy in the tails of the in-plane replicas will fall within the borders of the final photograph. Increasing the sampling rate in one domain leads to an increase in the field of view in the other domain. Aliasing energy from neighboring replicas falls into these outer regions, which is cropped away to isolate the original, central image of interest.
Another approach to mitigating aliasing is directed to a finite-extent filter that approximates a perfect spectrum (as would be exhibited via use of an ideal filter) as closely as possible. In an example embodiment, a 4D Kaiser-Bessel separable
1^

function, kb4(s,t,u,v) = kb(s)kb(t)kb(u)kb(v), is used as the filter, where kb(x) = l/W-I0(p-Jl-(2x/W)2)j
(4) In this equation, Io is the standard zero-order modified Kaiser-Bessel function of the first kind, W is the width of the desired filter, and P is a parameter that depends on W. In this example embodiment, W values are 5, 4.5, 4.0, 3.5, 3.0, 2.5, 2.0 and 1.5, and the P values are, respectively, 7.4302, 6.6291, 5.7567, 4.9107, 4.2054, 3.3800, 2.3934, and 1.9980. For general information regarding aliasing, and for specific information regarding approaches to mitigating aliasing in connection with one or more example embodiments of the present invention, reference may be made to Jackson J. I., Meyer C. H., Nishimura, D. G. and Macovski, A., 1997, Selection of convolution function for Fourier inversion using gridding. IEEE Transactions on Medical Imaging, 10, 3, 473-478, which is fully incorporated herein by reference. In one implementation, widths "W" of less than about 2.5 are implemented to achieve desirable image quality.
In another example embodiment, aliasing is mitigated by padding a light field with a small border of zero values before pre-multiplication and taking its Fourier transform. This pushes energy slightly further from the borders, and minimizes the amplification of aliasing energy by the pre-multiplication for rolloff correction.
FIG. 22 is a flow diagram showing an approach to refocusing in the frequency domain using various corrections described above, according to another example embodiment of the present invention. At block 2210, a discrete 4D light field is received. In the pre-processing phase, which occurs once per input light field, block 2215 checks if aliasing reduction is desired, and if so executes block 2220, which pads the light field with a small border (e.g., 5% of the width in that dimension) of zero values. At block 2225 a check is performed to determine whether rolloff correction is desired, and if so, the light field is modulated at block 2230 by the reciprocal of the Fourier transform of the resampling filter. In the final block of the pre-processing phase, the 4D. Fourier transform of the light field is computed at block 2240.
In the refocusing phase, which occurs once per desired focal depth, the process receives a desired focal depth of refocused image at block 2250, such as through the direction of a user. At block 2260, a check is performed to determine whether
Qf
aliasing reduction is desired. If not, block 2270 extracts a 2D slice of the Fourier transform of the light field, with a desired 4D rsampling filter, where the trajectory of the 2D slice corresponds to the desired focal depth; and block 2275 computes an inverse 2D Fourier transform of the extracted slice and moves to block 2290. If aliasing reduction was desired at block 2260, the process moves to block 2280, at which a 2D slice with desired 4D resampling filter and oversampling (e.g. 2x oversampling in each of the two dimensions) is extracted. At block 2283, the slice's inverse 2D Fourier transform is computed, and the resulting image is cropped to the original size without oversampling at block 2286, after which the process moves to block 2290. At block 2290, a check is performed to determine whether refocusing is complete. If not, another focal depth is chosen at block 2250 and the process proceeds as described above. If refocusing is complete, the process exits at block 2295.
The asymptotic computational complexity of this frequency-domain algorithm is less than refocusing by explicitly summing rays as described for the alternate embodiment above. Assume that the input discrete light field has N samples in each of its four dimensions. Then the computational complexity of the algorithm that explicitly sums rays is O (N4) for refocusing at each new depth. The computational complexity of the frequency-domain algorithm is O (N2 log N) for refocusing at each new depth, dominated by the cost of the inverse 2D Fourier transform. However, the pre-processing step costs O (N4 log N) for each new light field dataset.
In another example embodiment, the captured light rays are optically filtered. Although not limited to such applications, some examples of such filters are neutral density filters, color filters, polarizing filters. Any filter currently existing or that may be developed in the future may be used to effect a desired filtering of the rays of light. In one implementation, the light rays are optically filtered in groups or individually, so that each group or individual ray is filtered differently. In another implementation, a filtering is applied by the use of a spatially-varying filter attached to a main lens. In one example application, a gradient filter such as a neutral-density gradient filter is used to filter light. In another implementation, spatially varying filters are used in front of one or more of a ray sensor, a microlens array or a photosensor array. Referring to FIG. 1 by way of example, one or more such filters are selectively placed
30

in front of one or more of the main lens 110, microlens array 120 and photosensor array 130.
In another example embodiment of the present invention, a computational component such as a processor is programmed to selectively choose rays to combine in computing output pixels in order to effect a desired net filtering for that pixel value. Byway of example, consider embodiments involving an optical neutral gradient density filter at the main lens, each image of the lens aperture that appears under a microlens is weighted by the filter gradient across its extent. In one implementation, output images are computed by selecting a photosensor under each microlens at the point of the gradient that matches the desired level of neutral-density filtering for that output image pixel. For example, to produce an image in which every pixel is filtered to a large extent, every pixel value is set to the value of the photosensor under the corresponding microlens that is at the extreme end of the gradient corresponding to maximum filtering.
FIG. 2 is a data-flow diagram showing an approach to processing images in connection with other example embodiments of the present invention. An image sensor arrangement 210 collects image data using micro lens/photosensor chip arrangement 212 in a manner similar, for example, to the microlens array 120 and photosensor array 130 shown in FIG. 1 and described above. The image sensor arrangement 210 optionally includes an integrated processing circuit 214 bearing certain processing circuitry to prepare collected image data for transfer.
Sensor data created at the image sensor arrangement 210 is passed to a signal processor 220. The signal processor includes a low-resolution image processor 222 and one or both of a compression processor 224 and a (light) ray-direction processor 226; each of these processors is selectively implemented separately or functionally with a common processor, depending upon the application. Furthermore, each of the processors shown in FIG. 2 is selectively programmed with one or more processing functions described in connection with other figures or elsewhere herein. The signal processor 220 is optionally implemented in a common device or component with the image sensor arrangement 210, such as on a common circuit and/or in a common image device.
The low-resolution image processor 222 uses sensor data received from the image sensor arrangement 210 to generate low-resolution image data, which is sent to
31

a viewfinder display 230. An input device 235, such as a pushbutton on a camera or video camera, sends an image capture request to the signal processor 220 requesting, for example, the capture of a particular image displayed in the viewfinder display 230 and/or to initiate video imaging where so implemented.
In response to the image capture request or as otherwise directed, the signal processor 220 uses the sensor data captured by the image sensor arrangement 210 to generate processed sensor data. In some applications, the compression processor 224 is implemented to generate compressed raw data for transfer to a data storage arrangement 240 (e.g., memory). Such raw data is then selectively processed at the signal processor 220 and/or at an external computer 260 or other processing device, implementing ray-direction processing such as that implemented with the ray-direction processor 226, which is discussed further below.
In certain applications, the ray-direction processor 226 is implemented to process the sensor data received at the signal processor 220 to rearrange the sensor data for use in generating focused and/or corrected image data. The ray-direction processor 226 uses one or both of sensor data received from the image sensor arrangement 210 and raw data sent to the data storage arrangement 240. In these applications, the ray-direction processor 226 uses ray-mapping characteristics of the .particular imaging device (e.g., camera, video camera or mobile telephone) in which the image sensor arrangement 210 is implemented to determine a rearrangement of light rays sensed with the microlens/photosensor chip 212. Image data created with the ray-direction processor 226 is sent to the data storage arrangement 240 and/or to a communication link 250 for use in a variety of applications, such as in streaming image data or otherwise sending image data to a remote location.
In some applications, the integrated processing circuit 214 includes some or all of the processing functionality of the signal processor 220 by implementing, for example, a CMOS-type processor or other processor with appropriate functionality. For instance, the low-resolution image processor 222 is selectively included with the integrated processing circuit 214, with the low-resolution image data sent directly to the viewfinder display 230 from the image sensor arrangement 210. Similarly, the compression processor 224, or functionality similar thereto, is selectively implemented with the integrated processing circuit 214.
2D-

In some applications, computation of final images may be performed on the integrated processing circuit 214 (e.g. in some digital still cameras that output only final images). In other applications, the image sensor arrangement 210 may simply transmit the raw light ray data, or a compressed version of these data, to an external computational device, such as a desktop computer. Computation of final images from these data is then performed on the external device.
FIG. 3 is a flow diagram for a method for processing image data, according to another example embodiment of the present invention. At block 310, image data is captured at a camera or other imaging device, using a main lens or a lens stack with a microlens/photosensor array such as that shown in FIG. 1. If a preview image is desired at block 320, the preview image is generated at block 330 using, for example, a viewfinder or other type of display. The preview image is displayed, for example, in a viewfinder of a camera or video camera, using a subset of the captured image data.
Raw data from the photosensor array is processed and compressed for use at block 340. Light ray data is extracted from the processed and compressed data at block 350. This extraction involves, for example, detecting a bundle or set of light rays incident upon a particular photosensor in the photosensor array. Ray mapping data is retrieved at block 360 for the imaging arrangement in which the image data is captured. The ray-mapping data and the extracted light ray data is used to synthesize a re-sorted image at block 370. For example, the extraction, mapping and synthesis blocks 350 - 370 are selectively implemented by determining a bundle of light rays for a particular pixel of a scene for which the light rays were collected, and integrating the energy of the light rays to synthesize a value for the particular pixel. In some applications, the ray mapping data is used to trace light rays for each particular pixel through actual lenses used to acquire the image data. For example, by determining an appropriate set of rays to add together in order to focus upon a selected subject at a particular focal depth at block 370, the rays can be re-sorted to arrive at a focused image. Similarly, by determining a proper arrangement of rays to correct for conditions such as lens aberrations in the imaging device, the rays can be re-sorted to generate an image relatively free of characteristics relating to the aberrations or other condition.
32

A variety of approaches are selectively used to generate preview images for camera-type and other applications. FIG. 4 is a process flow diagram for generating such a preview image, according to another example embodiment of the present invention. The approach shown in FIG. 4 and discussed below may be implemented, for example, in connection with the generation of a preview image at block 330 of FIG. 3.
A preview instruction with raw sensor image data is received at block 410. At block 420, a center pixel is selected from each microlens image in the raw sensor image data. The selected center pixels are collected to form a high depth-of-field image at block 430. At block 440, the high depth-of-field image is downsampled to a resolution amenable for use in a viewfinder display. Referring to FIG. 2 by way of example, such downsampling is selectively performed at one or more of the image sensor arrangement 210 or the signal processor 220. The generated preview image data is sent to the viewfinder display at block 450, and at block 460, the viewfinder displays an image with the preview image data.
FIG. 5 is a process flow diagram for processing and compressing image data, according to another example embodiment of the present invention. The approach shown in FIG. 5 and discussed below may be implemented., for example, in connection with the processing and compressing of image data at block 340 of FIG. 3. When implemented with an arrangement as shown in FIG. 2, the approach shown in FIG. 5 may, for example, be implemented at one or both of the image sensor arrangement 210 and the signal processor 220.
At block 510, raw image data is received from a sensor array. If coloring is desired at block 520, color filter array values are demosiaced at block 530 to produce color at the sensors. If rectification and alignment is desired at block 540, microlens images are rectified and aligned with the photosensor array at block 550. If interpolation is desired at block 560, pixel values are interpolated at block 570 to an integral number of pixels associated with each microlens. At block 580, the processed raw image data is compressed and presented for synthesis processing (e.g., to form a refocused and/or corrected image).
FIG. 6 is a process flow diagram for image synthesis, according to another example embodiment of the present invention. The approach shown in FIG. 6 and
3*t

discussed below may be implemented, for example, in connection with the image synthesis approach shown at block 370 of FIG. 3 and discussed further below.
At block 610, raw image data is received from a photosensor array. If refocusing is desired at block 620, image data is refocused at block 630 using, e.g., approaches discussed herein for selectively re-sorting light represented by the raw image data. If image correction is desired at block 640, image data is corrected at block 650. In various applications, image correction at block 650 is carried out before or concurrently with refocusing at block 630 in applications where both refocusing and image correction is desirable. A resultant image is generated at block 660 using processed image data including refocused and corrected data., where applicable.
FIG. 7A is a process flow diagram for image refocusing with a lens arrangement, according to another example embodiment of the present invention. The approach shown in FIG. 7 and discussed below may be implemented, for example, in connection with the refocusing of image data at block 630 in FIG. 6.
At block 710, a virtual focal plane for refocusing an image portion is selected. At block 720, a pixel of a virtual image for the virtual focal plane is selected. If correction (e.g., for lens aberration) is desired at block 730, the value of a virtual light ray (or virtual set of light rays) passing between the selected pixel and each particular lens position is calculated at block 740. In one application, this calculation is facilitated by computing the conjugate light ray that would fall upon the selected pixel and tracing that ray through the path in the lens arrangement.
At block 750, the sum of light ray (or virtual set of light ray) values for each lens position for the particular focal plane are added to determine a total value for the selected pixel. In some applications, the sum added at block 750 is a weighted sum, wherein certain light rays (or set of light rays) are given greater weight than others. If there are additional pixels for refocusing at block 760, another pixel is selected at block 720 and the process continues until no further pixels are desirably refocused. After the pixels have been refocused, the pixel data is combined at block 770 to generate a refocused virtual image at the virtual focal plane selected in block 710. The refocusing approaches involving some or all of blocks 720, 730, 740 and 750 in FIG. 7 are carried out via more specific functions for a variety of applications.
The sensor data processing circuitry implemented with one or more example embodiments described herein includes one or more microprocessors, Application-
s'

Specific Integrated Circuits (ASICs), digital signal processors (DSPs), and/or programmable gate arrays (for example, field-programmable gate arrays (FPGAs)), depending upon the implementation. In this regard, sensor data processing circuitry may be any type or form of circuitry whether now known or later developed. For example, the sensor data processing circuitiy may include a single component or a multiplicity of components (microprocessors, ASICs and DSPs), either active and/or passive, which are coupled together to implement, provide and/or perform a desired operation/function/application.
In various applications, the sensor data processing circuitry performs or executes one or more applications, routines, programs and/or data structures that implement particular methods, tasks or operations described and/or illustrated herein. The functionality of the applications, routines or programs are selectively combined or distributed in certain applications. In some applications, the applications, routines or programs are implemented by sensor (or other) data processing circuitry using one or more of a variety of programming languages, whether now known or later developed. Such programming languages include, for example, FORTRAN, C, C++, Java and BASIC, whether compiled or uncompiled code, selectively implemented in connection with one or more aspects of the present invention.
The various embodiments described above are provided by way of illustration only and should not be construed to limit the invention. Based on the above discussion and illustrations, those skilled in the art will readily recognize that various modifications and changes may be made to the present invention without strictly following the exemplary embodiments and applications illustrated and described herein. For instance, such changes may include implementing the various optical imaging applications and devices in different types of applications, increasing or decreasing the number of rays collected per pixel (or other selected image area), or implementing different algorithms and/or equations than the examples described to assemble or otherwise process image data. Other changes may involve using coordinate representations other than or in addition to Cartesian coordinates, such as polar coordinates. Such modifications and changes do not depart from the true spirit and scope of the present invention.
36

We Claim:
1. A method for digitally imaging a scene, the method comprising:
using a photosensor array to simultaneously detect light from the scene that is passed to different locations on a focal plane;
determining the angle of incidence of the light detected at the different locations on the focal plane; and
using the determined angle of incidence and the determined depth of field to compute an output image in which at least a portion of the image is refocused.
2. The method of claim 1, wherein computing an output image includes
refocusing the detected light to form an image having at least a portion thereof
focused at a different focal plane.
3. The method of claim 2, wherein refocusing the detected light includes:
determining a set of light rays from the detected light that make up the at least
a portion of the image focused at a different focal plane; and
combining the determined set of light rays to form the portion of the output image focused at the different focal plane.
4. The method of claim 1, wherein refocusing the detected light includes correcting for conditions of optics used in detecting the light.
5. The method of claim 1, wherein refocusing the detected light includes correcting for lens aberration.
6. A digital imaging system for synthesizing an image from a scene, the system comprising:
a photosensor array having a plurality of photosensors;
an optics arrangement adapted to direct light from the scene to the photosensor array via a two-dimensional focal plane;
wherein, for a set of light rays passing through a particular portion of the two-dimensional focal plane, a plurality of the photosensors are arranged to detect
3t

different ones of the set of light rays that concurrently arrive at the particular portion of the two-dimensional focal plane at different angles of incidence; and
a processor programmed to compute a synthesized refocused image as a function of the light detected at different photosensors and of an angle of incidence of the detected light upon the two-dimensional focal plane as characterized by the location of each photosensor detecting the light.
7. The system of claim 6, wherein the optics arrangement includes a microlens array located at the focal plane, the microlens array including a plurality of microlenses, each microlens adapted to separate rays of light incident upon the microlens and to direct the separated rays to a set of photosensors in the photosensor array.
8. The system of claim 7, wherein the processor determines the angle of incidence of light upon each photosensor using the position of each photosensor relative to the microlens.
9. The system of claim 6, wherein the processor is programmed to determine a depth of field of a subject in the scene from which the detected light emanates and to use the depth of field and the angle of incidence of the detected light to compute a synthesized image focused upon a focal plane that is different from the two-dimensional focal plane.
10. A digital imaging system for synthesizing an image from a set of captured light rays, the system comprising:
a main lens;
a photosensor array for capturing a set of light rays;
a microlens array between the main lens and the photosensor array, the set of light rays being physically directed from the main lens and through the microlens array to the photosensor array;
a data processor to compute a synthesized refocused image via a virtual redirection of the set of light rays captured by the photosensor array.
3*

11. The system of claim 10, wherein the data processor computes an image by selectively combining selected light rays to virtually redirect the set of light rays.
12. The system of claim 10, wherein the data processor computes an image by selectively adding selected light rays to virtually redirect the set of light rays.
13. The system of claim 10, wherein the data processor computes an image by selectively weighting and adding light rays to virtually redirect the set of light rays.
14. The system of claim 10, wherein the data processor computes an image by virtually redirecting the set of light rays based on a spatial distribution of the set of light rays on the photosensor array and based on the physical direction of the set of light rays from the main lens and through the microlens array to the photosensor array.
15. The digital imaging system of claim 10, wherein the data processor virtually redirects the set of light rays to refocus a portion of the image at a focal plane different from a plane in which the microlens array is located.
16. The digital imaging system of claim 10, wherein the data processor virtually redirects the set of light rays to correct for a lens-aberration.
17. The digital imaging system of claim 10, wherein the data processor virtually redirects the set of light rays to extend the depth of field of the synthesized image.
18. The digital imaging system of claim 10, wherein the photosensor array includes, for each microlens in the microlens array, a plurality of photosensors.
19. The digital imaging system of claim 18, wherein the main lens focuses a two-dimensional image of a scene upon the microlens array, and wherein each microlens in the microlens array is adapted to diverge light rays focused thereupon by the main lens and to direct the diverged light rays to the plurality of photosensors for the microlens.
33

20. The digital imaging system of claim 10, wherein the data processor uses data from the photosensor array to create focused sub-images from different portions of a scene depicted by the captured light rays by resolving different depths of focus for each sub-image.
21. The digital imaging system of claim 20, wherein the data processor synthesizes a final image by combining the focused sub-images.
22. An imaging arrangements and methods therefore as claimed substantially as herein described with forgoing description and figures.

Dated this 20tb day of April 2007.

46

Dr. Rajeshkumar H. Acharya Advocate & Patent Agent For and on Behalf of Applicant

ABSTRACT
Image data is processed to facilitate focusing and/or optical correction. According to an example embodiment of the present invention, an imaging arrangement collects light data corresponding to light passing through a particular focal plane. The light data is collected using an approach that facilitates the determination of the direction from which various portions of the light incident upon a portion of the focal plane emanate from. Using this directional information in connection with value of the light as detected by photo sensors, an image represented by the light is selectively focused and/or corrected.

Documents:


Patent Number 259547
Indian Patent Application Number 584/MUMNP/2007
PG Journal Number 12/2014
Publication Date 21-Mar-2014
Grant Date 18-Mar-2014
Date of Filing 23-Apr-2007
Name of Patentee THE BOARD OF TRUSTEES OF THE LELAND STANFORD JUNIOR UNIVERSITY
Applicant Address 1705 EI CAMINO REAL, PALO ALTO, CA
Inventors:
# Inventor's Name Inventor's Address
1 NG YI-REN 563 HOPE STREET, MOUNTAIN VIEW, CA 94041
2 HANRAHAN PATRICK M 40 MINOCA ROAD, PORTOLA VALLEY, CA 94028
3 LEVOY MARC S 18 PETER COUTTS CIR., STANDFORD,CA 94305
4 HOROWITZ MARK A 1309 SAN MATEO DRIVE, MENLO PARK, CA 94025
PCT International Classification Number G03C7/26
PCT International Application Number PCT/US2005/035189
PCT International Filing date 2005-09-30
PCT Conventions:
# PCT Application Number Date of Convention Priority Country
1 60/615,179 2004-10-01 U.S.A.
2 60/647,492 2005-01-27 U.S.A.