Understanding Light Field Photography

Traditional cameras, whether digital or analog, capture a scene from a single point of view. Photoshop (and similar programs) enable us to do amazing things with these images, but no amount of post-processing can increase the information encoded by the original device. One obvious way of doing so is by increasing the size of the image sensor, and, indeed, as digital photography has advanced we’ve seen a steady progression to larger and larger sensors for this fairly obvious reason.

In fact, the achievable size of CMOS image sensors now greatly exceeds the needs of almost all practical applications, and as we shall later explain, this “surplus” of sensor pixels is one of the key advances opening the way to widespread implementation of “light field” (or “plenoptic”) imaging. The great promise of this new kind of photography is the ability to capture not just more information about a scene, but more kinds of information. Access to these new “data types” opens up an array of software post-processing options that are impossible with traditional cameras—options like dynamic refocusing, perspective shifting, and even 3D scanning.

How it Works

The light field is representative of all of the rays that enter the camera, not just those that are focused on the film or sensor. In a conventional camera, light rays from a scene are focused by the lens on the film or sensor. In a digital photograph, each pixel has a number value indicating brightness, which in turn represents the sum of all light rays from all parts of the scene focused, by the lens, on that part of the sensor. In a light-field camera, however, the various rays that make up each pixel are not summed, but separately measured and recorded, both in terms of their brightnesses and the directions from which they originate.

Hardware

The light field is really a mathematical abstraction that has applications in fields ranging from physics to machine vision to computer graphics. Devices that capture some part of the light-field exist in many forms. The Lytro captures the light field by directing the light rays that enter the main lens onto an array of over one hundred thousand micro lenses. The micro lenses in turn direct the light rays onto a 6.5 x 4.5 mm CMOS sensor with the ability to capture eleven million light rays arranged in a 3280 by 3280 grid. Information stored includes color, intensity, direction and distance. Each micro lens uses a roughly 10×10 pixel portion of the CMOS sensor.

The Lytro consumer light-field has been around for a couple of years and the feature most people seem to talk about is the ability to fix or change the focal point of a picture after it has been taken. But that isn’t the whole the story; in fact it’s just the tip of the iceberg.

The image above shows the light rays from the right entering the camera and being directed to the CMOS sensor by the micro lens array.

The Software

Capturing the light field is only the first step. The next step is to generate an image that can be viewed. Doing this has been described as “ray tracing in reverse.” To explain what this means I’m going to describe how pinhole cameras work, traverse briefly through ray tracing, and finally explain the Lytro rendering process. All three have two things in common. First there is an observer viewing the scene and second, the scene must somehow be rendered onto a screen.

In a pinhole camera light rays pass through a hole in the front of the camera and appear on the opposite surface as an image that is upside down and reversed. The back wall is essentially a screen. If you were inside the camera with your back to the pinhole lens you would see the projected image in front of you. (The “scene object” is supposed to be a cactus. It can be anything though. Ideally all the diagrams would be from a pseudo 3D perspective but that is beyond my drawing skills.)

In ray tracing a scene is described mathematically using geometric shapes, textures, and light sources. A point outside the scene is selected that represents the position of the observer and an image is generated from the perspective of that observer. Ray tracing differs from the pinhole camera example in three ways. First, the scene does not exist in the real world and has to be rendered. Second the “real” scene is in front of the observer rather than behind and finally since the scene is virtual, the image on the screen needs to be rendered. This is done on a pixel by pixel basis. “View” rays are cast out into the scene for each pixel and the color of the pixel is calculated based on the objects and light sources each view ray hits while traversing the scene. Sending view rays from the observer greatly reduces rendering time.

In the case of the Lytro we have the stored light rays that describe the scene captured when the picture was taken. In order to project an image onto our screen a focal point needs to be chosen. Given the selected focal point, the Lytro software uses the stored light rays to render the image. This is ray tracing in reverse in that the rays projected onto the screen to generate an image have their origin point within the scene rather than having been projected from the observer and through the screen into the scene.

To review, in ray tracing the scene is rendered by shooting view rays out from the observer and through the screen while with the Lytro the scene is rendered by shooting light rays captured in the scene back onto the screen.

The Future

A recent addition to the Lytro Library software is the ability to create 3D images. This ability demonstrates another advantage of capturing the light field. Scooting left or right, up or down within the camera is going to give you a slightly different view or perspective. The data needed to render those transitions is part of the captured light field and can be used to generate 3D renderings of the captured scene. A pair of inexpensive Anaglyph glasses is all you need. It’s easy to export the images as JPGs so that they can be viewed by anyone with anaglyph glasses.

One downside of the Lytro camera is the lack of a published API for image manipulation. Lytro uses a proprietary image format and while a lot of work has been done to reverse engineer it and create software to manipulate images as I write this there is no comprehensive cross platform API or software available for working with LFPs (“light field picture” files). The best resource I’ve found is “Lytro Meltdown,” which tends to be Windows-centric but contains a lot of useful information.

Light field photography is still in its infancy. Prices were initially high and the results underwhelming but things are starting to change on both fronts. The base 8GB Lytro camera has frequently been on sale for $200 to $300 recently, and new capabilities are being added to both the camera firmware and Lytro Library photo management software on a regular basis.

The thing I’d most like to see is a comprehensive cross platform open source library that can be used to manipulate and manage LFP files. The holy grail would be software capable of taking a raw Lytro file and rendering images.

Understanding Light Field Photography

By Mike Miller

Mike Miller