4–D Capture and 6–D Display
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Part Number | 4 | |
Number of Parts | 13 | |
Author | ||
License | CC Attribution 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/30386 (DOI) | |
Publisher | ||
Release Date | ||
Language | ||
Production Place | Cambridge |
Content Metadata
Subject Area | |
Genre |
2
4
7
12
13
00:00
Quantum statePoint (geometry)Direction (geometry)Group actionDifferent (Kate Ryan album)FluxFunction (mathematics)PlanningDimensional analysisFundamental theorem of algebraTerm (mathematics)SpacetimeCASE <Informatik>Position operatorLight fieldDistanceRepresentation (politics)Planar graphFile formatSound effectQuicksortMathematicsObject (grammar)Event horizonPropagatorParametrische ErregungGradientComputer animation
02:36
Field (computer science)Reflection (mathematics)Direction (geometry)Condition numberGlobale BeleuchtungDegrees of freedom (physics and chemistry)Quantum stateObject (grammar)DemosceneGeometryFunction (mathematics)Medical imagingGroup actionLight fieldComplete metric spaceUsabilityTerm (mathematics)Sound effectDiagram
03:30
Motion captureTerm (mathematics)SubsetElectronic visual displayComputer animation
03:58
InformationQuantum statePlanningINTEGRALPoint (geometry)Medical imagingMereologyMacro (computer science)Term (mathematics)DistanceDemosceneRight angleLight fieldAngleArray data structureStress (mechanics)Propagator
06:07
Daylight saving timeSinc functionException handlingFile formatArray data structurePrototypeNormal (geometry)DigitizingStandard deviationRight angleCASE <Informatik>Product (business)Centralizer and normalizerOrder (biology)
06:47
PlanningMotion captureDifferent (Kate Ryan album)Medical imagingTheoryAuditory maskingRight angleDemosceneBitBeat (acoustics)ResultantGroup action
07:58
DemosceneAuditory maskingProjective planeDigital photographyMereologyEndliche ModelltheorieFunction (mathematics)
08:44
MathematicsFocus (optics)MereologyPattern languageAuditory maskingData structureArmMedical imagingMetropolitan area networkSound effectOrder (biology)DemosceneFunction (mathematics)Point (geometry)Source codeTraffic reportingComputer animation
09:49
Auditory maskingGreatest elementFocus (optics)Meeting/Interview
10:19
PlanningRight angleAuditory maskingElement (mathematics)InformationFile formatDifferent (Kate Ryan album)Term (mathematics)OpticsComputer programming
11:07
Right angleMedical imagingOpticsForm (programming)QuicksortMotion captureInformationMathematicsTelecommunicationTheoryFrequencyCharge carrierDialectEvent horizonMathematical analysisProgramming paradigmLight fieldAuditory maskingTwo-dimensional spaceDomain nameNeuroinformatikComputer animation
12:12
Acoustic shadowMedical imagingData structureAuditory maskingSummierbarkeitFrequencyPattern languageDigital photographyHarmonic analysisMassMotion captureState of matterComputer animation
12:40
Data structureAlgorithmFrequencyDifferent (Kate Ryan album)InformationLight fieldDigital photographyFourier transformDemosceneView (database)Complete metric spaceMedical imagingScaling (geometry)Computer animation
13:25
Right angleInformationMereologyMathematicsGlobale BeleuchtungDirection (geometry)Dimensional analysisElectronic visual displayLight fieldFile viewerIndependence (probability theory)ResultantMedical imagingPosition operatorPlanningMotion captureDemosceneNetwork topologyLinearizationComputer animation
15:11
Acoustic shadowElectronic visual displayVideoconferencingCaustic (optics)
15:29
MathematicsElectronic visual displayGlobale BeleuchtungWindowPoint (geometry)View (database)
16:05
Graph coloringInformationMassArray data structureMathematicsReflection (mathematics)Limit (category theory)Direction (geometry)Globale BeleuchtungAuditory maskingMultiplication signSeries (mathematics)CASE <Informatik>Electronic visual displayDiagram
17:00
Medical imagingPixelElectronic visual displayInformationAuditory maskingArray data structureSeries (mathematics)OpticsSpacetime
17:42
MathematicsPattern languageGraph coloringElectronic visual displayDirection (geometry)Globale BeleuchtungGreen's functionDemo (music)MereologyMedical imagingRight angle
18:20
Right angleDisplacement MappingElectronic visual displayGraph coloringMathematicsPosition operatorPattern language
18:40
PixelPosition operatorPattern languageDirection (geometry)Electronic visual displayGlobale BeleuchtungComputer animation
19:29
Group actionSlide ruleElectronic visual displayMedical imagingINTEGRALFocus (optics)MathematicsGlobale BeleuchtungChainArmDirection (geometry)QuicksortElement (mathematics)Price indexDimensional analysisMotion captureDigital photographyStack (abstract data type)Level (video gaming)Geometric quantizationImage resolutionLetterpress printingOpticsAngleTerm (mathematics)Context awarenessComputer animation
Transcript: English(auto-generated)
00:01
Thanks everyone for being here. I would like to start with apologizing on behalf of Ramesh. Professor Ramesh Raskar who wanted to be here but unfortunately his trip to India got extended by a day and he's coming back tomorrow. Hopefully that won't hinder the talk and I'm hoping you can get the same benefit from it.
00:24
So what I'm going to talk about today is slightly different from what people think of in terms of traditional holography and the fundamental difference over here is that most of the stuff going on at our group and what I'm going to present today is going to be based on geometric optics instead of wave optics.
00:43
So we're going to talk a lot about rays and how rays propagate in space. So since it's a little different from the way we think of holograms in general, traditional holograms, I'm going to go over some basic fundamentals just to make sure everyone understands what I'm talking about
01:02
when I talk about rays. So there is this concept of a 5D planoptic function which basically describes the radiance or the intensity of a ray of light at any point in space in any given direction. So it's a 5D function because you can imagine in free space you have three dimensions for the X, Y, Z, any position in space and then you have two dimensions for the direction
01:24
in which you want to describe the intensity of light. So it becomes a five dimensional function and it's called the planoptic function. But if you consider the special case where we're talking about free space, you can imagine if I hold a point in space over here and I have a ray going in a certain direction,
01:41
the intensity of the ray of light does not change along that direction if there's no occluder between those two points. And this actually means that one dimension of these five dimensions actually becomes redundant and we're left with just a four dimensional function which is what we call the light field. And there are various ways of representing this 4D function
02:01
but the most common one is this so called 2D planar representation or parameterization where we take two planes which are separated by some finite distance and you find a point on one plane, you find a point on some other plane and join these two points. And this defines one ray of light that goes through the space between those two planes.
02:22
If you take all the points on both those planes, you can get the, you can completely describe all rays of light that go between those two planes. So that's the basic idea of what a light field is and how and why it is four dimensional. So if you look at an object or a scene, you can consider all the light rays,
02:42
rays of light that are coming out from that scene and since this is a 4D light field, this becomes a four dimensional function that you can, that you have coming out from a scene. You can also invert this problem and you can consider the four dimension, four degrees of freedom you have in the illumination that are incoming on the scene.
03:02
And together this actually gives you an eight dimensional function which we call the reflectance field. And this reflectance field completely, at least in terms of geometric optics, describes how a scene would appear from any given direction under any given illumination conditions. And if you can capture this complete eight dimensional reflectance field,
03:21
you can recreate the object in any way under any illumination condition. And that's the direction we want to go in. Unfortunately we're not quite there yet and what we're going to present to you, what I'm going to talk about in this talk, is only a much smaller subset. And I'm going to talk in particular about a four dimensional capture, so not the complete 8D,
03:42
but just a 4D capture and a 6D display. And again this is all in terms of geometric based optics, so you can obviously use holograms to do a lot of these things, but I'm going to talk about some other techniques that we've come up with. So let's just talk about what a camera does in very simple terms
04:03
and what part of this 4D incoming light field it captures. So if you consider a traditional camera, you have a lens and then you have a sensor some distance behind it. And the way it images a point in the scene is that it focuses all the rays coming out of that point onto a single point on the sensor.
04:22
Unfortunately what's happening here is you are losing out on a lot of information that's incoming into the camera. And in particular if you think of the 2D two plane parameterization that we spoke about, what's happening is you're losing all the information in one of those planes. In particular the angular information is completely lost out.
04:42
And all you get is the spatial information. And this actually happens because of, I don't have a laser pointer, but if you look at this angle that's been collapsed, so all the rays that are coming in within this angle are actually integrated or summed up.
05:05
And you can never recover what the intensity of any of these individual rays was. That information is completely lost when you use a traditional camera. So people have come up with ways of getting around this problem and it's been known for over a century now this concept of integral photography
05:24
where you essentially put micro lenses instead of a sensor in the plane where these rays are coming together. And let these rays propagate through and have the sensor at some distance behind these micro lenses. And what this gives us is it gives us a small macro image behind each of these micro lenses
05:43
and each of these images gives us this angular information that was otherwise lost in a traditional camera. And so by using this kind of a camera which contains a micro lens array right in front of an imaging sensor, you can recover the complete four-dimensional light field that is entering the camera. So you can essentially capture all the geometric information
06:03
that's coming into the camera in terms of the light. A prototype of this kind of a camera was recently built by some scientists in Stanford and they've actually since then started a startup company based on this. And what they did was they took one of these micro lens arrays
06:23
and put this right in front of the sensor of a standard digital medium format camera. What this gives us is something very interesting. It's similar to some of the stuff you saw in the last talk except since we're not using holograms and we're not using wave optics in this case,
06:41
we don't need coherent light. You don't need lasers. You can work with incoherent daylight and normal light. So one example that they could get is with just one single snapshot just by capturing one image and then just processing the data that they acquired from that one image. They're able to refocus at different planes in that image.
07:02
So for example in this scene you have people standing at different depths and from one capture they are able to recover this image and then this, this, this and this. So I'm going to just run through them again. You can see it's focusing at different planes, at different depths from the camera.
07:27
So this was done by just placing a micro lens array right next to the sensor in a traditional camera. What our group has been trying to do is how can we do something similar but without using a micro lens array. And we've come up with a technique of doing this kind of stuff by using just these kind of masks.
07:48
So instead of placing a micro lens in front of the camera, we place one of these masks in front of the sensor and we're able to get very similar results which is what I'm going to get into next. So one of the first projects that we did was this, what we call the coded aperture camera
08:03
in which we ripped open the lens of a camera and we put a very carefully designed mask very similar to this one which is opaque at certain places and transparent at other places into the aperture of the lens. So that's the mask that we used. And you can see the portions that are white are what are transparent and let the light go through.
08:23
And the parts that are black are of course opaque and light doesn't get through. So essentially we are throwing away half the light that's entering the camera. And it turns out by throwing away half the light in a very structured, in a very well thought out manner, you can actually recover much more information out of the photograph that you capture. So in particular what this particular mask lets us do is if the scene is perfectly in focus,
08:48
it actually does not change anything at all. So if the scene were perfectly in focus, it wouldn't change anything at all.
09:02
And like you have a point source over here, an LED, it appears the same. It's just half as bright as what it was before. However, if the scene is out of focus, we get these, with a traditional camera, you would get these out of focus blurs with so-called bokeh effect coming from the finite size of the aperture of the lens.
09:23
But because we placed this mask inside the aperture, instead of the circular bokeh, we get this well structured bokeh which is actually an image of the mask that we put in the lens aperture. And it turns out by converting the point spread function from a disc, a circular disc, to this very well structured mask pattern, we are able to use computational techniques to recover an in focus image from an out of focus image.
09:49
So for example, over here, the camera misfocused. Instead of focusing on the face, it focused on the background in the painting. But because this picture was taken using this mask planted in the aperture of the camera,
10:02
we are actually able to recover an image as if it were focused on the face. So you can see the glint in the eye is preserved. And this is something that would be impossible using a traditional camera if you didn't have this modification in the camera. So what we saw over here was we used this mask in the lens aperture and this gave us more or extra information.
10:27
This is kind of similar to the previous talk where they were looking at using holograms as an optical element. What we are looking at is how can we use these masks as optical elements. So in some sense, it's a simpler problem than the one they're looking at.
10:41
But it turns out this itself gives quite a lot of flexibility in terms of how we photograph. So the thing that we investigated next was what happens when we move the mask at different planes between the lens and the sensor. And in particular, what happens when we place this mask right next to the sensor. So we took this mask, which I have here. You're welcome to come and take a look later.
11:06
And we placed it in front of the sensor of a medium format camera, very similar to the way researchers at Stanford had placed the micro lens array right next to the image sensor. What this gives us is essentially a form of optical heterodyning.
11:25
Without getting into too much of the details of what's the math behind this, which is very elegant, it's a nice frequency domain analysis of how this works. But it essentially takes angular information that was otherwise lost by a traditional camera
11:40
and puts it into the higher frequency regions of the information that a sensor does capture. And it's sort of two-dimensional heterodyning where the mask acts as the carrier frequency, just like your optical, sorry, RF communication. And the sensor acts as the, and the heterodyne or the heterodyne signal is actually sensed on the sensor.
12:05
And then in computation we do a de-heterodyning or demodulation to recover the light field. So this is a photo captured by our modified camera where the sensor had this mask placed right next to it.
12:20
And if you zoom in, you can see some structure to the image as it's captured. And that's basically the shadow of the mask as it's falling on the sensor. And I must mention the mask that we used over here was a sum of co-sinuside patterns and of various frequencies of various harmonics.
12:41
So, I think I'll skip that. So what, if you look at the Fourier transform of the image that's captured, it has this very special structure and it has this information at different spatial and angular frequencies.
13:01
And by simply running a demodulation algorithm on this, we are able to recover the complete 4D light field. And what you see over here is just different views of that light field. So by taking just one single photo, we are able to recover what the scene would have looked like if the camera had moved slightly and not just stayed at that one place.
13:22
And we can recover all this from just one photo. Here is another result where on the left is the image that we captured and on the right top we are showing a refocusing result where we are focusing on different planes in the image. And again this is from just one image. And on the bottom right we are showing, we are moving the viewpoint slightly within the lens aperture itself.
13:44
So you can have slight modifications in the viewpoint of the camera and you can recover what the scene would have, image would have looked like if the camera was there. And this is all physically correct. It's not just a Photoshop trick that we are doing. It's what light would have fallen on the camera if it was placed at that location
14:01
or if it was focused at that plane. So that was more from the capture side, how we can capture the 4D light field by simply making a very, very simple modification to a camera and then using some computation. The next part of the talk is how we can actually process, how we can show and display this information to a user.
14:24
And the thing that we've been looking at is the six dimensional displays, which you might be wondering what the 60 really means and I alluded to this right at the start of the talk. But if you think of a 3D display that changes as you move your viewpoint left or right,
14:42
that would be a 3D display. If it also changes when you move your viewpoint up and down, that becomes a 4D display. But the other two dimensions actually come from the direction of the illumination on the display. So does the display change when your illumination on the display itself changes? If you move a flashlight around the display, do you see a different thing?
15:01
And the first thing, before I get to the 6D, I want to talk about a 4D display, where it's independent of the viewer position, but depends only on the illumination position. So this video over here shows this display that we constructed and I have some of these over here that you're welcome to come and take a look later.
15:21
And as the flashlight is moved behind the display, you can see the shadows and caustics that are formed by this display actually change. And this is a flat display just like this. There's nothing else in it. It's actually very similar to traditional integral photography. It's just integral photography actually inside out, just turned around.
15:44
Here is another example of the same display where it was placed on a window pane and as the illumination changes during the day, you can see a different view based on where the light is coming in from at that point in the day.
16:06
So as I said, the way this works is very similar to integral photography and we basically have a series of lenticular or lenticular sheets or microlens arrays and then we place this mask within the lens over here,
16:23
which the mask is again, it encodes this 4D information on a 2D sheet. It's a very similar mask to this except this time it's in color so that it encodes, it can encode color, different color going out in different directions. One of the limitations over here is you can never have any kind of reflectance greater than one. It always has to be less than one.
16:44
So the next step was going to the 6D display that I mentioned earlier and now we want the display to react not only to changes in the illumination direction, but also changes in the viewpoint. Unfortunately, this is harder than it appeared at first and the case that we discussed before
17:01
and we ended up making just a small 7x7 display, so it's 7x7 pixels display for this and each pixel over here that you see is actually about that thick and I have, I think, an image of that. So that's what each individual pixel looks like. It has a series of microlens arrays and I think two lenses
17:25
and then in between two microlenses over here we have this mask that's placed that actually encodes all the information. So the optics is actually again similar to integral imaging. It's just a little, many layers of the same thing.
17:44
And so I'm going to show you a simple, very simple demo of what this looks like when viewed from different directions and what we have, what we encoded in this display was as the illumination direction changes behind the display, you get these different patterns.
18:01
So this is the x and y of the incoming light direction. You get to see this display pattern when light is coming from the corresponding direction and based on the viewing direction, we change the color of the pattern that you see. So as you move to the right, you see a magenta color pattern. As you move to the left, you see a green color pattern.
18:22
And I'm hoping that you can see some of this. The display is there in the middle. As we're moving from the left to the right, the color of the display changes and I think it's going to move closer. And what it's showing here is the camera position and what color you should see and here it's showing the light position behind the display and what pattern you should be seeing.
18:48
So right now each pixel costs about $20 or $30. So you can imagine building a large display is not quite feasible yet, but maybe in another 10 years, who knows. So now it's changing the light illumination direction
19:03
and you can see the pattern is changing corresponding to what the light position is behind the display.
19:31
I think that's what I have for the talk and if there are any questions, I would take them now. A lot of this is not my work. I'm presenting others' research here also, so I would really like to thank them for the slides
19:41
and for doing the actual work. And just a plug for our group, it's a relatively new group and what we are trying to do is build the cameras of the future or try to dream designs of cameras that better capture the visual world around us and not just be limited by a traditional camera that is a lens and a sensor behind it.
20:01
And I'll take any questions if you have. Thank you very much indeed. Okay, well we do have a few minutes for questions. Again, over to you and please use the microphone. Is there somebody here? Yeah, thank you. Yes, Linda Law. With your 4D display with the demonstration that you showed with light coming through the image,
20:25
how large have you been able to make those images and how large do you think you will be able to? This is what we have right now and this is as large as we've made them so far.
20:40
And it really depends on how big micro lens you can get and how much you want to build on that. And it also depends on the printing resolution that you're going to use because that's how much quantization level you're going to get in terms of angle. But it's really very similar to just traditional integral imaging displays
21:05
that change when you move around them. Except here, they stay fixed when you move around them. They change when the illumination changes behind them. So it's not very different from that. Please, in the middle.
21:22
Tom Svetkovich. Do you get accommodation focus in the Z direction? Not with these displays, no. With no integral techniques? Well, you don't necessarily get them. So there's research and I'm aware of people trying in,
21:44
I forget, somewhere in Japan where they try to do that kind of thing. The problem you end up having is you need to have very, very high resolution in the angular dimension. And it's not really easy to do that if you are far away because a portion of the eye is so small. It's hard to get that kind of accommodation.
22:01
We have tried some similar experiments in our lab and we can get it to work when it's about that far. We can get some sense of accommodation. But if you're far away and with just using microlenserase or lenticular sheets, it's really hard to get much accommodation in there.
22:22
Any other questions, please? Yes. Hi, I'm Jim. And have you considered using holographic elements to replace these sort of stacks in the 6D display? Yes, we are definitely looking at that.
22:41
And that was a lot of what the previous talk was about, using holographic elements as sort of just optical elements. And that's something we are actively looking at. Thank you very much indeed. Thanks.