We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

How Can We Use Machine Learning in the Search for Exoplanets?

00:00

Formal Metadata

Title
How Can We Use Machine Learning in the Search for Exoplanets?
Author
License
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Exoplanets are planets beyond our own solar system. Since they do not emit much light and moreover are very close to their parent stars they are difficult to detect directly. * When searching for exoplanets, astronomers use telescopes to monitor the brightness of the parent star under investigation: Changes in brightness can point to a passing planet that obstructs part of the star’s surface. The recorded signal, however, contains not only the physical signal of the star but also systematic errors caused by the instrument. * As BERNHARD SCHÖLKOPF explains in this video, this noise can be removed by comparing the signal of the star of interest to those of a large number of other stars. Commonalities in their signals might be due to confounding effects of the instrument. Using machine learning, these observations can be used to train a system to predict the errors and correct the light curves. Who is Bernhardt Schölkopf? Bernhard Schölkopf is Director of the Max Planck Institute for Intelligent Systems in Tübingen and the head of the Department for Empirical Inference. More Info: https://en.wikipedia.org/wiki/Bernhard_Sch%C3%B6lkopf This LT Publication is divided into the following chapters: 0:00 Question 1:59 Method 4:11 Findings 9:14 Relevance 11:13 Outlook
MachineCartridge (firearms)AufnäherTypesettingModel buildingSatelliteSteering wheelPhotographic processingAvro Canada CF-105 ArrowSpaceflightIon thrusterShip classTrainRRS DiscoveryRemotely operated underwater vehicleFinger protocolPaperKlassifikationsgesellschaftSpare partOutsourcingCatadioptric systemRocketEuropean Train Control SystemPackaging and labelingMint-made errorsBecherwerkClockMeeting/Interview
Ion thrusterTrainSatelliteShip classPaperKlassifikationsgesellschaftPackaging and labelingTypesettingCatadioptric systemMint-made errorsSpaceflightMachinePhotographic processingSteering wheelModel buildingMeeting/Interview
Transcript: English(auto-generated)
The motivation for this research project was to find exoplanets. I was working together with astronomers. I spent some time at New York University, hosted by an astronomy department. I've always been interested in astronomy, even though my field is machine learning. And one of the most fascinating problems of astronomy is to find exoplanets, which means to find planets that orbit other stars
out there in space in our Milky Way typically. How do we do this? The most prominent method that's being used to find exoplanets is based on light curves. So it's based on monitoring the brightness of a star as accurately as possible. And to do this as accurately as possible,
astronomers have resorted to the help of space telescopes. And the telescope that has led to the largest number of discoveries is the so-called Kepler telescope that was launched by NASA some years ago. It's named after the German astronomer Kepler. And this Kepler telescope stared at one patch of sky, record very accurate light curves, which are not affected by the effects of atmosphere,
et cetera, because it's out in space. And this data is shared among all astronomers, and we can now try to analyze it and find exoplanets. And in order to be better than other people at doing this, we have to be better at removing what's called confounders. So confounders are processes or effects
that distort the signal that we're interested in. So we really want to get the signal that's from out in space, but we measure it through a telescope that might have little pointing arrows that might introduce some noise, maybe the sensor of the telescope introduced noise, et cetera. So we are interested in a signal that's confounded by these,
sometimes also called systematic errors, and we want to remove these systematic errors. And that's an example of a larger class of problems that's applicable to astronomy, but also of independent methodological interest for us as machine learners and causal modelers. Our approach is that we try to think hard
about what kind of information is available in the data. We are interested in one star. This one star is confounded by the effect of noise taking place in the instrument. We don't have direct access to this kind of noise. So therefore we have no direct access to the physical signal out there in space
that we want to reconstruct. However, in addition to the one star that we're interested in at any given point in time, we have 150,000 additional stars. Now, if it's the case that these additional stars are also affected by the same noise sources, then in principle, it might be possible to remove that information from the other stars to correct the light curve of the star
that we're interested in right now. So whatever the stars share might be new to the confounding effects of the instrument. Whatever they don't share is actually the true physical signal out there in the world because in space these stars are separated by light years and they don't directly interact.
From a practical point of view, if you work in machine learning or pattern recognition and you try to tackle such a problem, what you do is you get together with domain experts at the blackboard and you try to brainstorm what kind of information is there and how could we use it. So in my case, the whole thing started with a short sabbatical that I spent
at New York University in 2013. And this time was so productive that not long after that visit, we got a return visit from our colleagues in New York. And then we sat together in Tippigan in late summer, 2013. Talking about this problem again, we hadn't solved it yet.
We were sitting at the blackboard thinking about what kind of information is there, drawing down diagram what affects what and starting to think how could we phrase this as a machine learning problem. So how can we phrase this as a problem where we have data, inputs and outputs, where we have observations on which we can train
a system that can learn to predict outputs from inputs and how we can use such a system to correct these light curves. The result of our research is a new method to detect the presence and correct the effect of a confounder. So to correct the effect of something
that is caused by the measurement process, but that's not really what we're interested in. We're interested in a true astrophysical signal out there in space. We're not interested in the noise that our instrument adds when measuring that signal out in space. Now we're using the other stars to estimate the effect of this confounder
because all the stars are actually affected by that confounder. And at the same time, the stars out in space are independent from each other. So we have a set of independent things. They are light years apart. They don't interact directly out there in space. And then we measure something, light curves, which are dependent because they share the same confounder.
And then we can use these other light curves to correct a light curve of interest. And we do this by something that's called regression modeling. So we try to predict the star of interest from the other stars. And then that prediction, we subtract from the star of interest. And it turns out subtraction is exactly the right thing to do
if a certain additivity assumption about how the confounder acts holds true. So we can actually prove theorems about this. We can prove that if the additivity holds true and if sufficient information is present in the other stars about the effect of the noise, then subtracting that regression is the right thing.
And it corrects the signal and reconstructs the true signal up to an offset. And there's a nice analogy that we can draw that's maybe easier to understand and works quite parallel. Suppose you have a set of children that all share the same mother, but one of them looks very different. And in America, they call this the milkman's
or in England, they call it the milkman's child. I think in America, it's called the mailman's child. And we all know what this refers to. And then the problem is the following. Suppose you have such a set of children. Now, here we have one that looks a little different. And what you can try to do is you can try to explain how that one child looks
in terms of how the other ones look. So they have certain similarities. And these similarities are caused by the fact that they share a mother. Now, in our astronomical application, the mother is the instrument that records the signals. So the instrument makes sure all the measured star signals share some information. We want to remove that effect
and reconstruct the star out in space. Now, in the milkman case, we would take the children, we look at one, maybe the one that looks different. We try to explain the appearance of that child in terms of the siblings. If we then explain away this appearance by subtracting, now it's gonna be more complicated in this case
because the effect is not additive, subtracting what we can explain in terms of the siblings. Then, hopefully, if our mathematical assumptions are true, which they don't in this case, we would recover how the milkman looks. Now, in practice, things are a little bit more difficult. So in practice, we not only use the other stars
to predict the star of interest in order to correct the light curve corresponding to that star, but in practice, we take into account what we're interested in in that light curve. So what we're interested in are actually transit events. A transit event means the geometrical alignment of star and planet is such that as we look,
the star passes, the planet passes in front of the star and occludes part of the surface of the star, which leads to a small dip in the light curve, a small decrease in brightness. So that's what we're interested in. And this is a so-called transit event. It just takes a few hours. So for instance, if you were to look at earth
and sun from space, if you were lucky to pick up a transit, it would take half a day. So we're looking for such signals that take a few hours or take a day in the light curves. So we want to retain this kind of information in our light curves, but remove everything else. We're just interested in that. And it turns out if we use present,
not just the present of the other stars, the present values of the other stars, but also the past and future values of the star that we're actually analyzing, we can do an even better job. And we have to make sure that this past and future is sufficiently separate from our point of interest, so we don't actually explain away the transit itself.
But if it's sufficiently separate, we can do a better job, not only at removing the confounding effect of the instrument, but also we can remove the variability that's intrinsic in the star itself. Because it turns out stars are not as constant as we used to think. Almost every star shows some brightness fluctuations anyway.
And if we look for exoplanets, we are not interested in the brightness fluctuations of the star, but we're interested in the fluctuations that we get if an exoplanet occludes part of the star. Our results are relevant foremost in terms of the methods that they propose. So we develop methods and we have a new method
along with some performance guarantees under which conditions it works. And this method is applicable in various domains in bioinformatics medicine. But of course we developed it for astronomy and it's applicable in this field. And we have applied it in another paper. And this other paper is looking specifically at new data from the so-called Kepler-2 mission.
Now Kepler-2 means at some point the Kepler satellite broke. The satellite had four so-called reaction wheels that are used to stabilize the position of the satellite in space in order to make sure it's looking at exactly the same stars all the time. There were only two reaction wheels left and people, which means that the satellite
cannot stabilize anymore, but people had this idea that they can use the remaining fuel and the thrusters, like in a rocket, you know, you have thrusters that are driven by fuel to try to stabilize the satellite as well as possible. Now that's much worse. It's not as good as before, but in a sense that's good for us because we exactly deal with this problem
how to remove these kind of errors. So the data that the satellite, the satellite in the Kepler-2 mission then started looking at some other fields that haven't been observed before. And then we immediately started looking at this data and then in this data, essentially using our method combined with some clever method for searching through light curves
because our method just corrects light curves. Now we have to search for these dips that come from a transit. So combined with some method for searching these light curves we came up with a list of exoplanet candidates, published this list, it was a list of around 30 candidates and about half of them then were confirmed afterwards relatively soon using other means.
So they are considered as true exoplanets by the astronomers. Of course, nobody has visited them, but that's the same for all exoplanets we know. So that's an outcome and we've been quite happy with this. Where do we go from here? Of course, there are many other applications for this specific methods, some in astronomy, some in other fields,
but really, really we are methods developers and we're interested in the broader picture of the relationship between statistical observations and causal structure. And it turns out that statistical observations are just a surface phenomenon and underlying, we always have to have a causal structure that brings about these statistical dependencies
in the first place. This was first understood by a physicist and philosopher Hans Reisenbach who postulated what's called the common cause principle. We have reason to believe that if we understand this underlying structure, we can build machine learning systems that will generalize better from one task
to the next one. So currently in artificial intelligence, we have amazing successes with machine learning systems that are very good at solving one tasks. So we can train object recognition systems on label data. If we have millions of images of animals, each of them labeled as cat, dog, et cetera, we can train a classifier that will recognize cats and dogs extremely well,
maybe more accurate than a human by now. But we're not very good at transferring knowledge. So if someone gives us a new type of animal to recognize, we can only do that well if this person also gives us millions of training examples again. So we're not good at transferring the variability
that we've learned from the animals we've seen to a new class of animals. And we think the reason for this is that the current systems only focus on statistical information and don't try to understand the causal structure underlying these observations. And we think if we can model the causal structure
while retaining the attractive property that we do this in a data-driven way. We don't want to sit down and write down differential equations for everything. This is a data-driven approach to handling complexity. So if we can retain this attractive property, but go one level deeper, trying to automatically learn causal structure
that gives rise to the statistical structure, we have reason to believe that in this case, we will generalize better to new settings. So this is the big challenge for us at the interface of causal modeling and machine learning. How do we automatically learn causal models from data? And how do we exploit such models to generalize across different tasks,
which is something that humans and animals are good at and that artificial intelligence systems so far don't know how to do.