Formation of hypotheses in the field of molecular biology
This is a modal window.
Das Video konnte nicht geladen werden, da entweder ein Server- oder Netzwerkfehler auftrat oder das Format nicht unterstützt wird.
Formale Metadaten
Titel |
| |
Untertitel |
| |
Serientitel | ||
Anzahl der Teile | 4 | |
Autor | ||
Mitwirkende | ||
Lizenz | CC-Namensnennung 3.0 Unported: Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen. | |
Identifikatoren | 10.5446/51097 (DOI) | |
Herausgeber | ||
Erscheinungsjahr | ||
Sprache |
Inhaltliche Metadaten
Fachgebiet | ||
Genre | ||
Abstract |
|
00:00
Vorlesung/Konferenz
06:18
Vorlesung/Konferenz
09:23
Vorlesung/Konferenz
12:54
Vorlesung/Konferenz
14:50
Vorlesung/Konferenz
20:42
Computeranimation
26:08
Vorlesung/Konferenz
31:34
Vorlesung/Konferenz
40:19
Vorlesung/Konferenz
49:04
Diagramm
58:49
Vorlesung/Konferenz
01:08:33
ComputeranimationVorlesung/Konferenz
01:13:56
Vorlesung/Konferenz
01:19:18
Vorlesung/Konferenz
01:25:39
Vorlesung/Konferenz
Transkript: Englisch(automatisch erzeugt)
00:23
So welcome everybody to this third seminar of the series that we are organizing with Richard Gormoff and Francois Capasse, which is called fundamental questions on amazing logic of molecular biology.
00:42
And this series of seminars are about very interesting topics in molecular biology. And all these seminars are linked to a book that we are currently writing with Michel and Francois here, and it's a book which is about brain stages in molecular biology.
01:06
We are writing, we are planning to write. Planning to write. And today we are pleased to welcome Francois Capasse. So Francois is a Research Director at the ISSB, so Institute of Systems on Synthetic
01:24
Biology, which he founded inside Genocle. And he is a cell biologist studying engineering genome architecture using synthetics, systems on molecular approaches. And more recently, he founded his own company, Synovance, if you want to say a word about it.
01:48
Well, thank you. Thank you very much for the introduction. No, I don't want to say anything. I will not read the patent for you. That is pending. Okay, so as you understand from this introduction, the seminar aims at trying to show how ideas
02:11
came about in molecular biology. And I've really tried my best going in this direction. It's not easy, but at least in one case where I've been working since 2001, I can
02:28
at least testify for myself and trying to be honest about how the ideas came about. And so that's what I'll try to recount here. And because you are scientists, to avoid frustration in the end, I'll tell you the
02:44
situation now with this idea and project. Even though the main purpose is to indicate how it came about, I'll just make some sort of a final retrospective of where we stand with respect to this project now.
03:04
Okay, so in the first part, I'll really try to explain how and what were the influences of people who brought something to the table that I used and then how I used it and then
03:27
how I tried to handle the issue. Okay, so in this first part, I'd just like to express one element that was pretty obvious
03:43
long ago, but has been increasingly documented. People look at DNA often as an inert piece of information that can be read by interpreting
04:03
the sequence of A's, C's, G's, T's along the DNA strands. This is really, I'd say, the minimal information we can draw from DNA.
04:22
DNA encodes, and this we are well aware of, encodes the sequence of the macromolecules, RNA, protein, and so on. But it also encodes the level of their expression. And it encodes more surprising things, such as how fast should the protein, not the RNA,
04:49
the protein, how fast should the protein be made at the beginning, a little later, later, later, along the sequence of the protein. This is encoded in the DNA as well.
05:02
So here you see that we are already far away from the just sequence of nucleotides or amino acids in the RNA and the protein, the first item here. Yet this is still forgetting a little piece of information, which is that DNA is actually
05:27
chemical and physical. It has a physical chemistry. Why do you want biology or live organisms to not use this feature?
05:41
Well, actually, it's not our choice. I mean, we are the outcome of this choice. Evolution made it. Indeed, DNA is also carrying information at the level of its physical chemical properties, including the folding.
06:02
And it's more on that side that I'll dwell today, which is underlined here. Folding versus expression. So now to the influences. The first influence for the story I'll recount to you is about the local organization of gene transcription.
06:29
Benno Müller-Hill, now retired, but at the time professor at the University of Köln in Germany, and also the writer of the first books and the first inquiries about medicine in the Nazi period.
06:47
He inquired himself and wrote books about that. I'm not sure you know that. Benno Müller-Hill was the first to show this little twist to the story of Jacob Manno, published in 1961,
07:04
which is here is the lac operator, that's the DNA here, where the regulator protein, like repressor in that case, will bind
07:20
and influence the level of expression that is the rate of transcription initiation of the downstream gene or operon, the lactose transcription unit. So that's this 1961 story. Müller-Hill brought an important twist to it by showing that even though in the traditional scheme
07:47
there was one binding site for the regulator, there were actually more than, at least two, more than one. And that changes everything, because it allows for what biologists call cooperativity
08:01
or what physicists would call sigmoidal behavior. When you look at the effect versus the dose, you have an S curve instead of the simple one binding site configuration. We can reason with two, even though there may be three, four,
08:22
but the result is the same. We do have cooperativity. And this he showed very, very well. So this cartoon is showing some interpretation of it. So first of all, there is more than one binding site on the DNA for this protein,
08:41
which is repressing the expression of the genes. Second, the protein has two hands to grab DNA. It's divalent. Again, this is essential. Otherwise, how do you have any cooperativity?
09:02
Whatever it is, tetramer, dimer, octamer, what is important is it is divalent. And so on this cartoon, what you see is the protein bound to two operators, that is two binding sites, inducing a loop in between.
09:22
And he showed that if the physiological repression fold is 1000, that is when the protein is bound, the level of expression is 1000 times lower than when it is not bound. If you touch any of the features that make it divalent, you lose this.
09:46
You go from 1000 fold repression to 15 fold repression, 1-5. So you lose a factor of 70 in the control of expression. Why should it be squared? It shouldn't be that big difference.
10:01
Two factors become squares. It's not just, right? Each of them, if you think, if you think about 200, how do you get? Why 200? Well, you multiply the factor by another factor, so you multiply that. Why you have more? No, but it's the same, it's one protein.
10:22
It's really, it's one complex protein only. It's not more, it's not several proteins assembled. No, there are two factors in the same behavior, right? And usually factors, the factor kind of multiply no more than that. Oh, so you mean it should be 15 times 15?
10:41
Yes. Yeah, in terms of probability, but with the 15 fold is really without any effect. There is no loop anymore, right? And perhaps also, should we take into account, you're right on the numerics, that actually there is more than two binding sites in this specific case.
11:01
There are four, but the two others are weak, weaker. That is possible. So anything he could do, like for instance cutting this in the middle,
11:23
making it monovalent, or making the loop here, intervening loop very, very long, but even something more subtle worked, really subtle. So the physiological size of this loop is 93 base pairs, that's here, okay?
11:42
And you see the fold repression, don't look at the scale here, it's arbitrary units, but was 1,000 fold up there. But if you remove or add two nucleotides, two nucleotides, then you are down to 15, that's here. Just look at the points, take two points backwards or two points forward,
12:05
then you are down, okay? So it's extremely sensitive to removing or adding two to eight nucleotides to the... The difference position of the turnover. Yeah, it's a matter of positioning, respective positioning of the binding sites,
12:25
because the double helix is 10, more or less 10 nucleotide pitch. So if you remove 10, it's okay? Yeah, that's what you see here, yeah. If you remove 10, you are back up. 20, 30, you are back up.
12:41
I mean, it's a little bit more than 30. And if you go forward, same thing. So actually it can be used as coil, given this 10, 10 is bigger, 10 can add a little, 10 can really, again, by the way. I mean, this is more binary with several things, we call it this way? You mean is it the only system that is...
13:01
The system is two. Can we have these sequences using this kind of... Yeah, you can, yeah. And I said that there was a third binding site in reality here. It's not shown, but it's at 401, 401 base pair, and same thing, same thing.
13:21
So with... So here, this one I drew with the main binding sites is 93 base pair, but there is yet another operator here at 401 base pairs. And again, that's like 39 turns or something.
13:45
So that gave me a lot of inspiration. You'll see why in a little while. Besides, some of these... It's difficult to read, Vilar and Leibler, 2003, actually.
14:02
Misha, if you remember, we had organized with Alessandro and Paul Borgin a meeting in 2002, in Evry. It was the first meeting really on systems biology in Europe, and we invited Muehl and Leibler, and that's how they met.
14:21
And then the paper came out in 2003, the next year. And another thing is Leibler is no longer working on it, but Vilar is still working on this story. So that's one thing.
14:40
Are there other cases like this? Yes. Is one important one, or is it like some other examples? Yes, there are many other examples. Yes, yeah. But it was the first one. In higher eukaryotes?
15:00
In higher eukaryotes, the situation is different. The distances are much longer, and you have this notion of enhancers, long distance enhancers, and so on. It's slightly different. The physical-chemical principle must be the same, but the details are different. No, this idea is going to have a very long platform
15:22
where a particular protein can form the brain like that. So it's actually a mechanism. But your mention about enhancer, it's another case. Is it a motor enhancer? Or what about a repressor? How is it doing? Here, actually, it's a repressor.
15:42
It works both ways. It's about the strength of the control, not about being an activator or a repressor. It's about the strength of the control, how much you can bind and stay. There are different ways to tell the story afterwards at the interpretation level.
16:03
For instance, Müller-Hill would say the story like this. He would say it's a matter of local concentration effect. If you have this loop of 92, 93 base pairs, you have to imagine that, of course, you have an on rate, but you also have an off rate.
16:21
The protein is there. So that's my two feet now. And that's DNA here. Sorry, I'm not sure you see my feet, but anyway. So you have the two feet on the DNA. Now, there is an off rate. At any moment, one of the binding sites may be unbound.
16:40
I lift my foot. So now, how much should I search to find it again? Well, if I'm still bound with the other one, I will search a small volume. If I'm fully unbound, then I have to search the volume of the cell.
17:02
That's how he explains us. Because we know that it goes on and off. So if you have only one foot on one site, then once you get off, you have to explore the whole volume of the cell, 10 to the minus 12 cubic centimeters. But if you are still bound by one foot,
17:24
then you have to explore 10 to the minus 4 times less volume, smaller volume. Or if it's 400 or 1, then it's 10 to the minus 14. But if you increase it to 1,000 base pairs, as he did in some of his experiments,
17:41
then it's 10 to the minus 13 cubic centimeters. It's not a big advantage. But this is a big advantage. In the first approximation, the probability of getting both feet at the same time away, off,
18:01
is the square of the probability of getting one off. And so this will not happen. I mean, in practice, it's like it will not happen. It happens to get one foot off, but it doesn't happen to have both feet off, I mean, reasonably, just because it's a square of the probability of the first one.
18:20
So it's very improbable. We are talking about a specific binding. The protein will stay, on average, a minute, an hour. So you see the probabilities are very low. So getting both feet off is really not probable.
18:44
So the first piece of reasoning that I made at the time was, let's go from intergenic to intergenic. So here, we were talking about the two binding sites of one gene.
19:01
The lactose gene. How about having longer loops and having binding sites? So this is the binding protein, which is repressing or activating the regulator protein. It's divalent, which is the case for almost all these regulators.
19:25
And now, how about bridging two sites that are distant, that are not in the same gene? And how about from intra to interchromosomic?
19:41
Another chromosome, and these proteins which were bound on one chromosome by one site, now binding another chromosome. However, why not? But besides some physical difficulties, when the distances become important,
20:03
to get these two sites at the same place to bind the protein, there are other difficulties. And notably, I ask myself the question, can the cells do this, knowing that we have hundreds of sites for a given regulator to accommodate,
20:26
and knowing that we have hundreds of different regulators, each claiming for their own lives, and importance, and relevance, and actions, activity.
20:43
It was difficult to imagine for me, but I'll come back to that. Another source of inspiration, which was distinct, was the case of the nucleolus in eukaryotes. So you have this nucleus in the cell, which was prominent.
21:04
Morphologists at the end of the 19th century, using the optical microscope, were able to show that there was a little organelle within the nucleus, that they called a nucleolus, that was visible with the optical microscope, no trick.
21:23
Almost no trick. So it turned out, much more recently, and one person who did an important work on that is Danielle Hernan-Desverdans in Jussiou.
21:41
So I mentioned her paper, but many people have worked on the nucleolus. I mean, it was shown that the nucleolus is actually not bound by a membrane or an envelope. It just exists out of its activity, to summarize.
22:01
It's the place where there is a lot of activity of making ribosomes. That is one of the major elements in the cell. So it's massive in terms of what the cell has to achieve, is to make ribosomes. Another way to see the importance of the ribosome is to notice that it's actually the point of self-catalysis,
22:23
because it's made of proteins and RNA, but it's made of proteins, and it's the thing that makes proteins. So it's the point of self-catalysis. So in a rapidly growing bacterial cell or a yeast cell, it's like a quarter of the dry weight.
22:40
It's the ribosomes. It's only one or maybe two or more? Okay, so I'm not yet at the picture. I'm coming to it. So that's just to have you feel how, let's say, how active is this little place
23:01
where ribosomes are made and assembled within the nucleus. Now, these are just images, fluorescent images, of the nucleus in activity. I don't remember what was labeled there.
23:22
I forgot in the meantime. But you can tell these are images of the nucleus within cells, within nuclei, okay? So I think it is significant that this has no membrane and envelope of lipid membrane or anything around it,
23:41
but was considered an organelle. It's just made out of its own activity, the activity that it makes, polymerizing RNA, assembling proteins with the RNA. But it's still chemical. It's probably muscle proteins or? Ribosomes, RNA and proteins. We already have made ribosomes and proteins
24:02
of this much ribosomes themselves. It still uses proteins made by ribosome in the close, which normally doesn't happen in the nucleus. Exactly, exactly. No, you're right. So actually, to tell the story more completely,
24:20
so it's the place where the ribosomal RNAs are made. And the messenger RNA that encode the ribosomal proteins, of course, are exported out of the nucleus, are translated into protein, and the protein are imported back to the nucleolus, and there they are assembled with the RNA that was made and so on.
24:42
So it's a huge factory, really. It's very impressive in terms of activity. And so this activity without envelope, membrane envelope, without envelope, actually, which self-assembles as needed for the work to be done,
25:02
was enough to be seen as an organelle by the morphologists of the 19th century. A little bead in the bead of the nucleus. Okay, so just to keep in mind that activity
25:23
of making RNA and so on can be focused in things that are very active, such as the nucleolus. So it's another piece of information, again, that I thought was useful in the sense that at the time it was not so clear
25:41
that there was an equivalent for the messenger RNA. But then some morphological work came. But at the time it was not clear. So it seemed to be the case only for the very active ribosomal RNA assembly. So then what happened is in August 2001,
26:03
I was in Normandy by the beach, and I was on holiday. And I decided, because the weather was nice, that I would do like everybody else. I would go to the beach. So I took my little towel, and I went to the beach with an umbrella.
26:24
And no more, I think, I can't remember. It was the 5th of August, for sure. But I think no more than 10 minutes later, I thought I had something to check.
26:42
I thought that we had a problem. So I had in mind Muller-Hill and the DNA loops and their importance. I had in mind the nucleolus for the ribosomal RNA that I thought perhaps we could use as a model for smaller things that were factories
27:00
for the messenger RNAs of anything. And I had in mind discussions with Vic Norris from the University of Rouen about hyperstructures as well. But I couldn't see, as I showed my two little chromosomes, I couldn't see how the cell could do it.
27:21
I had a little problem on that. Exactly what? Exactly how can the cell deal with these little bacterial cells with 300 different factors, transcription factors, regulators, each dealing with between 1 and 600 targets
27:43
without some minimal order in the story. So I thought it shouldn't work. Of course it was a totally qualitative reasoning, but without modeling quantitatively, which no one would have been able to do
28:00
at the time anyway. And even now I think it's very tough. I thought it could not work. And of course I know that bacteria are around and have been around for over 3 billion years. So I thought there must be a principle of organization that we have not looked at so far
28:23
that should help. And the idea I had after these 10 minutes on the beach was that it has to do with loops, but not loops at the size of what I showed, but larger loops that would allow
28:40
for some phasing of any chromosome and would allow to aggregate things that belong together a bit like the nucleolus into transcription factories, hyperstructures, but in the case of transcription,
29:02
so transcription factories. So I had at the time, so I went back to the flight immediately, and on my computer I had some Excel sheets with the data that were published in 2002 about the yeast transcriptional interaction map.
29:25
I had the data in an Excel sheet. So it's simply a chart of, it's a simple chart that says that, it was for yeast in that case, it says that for this transcription factor one,
29:42
this is the list of targets, targets A, B, C, and so on, and for transcription factor two, and we had assembled that and we published it in Nature Genetics in April 2002. That's all. That's what I had on the Excel sheet.
30:01
But I also had the sequence, so I knew where I knew... Affinity was already the same, or in the Italian series, the affinity? Sorry, Michel? Affinity of the status. We're giving the affinity as a target. Affinity, forget it. Yes or no? There is no, yeah, exactly. Okay. There is no weight on that, no.
30:22
So position on the chromosomes, we had the positions from the sequence. Is it how we distributed it? In 1996. Yeah, so we, yeah. So that was on an Excel sheet, and... At least experimental. It's experimental. Yeah, it's not... Yes, yes. And this was not experimental?
30:41
Yeah, no, no, it was based on experiments, yes. So these four, at the time, I think, 54 of the 300 transcription factors, and these lists were not complete, but were already long. And so I used my Excel sheet,
31:00
and by midnight, I had the first result, and I have to say then it took months and years to really work out this properly, and I'll say a few words about that later. But, yeah, so this... You found some pattern there, I think. The point is to find a pattern, the logic. Yeah, yeah, find a pattern.
31:21
Exactly, exactly, exactly. And there was a pattern, but my tools were rudimentary, and the data were still not complete, and lots of problems. But it seemed encouraging already by midnight the same day. So here is sort of an implicit lesson
31:42
for the youngest of you in the room, that it's important to get bored sometimes, actually. What I understand, so you expect to have something like nucleoli in terms of the same organization,
32:02
a pattern that would simplify the work of the cell, a pattern on the DNA, on the chromosome sequence, that would simplify the work of the cell in order to organize itself internally for the purpose of doing all these things with 300 transcription factors,
32:22
each repressing one to 600 targets, all at the same time. It would be a close one to us, right? A localized, like in the case. Yes, but how do you do that when you have these many transcription factors and these many targets and a total of 6,000? Well, the best thing to tell us was the pattern you found. Yeah, yeah, yeah. No, I'll tell you the pattern that I found.
32:42
Yeah, sure. No, it's actually simple in a way, but when you get to the details, it's actually complicated. But it can be captured simply. So then I had this view that if the chromosome was organized in such a way
33:00
that genes belonging together, let's call them co-regulated genes or co-functional genes, could be together in space, in the space of the cell, then I would get the best from the world of Benomeleril and the best of the world of the nucleolus. And here is a second factor
33:24
with targets as well, and there are more, of course. So there was more or less how it boiled down to this view. And in a very simple-minded way, the pattern I was looking after that I imagined for a while
33:42
was of this type. So imagine solenoid. The DNA is coiled. Imagine you have periodic positionings of the green genes that belong together, of the blue genes that belong together, and then you get some sort of a solenoid
34:03
where if now you compress it, in your mind, if you imagine that it is compressed, then you do have local concentration effects for the blue, local concentration effects for the green, but you don't have crosstalk between the two systems because they are actually far away on the solenoid
34:22
because they have a different phase, right? And this is important as well. You don't want to have too many green guys in the area of the blue binding sites on the DNA because they will bind. As we know, the on rate is the same whatever the factor and the binding site.
34:42
It's the off rate which is very different. So the on rate. So because the on rate is the same whatever the segment of DNA we are talking about, then crosstalk is a big problem that the cell would probably avoid crosstalks between things that should not talk crosstalk.
35:02
So one way would be to use the phase. And that predicted that if I look at... Phase meaning what is phase of what? Oh, no, phase is... No, don't... Nothing... Yeah, it's just a different place
35:21
on the solenoid in phase view. It's all. So... Do you know all of that in terms of how you organize DNA? It's because of chromatin organization like for all... Yes. Yes, yes, yes, yes. Yes, but I... But it's too different. I'm not addressing this issue because I don't know really how to...
35:43
how this could fit with the chromatin organization. There is no contradiction, but I don't know how to... I can't make sure. ...draw that... This what you said is... Yeah. I think it's very fair, right? Yeah, yeah, yeah. So, yes, sure.
36:00
So actually, the current situation, a few... So some words... So this is a very simple principle. If there are... There are some pre-ambulated factors that they cite, which are germ as partial and might be seen close together, right? At the state. And this was first verified. This was verified, experimentally being checked.
36:20
Oh, sure. Sure. So... Yeah, sure. So the situation now, today. So it all started in 2001. The first papers were published in 2003, about that. And currently, there are about 26 papers on this topic, mostly for... No, all from the team,
36:40
plus there are maybe two papers from outside the team about this topic. And the situation. So what do we know in brief? So this was just to present the idea. It's very naive, very simplistic. The patterns are actually quite complicated, but the notion of periodicity
37:02
and proximity holds, even though we have sometimes several periods that overlap. We sometimes have regions that do not fit. We do have transcription factors that agree with the same period. We do have transcription factors that have different periods
37:21
that are not reducible to the first one, and so on. So it's actually a bit more complicated than what this view shows. So what happened is that we had more and more data with new transcriptomics methods at that time, 2000, 2001, 2002,
37:42
with so-called chromatin immunoprecipitation, and so on. So in some sense, after that, I should also say, besides the papers, that a couple of European grants were granted for this work and so on,
38:00
until now. Okay, so that's the current view now. The current view is, do not forget this. Do not forget the physical, chemical aspect of DNA. The cells have not forgotten it. They take advantage of it. So it is a good idea
38:23
to consider these three elements at once, rather than as people do usually, to consider them by pair, but the three of them at once. So what are they? Genome layout simply means
38:41
the respective positions of co-functional genes, genes that belong together, maybe co-regulated, maybe encoding proteins of the same complex or other, co-functional genes. A respective positioning of co-functional genes. This is the expression of those genes.
39:02
And finally, this is the conformation of the chromosome, the physical folding of the chromosome. And the idea is that it becomes important. So what we have to demonstrate, and was in part demonstrated in the meantime, is that the genome layout changes the folding.
39:22
We showed that with a biophysical approach. I'll just show two slides about that later. Then, that the chromosome conformation allows factories to be made for messenger RNAs, in addition to the nucleolus.
39:42
You'll see the data. That this changes the expression is more or less the work of, first is the mass action law, of course, for all the chemists. But in the case of transcription, it was shown by Murahil and others later. And finally, there is another issue.
40:00
How do we see patterns in current genomes, patterns that extend over the whole chromosome? And this cannot be a random effect, right? There must be some selective pressure. And so, there must be some selective pressure to explain why these improvements
40:21
to the control of gene expression lead to such a strong pattern of the genome. On that side, little was done. We did publish in 2008 a paper where we showed that it's a model. It's just an evolutionary model,
40:40
which seems to say, yes, indeed, that effects on that will regularize the patterns of the genome layout. But I mean, it's just a little evolutionary model. Yes? Did you try to do the correlation between the expression level of the transcription pattern
41:01
and this kind of localization? If the expression level is high, there is enough to manage the difference. If it's low, then they should be more close, one by one. Did you try to do sustainable relations?
41:22
Okay, so you have to remember that we do have this kind of relations, but we do not have weights generally. We don't know how strong are the binding and so on. But one element of information that goes in your direction, and which I didn't say, I didn't want to say, but it's a bit complicating the story,
41:41
is that in bacteria, not in yeast, not in other eukaryotes with the nucleus, but in bacteria with no nucleus, there is a mechanistic coupling between the transcription, the translation, and even the insertion of membrane proteins into the membrane. This has been known for a long while.
42:03
So it means that before the messenger RNA is finished, it is already being translated into the beginning of a protein. And before the protein is finished, it's already inserted in the membrane, if it is a membrane protein.
42:21
And so to get to your answer in any direct way, because we don't have a... Before messenger... Yeah. Message is not finished, that the protein is already started to... Yeah, so because of this mechanistic coupling, I asked the question, will the gene, now not the protein,
42:44
but the gene that encodes this transcription factor be positioned in a neutral way or be positioned like its targets, in register with its own targets? And the answer is, in yeast, in eukaryotes,
43:04
it's positioned randomly. In bacteria, the gene encoding the transcription factor is in register with its targets, very clearly. Sometimes it may be because this gene is actually under the influence of itself,
43:21
of the protein, of its product. Some other times, it is not the case, and it's still in register. So think of it in terms of... In kinetic terms, and again, sorry for being qualitative, but this is sometimes how you actually understand things at the beginning. Sorry for being qualitative.
43:42
So this gene encoding TF1 is there. So imagine there is a stress to the cell, for instance, oxidative stress. You have a little thing, you use H2O2 to disinfect it, right? Immediately, those bacteria will be under heavy stress.
44:02
Now, it's a matter of seconds for them to react before they are killed by H2O2. Okay, so imagine that this is the TF, the regulator for the response to oxidative stress.
44:22
The gene is immediately induced. Whatever the reason, I can come back to that, but it has nothing to do with my story. It is induced, okay? So it takes a few seconds to make the messenger RNA, then a few seconds to make the protein, and then here is the protein, and the protein, remember my mechanistic coupling?
44:41
The protein is made close to the gene in bacteria, and this has been shown long ago, 1970, and again in 2010. Protein... The gene encoding is... Close, yeah, close to it because of the mechanistic coupling. The protein is made, the RNA is still attached to the DNA, right? So this has been well shown. Okay, so now the protein is close to the gene.
45:04
Okay, and now it will, if there is nothing special about the organization, it will sample the DNA of the cell. It has been calculated that given the on rate, which is not specific again, it will sample about 1,000 sites
45:22
before it finds one specific one that is a target where it will stay. 1,000 wrong sites with, let's say, one tenth of a second each. That's 100 seconds. It's too late.
45:40
Now if it's made close to its gene, and its gene is closed in register, closed in space, closed in space to the targets, we don't know how much it will sample, but less than 1,000. It's already close to the targets. It will sample, but it may find the right targets
46:02
after 10 attempts and not 1,000. I don't know, maybe 10, maybe 100. You save tenfold or you save 100 fold. I don't know, but you save. Qualitative reasoning, sorry. You save something. Okay, you save time. Okay, so that might be a reason for these genes to be in register with their targets,
46:22
even when they are not under their own regulation. Anyway, it's an observation that I made in 2003 that they are in register with their targets, in bacteria, not in eukaryotes. So as you see, it's an indirect response.
46:41
I cannot say more, sorry. So in yeast, we do have this phenomenon with patterns here, but the gene encoding the factor is not in a special position with respect to the targets. It's not.
47:02
Because you are talking about oxidative damage, right? So usually, you have the loops, right? So you have a chance, right? And so where is the damage? The damage can occur more easily.
47:22
Because there's a term. You could argue that the proteins are protecting the DNA as well. Right, right, right. So it's probably there you should check for the repair factors. The DNA repair factor. It should localize where it is,
47:40
because it understands that when there is this term, it's easily, it's broken easily. It can be broken easily. Just kidding. For the oxidative damage, the targets would be many things. One would be something to destroy the H2O2. Another one would be to repair the DNA.
48:02
Another one would be to avoid that the lipids get oxidized. You have lots of things, but they all fit in this scheme, actually. So now briefly, just to give you a little update.
48:22
Okay, so I explained this. This notion that on top of the Jacob and Monod view from 1961, that is a view from the gene. I am a gene. I see the transcription factor come, bind to me, change influence my initiation of transcription rate.
48:40
There is this Monod view with short loops internal to the gene, to the same gene, and now with the notion that we sometimes have factors, regulators that agree on a certain pattern, they have the same pattern. We think that there may be a view from the cell on top of the others,
49:03
a view where you see from the point of view of the cell that there is an overall organization for all of transcriptional activities in a live cell. And that's the idea that has not been fully proven, but we do know a couple of things.
49:21
So what do we know, actually? So we know that from many morphologists that a transcription of messenger RNA, not ribosomal RNA now, which are the nucleolus. Actually, you see the nucleolus here. In green is the activity of transcription, and red is just the background. So nucleolus is here in this human cell nucleus,
49:42
but you do see 10,000 dots. I mean, don't count them. We invited Peter Cook one time some years ago to talk about that. One thousand spots of transcription factories that are for the messenger RNA, not the ribosomal RNA, which are in the nucleolus.
50:02
But you do also see patterns and spots and dots in bacteria, as shown here. Okay, that's one aspect. Another aspect is this Mueller Hill story, which I explained. That is exquisite sensitivity of the transcriptional control over tiny changes in the loop sizes.
50:26
And third is what I explained. I did just look at transcriptional interactions and given one transcription factor, look at the positions based on the sequence, look at the positions of the targets,
50:42
and then do some simple math. And less simple math, now we elaborated in 2010 a new tool to deal with biological data. Because you'd say, okay, you're telling me that there is some periodicity. Okay, so let me use Fourier analysis or wavelet analysis or something derived from Fourier.
51:05
It typically doesn't work because the biological data are, okay, as we know that. We devised a measure which is based on information theory, which is rather simple, but has the advantage that it gives a bonus
51:24
not only when we have several genes coinciding on the solenoid of phase view, like here, but also when there's a void, an exceptional void or an exceptional high density of genes will get some bonus in the scoring system.
51:44
And this has been published. Just to illustrate the path that we took and I don't want to get into it. It works very well and much better than Fourier on real biological data. It gives peaks at given periods like 9,510 nucleotides
52:02
and also double, triple, quadruple. So the data can fit in and on linear, right? The Fourier analysis doesn't work. It's not eligible, but still it's pretty realistic, but it's quantitative. Yes. Yeah, so I wanted to skip that, but see here, we took an example
52:21
where there are two periods. Yeah. And for each period, we have several points. And then Fourier simply crashes totally. This is minus noise, this is plus noise. Our method works. Yeah, sure.
52:41
So and this 10,000 base pair is reminding us of the so-called microdomains of the bacterial chromosomes, even though we didn't show that it has anything to do but it's the same size. Again, besides the microdomains, there are macrodomains in bacterial chromosomes
53:03
with the origin of replication, the terminus of replication and areas that do not interact much with the others but do have lots of interactions internally and not much externally. And actually, the patterns that we see fit.
53:21
So for instance, the borders here, which was determined by biochemistry in 2004, that's a paper shown here, are found by the fact that we have this period of 9,510 centering above the origin of replication and taking up half of the chromosome.
53:40
And then we have another period, seven kilobase pairs, which is significant and which has these two boundaries. And actually, so it was fitting the macrodomains except here. So we talked to the authors of the paper and actually, they said, no, since 2004, we changed our evaluation.
54:02
We refined it and actually it's exactly there. So you see that the macrodomains and the microdomains might have to do with this. Here again, it's just for illustration to avoid some frustration. This is proximity on the chromosome. The genes are next to each other.
54:22
And this is periodicity. That's when genes are periodically disposed and it can be interpreted like this. So without going into the details, looking at all enterobacteria, including Escherichia coli, the proximity phenomenon is lost after
54:44
or is no better above about 20 kilobase pairs of distance. That is, you know, you are the neighbor of you and you are the neighbor of fur and so on. But at some point, I mean, are you a neighbor of Nazim down there?
55:03
So the answer is here. The answer is besides 20 genes in a row, it's not really any more proximal. Just to the contrary, starting with 20 genes, we have periodicity in the groups of genes that are bigger,
55:21
bigger than 20 genes in the group. And altogether, and that's more important perhaps, so we have size effects, right? More important is if we add up all periodic genes for the Escherichia coli, for instance, we get 500 genes. That's 12% of the protein encoding genes.
55:42
They function in the synthesis of DNA, RNA, proteins. That's one aspect, the functional aspect. Look at the transcriptional interaction map. They are in the core. They are most connected. But very important for the biologists, we looked at all the sequenced genomes of bacteria.
56:05
We pushed these 500 genes. We looked at the homologs, the best homologs in all the 800 sequenced genomes of bacteria. And we found that the orthologs were periodic as well in all bacterial fila.
56:23
So it's a ubiquitous phenomenon, and I will not show in detail. What phenomenon is the same one? Just the collection of periodic genes in Escherichia coli. If you look at the homologs, they are periodic as well in all other fila.
56:42
So it's homolog in particular, Yeah, exactly. Orthologs is a way to say in brief homolog, and in principle the real homolog, based on some informatics criteria. Okay, so it's true in all bacteria.
57:01
It's true in one RK bacterium we looked out of one. And it's true in one eukaryote out of one, which was yeast. It was the first one actually we tried. That's all we know. We don't know more, so far. Is it the same period or what? Sorry? Is it the same period or the same length? Nope. No.
57:20
Nope. Not the same periods. It's always in the same area, but it's not at all the same period. It's just saying the cooperation. Yeah, in the end what's important is the principle, and now then you can realize it with different periods, which are in some sense arbitrary,
57:41
and if you look at causing bacteria, you'll find something similar, but if you look further away, it's going to be not similar. It's an arbitrary question. I mean, this question doesn't... Okay, so what time? We had one hour?
58:00
Yes. One hour. So I'll go a little faster. I'll skip this and just say a little word about the biophysics. Okay, so it's a Monte Carlo metropolis model of a polymer where we have the notion of energy
58:22
for the binding of the protein onto the DNA. The protein can grab the DNA with more than one hand. That's very important. And we also have a certain flexibility of DNA. We took the physiological parameters. We have the notion of flexibility,
58:42
of a persistence length of the polymer. Given this one model, one of the key results we had in 2010 was that if you take a strand of DNA and you put size for, let's say, red, green, yellow, blue,
59:06
red, green, yellow, blue, red, green, yellow, blue, in a sort of a regular pattern with four equivalent of transcription factors, four regulators, four colors, then you get something which certainly is not a solenoid. This is polymer physics.
59:21
But it's going through these colors one after the other in a regular way. Now, this is not shown here, but if your polymer is long enough, then you get anomalies. It doesn't go circular like that. But with a short DNA, you get something rather solenoid.
59:44
Okay, that's one aspect. Now, if you randomize the positions of these different colors, this is an example, you get two differences, two major differences. One is some genes, some dots.
01:00:00
will not be accommodated. They will not be in groups with the others. So you would predict a weak, weaker transcriptional control. Second, all colors merge in the middle. So you would predict a high crosstalk
01:00:21
between things that perhaps should not talk. Whereas here, without putting it explicitly in the model, we got separated colors, no crosstalk. So that's the key result. And now, I mean, we did a lot more,
01:00:40
including on mammalian DNA. We used other approaches, like it's a Langevin type of model here. I'll skip that, really. It's not so important. So, conclusion so far. First of all, and that's mostly our work, co-functional genes tend to position
01:01:00
periodically in all microorganisms, so far looked at. And second, I alluded to it, I didn't show it, the gene spatial clustering favors periodical positioning in an evolutionary model. Third, if you do have periodical positioning,
01:01:21
it will favor clustering in factories, in space. And this, really, we have been working a lot on. And also, as I just showed, the Sorinoidal Organization of Chromosomes. And finally, work from others. Gene spatial clustering optimizes transcription regulation,
01:01:42
Miller Hill and others. Transcription occurs in focal points. Focal mean what? What do you mean by focal? Oh, in just gathering in the same places. You remember the morphological data, yeah? So, that's what I meant.
01:02:02
So, these are the conclusions. This is how we understand the system now. And I'll finish in the last minute by saying that, okay, these are observations and so on, but observations that are precise enough and bear on genes
01:02:22
where we can say this is the name of the gene, this is this gene doing this. Precise enough that now we are envisioning that we can engineer genomes by applying the principles of natural genomes. One of them being this. If you rearrange genes, it will not work.
01:02:43
Okay, yeah, so we have work ongoing on that. We are rearranging. Notably, in answer to your question, I said that the gene encoding the factor in bacteria is in register with those targets. So, we have moved the gene. We are waiting for the result, okay, currently.
01:03:02
We have moved the gene that encodes the factor. We expect to see the effect. And many other experiments are underway, actually. But what do you mean when they did raise the wrong factor? We think, yeah, reasonably. What kind of effect do you expect after you?
01:03:23
Well, we think that the efficiency of translation, that the transcriptional control will be weakened. And this we can measure in principle. And so, bacteria will suffer. They will suffer. They will suffer either just like that or under pressure with some stimulus or stress or.
01:03:44
Maybe you get some similarity. If you want to make, for example, compared to a model, like models have a bacteria machine, you know, again, what? The major problem was to eliminate undesirable interactions. Yes. This is exactly what happens here. Yes. Eliminating desirable interactions. Yes, we think that it's, yeah.
01:04:00
Otherwise, yeah. How does that dialogue happen? Yeah, we think that it's also important, yes, yes. I agree, I agree. It must be important, as well, to eliminate unwanted, yeah, yeah, yeah. It's perhaps as important, perhaps more important, perhaps more important for you. You know, people tried a lot, you know, the most costly theorem for me cost about 10 billion dollars
01:04:21
through the game, yeah, because you need time. And that's exactly the kind of integration where interactions occur. Yes. And this way. Yes, because there are examples that were known a long time ago and unexplained, like there's a factor in our cells that responds to glucocorticoids, okay?
01:04:44
And it not only responds, but is the same molecule that actually influences the transcription of some genes. And people were very surprised that it does not bind and influence a gene
01:05:00
that has a nice binding sequence in vivo, in the physiological situation, but it does so on one that doesn't have a good binding sequence for the glucocorticoid receptor. These type of things are unexplained unless you accept that the position is,
01:05:21
the binding sequence is one thing, but there is also, it's also relevant to consider the position with respect to other things. And so it seems that it's one of the tools in the toolbox of the cells to avoid undesirable interactions, to play with the respective positions, it seems.
01:05:45
Yes. So we are at the verge. I mean, people have been neglecting this phenomenon.
01:06:03
Now, I didn't have the time to show some data, but we see that it's an important phenomenon, quantitatively very important. And so just to finish, because we can have questions after the break, and I think we are all sweating,
01:06:21
is now that we understand one more thing about how to organize a genome that works, that functions, we are using this information to try and engineer desirable features on chromosomes, on genomes, bacterial genomes
01:06:41
that can be used for different purposes, be useful to produce a drug or the precursor of a fabric or things like this. And this is the reason why we founded this company called Synovons that you mentioned.
01:07:01
To use this principle, apply this principle, but also other principles that were known before that are constraints on the architecture of the genome. But you can't believe that the correlation is what is wrong with promising structure.
01:07:22
So promising is denied, after all, right? So this is denied. So we don't know. I mean, the only case where we touched upon that when in the first paper in 2003, when I worked with yeast, all the other work has been done with bacteria. And for yeast, we don't know how to articulate
01:07:42
these observations. I think it's at the shorter scale than the chromatin story. Not that the nucleosome is smaller, yes. So okay, okay, so it's at the larger scale than the nucleosome, but it is a smaller scale than the long-range organization of chromatin. But these are just words.
01:08:01
I don't know. It's the honest response. In transcription factors, in the promoter regions, there are not many in the questions. We don't have a chromatin in promoter regions. Yes, so that it receives proteins, yeah. Less than in the open reading frame, even less.
01:08:25
Less than in the open reading frame, yes. Yes, okay, so maybe we'll take a break. Okay. I was asking about the possibility of modeling, yeah. That you try to model and make some quantization,
01:08:41
quantization estimates. Part we can do by, and basically the part we can do by modeling. True. So you could model from scratch. You could say, okay, give me some information. I'll model from scratch.
01:09:02
Some people have been doing that. They didn't bring much new information. Another option would be to say, give me the average distance between the partners, which we can give with our Monte Carlo model. Really, we give distances, real distances.
01:09:22
We cannot give time because it's Monte Carlo, but we give distances at the best equilibrium level. Give me distances, and I will model in order to give you now the level of activation. So if this is the DNA binding site,
01:09:45
and this is the transcription factor protein, I'll try to tell you how much this will activate or disactivate or repress the rate
01:10:07
of initiation of transcription.
01:10:22
Rate, what do you take? How you measure it, but not measuring it. Ah, great. Do we measure it? No, what it is, how we define it, right? Oh, so. It's a number, you mean. So it's a number of initiations of transcription per minute or per second, per unit time.
01:10:44
How many per second, for instance, how many times will this initiate transcription? That is what is controlled by the transcription factor.
01:11:04
So that, and this would be valuable. Now, on a more analytical approach, so this would be simulations. And before I forget, one of the difficulties is that we do not know the parameters well,
01:11:21
so you could search for the parameter, but then you'd like to know the value, in some cases, to calibrate your parameters, and this is difficult to measure. There are a few data, but not many data on measuring this.
01:11:42
This is a difficulty. Now, for an analytical approach, I would suggest that Hill 1910 could be a starting point.
01:12:01
Meaning, what kind of work you have, so for the, is mathematician here, or who? Yeah, but I don't know how much. I never met him, unfortunately, and I don't know how much, and the paper. What kind of work is that on, which field? So what I can tell you is biochemists use his work in the following way, and this doesn't tell you much
01:12:22
about, and sorry, but I looked into that probably eight or nine years ago, and I don't remember the specifics. What I do remember is how biochemists use this paper. Who? Biochemists.
01:12:42
No, any biochemist. Is, he proposed a way to, I'll, he can, he doesn't want, I see.
01:13:04
Okay. His work is used for the following.
01:13:22
Consider, ah, very nice, thank you. Consider a beaker with liquid, and a semi-permeable membrane here, and consider solutes of two types, one that can cross, and one that cannot cross.
01:13:47
One that cannot cross, say a macromolecule that cannot cross this semi-permeable membrane, and a small salt that can cross. And put the, put them all here,
01:14:04
and then so the macromolecule will not equilibrate, and the small molecule will equilibrate. Imagine you have a way to measure how this distributes. And, oh, sorry, I forgot the most important thing. This small molecule binds, can bind the big molecule.
01:14:24
And this small molecule, so to measure it, what do you do, you make it radioactive, or whatever. Fluorescent, radioactive, something. So the small molecule binds to the big molecule, and you can follow it. And now measure, measure how much small molecule,
01:14:44
small molecule, macromolecule, measure how much small molecule in this, is in this compartment, how much is in this compartment, and then you can use Hill 1910 to find out the two things,
01:15:02
the concentration of the big molecule, but also how many small molecules bind per big molecule. And don't ask me more. Sorry, I forgot fully. At that time, we had a physical, a mathematical physicist in the team, a German guy.
01:15:22
Was pretty good, and I tried to talk him into, trying to restart from Hill 1910, and following up to obtain this type of information. Not this, but another application
01:15:40
towards this result, in an analytical way. And in the end, he shied away, he didn't do that, and then I forgot all the details of what we had been discussing, so don't ask me more. But I can dig, I can dig up and see if I can find again what was the idea.
01:16:02
Well, but here, I meant very simple kind of issue. When you have a kind of a round of work, and in the size, when it turns 30, 30, 30, 30 minute, and how much, it depends on distribution or distance from the weights that you start. This is computable, this is a kind of elementary,
01:16:20
a kind of homogeneous space, you can easily do that. Of course, here, it's kind of more complicated, but this first, I would make a simple computation and see how much it helps. In two spaces, this way, but way, you would buy different answer. They might depend on the dimension. But this is the computation we use. And actually, I can think, I know a mathematician who can easily do that, can do it with that.
01:16:42
Okay, we still have to find the right person. But I'm open, I mean, of course. Okay, okay, let me know. Thank you. I have some questions about this periodic genes bacteria.
01:17:06
Are there more assumption and more concern than genes which are not periodic? So I cannot answer directly,
01:17:21
but indirectly, yes, these 500 genes from E. coli are involved in two categories of roles, two roles. One is spatial organization, whatever it means, but it's funny because we start from a hypothesis on the spatial organization,
01:17:41
and we end up with a list of genes that are labeled spatial organization, funny. But most of those genes are involved in macromolecular synthesis, which means synthesis of DNA, RNA, protein, which means they are very central to the life of the organism.
01:18:02
One sub-hypothesis could have been cells had to optimize the expression of genes which produce a lot of product, massive product. For instance, I mentioned ribosomes. Okay, so then it would mean that they have to,
01:18:22
they have to massively produce protein altogether and also massively produce some RNAs, like the ribosomal RNAs. So you'd think these are the massive products of the cell and that's where you need more optimization.
01:18:41
Well, not, not because, it's not the case because we also have DNA synthesis in the story, and DNA synthesis is not massive, unlike RNAs and protein, it's not massive. So the criterion is not,
01:19:01
the criterion is not the fact that it is a massive production. But indeed it's DNA, RNA, protein synthesis is among the top roles of the genes that are periodic. We've come back to the previous question.
01:19:21
You predict, we know, so they know there's something like how many genes are not what is necessary to model cells, but not to emulate. So to evaluate what exactly the, the importance or? How many genes and proteins you should put in the model
01:19:41
to understand the same interactions? It's a difficult question because it would not work if you put in four or five genes because it's a collective phenomenon. It would not work if you reduce the genome
01:20:01
to a small size because it needs several loops and the loops cannot be small because of the lack of flexibility of the polymer DNA. So it's not an easily reducible case. You cannot say I'll work on two genes
01:20:20
and see how it works or even five genes. You need a genome-scale pattern. But how many, again, is it a number? A hundred, is it a thousand, or is it hundreds? In the hundreds of genes. Hundreds of genes, yeah. No, but there may be some kind of approximation.
01:20:42
So it's a huge thing, but you can use for something like having clusters, an operation with these clusters. So it's not maybe modeling them point by point, but somehow kind of average over certain groups. So we're creating such a huge, if you take, what, 1,000? Yeah. Because 1,000 squared.
01:21:01
So it's like one million. Yes. For one future, it's nothing. Then I should play with this biodegradable data. On the lattice? Yes, but surely we would, we usually use a bit of computations. Yeah, I really don't know.
01:21:21
No, what I'm saying is that you don't have to do it point by point. Yeah, you can average over some regions, yeah? Immediately use computation, right? Yes, yes, yes. You have to first invent a class of model and think about how they work and then do it. Actually, I forgot. There's a third approach,
01:21:40
which would be generalize, generalize, Villar and Leibler. Because Villar and Leibler, Stan Leibler.
01:22:02
In this paper, they deal with a case of two sides, two very close sides. So not only are we saying that the sides can be far away along the 1D sequence of the genome, but we are saying that it's not true. It could be 200, 200 sides.
01:22:23
And this model works for two. Intrinsically, it works for two. It doesn't work for three. But there might be a way to generalize their approach. Again, it's something that we were discussing in 2007, eight with this German mathematical physicist,
01:22:44
Timo Horv, and I don't remember the specifics, but it did not look to be an easy thing to do. The way you describe it, there are so many. There may be a simple mathematical argument and immediately we'll show it has tremendous effect. In two issues where it's kind of clear, it's exactly multiply things.
01:23:01
Just multiply them and we'll explain numbers immediately if they become very strong. Because we bring together, so it's not one pair, but many pairs, so it's like multiple. It might be very huge effect. Yes, yes. So many minor effects, major effects. I agree. And therefore, it may be. I agree, but I agree from my intuition. I mean, your intuition is probably of a different nature.
01:23:20
No, no, it's same. I must have numbers here. And furthermore, yeah, we have to find a person who we can deal with this and we'll be motivated to deal with this. I'm ready to talk to a person to dig out these previous ideas as well. And I suppose that it is, yeah, it could be done,
01:23:40
but not me. Yeah. Okay, so should we stop? No, no, it's so kind of tantalizing we have to formalize it and just wrap up the relationship there.
01:24:00
But of course, just to make it interesting, we have to know kind of data, something more information. For example, when you have this size, you have these kind of correlations. Then you may try to use it by clustering analysis with that kind of routine, so you can get lost, right? And of course, can react, for example, what these genes, what these proteins,
01:24:23
what you know about them, kind of experimentally. Of course, you can make, then you make a mathematical model, but then you miss some crucial point there. Sure, sure. You mean in simulations? In simulations. In simulations? No, I mean the biology. We do have some reasonable orders of magnitude
01:24:41
for the energy, the binding energy, for the physics of the polymer DNA. We do have that. This is a very, very, quite a remarkable thing, absolutely, it's a tremendous, it seems to be a huge factor, right? It's not just one, 10, 20, or something. Yeah, because it's not just two, it's like 600 and they can exchange in a dynamic way.
01:25:05
I live on that side, and then I can go back to the same, or go to another one. And in evolutionary, again, can you estimate how the structure being organized, yeah? It's a high-scale structure, so how evolutionary it should work.
01:25:24
Okay, so we had this simple model, which basically said genes belonging to the same list, oh no, sites, sorry, sites that belong to the same list,
01:25:44
will, how was it? Will be subjected to attraction, so it was not very realistic, was more like distance to the minus two type of gravitational or Colombian attraction,
01:26:06
which we thought would accelerate the reaching of an equilibrium, and we had to be fast, why? Because it's an evolutionary model, so evaluating one chromosome for its fitness
01:26:22
should not take more than hundredths of a second, because we have to evaluate many, and we have to go through many evolutionary steps, so we had two time scales, we had the ontogenic time scale, that is evaluation of the fitness of the genome, which was simply, very simply made by saying,
01:26:41
okay, so we have, let's say, 200 sites, and after folding it, we measure the distance, the whole pairwise distances between these sites, yes, that's what we did,
01:27:00
whole pairwise distances, and we take the average, and that's the coefficient of fitness, actually it's the inverse of the coefficient of fitness. The coefficient involved, right, probably is a major, it's not just in a point mutation, right, it's the result of the sphere problem. Yeah, exactly, no, that's what we were doing, so yeah, I didn't say that,
01:27:21
because that's the short time scale, we evaluate the fitness of a genome, then, after evaluating 100 genomes, we took, so we had the fitness coefficient, which was the inverse of this number, of the average distance, pairwise distance, we take the best, now we have, we go one step further in the evolution,
01:27:43
so it's a different time scale, it's a longer time scale now, we used variation operators, exactly what you said, so these were only of this type, choose randomly two sites,
01:28:01
and a third site where you insert your transpositions, we had only transpositions, so we make, from the Wiener chromosome, we make 100 or more, 100 copies using random transposition, and then we evaluate them, and so on and so forth, and we keep the best in terms of clustering again,
01:28:25
and then another evolutionary step, for again with 100 chromosomes made with this variation operator called transposition, and so on and so forth, and after a couple of hundred thousands of steps, evolutionary steps,
01:28:42
in a few hours with a computer, we got things that were regularized, that is, we did have some proximity, and periodicity patterns appearing from a random initial sequence.
01:29:02
We can do it with the activities, you're not related to subject with transposon, or some other mechanism involved, biological mechanisms. Transposons are supposed to be random. But there may be some biological attached inside the scheme, that might be something, right? I mean, Misha, you are going too fast. It is possible, but it's really a wild hypothesis.
01:29:21
Of course, absolutely. It's a wild hypothesis. It looks really interesting just to relate this kind of mathematical understanding, and the way you're doing that, but if you relate it to something biological, you'll have less of that. Yeah. We know that when horizontal gene transfer occurs,
01:29:42
normally, it's a continuous piece which goes to another place, or to another organism. Horizontal gene transfer. However, your hypothesis amounts to saying that it could be a piece of this one,
01:30:00
plus a piece of that one, that together, well, and there are very few, I mean, there's no proof of that. There are some hints, perhaps, but no more than hints. And if you ask people in the field, I would say, no, no, no.
01:30:22
It's not possible. You see, what you're saying, and it cannot be positioned, but there is a book by Kuhn, you know, about evolution of genomes. The theorem is completely random, right? Exactly, the structure is not the picture. In this company, what you're saying, you just have to write the whole book, right?
01:30:42
Right, yeah. It's random until we find the regularity. Yeah, we should do a pattern of organization, exactly. And then, I am curious, I can prove again what he wrote, how you can, it wasn't that if you use something, and this is, if you,
01:31:01
I'm just curious if you do what he does, but now make your assumption, if you had a different answer, can it be better to relate with the real world? You see? This again comes down to the kind of computation you can imagine. And there's constraints on organization of the genome,
01:31:20
and the certain type. And so we live in a different space and we live in statistics. So there are two mathematics in one, one inside of the cell, and then in the space of genomes and evolution. His book was written before that. Yes, sure. No, no, the book was written about three or four years ago, or five years ago. So he didn't read the good authors. He just, he told me,
01:31:40
he told me, I'm waiting to see if he refers to the right. I came up with my computer, so I was like. No, this is, there are two kind of issues in the medical illuminator, as you said. One in the cell, and the other in the space of genomes. True. Yep.
Empfehlungen
Serie mit 10 Medien
Serie mit 4 Medien