Logo TIB AV-Portal Logo TIB AV-Portal

PUBLICATION août 2016 - Towards design in Synthetic Biology

Video in TIB AV-Portal: PUBLICATION août 2016 - Towards design in Synthetic Biology

Formal Metadata

PUBLICATION août 2016 - Towards design in Synthetic Biology
Title of Series
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Release Date

Content Metadata

Subject Area
Digital elevation model left Cefalexin balance metabolic operation set Biodiesel chemical Prospektion molecule Retinol Cubic Vitamin-E-Gruppe fermentation Marker, Norway Resveratrol biochemistry biofuels organizations Artemisinin Oxycodon steps synthetic biology Topic Isoprene Polyhydroxyfettsäuren slides firm Säuren Pyridoxine Biotin compounds stone scale fermentation hydrocarbons Riboflavin drug Chemieingenieurin Vitamin B1 Nicotinsäure kinetics sterile case Säuren Propionaldehyd DNA synthetic Vitamin-D-Mangel cell important vitamin biotechnology Peptide Synthesis Farnesene processes extraction aldehydes progress balance chemical charge vitamin B12 flow synthetic function vitamin mixing Vollernter screening Peptide Synthesis host
Glucose sugar balance des Lactatdehydrogenase Dihydroxyacetone metabolic operation gene set chemical variety Klinisches Experiment molecule enzymes variants biochemistry chemical element aldol ADP synthetic biology chemische Reaktion Aldol Reactions slides end glycolysis systems Greater London Council phosphate Simulation glycerol coupling materials compounds remove pyridine scale sodium hydride Chemieingenieurin Pyruvic acid electron donor dynamics adhesins DNA reflection Isomerasen cell Glycerinaldehyd gap Atomic layer deposition prosthetic groups NAD biotechnology level aldehydes fructose power DNA Biocatalysis Blueprint standards substrates Vollernter Peptide Synthesis chemist extract
purification Glucose metabolic chemical chemische Reaktion Injektor <Chromatographie> variety firm model enzymes Zulauf <Verfahrenstechnik> variants membrane core chemical element concentration constitutional chemische Reaktion complexes pumped glycolysis systems samples model enzymes cascades coupling matrix scale stuff sequence informatics activities density kinetics pumped man synthesis cell mass spectrometer Metabolomics processes matrix reactor density power activities composition Identification buffer processes HPLC metabolic pathway substrates Membranbioreaktor
Glucose density set pumped case petroleum Eisfläche samples enzymes gap Insulin oil Cycloalkane Spektroelektrochemie reactor chemical formulae steps Hexokinase chemische Reaktion end systems HPLC function coupling Iron den
Glucose fine bind gene model enzymes Kunststoffe P sites HPLC Library biochemistry Eau de Cologne concentration steps gene chemische Reaktion end systems toxicity van Säuren model glycerol enzymes coupling compounds pyridine Library sense sodium hydride NAD Pyruvic acid Cross potential plant kinetics pumped case DNA man strains cell blue Blueprint active site NAD Inneren level Einschlüssen Metabolomics processes reactor substrates genetic code DNA dynamics food buffer processes conditions function substrates
Digital elevation model delta RNA bind set case reflection man model Zeitverschiebung Strength P sites Glucocorticosteroide overexpression rate initial active site Library Methylmalonyl-CoA-Mutase Inneren translation level Acc processes transition metal chemical element rates end systems translation model initial orders of magnitude monomer scale Library calculations
metabolic bind set Tollens case man variety Zeitverschiebung Strength P sites cell variants rate initial active site tool Library derivatives clones Tillit translation mixture genetic code rates Maische oligonucleotide solutions Combining translation initial GFP Library mixing Devolution (biology) Zinnerz calculations sequence
Digital elevation model biofuels Bohr metabolic bind set Bifurcation theory ether molecule enzymes Strength P sites protein Library clones biochemistry Holes regulation properties Oxycodon steps chemische Reaktion complexes solutions slides end Alkane glycolysis systems rapid model GFP protein DOM coupling RNA molecule compounds sequence hopes modules modifications sense Twins pumped case Säuren man samples Pharmacology Table wine cell plant overexpression precursor active site parents vitamin biotechnology level Plate HPLC metabolic pathway Amino den
reactor phase Forkhead-Gen substrates media batch case Cross Calcium hydroxide end Alginic acid glycolysis cell protein cell clones Suspension screening Riboflavin AdoMet Poisson's ratio
Digital elevation model sense Graukäse chemical element high-throughput screens Pentose phosphate pathway case man molecule Ham function cell plant overexpression level biosensors Riboflavin green Redoxreaktion regulation microarray analysis food systems GFP Flavin group coli Dipole Moment Suspension prosthetic groups Riboflavin
chemical element sense areas surface sense sensitivity growth steps bind gene Electronegativity man Scanning transmission electron microscopy translation metabolic pathway RNA protein coli Peptide Synthesis Riboflavin Peptide Synthesis ribozyme modules Riboflavin
reactor Bohr Cellobiose media transport synthetic strains Klinischer Tod Crown ether cell factors Library coli Riboflavin Emulsion oil growth concentration Digital elevation model systems Drops GFP cell Peptide Synthesis form Riboflavin
Digital elevation model molecular Digital elevation model systems processes function Optische Untersuchungen cell coupling clones Library clones Riboflavin stuff modules processes hopes Riboflavin
ha and thanks to organize us for having this very nice meeting in this very nice settings here close to Paris also including a little bit of a testing our testing capabilities that are slightly different from scientific deduction by putting us in the all my residents and testing us whether we find access or not what's very interesting but thanks very much anyway um so following up on some of the discussions we had earlier what you can do with mathematical model so much can you not to do with mathematical models in in a setting of biotechnology I'd like to talk about our progress towards design in synthetic biology I'm particularly insistent on towards as you will see I'm going to cover three different topics that we're treating in our lab and some of them are examples of what you might want to call design and others are reason to be far away from this and then maybe we can also no discussion explore the boundaries of this let me start with the slide on some of the real world applications of synthetic biology which obviously growing continuously but which include to quote some of them may be more prominent the production of artemisinin by sanofi aventis based on a process towards at mussina casa developed by jay kiessling here an example from global biology is to also quote something that is more closer to hear the production of hydrocarbons sterile insect technique as developed by oxy tech in the UK or drug screening applications of bio circuits as commercialized by by a vs which happens to be a spin-off of our own Department from the group of Martin Luther nigga and I think I hope I have captured more or less the different product groups that we're currently thinking about in synthetic biology in fact it seems to be look it seems to look quite okay so if you go to the Wilson Center product database the Wilson Center is an outfit in Washington that is very much interested in the progress of synthetic biology and its interactions with society so if you go to the product database of this of the Wilson Center you find hundred sixty products from synthetic biology many of which are close to on or close to the market that looks actually quite promising but if you go really nitty gritty details of this list you find things such as 13 p do propane dial or polyhydroxy economic acid from metabolics so these are products that have been around for for very long time predating what we want what we might call synthetic biology for quite some years so in fact the number of products on the market that I think can be associated with synthetic biology is substantially smaller than these 116 and i found one comment from jay kiesling very informative that he made during the desk on conference haway pointed out that jay glue might actually be one of the most important people in synthetic ball or in commercializing synthetic biology because he's in charge of subsidies for biofuels in the US and without his help it might get tricky for many of the compounds and a little bit more in this pessimistic note I went through the list of vitamins that are on the market that you can buy particular for my former employer DSM and vitamins are something that obviously is made by cells so microbe microbes other cells whatever so in a way it's the perfect microbial biotech product you just need to crank up the production and then harvest and that's it but if you look at this at the list of these 13 I think molecules there only two which are actually made by fermentation riboflavin and vitamin b12 and vitamin b12 is clear case from it's impossible to do chemically I'm not quite impossible but reasonably close to impossible but all the others are not so strange and but nevertheless so we can 11 of them are found mentally made by chemical means and even those mixed compounds are these mixed production schemes that I pointed out here are fundamentally chemical steps plus one or the other enzymatic step to help synthesis along so even in a product portfolio where we could think that this should be as rarely one another be suited for synthetic biology exploration we find that most of this is actually done by chemistry why if you think about the typical biotechnology process you it's typically through separated into three parts the upstream processing everything that's related to engineering the right host I then the actual process and the downstream processing which refers to purifying the product so this and this is reasonably similar between biotechnology and chemistry and please forgive me I know that's an exaggeration fermenting fermenting boxes substantially different from running a synthesis on a six cubic meter scale I appreciate that but I still would say that for this and this part of the entire process we have in a discipline of chemical engineering what I would call a comprehensive design framework we have clear ideas about kinetics about the organization of the work flow into unit operations we have similarity and analysis dimensionless numbers mass balances we know how to deal with this and we know how to scale it up and no scale scale up is not a pure science yes scale up is to a certain extent still an art but nevertheless we've done it over and over again we know how to do it so the difference has to be here on the left
side and if this is chemical engineering then great Stephanopoulos has at some point in the past coined the term biological engineering which I very much like and the question is whether we can come up with something similar to a comprehensive design framework on the level of biological engineering and yes we have a couple of techniques that have developed over the last 20 30 years metabolic engineering systems biotechnology synthetic biology but if we post a question whether these methods are really a comprehensive design framework I think fair to say no they're not at least not yet and the reason is of course that if you compare to chemistry a chemist has to deal with only one reaction or maybe two but this is something that is manageable the number of unexpected effects that can occur are limited we're looking at cells and cells are as we all know much more complex than that nevertheless if we want to if we want to get more work from the chemistry into the hands of biology this is something we have to deal with of course time is on our sides limited raw materials for chemistry the environment environmental impact etc etc all of these points matter but in a way biology has to do its homework and get closer to the development times and the development of power of chemistry so in a way it's I guess it's fair to say that biology has to come closer to being a design science or it needs to we need to come closer to be able to design in biotechnology so typical design cycle and we've seen variants of these pictures already quite a number of times before in this conference let's assume we have a blueprint of a catalytic system that we can work with at desktop or at the computer and which we then translate into a DNA print at your favorite DNA synthesis synthesis provider let's say gene art for example or and then you go ahead take this DNA construct you implement it into your chassis and then you hope that it's doing what it's supposed to do or not and if not then you go through this cycle the number of times that you need we tried quite a lot to get close to what we would call design in this cycle and at some point which I'm going to explain to you in a couple of minutes we kind of gave up so if we decided if you want to do real design and I get to that in a minute we need to ago need to step over this particular element here and we need to go directly between the two take it from there why is that this is our favorite test system we interested in making this compound here DHAP it's an abbreviation for died Roxy acetone phosphate and for the connoisseurs among Yoda Roxy acetone phosphate is an aldol donor and set of aldol reactions that we can be catalyzed by a variety of alder lasers in biocatalysis and the nice thing about it is that if you use this one as a donor you can produce vicinal dials in all four possible diastereomeric configurations so that's nice that's what came in stream about at night and biocatalysis can easily do this however making these molecules is a mess and one easy way to make it if you look at biochemical maps for example is to go from glucose down the road of glycolysis and that is classical glycolysis not the one that Pablo showed us in the last talk but standard mi of panas glycolysis so glucose 6-phosphate fructose 6-phosphate fructose bisphosphate splitting by the air lays into dialogue SI estan phosphate and glyceraldehyde 3-phosphate on way from here to here you spend two ATP which in terms of making chemistry are expensive you want to regenerate them so you can recruit the lower part of glycolysis from glyceraldehyde-3-phosphate be a 3-phosphoglycerate down to pyruvate in the course of this generation operation you spend an ad which you convert into NADH which you need to recover for example by converting pyruvate to lactate and here you get the NAD regenerated end of course also to two ATP that you invested in particular if you take out the trials phosphate isomerase that usually interconnects these two then you have one more glucose leading to one more dialogue see acetone phosphate and one more lactate and all cofactors regenerated that's the idea dihydroxyacetone phosphate is phosphorylated so it's not a good idea to make it in the cell we do it outside the cell so we have an in vitro system assembling all these enzymes in invito scenario is not necessarily the best idea to do it on a commercial scale so we used to do this with cell-free extracts so simply grow ecole eyes to a favorable gross state harvest them rip them open and use the cell-free extract without much further do in order to run this reaction this works but if you want to get it really down to the level of design then it turns out that these systems are too complex and I get back to this in a second nevertheless so let me briefly point out it's an in vitro systems are not working with cells who are working with cells free systems and we're interested in running these ten enzyme problem one more information in the following slides you're not going to see much about died Roxy acetone phosphate you're going to see more about glycerol-3-phosphate that is dyed Roxy acetone phosphate is relatively unstable and a real world reaction it would never be a product in itself it would be the intermediate substrate that you make in order to make other sugars so what we decided was we added another reaction at the I'd Raja neighs that would convert that Roxy acetone phosphate into glycerol 3-phosphate in reflection of its later real-world usage if we do this then the NID gets regenerated here and not here and we don't need to lactate we don't need to lactate dehydrogenase I should say ok so the problems that we
had when we were using the selfie extract is that all these background reactions which are kind of competing reactions for many for our core glycolytic network they were bothering us ATP got degraded etc all these things were interfering with our with our desire to actually design the picture so at some point we gave up and we went back to commercially purified enzymes glycolysis is nice because you can buy most of the enzymes and then you can simply assemble the pathway without all the side reactions that otherwise you would need to take into account so in
terms of complexity we reduce the systems we took out the elements of very large chemical network and we took out the element of change in time for example by illusion by evolution or by reacting to the environment that makes you change your metabolic network so these two elements which are admittedly super important in modeling cells we kicked out then there was the question okay now we facilitate it or we simplify the system can we do design now then the question is what would we want to design and for us this was the identification of the optimal composition the optimal distribution of a given amount of enzyme activity such that the product is formed with maximum productivity so when I asked myself what's the optimal composition of the system I'm asking how many units of enzyme 1 2 3 10 do I need in order to get my product with the quickest or in the shortest possible time of course you could say but you can always make it shorter by adding more enzyme yes of course that's why we constrain the amount of enzyme that we can distribute I also d design question is given you have 10 units of enzymes or 20 units or 40 units in this console in this particular volume how would you distribute these 10 or 20 units among the 10 possible candidates next question what's the actual design process ideally we would like to you produce a mathematical model that allows us to explain to exploit computational procedures of optimization rather than using PhD students now so if you're ten enzymes and you want to explore I don't know five variants five concentration variants for each enzyme then you have 5 to the power of 10 experiments to carry out which is for the average PhD student quite a lot so ideally we would not need to do this we would have a model we could ask the model to find identify for us the optimal solution to this problem and we would take this suggestion from the computer implement it and there would be a chance that the system actually does what it's supposed to do usually at that stage when it comes to making predictions from the model the game is over for a variety of reasons mostly complexity reasons network and change in time I already mentioned but this is something that we have eliminated but then still feedback and so all these systems are highly nonlinear so they're not they don't scale very easily so there's quite some mathematical effort involved to get them right but another point is data density and data quality so very often you have only a few points in the system that you have available in order to calibrate your model and this is usually a not enough and B these points are taken in a from a experimental scenario where they're actually not helpful I guess everybody here or most of you have at some point done the determination of the v-max and the km of an enzyme to make it very blunt you cannot you cannot determine the km of an enzyme if you only measure at concentrations more than 10 times higher than the km and in many in vivo scenarios this is exactly what we do because in metabolic metabolomics for example we give glucose pulses to the system because glucose is the only thing that can go across the membrane and then we trace how the glucose pulse makes the changes to the metabolic constitution of the cell what but in order to really find out about the kinetic parameters of enzyme eight or ten or twelve in the sequence we would not need to vary to modify them to vary the concentration of glucose but we would need to vary the concentration of the immediate precursor or of the product and these are things that are very difficult because there's the cell membrane so data density and data quality are too also highly important points and this is stuff that we can deal with yeah so we decided to go for this particular setup here this is the real world set up here seeing in mass spectrometer and enzyme membrane reactor in a couple of HPLC pumps more conceptually this looks like this we have a buffer system and a pump and injector loop the reactor which is fed from the system here and because all of this is in vitro you can feed whatever you want you can feed glucose or intermediates from your pathway you can add additional enzymes intermediates products etc here's a membrane that retains the enzymes that are here in the reactor but lets intermediates and products and substrate through which go here so is tubing are conditioned with mass spectrometry matrix buffer and then are immediately injected into triple quad ms y
immediately injected into triple quad ms that's unusual usually what people do is they run this through an hplc column to achieve a separation before but we didn't want to do this because we wanted to work at hi-data density meaning lots of samples per time so here in this case one analysis cycle 48 seconds and if you do want to do this then hplc is no longer an option you need to go directly into the MS that brings a couple of problems with it and such as iron suppression and others you can solve this by doing a lot of calibration work etc I skip over this in the end what you remain with is a set of data like data like this so if you put a step input of glucose of enzymes whatever you produce a perturbation of the system in the reactor and then by using this ms setup you can directly trace the response of the system with high data density and high accuracy and if you see that all these lines are little blurry that's not because we don't know how to make good lines but this is because the lines actually not lines but single dots from each of the measurements well and so if you'll have a look here 18 minutes one measurement cycle each eight for every eight seconds so you can imagine that a lot of data points that actually help us capturing very nicely done dynamic behavior of the system if you want to do
design you need to be accurate enough so one of the things that we also did was we fully identified our measurement setup so we were able by running a couple of test functions through the system we were able to describe entirely or accurately the influence of on the data that we get by the measurement system that we apply so you can see here that even though we put a step function into the system just by running it through the system no reactions nothing the step function converts into this particular formula here this is something that you can capture mathematically in order to correct your data once you produce them
in reality so now comes the thing about which I'm we are very excited so i told you about what typically people in metabolomics do when they look inside cells the only thing they really can do is they can throw glucose into the system which is a step function and step up function of glucose something as you would see here if you work in vivo so if you work in vitro the range of experimental Liberty is much broader so for example rather than putting these steps in you can do Rams pulses sine curves which actually transmit quite accurately into the reaction system and what you have here is our some iconic representations of what happens to the substrate in the reactor so you can see that it crosses nicely a number of through a number of states that allow you to cover fundamentally the entire work range of the enzyme so the model that you produce from this is supposed to be very good at least in the test range because with these different dynamic perturbation functions you force the system to explore all different all different corners of the experimental space so the model lets you get from this should be working in the entire experimental stage what we did then was we ran experiments like these here 22 of them so you start with something and then you start to perturb the system at enzymes at compounds whatever and then you get this behavior in time each these abrupt changes means that we did something to the system edit something in addition and then you go back to the
table you write mathematical model in this case one with sixty parameters and then you use the data that you have in order to find values for the parameters so we did twenty two experiments and the model at the end of these 22 experiment was able to reproduce these various experimental dis various experiments actually quite nicely it's this model any good so did we get odd values for the parameters or did they make sense what you see here is a list of the kinetic parameters of the various enzymes we're going to look at the stars are the values for the parameter that we found and the red lines are boundary values that we extracted from looking at Brenda which is an enzyme database that gives you ideas about km values maximum minimum etc so these were kind of boundary values that we extracted and you see that in many cases the values that we found for model were actually placed quite nicely within this boundary range which made us confident that it makes sense there were a couple of cases where this didn't work and so sometimes we were out of the range but not by much okay fine we have different conditions different buffers etc maybe sometimes the values were exactly at the boundary which meant that if we had optimized further most probably they would have crept outside so we weren't always very good and sometimes it just didn't make much sense and in these cases when you do the analysis you get back to the problem of mathematic identifiability as i said before some experiments are simply not suited to identify certain parameters as in mccandless menten kinetics determination you cannot measure at v-max all the time and even though we had this marvelous experimental freedom with our in vitro system there were some corners that we show cross in more detail apparently but because the model was good enough we decided to ignore these cases and then it was obviously the question okay now we've done all the work can we actually do design with this model and we asked two questions one was optimization of productivity and one was the optimization of the amount of nad you have to throw into the system what you see here I experiments where we distributed 20 or 40 units of enzymes according to the design question I explained earlier and what you find is the stripped lines are predictions and the solid lines are the actual data from the experiment and what you can see here is that or the message is fundamentally that stripped light and solid line follower for all intents and purposes very close together in particular here in a 40-unit case sorry sorry particular for pyruvate and also quite ok for glycerol 3-phosphate second-seeded alliance actually are close together meaning that in terms of predicting optimal productivity of the system we did a good job we did an even better job for the question of optimizing the energy concentration so we looked at various concentration of nad and looked at the smallest concentration it would still give us good performance and this one is representing expert and you see again it's doing quite fine so from our point of view this was the punch line of this particular story that if you go to enzymes alone and take out the bigger network of the cell you should take out evolution etc this is something we can manage yeah so there's non-linearity in the data and in the sorry in the equations etc but this is obviously something that we can deal given that the analytics we throw at it is good enough in order to give us a good model so design with in a bio system yes if you agree with me that a system of ten enzymes is a system and we might argue about this okay let's let's increase the level of sophistication a little bit so go away from in vitro go into in vivo take this part in here manufacturing of DNA at the fabrication side and plant it and then go through one or the other cycle where how can you use design there if you go through the cycle in an in vivo system then usually in one or the other at one of the other step you're going to have to look into a library approach yeah usually the it's not good enough to make one design you're going to vary design in one or another way in order to explore a big experimental space to find the optimally producing strain again we're interested in metabolic system so we're interested in making something and so if you're interested in for example the problem that I had shown earlier Artemis enoch acid by jay Kiessling ten genes what are the levels in which you need to express these genes to prevent toxicity to improve productivity etc classical library problem the library problem can be addressed in many different ways promoters ribosome binding sites but a couple of examples already earlier the way we wanted to do it was ribosome binding engineering simply for the reason because it's easy and straightforward however doing this is a little tricky in terms of numbers so if you think about a classical ribosome binding site let's say we did shine de gallo side six nucleotides if you do a fully degenerated 6n library that's 4096 members and if you have two genes that you want to explore the relative level of this to gene system then what you need to do on the order of 10 million experiments to match each potential ribosome binding site with another potential ribosome binding site that's obviously not feasible so
looking at fully degenerated systems is not a good idea thankfully people such as Howard Silas at that time in the lap of Chris void have helped out have helped us out of this predicament by producing tools such as the ribosome binding site calculator it's a delta g model that takes into account various elements of energy contributing elements of the binding process and the translation initiation process and provides you at the end with relative value that tells you how much better or worse this ribosome binding site is relatively to reference case and actually this model even if we take into account that this is a logarithmic scale here does at least in the calibration scenario quite well it's not an exact model but it allows you to separate the junk from the good ones and the good ones at least classify them roughly in the order of magnitude nevertheless you
need to use this computational data in a smart way in your library and what we had in mind was that instead of doing it like this with its with us to fully degenerate case here you have relative translation initiation rates here you have the frequency of how much a certain translation initiation rate level of appears in a certain set of possible ribosome binding sites then you see here non-functional ribosome binding site are most frequent in the set and good ones actually very very rare what we would like to have is not this what we'd like to have is this so we have here again translation initiation rate frequency please note that the numbers are highly different so here we have six you have 1,400 so this is a reflection of an ideal scenario where you would have over the entire range a few candidates predictor translation initiation rate strength so that you would need to check maybe here in this case 36 Varian's and you could be sure that the entire range of possible expression levels is covered why is 36 because this is 2468 mer that is covered by some explicit denominations so this has to be G this has to be this has to be G but some of them are partially degenerate for example rather than have having an AC g or t we have here a B which oops sorry
which stands for either is c a g or t or a V which stands for either a CG so this is something that you can send to your G nada of choice and they send you back this particular on ego which is in fact a mix of a variety of allah go here in this case 36 different ones but this is something you can clone in one step and if you then do the experiment to follow up what you should get is not exactly this picture because ribosome binding site calculator gives you only a prediction not an accurate number but nevertheless you should get something very close to this the computational
problem was then how to find the optimal prediction for such a partially degenerate sequence and the way we do it is we send we send our the derive some binding site that we would like to optimize we send it to the cells binding site calculator we get back full set of translation initiation values and then we go through this data set and look at every possible combination of degenerate nucleotides which is computationally a relatively excessive problem and then we sort through all the possibilities in order to find the optimal partially degenerate oligo that gives us something close to a linearly equally or evenly distributed range of translation initiation rates so this is for in terms of algorithm the heart the entire thing I'm going to skip over it the output is something like this so depending on how many variants you actually would like to test for 12 or 24 then this is what you get out together out specific suggestion the best or the goo nucleotide according to the tools that we have that allows you to do this linear distribution of strength across the possible range I should mention at this point that there are other people who are thinking about this problem in particular also Howard Sallis who has found a different solution for this problem in this case the distribution is not linearly over the entire range but it's exponentially which is like we think very good for some problems but in our view not so not equally good for metabolic engineering problems because if you think about it if you distribute exponentially that would mean that if you have ten clones nine of those would be among the first top ten percent and that's not really helpful in metabolic engineering you want to explore the entire range so that's why we sing linearly is good but nevertheless there are other ways to address this problem does our algorithm work so we did first test em cherry gfp spacer between the two and here to promoters and then we checked we varied these two ribosome binding sites with an algorithm that came out of our red lips reduced libraries algorithm and you can see immediately if you put a fully degenerate ribosome binding site into these two positions then you find either green or red but hardly any mixtures if you look at the rational reduce problem you find both green red and also mixtures of all different sorts so this
actually looks quite nicely you can also follow this up a little bit more systematically so here we have an increasing amount of information in the design of the ribosome binding site part here we do an in silico analysis and here we do an in vivo analysis and meaning we picked colonies from from the plate and checked for gfp and rfp values and if you do this in silico you can see that the more information you put into the ribosome binding site the you more you reduce the amount of degeneracy the better coverage in terms of different expression levels get and we find the same also for the experimental setup you discover here a hole in the upper right corner we try to reproduce these clones and find out why we don't see them the fact is simply that you cannot make the sell produce tons of gfp and rfp at the same time so this here's a physiological problem we try to explore this a little further we wanted to make vol a scene there are a couple of pharmacological properties associated with this with this compound which makes it at the moment a little bit of vogue but to be frank and something that you can easily see so it's a good model compound the pathway goes from tryptophan via along a couple of steps down to this intermediate here PVA from which there are this one reaction that leads further to viola see in catalyzed by an enzyme called bio d but unfortunately there's also site reaction that leads to the oxy viola see in catalyzed by yoc and you cannot simply knock out this enzyme because you needed once more on the road to viola seen in the last step so we have here we have a classical metabolic split situation that you have in metabolism all the time from glycolysis to the TCA whatever all sorts of bifurcations of metabolism and here simply not not a terribly important bifurcation but one that we can actually in terms of analytics manage very well so what we wanted to do was we wanted to come up with a solution that allows us to make most viola seen it was clear that there would not be a hundred percent solution because in a hundred percent solution that's then you always have via see here to the oxy viola scene but the question was what is the optimal solution that gives you most viola see the task was relatively simple we wanted to modify the expression level of vaio e by OC and VOD and then look for the best candidate we produce reduced libraries of these four three rivals and binding sites this is some students you can manage nicely in one pci experiment 1 cloning step so that's actually quite convenient we used to twin a degenerate library of 24 of complexity of 24 24 x 24 x 24 is 13,800 too much so we simply drew something on the order of spin at 75 of which three and twenty five actually gave a product and as you can see here immediately yes the violence the infraction here's the deoxy virus infraction this line here in the middle is the parent and you can see immediately that we get nice fluctuation in the level of violence seeing that we find above and below the parent level so this seemed to have actually work we looked at the top five here in the top five there and found that in the best case we could get up to eighty percent via the sea but the vexing question is of course well three and twenty five out of thirteen thousand if you go now on what would have happened would you have found one that's even better so you can either hire another PhD student and keep them busy until the end of his days or you can take the ones that were good sequence them and try to make sense of them and that's what the PhD student ? was yes check actually did so he looked at the prediction strengths I cannot say this often enough so we do not actually measure RNA levels we measure prediction strength or the computational value nevertheless you looked at these values what you see here in a slightly odd display is the prediction strength for the expression level of my OC vaio d + x OE and if you forget about vaio e for a moment so that's the one the vertical and look at yoc and VOD then you most probably see that in particular for these four examples here the ratio between vaio z and y OD seems to be relatively constant so these are the good candidates so the five best one and even here where it looks slightly different it doesn't look outlandishly different yeah so it seems that there is a corridor of yoc versus VOD expression levels that seems to lead to optimal results so would it be an idea to make a second library that explores this specific value see bio d ratio and rather than going through the other 13 thousand samples that we haven't checked so far do a second library around this what looks to be a good point again sample only a few samples and then see what happens that's what we did so here are the we asked the system to produce us to produce rosen binding sites for us that were high for y ou D and low for vaio see with an overlap and with an overlap in between cloned it and what you see immediately is that if you look at 277 additional clones the average production of delk sorry of Viola scene is shifted upwards that's also true if you look in more detail and more analytical detail the best candidate that we found had something in your order of ninety percent viola say in relative to the oxy violation so another increase by about ten percent points so from our point of view by this two-step method by looking only through 500 or 600 clones we actually could increase the optimize the distribution of the product distribution of viola see in versus the oxy viola seen by fifty percent from sixty to ninety percent and 500 or 600 clones that something that you can even do in in a industrial setting where frequently the analysis not easy but you have to do a hplc or a mess or whatever but even under these circumstances 600 clones is doable five minutes okay that leaves me with a problem but nevertheless um okay
um let me switch gear for the last five minutes I talked about the design of systems we're interested in making molecules so we're also interested a design of molecules and i'll skip the slide because it's going to take too much time obviously the optimization problems are slightly different if you think about molecules and when you talk about systems and we're not really a very good force field modelers so we would need to team up with you in order to get a better idea of what we do but there are tricks that you can play and one of the tricks that we had high hopes for was modularity of designs in particular in the RNA field where this has been entertained for quite some time by a number of people among them Christina smokey that certain that RNAs might be easy to engineer if you just know where to do and in fact this is a problem where in biotechnology you have a lot of representatives for because let's face it many people in biotechnology do screening and if they do screening most of the products interested in are secreted which makes it very messy to follow up yeah so there's not you need to extract with whatever ether whatever it is and then go to the hplc takes time etc etc it would be great if we could change this in our approach to this is using cells as sensors and in order to convince the cells that they are able to sense exactly the molecule you're interested in you need to manipulate either proteins or RNAs our particular problem
was riboflavin there's no protein for riboflavin so we went for an RNA but before I do this let me briefly explain how we do these screenings what you see here is a small edge night it in which you have two micro colonies in this case both ecoli one was an rfp one was a GFP these beats are relatively small 70 to 200 micro meters which lets the volume fluctuate between pico leaders and nano leaders and we like to call them not only two reactors obviously because they're very small you can get 25 million of these in 25 millimeter volume so you can do lots of experiments what we typically do is we
have this bacteria and we have single bacteria in these beads so that bolo clone beats and then we let them put them into medium they grow into micro colonies then we can analyze them for example by Fox we can also when they grow here we can also put them in oil so that there's no crosstalk between beads and at the end you can simply get the beads open them up recover the cells and then do with them whatever you want what you have here is beats that contain riboflavin and those that do not contain riboflavin mixed for a long time in this case 14 hours I think and you can still see that the populations are separated from each other so we can prevent
crosstalk riboflavin production that's what we were interested in so our point was to find a high-throughput screening system which has one cell that is able to sense increased levels of riboflavin
that's the idea producer in our case bacillus subtilis this was a project that we did together with DSM and they they are industrial producer of riboflavin by bacillus so here the bacillus secreting riboflavin molecules and here sends ourselves in this case e coli that hopefully should turn green gfp expression if a certain level of riboflavin is reached if that level is not reached this ecoli cell remain dark
as I said no protein surface for riboflavin so RNA yes RNA if you look to the literature and the databases you find that for example ecola has an RNA element to sense riboflavin in fact no riboflavin but F&N but it's only one step away from riboflavin that was good enough for us but it turns off so if you have too much riboflavin then riboflavin binds to this RNA element and hides the atg of the first gene and synthesis pathway if you are into high throughput screening then you know that turning a signal off five stupid screening is not good it gives you tons of false positives and negatives so it's not a good idea you need to turn on a signal and respond so we had to flip the
sensitivity of this rival switch and that's where the modularity addy came in because there's for example the hammerhead ribozyme from she stole my mansoni which has been around in literature for a long time so this is not by any means our invention we just took advantage of a lot of literature that is available in the field and this suggested that if you start to mess with these two blue areas here and exchange this part then you should be able to change the sensitivity of the resulting rival switch from off to on we try this and to our surprise we had to screen we
had to look at so you clone you discard the those that are constitutively on then you go for those who'd get on turned on in the presence of riboflavin we had four candidates and all four were good we couldn't believe our luck but nevertheless it the results didn't go away turns out that we had found immediately issued arrival switch that response to increasing concentration of riboflavin in the medium which was exactly what we were looking for and this is how we do the essay the bacillus cells secretes the ralphs laban it gets the concentration outside inside the sensors and a secure lab collaborated because we have a transporter here or facilitator I should say riboflavin gets converted to FM ND FM and flips the switch and if the switch is flipped you get gfp synthesis
and this is how it looks so we got a library from dsm of strains that we were to screen through for the best producers here you have two beads that all looks very dirty why is that because this is oil and these dirty things are water droplets that are also in the system but nevertheless what you see here are the beads some of them dark some of them green you can after the analysis you can simply remove the oil because the signal is frozen in the form of gfp then you get all these different beats here some of them green some of them lot now you can run them through the facts it's a special fax it's called the copis but that's only a minor detail and then you get an enrichment a sixth up a
substantial enrichment for high riboflavin producers so this is here the follow-up experiment that was done by dsm so not by a spy and orthogonal institution if you want amount of clones riboflavin production red is the library that we started with so strong bias towards low producers green is the library that we delivered back so strong and richmond towards high producers so in fact it actually worked so if i go to
my summary real design yes for in vitro systems information supported construction constructing cells as catalyst is an increasingly information driven process as i hope that i could show you and exploiting modularity in molecular design well there's an intelligent sentence here but i have to admit the last time that I showed this data there were Beatrix's and from so jaramillo in the in the audience and they came later to me and they said you were lucky yeah so I have to take the word for it because they're experts and we are not I can only say we did it once and it worked beautifully but they told me we were lucky so I don't want to push the idea of modularity in molecular design beyond its border so but this is what we found so let me briefly acknowledge a couple of people in particular Matthias biyaha and cursed of altered the in vitro design stuff Ganga Ganga san marcos yechezkel worked on the red lips algorithm and then a group of people who's involved in our screening efforts in particular Stephan Schmid muscle visor Martin held in a pillow under a spire and Katya Becca and and also acknowledge funding in particular from your opinion Union and let me thank you for your attention