We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

“When a biologist met Python”

00:00

Formal Metadata

Title
“When a biologist met Python”
Subtitle
An adventure into the natural sciences using tools like Biopython, Bokeh, Networkx, Ecopy and more!
Title of Series
Number of Parts
118
Author
License
CC Attribution - NonCommercial - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Biology and computing are closer than we usually think, for example many algorithms are inspired in biology patterns, and complementary to that, researchers needs special algorithms to have a better understanding of our environment. Thus, there is a strong relation an dependency. In the past years, Biology has been transformed into computational biology. Therefore technological advances helps us to predict physical interactions between atoms and DNA, because we are being able to integrate information from biology into algorithms. Python has become a popular programming language in biosciences because it has a clean syntax that makes it easy to read language. In addition to this, there are many modules (toolkits) extending to different biological domains, like metabolomics, structure analysis, phylogenomics, molecular biology and others. Python is currently improving researcher’s workflow, helping us to focus on the theory or experimental part, instead of fighting with old buggy applications. This talk aims to be oriented to all audiences (with/without biological background) since we will go together through an amazing adventure into the natural sciences using tools like Biopython, Bokeh, Networkx, Ecopy and much more! Are you brave enough to follow me on this journey?
Keywords
20
58
GoogolPoint cloudAnnulus (mathematics)Inclusion mapMultiplication sign
Product (business)Computer virusAdventure gameNumber theoryDistanceLevel (video gaming)Sensitivity analysisAlpha (investment)CircleClique-widthAveragePlot (narrative)Electric generatorPlanningMathematicsInformationMultiplication signOpen setAdventure gameVideo gameSpacetimeUniform resource locatorDifferent (Kate Ryan album)Point (geometry)Endliche ModelltheorieDistribution (mathematics)OvalComputing platformWaveMetreGraph coloringProduct (business)Moment (mathematics)Data managementField (computer science)Scripting languageDatabaseShared memoryCASE <Informatik>PlotterBitTerm (mathematics)Range (statistics)Characteristic polynomialParsingMereologyPrimitive (album)AreaLine (geometry)Near-ringHypothesisXML
Alpha (investment)Computer virusPersonal digital assistantThermal expansionGenetic programmingMathematical optimizationAnt colony optimization algorithmsComputer networkDirection (geometry)Impulse responseFunction (mathematics)SoftwareDressing (medical)Object (grammar)PlastikkarteCASE <Informatik>2 (number)InformationComputer virusFunction (mathematics)Artificial neural networkVirtual machineDevolution (biology)AverageGraph (mathematics)Ant colony optimization algorithmsGraph coloringHeegaard splittingEndliche ModelltheorieMereologyBitShape (magazine)PlotterDifferent (Kate Ryan album)Range (statistics)Multiplication signView (database)AllegoryPoint (geometry)NeuroinformatikSampling (statistics)Mathematical optimizationSelf-organizationTelecommunicationAlgorithmDot productProcess (computing)Twin primeDistribution (mathematics)outputLine (geometry)WeightSimilarity (geometry)FamilyMachine learningMoment (mathematics)EvoluteComputer animation
outputFunctional (mathematics)Active contour modelOpen sourceVirtual machineMessage passingCellular automatonPattern languageEndliche ModelltheoriePixelElectric generatorActive contour modelVideo gameCASE <Informatik>Medical imagingTraffic reportingInformationResultantLine (geometry)Graph coloringShared memorySet (mathematics)Multiplication signWordDifferent (Kate Ryan album)Scripting language1 (number)DivisorTouch typingDescriptive statisticsCharacteristic polynomialStrategy gameNatural numberSampling (statistics)Right angleReal numberWell-formed formulaComputer configurationModemSoftware development kitSemiconductor memoryOnline helpSpeciesMessage passingSoftware testingScaling (geometry)Computer animation
MathematicsTask (computing)Multiplication signSoftware testingMusical ensembleLecture/Conference
Transcript: English(auto-generated)
Thank you very much for coming. First thing that I would like to do is sharing my feelings because this is my first tech conference and I am really scared. But at the same time, I'm really excited to share with you what happened when a biologist met Python.
This biologist is me. I did a PhD in molecular biology and after a while I was a scientist if I stayed in academia or go to industry and in this time I started to learn Python
because I knew that it was useful for science and and it was kind of... I mean in this point I discovered, I started to discover some models and packages that was kind of amazing. In that point I
started to, I don't know, to have a kind of relationship. Wow, this is amazing. Why I didn't start to use it before this moment? Well, I decided to move to industry where I'm working as a product data manager, an e-commerce platform that we sell products for scientists and
one of my role was looking for new resources, in this case publications. And it was in that moment when I realized and I discovered some amazing tools that I thought why I cannot share with the world this new discovery?
And that it's why I am here today. Well, this talk is not a conventional talk. We're gonna have a biological adventure
but with Python. That means that we are gonna cover some different biological topics and how Python can have a role in this. As you can see will be really different topics. My first story is about plants because plants are amazing.
They are there and they cannot walk, they cannot speak, but even that they can communicate one of each other. Imagine the situation. We are in our living room. We have a plant that is living there, really happy and this plant
perceives or sees the light with two different kind of waves. But we thought, oh, it's an amazing situation, but maybe I can buy a new plant and visit there because my plant is really happy there.
I can have a new one near. And in this situation you arrive to your home, to your living room, you left your plant and this second plant also sees the light to the same wave. But this plant also reflects a kind of wave.
Saying that one plant can detect another and they are starting to be unhappy because they are feeling that they are in danger. They need to survive because they are in competition for light. Amazing topic. This was my topic of thesis and
imagine that you want to know about this syndrome, that is the generative for this situation, that is called shade avoidance syndrome. And this is where Python can help us. We can use this model called Biopython to look for more information. In this case, it's a simple example
where I am looking in a database called PMC. As you can see in the line 5 and also I am looking for terms. I choose this database because it contains open access publications.
But there are many other databases that you can look for a lot of kind of information related with different fields in science in general. In the second part of the script, we are doing a parsing of the information.
And then what we are going to obtain in this case is title and URL. But again, we cannot change piece of text and other information. And then we can see that this also has two different publications in open access
journals. Well, sorry, I give you a spoiler for my next story, but I wanted to say the last thing. Remember, if you have to plan really close at home, give some space to them. They will be happier.
My next story is about avocado. Because avocado is a fruit from other areas. I don't know if you know that, but this is the aspect that primitive avocado had.
The seed was huge. And this was a really problematic issue because it is really difficult, the dispersion of this seed. But thanks to this animal, well, not this, to the grand grand grandparent,
but it was a giant slot around 4 meters of size, that ate avocados. We have avocados nowadays. But you can think, yeah, but this is an animal around 4 meters. Only avocados eat enough, well, it was not enough for this animal. He needs to eat avocados
and other fruits, and for that he needs to move around a lot. And that was really useful for avocados because there was a huge distribution of this fruit, and it could survive.
But at some point this animal disappeared, but humanity appeared, and also we discovered that avocados were amazing. Nowadays avocados are a trendy food, and for me, I am a bit worried about the prices of avocados. Because trendy food, sometimes people
increase the prices, and well, for that I analyzed the prices. I wanted to visualize the prices of avocados in a range of years with Bokeh. Bokeh is a model that allows
us to produce interactive plots in a really easy way. Of course this script is a bit summarized. I forgot to tell you that I summarized a bit
the scripts because it's too long, it's a lot of examples, but all the information is available in my GitHub that I'm going to give you later. Don't worry if you see that it's only some short pieces of the script. Well, sorry, as I was telling you, it's a really simple way to do it. We read the data,
and then we choose the characteristics that we want to use, the style that we want to use in our plot, in this case, dots and lines.
Well, we're going to try to do a trial. The dots are the distribution of the prices, the lines are the average of the prices, and the different colors, the blue is organic
and the red is conventional. As you can see, we have different prices, but the tendency in time is similar. Well, this is about avocados. My next topic, or my next story is about virus.
Because virus are amazing organisms, but do you know that still nowadays scientists are not sure if they are alive or not? It's kind of an amazing topic, but sometimes
produces a lot of illness and a lot of problems. This is the case of virus Zika The information that we have nowadays is that it's transmitted by mosquitoes
from the family Aedes. We don't have vaccinations to prevent the illness, and because of the symptoms, if we are infected, we can have fever, rash and pain, and the most serious and dangerous is microcephalya in newborns.
Here is the distribution of the virus nowadays, and the most important thing when we are talking about viruses or illness is how fast does it spread.
For that, we can use networks. It's a model that helps us to visualize the network, to generate the network. In this case, we are going to analyze cases of illness in Brazil.
The first, we generate a graph or plot with different nodes that represent different cities of Brazil, and we are going to see how was the evolution of the spread of this virus.
Let's see now. Well, this is the different cities. We have different colors that represent the amount of cases. As you can see, it's in range every time
that go from one range to another, grow the size of the ball, and also change the color. I was preparing the talk, and thinking, wow, it's amazing how Python can help
to biology, but I was studying and using a bit of machine learning and things, to say, yeah, but I can see also biology in Python. How can this... And I start to say, yeah, I think that biology inspired computing,
and I want to share with you this point of view. One example, there are many of these examples, but one example is evolutionary avagorean. Specifically, in this case, this is one of the examples that I love, for that I choose it, but it's ants' colony optimization.
In this case, it's based in ants. Ants can go to the nest, to the food, because they use pheromones, and they can communicate one to another using pheromone communication. And this is really useful,
because when they have some travels in the way, a rock or something, they can say to each other, hey, this is the easy way to arrive, or this is a shorter way to do it. This is a kind of optimization process,
and it's similar, or it's the base of this algorithm. But it's not the only one, of course. We have also the neural networks. Neural networks is based in our brain, specifically in neurons.
Neurons communicate one or two to each other with electrical impulse, and go from one to each other. Imagine that we have an input. You can see something, you receive information, you have an input,
and then this information is going from one neuron to another one. And then we can produce an output. In this case, you say, oh, or whatever. It's the same idea that is applied in artificial neural networks,
but we need to have some things in mind that it's not exactly the same. But we are going to do an experiment. I think that it's a moment to prepare yourself, because I want that... I'm going to show you a picture,
and I need that you count how many seconds do you need to recognize the object in this picture. Are you ready? Yeah? Okay. Let's go. Three, two, one.
Do you need one second, maybe? Raise your hand if you need one second. More? Less, less. Scientists, it's described that our brain needs 0.1 seconds
to recognize an object that you see before. That means that if you know this object, you can use 0.1 seconds. It's really fast. It's really efficient, our brain.
What our brain or what our body is doing is analyzing this picture. We are analyzing shapes. We are analyzing colors. We are analyzing by a small part. Similar with this doing some machine learning.
And here, I'm going to talk a bit about PyTorch model. I don't have a lot of information to say after the talk of yesterday. I don't know if you were here, but it was kind of amazing. I don't think Guy is explaining all about PyTorch.
But I would like to only indicate two different important things for me. In this case, we are going to load a data set of flowers
because we want to identify flowers. We are going to use a model called ResNet-50 that was pre-trained. This model has a specific characteristic. It's based in pyramidal cells.
That means that these cells are not using layer by layer. These cells can send information from one layer to another far away. And this is what this model also does.
In this case, we train and also evaluate and also test. And here, we have the results. We have an image classification with different plant species. This is amazing and works good, but needs time and money.
And sometimes, we don't have this time. This is the case for the next example. It's about the snakes and parters. Imagine that you have a friend who is on holiday.
Two different snakes take two pictures and send to you these pictures. They say, hey, I know that you know biology and also Python. Can you help me? If I am in danger, it's fine. The situation, what can I do?
And of course, this person sends to you a perfect pattern, clear, you know, like real life, real data. All amazing. You can see all the patterns. For that, we can use this model.
Also, this model, we can analyze the pixels of the images. Or, well, we have another option, because maybe you can remember this poem that says,
Red touch yellow, kills a fellow. Red touch black, venom black. But maybe you have bad memory, as me, and you don't remember if ghost kills a fellow, kills a fellow, okay, no. Better use a script in Python.
It's safe. In this case, we write an image. Once I generate a green scale, only to simplify, I get a middle line of pixels, and then I translate to obtain the colors.
And that is what we have. This is the image that your friend saw in nature, and this is the patterns that you obtain. And after that, you can say, hey, you are safe. Go to the right, because the left is a venomous one.
The biological strategy of this snake, this is not a venomous snake. It's a snake that imitates only the colors to be safe of the predators.
I'm sorry. Well, my last story is, how does happiness look like? Do you have an idea? Do you have an idea how does happiness look like? No? No? Okay. Well, first of all, a description of happiness,
overall appreciation of one's life as a whole. Well, it's one definition of happiness, but after that, you say, I want to know, what is the world happiness countries in this world?
Based on the World Happiness Report of this year, these are the top five countries. If you are from one of these countries, please share with all of us what is the secret, and please, because we need to learn why and how to arrive there.
Well, imagine that you want to visualize the happiness. We can use this model that is called RT-KIT. A really easy way, we can use the SMILE,
that is a chemical formula. We can transform these formulas, and we can visualize the happiness. Here, we have the four hormones in humans that are related with happiness.
It's that all now. We have a lot of models and packages related with biology and science. I'll show briefly two words about Ecopie. Ecopie is a model used in ecology,
and we can measure the diversity factors. My take-home methods are the following. Python helps to scientists. In this specific case, to biologists, as you see.
But biology is helping to computing, or inspiring to computing, and also to Python. If we work together, the scientists start to collaborate more with tech people,
or if we normalize Python in science, we can increase the diversity in the community. This enriches Python and enriches all of us. Of course, if you have some idea of model package, please do it.
Generate more tools, because even you think that it's not so important. Really, there are a lot of people who are using these tools anonymously. Well, this is what happened when some biologists made Python.
I finished only saying that all the information is available in my GitHub. Thank you very much, and I hope that you enjoyed the talk. Thank you.
You finished your PhD and then you started working with Python? What happened? When I was doing my PhD,
I realized that there was a lot of manual task that has no sense to be in this year doing this task by hand, or not automatically. When I tried to introduce some changes,
you know that changes need time. In academia, it's quite conservative. When I finished and I was with my own, I decided to explore all these interests that I had,
to do the things more automatically and in a more effective way. Thank you.