Assessment of the toxicity of metal oxide nanoparticles by a multi-target perturbation theory machine learning approach
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 10 | |
Author | 0000-0003-3375-8670 (ORCID) | |
Contributors | 0000-0003-3375-8670 (ORCID) | |
License | CC Attribution 4.0 International: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/60648 (DOI) | |
Publisher | 05jdrrw50 (ROR) | |
Release Date | ||
Language | ||
Producer | 05jdrrw50 (ROR) | |
Production Year | 2023 | |
Production Place | Frankfurt am Main |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
| |
Keywords |
00:00
TiermodellCytotoxicityRegression <Geomorphologie>DiseaseCombine harvesterMetalAtom probeTool steelNanoparticleToxicityWine tasting descriptorsMolar volumeSpeciesElectronegativitySetzen <Verfahrenstechnik>Digital elevation modelÖlsäureChlorideScreening (medicine)FibroblastLymphocyteDNS-SyntheseMagnetiteOrganische ChemieMixing (process engineering)NanoparticleSetzen <Verfahrenstechnik>DiseaseChemical elementSunscreenToxicityGesundheitsstörungMaßanalyseSchmerzschwelleOxideFunctional groupArtificial leatherWursthülleTiermodellNanomaterialsSet (abstract data type)Electronic cigaretteWine tasting descriptorsDipol <1,3->Chemical structureAction potentialElectronRiver sourceChemistryCell (biology)ThermoformingSample (material)Deterrence (legal)Separation processGene expressionGrease interceptorCytotoxicityChemical formulaMetalAtomic numberMolar volumeSpeciesScreening (medicine)DensityHomocysteineSense DistrictCell membraneHyperpolarisierungCalculus (medicine)Surface scienceAreaConcentrateActive siteEnzymeMachinabilityConsensus sequenceRegression <Geomorphologie>SolutionAzo couplingSiliconeChemical propertyCosmeticsSilicon dioxideElectronegativityBody weightMultiprotein complexComputer animation
Transcript: English(auto-generated)
00:05
Well, thank you. And I'd like to thank also the organizers for giving me the opportunity to present here my outlier way of assessing the toxicity of the metal oxide nanoparticles.
00:24
Well this is mostly what we already know. Nanotechnology is rapidly expanding. There is a large number of products containing nanomaterials, which are already on the market, like batteries, coatings, even clothing, cosmetics and so forth. And nanomaterials,
00:46
they do offer technical and commercial opportunities, but they also pose risks to the environment and raise health and safety concerns for humans and animals. That means that
01:02
we really have to take care about the toxicity of nanomaterials. In CI, I went to the Statnano and realized that China is the one that is publishing more recently regarding nanotechnology articles in IC. Well, you can see that Germany is even lower
01:28
than Iran. Well, but mainly the general distribution is you have Asian, the Austrian is
01:41
the most nanopublications followed by Europe and then by the United States. But in terms of patents, what we see is completely the opposite. We see that the United States are the ones that have more patents, and followed by South Korea and by China.
02:01
Well, so really we do need, we all know after all these sections, we know that we really need measures to fast and well address the environment fate and potential toxicity of nanomaterials and nanoparticles. And we also know that tight throughput screening and
02:22
content analysis, they are playing a key role for the efficient assessment, but they do not, they are unable to cope with all the vast chemical diversity and biological response of the nanoparticles. So due to this, the high costs, times, and well, and other
02:49
animals and things like that, we, what we really need in silicone models to serve as an alternative to co-toxicity, cytotoxicity, and genotoxicity testing. So our proposed solution is to,
03:09
I borrowed this from a publication, I can't remember, I believe it was in the count of chemical research a couple of years ago, but this really shows what we really need. We need
03:22
a strong characterization, and then we need the testing and we have to integrate here to avoid, well, to prioritize really, if they are hazard to the ecosystem, if they are ecotoxicity or they are hazard to humans, in order to have sustainability of nanotechnology.
03:46
So integrating silicone models somewhere around here for the analysis of nanomaterials hazard and risk. And until now, although during the last talks I realized that now it's not, it's
04:11
now the classical QSTR models were based on that the toxicity of the nanoparticles correlates
04:20
with the properties of encoding particular structural features. And they tried to derive a model that relates the desired endpoint and normally only one type of endpoint with a nanoparticle discretes. But the truth that we have been seeing during these days is that
04:44
classical models, well, it's that toxicity against, they do not take, going that way, we do not take into account the role of other factors like the experimental conditions followed, the biological targets and so on. So they do have disadvantage. And you have seen
05:04
a way of introducing that, like for instance using mixing of descriptors and so forth. And in our case, what we have used is the unified QCTR perturbation model for proving
05:21
the general toxicity of the nanoparticle. So we try also to integrate everything. So basically, what we, instead of following the common priorities which are well described in the, this landmark paper of Muratov and all, how can we overcome the limitations of ignoring that
05:50
the end point response values depend on the type of experimental procedures, or even of the theoretical calculations employed, or even if they follow the same protocol but in different
06:01
conditions. So the, our answer was a long time ago, before nanoQSARs appeared, really, it was that we could apply perturbation theory. And based on the box Jenkins moving
06:21
average approach. So first, I will try to explain you what it is, the box Jenkins moving average approach. So we borrowed that idea from time series analysis. But instead of computing average over time, what we do is we compute average over the experimental conditions.
06:45
And like that we have, we kind of try to track the trends under the specific experimental condition. And so we kind of also describe new types of descriptors which are not
07:03
mixed descriptors but are deviation descriptors which are computed by the difference between the structural descriptors by subtracting to the original descriptors the average, which is based on the experimental conditions. And this average is the average of the
07:24
descriptive values for all nanoparticles that were tested over a particular element of the experimental condition. And this means that to define the particular experimental condition you have to resort to ontologies, really, to an ontology, to the concept of an
07:44
ontology. So the experimental condition is expressed as an ontology of this form. This is by organizing and grouping the several elements xi, this is for defining the specific experimental condition and also tracking the relationships. And this can be, for example,
08:05
the type of measures of toxicity. So I mean, we also integrate in this type of approach, we also integrate ecotoxicity, cytotoxicity and whatever, different biological targets,
08:20
also all of the fish, cell lines, plants, possible nano shapes and other types of experimental conditions like times of the assay times and things like that. And apart from these original descriptors, which are only this difference, recently
08:45
they have been recently been employed other types of modified forms of the later formula, but basically corresponding to normalizations with respect to the variation in sample size against each condition or its each discrete value. Basically, this one's
09:05
which they divided by the maximum value of the i minus the minimum value of the i and so forth. This is the probability of getting positive points against the specific element of experimental
09:22
condition. And this one, this is a probabilistic term that we apply that reflects the reliability of the experimental data that we are based on.
09:44
And nevertheless, I would like to stress that there is not enough evidence to prove that especially these later ones that this modified descriptors give rise to better statistical models than the original formula, as far as we know, because there is still a lot to do to
10:05
clarify that if they are better or not. Well, and now comes the perturbation theory. Why perturbation theory? This basically this came from the facts of the difference, because normally for a long time QSAR was based on machine learning, linear machine learning,
10:27
classification or regression modeling, and there are several difficulties of interpreting the intercept of the of the line that you have. And for that what we, because the intercept is
10:41
related to a particular reference state. And what we tried to do was to apply perturbation theory to have more than one reference state. And the philosophy is to PTML, the modeling combines the moving average approach with perturbation theory,
11:02
basically to end deviations or perturbations pertaining to small variations of the different conditions. To do so, what we do is we generate several nanoparticle, nanoparticle pairs,
11:21
and in each pair one of the nanoparticles is used as the reference state, whereas the other is used as the new state to be predicted. And notice that we can have, in one case, this is the reference and this is the new one, and we can interchange their role in another pair.
11:45
We can even have the same nanoparticle used as reference and perturbation, of course we skip that. And now we are, by defining this, we are in a position to move forward and to find possible deviations among the two states by simply calculating the difference
12:06
over the experimental conditions and also over the coatings of the new and the reference nanoparticles. And as such, after those nanoparticles, nanoparticle pairs are
12:24
generated, which you could have at least nine square nanoparticle pairs, but normally you have less than that. And this is a shortcoming of
12:40
our perturbation theory model, really, because these are normally randomly generated. And then the perturbation model may be defined as this expression shows, that is, what we do, the toxicity of the new nanoparticle in the new state can be predicted
13:03
if the toxicity of the reference state is known, as well as these perturbation terms related to the elements of the ontology and to the different coatings. And of course, this function can be established using well-known machine learning techniques like
13:24
linear classification techniques, linear discriminant analysis, multilinear regression, random photos, gradient boosting, well, artificial networks and so forth. I'm going to show two case studies that we have done in our group,
13:48
one with the artificial neural networks and another one with linear discriminant analysis. And normally these type of PCTR perturbation models are applied to classification problems,
14:01
really. So, in our first case study, what we wanted to, our aim was to jointly prove the toxicity and the cytotoxicity of nanoparticles under different experimental conditions, which includes, as you can see, a lot of things, includes different types of toxicity measures,
14:25
cytotoxicity and necotoxicity, around 53 biological targets, you can have algae, bacteria, fungi, and so on. 11, out of 11 possible nanoparticle shapes,
14:42
sphericals, irregulars, polyhedra, and so forth, out of 8 possible conditions that were taken for measuring the nanoparticle size, also different study times employed, ranging from 0.5 to 360 hours, and different also counting species.
15:08
And this was done using, well, collecting from, at that time there were not any web servers with data already, so we had to collect it from the literature, and we collected around 260
15:26
metal oxides, and we mixed metal oxides with slowly metal and silica, also nanoparticles, for cutting this, for building the database. And we use, as I told you before, a classification
15:44
tool, based on the artificial neural network. So, we have a binary categorical variable, that means that we had to choose the thresholds for determining if one nanoparticle in one specific experimental condition was toxic or not. So, we had two types of values,
16:07
minus one and plus one, non-toxic and toxic, and we had several conditions which were based on analyzing, carefully analyzing the data, the literature, the gathered literature data
16:21
to establish these values. And as descriptors, as structural descriptors, we use simple ones, periodic table-based descriptors, because these are easy interpretable, this is also the truth, and they can be easily, most of them molar volume,
16:45
electron negativity, you can assess this public source chemical and define the descriptors for each nanoparticle. And also the size of the nanoparticles which that were taken from the experiments. And about the coating species descriptors, we decided to use
17:11
the spectral moments of the bone-digestion matrix, defined several years ago by Ernest Strada, I don't know if you know him, but these are, well, these are quite interesting.
17:27
Well, these are graph-based descriptors, and from my point of view, quite interesting, and have been shown to give very good results, even to things which are not only 2D,
17:45
well, published. And these were completed using the MODS lab, which was a software done by Strada. And as such, we kind of first produced the pairs, we end up with around
18:15
54,000 cases, which were then randomly split into training and test sets with
18:24
this ratio, 75 to 25. And then we choose, we tried different artificial neural networks, architectures and topologies to find the best one, ranging from linear neural networks to
18:40
random basis function, multi-layer perception and probability neural networks. And the best model that we found was multi-layer perception, which also displayed a very, more fun to display a very good predictive performance and correctly classify, well,
19:03
with an overall accuracy of 90, around 98% in both training and test sets. And what, in terms of, we also try to interpret the results, which by performing a sensitivity
19:21
analysis of the descriptors used within the artificial neural network, that is possible also. And as such, we could at least, well, end up with some conclusions about the toxicity of the nanoparticles, which is, we generally, we found generally that
19:47
increasing the size generally diminished the nanoparticles toxicity. So we kind of hypnotize, well, have the hypothesis that probably it's because they aggregate and then they
20:01
do not penetrate through the membranes. This is an hypothesis. We thought it was like that. We also, regarding the codes, is what we found was that a lower edrophobicity plus a larger polar surface area in the codes generally tend to diminish the nanoparticles toxicity.
20:26
And regarding the experimental conditions, what we found was that the toxicity of the nanoparticles were clearly dependent on the concentration and also in the type of a site that was used, meaning that you can't really, you have to take into account the
20:47
type of measures of effect that you are using really. Otherwise, your results will be a bit without sense. And also the toxicity of the nanoparticles were also clear, time-dependent.
21:03
So the time of the sizing, it's also an experimental condition to be taken into account. We also tried to look at what about how our model worked on virtual screening
21:23
of echo or cytotoxicity of nanoparticles, meaning that this is a classification model. So its main aim is to prioritize, to do a first screening of the nanoparticles really.
21:42
And what we found was, so we use other nanoparticles, an external set of nanoparticles that were not used to derive the model and that were published meanwhile. And we try to predict, well, we classify them
22:07
as toxic or not non-toxic according to the thresholds that we use to drive our model. And then we try to predict them with our model. And using this as this is a perturbation theory
22:24
approach, what we did was using this as the new final state, using all the nanoparticles that we included in our, for driving our model as the reference state, we predict this new one
22:43
and this new one and so on and so forth. And we, so these are, this percentage of 260, it's the consensus prediction of 260 times that we use the perturbation model.
23:05
And as you can see, you can have, we could, well, and it's a random prediction is if this values will give 50%, but these are all over 50%,
23:22
you have from 70 to even 100% correct prediction. So it works quite good really. Apart from the ecotoxicity or cytotoxicity of nanoparticles, we also tried to,
23:43
in another second case study, we tried to predict the genotoxicity of metal oxide, only metal oxide nanoparticles. And there are also different experimental conditions. In this case, there includes, well, these were based on this data set,
24:03
which we collected also from the literature, which has 78 evaluated nanoparticles by the in vitro comet assay. And so we had out of, one out of 32
24:21
cell lines from humans, rodents, microbes and so forth, one out of four possible nanoparticle size that we divided, there were cases which had less than 25, between 25 and 50, 50 and 100 were not reported. And also we found that there are a lot of difference also
24:44
in the incubation conditions that they use. They cannot or not use the repair enzyme or there was a presence of a repair enzyme like this one. And we also, as we find that the results were quite dependent on the type of reaching
25:07
to a conclusion that it was genotoxic, we decided to use a probabilistic factor when we said that their conclusion was less reliable if the lowest was too high.
25:27
And with that we multiplied, if you well remember, we multiplied the deviations by 0.5 or otherwise reliable and we simply, well multiplied by one, we simply stayed with the
25:42
formula. And as a modelling technique in this case we use linear discriminant analysis and so we are following also band response is also a binary categorical variable and this
26:01
case we kind of look at the results, experimental results and decide if they were, if they said that they were non-genotoxic we put minus one or genotoxic, we put plus one. And we use also periodic table-based descriptors. In this case we use molar volume,
26:23
atomic radius, electron negativity and reproducibility and also some other type of quantum mechanical descriptions, completed with a very fast method, SME empirical one, PM7, which we use MOPAC to do that. And so we use eta formation, total energy, electronic energy,
26:47
HOMO and LUMO energies, dipole moment and also, well, volume. And also, and apart from that we also use 0 and 1D constitutional descriptors using DRAGRD. So, as in the former work we also
27:09
randomly generate nanoparticle nanoparticle pairs. In this case we produce around 60,000 nanoparticle nanoparticle pairs, which were also random split into training and test sets in this
27:23
ratio. And the best linear discriminant analysis model found was a 10 variable equation, which also displayed the high predictive performance and correctly classify, well, it has an overall accuracy of 97% in both training and test sets.
27:43
And in this case, as it's a linear model, what we did was to try to interpret the descriptors, which is much more easy, by the relevance taken from the standardized coefficients. And well, and we also reached some conclusions. Basically, we find that the higher number of
28:07
atoms appears to trigger a higher nanoparticle genotoxicity. We also find that as great as the graft density was, which means that the lower it is,
28:22
the molecular complexity, it also appears to induce higher genotoxicity. And also, we find that higher HOMO values improves lower polymerizability, also appears to induce lower genotoxicity. And regarding the experimental conditions,
28:42
what we found was that the genotoxicity is also clearly dependent on the size of the nanoparticles and also the outcomes of the size depend also on the incubation conditions. As in the former case study, we tried also to
29:00
to try to use it and to see how does it work on virtual screening, because this should be, the models should be, apart from giving reliable results, they should be applied in virtual screening. That should be their main aim, in my opinion.
29:21
And in this case, we also find, we even got a percentage of times were even better than the former model. So we had a lot of, well, between 19 and 100 percent. So we could easily predict the
29:45
genotoxicity of these particles. Well, that's it. I would like to also point out in our future work, what we can do is try to integrate
30:05
even other toxicity data to gather further information to be used for an early recognition of toxicophores in nanomaterials, really. Keep on improving this kind of approach,
30:21
the perturbation theory, machine learning approach. For instance, try to use a better way of choosing the nanoparticle-nanoparticle pairs and other things, for sure. Launching a public available software, we already launched two public, well, the first one was
30:44
done quite, well, three years ago by Pravin Ambur, which came from the lab of Professor Kunal Roy, by the way. And it was programmed in Java, as it is usual in Professor
31:01
Kunal Roy's lab. And it's a novel source software to do classification based on being average approach. And then, but was only based in doing linear or linear specific or OLDA, linear discriminant analysis or random forest. But then we improved a lot by,
31:28
by, by, Ami Talholder improved a lot and based on Python, because Python is much more easier to program than Java, I believe. Well, I have that impression for me,
31:42
at least. And so we, it's a Python also open source toolkit. And in this case, we are keep on improving it. Now it has deep neural networks. It also has, for instance, as we are dealing with the, with the experimental conditions, we use the moving average approach to
32:05
to include the, to integrate experimental conditions. We kind of changed also the IPPS randomization test that it's often done without taking into account the experimental conditions and we, we, we change it to, and include it so there. And we kind of
32:27
including other features also. So, and it's also an open source toolkit. Hopefully it can be put it on, on the, on the Greek database now. And apart from, well,
32:47
especially because, or launching a web server, well, to have reliable data models and so forth, it's very important really to compare the models and so on. And especially in our cases,
33:03
for more, more rapid tackling virtual partitions using these models, out weighting the difficulties of adding such complex data, which, which is characterized by different theological targets, different, different site times, whatever. And the last but not least, what we are trying
33:24
to establish more recently is to try to establish an inverse in silica model approach to generate the novel structures of nanomaterials having non-toxic and still useful properties. And for this we are trying, well, another type of approach, like multi-objective optimization,
33:45
evolutionary algorithms, and even a kind of fancy things which are never listened here, like ant-colonization, you know, ant-colonizations, uh, red horses, no, yes, gray wolves, red wolfs, optimizations, and so forth, to,
34:05
to, well, try to establish, because this is the, well, the, the, the thing that we would like to do it really. Well, and that is all. Thank you all for your attention. And apart from you,
34:25
of course, I would like to thank the funding, the, the National Foundation and the European Union, and also my, well, most of them are, most of them are not already in my group, I must confess, uh, Alejandro Speck Planche and Valeria Clendovra are now in Ecuador,
34:44
Feng Rong is, he went back to China, and Pravin Amburi is working in Spain, but Amit is still here, he's still there, so we can still keep on doing, uh, nano, nano QSRs, nano XRs, well, so.