We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Representation is King: The Journey to Quality Dialog Embeddings

00:00

Formal Metadata

Title
Representation is King: The Journey to Quality Dialog Embeddings
Title of Series
Number of Parts
131
Author
Contributors
License
CC Attribution - NonCommercial - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
In natural language processing, embeddings are crucial for understanding textual data. In this talk, we’ll explore sentence embeddings and their application in dialog systems. We'll focus on a use case involving the classification of dialogs. We'll demonstrate the necessity of sentence transformers for this problem, specifically utilizing one of the top-performing small-sized sentence transformers. We will show how to fine-tune this model with both labeled and unlabeled dialog data, using the SentenceTransformers Python framework. This talk is practical, packed with easy-to-follow examples, and aimed at building intuition around this topic. While some basic knowledge of Transformers would be beneficial, it is not required. Newcomers are also welcome.
Representation (politics)Multiplication signVirtual machineCASE <Informatik>Einbettung <Mathematik>Machine learningComputer animationLecture/Conference
Tournament (medieval)Logistic distributionComputer virusModel theoryPredictionGroup actionDataflowFormal languageProcess (computing)System programmingInformationLinear regressionToken ringEinbettung <Mathematik>Similarity (geometry)Trigonometric functionsPlane (geometry)Insertion lossRankingSymmetric matrixCASE <Informatik>Instance (computer science)PlanningEndliche ModelltheorieEinbettung <Mathematik>Pattern languageSlide ruleMachine codeSimilarity (geometry)Vector spaceToken ringRight angleInheritance (object-oriented programming)Line (geometry)Order (biology)Data conversionTransformation (genetics)AverageCodierung <Programmierung>Formal languageRegular graphModel theoryString (computer science)Auditory maskingLinear regressionBitStatement (computer science)RankingHand fanSign (mathematics)Knowledge representation and reasoningMulti-agent systemSymmetric matrixObject (grammar)Parameter (computer programming)Logistic distributionWave packetMaxima and minimaArrow of timeSpacetimeElement (mathematics)Row (database)WordMatrix (mathematics)DiagonalStapeldateiNegative numberArithmetic meanPerformance appraisalGroup actionPhysical lawMultiplicationFocus (optics)Moment (mathematics)NumberDimensional analysisTrigonometric functionsPoint (geometry)Revision controlModule (mathematics)ExistenceMessage passingPoisson-Klammer1 (number)EvoluteSimilarity matrixMathematicsInformationDot productSet (mathematics)Goodness of fitNetwork topologyComputer animation
Entropie <Informationstheorie>Plane (geometry)Insertion lossRankingSymmetric matrixPressureSimilarity (geometry)Trigonometric functionsMagneto-optical driveLogistic distributionLinear regressionSound effectStandard deviationSimilarity (geometry)Data conversionWordNumberMereologyLibrary (computing)Different (Kate Ryan album)CASE <Informatik>Arithmetic meanMathematicsQuicksortPerformance appraisalKnowledge representation and reasoningFunctional (mathematics)Machine codeLine (geometry)Term (mathematics)Logistic distributionLinear regressionTransformation (genetics)Endliche ModelltheorieWave packetPhysical lawPattern languagePredictabilityGroup actionEvoluteInsertion lossCoroutineRandomizationTrigonometric functionsGoodness of fitEinbettung <Mathematik>DistanceoutputInstance (computer science)Fitness functionMultiplication signPlanningLevel (video gaming)Bookmark (World Wide Web)View (database)Matrix (mathematics)ResultantSquare numberSingle-precision floating-point formatNoise (electronics)LaptopReal numberComputer animation
Logistic distributionHexagonLinear regressionFrequencyPredictionEndliche ModelltheorieFlow separationRight angleValidity (statistics)SpacetimeSet (mathematics)Open setView (database)PredictabilityEinbettung <Mathematik>Graph (mathematics)Group actionGraph coloringDifferent (Kate Ryan album)Point (geometry)Social classGoodness of fitDimensional analysisGraph (mathematics)Wave packetData conversionComputer animation
Total S.A.Integrated development environmentFormal languageProcess (computing)Representation (politics)Linear regressionLogistic distributionSet (mathematics)Wave packetValidity (statistics)Error messageVideo gameKnowledge representation and reasoningMachine learningSlide ruleVirtual machineComputer animation
Classical physicsSocial classSoftware testingComputer virusComputer animationLecture/Conference
Pearson product-moment correlation coefficientCross-correlationComputer animation
Receiver operating characteristicMachine codeRow (database)MeasurementMultiplication signData conversionError messageOpen sourceSet (mathematics)Graph (mathematics)DampingWave packetDisk read-and-write headStapeldateiInsertion lossEndliche ModelltheorieArithmetic meanToken ringTransformation (genetics)Right angleTask (computing)Medical imagingHyperlinkType theoryOpen setComputer animationLecture/Conference
Graph (mathematics)Endliche ModelltheorieTransformation (genetics)QuicksortBasis <Mathematik>Online chatResultantDirection (geometry)Similarity (geometry)Trigonometric functionsProcedural programmingExecution unitTunisRight angleComputer animationLecture/Conference
Logistic distributionLinear regressionSimilarity (geometry)Wave packetTrigonometric functionsPoint (geometry)Wave packetTransformation (genetics)Presentation of a groupResultantData conversionComputer animationLecture/Conference
Transcript: English(auto-generated)
Hello, everybody. So one more time, my name is Adam. I am a machine learning engineer from Salted CX, and today I will be taking you to the journey through quality dialogue embeddings.
In our journey, we'll be working out a use case. Sorry, I have to fix the mic. This is not comfortable for me. Okay, well, let's try like that. So we'll be working out a use case together for a non-existing airline company called Python Air. So Python Air is really cool, customer-centric. So then customers like George can reach out and be helped by chat.
So in this particular conversation, George introduces himself. He says he'd like to fly from Prague to London. He mentions the journey dates. And in the end of the conversation, the employee of Python Air, Ben, he will book a ticket for Ben. Ben can do four other actions apart from booking.
He can also cancel the flight, change the flight, say there is no such flight or so there is no such action. Now because Python Air is a data-driven company, they would like to track automatically these actions. So essentially, they would like to build a predictive model that will be able to say what's the action that Ben has done given the conversation.
And today, we'll be having like a few-shot problem where so Python Air provided just ten labeled dialogs for each action. And then they threw us 15,000 dialogs for which we don't know the actions. We'll make use of that. And we agree that we'll be evaluating this predictive model with F1 score on some holdout
evaluation set. Now this problem can be translated into the following pipeline. So in the beginning, you have the conversation. Then you'll pass it to the transformer. You will get some representation of this conversation. So essentially, whenever in my slides you'll see this squared brackets and a floating point number, it's a vector.
And these vectors, they are representations that point to the meaning of the conversation. And I'll be using interchangeably the word embedding. So embedding is also a representation, the same thing. So once we have these embeddings, we'll pass them to the logistic regression. And finally, we'll get our predictive outcome.
Now I'm deliberately choosing here like a simple classifier such as logistic regression to illustrate that all the interesting things are happening within the transform model that will be providing quality representation based on which we'll be doing the classification. So everything that we'll be doing from now on, we'll be tweaking this transform model to give us quality representations.
So how does this transformer model look like? So very briefly, I think you're familiar with it, but I think it's important to go through it. So we'll pass a conversation to this transformer. Well, first, sorry, we'll concatenate conversation. So it's just one plain string. Then we'll tokenize it. I think you're familiar with it. We will pass it through the transformer.
So we'll extract token embeddings. So every single token will have their own representation. And finally, because we need just one vector, we need to do some, well, it's called pooling. So we will just take an average of all these vectors within our conversation. And then this will be the vector, this 0.4 dot, dot, dot.
That's the vector with which we'll represent our conversation. And this is what we'll be passing further to the logistic regression. Now today, well, or today, to start with, we'll be using MPNet. I'm not sure if you're familiar. If not, it's just a regular transformer encoder. Here, if you're interested in the details, have a look at the reference.
So MPNet. Now in code, I'll be showing, by the way, these short snippets with sentence transformers. It's super simple to work everything I'm showing. Super simple. Few lines of code. So essentially, we define a sentence transformer to consist of two modules. Transformer module, essentially we say we like the MPNet. We define the pooling, where we say that our dimensionality is 768.
And then we say in line A that we like to do the mean pooling. So we like to take the average of all these tokens in that conversation to represent our conversation. So finally, on the last line, we say model dot encode. And then we will pass our conversation to get our embedding.
Here I have an example where it will say, I want to fly to Prague. So if we call that, you'll get our embedding, our representation, which is the vector, as I said. And if you will have a well-behaved space, you could expect that similar things will be close together. So close to the sentence, I want to fly to Prague. There should be some other sentence, synonymous sentence.
I would like to travel to Prague by plane. These are synonymous. Cool. So we have our MPNet. Let's pass all the conversation through it. Let's extract the embeddings. Let's train logistic regression, run the evolution. And we got a fun score of 59%. On one hand, OK, not bad.
On the other hand, MPNet, it's a model that has 110 million parameters. And this is really not a complicated problem. So well, something is not working here. What's that? Well, the problem is that the MPNet and generally Transformer encoders, they are trained to represent these token embeddings, not the final ones.
So I'm here mentioning the language modeling objective, which with the model was trained probably you're familiar with BERT. So BERT was trained with Masked Language Modeling. MPNet was trained with a little bit smarter objective. But still, MPNet was tree trained to represent these tokens, not this final sentence,
in our case, dialogue embeddings. So to give you a little bit more intuition, let's work out these examples. Let's compute some cosine similarities. So here we have, you'll be always calculating cosine similarities with the reference to the sentence, I want to fly to Prague. So let's first calculate it with this anonymous sentence.
We see it has very high cosine similarity. One is the max, so it's nearly perfect. So far, so good. Let's take some random sentence, but very true sentence. By the way, I'm a fan. Sparta Prague is the best. It has a lower cosine similarity as we could expect. Good. But if you will take an opposite sentence to our reference, I don't want to fly to Prague,
that has a higher cosine similarity than the synonymous. So something is not working well. So take this statement, a grain of salt is just an example, but generally, transformer encoders such as MPNet, they're not suitable for representing sentences. So what's suitable for representing sentences? I think I'm not going to surprise you when I say sentence transformers.
So what are those mysterious sentence transformers? Sentence transformers are, in fact, just fine-tuned, regular transformer encoders that are pre-trained to provide you the right sentence representation. They don't focus on tokens anymore. They really focus on the last layer, on the sense embedding, on the embedding that
we are looking for. So let's use them. In order to train them, there are many different ways. I think the most standard pattern is that you have to provide to a sentence informer a pair or pairs of semantically related data. So for instance, here we have a pair. The first one is, I want to fly to Prague, and I would like to travel to Prague by plane.
And this green arrow, it represents the label. So we say these things, they are similar, pretty much, they're same sentences. But we don't have to pass just similar or the same things, we can just pass to the transformer to train it related things, like question and answer pairs, like what is the
best club? Sparta Prague is the best, yeah? So one is the answer to the other. They are semantically related, so tasks, they should be similar. Now how to train a sentence transformer? Well that really depends. Today, we'll be using sentence MPNet, by the way, the official name, it's all MPNet-based, we do, but let's just stick to the simple version, sentence MPNet.
Sentence MPNet is one of the best, very small sentence transformers, just for the record. And how is it trained? So a sentence transformer needs these pairs of semantically similar text. So let's take it, let's pass it, or let's pass it through the original MPNet that we just used a moment ago, and then sentence MPNet was trained with multiple negative
symmetric ranking laws. It's quite a complicated name for a relatively simple thing, so what's happening? So in order to train the sentence MPNet, we will pass our, we'll take a batch of data, so let's take these two examples that we have. Let's create the embeddings, so let's create the embeddings and create a similarity matrix.
So right now we are looking at the similarity matrix, and notice that on the diagonal, so from here and here, those are the pairs that we told the model to be the same, yeah? So this green arrow, those are our labels. So diagonal, the elements in diagonal we'd like to maximize.
And on the other hand, the elements that are off the diagonal, they're called in-batch negatives, we'd like to minimize them. So for instance here, so the sentence, I want to fly to Prague, and Sparta Prague is the best, they should not be similar, so I would like to minimize these similarities. How to do that? Well, I can take a softmax for every single row, convert these similarities into probabilities,
so right now, for instance here, I have a probability of 52% that I want to fly to Prague, it's most similar to I would like to travel to Prague by plane. I'll take a cross-entropy with the labels being the diagonal, and that becomes my loss. And because this multiple negative symmetric ranking loss is symmetric, I'll also take
the transpose of the matrix and compute the whole thing, and then that will become our loss. We don't have to do it, someone has done it for us already, sentence MPNet was pre-trained with 1 billion training pairs, huge amount of data, so we can just use it. So with sentence transformers, we'll actually just pluck this name of the MPNet to enter
our code, and we are good to go. Now let's work out this example one more time. Finally, suddenly here you see that these sentences, these similarities, they start to make sense. So I want to fly to Prague, and I, the synonymous sentence, I would like to travel to Prague by plane, they finally have the highest cosine similarity, and then comes the rest.
So, again, this is just for intuition, but generally, sentence transformers and MPNet are suitable for representing sentences, short text, and dialogues just like we have. Cool. So let's use the sentence MPNet, extract the embeddings, train the logistic regression, and run the evolution. What do we get?
Nice. This is good. Well, we got an improvement from 59 to 74 percent in terms of F1 score. This is a good start, but we'd like to do better. Let's train the sentence MPNet. How can we do that? Well, we can follow the pattern that I just introduced with multiple negative symmetric ranking laws. So what do we need to train the model?
We need pairs of semantically similar text. So do we have that? If you look at the dialogue, can you think for yourself which parts of the dialogue are semantically related? How can we create these training pairs? What parts of the dialogues? Do you see it? If you think about it, the consecutive terms, they should be related, because one is reaction
to the other. It's like the question and answer pair. So hey, I'm George, it's related to hello, how shall I help you, because it follows its part of the conversation. And then hello, how shall I help you should be related to I'm going to fly to Prague, da-di-da-di-da, and that should be related to the next one, the next one, the next one. And if you'll be training the model like this with all the conversation, actually,
what we are doing, we are defining our own function of similarity. And I find that cool. I hope you do. Well, in terms of code, it's again just a few lines of code. You call it model.fit, you provide the data. I'm not showing you how to prepare data. I think you can handle it yourself.
You say that you'd like to use this loss function if you define, and you are good. By the way, a disclaimer, since a month ago, there was a new major release of Sentence Former, and this is currently a deprecated way, how to train a model. It still works, but I didn't manage to run this on my laptop, the new one, so just for you to know. It still works.
Cool. So now we've trained our sentence and peanut with our custom data. So again, let's extract the representations. Let's train the logistic regression. Let's run the evaluation. And what do we see? Voila. We get an F1 score of 92%. So from 74 to 92, and we didn't need any single example of labeled data.
Everything we needed was just these dialogues. That's it. No one needed labeled data. Fantastic. But let's come back to our example. Well, if you look into it, you're sort of back where we started. Finally, the sentence, I want to fly to Prague is most similar to the sentence with the opposite meaning.
I don't want to fly to Prague. Well, is it really a problem? Is it really expected? If you think about it, these similarities, well, it kind of like makes sense because we were training our model, we said the similar things are the consecutive turns. So probably it doesn't never happen in the conversation that George, as the customer,
would say, I want to fly to Prague, and Ben would say, yeah, I would like to fly to Prague to play. This is not happening. Ben would say, like, when do you want to fly? How much do you want to pay? Where do you? Et cetera. So it shouldn't be really surprising, and you should be aware of this. So in our use case, it works fine, but there could be other use cases that this
would not be desired. Cool. So let's go to my favorite part, and that's intuition. So right now I added here the percentage with which, or probability, with which the model thinks that this is a booking conversation, and this is a real number.
So why? Why 27%? What parts of the conversation are making our pipeline think that this is a booking conversation with 27%? Well, for that, we can use the shop library. So what shop does, I'm not sure if you're familiar, it actually hides certain parts of
the input. So for instance, let's hide the word booked, and then the shop would run the full pipeline and then observe the new probability without this word booked. And again, these are real numbers, so if you would hide it where it booked, suddenly our probability that this is a booking conversation would drop to 26%, which would mean the word booked positively contributes towards the prediction of booking status.
And we can actually let shop do its work and work out it for the whole conversation. So this is how it looks. So first, there are the words in the red. Those are the words based on which the model thinks that this is a booking conversation. So here we have the word booked, it's the brightest, so it has the largest contribution.
Like my view, that's good, ticket is booked, okay, that should contribute towards booking conversation. But there are also a lot of irrelevant things, at least in my view, like 10-10, be, and, sure I, make, whatever. And then there are words that are in blue based on which the model thinks this is not
a booking conversation. So overall, in fact, the model did a right prediction, so the booking action had the highest probability, but still was very low, and you can think that if you might make small change in the text, the model can make the mistake and you can confuse
it and it can predict different status, sorry, action. So why is this happening? Why are pipeline is using like these particular words mainly and not others? Well, we've never told our model how a booking conversation look like.
So how to fix it? Well, first idea that you might have, you could do end-to-end training, so you can just train the sentence transformer and logistic regression together, you're good to go. Or maybe not, because you have just 10 dialogues per action, so yeah, you're very likely to overfit, this might not be a good idea.
Better idea would be to go back to the realm of sentence transformers. So what do we need to train a sentence transformer? Well, we need pairs of semantically related or similar text. So do we have that if we use our labeled data? In fact, we do.
If you think about it, you can just create pairs of different conversations that have the same status. So you can take conversation with booking, the first conversation with booking, second one and say, these, they talk about the same thing, they should be similar. And then you can do the same thing, but across different statuses, you can say, this is a booking conversation number one, and this is a conversation about no flight, whatever,
and that these should be dissimilar. And like that, we can create 225 similar pairs and up to 1,000 dissimilar pairs, much better. Cool. Then we can use contrastive loss, which is a standard loss that's being used in training sentence transformers. We can pass these two conversations into the model.
We can extract the embeddings for each conversation. Then we can calculate cosine distance, so one minus cosine, cosine distance. And if you would square this result, then essentially this would become our loss based on which we could tweak our sentence transformer to understand that these two conversations
are the same, should be the same, should be similar. Cool. In terms of code, again, one line of change, just change the loss prepared data, you're good. Then, once we have our model trained, we would get from, this is what we had before, that was the model was trained on unlabeled data, and we would get to this.
So first thing, notice that our prediction probability rose from 27 to 77%. Our model is much more confident that this is a booking conversation. And secondly, notice that the model is using only, in my view, relevant parts of the conversation. So it's using here, the ticket is booked.
That's here, it's okay. I don't know. Could be some pattern, random noise, mistake, I don't know. But primarily, the picture that you're looking on the right looks much better on the left. Nice. So let's, again, do the same routine. Let's plug it to our pipeline, extract the embeddings, train the logistic regression,
run the evaluation, and we see we got an improvement of another 4% improvement. Our F1 score rose to 96%. So you say, okay, 4% improvement, that's not much. But in fact, you got much more than that. Because if you will look at the prediction probabilities of the previous model, so these are the prediction probabilities of the sentence MPNet that was trained on unlabeled data,
you can see that typically it's around 30%. So the model is really not sure about its own prediction. As I said, if you would tweak the initial, the dialogue, if the person would say, I don't know, some random stuff, then the model could change its prediction. So this is like, you don't want that.
You would really have unstable predictions. However, if you would do the same thing, if you will plot the same graph for the model that was trained also with labeled data, we see that our model is much more confident. The predictions are around 70% or 80%. And this is also reflected into the better separation of the space.
So now on these graphs, you are looking at embeddings projected with PCA into two dimensions. On the left, you see the sentence MPNet just trained on unlabeled data. On the right, you see the sentence MPNet trained on unlabeled and then labeled data. And you can see that the one on the right has much better separation across different
classes. So every single point is a conversation, and different colors represent different actions. So this picture looks much better than here. You see how these are overlapping. And if you're looking closely, and by the way, this is generally good practice. Look at your data. View your data. I'm not sure if you can see, but here there are two points.
This is no reservation in the middle of the cluster with no flights. What's that? Why is that? This is a good question. Why did we get it? Why our model thinks that? In fact, I deliberately left some incorrectly labeled data in the data set. So this was an open data set that I took online.
Some labels are incorrect, and they are marked. So I deliberately left them there, because this is what's happening in practice. You cannot really trust your labels. So if you remove these incorrectly labeled data just from the validation set, you can keep them in the training, suddenly your F1 score would rise to 99%. And that's something which would make the Python error happy.
Cool. So just one last slide from my side takeaway. So if you're working on machine learning problems, it doesn't have to be just this one. My takeaway is try to work out, try to come up with the right representation for your data. It will make your life easy. And remember, representation is king.
That's it for me. I'm happy to have a chat, as I finished again so much earlier than I expected. Classic. Anyway. Thank you so much. Thank you so much, Adam, for your presentation. Please, if you have a question, go to the first mic, please.
Hello. Hey. Test, test, test. Great talk. Thank you. With this, yeah, the last improvement just being like 4% in the F1 score, I thought
it might be related to F1 score, because I noticed that what is a good F1 score depends on if it's a balanced binary thing or not. So, yeah, I think I changed to using Matthew's correlation coefficient, if you know that. That's like, it's more normalized. And maybe if you just use that, it shows better how big improvement this last step is.
Okay. Thanks for that. I don't know it. Maybe I also can invite you to look into the GitHub. There's code. You can look at the precision records there. So maybe then we can see what's going on. But I will find you after the lecture and one more time make note of this measure.
Thanks. Yeah, really, really great talk. Thank you. Thank you. Sorry. I don't know how this works. So thanks for the great talk. I was just curious. So let's say we take an open model, right? It was trained on Wikipedia articles.
And that's very well-written text, no typos, nothing. And now we want to use it for dialogue for text of people that maybe they have typos, maybe they are not really native speakers. And the problem that we are facing is that the tokenizer even may not be very adequate for the task. And now we face the issue of retraining the tokenizer, which means that we'll have to maybe retrain the whole model from scratch.
So how would you address this problem? Well, I think the best thing would be to find the guy who spoke about exactly this problem yesterday. There was a talk in the other hall, I forgot the name, and he exactly spoke about the same thing. So I would refer you to that. I have to say that this data set was error dialogue.
Maybe you are familiar with it. It's open source dialogue. There are no typos, no problems. In our company, we do these things in practice. And yeah, the conversation looks completely different. They're messy. It's never happening that you never have this customer say something, agent say something.
It's like customers say five things, and agent then starts slowly replying while the agent is, you have hyperlinks, images. This is much more complicated in practice. So I maybe would invite you to ask the other guy. I think that would be better than me doing this. Okay. Thank you. Hi.
Very cool talk. I was wondering, do you have any idea how does the batch size affect the training loss and consequently the convergence and performance? Good question. So these are the error dialogues. They're very short dialogues, just a few turns.
I don't have it in my head how long, but generally short. In practice, we work with much longer text, and again, this becomes much more problematic. So my suggestion for that would be use more advanced sentence formers, just larger sentence formers that can, first of all, take more context, but in general, they're just smarter.
I forgot the name. I can look it up for you. I was looking in, like, as you saw, these graphs. So I worked out with these graphs. One was sentence transformer, and one, I just used a different model. For this problem, it doesn't matter. It's not a complicated problem, but if you would use the practical data, then that could
be the way how to tackle that. Just use a better sentence transformer. Okay. Thanks. Hi. Thank you for the talk. So you've shown this sort of detour when you went to sentence transformers and then
fine-tuned it, and you basically went back to having, like, similar cosine similarities than before. So I wonder what happens if you leave out the sentence transformer and just fine-tune the token-based model directly. Did you try that out? Do you get, like, the same quality of results in the end? How would you fine-tune it?
How would I? Well, couldn't you do the same procedure, right? You take your chats again, like you've done with the sentence model, and then train it, but use the token-based model as the basis for your fine-tuning, I guess. Of course. Yeah, sure. Like, nothing stops you from doing it. It's just this, if you go back, like, here, like, maybe here, just, yeah, I don't
know where it was, or, yeah, whatever. Let me find it. Of course, you can, like, you're just picking up, like, better starting point. My question is, have you tried it out? I'm wondering whether the sentence transformer actually improves things, or whether you end up more or less in the same place.
I think, so, note that the sentence transformer, this sentence empionet was trained on one billion training pairs. We had 15,000 conversations, of which we created 180,000 pairs. So I haven't tried it out, but I'm quite confident that you would not get the same results because you would just need much more data. Thank you.
Thank you so much for your questions. Thank you, Adam, for your presentation. Thank you. Thanks.