We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Your Name Is Invalid!

00:00

Formal Metadata

Title
Your Name Is Invalid!
Title of Series
Number of Parts
275
Author
Contributors
License
CC Attribution 4.0 International:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Names of people cannot be invalid. People have names. Most people do. People have first names and last names. Many people do. People have any sorts of names that often don’t fit fixed fields in the forms. These names may contain letters, accented letters, and other characters, that may cause problems to your code depending on the encoding you use. They may look differently in uppercase and lowercase, or may not be case foldable at all. Searching and sorting these names may be tricky too. And if you design an application, web form, and/or database dealing with personal names, you’ll have to take that into account.
Keywords
Radio-frequency identificationString (computer science)CodeRead-only memoryPoint (geometry)Programming languageException handlingOcean currentError messageCodeString (computer science)System callForm (programming)Formal languagePoint (geometry)Centralizer and normalizerSoftwareRight angleWeb 2.0InformationComputer programmingMereologyType theoryObject (grammar)Extension (kinesiology)Different (Kate Ryan album)CASE <Informatik>NumberLimit (category theory)Fitness functionWindowStandard deviationPerimeterUnicodeVotingKeyboard shortcutSemiconductor memoryHypermediaLibrary (computing)Direction (geometry)BitMathematicsDefault (computer science)Computer fileMiniDiscMobile appDatabaseSet (mathematics)Level (video gaming)Regulärer Ausdruck <Textverarbeitung>Table (information)Computer animation
Latin squareWindowWeightSymbol tableProtein foldingDatabase normalizationWordMultiplication signString (computer science)Scripting languageDifferent (Kate Ryan album)MereologyHelmholtz decompositionCategory of beingArrow of timeFunctional (mathematics)BitFamilyCASE <Informatik>Formal languageEmailWave packetCodeAlphabet (computer science)Error messageDefault (computer science)Parameter (computer programming)Revision controlCharacteristic polynomialLatin squareSelf-adjoint operatorInformationProtein foldingMappingLine (geometry)DecimalClique-widthRandom number generationNumberNumeral (linguistics)Electronic mailing listTable (information)Coefficient of determinationPrisoner's dilemmaProjective planeLibrary (computing)Arithmetic meanHecke operatorPresentation of a groupException handlingRoundness (object)Stress (mechanics)OpticsOperator (mathematics)Immersion (album)TwitterOffice suiteCausalityCombinational logicWhiteboardServer (computing)Connectivity (graph theory)Computer animation
Database normalizationProcess (computing)Connected spaceRegular graphUnicodeRegulärer Ausdruck <Textverarbeitung>CodeLocal ringFormal languageDifferent (Kate Ryan album)Electronic mailing listFitness functionPosition operatorTable (information)WebsiteOrder (biology)Category of beingThread (computing)Data miningRule of inferenceWordCentralizer and normalizerKeyboard shortcutForm (programming)MereologyRegulärer Ausdruck <Textverarbeitung>Computer programmingAlphabet (computer science)QuicksortLibrary (computing)DigitizingProcess (computing)Boolean algebraFunctional (mathematics)Default (computer science)ParsingString (computer science)Natural numberException handlingArmCASE <Informatik>ExpressionCausalityCircleNumberAlgebraic closureTelecommunicationPoint (geometry)Special unitary groupMultiplication signDigital electronicsSoftwareTwitterSource codeObject (grammar)Uniform resource locatorMathematicsComputer animation
Address spaceForm (programming)Alphabet (computer science)2 (number)Formal languageFamilyDifferent (Kate Ryan album)MereologyGame controllerWordLimit (category theory)Multiplication signComputer programmingTable (information)LengthDirectory serviceElectronic mailing listCentralizer and normalizerDrop (liquid)DatabaseLatin squarePhysical systemLine (geometry)Reading (process)GodHypermediaCASE <Informatik>Process (computing)Video gameQuicksort40 (number)NumberSystem callECosArithmetic meanCausalityOperator (mathematics)WritingRight angleInsertion lossResonatorNeuroinformatikState of matterAuthorizationPhysical lawWeb pageOpen setComputer animation
SubsetSolid geometryString (computer science)TwitterLimit (category theory)CodeMultiplication signRow (database)Local ringProgrammer (hardware)Data conversionStreaming mediaComputer programmingLevel (video gaming)Web pageString (computer science)BitMereologyInclusion mapRevision controlState of matterTouchscreenNumberMaxima and minimaQuicksortExecution unitSoftware developerUniform resource locatorComputer animation
HypermediaComputer animation
Transcript: English(auto-generated)
Welcome to the RC3.
Hello and welcome at the Franconian stage. Sorry for the delay. We had some technical problems. We have to get used to it. I want to present you an interesting talk from Miro.
He talked about names. If you think about names, names can have a first name, last name, middle name, prefix, postfix, but oops, think about encoding. Could be difficult. Some of the difficulties that Miro experiences, he will tell us about it.
So welcome Miro. I'm glad to hear your talk. Thank you very much. Hello everyone. My name is Miroslau Sredivy.
No, no, no, that's wrong. No, that's still wrong. There's still some character is wrong. No, no, that's still the wrong encoding. No, no, no, no, no. Ah yes, now it works. So my name is Miroslau Sredivy. And what I hear quite often is that my name is unfortunately invalid.
That cannot be true. Miroslau Sredivy. This is how you write it. This is how you pronounce it with IPA. And if you have the standard keyboard layout and you use the compose key, you can type it as well. Although I'm going to speak about today, I'm going to speak about names in programming.
Python is just an example programming language. If you are using a different one, it may very probably apply by the same extent or a little bit differently. I would be interested to hear what are your experiences. I'm going to speak about strings and bytes, about encoding, about normalizing, case folding, sorting, regular expressions.
And about the names on the web, on the web forms, in the databases. How they are formed, how they consist of different parts, prefix, first, middle, last, suffix names. And about the allowed characters. So, how you should prepare or program your app that it works with the names of people correctly and without problems.
In Python 3, now we have strings and bytes that are two different objects or types. Strings consist of characters. So, these are actually the characters that are used to write in some language.
And we have more than one million different code points that are in Unicode. And these strings are only in memory, in the working memory. So, if you read something from a file or over network, it will come as bytes. Because byte is the standard old 256 different possibilities of 8 bits that you have in a byte.
But as soon as you read it in Python, you convert it to a string. Because as we are going to see, one character can consist of several bytes. Because there are many more than 256 different possible characters.
Then in the memory, you work with those strings. And at the end, you convert it, again, into bytes when you want to save it to a disk file or you want to send it over network. So, if your name is Chuck Norris, you don't need any encodings. But now, of course, the string Chuck Norris will be encoded into bytes using the dot encode method.
And the other way around, your bytes, you can decode them back to Chuck Norris. Both look the same. In Python 3, the default encoding is UTF-8. And Chuck Norris looks the same in UTF-8 or as the original characters.
So, there is no change. You see that Chuck Norris is 12 characters or 12 bytes when it is encoded as UTF-8. If you are from Germany and your last name is Mila, then it will be a little bit different. Because in UTF-8, the U with the resist, so umlaut, doesn't fit into one single byte.
And you need two bytes. So, there are two bytes that will save or that will encode your U with umlaut. If your last name is Chinese, of course, this is not the last name. This means ni hao, so hello.
The two Chinese characters are encoded as six bytes in UTF-8. So, it works one way, the other way around. And if you know that your bytes are in UTF-8, you can read them and decode and you will get your original Chinese character. But there are also other encodings apart from UTF-8.
UTF-8 is great because it works for Unicode and for most characters that we need. But earlier, earlier, earlier, there was ASCII. And ASCII knows only seven bits and only 128 encoding possibilities. So, there is a limited number of characters that you can encode directly.
In the case of Chuck Norris, it works. But if your name is Mila, then you will need something like Latin-1 encoding. Latin-1 or ISO 8891 is encoding that works perfect very well for some Western European languages. Like German, French, Spanish, Italian, and other languages.
And it knows or it has the information that these several characters can be encoded into the respective bytes. But the U with umlaut has a place in this Latin-1 encoding table.
But many other characters don't have because you have quite a limited number of possible characters that can be encoded. My last name, Šívi, which means gray-haired in Czech and Slovak, it comes from Czechoslovakia, cannot be encoded in Latin-1.
In the languages like Czech, Slovak, Polish, Hungarian, and in other Central European languages, we have used Latin-2. Which knows some other characters on the place of the characters that are encoding in Latin-1. So the Š, the S with Karen, has a place there, but it is not encoded in Latin-1.
You need to encode it in Latin-2. And then, living in Germany, sometimes I get packages post-mail from some German companies that do such stuff. And you see that I think they have some problem with my last name because on the web form, my last name is written correctly Šívi.
But on the sticker, on the package, the Š is replaced by a question mark. And now, I just wanted to know, why is it like that? Because Š and question mark, they look quite similar, but that's not a problem. How is it possible that they encoded my Š as a question mark?
In Python, if I encode my last name Šívi into Latin-2, it will work. But if I tell them, encode it as a Latin-1, then of course it will not find the S with Karen, the Š character, and it will raise an exception.
You can encode error. I don't know the Š because it is not contained within Latin-1 character set. Okay, but I receive a package. I don't receive an exception. So probably there is some possibility how they do fix it the way that it works. And I get my package, although with a wrongly printed name.
And yes, there is a function or the parameter in Python where you can encode to Latin-1. And if there is an error, you replace the character. Default for errors is a raise, to raise an exception, as in the first case. But if I tell it if there is an error, replace it, it will be replaced by a question mark.
Why they chose question mark? No idea. It is not configurable. But there is a small hack that gives you the possibility to replace your missing character by something else. But this is probably how that big company encoded my last name and converted it to Latin-1.
And that's where the question mark comes from. And you can write a short one-liner in Python, for example, like this, where I say if there is an error, use the replaceRandomly function.
And in that case, it will just put a random number. You can write something else, some funny character that will be printed there instead of a missing character. And in this case, I got a 5 instead of S. Well, a 5 looks more like Š than a question mark looks like Š. Well, that's fine. And there are some other big companies in Germany.
One, I will not tell the name, but they have beautiful big trains all around the country. On the mail that I get, on the customer card, on the online tickets, they manage to use always a different encoding and always to write my last name differently.
There is another big company that has big airplanes. When I wanted to buy a ticket and I write my last name, they tell me, you can only enter letters in the adult's last name field. My last name consists of letters.
What is a letter? In Python 3, you can call a variable, you can name a variable using any character that looks like a letter. So, for example, I could do something like that. But I cannot name a variable, for example, a question mark or a smiley.
Where does Python know that Š is a letter and question mark or a smiley or some arrow is not a letter? If you import the center of the Unicode data library, there are some functions that will give you the possibility to investigate,
to inspect how the characters look like. So, now I have a few characters, a umlaut, the sharp S, lowercase uppercase, yes, there is uppercase sharp S, and the dot and the question, a smiley. And then I ask for the category and for the name. And what you see in the first column is the character.
Then you see the two characters, ll, lu, zs, po, so. And this is the category of the character. If it starts with an l, it means it is a letter. And then if there is a u or l, it means it is an uppercase letter or a lowercase letter. Zs, po, so are different other categories for symbols, for numbers, digits, and so on.
And then you see the name. So, for example, Latin small letter a. This is the list of all categories. And actually every character in the Unicode table has a category, belongs to some category. And then there is a possibility to access the information behind that.
So, if it looks like a letter, I can use it as a letter. If it looks like a number, I can probably get the decimal value of this number or of this character, even if the character doesn't look like a number, if it is a Roman numeral or some other alphabetical numeral.
There is the character maps app that will also give you all the information about the character and where you can see also the category and the name and all the characteristics. Case folding. Case folding is the possibility to switch between lowercase and uppercase letters,
which works in Latin alphabet, in Greek alphabet, in Cyrillic alphabet. It doesn't work in Chinese. It works in some alphabets. But in our case, we did it. So, if I have some characters in lowercase, I get uppercase version and vice versa.
There are some exceptions. So, for example, the sharp s, uppercase version of the sharp s, there is an uppercase sharp s. But it converts it to uppercase s, which is probably wrong. And the other way around, if I have uppercase sharp s and I convert it to lowercase, then it will work.
But this is not a symmetrical operation. And this works for all characters that are lowercase-able or uppercase-able. But this doesn't work always correctly. So, we have seen here the case of sharp s. And there is also one other case that is contained even within ASCII, even within the basic 26 letters of the Latin alphabet.
And this is something that we don't see as a problem, but we broke it to some other languages. And that is the case of i. You see the difference between lowercase i and uppercase i.
There is one tiny difference, the dot. There is lowercase i with dot and uppercase i without dot. And there is a language, at least one or several, a family of languages that distinguish between these two i's. It is, for example, Turkish. The i with dot is pronounced as i.
And the i without dot, dotless i, is pronounced as i. And now imagine that if you have some Turkish text, not you, but your Turkish colleagues, and they want to convert between uppercase and lowercase, it can be wrong. And sometimes it can be so wrong that even the word can mean something different if it is with dot or without dot.
So, our Turkish colleagues, they have actually to make a workaround. They have to import the ICU, which is the International Components for Unicode Library, and then import the locale, and then convert it this way.
So, that's a little bit more complicated. Normalizing is something that you probably don't see usually. But this is the normalizing. This is the decomposition of characters into their parts. For example, we have the German word zus, which means sweet. And I have two words that look the same.
The first one, three characters. The second one is normalized NFB form. What does that mean? I take, again, my small timing script that shows me all the characters within the string. And you see that the first word contains three characters. The second one contains four characters.
And the difference is with this u with umlaut. In the first case, it is u with umlaut as a one character. Latin small letter u with dir resist. In the second case, there are two characters. The first one is Latin small letter u. And the second one is a combining dir resist. And one thing you see that the column in the second word,
this line with combining dir resist, has moved a little bit by one character. That means that the combining dir resist doesn't have a width. So, it has zero width. It is just glued to the character before. And then it looks like a u with umlaut.
And there are plenty of combining characters that allow you to combine. You can actually make a sharp s with dir resist on the top. Or combine most characters with other combining characters. This is an example. This is a Stack Overflow answer to the question,
how to parse HTML with regular expressions. Of course, you can't. But what you see at the end, these are just characters with plenty, plenty of random combining characters just after them. And it looks cool. Alphabetic sorting. There is built-in function in Python sorted that will get a list or a string,
which is actually a list of characters, and will sort them according to some rules. If you have some numbers, great. That's easy. But if you have characters, it will sort them. And what you see is in the example that at the beginning, I have capital, like uppercase AOU, then lowercase AOU,
then with umlaut uppercase, then sharp s, then AOU lowercase, then I have some Central European like Czech, Slovak. And then the sharp s uppercase, it comes at the end. So actually, the order doesn't look very natural. And this is because all these characters are converted to Unicode code points,
like the numbers, their position in the Unicode table. And then they are sorted according to them. And sharp s uppercase came later, so it is at the end. And this is not what you would like to see as an alphabetical list, in a fold list or a list of some names or something.
We want to sort now according to the German language, because sorting according to every language may be a little bit different. So we import the locale, and we say our locale is German. And we sort these characters using the locale string xform method. And it will look better, because first I have both As,
then I have A with umlaut, and then I have B. So this is how it should look like in a German dictionary or in a German form list or list of names. But if I have a Swedish user, for them, seeing A with umlaut between A and B is not natural.
Swedish expect the umlaut characters at the end of alphabet after that. So if you have the Swedish alphabet, it should be at the end. In Hungarian, there is the CZ sound, which is written as CS.
And CS doesn't get between CR and CT. CS is as a special character between C and D. So this word CZ, which means sharp as a chili, CZ comes after all the C words. In Czech and Slovak, we have also the CZ sound,
but we write it with C with keren. We have CZ, you have seen already in my last name, which is S with keren. plenty of characters, like Slovak alphabet has 43 letters with all the possible umlauts. And another thing is, for example, the CH, which is the CH sound, which is also alphabetically between H and I.
But there are also exceptions. If you have two words glued together, and the first one ends with a C and the second one starts with an H, it is not CH, it is ZH. And then it is sorted differently. In French, it is even more interesting. They sort something from the beginning of the word, some other things sort from the end of the word.
So usually, when they sort words, they sort everything according to ASCII form. And then they look, if there are these four words that have the same ASCII form, they look at the end of the last syllable. And then you see the first two words, on the I, they have nothing, and the other two words, they have accent, égu.
So first come the two words where the last syllable is, without accent, and then the words with the last syllable with accent. And then within these two, they sort according to the penultimate syllable, so the before, last, and so on and so on. So that's French.
That's okay. If you have seen their keyboard layout, you understand why they are doing it like that. The problem is that locale is connected to the process. So it means if you do something like this, like set locale in your code, then if it is a library or if it is a website with plenty of users, with plenty of threads running,
this will change the locale of the process. And this is not what you want, because if you have two users, one of them clicks, I want to sort it, at least according to German rules, to Swedish rules, then they will just break it to each other. And there's a problem, because locale is connected to the process. But we have seen already this ICU library
that allows you to use the locales as objects, object-oriented, and that allows you to do something in your corner, in your method, and you use all these things, but the whole process is not changed by that.
Another possibility, which is much more lightweight, is PYUCA, that you see that sorts nicely, but that sorts according to some English rules, because it doesn't define which locale it has, like which language. So they have some better sorting than the default sorting according to Unicode lists,
but this is not optimal for every language. But if you need one general list, you can go with PYUCA. Now, regular expressions. If you have a problem, use regular expressions. Now you have two problems. But anyway, let's say that we wanted to extract
from this word, Minchen 123. Minchen is the German name for Munich. Minchen 123, and we want to extract the name Minchen. So if I import regular expressions, and then I ask for all characters A to Z, A to Z,
then it will see the M, capital M, and then it doesn't see the U because it doesn't belong to the list of A to Z. There is backslash W, which finds the U umlaut. So it finds the word Minchen, but it also finds all digits,
and I'm not interested in digits. I just want to see Minchen. So how can I extract Minchen in a regular expression? There is third-party library Regex that works identically to the standard RE, but it has some more functionality, and in this case, that's the possibility to ask for backslash P,
which is a special character from the Unicode list. And in the L, the capital L, as you remember, is the category for letters. So if you have L, it's any letter. LU would be uppercase. LL would be lowercase letter. And this is how you can actually use regular expressions
to find words that contain some other characters that are beyond asking. So I came here for Python, but I said for the names. So that was the programming part. Now let's have a look at the names. I cannot see you, but you can just raise your hand if your name fits into first name, last name category.
Mine fits. But maybe there are some people here who have some middle name or that have some petronymic surname. Metronymic surname, like in Spanish. Juan Guterres... Da, da, da. Salvatira. So these are names that are...
This is not last name. There are two last names or two surnames. Maybe there is someone from Hungary or from Eastern Asia who has last name first and then first name last. Are there any popes or queens or kings here? Maybe somebody who has only a name.
Or for example, from the Nordic countries, in Iceland, someone is called Sigur and their father is Johan. So this person is called Sigur Johansson. But Johansson is not the last name. Johansson is the petronymic name. So this means that Sigur Johansson, you can call them Sigur,
you can call them Mr. Sigur Johansson, but you don't call them Mr. Johansson. And in the alphabetical list, they are not under J, like Johansson. They are under S, Sigur Johansson. There are names like...
different forms of names. So for example, in Czech and Slovak, the names for the masculine and the feminine forms of names are different. My name is Sejivi. All the female members of my family are called Sejiva. Grey-haired. But this is the grammatical form of the female form, the feminine form of the grammatical form
for this adjective. All the substantive in Czech and Slovak, if someone is called something like Mula, the females, women, they are called Mula Rova. There comes an Ova at the end. And this is not only for Czech and Slovak names, this is also for other names. So if you read Czech or Slovak newspaper,
you will see Angela Merkelova. Sometimes Merkelova is still okay because Merkel sounds like a last name that would be acceptable also in Czech and Slovak. But there are also some names from Africa, from Asia that are grammatically not compatible with our language. And that gets always the Ova at the end.
And of course, if you have some title, Fonunsu, or some academic title, which is a part of your name, then this is also more complicated to decide how to write it in a form. Because if a form asks for a first name and last name,
where do you write your title, or your second name, or patronaming, matronaming, or the parts of your name? So actually, what I propose suggest is to have one form for full name, where you write the name as it is, official, on your passport. And then how should we call you? So in my case, full name Miroslav Šidivi,
and you can call me Miro. And there are some other people who make it really clear on how you should call them and how you should write their names. Yeah, this is what I see sometimes when I write my name on some forms here in Germany. Please enter characters from the European character set only.
What is European character set? Šidivi is a Central European, Czech or Slovak, name, and it contains only characters from the Central European or from the European character set. Bitte geben sie einen vor gültig en vorschenigen namen ein. Please enter a full valid name. I'm sorry, I don't understand what you mean.
My name is valid. A name of a person cannot be invalid. So if you program something that has to do with names, I'm not speaking about GDPR, I'm speaking about common sense, don't assume anything. Don't put random limit on the length of a name. There are short names. There are long names.
There are very long names. There are names that are so long that even if they make a typo on Wikipedia, that person won't notice it. Don't use stop words. If it is a stop word in your language, it is probably a perfectly valid name in another language. As I told, family members don't have necessarily the same family name.
In my case, all the names in Czech, Slovak, Polish, and other languages. Different transcription from non-Latin alphabets. So, of course, if you have the Russian name of Czech, every European language writes it differently. The same with Chinese. On the other hand, I went to Russia twice with the same passport, and I got to my visa.
And on the visa, there was always my whole name in Latin and in the Cyrillic alphabet. And on those two pages, the transcription, the Cyrillic transcription of my last name was different. Also, the Russian officials, they see my last name,
and they try to write it somehow to transcribe it into the Cyrillic alphabet. So, it works the other way around. The men change their family names, too. So, if in your form you ask for the maiden name or name, it's probably not what you want, because there are plenty of men who, after they get married, they change their family name.
One letter name is probably not an initial. So, the French guy who did quite a lot of beautiful stuff with fractals, the B is not an initial, it's just a B.
So, probably everything that is printable is probably fine. So, you have to expect anything in the name. If you have heard about this guy, Christopher Nall, hello, I am Mr. Nall. My name makes me invisible to computers. If your program has problems with that, I'm sorry for that. Someone tried to, or bought a customized license plate
for their car, and they wanted to have Nall on it. The guy thought, hey, it's great if I get a speed ticket, they are not going to be able to attribute the speed ticket to my license plate, because there is written Nall.
And at the end, he received all the speed tickets in the county that were not attributed, that were not attributable to some license plate, because it just was mapped to Nall. So, he received way too much. If your database has problem with a guy named Robert
drop table students, okay, see you in the Q&A. I'd say street entities. There's also something interesting, because it's like names. If you are in Germany, you know what is the most common name of a street in Germany?
Yes, it's Einbannstraße. No, just joking. Einbannstraße means one-way road, but many foreigners think that it is a name of a street, and when they park a car, they just ride down in the Einbannstraße, and they need quite a long time to find their car again. What you can see in many US lists of directories of companies
from Germany is the concept of Hauptstraße, because their OCRs, probably, or some other programs, are not able to identify the Schaff SS, and they write it as an uppercase B.
The names of the places, they can be very short, like the O, somewhere in Scandinavia, or the E, Ygric, somewhere in France. The inhabitants, they call themselves Ypsilonians. So if you live in a place like this,
and then you get some control question, like, what is your mother's maiden name, or what is the place of birth, and it says, oh, you have to enter at least six characters. No, don't do that. And there are some places, like Llanvalda, Gwinge, Gogra, Windrobog, Montasilio, Gogogoch, that are a little bit longer, or Hrzączyzewa, Szycepowel, Venkowadi.
So you need really much more space, and you don't have to stop after 10, 15, 20, or 30 characters, because the places have really long names. And sometimes the places, they even don't need names, because somewhere in Iceland, you can just draw a map on the envelope,
and it will arrive. There are plenty of things that you have to think about when you are doing programming something with names, and that can surprise you. There are some pages like this, falsehoods of programmers believe about names.
I invite you, just read them, and you will see quite a lot of interesting stuff that you have not thought about earlier. Your name is invalid? No, your name is not invalid. Please, as a developer, respect the names of your users, because their names are not invalid.
Don't break the locale. So import ICU if you are in Python. Convert from bytes to string as soon as possible and from strings to bytes as late as possible. Convert with strings the whole time. It's cool. UTF-8 is cool. Python 3 is cool. Be cool, too, and use Python 3 and UTF-8. If you tell the user your name is invalid,
you will land on the Twitter account. Your name is valid. Actually, this is also a limit of Twitter, because you can have an account with maximum 15 characters, so your name is invalid. It wouldn't fit there. So there is a Twitter account. Your name is valid, and be nice. Thank you very much.
Miro, thanks for your interesting talk. Thank you. It was a pleasure to hear what kind of problems you can have with names and encodings
and how you can circumvent it as a programmer, and you really should circumvent it. I have some questions from the audience, and they go specific to non-letter characters.
Which characters? Non-letter. Non-letter characters, like the upper straw, and it's called m-dash. Yep.
And what do you think about? They should be allowed in names. How do you handle that? At least in passports, it looks like they are allowed. If they are a part of a name of a person,
so they are valid. Yes, apostrophes, m-dashes, numbers, maybe. All of that should be allowed. So yes, as I thought, you have to accept almost anything and deal with it.
OK. We have a question from the audience. Really, thank you for your talk. It was interesting for us all.
Sorry for the problems with the stream we had. If you missed something from the stream, just go to the recording afterwards. There will be a full recording of this session, including questions and answers.
Thanks again, Miro. Thank you very much, and enjoy the conference. Bye.