We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Introduction to Programming for Business Analytics - Lecture 2: Elementary Data Types

00:00

Formale Metadaten

Titel
Introduction to Programming for Business Analytics - Lecture 2: Elementary Data Types
Serientitel
Anzahl der Teile
22
Autor
Lizenz
CC-Namensnennung 4.0 International:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.
Identifikatoren
Herausgeber
Erscheinungsjahr
Sprache
Produzent

Inhaltliche Metadaten

Fachgebiet
Genre
Schlagwörter
ComputeranimationBesprechung/Interview
Computeranimation
Computeranimation
Computeranimation
Computeranimation
Computeranimation
Computeranimation
Computeranimation
Computeranimation
Besprechung/Interview
Computeranimation
Transkript: Englisch(automatisch erzeugt)
Hi, welcome everyone to the second lecture of the Introduction to Programming for Business Analytics class. In today's lecture, we're going to talk about some elementary data types. If you remember from the previous lecture, we talked about some numeric data types or some specific data types that were called numeric littles.
In this lecture, we're going to expand upon these numeric littles and specifically we're going to talk about some integers and floating point numbers. We're also going to talk about some boolean values. This is another data type. It's called boolean values.
We're also going to talk about strings and finally we're going to talk about vectors. Strings and vectors are going to be very similar to each other in some ways but they're also different than another which we will explore throughout the next slides.
In Julia, every value has a data type. Like I mentioned before, we looked at numeric values and numeric data types. The reason why we talk about these data types is that every value or a value data type determines how we can use it and how we interact with it whenever we write any program.
In this lecture, we're going to talk about six elementary data types in Julia and in particular, we're going to talk about a data type called int64. Int stands for integer and which stores positive and negative whole numbers within a limited range. We're also going to talk about big integers.
We're going to talk about a numeric data type called big int which stands for big integers which stores positive and negative whole numbers. However, they can be outside of the range that we have available to us by int64. We're also going to talk about float64.
Float stands for floating point numbers and float64 stores positive and negative numbers that have potential decimal places. We're also going to look at boolean type values or a data type called bool which stores two possible logical boolean values and in particular, either true or false.
We're also going to look at string which is a set of characters for representing a text. We're also going to look at vectors or a data type called vector which is a data container that may contain more than one value.
To see an example of each data type, we can use the Julia function called typeof which has one parameter and returns the type of its argument. We'll start by the first one. A simple example of an int64 would be to type in just the number 1.
Any whole number would work, positive or negative. In this case, I chose 1. You'll see here that typeof says int64. We can also take a look at another type of data or data type which is in this case going to be big int.
Big int is any or usually used for very large whole numbers. In this example, I'm just going to type a bunch of 9s here just to have a very large number. Then when I execute the cell, I get big int as the type of that data.
We also can take a look at a floating point number. One example would be, for example, to type in the first few digits of pi. I'll just type in 3.14.
It has a fractional or decimal point and I hit or execute the cell. As you can see, we have float64. We can also see that if we, for example, type in typeof and then between the parentheses,
I have a typo here. If we just type in true, there's a couple of things to notice. The first thing is that the word true is highlighted by green. That's because it's a keyword. When we execute the cell, we can see that it says it's a type of bool.
We also can have a string value. String values are typically indicated by quotation marks. If you type quotation mark and I type, for example, ipba and then I hit execute, I get string.
Last but not least, we can have a string or we can have an array that contains all of these guys together, all of the ones that we've written so far. The one with all the lines or the very large number that we wrote to get the big integer,
we can also get 3.14. Then we have true, comma, and then we have between the quotation, we have ipba for the string. Then we can also, just to make the example comprehensive, we can actually add another vector or another array inside of the one that we are trying to define here.
This is an array or a vector, so we'll talk about the distinction between an array and vector, or rather the similarity of whenever we say a vector and an array later on. For now, this guy has all of the things that we've wrote down so far. The one to indicate the integer, the int64, the 999, the very large number to get the big integer,
the 3.14 to get the floating point number, the boolean value true, and then the string ipba, and then to just make it comprehensive, we added just another empty vector. When you hit execute here, we can see that it says a vector of any.
Again, we'll talk about what these guys mean later on, but for now these are the six types of data that we're going to talk about throughout the remainder of this lecture. We've just seen that we have a different way of representing numbers.
int64, bigInt, and float64. A natural question that would come to mind as soon as we make this observation is that why are there different data types for representing numbers? The short answer for this is that each one of these guys has its pros and cons, and none of them is suitable for every situation.
To elaborate on this, if you know about the computer architecture of designing a computer, you would know that the computer's memories are organized typically into a sequence of elements. These elements, each one of them is called a bit.
These bits typically can have only two states, so zero and one. Since the computer memory is limited, every information that a computer can process must be represented in a finite sequence of bits. We'll elaborate on this shortly. For now, what you need to know is that there is this finiteness in the amount of information
that can be processed and represented in a computer memory because of this finite sequence of bits. Therefore, different numeric data types are designed to address the different aspects of this limitation, of the fact that we can only represent and process a finite number of elements.
Just a few short notes here is that these bits, this sequence of zeros and ones, are typically grouped in what is referred to as words. In most computers, one word is equal to 64-bit.
This 64-bit is actually the computer that is being currently used here. When you just type N1, you get N64. That's because actually the default word in the computer being used for this lecture has a 64-bit. That's the first piece of information.
The second one is that int64, bigint, and float64 are not the only numeric data types in Julia. These are just a couple of things to note in mind, but I would say these are perhaps the most important ones that you need to know. How do we represent integers using int64?
A value of type int64 uses one 64-bit word per number. Now, different permutations of 0 and 1 in the 64 positions we have at our disposal represent different numbers. In the following table, we're going to take a look at examples of how these different permutations of 0 and 1
in the different 64 positions we have are going to represent different numbers in int64. Whenever we want to represent the number 0 in all of the 64 positions, we're going to have zeros. If we want to represent the number 1, we increment the first position to the right,
and we're going to have 1 and then 0 everywhere, that's going to be 1. We cannot increment the first position anymore, so we have to change or increment the second position. If we have 1, 0 here, that's going to be 2. Then we have 1, 1 here, that's going to be 3. If we want to represent 4, we're going to have 0, 0, 1, and so on.
We can have as many permutations as we want for the first 63 positions here, and you'll see that in the first position to the left, we're going to have the number 0. So if we have the first position to the left to be 0, and all of the remaining ones to be 1,
we can arrive at the number 2 to the power of 63 minus 1. In other words, we have 2 to the power of 63 minus 1 permutations for the first 63 positions, permutations of zeros and ones,
and all of these are going to represent positive numbers. If we change the first position to the left to 1 instead of 0, we're going to start with representing negative numbers, and if all of the remaining ones are 0, then we are looking at the smallest negative number,
which is in this case negative 2 to the power of 63. But we still have 2 to the power of 63 minus 1 different permutations, and for each of these different permutations, we get a corresponding negative number as we did for the positive numbers. So the takeaway message is that the first position to the left indicate the sign,
so 0 positive and 1 is negative, and the different permutations of the remaining ones are going to tell us what is the corresponding value of the positive or the negative number. So there is a very useful Julia built-in function called bitstring,
which has one parameter, and it returns a bit representation of its argument. So for example, if I say what is a bitstring representation of the number 0, and I run the cell, you'll see that it says that everything here is 0.
Similarly, if we do bitstring, and then here we have 1, we execute the cell, you'll see that everything here is 0 except for the first position to the right is 1. We can also do bitstring, where we have 2 to the power of 63.
And as you can see here, it says that 1 and we have 0 everywhere. Obviously, this is a negative number. So if you run this, you'll see here we have the negative number,
which corresponds to the number that we've just seen in the table. So here is a summary of the different representation or different bitstring representation that we've just mentioned that I've already prepared for you just to see.
So if we run this cell, we can see here we have different representations. So for 0, we've just seen that. For 1, we've also seen that. For 2, we've seen that in the table. For 3, we've also seen that. And then for 2 to the power of 63 minus 1, we've also seen that and we've also seen where we have negative 2 to the power of 63.
A very important thing to keep in mind is this wraparound behavior is that when adding to the highest representable int64 number, which is 2 to the power of 63 minus 1, the computation wraps around. What does that mean? So here we have the numbers.
We're going 0, 1, 0, plus 1, plus 2, and so on until we get to the largest positive integer or highest representable integer. We have this point after which we go into the negative range. That was the case actually when we printed the number 2 to the power of 63.
So that's the point where it switched from the positive range to the negative range. As we say, we say that the computation wraps around in the sense that it goes from positive to negative.
So if we have, for example, if we print, say, 2 to the power of 63 minus 1 and then in the new line or in another line we print 2 to the power of 63,
you'll see here we have this is the positive number and then here we have the negative number. So that's where we're at this point and then we go to the break and then we go to the first negative or the smallest negative integer representable in 64.
Similarly, when subtracting from the lowest representable integer or int64 number, which is negative 2 to the power of 63, the computation wraps around in the other direction. So if we have the same thing, but now we say print lin negative 2 to the power of 63
and then in another line we print lin negative 2 to the power of 63 minus 1
and we execute the cell, we see that here we go from the negative to the positive range. So this is the point where we go from here to here because we're subtracting 1. So that's the largest or the smallest negative number and we subtract 1 from it, we're going to go to the largest positive representable number.
Just a small note, a wraparound behavior is sometimes also known as overflow. All right, so how do we overcome this limitation of overflow or wraparound behavior? One way is to use bigInt or a data type bigInt.
So a value of type bigInt can represent any number, even those outside of the range of int64, using as many words as necessary. So we not only use one word, we only have 64 positions for different permutations of zeros and ones, but we can use as many words as necessary.
And the number of words is going to be chosen behind the scenes. However, the only limitation that we have here is going to be the size of the computer memory, in particular, the computer memory that you're using. The way to do this, first, we can specify or we need to specify that this, say, for example,
number is going to be of type bigInt. So in this case, I define a number in this case is going to be 2 to be of type bigInt. So this number has been declared as a big integer. And then we can just simply raise this number to whichever very large number we want to choose.
So we can just say, for example, raise to the power of 120 or 128. And then as you can see here, we obtain a very large number. Similarly, we can also do bigInt, which here in this case, we also have 2.
And let's say just for the sake of example, we're going to have raised to the power of 63 and then plus 1. Let's just see what happens. So as you can see here, we can still get a positive number, unlike the other case, whenever we did not have the bigInt between the two or we didn't have the two where we passed the 2 to the bigInt function here,
this sort of argument that gives us a big integer type data. All right. So using bigInt and int64, we can represent all the integer numbers, positive and negative that we want.
To represent numbers that have potential decimal places, we can use float64. So float64, like int64, uses one 64-bit word per number. And to facilitate representing a wide range of numbers, float64 uses
something called scientific notation to approximate very small and large numbers. Using this approximation, however, it leads to float64 becoming prone to some inaccuracy. So we will see how this inaccuracy or when this inaccuracy happened and how we can be handled.
But for now, the important thing is that float64 uses this thing called scientific notation in order to represent a wide range of very large and small numbers. So, for example, if you have 3.4, this number, as we mentioned earlier, has a potential decimal point or has a decimal point.
And if we ask Julia what is the type of this guy, we can see that we have 3.14.
And to see this approximation error, let's see if we add 0.1 to 0.2. Now, just by looking at this, we know that the result should be 0.3, right? That's what you get for adding 0.1 plus 0.2. However, when we ask Julia what this number corresponds to, it says that it corresponds to
0.3000000 all the way up to some decimal point here where we have the number 4. So this is an example of a numerical error or something called a numerical error.
And before we address this numerical error, this prompts one question, which is if float64 is inaccurate or is an inaccurate data type, is it still useful? And the answer is yes. This position, the position that we can actually get from float64, which is saying
that it's correct all the way up until here, except this one decimal point, is sufficient for most practical application. Now, as you can see here, in general, you can argue that this last number here is negligible. So to answer the question whether float64 is inaccurate or not, or if it's inaccurate, is it still useful?
And the answer is yes, because this position is sufficient for most practical application. And to hedge against this, the recommendation is that whenever it's possible to use int64, use int64. But if you must use float64 in the case of using fractional or decimal places, you need to take this inaccuracy into account.
So how do we take this inaccuracy into account? That's the second question. And the way to do this or one way to do this is to use a Julia built-in function called isApprox. And what isApprox does is it checks whether two numbers are approximately the same or not.
So if I say, for example, isApprox here, and I ask Julia about the two numbers, 0.3. That's what I expected from getting the result of adding 0.1 to 0.2 and 0.1 plus 0.2.
I just type in the numbers directly, whether they're approximately the same and execute the cell. So I think we have a typo here. So that should be isApprox. And as you can see here, Julia says that this guy is true.
So meaning that 0.1, 0.2 is approximately the same as 0.3. There's a couple of things to note about float64. The first thing is that it has some special values. So the number zero is actually represented in two ways.
So if I type in 1.0, and then I multiply that by 0.0, you can see here that I have a number that says 0.0. So it's just multiplying one by zero. But if I actually make the one here to be a negative number, and then I multiply the same by 0.0,
you'll see that Julia says that this is a negative 0.0. Now, obviously, this is unfamiliar to us because 0, negative 0.0 is the same as 0.0, which in indeed the case in Julia, they're equivalent to the language.
However, Julia represented the number zero in two different ways. And negative infinity, plus infinity and negative infinity are represented as inf and negative inf respectively. So if you type in, for example, 2.0 raised to the power of 124, or say, we can even make it bigger,
1024, you can see that it says it's plus, or it says inf, which basically means that it's plus infinity. And similarly, if we do the same, but now we have negative 2.0 raised to the power of 1024,
as you can see here, we get negative infinity. Undefined results, for example, if you divide by zero, these numbers or these undefined results
are represented as a value called NaN, which stands for not a number. So if you have zero divided by zero, you get something called NaN, which basically not a number. And this is usually what happens whenever you have an undefined result.
So here's a quick overview on the different type of numeric, littles or numerical data types that we've talked about throughout the lecture, which is int64, bigint and float64. So for int64, we are usually dealing with whole numbers, positive or negative, like 1, 2, 3, negative 1, negative 2, negative 3, and so on.
That is also the case for big integers or bigint. We also deal with whole numbers. However, for float64, we are usually dealing with whole and fractional values. For int64, the range of representable numbers is between negative 2 to the power of 63 and 2 to the power of 63 minus 1.
For bigint, we can represent every negative or positive integer, of course, as long as the memory allows.
As for float64, we can represent a wide range of real numbers, as we mentioned. In terms of precision, int64 are indeed precise, bigint are also precise. However, float64 are not, and we can end up with numeric errors.
As far as efficiency in terms of calculation, int64 are efficient, bigint are not efficient, and float64 are efficient. A couple of slides ago, we talked about how we need to take the inaccuracy of float64 into account.
The answer is, when does this actually happen? When do we take float64's inaccuracy or the inaccuracy of floating-point numbers into account? The answer for this is that whenever we compare two numbers for equality.
Earlier, we checked whether 0.1 plus 0.2 is the same as 0.3. For that, we use the isApproxBuild function, but we want to actually do that in a more formal way rather than just using a Julia built-in function. Julia has comparison operators that allow for comparing different values.
If we type in 0.1 plus 0.2 equal equal 0.3, and we execute the cell, we can see that it says it's false.
Obviously, we know that this is the case because of the inaccuracy of floating-point numbers. Another example would be to check for the equivalence of the two representations of 0 and Julia. If you have negative 0.00, that's one way of representing the number 0,
and then the other representation, which is just without the minus sign. Then we check for these guys, we can see that Julia says that this is true. Comparison operators, like this equal equal sign, are binary operators. A comparison expression has three components. The left-hand side, which is a numeric expression,
in this case, we had 0.1 plus 0.2 and negative 0.0 on the left side. Then we have a comparison operator, which is the equal equal sign in this case. Then we have the right-hand side, which is also a numeric expression. We have 0.3 here, and then we have 0.0 here.
Comparison expressions evaluate to the Julia data type bool, either true or false. In the first example, we had false. That was following the checking the inaccuracy of float, 64. Then in the second example, we had true.
Recall that every program, that's what we talked about in the first lecture, that every program can be broken down into input, output, maths, conditional execution, and repetition. Boolean expression actually play a crucial role in the conditional execution and repetition parts of writing a program, which is actually going to be the topic of next lecture.
Here's a list of common comparison operators. The equality, which we've just seen. In math, we typically represent that or write that as equal, but in the Julia syntax, we have equal equal. Don't confuse the equality operator with the assignment operator when you just type in one equal sign.
In the example that we've just seen a few minutes ago, it was checking for a negative zero, whether it's equivalent to or whether it's equal to zero without the negative, which evaluated to true. We also have an inequality operator. In math, we represent that as equal with a dashed or a crossed symbol.
Then we have in the Julia syntax, we represent that by typing the explanation mark and then followed by the equal sign. Typically, the explanation mark is a negation operator and it negates whatever happens after that whenever we have a logical expression. So we have the explanation mark followed by equal sign checks whether negative 0.0 is not equal to 0.0
and it returns false because the statement is actually true. We also have the less than. It's very similar to the mathematical expression. We have the less than or equal to slightly similar because we have the less than,
but we have the equal sign not underneath, but rather following the less than simple and same. We have for the greater than or strictly greater than and greater than or equal to. And here are examples of how we use them. They're very intuitive. All right. So now we're going to talk about something called data container.
So all of the data types we have introduced so far are primitive, meaning that they cannot be subdivided further into other elements of the same or different data types. A data container is a data type that may contain more values, meaning that it can be subdivided further into other elements of the same or different data types.
The most important data containers are strings and vector or data type string and a data of type vector. So we've looked at numbers like one and two, either in the form of integer or big integer or fold64 and so on.
And if you think about each one of these values, none of them can be subdivided into further parts. But now we're going to look at data containers that can be subdivided or a data type that its element can be subdivided further. So let's see how that works in practice by looking at the first type
or the first of the two important types that we're going to talk about, which is string. So a string is a data container of car values. Car is basically character values. A car, which is short for character, is a data type that represent a single glyph, white spaces and special control values.
Glyphs can be any letter, digit or symbol. And to define a car value in Julia, we use a single quotation delimiters. So we're going to take a look at a couple of examples. So if I type in single quotation and then I type in between them any letter.
So in this example, I'm going to type in the capital Uppercase A and then I execute the cell. You'll see here that it says ASCII Unicode, some gibberish here and then category LU, which has letter and uppercase.
OK. And if I asked you, what is the type of this guy here? Julia says that it's type of car. OK. The first thing to note here is that this category is letter and
uppercase, because here I'm looking at an English letter and it's an uppercase letter. Obviously, this is not the only type of a letter that we can find. So, for example, we can have also a lowercase a and you'll see here that it's LL, meaning that it's lowercase letter. If we ask Julia the type of this guy, it's going to say that it's a character.
OK. We can, in fact, also define numbers as characters. So if I want to type the number as part of a string, we can also type it between a single quotation.
You'll see here it says the category is ND, which is number and decimal digit. And then if we type in the type of this guy here, we can have a character. We can see that it's also of type car. And now we can also type in other type of characters.
So, for example, maybe we first look at if we have just a white space, you'll see that it says category ZS, which is separator and space in particular. And then if we ask Julia the type of this guy here says that it's also a character.
All right. So not only that, we can type English letters like A, B and C and so on, but we can also type Unicode characters. Unicode characters can be entered via tab completion of latex-like abbreviation. Now, if you don't know what latex is, you don't need to worry about that.
However, you just need to know that you enter Unicode characters by tab completion that's similar to this latex program. So, for example, I'll give you an example. So, for example, to get the Greek letter pi, you write pi preceded by a backslash and then we press and hold the tab until the letter shows.
So let's do that. So if I type in the backslash and then pi and then I press tab, immediately you'll see that we have pi.
Now, we already know that pi, the Greek letter pi, is actually a predefined constant in Julia. So if we execute the cell, it immediately gives us pi, the value of pi. But it also can be used as a character. So if you have between a single quotation, let's type maybe the pi first again, do that again.
And then we put that between a single quotation and then we execute the cell. We can see that it says Unicode and the category is L1 letter and a lowercase. If we ask what is the type of this guy, which is in this case is going to be pi.
And then we put that between a single quotation, we can see that it's also a character. Now, a string is actually a sequence of car values. So the formal definition of a string is that it's a container data type that contains a sequence of car values.
And string can represent arbitrary pieces of text. So if you have a bunch of characters that are stringed together, that's why it's called a string, we can get a string type value. And the only difference between a string and a car is that string is usually defined between a double quotation,
unlike the car values, which is defined by a single quotation delimiter. And the string literal is usually used to refer to the value in close between the double quotation delimiter. So let's take a look at an example. So, for example, here we're going to have the two quotation mark and we're going to type in introduction to programming or business analytics.
Okay. So this is an example of a string. And to know that this is indeed the case, we can actually ask Julia also what is the type of this guy here.
And as you can see, Julia says that this is indeed a string. All right. So another important thing, whenever we talk about strings, something called escape sequence. So the two delimiters or the two quotation marks that are used to define a string are usually not part of the string value.
And to have them as part of the string, we must escape them. And Julia's escape character is the backslash symbol.
OK, so let's take a look at a couple of examples. In the first line here, we have a print statement that says I'm a string that contains backslash one double quotation as a character. Full stop. And then in the second print statement, we have a sentence that says or
a string that says I'm a string that contains backslash backslash as a character. So in the first example, what we're trying to do here is that we're trying to indicate to Julia that this double quotation is not to close the definition of the string value. So typically we define the string, as you remember, by having two double quotations.
And to tell Julia that, hey, this is not to close the string definition, but rather to be included in the statement. So in a way, this backslash is actually some kind of a function that tells Julia to do something specific, which in this case, to escape whatever is going to come after that.
But what if we want to also escape the backslash itself? So in the first line, this is exactly what's being done. So in the second line, this is exactly what's being done, which is to escape the backslash itself.
So in the first line, we escape the quotation delimiter. But in the second line, we escape the operator that allow us to escape the character that allow us to escape other characters. So if we execute the cell, we'll see that we have a string that contains only the quotation because the first one here is used as a function, you might will.
And then the second line, we only have one of the backslash because that's what's allow us to escape the backslash here. So let's take a look at another example. So here in this example, we say in the Julia code of the above example, we have three backslash and then two quotation marks.
But if we look at what's being printed, we only have one backslash and one double quotation. Now, the last quotation mark here is actually is used to close the definition of a string. So essentially, these four are the only ones that are part of the string that's being defined.
The first backslash escapes this backslash here, and then the second backslash escapes the double quotation delimiter. And therefore, we have backslash and then double quotation. And then the second line, we have backslash or we have four backslash and all of them are within the string value that's being defined.
And as you can see, the first one escapes this guy, second one escape this guy. So we have only two backslash. And then in the final line, we just have a normal string where we just say escape sequence.
So the whole sentence reads as in the Julia code of the above example, comma, backslash and the double quotation delimiter, and backslash backslash are examples of what is known as an escape sequence.
All right, so two other elements of escape sequences that are very important is whenever we write a backslash followed by n OK, now if backslash followed by n usually indicates a new line and backslash followed by t usually indicates a tap or tap space.
In this example, in the first line, we can say that we can also use escape sequences to represent special values such as a line break or the tap white space. OK, and as you can see in the first line here, we split the word represent by having e here
and then followed by a dash to indicate that this word is actually going to be completed in the following line. And then we have backslash n. Now, as soon as we type in backslash n, what's going to happen is that we're going to start from the next line. So everything here, everything in this sentence is going to be completed in the next line.
Now, note that we are not using the println, but rather we're using the print function. And because we're using the print, everything is going to be essentially printed in the same line. However, within the same within the string itself, we're going to indicate where the new line begins and where the new where the old line ends.
OK, so here we want to indicate that the old line or the the previous line ends here after the dash and the new line starts after or starts from the s onwards. OK, so by backslash n, we can start the new line and then by the tap, we can create a tap white space.
All right, so we can also convert other data types into a string, and we can do this using a function called string. And this function takes one argument, which is the value that is being converted into a string.
So in this example, we can we convert a data type in 64 and say by just typing one, we know that one is by default is going to be in 64. And if we type it between this parentheses or we pass it to the string function and we execute the cell, we can get one as a string.
OK, and we can do the same, for example, using string and then between parentheses. We type in big int and we have to, let's say, raise to the power of 128. And now we have a string that have to the power of 128.
Similarly, we can also convert a float, which is in this case, three point fourteen to a string value. And this is also going to be a float. We can also do the same for a boolean value. So let's say negative point zero, equal equal point zero.
And then if we execute the cell, we have true here as a string. And by the same token, we can also convert from a string. And we do this using a function called parse. So because we need to tell Julia to which value we want to convert to.
So if we use this parse function, the sparse function has, for our purposes, two arguments. In this case, in the first argument, we're just going to say in 64, assuming that we want to convert a string value into an int 64.
OK, and between parentheses or between the parentheses, we have two arguments. The first one is going to be the value or the type of the data we want to convert into. And then the second argument is going to be the string value that we want to convert from.
So if we execute this, we have one as an int 64. Now, obviously, these guys has to be compatible. So if we use the same thing here again, but let's say we have one point zero trying to do that. It says invalid based 10 digit because it's not possible to convert a float to an int 64.
Or it's not possible at least to convert one point zero into an int 64. But if we do that, for example, saying that we have float, then that would work. OK, and we can also do the same thing if we want to do or boolean values.
Let's say we have true and then here and this is going to be specified as bool. And as you can see, indeed, we get a bool value. OK, now string values can be very useful inside of a print statement.
So here we're going to take a look at an example where we include a string value in the print statement to display what is being calculated and the result of the calculation in the same line. So let's say I have two variables. One of them is called X, which has the value of three. And then the other one is Y, which has the value of, say, one.
One thing we can do is we can say println and then between the quotation we define we define a string which says X plus Y equals. And then we add a delimiter, which is a comma. In this case, to write the expression that we're trying to calculate, which is X and X plus Y.
OK, so we can see here this is a very clean way to see the result of the calculation as well as what's being calculated in the first place. So X plus Y equals four, indeed three plus one is equals four. So the comma between the string X plus Y, that is a string, the first
part here, and the algebraic expression X plus Y, what's actually being calculated is a delimiter. And we can have as many string representation and algebraic expression as we want. So for example, we can have the same guy here and we can separate this by another comma where we say, let's say, for example, X minus Y.
Or maybe we can say and X minus Y equals. And then we separate again by the comma by delimiter and we say X minus Y.
So here we have X plus Y equals four. Maybe you can create another space here. So it reads cleanly. So X plus Y equals four and X minus Y equals two. OK, maybe just to be consistent, we can also remove this space from here.
All right. So now we're on to the last data type you want to talk about, which is vectors. So a vector is a data container that may contain more than one value of any data types. Vectors are also often referred to as arrays, and in many programming language, what is known in Julia as a vector is called an array.
In Julia, a vector is a special case of the data type array. We will discuss the Julia data type array later on in another lecture. A vector is a comma separated list of values inside a delimiter between where we have where we use these two square brackets,
one for opening the bracket and one for closing the bracket. So the simplest example of a vector is actually an empty vector or an empty array where we just type in two brackets or one open bracket and one closed square bracket.
And then we hit execute and we see that we've successfully defined an empty vector. As it is the case with all the other data types, you can also assign it to a value. So here's an example vector which actually is going to contain three int64 values, one, two and three.
And when we execute the cell, you can see that it says three element, vector and 64. Three elements because it contains three elements and 64 because of all of them are of type int64.
This is not to say that it has to be all of the same data type. However, this is just how we define this one. And notice that we open the bracket to define the list and by this left and square bracket. And then we close it by the right square and closing bracket as well.
And like I mentioned, it doesn't have to be off the same type or the data. The elements of that doesn't have to be of the same type. So we can have one, we can have 3.14, we can have example vector and then we can have another, let's say, string value that's going to be the fourth element.
And then we can also have an empty array. We execute the cell. We have a five element vector of any. That's the type of the vector because it can contain any of the type of different data types.
The first one is one. The second one is 3.4, 1.3, which is the example vector that we've defined. A string called hello and then an empty array. One important operator whenever we talk about arrays is the dot syntax.
So for every binary operator, so take, for example, the addition operator. There's a corresponding dot operator, which is written something like dot followed by the operator that we are referring to. This dot syntax is automatically defined to perform or broadcast the addition element by element on vectors.
So here's an example. So we have two vectors, one is called data one and the other one is called data two. And then they just have or they contain the numbers from one to three for the first one and three to one in reverse order for the second one. One way to do this is one way to, let's say, for example, add a number to all of the elements.
Just by writing one line of code to all of the elements in data one or data two is to use this dot syntax. So in the first set of examples that we have here, to data one, to all of the elements in data one, we're going to add two.
We're going to subtract two. We're going to multiply them by two. We're going to divide them by two and then we're going to raise them to the power of two. And just in the interest of time, we're just going to use the print function to just display them all together whenever we run the cell.
So in the first line, we have data one. Well, the algebraic expression is on the right side. The string on the left here is just to display what's going to be executed. So we can just focus on this part here. So we have data one, that's the algebraic expression, this guy.
And then dot to indicate to Julia that we want to broadcast whatever operation that we're going to perform to all of the elements in the array. Dot plus two, meaning that we're going to add two to all of the elements in data one. Similarly, we're going to subtract two from all of the elements in data one.
And then we're going to multiply and we're going to divide them and we're going to raise them to the power of two. This is called scalar addition, subtraction, multiplication, division and exponentiation. Scalar, it's because there's only one number here. OK, so let's run the cell and see what happens.
So data one is not defined, so we need to run this one first and then we rerun the cell again. And as you can see here, we have data one plus two. That's just the string. So it has no implication on what's being calculated on the right here. But if we look at what's being calculated, we know we have one, two, three.
And if we add two to all of the elements, we get three, four and five, which is indeed the case here. So if we subtract two from all of the elements in data one, we get negative one, zero and one, which is indeed the case here. If we multiply each one of them by two, we get two, four, six. Indeed, that's the case here.
If we divide or if we divide each one by two, we get one half one and three over two or one point five, which is indeed the case. And then the last one is just to raise all of them to the power of two. So we have one, four and nine respectively.
OK, now we've done this now by broadcasting or by performing one or the same operation to all of the elements in data one. But we can actually use this dot operator to perform the same operation element wise, meaning that I want to perform the same operation on the first element of data and the first element of data one and the first element of data two.
The same for the second element of data one and the second element of data two and so on. And the way to do this is very similar, except now instead of having two on the right here, we just have the vector for which we are trying to use the dot syntax.
This is called vector element wise addition, subtraction, multiplication, division and exponentiation, because we're doing this by doing the operation for all the elements in the two vectors. Again, we're doing that element by element. So if we run this cell here.
We can see that we have data one plus data two equals four, four, four. And if we recall that data one is one, two, three and data two is also one, two, three. So except that they're in reverse order. So when we add three plus one, we get four.
Two plus two, we get two. And then one plus three, we also get four. And if you do the algebra for the rest, you'll see that indeed the results are the same. But the idea is that in the first one, we broadcast only one element here to everything, or the operation of operating on one, all of the elements in the vector by one number here,
whereas the other one, we use the element of another vector and we do that element wise. So this guy with this, the one with the three, the two with the two and the three with the one and so on. We can in fact use this dot syntax in any built in or user defined function.
And we can do this similarly to how we did it for the scalar and the element wise operation. So in this example, we're going to find the square root of all of the elements inside this array
by just typing sqrt. That's the name of the function. So if you recall, we had if we want to calculate the square root of any value, we use the built in function sqrt and it gives us the square root of four.
But now we want to use this sqrt in using the dot syntax by applying to all of the elements in this vector four, nine and 16. If you execute this cell, you'll see that it says three element vector, which has float 64, two, three and four. Two is actually the square root of four.
Three is the square root of nine and four is the square root of 16. And we've done this by just putting the dot between the function name and the delimiter that indicate the argument of the function. We can also do the same for the debit function or the built in function that we've defined in the previous lecture,
where we calculated the compound interest rate after t periods. So here we have the built in function. It's a copy paste from lecture one. And then periods here are different periods for which we're trying to calculate the interest
or the amount that we owe the bank after t years. So if we run the first cell just to define the parameters and then we run the second one, you will see that the debit function has been broadcasted or to all of the elements in the periods vector. All right. So as I mentioned at the beginning, both string and vectors are data containers,
and therefore they share several characteristics. So values inside of a vector and strings are stored sequentially and every value is associated with a position. These positions are associated with numbers referred to as indices, which is a plural for index, and values can be retrieved using their index, i.e. the position number.
To retrieve a value from a string or a vector, we pass the index of the position to the bracket operator. So let's see how that work in an example. So here we have a vector called my vector, which has six elements. And then we have another string that says hello comma world explanation mark.
Now, if you want to access one of the elements of my vector or my strings, we can do this by passing the index or the position, the number corresponding to the position of the element we were trying to access to this bracket operator.
So let's say, for example, you want to retrieve the first element of my vector, which is in this case is going to be six. You can just type in my vector, my underscore vector, and then between this bracket operator, we just type in the number corresponding to the position.
position, so if we just type that, if we just type that, we can get 6 because it's in the first position. Similarly, we can do the same thing if we want to say access the fourth position or the element in the fourth position, which in this case is going to be 3. And whenever we want to access the last
element, obviously we know that here in this case it's 6, but if we don't know what's the last element, we can just type in int, and int tells us what is that last element directly, so we can say that my vector int is going to correspond to 1. And the same thing is for my string, so if we type in my
string and then we type in 1, we know that the string is actually hello world and hello starts in with a capital, and as you can see here we have h which is being outputted. Similarly, you can do also my string 4, and as you can see
you can get l because that is the fourth character in the string that we've defined. We can in fact do this by performing algebraic calculations, so
let's say for example minus 10 plus 14, it's going to be 4, and therefore we're going to get the same thing as well, and we can in fact do this not only with numbers but with also with this int that we've used to access the
last element of the string. So we know that the last element here is the exclamation mark, we subtracted 1 and that's why we get d, okay. Alright, so instead of a single number we can also pass a range, and this is actually known as slicing, i.e. we take a slice from the vector or the string that we're
trying to access. So let's say for example we want to find, we want to access all of the elements that are in the second, that are from the second to the fourth position of the vector that we have. We do this by passing a range. Now a range is defined by a colon, so saying for example the
range starts from 2 and then it goes all the way onto 4, that means that we want to start from the second position until the fourth position, then I type in my vector here and then open the bracket and then close the bracket, and then if we hit this you see that 5, 4, 3, these are the elements
between 2 and 4. And the same is true also for accessing a string. We can do this now, let's say by using the int minus 1. And as you can see here
we've basically omitted the first letter because we've accessed everything from 2 all the way until int minus 1, so you got rid of from the first character or from the first element in the string and the last one in the string. We can also obtain the length of a vector or string using a
function called length, so if you have length and we pass an empty array to it, it tells us that the length of this array or the length of this vector is 0, but if we say what is the length of hello for example, we
know that hello has 5 characters and therefore the length of this is going to be 5. So one of the key differences when it comes to string versus vectors is that vectors are mutable, however strings are
immutable. So if we try to change the values contained in a vector, we can do that without any problems, but if we try to do the same thing in a string, we're going to get an error. So let's take a look at that using an example. We're going to take a look at an example where we have two words, the
first one we're going to call word1 and it's going to be just the word hello comma space and then the second word is going to be world, but instead of spelling the first letter correctly, we're going to spell it purposely or intentionally incorrectly with a V instead of a W and then we
have an exclamation mark at the end. So that's the definition of the two variables and suppose now we want to mutate word2 such that we change the first letter in word2, so we want to change that clearly into the correct
way. Instead of being world, we want it to be word. So we have in the word2, we want to access the first element and we want to set that into a
character which is W, right? That's what we intend to do, however if we do that we're going to get an error that says no method matching set index and therefore we know that this is not possible because as we've just mentioned strings are immutable. This is not necessarily the case for vectors
or for arrays, so if you have a vector, say this is the vector, we mutate or we having two elements 3 and 10 and say for example we reassign and maybe you
can even do that in a separate cell just to see that it's indeed has already been defined so if we say to this vector underscore we mutate say we
want to set the value of the second element or the element in the second position instead of 10 we want that to be 4 and then we just print this vector we underscore mutate. You'll see here that Julia does that without any
problems, okay? All right so although strings are immutable one thing we can do is we can use something called string concatenation or string interpolation. By string concatenation we mean that what we mean is that we
can concatenate two strings using the star operator so here we have the two words or two strings hello comma space and then vault an explanation mark we can define a new variable or it doesn't have we don't have to define a new variable but we can concatenate these two guys to each other and put
word multiplied by W multiplied by word 2 and then from the range 2 all the way to there so what we're doing here is that we're actually only taking the slides that's the slice that starts from the second position in other word
we're getting rid of the V and then we're concatenating the W to the start of the word meaning that for to the start of the second word or the second slice or the slice that we took from the second word and then we also concatenate that to the first word and when we execute the cell we can see
that we get the correct expression that we're trying to say which is hello world okay now here we have three variables one of them is called part one part two part three the first one says we have set an escape sequence comma hello world comma another hello world explanation mark
an escape sequence another quotation mark and then space and then part two says string three and then part three says times in this lecture we can execute this cell here we can see that we have set hello world three times in this lecture okay and and here we've assigned them to a variable and
here we printed them as part of a print statement okay and in the last cell here we can we can see that we can also interpolate into a string
using the dollar operator so in this example we have the variables we have variables one of them is greeting and then the other one is going to be string four so this is a numerical value and we have this string already and what we want to do is we want to interpolate inside of this string so here is the fourth time colon dollar sign greeting and if we execute the
cell we can see that it says here is the fourth time hello world the first four here or this four here is coming by actually making a string of the number
four so we pass the number four here to this string function to convert it into a string and then we put it between the parentheses to indicate to Julia that this variable that's being interpolated is going to end between or is going to be between those two parentheses and then th is part of the
actual string which we're interpolating into and then greeting here's the variable that we've already defined earlier by taking a slice from the two word word one and word two a similar function for working with arrays called the push exclamation mark function which is very similar and
can be used to insert one or more items at the end of the vector so we have a vector a vector we push into which contains two elements three and four and the way we push an element into this vector or to the last part of this vector is by typing push exclamation mark and then the first
argument is going to be the vector we push into comma and then the number that we want to push so if we run the cell we can see that we have three four seven because we pushed seven to this cell okay not only that we can just push one number we can in fact push multiple numbers and as you can see if we rerun this cell or if you run this other cell we can see that we
have three four seven eleven eighteen twenty nine because these are the numbers that we've pushed and all of them are separated by a comma just a quick note there's a stylistic convention in the Julia language which is to indicate any function that changes or mutate at least one of their argument by using the exclamation mark and this is usually done or put as a
last character in its name so here we have the function called push the last character its name is an exclamation mark that's because push mutate at least one of its element in this case the vector we push into alright so just to
summarize we've looked at several and in particular six new data types in 64 which is stores positive and negative numbers whole numbers within a limited range big integer which overcomes the limitation of storing a limited range of
whole numbers positive and negative and in particular those outside of the range of in 64 and we also looked at float 64 which restores positive and negative numbers that have potential decimal places we also looked at bool type values which stores two possible logical boolean values true and
false we also looked at string which is set of characters for representing text and finally we looked at vector which is a data container that may contain more than one values looking forward and we're gonna see how we can use these different data types in a meaningful context and we're also
going to look at the other parts of what what comprised what the other parts that comprises definition of a program and the conditional execution and repetition parts of a program that's all for this lecture and I'll see you
in the next video