We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Introduction to Programming for Business Analytics - Lecture 11: Advanced Data Types and Libraries

00:00

Formal Metadata

Title
Introduction to Programming for Business Analytics - Lecture 11: Advanced Data Types and Libraries
Title of Series
Number of Parts
22
Author
License
CC Attribution 4.0 International:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language
Producer
Computer animationLecture/Conference
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animationLecture/Conference
Computer animation
Transcript: English(auto-generated)
Hi, welcome everyone to the 11th lecture of the Introduction to Programming for Business Analytics class. In today's lecture, we're going to talk about advanced data types and libraries in Python.
In particular, the outline we have for today's lecture is that first, we're going to talk about tuples, dictionaries, classes, and objects, and then we're going to talk about libraries and modules. In particular, we're going to talk about the random module and NumPy, which is a library
for creating arrays and doing matrix operations. We're also going to talk about Bandas, which is a library for data frames, and we're also going to talk about Matplotlib, which is a library for plotting and data visualizations. All right, so let's go ahead and get started. First, we're going to talk about advanced data types.
In particular, we're going to start talking about tuples. So similarly to Julia, tuples in Python have the following properties, which is that syntactically a tuple is a comma separated list of values, and that could be of any type, and they're commonly enclosed in parentheses.
So we're going to use the same example that we had for the Julia part. First, we're going to define a tuple, and we're going to call it myTuple, that has the values a comma b comma c comma b, finally comma e, so just the first five letters in
the English alphabet. And then we execute this. So now we have successfully defined a tuple. Clearly, we can check that by checking what is the type of this my underscore tuple.
And as you can see here, Python indeed says that this is a tuple. Similarly, to Julia, tuples are also indexed by integers here in Python. So for example, if I print my underscore tuple, and then between bracket or square bracket,
I type in zero, I can access the first element. That's the exception, obviously, because Python is zero based indexing, and also do the same thing for one, which gives me the second element, I can also do minus one, which gives me the last element, I can also access a range.
So for example, I can do two to four. And when I execute the cell, as you can see here, we have a, b, e, and here we get two elements, which is c and d. Again, similarly to Julia, tuples are immutable.
So if I want to assign the value of my tuple, maybe the one whose index is two, to say, for example, f, I'm going to get an error that tuple object does not support item assignments.
Finally, we know that tuples store the output of fruitful functions that return multiple values. Tuples was the case in Julia, and it's also the case in Python. So for example, here, we're going to use the function, dive mode, that gives us the quotient
and the remainder of dividing two numbers by each other. So if I say, for example, the first return value, I'm going to assign it to the variable q, which is in this case going to be the quotient and the second return value, I'm going to assign it to the variable r, which is going to be the remainder. When I say dive, or dive mode, and here I give it seven and three, I know that the quotient
of dividing seven by three is going to be two and the remainder is going to be one. So if I print q here, and then in the next line, I print r, and I execute the cell, as
you can see here, the quotient is two and the remainder is one. All right, so next we want to talk about dictionaries. So syntactically, dictionaries in Python are defined using curly brackets. This is different from Julia, because in Julia, we use that dict function and then followed
by the two delimiters, which are empty parentheses without passing any argument to them. However, similarly to Julia, we can initialize a dictionary to be empty and add key value pairs to it one by one. So let's first define an empty dictionary, and we're going to also use that same example
we use for the Julia part where we define this dictionary that translates from English to German, so I'll call it eng2d again, and then here I'll define the dictionary. That's the first difference. And then here I'm going to assign values to it one by one, or key value pairs to it one
by one. So the first one is going to be eng2d, and here I'm going to give it, for example, one, and then here I'm going to assign this to the value ints.
And just to see that maybe or to see the dictionary, maybe you can already print how eng2d looks at this point. And we can also do the same here by typing eng2e and then between the parentheses we assign the second or we create the second key here, which is two, and then we assign
it to the value zwei, so Z-W-E-I. And again, we can also print the dictionary at this point. So now let's execute the cell. So as you can see in the first output here, we only have the first key value pair, but
in the second output we have the second key value pair as well. Now you might have noticed already that items or elements or key value pairs in a dictionary
are associated or coupled with a colon rather than that equal strictly greater than symbol in Julia, and similarly to Julia, key value pairs are separated by a comma. So this is the next two bullet points that I wanted to discuss, which is that as we can
initialize a dictionary with multiple key value pairs, in particular, differently from Julia, Python uses a colon between every key and the associated value. And similarly to Julia, Python uses a comma between every key value pair.
So if I were to initialize that eng2e dictionary with all its elements up front, I need to keep in mind one major or two major differences compared to Julia. The first one is how we define the dictionary in the first place, which is clearly using
curly brackets, but also that we couple key value pairs using a colon. And to separate key value pairs, we just use a comma similarly to Julia. So let's do that. So I'll type in here eng2de equals to these curly brackets.
And then here, I'll create the first key value pair, which is one and eins. And then comma, we have two and colon, and we have three.
And then we have comma, and then the third key value pair, which is three and rei. So here, I also have a string for all of them. Now, I can actually directly execute the cell, but it doesn't show the value, but maybe we
can add a print statement here that displays this. So eng2de, as you can see, we have all three key value pairs. So similarly to Julia, a value in a dictionary is accessed by passing its respective key
between square brackets. So if I say eng2de, and then between square brackets, I type in two, I get zwei, and if we try to access a dictionary using a non-existing key, we're going to get an exception. So if I say eng2de, and I pass in as a key four, because we have not defined that, we're
going to get an exception. All right, so the last thing we wanted to talk about regarding dictionaries is that if you want to get a collection of elements with all the keys or the values of that dictionary, we can use the keys or values method respectively.
So if I want to get a list that contains all the keys in that dictionary, we can type eng2de.keys, and then open empty parentheses.
And if I want to do the same thing, but instead of keys, I want to get the values, I can do the same thing, but replacing the keys with values. So I can execute the cell, and as you can see here, this is the dictionary keys.
And here we have all the keys in the dictionary that we've defined as eng2de. And similarly, we have all the values in the second output. All right, so we've seen how we can define our own data types in the Julia part. And now we want to do the same thing in Python.
In particular, we're going to talk about something called classes and objects. So a user-defined type in Python is called a class. So unlike Julia, we refer to a user-defined type as a struct. Here we refer to it as a class. So how do we define a class?
A class definition looks like this. We're going to use the same example that we use in the Julia part where we define an object or a data type that captures the time of the day that has hour, minute, and second. So let's break down this definition. So defining a class named myTime creates a class object.
That's what we've just defined in this previous cell here, which I can already execute. And what does this definition consist of? It consists of mainly two things, the header, which indicates that the new class is called myTime. So the keyword here declares that we want to define a class, the following name here
specified the name of that class, and then finally here the column, which is just part of the header syntax. We have this text, which is embedded between this string that we define here.
And the text between these triple double quotation marks is also called a docstring. And it is actually not necessary for defining the function. However, it is very good practice to actually write such a docstring because it explains
the context or the content of that class we're trying to define. So in principle, we could have defined this class just as this without having to write this docstring or this explanation between double or triple double quotation, but we
actually include this explanation where we say that this class is going to represent the time of the day and it has the following attributes, the hour, the minute, and a second.
So let's talk about instantiation, which is how we can create an instance of this time object, or this myTime object that we've just defined. So to create an instance of myTime, we call myTime as if it was a function without passing any argument to it.
So it's actually pretty simple. So we just say time equals myTime. And the first one here is the name of the variable. And here we are using the name of the class. Okay, so it's called myTime. And then between parentheses, we don't write anything and we can just execute the cell.
And if I ask Julia, what is the type of, or if I ask Python, excuse me, what is the type of time, it says that this is an object or a part of the class, myTime that we've just defined. All right, so clearly this does not give us enough to work with.
We still need to specify the hour, the time, the hour, the minute, and the seconds of that time object. To do this, we can assign values to that instance using the dot notation as follows. So if I want to specify the time of that object, I will type in the name of the instance that
we've just defined, which is time. And I say dot hour, and here I can specify the hour. So for example, I set it to be equal to 11. That's just the same example we used in the Julia part again. And we can also specify the minute as follows, and we can also specify the seconds as follow.
Now I can execute the cell. And if I want to access the values from that instance, we can also use the dot notation.
So if I ask Python to print time dot hour, and then in the next line, I say print time dot minute, and then in the third line, I say print time dot second.
As you can see here, we are able to print the values that we've just assigned to the instance we created called time. Okay, so one of the biggest differences when it comes to classes and structs is that by
default, objects are mutable, and we can change the state of an object by making an assignment to one of its attributes. So in the Julia part, we had to explicitly declare that it was a mutable struct, but here by default, objects in Python are mutable.
So let's take a look at an example, where we're going to print the old time, which is in this case, the one we defined above. Again, it's the same example from the Julia part. So we want to access time dot hour.
And then here we have comma between quotation, we have comma time dot minute, comma, and then between quotation, we have another colon, and then finally, we have time dot second. Okay, that is the old time, which we defined above.
And then in the next line, we're immediately going to say time dot second, we're going to reassign that to 59. And then in the next line, we want to print the same thing, with the only exception that
we're just going to change this old view time, okay, so let's execute the cell. As you can see here, the old time has 30 seconds, whereas the new time has 59.
So we don't get any errors, because we are trying to mutate one of the attributes of that object. All right, so another thing we want to talk about regarding classes and objects is the so called object oriented features. So Python provides these features which make it easier to work with object or objects.
The first one is the so called the init method. So the init method, where the word init is just short for initialization, is a special method that allows us to set default values for the parameters of the object, such that
when we instantiate an object, if an argument is passed, it overrides the default values otherwise missing parameters are replaced with those default values we set using init method. So I'll create this blueprint, if you will, and create this default values for this object
we've defined and previously called my time, maybe I'll modify the name slightly. And I will use this init method to set default values for time. So let's do that together and explain it as we go along. So the first thing I'll write just class. Again, this is just the way we define the object that's sufficient to define the class,
excuse me, not the object. And inside this class, again, we'll just type in or use this doc string to explain what this class is providing. In this case, it represents the time of day.
Maybe in the next line here, we just type in the attributes of this class are going to be hour, minute, and second.
So how do we use this init method? So the way we use this init method is we type in def as if we're trying to define a function and then underscore, underscore, I N I T underscore, underscore, and then between parentheses, we're going to type some arguments.
So you can think of this init method as just, we're trying to define a function. So this function is going to have three attributes because we want to instantiate three, we want to create three default values for the time of the day. The first one is our, we're going to immediately set that equal to zero comma.
We're going to also do the same thing for minutes. We're immediately going to set that for zero. That's just going to be the default values. You can choose whatever default values you want, but I'm just going to use zero in my case here. And this other parameter we call self.
So what is self? Self represents the instance of that class itself. So every method that we're going to define in this class is always going to be a function of the self or function of this parameter here we call self and all the remaining parameters
that we want to use inside the function. So self here refer to an instance of that class itself. Of course, we are trying to define a method or a function, if you will, and we have to have a colon at the end.
So that's just the header definition. So now we want to use these hour and minute and second attribute and set them as default value to any time instance that we create using this class. So first we're going to say self.hour is going to equal to hour.
So here the attribute is hour in the same way that we have assigned an attribute hour of 11. But now here we're going to assign it to this parameter here. So this is a parameter and this is an attribute. So do not confuse these two. This could have been any other thing and we could have just used something like this.
But to simplify things, I will just use the same name here. So hour hour. Okay. And we do the same thing here self.minute equals to minute again, the first one here refers to the attribute and the second one here refers to the parameter.
And then last but not least, we have second which equals to the second parameter or the parameter second. Okay. So this completes the definition of the class and time that includes the init method which
instantiate the object with default values zero zero zero. So to see how that is the case, first, if I say time equals to time, now I do not pass any parameters to that function, or I do not pass any parameter to the constructor
that instantiate the class. And if I try in the next line to print, for example, maybe we can also use an f string here. And say we want to print, in this case, time dot hour, colon, and then again, time dot
minute, colon, and then again, between parentheses or between curly brackets, and time dot second. By print here, as you can see, it says zero zero zero because that's how we set the values
default values in the init method. If I change this to ten, for example, and I print here, you'll see that it says ten and so on. All right. So as I mentioned, if an argument is passed, it's going to overwrite the default values.
Otherwise the missing parameter are going to be replaced with default values. So here all of the parameters were missing, and therefore all of them were replaced by the default values. But let's actually try to pass these parameters and see if it actually gives us or if it actually
replace only the missing ones. So if I say, for example, now I want to set time to equal to time, and I only specify one parameter, and then in the next line, maybe I can just copy, paste this here so I don't have to type it again.
As you can see here, now it says eleven zero zero. Mind you, I changed this back to zero, so it's no longer ten. So if I only specify one parameter, it's going to think that I'm referring to the first one here, which is hour, and all of the rest or the remaining parameter is going to be set
to the default values, which is in this case zero zero. If I do the same, but instead of passing one parameter, I pass two parameters, so say, for example, in this case, I do eleven and maybe fifteen nine, and I print, as you can see here, it says eleven, fifteen nine, and zero.
So it interpreted eleven as the hour, fifty nine as the minute, and zero because it was missing, or second because it was missing, it returned the default values. Actually you can do the same again. So if I say eleven here, and fifty nine, and then thirty, and I execute the cell, you can
see that it returned all the values that we used here in the constructor. All right, so next we want to talk about the STR method. Before I explain what the STR method, notice how whenever I wanted to print the time, I had to use this F string, and then here between every two attributes, I added a colon just
to sort of have it in this clock-like looking output. But if I were to just define, or to just type print time directly, it says this basically
gibberish. So I really cannot understand this. So the STR method is a special method that returns a string representation of an object, such that when we print an object, Python is going to invoke that method. In other words, I want to create a string representation of this time object, such that
whenever I print, I get something that looks like this. So directly I get the first hour, or the parameter, the value that corresponds to the hour here, and then a colon, and then a minute here, a colon, and a second, and so on.
So how do I do that? Well, the way to do that is to use this STR method to create a string representation of this time object, such that whenever we print, and whenever we want to display the value of time, Python is going to invoke this STR method.
But there is one thing that we have to be careful with. If you noticed here, we have this 1100, which does not look exactly how a clock usually looks like, because we expect to see 00 here, and then 00 here. So two digits in every position.
So to ensure that the time object is printed in the appropriate hour, minute, second, where every attribute has two digits, we are going to define the string representation using something called format sequence, and it has this syntax. And this is going to print an integer using at least two digits, including a leading zero
if necessary. Of course, we know that if we put a zero to the left of any number, it does not add any value to it. But just to make it in the appropriate format, if it's necessary, so if the attribute has a single digit, it's going to add a leading zero to the left of that number.
This is done using this format sequences, which has this following syntax. So let's do that together. So here's what I'm going to do first. Again, I'm going to maybe I can also already use all of that here.
Notice that we don't actually need to remove this, we can just keep adding as many methods as we want here to this class object, or this class called time. So in addition to this, so this is just what's going to instantiate the object, but now I'm going to define this str method. And again, I'm going to define it in a very similar manner.
So def underscore underscore str underscore underscore, and then between parentheses, I'm always going to have self. For the str method, we actually don't need any parameters, we just need to create a string
representation of that object itself. So here, we're not going to initialize the hours to anything, the minutes to anything, because this is already taken care of by the init method here, but we now just want to create a string representation of this guy. So how do we do that? We simply say return the following, that's going to be the string representation.
So between quotation here, we're going to use this format sequence that we refer to here. So we have a percentage symbol, and then we have dot two D. Okay, so two here just means that we want to get at least two numbers.
This is for the first one, or for the our attribute, but clearly we need to have that for all three of them. But between every attribute, we have a colon, right? So we have the first one has to have at least two digits. And then for the second one is the same for the third one is the same as well.
However, this is still not complete, because we have to tell Python what the values of these guys are. So this just takes care of the fact that we need to have at least two digits in every attribute position, but we still need to specify the values of these attributes.
So the way we do that is we type the percentage symbol here, and then we say self dot our. So in this case, what is what this guy is going to do is going to assign the value of this first parameter here to the first format sequence we have, then we have comma self
dot minute comma self dot second. So the same thing for the remaining attributes. And this completes the definition of the str method.
Alright, so now what happens when I print my time, so here, I just created a couple of examples. So here, we're just going to create it and we're instantiate an instance using just the default values, and I'm going to directly print the time. Okay, so notice that I've just executed the cell and now we have this str method. So now when I print, I directly not only get that string representation, so instead of
that, and gibberish that we had here, but we also have two digits in every position. So if I do the same thing for here, notice in the first position, I have a single digit in the second position, I have a single digit. If I print this, as you can see here, it added this leading zero in every single digit attribute.
Finally, we can do the same thing for the last one, all three of them has double digits. So we don't need to add any additional leading zeros. So we just get the desired output. Alright, so the third and last object-oriented feature we want to talk about is the add method.
So the add method is a special method that allows us to specify the behavior of the plus operator on an object, so that when the plus operator is called on two instances of the class, this add method gets invoked.
So if I, for example, say I have created two instances of this time object, say for example, we have start equals to time, m9, let's say 45, and then here, we have maybe the duration.
Again, that's the example from the Julia part. And we said that the duration to be 2 and 22, and I say that I want to print start plus
duration, and I execute the cell, I'm going to get an error. This is because Python does not know how to add these two times together, and we need to actually describe to Python how this can be done. And this is where the add method comes into play.
So let's do that together. Again, I'm going to use that same structure that I've used for the last two, which is the one that includes the init method definition as well as the str method definition. And now I'm going to add this add method, okay?
So to define the method, first I'll type def underscore underscore add, that's just the name of the method, and again underscore underscore, and then between parentheses, I'm always going to type self, comma, I'm just going to give it another parameter, say I'll call it other
for example, which is the number that we're going to add to. So in the previous example, we added start to duration, you can think of self here as start and other is the duration, so this is the other thing that we want to add to. The plus operator is just a binary operator, so we just need two expressions, one to its
left and one to its right, here in this case, we're just going to think of the self as the one to the left and other is the one to the right. But how does this function looks like? Well, we have to think about how first we are going to add two objects of type time
to each other. So if you remember from the Julia part, we did this trick where we converted everything into seconds for the first time and the second time, we then added them together and then we converted the result back into that time object. So essentially, we're going to do the same thing here.
In other words, first, I'm going to convert this time object here, which has the parameter self, I'm going to convert all of it into seconds, and then I'm going to do the same thing for other, and then I'm going to add the two together, and then the total seconds that I get by adding the two, I'm going to convert that back into a time object, which
I'm going to return. Okay, so let's do that. So first, I'll convert the self time objects to seconds. To do this, I first want to calculate the minutes. So let's just say to make the distinction self minutes equals, in this case, going to be self.hour multiplied by 60.
So this is the first thing I do. So for the instance itself, I'm going to multiply the hours attribute by 60 so that I can convert all of them into minutes plus minute because I still need to add the existing minute of self, right?
So there are already existing minutes and I've converted all the hours into minutes as well. But now I'm going to do the same thing, however, for seconds to self. Seconds is going to equal to the one that we've just calculated here, all the minutes that we've calculated here, which is going to be self minutes multiplied by 60, okay,
plus self.second. I need to do the same thing for other because I also need to convert everything here into seconds. And then I'm going to say other dot underscore minutes equal to self other dot hour multiplied
by 60 plus other dot minute. And then the next line, I'll say other underscore seconds equals to other underscore minutes,
which is the one that I've just calculated here. I multiply all of that by 60 to change it into seconds. And then I'm going to add the existing seconds. So far, all these four lines that are doing are just converting everything here to seconds,
everything here to seconds. The next step clearly is to add the two together. So then I'm going to calculate the total seconds, which is going to equal to self seconds plus other seconds.
Now I've calculated the total number of seconds, I just need to convert them back into time. And to do this at first, I'm going to just initialize a time object here. And again, I'll call it time using the name of the class.
And I'm going to specify the minute and the seconds using that dive mode function we've used earlier, which calculates the quotient and the remainder. So first, when I use the dive mode, for example, and I divide the total seconds by 60, I'm
going to get the quotient and the remainder. So in this case, the quotient is going to be the minutes in the total number of seconds.
And I can directly assign the remainder to the second. So here, I'm just assigning this, the second attribute of the time object that I've created here to the remainder of this division. And again, I'm going to use that same function. But this time, I'm going to directly assign the time dot hour, as well as time dot minute
to dive mode, quotient and remainder when we divide minutes, which is the one that we've
calculated here, when we also divide by 60. Finally, we're just going to return time. Let's just go through it together one more time. At first, I will convert everything here to seconds. This is the case for self other, I'll add the two together. And now I need to convert all of these seconds into a time.
To do this, I'm going to instantiate a time object here. And in the second attribute, I'm already going to assign it to the remainder when I divide all of these seconds by 60, because I'm now converting everything into minutes. So once I convert everything into minutes, I'm not sure whether this minutes is going
to be less than 60 or not. And to ensure that, I'm also going to use this minutes in the next line, or the dive mode function in the next line, where I divide the minutes by 60.
And the remainder can be directly assigned to the minute attribute of the time that we've created here. And the quotient can be directly assigned to the hour attribute. So now we can execute the cell. So we've executed successfully, it seems like we don't have any errors.
And as you can see here, I've already sort of created this example where we have the start to be 9.45, the duration to be time, or duration to be two and 22 minutes, or two hours and 22 minutes, clearly the seconds are going to be set to zero here because of the way we define the init method.
And then when I implement start plus duration, indeed, we can get a result here, in this case 12 hours and seven minutes and zero seconds. All right, so if you look closely at this method here, you'll see that there is a lot
of duplication in the sense that here I'm repairing essentially the same process. So I convert everything to seconds, I convert everything to seconds. So there's a lot of repetition. And as we know that the nice way to, or good practice is to create a compact code, and
this can be done by using functions. So one way to approach this is to actually rewrite a function or write a function that converts every time object to seconds or to integers, another function that converts back from seconds into time object, and use both of these functions in this add method.
So here's what I'm going to do. I'm going to rewrite this add method. But before I do that, I'm first going to define two more methods. Before that, the first one, I'm going to call int to time.
And this function is going to, or maybe I can also define the first one to be time to int. And this function is going to take any time object, and it's going to convert it into integers.
So here in this case, this function is just going to be a function of self because it's just a function of the instances that we're trying to convert into integer. So what I'm going to do is exactly the same as I did here, but now I'm just going to embed that into a function. Clearly, I need to be careful of the variable names that I'm using, but it shouldn't be
very difficult to follow. So this is what we did to convert two seconds. We don't need to make the distinction by calling itself here. So that's the first thing we can get rid of. And we need to make sure that we are using minutes here instead of self minutes.
Okay, so that's the first step. And we need to also have a return for that function. And this case, that return is going to be the seconds, right? Because we want to convert into an int, which in this case is seconds or integers. And in the following step, we're going to define another function that does the second part or does the second strip, which is int to time.
And this one is going to be a function of self as always, and also is also going to be a function of seconds, okay? So these are the seconds that we want to convert into time or back into time, okay?
So in this step, I'm essentially going to do what I did here in these last three lines. So let's see how we can do that. So first, I'll instantiate the object.
And then here I have minutes, time.seconds, dive mod, but instead of total seconds, I now have seconds as the parameter, okay? Similarly here, I'll assign the hour attribute, the minute attribute, and I'm just going to use minutes as it is.
But finally, which is one more importantly, we need to return something. And in this case, we need to return the time that we've just created here. So now when we come to this add method, to define this add method, all we need is to say that seconds equal to self.timeToInt and then nothing between the parentheses.
So we're going to apply this function or this method into the self in the sense that we want to convert this time into seconds. And the other one that we want to add to is going to be the same, so other.timeToInt and
then between parentheses, okay? So we converted these two guys into seconds, we've added them together, and now we're just going to return self.intToTime and then between parentheses, we have seconds.
Notice how here we don't pass any parameter because here it's only a function of self itself. However, for the second function, the intToTime function is a function of seconds, which is the one that we're converting back into time, and therefore we need to specify here.
So now let's execute this function or this class. Maybe we can reevaluate this block of code and hopefully we get the same result. So as you can see, we also get 1207, which is indeed the desired results. All right, so next we want to talk about libraries and modules.
So in the following section, we discuss some essential Python built in modules, for example, the random module and external libraries, for example, NumPy, Pandas, and Matplotlib. There is a comprehensive documentation of Python built in modules, which can be found
in the Python official documentation for libraries, as well as a comprehensive list of external packages that can be found in the Python package index PyPy. All right, so we'll start first by random.
So the random library is a built in Python module for generating pseudo random numbers for various distributions. And to start working with random, we need to import it and we do this as follows. So first we type import random as rd.
Now you might be wondering what is this rd, following import random, we know that the first part allows us to import the module. But what is this as rd? We typically shorten the imported name to rd for better readability of the code using
the module random. In other words, you can easily import the module without having to include this as rd, but to make the code more readable, which is also a very common practice to include this as rd.
So what we're saying here is that we want to import this module as this name. So now Python, whenever you say rd, Python just has this name for the random module. So we can execute the cell.
So let's see how we can generate random numbers. The most basic form of a random element can be generated using the function random as follows. To generate a random number, or the most basic version of a random number, we simply type in rd, that's the name we imported the random library as, dot random, that's the name of
the method, and then empty parentheses. We can execute the cell, as you can see here, we get some floating point number between zero and one. You can do the same thing again, I can keep getting some fractions or some floating point
numbers between zero and one. All right, so if you want to generate a random integer, or an int number between zero and n, we use the function rand range as follows. We simply type in, maybe we can first define a variable and we assign it to the value that
we want to have the range. So for example, here, I'm just going to use n equals 10. And then we can say rd dot rand range, and then between parentheses, I will say n plus one. Clearly, we don't have to use a variable here, but I'm just using a variable for completeness.
Now I'm going to execute the cell. As you can see, I get the number nine, if I execute the cell again, I get three. This is because the strand range function is going to give us a single random integer between zero and n. So if I don't have this plus one, it's going to give us a random integer number between
zero and n minus one. And so on.
More generally speaking, the function rand range returns a randomly selected element from a range that has start, stop, and step as its parameter. For instance, the following example shows how we can use this rand range function to return an even integer between zero and 100.
So you can think of this 100 plus one as the n plus one that we've used here. That's basically the stop, right? So we see how the range function typically defaults these guys to zero and this one to one, whenever we don't specify them.
But if we specify here this to be zero and we specify this to be two, we're going to create a step by two, and this is going to be the stop, right? That's the upper bound. And because we're stepping by two, we're going to have integer numbers or even integer numbers.
So if I keep executing the cell, I'm always going to have an even number as you can see. If you want to generate a random floating point number with a specific range, we use the function uniform as follows. So it's very similar to what we've done with the rand and the rand range functions.
In this case, we say rd.uniform and this is the range from which we want to get the floating point number. Here in this case, I'm using 2.5 and 10. If I execute the cell, if you keep executing the cell, you're always going to get a number within this range. All right, so all of the random numbers we've looked at so far were single elements or single
numbers. But what if we want to create a list of random elements? To generate a random list of elements without replacement, we can use the function sample in the following way. So before we talk about how we use the function, I want to point out that you should pay careful
attention to this word and without replacement, which refer to the fact that in the random sample of elements that we're going to get, none of them is going to be the same. So we're going to have a unique set of elements. So you can think of without replacement as if you have some sort of an urn that contains
several balls and every time you withdraw one of them or you draw one of them and you don't put it back. So to use this sample method or sample function, we simply use the same syntax that we've used in the previous examples. So we say rd.sample and here we specify the range.
So here this is the range from which the values are going to be drawn and here how many values we want. So here in this case, you're going to have a random or three random integers between one and 10. It's one and 10 because the range here has start as one and has stop as 10.
So if I execute the cell, I'm always going to get three random integers between one and nine. Here in this example, we don't use a range, but we rather have a list explicitly.
So a list that contains 10, 20, 30, 40, and 50. So if I execute the cell, if I keep executing the cell, I'm always going to get three random elements from the cell. Similarly here, we can use it to get not just integers, but also floating point numbers. But before I talk about the next one, I want to show you that if we try to use this ran
or the sample method to create, let's say more than it's possible because we need all of them to be unique. So if I say 50, for example, or even not even 50, say if I say 11, I'm always going to get
an error because I cannot get 11 unique elements between who falls within this range. So something has to be changed. And this is what I mean by having or turning a random set of elements without replacements.
But to generate a random list of elements with replacement, meaning that the elements in the set or in the element in the sample that we're going to draw could be similar. Could we have multiple values of the same element? We can use the function choices as follows.
So here we say rd.choices and we specify the range. And here, as you can see, we have 10, which is greater than all the integer numbers that falls within this range. If I execute the cell, as you can see here, I have several fours, I have several twos, and I have several ones. I can do the same thing again, and I'm always going to get such a result.
And clearly, this can also be used in a context where I specify a list rather than this range function. So here we use these three numbers, 0.5, 1.0, and 1.5 to get 10 random elements from this
list. And as you can see here, we get several duplications of numbers. Moreover, the function sample and choices do not require the values to be numerical. So in the first example here, I defined this list that has strings and not even numbers. And in the second example, I also define another set of strings here.
The first one has the string values red, black, green, blue, white, yellow, and red, black, green here. And here we use the sample function. Indeed, we get three different or three unique colors. And here we get the choices function and we get 10 random elements from this list.
All right, so we can also generate a random permutation of the elements in a list. And to do this, we're going to use a function called shuffle. So I'm going to use an example where I'm going to create a deck of cards, and then I'm going
to shuffle this deck of cards. So you should be familiar with this string values here, but basically each one of them corresponds to the Unicode representation of the different types of suit in a deck of cards. So the first one is black splayed, or the first one is spade, diamond, heart, and clubs, respectively.
And what I'm going to do is I'm just going to initialize this list, which I'll call it deck. And here I'm going to run a for loop that goes through all of the different suits. Of course, we know that every suit has 13 cards. So for every suit, I'm going to get the corresponding card and I'm just going to append it to this list.
So let's do that. So I'll say for s in suit and for i in range 1 to 14. And I'm going to use the append method, in particular, I'm going to say deck.append.
This is what I want to append to this deck. Maybe I can use an f string here. I'm going to append this string that has s comma i.
So actually we don't need a comma since we're already using an f string. Now I can run this loop and maybe you can also print it to just see to print to see how the deck looks like prior to shuffling it. And as you can see here, we have the different types of cards.
Now we can simply apply this shuffle function as false. So rd shuffle with one l. We want to shuffle this deck and here I'm just going to print the deck. So we have a typo somewhere.
We're missing one f here and we have an extra l here and indeed once we execute the cell as you can see here we have the decks with the deck shuffled.
All right next we want to talk about NumPy. So consider the following two matrices a and b and suppose you want to calculate a plus b and a multiplied by b. Before we actually talk about how we can calculate these algebraic operations, I want to talk
about how we can actually interpret Python's list as one-dimensional arrays. However unlike Julia, Python has no built-in support for multi-dimensional arrays and matrix operations. In fact in Python applying the binary operators plus and the asterisk on lists is possible
in some cases. However they do not provide or they do not perform linear algebraic operations. And that's why we have to use the NumPy library to perform matrix operations but what do these two matrices do?
if these two or these binary operators do not perform matrix operations, what do They do not provide or they do not perform linear algebraic operations. However they do not provide or they do not perform linear algebraic operations. they perform? We have to be careful here because these operators can only be applied in some cases. For example, for the plus operator, this is
only possible when the two operands are lists. And when we apply these plus operators, where the two operands are lists, we're going to get a concatenated list as shown in the following example. So if I execute the cell, so in the first line I'm adding one list here to another, where the first one has 1, 2 as
elements, 8, 10 as elements, and the second one has 4, 1, 7, 4 as elements respectively. If I execute the cell, you'll see that the result is a concatenated list of the two lists. So 1, 2, 8, 10, and then the second line is 4, 4, 1, 7, 4.
And clearly these two lines are completely independent, so do not get them confused. On the other hand, applying the asterisk operator is possible when one of the operands is a list and the other one is a
non-negative integer. And the result is a new list that repeats the element for a given number of times as shown in the following example. So let's see what happens when I multiply a list by a number, and that number has to be a non-negative integer. So in the first example here, I multiply this list which
has one element, 4 by 5. Here I multiply this list that has two elements, 8 and 10 by 3, and so on. So let's execute the cell and see what happens. Well as you can see for the first one, we have a list that contains 4 five times. That's because we repeated whatever inside this list five times. And here I have another
list that has 8 and 10 three times, repeating three times, and in the following line I have 4, 1, 4, 1 repeated two times. And here where I multiply by 0, it just makes it an empty list. So the NumPy library provides efficient ways of
creating arrays and manipulating numerical data inside of them including matrix operations. So first we need to install the NumPy library and we do this using the following command. First this command contains this import system which is a module that we need to import which allows us to install this NumPy
library. So first we will import this module and then from this module we're going to execute the following command here. Okay so here this is the name of the library we want to install. So you can just memorize the remaining
syntax and then just change this for different libraries that you want to install. So let's execute the cell. Now as you can see here it says requirement already satisfied. That's because I already have this package installed. Next to access the NumPy libraries and its functionalities or functions we need to import it and we import it as follows. So we say import
NumPy as MP. Again we as you can see here we have shortened the imported name to MP this is just for better readability of the code using NumPy which is actually a widely adopted convention and I think you should also follow so that anyone working with your code can easily understand. All right once we've
done that we can go back to our example which is the matrix addition and multiplication and here as you can see first we need to define the array that contains the matrix A and here we need to define the array that contains the
matrix B. So if you remember the matrix here A has one two four one B has eight ten seven four which is exactly what we write here and we do this by having this nested list where each inner list if you will contains a row of the matrix
we're trying to create. So here this is the first row second row first row second row and so on. So this allows to define the array. Now if you want to calculate A plus B we simply use the plus operator as we did in the
Julia part and as you can see here we have a result that corresponds to A plus B you can do the calculation yourself and double check that. For matrix multiplication we use this method called called matmul which is just stands for matrix multiplication and we give it the two parameters that we are or the two
matrices that we're trying to multiply by each other in this case A and B if I execute the cell you can see that we have this result which you can also double check or compare it with the result that you get from the Julia part.
In addition to matrix addition and multiplication and we can also use this numpy module to perform element wise operations. So similarly to the plus operator using binary operators such as the asterisk the division and the double
asterisk perform element wise operations in numpy. So here for example if I have A and B we know that A and B are matrices so again I'm gonna get another matrix but this time instead of doing matrix multiplication in the standard way we're going to perform matrix multiplication element wise. We can also
do the same thing but here we do division element wise so this is the same as we did again in the Julia part which you can also verify by using broadcast operator in the Julia part or you just do the math yourself. Finally we
also did we also can use it to do exponentiation so here every element in the matrix A is raised to the corresponding element in the matrix B. Last but not least numpy gives us the ability to carry out operations between
an array and a single number so here all of this was between two different matrices which were defined in the form of an array but here we can also do the this between an array and a single number so in the first example we have
an operation where we add this array to two and here we do the same thing but multiplication division and exponentiation so if I execute the cell you'll see here 2 is added to 8 and 10 here 2 is multiplied by 8 and 10 and
here 8 and 10 are divided by 2 and 8 and 10 are raised to the power of 2 respectively. Alright so another important Python library for providing easy to use data structure such as data frame is called pandas so let's first
install this library and then we're gonna use it to define data frames. To install it we're gonna use the same syntax as we did in installing numpy except we change the name here to pandas to import it we're also going to use the same syntax but now we're just gonna replace this with pandas here and we're
gonna shorten it for pd. Alright so I've executed the two cells and I've installed the library and imported the library as well and first we want to see how we can use it to create data frame so we're gonna follow the same setting
as we did in the Julia part so we can see how we can use this module to define a data frame from other Python data structures so we can create a data frame from other Python data structures all at once in the following way so this is what's gonna be in my rows and this is gonna be the column names okay so in the
first row I'm gonna have 67 and male and in the second row I'm gonna have 22 and female and in the third row I'm gonna have 49 and male again this is exactly the same example that we did in the Julia part that's why I don't explain it and here for the column names we're gonna have the first column to be called
age and second column to be called sex so now we want to define this data frame using this two variables to do this we're just gonna go to this module pd and we're gonna write dot data frame because we want to define a data frame in the first argument we're gonna pass the rows these are gonna be our rows and here we're
gonna set the columns to be equal to column names so these are going to be my column names so let's execute the cell and as you can see indeed we have a data frame which looks or the output looks very similar to the output we get in the
Julia part clearly we can also start with an empty data frame and add element to it by a column by column or row by row so here in the first section we do this column by column and then here we do this row by row so to add new elements column by column first we instantiate or we create the data frame
to be empty in this line here by just saying pd dot data frame so it's very similar to even the Julia syntax the only difference is that we have to specify that we're using this module called pandas and once we've defined
that we're going to say that in the column or we first want to create a column called age and which has those three values and we also want to create another column called sex which has all three values once we execute the cell as
you can see indeed we get a similar results to do this row by row one thing we have to be careful with here is that when we initialize the data frame clearly we can initialize it to be empty but we just specified directly the column names so in this case this is the step that initializes data frame and we
could have also used the call names variable we define up there but here we just write it explicitly for completeness once we've done that we're just going to use the append method and it's important to note that this ignore index which I will address in a second so we've defined or initialized the data
frame with the column names age and sex and now we want to add the first row so we just append to this data frame in particular to the H column this value which is 67 that corresponding to the value in the first row and to the sex column we just add male we do the same thing for the second and the third row
and ignore index is so maybe we can actually see this in two cases first let's just create this without having this ignore index and as you can see we
get an error this is because Python is confused as to where to place this row and to ensure that we ignore the index so we don't always start from scratch
or adding to the first row we're appending to the bottom of that data frame we just say ignore index to be true so once we've done that as you can see here we have 0 1 & 2 which indeed gives us the desired result so we can
also create a data frame by converting other data types such as dict and numpy array so we've seen how we can define an array using the numpy library and we already see we're already we've already seen how we can define a dictionary so
both of these data types can also be converted or we can convert a data frame from both of these data types so in the first example here we convert a dictionary to a data frame which the syntax is very straightforward here we're just defining the dictionary as we define it in the beginning of the
lecture where the first key here is going to be H and the value or the corresponding value is the element of that column and the second key is going to be sex and the value is going to be the element of the corresponding column and to convert this dictionary into a data frame we just pass it here to this
data frame constructor alright so converting a numpy dot array to a data frame is also very similar here we define the array or the numpy array by using the same syntax we used above here every inner list corresponds to a
row and to define the array we simply say that PD dot data frame and here we specify that this is the the rows are going to be coming from this numpy array and the names of the columns are going to be H and sex it's very similar to the
first way we used it to define a data frame so we can execute the sudden deed you can see that we have the same results we can also do the same thing to create a data frame from permanent storage files such as CSV files for this
we are going to need the library for or a library for working with MC CSV files luckily Python has a built-in module for working with CSV files called CSV so we do not need to install it however we need to import it in any code that uses
it in the following way so we simply just type in import CSV and we're just not gonna shorten this name because it's already an abbreviation so once we imported the CSV module we can use it to read the CSV file using the function called read underscore CSV and we're gonna use it to read the German CSV file
that we've used in the Julia example in the following way it's actually very straightforward so here it is the variable name we just want to say PD that's the name of the module dot read CSV okay so once we execute the cell indeed you can see that we have that data frame as well by contrast we can
save a data frame as a CSV file using the function to CSV we're gonna create another copy of the German CSV and now assign it to the variable DF under the different name called output CSV just to make sure that we don't overwrite the
existing one here so here in this case we're just going to say DF dot to CSV and it's just going to be output dot CSV so this is again examples very similar to the one that we did in the Julia part so it once I execute the cell and if I
go to my home directory and I see you what is the last modified files you can sort this by latest one as you can see we have this one was created seconds ago all right so accessing elements of a data frame is also similarly similar to
Julia we do this we can access a data frame via its columns rows and cells so via columns I simply by saying DF and then between brackets we type in the
name of the column we want to access maybe if I remove the column here so that I don't suppress the output and as you can see here we have all of these rows and all of the corresponding elements and we can do the same thing for or several columns that are ordered and adjacent so for example here we have
the age and the sex column which actually correspond to the second and the third columns and clearly we can also do it for an ordered and non adjacent multiple columns so here we have 8 job and the credit amount and we
can also do it X we can also access via rows as we did here in this case we can access a single row and as you can see here for this we have the range 0 to 1 because we want to access a single row but if we want to do multiple rows we have a larger range that has difference between the two valleys here is more
than one so if I do this you can see here we have two rows and so on we can also access a cell or via cell location but here we have to be a little bit careful because we have this lock following the the DF dot that we have at
the beginning okay so we need to specify that we are working with this data frame but we're now we're gonna access using lock which actually just stands for location so here I'm gonna go to the 0 row which correspond to the first row and this column so if I execute the cell you will see here they have 67 so if you
go to the first data frame indeed you see here that we have 67 and you can do the same thing but creating or getting all the rows actually and we can also do this for a subset of rows and a subset of columns so I also execute the
cell you can see that this is also gives us the rows one who has indices one three and seven and the columns corresponding to a job and credit amount again similarly to the Julia part we can mutate the elements of an entire column
as follows so here in this case we want to modify the elements of the ID column by creating another list that has the elements from one to to 999 plus one which is the last one here so we need to use one base indexing rather than zero
base indexing fuel so to do this we have this is the list that we create so this is the list that we want to create and this list has the range from one until the length of the data frame where the number of rows in that data frame
plus one so here this is going to give us less that has numbers between one and one thousand and once we execute the cell you will see that the ID column now starts from one and end with one thousand again we can rename or one or
multiple columns and to do this here we're gonna have this variable called all name where that's the old name of the ID column and we want to change it to uppercase letter again that's just the same example as we did in the Julia part and here the only difference is that we use this dot upper method to
convert this string into uppercase letters when it comes to renaming a column or multiple columns of a data frame we use this rename method so here what we have is the old name and the new name and when we say in place it means
that we want to replace the old name and the new name okay so we want to make this directly in place rather than creating a new column that has this name ID with everything capitalized so if I execute the cell indeed you can see now we have ID where it's written with uppercase letters so we can also insert
a new column at a particular position as follows so here I'm just gonna use a create a random set of elements using the random module so you don't
need to reload or execute the cell if you already imported it in the session and the way to do this or to insert a new column is to use this insert function so in this case I'm inserting this column in the second position it's
called account manager so that's actually quite similar to the example we used before the Julia part but notice how here I'm using the choices because I'm neat I want to create a list of random elements with replacement so once we execute the cell maybe we can now print data frame and as you can see here
we have in position two which is in the third position because again Python uses zero base indexing we have the third column to be account manager you can also remove rows that contain missing values as follows and doing this only
requires to use this drop and a method and again we use this in place which allows us to do this in place of that same row so in other words we're not going to create a new data frame that has or that doesn't have the missing
rows but rather we're going to do this in that same data frame that we have so once we execute the cell you will see that now we have a smaller data frame with 522 rows rather than 1,000 all right we also talked about removing
rows based on a conditional expression and we do this using this drop method so looks might look a little bit complicated but if you break it down you'll see that in the inner argument here we have this DF age strictly greater than 50 that's just a condition because in this example we want to drop all the
rows pertaining to customers older than 50 okay so inside this or this is actually the new data frame that or data frame we want to drop from and here we're saying this in place equals true because again we want to do this in the same data frame rather than creating a new copy so if I do that and if I
execute the cell as you can see here the number of data frame has been the number of frozen this data frame has became even smaller compared to the one from before finally we can sort the rows of a data frame using the sort values method
so here we sort by age and then credit amount and then the duration and again unlike the previous example we do this in place we said this in place to be false so if I execute the cell you'll see that we get this order as we want it
which is ordering the rows first by the age and then by the credit amount and then by the duration next we also talked about convenient functionalities to
manipulate data and data frames which is going to be left as an exercise all right so the last library I want to talk about is matplotlib which is a comprehensive library for creating static animated and interactive visualization in Python so first let's install this library and we do this
again using the same syntax and here we replace the last argument or the last value by matplotlib which is the name of the library we want to install again I already have that installed it might for you might take a few more minutes but you just have to be patient with so for the purpose of this lecture we're going to use a sub module of matplotlib called pyplot as follows so we
execute this syntax so that we import this sub module of this matplotlib library and we import it using the following syntax where we assign it to the name PLT okay as I mentioned earlier analogously to shortening numpy as MP
pandas as B and the PD and random as RD we are shortening this pyplot sub module as PLT so how does the basic syntax of this matplotlib library looks
like so again we're using the same example here we have these latitude and longitude that corresponds to different locations in the city of Aachen and the basic syntax is to say plot or PLT dot name of the type of the plot we want to
create in this case it's a scatter plot and here that's just going to be the values it's gonna go in the x-axis and the values on the y-axis following that so it's very similar to the Julia part and it's also similar to everything we talked about in the sense that we have this name of the module and then the dot operator and then the name of the function here but to just ensure that we
are showing the we can see the plot itself once we've created it we use this plot dot show next so as you can see here we get exactly the same plot as we created in the Julia part for the bar chart it's exactly the same just say
plot and now I just replace this with bar and here I have the grade and the frequency if I execute the cell as you can see here we have that bar chart alright so for line graphs are also going to use that same example from the Julia part where we have the COVID 2019 cases so I'm gonna use that CSV and
pandas module we talked about earlier to import a CSV file as dataframe and here I'm going to create the x-axis value is just a range that goes from one all the way until the number of rows in the dataframe plus one again this is just
because Python is zero based indexing so I always have to take care of that and here I just create the y values over the y-axis values which is going to the dataframe and the new cases columns and here I'm just gonna use plot instead of scatter or bar and I've when I execute the cell indeed you can see that we get
the same result from multiple lines it's also very similar and here I'm using this lock and or location to access the rows between zero and 364 so the first
year and here for the second year and remember the syntax for going all the way until the end of the list which is to not have anything after the column here for the first range so this is the data for year one this is the data for year two and then we're gonna plot X here and this is year the new cases so
that's for the first year and then I'm gonna mutate that plot again but now using your two so if you execute the cell and then you can see that we can get similar results now similarly to what we've talked about we now want to
talk about attributes which is how to customize the plots so we can add a title to the plot as follows again this is from what we've just done which is to create the plot now to the add title instead of writing that inside the function that creates the method that creates the plot we have a separate method that's called title and now we give it the title which is the same one
that we've used in the Julia part now I can execute the cell and as you can see we have this rather very long title but we can also change the font size as we did previously but to change the font size of the title specifically we have
to do that inside of this function okay so inside the function title that we have here because we don't want we want to change the title the font size of the title specifically not the title in the entire plot so here we just make it eight and if we execute the cell you can see that it indeed got smaller we
can also reduce the size of the plot so by default the width and height of the plot are six point four and four point eight if you want to change it we can
use this figure method so before we even create the x-values the y-values or plot the values themselves we can just upfront say that plot dot figure and we want to specify the figure size to be ten and four so we increase the width and we reduce the height okay and again we have everything from before and here
I'm not even reducing the font or the size of the font in the title if I execute the cell again you'll see that I have a rather wide plot and shorter plot similarly we can also add a label to the plot and doing this is also very
straightforward so here we have the values in the x-axis the values in the y-axis we just store them as we did from before and first whenever I plot once I
plot here I specify because I plotting the first line here so I'm gonna specify all the information corresponding to that first line in this case what is the label of that line so this is going to be the cases in this period and when I want to specify the cases of the second year I'm also going to directly give it to label so these attributes because they apply to
specific serious or specific lines that's why we have to specify them once we create the line unlike the title which or the figure size which are specified separately and they apply to the entire plot so once we've done that
we can also have the title if you want and to make sure that the legend or these label that we've specified are showing we have to invoke this function finally we just do the plot dot show as we did now I execute this and you can see here that we have indeed a label for each line clearly you can also
specify each serious color and we do this in the same way and notice again that we specify the color inside the definition of the line itself so here the first one is going to be yellow second one is going to be black and the third one is going to be red finally once I execute the cell you'll see that
indeed we have same result as we did from the Julia part and also we can change the line width and we have to do this inside the function that creates the line itself here in this case we increase it to three for all three lines
so if I execute cell indeed I get a plot that has a wider or larger line and we can also change the line style and again doing this we will do it inside the function so here I make the one to be solid the second one to be dot dotted
one important distinction between doing this in Python and Julia in addition to everything we talked about is that here we are not using symbols but rather we were using string values so here this is string if you remember in Julia we used symbol nevertheless once I execute the cell as you can see here we have the
first one to be a solid line and the second one to be a dotted line we can also annotate the plot and we do this as follows again that's the same example from the Julia part so we have the latitude the longitude and now we want
to give an annotate or annotate all the different location in that map we wanted to create so here this is just coordinates here these are different names that corresponds to different coordinates and here I specify the width and the height of that plot and I create a scatterplot so nothing changed so far here we've already seen how we can create a title for that plot and now I'm
just gonna run a for loop that goes through all the different location and for every location I'm going to use this annotate function so that I can annotate the corresponding location I with these two coordinates which has these two coordinates right so that is the name that's the annotation name that I'm
going to use and this annotate this name is going to go in this coordinates and that's why I use these XY because it refers to the fact that I'm using the X and Y coordinates of that annotation position I execute the cell and indeed you can see that we have the same result as before we also talked about
subplot attributes so by default Python chooses the best position to place the legend so if you remember for example here at one point it chooses a legend to be here on the top right corner and here it chooses to be on the top left
corner so by default Python actually chooses the best position to place the legend but if you want to specify the legend location we can do this as follows again it's very similar that's the same example and everything here is very similar except that here whenever we say plot that legend instead of
leaving this guy to be empty as we did in let's say for this example here we're specifying this to be equals to upper right so in this case it's going to be placed in the upper right corner and then here we can just display the plot so if I execute the cell you can see now it's displayed in the upper right
corner although by default Python chose the upper left corner for this legend we can also create a single plot that combines multiple subplots and we do this as follows so here we initialize this figure that's gonna contain these
top two subplots and we have two names for the two subplots just remember the same example where we have on the one hand the number of new cases and then on the other hand we have the number of new deaths each year and this is going to be the one for the new cases and this is going to be the number the new the one for new deaths and instead of saying that we want to create a plot we say
plot dot subplots one and two here refers to the fact that we want to have a one by two subplots so we want to have one next to the other one not the first one and then the one below that so we want to have them side by side rather
than on top of each other and this is how we want the figure size clearly we don't need to have that but for this specific example for it to be visible we need to just increase the font or the size of that figure clearly you can play around the dysfunction and you will see that you will get different results for
different values so how do you use these two a these two subplots that we've initialized here so I've just I haven't created them yet I've just initialized them as subplots so for the first one I use ax one that's the name of the one that's gonna have new cases and I just create a plot as we did from before so
nothing changes here so XX that's the value on the x-axis values on the y-axis we're gonna for the first one are gonna be this we're gonna have this label we're gonna have this color and this is going to be the line style that I'm using okay and this is the same thing for the second plot once I've done that I'm also going to give it a title in this case it's gonna be daily number
of COVID-19 cases and that's going to be in the upper left so this is just for the number of new case I'll do exactly the same thing but now for set of new cases new deaths and now I just put it on the upper right corner so let's
execute this and you'll see here this one corresponds to this one here and then this one corresponds to this one here finally we talked about access attributes so we can label the axis of the subplots as follows so here we have
the grade and frequency example and here we want to label the x-axis to be grade and the y-axis to be number of students so doing this we just do plot dot x label that gives the x-axis a label and y label gives the y-axis a
label so if I execute this you'll see here now I have number of students and grade in the y and x axis respectively we can also determine the level of granularity in which each axis or in each axis by specifying the x ticks and y ticks as follows this is again the same example as we did in the Julia part
so here this is just a great the frequency data and here this is the size of the figure what type of plot what title and what is the x label and what are the y labels but now here you want the x ticks to be grade and the y ticks to be in the range of one to the maximum frequency so once I execute the cell
you'll see that we have that level of granularity finally for saving the plot once everything is done we can use the save fig function so here we create that figure which has this dimension and we have it for the grade and frequency and
once we've specified the ticks and the y label and the x label this is a copy paste of the example from here we can just use the save fig function and we give it this name which is going to be grades underscore PI so I just made that distinction to make it separate from the one that we created using the Julia
part so if I execute the cell I've already showed it and as you can see here when I go to this directory I already have it all right so just to summarize today we've seen how tuples in Python have the same properties as in
Julia we've seen that Python dictionaries are defined using curly brackets where every key value pair is coupled with a colon and different key value pairs are separated with a comma these are defined type in Python is referred to as a class and defining a class creates a class object so in the
definition the header and it includes the keyword class and its name and in instantiation if we want to create an instance of a class we call it as if it's a function without passing any argument to it you can assign and access values to and from an instance using the dot notation and we've seen how objects
are usable by default in Python moreover we talked about how Python provides method that makes it easier to work with objects in particular the init method which set default values for the parameter or the parameters of the object and the str method which returns a string representation of an object and
finally the add method which specify the behavior of the plus operator of an object then we talked about libraries so random which is a built-in Python module for generating pseudo random numbers for various distributions numpy which is a library that provides efficient way of creating arrays and
manipulating numerical data inside them including matrix operations CSV which is a built-in Python module for working with CSV files pandas which provides easy to use data structures and data analysis tools in Python including data
frames mad plot lib which is a comprehensive library for creating static animated and interactive visualization in Python that's it for this course and I hope you learned something useful
you