WEBVTT 1 00:00:02.760 --> 00:00:11.280 William Cheng: Welcome to lectures 17 so Colonel two is do tonight. If you have code from previous semester, don't look at them. Don't copy them. 2 00:00:11.910 --> 00:00:21.120 William Cheng: Best to get rid of a grading guidelines. The only way we'll gray after submission. Make sure you verify your kernel submission. Make sure you have a good Readme file or that stuff. 3 00:00:23.730 --> 00:00:32.430 William Cheng: Alright, so, so after today. We're starting Colonel three Colonel three, you have two weeks plus two days to finish. So it's a little longer. But again, 4 00:00:33.480 --> 00:00:41.790 William Cheng: You know the colonel three is a is the most difficult assignment as usual. If you have co from previous semester, don't look at their know copy them best to get rid of it. 5 00:00:42.330 --> 00:00:45.810 William Cheng: You need to start early. If you want to have a chance to finish Colonel three 6 00:00:46.230 --> 00:00:54.450 William Cheng: Okay, so, Colonel through even though it's very difficult. You know, every semester. There's plenty of people who finished Colonel three and get, you know, a 100% of the credit 7 00:00:55.080 --> 00:01:03.360 William Cheng: So, so you should, you know, make a big push for us as soon as you can. You should watch the week night discussion section video where they give you the 8 00:01:03.840 --> 00:01:15.270 William Cheng: Sort of introduction, not to Colonel three. Yeah. And again, the grading, guys. Oh, I will grade and for Colonel three, we need to, you know, most of the grading is done in user space program. 9 00:01:15.780 --> 00:01:22.740 William Cheng: So you want to be able to run user space program as soon as you can. Okay, so your question, you know, feel free to send them to me. 10 00:01:23.460 --> 00:01:35.160 William Cheng: And also the grading on the way that we said that the the big you know flagging can compete. I am K. It's a little different from Colonel to. So again, you know, look, look at the grading golan very carefully. 11 00:01:35.790 --> 00:01:45.840 William Cheng: I should do GDP assignment number three, you know that we just sort of give you a few additional commands GDP commands to debug the user space program. 12 00:01:46.320 --> 00:01:54.330 William Cheng: Okay. So, so far, you've been debugging. The colonel, as it turns out that it's a little more work to do to to set a breakpoint in a user space program. 13 00:01:55.830 --> 00:02:00.630 William Cheng: If you get stuck, come to live office hours me email posted a class school group. 14 00:02:00.840 --> 00:02:07.500 William Cheng: Don't get stuck too long. So what is too long. Right. Some people always says that, you know, I've been stuck at a bug for three days. Don't do that, you know, if you 15 00:02:07.770 --> 00:02:14.580 William Cheng: If you're stuck while you're just like one night or something like that. You should be, you should talk to me as soon as possible. Okay, we're sending out to me. I 16 00:02:16.980 --> 00:02:21.840 William Cheng: Recommend a timeline for Colonel three. So, I guess, instead of giving them in day 17 00:02:22.680 --> 00:02:33.630 William Cheng: Specify which days you're supposed to do something. I just give you a sort of a rough number of days and I mentioned that the these five phases 12345 Phase one is to run with VM equals zero. 18 00:02:34.230 --> 00:02:40.770 William Cheng: Or switch the actual file system to system power versus them or you have to implement three programs get rid of all the bugs are your kernel to 19 00:02:41.100 --> 00:02:45.120 William Cheng: And then move forward. So you should try to get this done in two days. OK. So again, this is very, very tied 20 00:02:45.840 --> 00:02:57.000 William Cheng: In phase two is to get a hollow program to run. So the way that you should do is that insight in a product right you call colonel. Exactly. And you run the hello program. So the colonel in the process is going to go into the user 21 00:02:57.450 --> 00:03:04.890 William Cheng: Become the user space in their process. So in that case, all it does is to print hello and then it will call exit. You come back instead of kernel and you turn off the machine. 22 00:03:05.280 --> 00:03:10.710 William Cheng: Okay, lots of code that needs to be done. So again, you should get this working as soon as possible. 23 00:03:11.340 --> 00:03:17.850 William Cheng: So over here I said that you should get this done in five days. Okay, so there's a lot of codes derived just to get one program to work. Yeah. 24 00:03:18.540 --> 00:03:30.900 William Cheng: So once you can get one user space program to work then then then then you're in a reasonable shape to try to get all the other user space program to work. Okay, please don't attempt to run as been in it because that's very, very difficult. 25 00:03:32.130 --> 00:03:35.310 William Cheng: Phase three pass or section B test, you know, 26 00:03:36.720 --> 00:03:47.820 William Cheng: You know the all user space program using using you exactly the same method as you run. Hello there, so do exactly the same thing. I'll try to get this done in three days. And then the last one is 27 00:03:48.900 --> 00:03:59.640 William Cheng: Is is called forking Wait, that's what you have to implement shadow object, you also have to implement fork. Okay, so that that that's going to be a difficult one. So, so we have three days. 28 00:04:00.060 --> 00:04:06.450 William Cheng: So, towards the end, you probably going to spend, you know, one day or something like that, just to get the last book went to work. Okay, well, maybe two days. Yeah. 29 00:04:07.020 --> 00:04:16.260 William Cheng: Phase four. So once you can pass fork and way, you have a working fork and you also have your, your shutter objects working right. The next thing to do is to pass as been in it. 30 00:04:16.770 --> 00:04:26.790 William Cheng: So it's been in it will start your user space shout. So, so that will be the first time you have to deal with Malik, and there will be additional function that you have to write with memory. Memory mapping 31 00:04:27.330 --> 00:04:33.570 William Cheng: There's the end map system call you have to implement. So I guess to some of the bugs is going to be there. So you got to get those things done. 32 00:04:34.050 --> 00:04:45.570 William Cheng: So you can do this in two days or three days. So if you did it in two days and you have five days left to get all the section details to work, you'd be doing three days. Then you only have four days to get all the section detest work. Yeah. 33 00:04:46.260 --> 00:04:54.870 William Cheng: It's important that you The Colonel three FAQ understand pretty much all the lecture material cover so far the spec and the Phoenix documentation or not enough. Okay. 34 00:04:56.280 --> 00:05:04.650 William Cheng: All right, so, so we're going to go back to talk about the actual process them. So again, this is post, you know, Colonel assignment so 35 00:05:05.280 --> 00:05:11.460 William Cheng: So you still have to. No, no, no. But all this kind of stuff. Even though you know they are not related to your current assignment that 36 00:05:12.030 --> 00:05:14.760 William Cheng: So last time we talked to the reason that you still need to 37 00:05:15.150 --> 00:05:20.640 William Cheng: Do need to know these kind of stuff is that, you know, when you go on an interview and the interviewer says, oh, you have taken an operating system class. 38 00:05:20.940 --> 00:05:26.280 William Cheng: Let me ask you about file system that's yeah let me ask you for scheduling as you asked me about virtual machine, all these other stuff. 39 00:05:26.760 --> 00:05:36.840 William Cheng: So you so you really have to understand these. Yeah, last time we started to talk about the fast file system. It's an improvement over the system houses. And so, you know, 40 00:05:38.130 --> 00:05:51.420 William Cheng: So, so what we're going to see what the system does with changing the the sort of the sort of a data. The data organization on the desk. So this way it can actually improve the performance. Yeah, so that's what we're gonna 41 00:05:52.500 --> 00:05:53.820 William Cheng: We're gonna see see first 42 00:05:55.350 --> 00:06:04.140 William Cheng: All right, so there are some trick that we mentioned last time, right. One is to make the block size a little bigger, right. So in this case, you know, again, the blog site is a logical unit. 43 00:06:04.590 --> 00:06:09.570 William Cheng: Okay, they can be said to anything that you want. Right. So, you know, so, so, so, as it turns out that 44 00:06:10.050 --> 00:06:23.970 William Cheng: You know, if you said the block size to to be a little bigger. You're going to actually reduce the amount of times you have to perform the Sikh operation. Okay. So for example, you know, in order for so Alyssa in the in the good old days one blogs, the same size as 45 00:06:25.110 --> 00:06:33.630 William Cheng: As a sector. Okay. So in this case, when you try to perform a sequential read of your file every time when you go to the next blog, you need to go to a random place on the desk. 46 00:06:33.990 --> 00:06:39.330 William Cheng: Okay, so why don't we make the block size a little bigger. So this way. We're going to make this entire for box over here into one block. 47 00:06:39.570 --> 00:06:48.600 William Cheng: So don't look like this. Okay, so this way you only have to go to the disk once and then, you know, since this is one large block you will make them into contiguous sectors. 48 00:06:49.140 --> 00:06:54.060 William Cheng: OK, so again block is a logical unit, we can divided this to be to be a real blocks. 49 00:06:54.540 --> 00:06:59.640 William Cheng: So in this case, you know, if you use larger blog, then they will be consist of sectors that are right next to each other. 50 00:07:00.030 --> 00:07:08.640 William Cheng: That. So this way when you try to read the data. Well, the first time you try to read the first blog will be here it's going to take longer because the bigger block, but from this point on, you know, the next three 51 00:07:08.910 --> 00:07:13.350 William Cheng: The next three weeks over here. You don't have to go to the tix this three times. 52 00:07:13.740 --> 00:07:22.260 William Cheng: Because if you go to this three time, chances are you're going to end up at different places on this, they have to pay for see time yet to pay for rotational latency is indicating that you can get really expensive. 53 00:07:22.920 --> 00:07:33.090 William Cheng: Okay, so just by making a blocks as a little bigger. We're going to improve performance. Okay, but how about, you know, so if this improve performance. Why don't we make the blocks as as big as possible. 54 00:07:33.840 --> 00:07:46.800 William Cheng: Then why don't we make the block size, the size of the desk. Okay. I mean, in the in the extreme case, right, if the block size, the size of the this and if you have that much memory, then the first time we bring the data from a distant memory we never have to go to the disk again. 55 00:07:48.120 --> 00:07:56.580 William Cheng: Okay, so remember the performance of the fastest that is proportional to the number of times you go to the desk. So if you only have to go to this once while, then you're going to be super fast. 56 00:07:57.510 --> 00:08:05.220 William Cheng: Okay, but unfortunately, nobody has that much memory because memory is a lot more expensive than this one. So that's why you know a big desk and we have a small amount of memory. Yeah. 57 00:08:05.760 --> 00:08:13.260 William Cheng: So, so, so again, in the end, there's a trade off. Okay, you want to use the rice, the source of the right size of the box. Yeah. 58 00:08:14.910 --> 00:08:20.220 William Cheng: Alright, so, so again, in this case right we compare the two cases over here. The first case over here, you know, 59 00:08:20.580 --> 00:08:26.880 William Cheng: You know when you try to retrieve data is going across the border and and now you combine football, you know, for original brought into one large block. 60 00:08:27.330 --> 00:08:32.760 William Cheng: Order, you know, and is able to order and divided by four, because again, they're all order in right but in the end. 61 00:08:33.000 --> 00:08:41.760 William Cheng: You know the fosters and performance is proportional to the number of times you go to the desk. So therefore, even though they have the same big. Oh, it's going to be a big performance difference. Okay. 62 00:08:43.410 --> 00:08:47.970 William Cheng: All right, well, what is the, what is wrong with the larger block. Okay, so, you know, so, so somebody 63 00:08:48.420 --> 00:08:59.100 William Cheng: Can be clearly if you have a really large block. Most of the, you know, most of the blocks. Not going to be used. Right. So. So one problem with the larger block size is that you're going to end up with the internal fragmentation. Okay. 64 00:08:59.550 --> 00:09:04.530 William Cheng: So here's a picture complaint are comparing small blocks as a large box I so let's say that this dash 65 00:09:04.800 --> 00:09:14.010 William Cheng: part over here is the size of our file before using a smaller blocks. How many small box that we need right you can sort of counter 1234 or five. So we kind of look like this. 66 00:09:14.760 --> 00:09:19.080 William Cheng: So in this case, as it turns out, we need six small blocks over here to cover it there. 67 00:09:19.620 --> 00:09:24.270 William Cheng: But in this case, the amount of wasted space which is the internal fragmentation is going to be this much internal fragmentation. 68 00:09:25.020 --> 00:09:31.380 William Cheng: Well, if we use the large block size over here which the block size over here equals to A six blocks. So this one actually quite six 69 00:09:31.830 --> 00:09:42.660 William Cheng: Seven, over here, so, so, so this one the larger blog over here is equivalent to the six original block size over here, but since this one is slightly more than see we're going to end up with using to large loss. 70 00:09:43.290 --> 00:09:54.240 William Cheng: Okay, so in this case we're going to end up with a large amount of internal fragmentation. So if you can imagine that if every file, you know, inside your file system. They all look like this, we are wasting 50% of the desk. 71 00:09:55.800 --> 00:09:59.340 William Cheng: Okay, so, of course, not all files I look like that. Right. Most of the file you know he's a follow on large 72 00:09:59.520 --> 00:10:05.670 William Cheng: In the beginning part they are completely filled only the last part over here. We're going to have internal fragmentation. OK. So the reality is not that bad. 73 00:10:05.820 --> 00:10:15.810 William Cheng: But, you know, in the worst case whenever he follows is exactly like this. I'm going to end up wasting 50% of the day space. Okay, so that's a really big ways. I mean, today you have terabyte this or some people say, oh, if you lose have a 74 00:10:16.470 --> 00:10:22.680 William Cheng: You have a four terabyte is if you lose two terabytes, is that bad. I mean, it's still sounds really bad. Yeah. 75 00:10:24.120 --> 00:10:31.200 William Cheng: Oh. So what is the solution over here. Okay, so again, there's no free lunch, you know, we wanted the block to be as big as possible but 76 00:10:31.650 --> 00:10:37.200 William Cheng: The bigger it is, the more internal fragmentation. And also, you know, the sort of memory, really, really quickly. 77 00:10:37.530 --> 00:10:41.220 William Cheng: So one thing that they doing the festival system is that they actually use to block sizes. 78 00:10:41.520 --> 00:10:51.180 William Cheng: OK, so the beginning part of the file. They tried to cover it using the large blog what the last part of the file, they use this they they use a small ball right so in the end. So, so listen at this 79 00:10:51.480 --> 00:10:59.370 William Cheng: This is the actual size of the file. So the first blog will be. I'm going to use a large block and then to cover the second part over here. What I'm going to use a small block. 80 00:11:00.570 --> 00:11:05.940 William Cheng: Okay, so they call the small block of fragments, right. So, therefore, you know, in order for you to implement a file, you need a bunch of blogs 81 00:11:06.120 --> 00:11:13.380 William Cheng: And at the end, you can use a fraction of a book which is called fragment. Right. So in this case, we're going to end up with exactly the same internal fragmentation. 82 00:11:14.460 --> 00:11:20.520 William Cheng: Okay. And also you can actually speed up the desk, because in this case, you know what, we try to retrieve the large block, what can be faster. 83 00:11:21.480 --> 00:11:29.460 William Cheng: Okay, so again, it sounds too good to be true because it's the best of both worlds. And then we know that there's no free lunch. So in this case, what is going to be the cost 84 00:11:30.120 --> 00:11:36.210 William Cheng: OK, so again we sit we have exactly the same wasting space and also we can actually speed things up. 85 00:11:36.840 --> 00:11:44.070 William Cheng: So, therefore, that we, you know, we must begin with giving away something right. So what are we giving it away is the complexity inside the file system. 86 00:11:44.490 --> 00:11:53.400 William Cheng: Guys inside of our system is going to be a little more complicated. Now, for every file, you need to keep track of, you know, the number of blocks at the beginning and also the number of fragments that yet. There 87 00:11:54.150 --> 00:11:57.900 William Cheng: So in some of the system over here. What they would do is that, you know, some of the basic rules that 88 00:11:58.290 --> 00:12:07.650 William Cheng: They would divide the clock into a number of fragments 124 or eight and this one is fixed for the entire file system. Right. So why don't you feel the forces me to determine 89 00:12:07.980 --> 00:12:18.990 William Cheng: which way you want to go if you make a mistake was too bad. Okay, so, so, so, so, so, so in this case, again, it's going to be based on a heuristic how you determine what should be the right number of fragment inside of 90 00:12:20.070 --> 00:12:26.700 William Cheng: That. So once you start doing this and people actually tried to be very creative. They tried to sort of squeeze out as much this space as possible. 91 00:12:26.820 --> 00:12:34.530 William Cheng: So in that case, they want to have as little waste of space as possible. So one thing either will do is that if you have two files like this right here's a blue file. And here's the pink file. 92 00:12:34.920 --> 00:12:44.610 William Cheng: The blue flower and the last block only required to fragments, what the pink color we hear that the you know the the the the last blog require for fragments, you can actually have the blue. 93 00:12:44.880 --> 00:12:52.890 William Cheng: And the pink box share the last block where the blue take out, take out the first to fragment, while the pink over your take on the other. The last for for pregnant. 94 00:12:53.130 --> 00:12:59.250 William Cheng: Now, so this way. We're going to waste as little you know space as possible. But again, the, the file size and complexity is going to 95 00:13:00.000 --> 00:13:02.100 William Cheng: The file systems going to get a little more complicated. 96 00:13:02.370 --> 00:13:08.370 William Cheng: That. So with this kind of organization, you can really save this space. Even when I tried to grow file a over here when I started go fly. 97 00:13:08.580 --> 00:13:14.790 William Cheng: I'm going to take up additional fragments are we looking like this will take a more fragmented, but eventually it will start, you know, writing into the 98 00:13:15.480 --> 00:13:26.700 William Cheng: Powers That Be want to the pink houses them. So, at that time, we need to do, the more sophisticated thing we need to break this block into two different block and now the blue. I'll be here. We use one exclusively and the pink will usable and not 99 00:13:27.270 --> 00:13:39.870 William Cheng: exclusively about again the fastest. I'm going to go more complicated. So in this case, the file system will run a little slower, but hopefully the performance gang. They that you get from doing this, it's going to be worth it. Okay. 100 00:13:40.950 --> 00:13:45.630 William Cheng: So that's one trick that the first one system play out to improve the performance. Yeah. 101 00:13:46.290 --> 00:13:51.960 William Cheng: The second one, we're going to see is how to how we actually improve the see time and the key over here as we mentioned before. 102 00:13:52.380 --> 00:14:00.990 William Cheng: There are two motors on the rhino be a hard drive. One is the main folder which is not very accurate. And the second one is called the stepper motor the stepper motor is very, you know, 103 00:14:01.590 --> 00:14:10.290 William Cheng: Several Motors is very accurate, it's also much faster because you won't have to go to the next track that's in this guy. Is this the speed different, it's going to be 22 one 104 00:14:11.070 --> 00:14:18.120 William Cheng: Okay, so therefore the average see time, you know, for us, the main motor is going to be formula. Second, the one track see time it's going to be point two milliseconds. 105 00:14:18.750 --> 00:14:27.570 William Cheng: So the idea here is that we should only use the stepper motor as much as we can because that will improve performance, we should avoid using the main mode because I'm a motor is going to give you poor performance. 106 00:14:28.080 --> 00:14:32.670 William Cheng: That. So the basic idea of yours that things that are related. We need to put them close to each other. 107 00:14:33.060 --> 00:14:37.170 William Cheng: They're saying the fastest them, what they did is that they introduced the idea of cylinder group. 108 00:14:37.380 --> 00:14:44.370 William Cheng: We're going to take the cylinder. We're going to group them together into cylinder group over here. So this monster truck over here. We're going to group them into cylinder group. 109 00:14:44.550 --> 00:14:55.620 William Cheng: And we're going to create files and directories that are related to each other. We're going to put that inside the same cylinder group. Okay, so this way when you open a file and you need to find all the this block, you need to stay inside a cylinder group. 110 00:14:56.760 --> 00:15:03.540 William Cheng: Okay, so go. So when I say open a file. What is the file right as far as I know. And then, you know, inside the Father also. This 111 00:15:04.230 --> 00:15:16.620 William Cheng: The state or region. So, we want the I know and the data region to be within the same cylinder group. Okay, so this way. What we try to, you know, perform seek over here we're going to end up performing seek much faster than a performing a random seek. Yeah. 112 00:15:18.090 --> 00:15:25.830 William Cheng: Alright, so, so, so, so the difference over here is that is that, you know, the first thing is going to be a little more complicated. And also, you know, when you try to allocate stuff. 113 00:15:26.250 --> 00:15:32.370 William Cheng: You know, is that a file system, the algorithm is going to be a little more complicated, right. So, so here's the capacity and the system versus them. 114 00:15:32.610 --> 00:15:38.880 William Cheng: Plus zero is is a block, block one is a super blah, and then come the I list and then come the data region. 115 00:15:39.180 --> 00:15:43.740 William Cheng: Okay, so what's wrong with this organization. Right. The I nodes are actually sitting the aisle is over here. 116 00:15:43.890 --> 00:15:50.820 William Cheng: At the end of the I know there is this map that this man is going to point to the data region and because of the way the free list is allocated. 117 00:15:51.000 --> 00:15:56.580 William Cheng: You know all that the block instead of, you know, he said this, this map, they're going to be scattered all over the desk. 118 00:15:57.000 --> 00:16:05.610 William Cheng: Okay, so therefore, everyone is pointing over here, they will be pointing at random places on the desk. So this way when you try to retrieve data from a file. Why, in this case, you have to perform random seek 119 00:16:06.300 --> 00:16:17.910 William Cheng: Okay, so that would cause you for millisecond every time you try to get, get to the next spot. Okay. So, in the first place. Is that what it will do is that it will group, a bunch of, you know, a cylinder over here together so so 120 00:16:19.050 --> 00:16:21.840 William Cheng: How many pseudonym pseudonym, would you group together into a cylinder group. 121 00:16:22.380 --> 00:16:28.230 William Cheng: Right. And since the ratio is 22 one, you know, maybe what we can do that, we can group 20 cylinders together into a cylinder group. 122 00:16:28.530 --> 00:16:35.250 William Cheng: There's over here here. So the integral number one over here. Maybe this is what I actually, I don't know what the number is. This is very, very old. 123 00:16:35.790 --> 00:16:42.930 William Cheng: So I just made up that sort of 27 right because that seems to me seems to make sense. So again, you need to sort of figure out what. Well what was, what was the right number is supposed to be. 124 00:16:43.200 --> 00:16:45.930 William Cheng: And then we have a second cylinder would have a third cylinder group. 125 00:16:46.560 --> 00:16:53.610 William Cheng: And what we do is that when we try to create a file, we want the I know and the data, data blocks over here, they all go into the same cylinder group. 126 00:16:54.030 --> 00:16:57.330 William Cheng: Okay, so this way, once you open the file. If you want to get all the data. 127 00:16:57.720 --> 00:17:09.660 William Cheng: You know, the data for that file what they will all be sitting inside a cylinder group. So in this case, we don't have to turn on the main motor. We just have to perform that the sort of the local seek over here. So they will be a lot faster. Yeah. 128 00:17:11.010 --> 00:17:13.950 William Cheng: Alright, so, so, so, so go, what is the 129 00:17:16.530 --> 00:17:23.730 William Cheng: You know, what is the downside of this approach. Okay, so now you see times going to be faster, right, because you only use the, the faster motor 130 00:17:24.570 --> 00:17:30.150 William Cheng: So, you know, again, your file system is going to get more complicated whenever you try to create a file. 131 00:17:30.900 --> 00:17:38.760 William Cheng: Okay, so before when we create a file. What do we have to do, right, we get a random I O number and then we'll start, we'll start allocating block from the free lists. 132 00:17:39.600 --> 00:17:49.680 William Cheng: Okay, so in that case, you know. So when we start doing that we're going to end up the performance, right. So now when you allocate a file, you need to determine which cylinder group should go. 133 00:17:50.310 --> 00:17:57.150 William Cheng: OK, so the amendment, it doesn't have to choose from what you know what kind of algorithm. We had to run in order for you to decide to to create that file. 134 00:17:57.750 --> 00:18:03.540 William Cheng: Okay. As it turns out, this can be really complicated, right, because, you know, in order for you. I mean, some, some people there W people 135 00:18:04.230 --> 00:18:12.990 William Cheng: Or engineers, you know, they like to think about, you know, I have optimization problems. So I formulate this optimization problem I self optimization problem and the solution, right. That's very easy. 136 00:18:13.470 --> 00:18:22.890 William Cheng: The optimization problem. It's kind of expensive to solve and also the optimization problem once you start using solving optimization problem your solution tends to be very fragile. 137 00:18:23.400 --> 00:18:29.790 William Cheng: Because after you added, you know, sort of, at a few files that they do if you saw all of a sudden your solution changes. 138 00:18:30.720 --> 00:18:39.900 William Cheng: Okay, so therefore, wherever you allocate it now. It's no longer optimal so they have, what do you do I supposed to move things around so you can adjust so so you can achieve the optimal solution, you know, for all your file place, man. 139 00:18:40.410 --> 00:18:45.720 William Cheng: That'd become a very, very difficult problem. Guys, a computer science people they typically like heuristics that was even though 140 00:18:46.020 --> 00:18:51.210 William Cheng: Even though the archives will optimization, but there are also times where you're using a heuristic might be the right thing to do. 141 00:18:52.050 --> 00:19:01.440 William Cheng: That. So, so what kind of hearsay. Right. So for example, I mean, this is really, really high level here. I see. If you see a bunch of tasks see files. What do you think that the the you need to do. 142 00:19:01.890 --> 00:19:04.980 William Cheng: Well, if you see a bunch of that stuff. Are they, you know, that, chances are we're gonna 143 00:19:05.580 --> 00:19:08.130 William Cheng: Actually don't compile code as a dossier are going to be a complete 144 00:19:08.430 --> 00:19:16.500 William Cheng: File so that all of our needs. According to the same directory. So in that case, you know, maybe we need to group all the diocese file. Instead of saying directly together into one cylinder group. 145 00:19:17.040 --> 00:19:23.550 William Cheng: Okay, so if you have a donkey hierarchy like this, a VC make files all over the place. What you would do is you can actually create you know group all these three 146 00:19:23.850 --> 00:19:33.300 William Cheng: Files over here, you know, interest Linda group. Why would you want to do that bad because inside this directory. There's a make file where you type make what he would do is that he will go to the directory 147 00:19:33.870 --> 00:19:40.800 William Cheng: You know, scan all the, the, the, the directory at the archaea entries over here find all the CFR, and then compile them all into that. 148 00:19:41.760 --> 00:19:51.360 William Cheng: Okay, so it will answer them directly are quite a bit. It will also you know assets, etc. For by the compiler weapon read them and then to generate that fast. So it seems like it's a good idea to group them all together. 149 00:19:52.140 --> 00:19:58.890 William Cheng: So maybe one of the heroes. They can be something like this. I want to do this together. I want to do this together. Well, what about FRC over here. So the different level. 150 00:19:59.490 --> 00:20:09.270 William Cheng: Okay, why not case, you know, maybe I need to group these guys together or maybe I need to put them inside. Is that another similar group. So again, some heuristic will guide you about where to, you know, 151 00:20:09.960 --> 00:20:15.390 William Cheng: Where to put the sauce. Okay. Is this a perfect solution. Is this the optimal solution me clearly they are not optimal. 152 00:20:15.810 --> 00:20:23.370 William Cheng: But again, you know, you know, the typical solution is to use a heuristic to determine where to place these files. Okay. So, by using certain regrew your 153 00:20:23.730 --> 00:20:33.720 William Cheng: Your, your, your, your systems we've got a little bit. So you said your fastest, then you're going to get a little slower because now you have to run these heuristic in order for you to figure out where to put the files that 154 00:20:36.000 --> 00:20:44.370 William Cheng: Oh, I am so by using the larger blocks and also by using a cylinder group. How much performance, can we improve. Okay. 155 00:20:44.760 --> 00:20:55.230 William Cheng: So, so people actually did this with the rhino PA hard drive and then they put on the fast passes them by only turning on these two features. As it turns out, if you do that, you get a factor of 20 improvement. 156 00:20:56.070 --> 00:21:08.100 William Cheng: I'm not, I'm not saying 20% improvement, a factor of 20 okay so remember at the beginning, we're only using 0.16% of the transfer capacity of the right Nokia hard drive. So even if you multiply that by 157 00:21:08.400 --> 00:21:13.170 William Cheng: 20 in the end he only only using 3.7 of the maximum possible 158 00:21:13.620 --> 00:21:25.530 William Cheng: This transfer capacity. Okay. So clearly this is a, this isn't you know this is a big improvement, but in the end, you know, we're still pretty upset that that we're only using 3.7 of the maximum transfer capacity of the desk. Yeah. 159 00:21:26.880 --> 00:21:29.850 William Cheng: So there's one thing that we also need to improve is that we need to improve 160 00:21:30.120 --> 00:21:37.890 William Cheng: You know, rotational latency. So remember, you know, at the maximum transmit capacity is by assuming that see time equal to zero and rotational latency is over here 00 161 00:21:38.250 --> 00:21:45.480 William Cheng: Okay, so we sort of, we will try to reduce the sea time part of it already. And now we also need to minimize rotational latency now. 162 00:21:46.020 --> 00:21:51.960 William Cheng: Alright, so what is rotational latency come from. Right. Yeah. So let's say that we're reading section number one over here. 163 00:21:52.740 --> 00:21:56.730 William Cheng: So let's say that you know that they will transfer data from section of one over here to 164 00:21:57.270 --> 00:22:08.760 William Cheng: This controller. Right. So once we wait for rotational agency for a second. I want to show up under the desk there we start transferring this data on to the buffer inside of this controller. Well, we reached the end over here. What do we do 165 00:22:09.360 --> 00:22:19.500 William Cheng: Well, when we reach through the end there. We finished transferring a sector. And now, what happened is that this controller is going to DMA the data. So here is the memory on the bus right memories on the bus, here's what this controller. 166 00:22:19.800 --> 00:22:25.050 William Cheng: That this controller with DMA the data for this into memory, right, and then that's going to take some time and also at that 167 00:22:25.260 --> 00:22:31.380 William Cheng: When that transfer is finished, it will interrupt the CPU. So the CPU sitting over here, it will it will, you know, have the CPU. 168 00:22:31.560 --> 00:22:39.930 William Cheng: The CPU might have a interrupt blocked or or maybe I interrupt disable right this is the CPU might have interrupt blah, maybe servicing a more and more important. 169 00:22:40.170 --> 00:22:48.720 William Cheng: You know the interrupt because IPL and that this is not very, very high is somewhere in the middle. Okay, so therefore the CPU might be servicing. Now what you know Rob whatever 70 seeing, you know, 170 00:22:49.200 --> 00:23:00.300 William Cheng: Inter process, you know, Rob, or some some other interoperable clocking around or something like that. So, so that it might take a while for the CPU to respond. Eventually when the CPUs executing you know observers of D. Again, what does it do right 171 00:23:00.720 --> 00:23:11.040 William Cheng: Braga kernels that it starts the next IO operation. What is the next hour operation, the next hour operation is going to be, you know, read you know sector number two. And then, you know, transfer that into memory. 172 00:23:12.030 --> 00:23:20.610 William Cheng: So second number two is right here. But at this point, where is it this head that all this time we were doing the DMA where we're interrupting the CPU or waiting for the seat. 173 00:23:22.080 --> 00:23:28.380 William Cheng: For the interrupted to get deliver that this is going to continue to spin. Right. But this is a mechanical device doesn't slow down. 174 00:23:29.220 --> 00:23:32.670 William Cheng: Okay, so therefore, while you're waiting for all this thing that this continue spinning 175 00:23:32.910 --> 00:23:39.870 William Cheng: By the time you're gonna you're ready for the next hour operation that this cat is going to be somewhere on sector to or maybe it's going to do even worse. 176 00:23:40.080 --> 00:23:47.070 William Cheng: It is going to be somewhere else sector three. So in that case, next time, but we need to perform a wonderful rotational latency. How long do we have to wait. 177 00:23:47.970 --> 00:23:53.460 William Cheng: Okay, we have to actually wait for the entire revolution of the days before we can get to the beginning of sector to again. 178 00:23:53.670 --> 00:24:04.500 William Cheng: Right, because we can't really start reading start to to already. We don't really know what we're supposed to read. So therefore, we need to wait for the entire Revolution until we get to the first bit of sector to and then we're gonna start transferring data again. 179 00:24:04.740 --> 00:24:12.030 William Cheng: Well, when we finished doing that we're going to interrupt the CPU and then by the time we ready to transfer sector number three the discount is over here again with way over and over again. 180 00:24:12.510 --> 00:24:18.660 William Cheng: Okay, so when you try to transfer. Let's say you try to transport and sectors over here. They are contiguous on the desk. 181 00:24:19.290 --> 00:24:27.180 William Cheng: wind this up the first time the first sector, you have to wait on the average, maybe halfway through the desk. So it's going to be half a revolution of the desk. 182 00:24:27.450 --> 00:24:36.210 William Cheng: While all the other blocks. We're going to wait for, you know, the entire evolution of the position of the desk. Okay, so, India, we're going to end up with waiting for one 183 00:24:36.540 --> 00:24:49.650 William Cheng: One have revolution over this plus and minus one times one revolution at a desk. Okay, so this is one of the reason this is the buffer system is so slow because if you try to transfer consecutive sectors is actually going to be the worst case. 184 00:24:51.000 --> 00:25:03.240 William Cheng: Okay, so in this case. So it's actually on the textbook. They have some justification to tell you that, you know, some kind of a typical system, you know that this has got to be a little bit into sector to when you're ready to transfer the next the next bucket data. 185 00:25:04.530 --> 00:25:12.540 William Cheng: So, and this guy is what is the solution over here. What a solution over here is to say, well, why don't we just call this blog this this blog. This second over here sector number two. 186 00:25:14.430 --> 00:25:23.700 William Cheng: Well, better solutions. Very simple. Right. There's no reason why we had to call the next sector sector number two, maybe we'll call this one. Number two, and then this won't be three. This all before this one will be five. 187 00:25:23.910 --> 00:25:30.870 William Cheng: There's all be six is only be seven as well be eight. So this way when we finished at a while. Well, we got a second or two, we only have to wait how long 188 00:25:31.320 --> 00:25:33.240 William Cheng: Did this head is actually sitting right here. 189 00:25:33.750 --> 00:25:39.210 William Cheng: There if the desert is sitting right here. Why, in this case, we have to wait for this much rotational latency. So again, 190 00:25:39.450 --> 00:25:47.070 William Cheng: On the average, there's going to be 750 sectors per cylinder. So it's a per track so therefore there's gonna be a very, very small distance 191 00:25:47.490 --> 00:25:54.990 William Cheng: Okay, so if we can do this, we're going to end up, you know, minimizing the rotational latency. Okay, and that's what, exactly, that's exactly what the first classes or do 192 00:25:55.380 --> 00:26:02.100 William Cheng: They call this walk into leading well. So in this case, we just sort of which is going to receive a number of the blog or in this guy. 193 00:26:02.490 --> 00:26:08.700 William Cheng: This guy's the block is bigger already. But, you know, we're going to call this blog over here. There'll be multiple sector over here that we need to leave but now 194 00:26:10.290 --> 00:26:11.640 William Cheng: We need to leave enough 195 00:26:12.720 --> 00:26:21.900 William Cheng: enough space. So we can call the next one over here, part number two. Okay. So, so what we want to do is that we want to minimize the chance you have actually I'll wait for the entire rotation of the desk. 196 00:26:22.620 --> 00:26:24.870 William Cheng: Now, so it's turned out. This is really simple. 197 00:26:25.410 --> 00:26:34.860 William Cheng: I mean, the, the idea is very simple. But in the end, the file systems don't get again more complicated because now when you try to perform a Sikh into society. We should have retreated. 198 00:26:35.250 --> 00:26:44.820 William Cheng: A particular part of data we need to be up aware of this block inter levy that's going on. Okay, and also the fly into leaving over here is not always a block inter levy of. So this is 199 00:26:45.270 --> 00:26:54.210 William Cheng: So, so, so this is the intervening factor of one. Alright, so when we go from one to the other one. We're going to skip one block. We're going to also have an inter leaving factor of two, three or four 200 00:26:54.630 --> 00:27:03.570 William Cheng: Okay. So it really depends on you know how busy operating system is so, so, so, so, so why is that when you first set up your system. You got to decide on your entire leaving factor. 201 00:27:03.840 --> 00:27:08.730 William Cheng: Okay, if for some reason your you know your your CPU has 128 levels of interrupt. 202 00:27:09.000 --> 00:27:16.110 William Cheng: And that this is very, very low party. In that case, you know, you have to service all the other interrupt before you come to this. Well, in that case, you know you if you do 203 00:27:16.350 --> 00:27:25.200 William Cheng: If you only you only interview one block. By the time when the CPU is ready. Well, then maybe what this says is already in the third block already 204 00:27:25.920 --> 00:27:32.520 William Cheng: Okay, so in that case you need to increase the intervening factor. So what happened is that what you starting to format your, your, your file system, you need to sort of 205 00:27:33.030 --> 00:27:44.250 William Cheng: Benchmark on your forces them to sort of figure out what is the right into leading factor, you know, for, for, for all for kind of machine that you have there is, okay, how much into leaving you do it depends on the system. 206 00:27:45.750 --> 00:28:03.000 William Cheng: By by just using block into leaving. We can also improve the performance by 15 folds. OK. So again, this is a factor of 15 15%. Okay, so before we were at 3.7% if you multiply by 15 we're going to get almost 50% of the Superdome right Nokia hard drive. Yeah. 207 00:28:04.740 --> 00:28:20.400 William Cheng: Alright, so, so again, indeed, over here, we're going to end up with 32.4 cents. So by by using all you know by by by using a larger block, but we by using cylinder group by using the block into levy and now we can achieve 32.4 megabytes per second. 208 00:28:21.030 --> 00:28:29.490 William Cheng: You know, for the transfer capacity, as we mentioned before, for the rhino bit hard drives around 64 or 65 right so now we reach about 50% of the maximum transfer sweet. 209 00:28:30.300 --> 00:28:41.280 William Cheng: That I so a quick summary. Right. If you use system for us, then, you know, again, the capacity over here is the transfer capacity. We're using only 0.16% or transfer 210 00:28:41.670 --> 00:28:48.780 William Cheng: Capacity. The fast our system without blocking to levy, we can reach 3.8 3.7 or the transfer capacity. 211 00:28:49.230 --> 00:28:55.170 William Cheng: The file system with blocking to levy and also the other two feature added, we can reach about 50% of the transfer capacity. 212 00:28:56.040 --> 00:29:05.550 William Cheng: But can we reach 100% of the the transfer capacity. Okay, so basically you know the secret is equal to zero and the rotational agency can be has to be zero. 213 00:29:06.030 --> 00:29:18.690 William Cheng: Okay, so, so how do we actually do to do to get to 100% OF TRANSCRIPTS. Right. I mean 50% is pretty good. But again, you want, how you want the refund have your money because uh yeah you already get 50% of the capacity of the desk. Okay. 214 00:29:20.640 --> 00:29:22.770 William Cheng: All right, so 215 00:29:24.180 --> 00:29:32.490 William Cheng: All right. So again, you know, the key over here is to understand sort of the structure of the this at some point, you still have to perform see at some point you have to wait for rotational latency. 216 00:29:33.210 --> 00:29:36.060 William Cheng: So so so let's do that again. 217 00:29:36.690 --> 00:29:42.000 William Cheng: Review waterfowl apply right here is the data block for file guy in order for you to restate a blog on file. 218 00:29:42.180 --> 00:29:50.340 William Cheng: You need to go to the. I know, right. So the I know his story is that is that data blog over here are scattered all over the place inside that data region for the system Bob us then. 219 00:29:50.550 --> 00:29:56.160 William Cheng: That will we go to the fast file system. We did a major improvement by grouping them into the same cylinder group. 220 00:29:56.370 --> 00:30:03.450 William Cheng: Now the I know and the data region over here. So again, they all go into the same cylinder group guys again the file systems don't get more complicated. 221 00:30:03.690 --> 00:30:14.850 William Cheng: Inside a superbug over here, we need to keep track of where all the sudden to group is right, and also inside the cylinder group over here, we also need to keep a data structure of all the free list inside a cylinder group. 222 00:30:15.900 --> 00:30:20.550 William Cheng: Okay so got some of the data structure, you know, it's going to change from system PA system to the house. Yeah. 223 00:30:21.720 --> 00:30:27.180 William Cheng: But even when you do this right when you try to perform minimal amount of seeing, you're still doing secret. The secret is not equal to zero. 224 00:30:27.600 --> 00:30:33.390 William Cheng: So this seems to be that you know 100% of the transfer capacity is really unachievable well 225 00:30:33.990 --> 00:30:40.050 William Cheng: Alright, so I certainly in the 80s. That was the people weren't able to do it. But, you know, over time, something happened. 226 00:30:40.290 --> 00:30:45.810 William Cheng: You know, with the, the speed of memory and the speed of this over here. So here's a picture of horizontal axis over here is time. 227 00:30:46.260 --> 00:30:55.050 William Cheng: This is about 1980 something 80 something the fastball since then. I think the fastest isn't was was sort of like in the early 90s over here, but 228 00:30:55.470 --> 00:31:01.950 William Cheng: Over time of harvest at the memory actually getting cheaper and cheaper. We can start getting more and more memory. 229 00:31:02.460 --> 00:31:11.520 William Cheng: Okay, like I said before, you know, if you have the memory. That's the size of the depth that you can read all the data for this into memory and then you're done. Right. So in this case, you're gonna energy perform. Very good. So in this case, 230 00:31:12.060 --> 00:31:26.940 William Cheng: You know, did you did you achieve 100% of this transfer capacity. Okay. So remember that the transfer capacity of that this is only 64 megabytes per second. If you never have to go to the desk, you can actually yo yo performance is going to be much, much better than 64 megabytes per second. 231 00:31:28.170 --> 00:31:39.420 William Cheng: Okay, because you'll be running, running at the speed of the CPUs and this can you read much, much faster. Okay, so as as we start getting more and more memory, we can actually use a different scheme to cash, most of the data from this into memory. 232 00:31:40.440 --> 00:31:45.750 William Cheng: So therefore, when we get here. So this is, you know, stood up, you know, in the late 90s, early 233 00:31:46.710 --> 00:31:51.780 William Cheng: 2000s, we start doing very aggressive caching, because we have a lot of memory. Gavin today on 234 00:31:52.320 --> 00:32:06.390 William Cheng: The system you have, you know, some some people's laughed out of 16 gigabytes of memory. That's a lot of memory. Okay, so what people would do is that they will actually take about, you know, 25% or 20% of that memory, just to cash, just to build the cash for the file system. 235 00:32:07.470 --> 00:32:20.880 William Cheng: That as we mentioned before in the past. Ism. There's a cash right so so so we're going to take a take away 20 to 25% of your physical memory and then dedicate them, you know, to be used as a system cash. Okay. So this guy is going to end up you know 236 00:32:22.200 --> 00:32:30.150 William Cheng: And performing much better than before that. So, so, because when we mentioned about this. There's this cash right whenever you try to 237 00:32:30.510 --> 00:32:33.180 William Cheng: get data from the this you first you look into the file system cache. 238 00:32:33.360 --> 00:32:41.310 William Cheng: If the data is there. You don't have to go to this event data is not there, then you go to the dance, and you get the data after you get the data from it is you keep a copy, instead of file system cache. 239 00:32:41.490 --> 00:32:46.920 William Cheng: From this, why not. If you need to ask us more data, you know, again, you can go to the File System cache and try to look for first 240 00:32:47.370 --> 00:32:51.060 William Cheng: OK. So again, this is a sort of a casual gonna use the same terminology as 241 00:32:51.450 --> 00:32:57.120 William Cheng: You know transition look as I buffer, you know, because that's also a cash. We're going to talk about the hip probability and the Miss probability 242 00:32:57.300 --> 00:33:05.490 William Cheng: The hip mobility is the probability that you find that data inside of cash and the Miss probability is the probability that you don't find that data in the cash and you actually have to go to go go. 243 00:33:06.090 --> 00:33:12.390 William Cheng: Go, go to the desk and get the data. Okay, so if you have a high probability that said are 99% hit rate. 244 00:33:13.590 --> 00:33:26.460 William Cheng: Okay. That means that 99% of time, but you need to go to this is going to cost you nearly zero because this is so slow. Okay. Only 1% of the time, how much it's gonna cost you. Right. So, on the average, that this assets is is about 10 millisecond. 245 00:33:27.630 --> 00:33:38.130 William Cheng: Okay. So this guy is 99% of the time you're going to get, you get zero milliseconds. And then 1% of the time you have, I guess, you know, the 10 milliseconds. So on the average is going to be 246 00:33:38.520 --> 00:33:46.050 William Cheng: So in this case, you're gonna, you know, ship it doesn't apply, but by two, we're going to end up with 0.1 millisecond as your average this access time 247 00:33:46.860 --> 00:33:53.790 William Cheng: Okay, so you start doing this, you can see that you can easily get better than 100% transfer capacity for the note right nope your hard drive. 248 00:33:54.720 --> 00:34:09.540 William Cheng: Okay, so what if you only get 90% here. Right, right. So this guy again 90% of the time you take 00 milliseconds 10% OF THE TIME YOU GET you're gonna cost you, you know, at the 10 milliseconds. So on the Irish did this as a sounds W one millisecond. 249 00:34:10.620 --> 00:34:20.610 William Cheng: OK. So again, you know, depends on how big, how big of a, you know, the POS system because you can have, you can actually and also you know what kind of Hey Ray that you could get you can actually improve the the 250 00:34:20.940 --> 00:34:26.730 William Cheng: The distressful capacity on the average to exceed that right Nokia harddrive this transfer capacity. 251 00:34:27.240 --> 00:34:33.480 William Cheng: And so therefore, it becomes really easy to achieve better than 100% of the district for capacity. 252 00:34:33.810 --> 00:34:42.360 William Cheng: Okay. But again, this is only for reading, right, because what we perform read, you can actually use the use the buses some cash. What about when you perform the right operation. 253 00:34:43.170 --> 00:34:50.760 William Cheng: Okay, well you're pulling the right operation, you can really use the cash right because because I did, I need to go on to the desk. So in that case, you know, you actually 254 00:34:52.260 --> 00:34:55.740 William Cheng: You will need a different kind of solution for for writing. OK. 255 00:34:57.720 --> 00:35:06.750 William Cheng: So the people actually gave the file system cache, a special name or they call it the buffer cache. So the buffer cache. Again, we're going to use 20% to 25% of physical memory. 256 00:35:06.930 --> 00:35:14.640 William Cheng: And dedicated and given to the file system as the buffer cache now. So whenever you perform the read operations. First we need to check the buffer cache and see if it has the data. 257 00:35:14.790 --> 00:35:17.730 William Cheng: If it has the data that we're going to return that data without going into this 258 00:35:17.940 --> 00:35:26.850 William Cheng: If it doesn't have a data. Now we're gonna we're gonna go to the dentist to get data and then we'll keep a copy inside the buffer cache, if it turns out the buffer catches for then again we have to 259 00:35:27.780 --> 00:35:33.840 William Cheng: We have to implement some kind of a replacement policy to toss something's something out and this will create space, right. 260 00:35:34.860 --> 00:35:42.360 William Cheng: Right operation over here okay for right operation when we write the data onto the buffer cache over here. I mean, the ideal, but he was that he needs to go on to the desk. 261 00:35:43.140 --> 00:35:49.890 William Cheng: Okay, so therefore, in this case, what can we do right. So as it turns out that there's there's two different approaches. One is called right through. And the other one is called right back. 262 00:35:50.430 --> 00:35:56.250 William Cheng: Okay, right. So, it is that way you will you right to that this this over here, first you're going to hit the data inside the buffer cache, right. 263 00:35:56.400 --> 00:36:04.170 William Cheng: So this guy's you modify the data inside of cash and also you started this cooperation so right doesn't return until the right to that this has finished. 264 00:36:04.710 --> 00:36:08.700 William Cheng: Okay, so when so easy right through you know where the data eventually is going to finish writing 265 00:36:09.180 --> 00:36:14.880 William Cheng: Writing on to the desk and now that they know that this is exactly the same as data inside the buffer cache. And now you will return from right 266 00:36:15.330 --> 00:36:24.720 William Cheng: Okay, so in that case the right even, you know, even with send a group with all the tricks that we use in this isn't my boss is them India. They can only achieve 50% of the maximum transfer capacity. Right. 267 00:36:25.770 --> 00:36:29.640 William Cheng: Now the other, the other way to do is to use something called right back. 268 00:36:30.420 --> 00:36:36.720 William Cheng: Well, so, so, the problem with the right through over here is that is going to be too slow. And we can only achieve 50% of the transfer capacity of this 269 00:36:37.260 --> 00:36:41.280 William Cheng: The other proteins called right back. Okay, so this one is actually trying to be optimistic. 270 00:36:41.460 --> 00:36:52.350 William Cheng: What we'll do is that when we perform the right operation over here. We're going to modify the data inside a buffer cache. And then we're going to schedule these this block to be written out to the desk at a later time. And now the right is going to return right away. 271 00:36:53.100 --> 00:37:03.360 William Cheng: Okay, so as soon as your data as soon as the racism, Paul modified data is not a buffer cache, then I will return and then going to schedule all these data blog will be here to be reached out to this at a later. 272 00:37:03.990 --> 00:37:14.130 William Cheng: Time. So that's why this operation is known as right back. Okay, we're going to modify the buffer cache. And then we're going to write the data back onto the desk at a later time. What if you actually lose power at this time. 273 00:37:15.120 --> 00:37:23.520 William Cheng: Okay. So the problem with this approach is that, you know, if you lose power. So let's say you finished writing the first one. The second one over here under the desk and all of a sudden you lose power, you don't have time to write all the other day Dalton 274 00:37:24.150 --> 00:37:29.610 William Cheng: This next time, will you reboot the system that data on the desk and the data. The data under this would not be consistent. 275 00:37:30.180 --> 00:37:36.930 William Cheng: Right, so, so, so again, you're modifying different block on to the data. Some of them might belong to. I know some of them might be onto the data region. 276 00:37:37.620 --> 00:37:46.200 William Cheng: You know. So in this case, you know, when the data on the dis, they're not not consistent next time where you put the system. What if it turns out that you're trying to modify the root directory. I know. 277 00:37:47.010 --> 00:37:53.220 William Cheng: Okay, so if you modify that you're going to end up the inconsistent, you know, a rude. I know so. So in that case, you might not be able to put your processor 278 00:37:53.520 --> 00:38:01.140 William Cheng: Okay, or some nice as possible. You will lose half the disk. You know when you lose power. It's also possible that you're going to get an operating system crash. Well, you get 279 00:38:02.790 --> 00:38:14.730 William Cheng: You get a new patch you install it. As it turns out, there obviously has a bug. And somewhere along the line when you try to run your code. All of a sudden, you're going to get a kernel panic and then half the data has written out of this. And the other half are still inside a buffer cache. 280 00:38:15.990 --> 00:38:21.000 William Cheng: Okay, so this is going to be a major problem in the good old days, maybe just 1515 years ago. 281 00:38:21.360 --> 00:38:32.460 William Cheng: Whenever we lose power or something like that. Next time where the system break. What it will do is it will see a thing on the score on the screen spinning and tried to say checking file system. If you're unlucky, you're going to end up losing half the data on your desk. 282 00:38:33.510 --> 00:38:42.300 William Cheng: Okay, so even though right you know right through over here. So, so we will use it right back in the day, you will be able to achieve near a 100% of this transfer capacity. 283 00:38:42.480 --> 00:38:52.110 William Cheng: But the downside over here is very, very risky. So if you want a solution, we have to address this pretty good downside. Okay. Because otherwise. Next up, what you boot, you're gonna end up losing half of your desk. Yeah. 284 00:38:53.940 --> 00:39:01.770 William Cheng: So what is the solution for this one. The solution with this one over here day is that you organize the desk, you know, as a very, very long log 285 00:39:02.580 --> 00:39:12.030 William Cheng: Okay, so what's the log log is that good journal around when you try to write a journal, you only asked up at the end and all the stuff in all the stuff that are before you actually, you never delete them, you know, modify them. 286 00:39:12.780 --> 00:39:19.980 William Cheng: Okay. So in this case, what do we do, is that when you try to perform long right you can actually achieve achieve you know 100% of this transfer capacity. 287 00:39:20.400 --> 00:39:26.190 William Cheng: Then we also need to address the risk factor when I sort of talked about that towards the end. Okay. Talk about how to actually deal with that. Yeah. 288 00:39:27.000 --> 00:39:38.100 William Cheng: Right. So the idea here is that we're going to have a different organization of the desk right here is our this right now, you know, block number zero is our booth, blah. Here's our super blog over here and the rest of the file system is just the log 289 00:39:38.880 --> 00:39:46.410 William Cheng: Okay, the, the, the property of the log is that it's a pen only. So if I want to add stuff to my file system. I asked up at the end over here. 290 00:39:46.650 --> 00:39:51.270 William Cheng: And also, anything that have written out to the file system over here. It's never delete. I can never update it. 291 00:39:51.570 --> 00:39:59.580 William Cheng: Okay, there's no way for me to come to the log over here and change something the system in the past because if I do that, if I'm using right back. We're going to be big old going to end up 292 00:40:00.360 --> 00:40:10.650 William Cheng: Again, and I'm in big trouble. Okay, so the trick. Over here is I'm going to use right back and we're going to change the organization of the file system so that the entire file system now become a very, very long, long 293 00:40:10.980 --> 00:40:17.820 William Cheng: Okay, so even though the view of the file system is that you still have files all over the place. Okay, but the actual file system over here. He's going to 294 00:40:19.620 --> 00:40:23.970 William Cheng: He's going to be a very long, long, because the question is how do we actually implement this. Okay. 295 00:40:24.960 --> 00:40:34.980 William Cheng: So, so, so, for example, we should create a file, the new I know the data over here, we'll go on to the end, the logo here and we create another file over here and stuff like that when he tried to modify a file or do I do 296 00:40:35.250 --> 00:40:41.760 William Cheng: Okay if I want to modify this fall over here. So again, I cannot delete them. I need to add the new I know and the new data at the end over here. 297 00:40:42.720 --> 00:40:48.630 William Cheng: So in this case, how do I invalidate the old data over here. So I guess. So you're going to require some kind of a trick to say that, you know, this is the 298 00:40:48.930 --> 00:40:58.380 William Cheng: This file has has been moved from here to here and here is the actual place for that particular file. Okay, so, so, so the logic of houses. I'm has has to address you know all these problems. 299 00:41:00.540 --> 00:41:07.290 William Cheng: So somebody right okay if you have a system like this that's never delete never updated. What is the main problem of this particular file system. 300 00:41:08.100 --> 00:41:15.660 William Cheng: Okay if I use up the entire disk, the entire process and become a read only file system. Okay, so this is really not a very useful file system. 301 00:41:16.170 --> 00:41:23.490 William Cheng: Okay, so. So here's a window to talk about an implementation of your noise log structure file system. Okay. And this is implement by 302 00:41:23.940 --> 00:41:31.410 William Cheng: By by that by UC Berkeley, people will get the researcher, they're playing around with the idea of how to achieve 100% of the dis right capacity. 303 00:41:31.830 --> 00:41:37.530 William Cheng: Okay, bye bye building a specialized process them and they came up with a reward structure bosses them and they call it the sprite file system. 304 00:41:38.070 --> 00:41:50.370 William Cheng: Guys, okay, this is not a useful process that because when you reach the end of the desk. Your this become read only desk, guys. So, therefore, you know, has very, very limited use we're going to sort of briefly talk about how to actually, you know, build a system like that. 305 00:41:51.930 --> 00:42:00.930 William Cheng: So again, the basic idea here is that when you try to create a new file right you add the data blocks over here and then again you can actually figure out, you know what a blog numbers on they are 306 00:42:01.110 --> 00:42:10.470 William Cheng: And then the I note over here is that I know there's this map right the the orange arrow be here show you where that this map is pointing tues right so so maybe this one only required to data blocks. 307 00:42:10.710 --> 00:42:15.360 William Cheng: The first block is right here. The second book is right here. Right. And again, if you're using the fastball system. 308 00:42:15.690 --> 00:42:24.060 William Cheng: You know, the last part over here. There's 13 pointers and all that kind of stuff. Right. So the last thing your bosses that because already imagine we're going to use the same kind of I know as a fast, fast. As soon as this is about my sister. 309 00:42:24.450 --> 00:42:27.810 William Cheng: Okay, so here's one file and maybe with a pen another file and 310 00:42:28.200 --> 00:42:39.090 William Cheng: Then another file. So here is the fastest them with the two different files, file a CFO be is here. Yeah. And then there's another data structure that keep track of all the files inside the file system. 311 00:42:39.480 --> 00:42:43.740 William Cheng: Okay that data structure in the spray file system is known as the I know map. 312 00:42:44.340 --> 00:42:49.650 William Cheng: OK. So again, don't confuse the item up with this map right that this map is the last part of the I know 313 00:42:49.860 --> 00:42:57.150 William Cheng: The I know Matt is a data structure that store where all the I knows our rise and again what I knows right the iOS or the file. So over here, here's I know here's I know 314 00:42:57.330 --> 00:43:02.370 William Cheng: The I know are scattered all over this. So we need another data structure to keep track of all, they are, where they are. 315 00:43:02.670 --> 00:43:13.440 William Cheng: Okay, so let's say that here's I know, Matt. They know we're far as and we're far BP is so again eight over here is going to be a block number I tell you exactly where this I know is and be over here will be a blog, I'ma tell you exactly what it is. 316 00:43:14.340 --> 00:43:21.810 William Cheng: Well alright so let's take a look at example to see why. But when you try to change one of the file. So let's say that we want to make file, a little bigger. 317 00:43:22.680 --> 00:43:28.710 William Cheng: OK. So again, even look at this picture over here are four areas right here. How do you make a little bigger, right, the first point, the point right here. The second point of all right here. 318 00:43:28.920 --> 00:43:36.120 William Cheng: We need to make the second block a little bigger, but we can never modify what's inside a log over here. So we're going to actually create a larger blog over here. 319 00:43:36.420 --> 00:43:45.480 William Cheng: For this blog up here when I change the point of the point over here, but we can really change the pointer because you know does. I know is inside the inside log of here. So, India, we need to 320 00:43:45.750 --> 00:43:51.900 William Cheng: Move this I knows all the way to the end of the log over here. So this guy is the first point that over your point right here site. 321 00:43:52.860 --> 00:43:58.200 William Cheng: Get the first point over your point right here. And the second part will point to the new block. Okay. So in this case, when we're done it. 322 00:43:58.590 --> 00:44:04.920 William Cheng: Says that this blog will be a good mood over there. So yeah, although I drawn an X over there. There's no way for you to delete anything so that things still there. 323 00:44:05.160 --> 00:44:09.750 William Cheng: Right. But he's just going to change the pointer and then the I know we're also going to move to, to the no, yeah. 324 00:44:10.050 --> 00:44:16.770 William Cheng: Yeah, so we're going to end our data such a look like this. This is the file over here the first block point appointed the original part here because it hasn't been changed. 325 00:44:17.010 --> 00:44:20.340 William Cheng: And the second part of your point of new one over here. That's a little bigger than the previous one. 326 00:44:21.180 --> 00:44:30.870 William Cheng: And also inside. I know that this is where the new file as and now we cannot find the older version of it right because the older version of it is right here and now there's nothing that points to it. 327 00:44:32.700 --> 00:44:40.920 William Cheng: So in this case, what we need to do is I IS THAT YOU KNOW WE NEED TO THE I KNOW ABOUT has changed, right. So one thing we do is that we need to write the entire I know map at the end of the law charge of houses them. 328 00:44:41.100 --> 00:44:48.270 William Cheng: But I don't know if it's actually pretty big because it knows where all the nodes are okay so that's really not feasible to read the entire I know, Matt. 329 00:44:49.590 --> 00:45:02.250 William Cheng: appended to the end of the log. So what the spray forces and people do is that they will they divide the I know Matt and chopping down into little pieces and they will find the piece that contain fall a and that will be the only piece that will go on to the analog 330 00:45:03.120 --> 00:45:09.540 William Cheng: Okay, so therefore you will do something like this, right. This is going to be the I know Matt piece that content file. As it turns out, this example. They also can 331 00:45:10.650 --> 00:45:20.190 William Cheng: Be so therefore this one over here. It's going to be updated it will be a pendant at the end of the law. Okay. So, this I don't know peace used to be somewhere inside a log over here, right. So, now, since 332 00:45:20.520 --> 00:45:24.570 William Cheng: It has been modified, we need to append a new. I know, Matt. Matt piece at the end of the law. 333 00:45:25.350 --> 00:45:32.370 William Cheng: Okay, but what about the I know, man. I don't have now the data search. I know. Matt is the one that keeps track of all the I know Matt pieces right 334 00:45:32.520 --> 00:45:38.460 William Cheng: Now one of them moved over here. So therefore, the I don't map also changes. Okay, you need a pen. I don't matter at the end of year. So the spread of 335 00:45:38.880 --> 00:45:50.130 William Cheng: The Berkeley people decide not to do that. Okay, so what they will do is that they will actually allocate the part of the dis to be to be a data structure that keep track of the item map. And that's not part of the law. 336 00:45:50.610 --> 00:45:56.340 William Cheng: Okay, so therefore, when you modify the idol, Matt, you don't have to append that I know map data structure at the end of the law. 337 00:45:57.300 --> 00:46:08.760 William Cheng: That's okay. You know, if you think about a recursion. Right. This is how the records and stop because India you have that way to stop recursion, the recursion over here is this is the take something out of the log. So therefore, eventually the recursion will stop. Yeah. 338 00:46:10.410 --> 00:46:19.050 William Cheng: So he had the idea of us that they call this data structure called a checkpoint file. So the checkpoint file contains information about where all the I know Matt pieces are so now it was a. This was 339 00:46:19.290 --> 00:46:28.080 William Cheng: The here's somebody I know it is. Here's the some of the IPs. And here are some of the by piece. So by going to the checkpoint file, you have the entire file system hierarchy. 340 00:46:31.350 --> 00:46:42.630 William Cheng: So one more thing over here, we need to address right so what you can do is that, you know, the data that you write on the desk over here, you need to sort of write them on this in one long, right, because the, the purpose of the stripe stripe. 341 00:46:43.410 --> 00:46:48.360 William Cheng: Spread fast is that you're trying to achieve 100% of the right transfer capacity. Right. How do you do that. 342 00:46:48.750 --> 00:46:58.050 William Cheng: Okay, so if you want to do that. You gotta be. You gotta be able to cut down rotational agency down to zero, and also the seat time on to zero. So what you can do is that you can actually wait for, you know, 343 00:46:58.830 --> 00:47:06.930 William Cheng: The trigger years that you're going to use right back, you're going to wait for a lot of data to be run out of this. And then when you try to write to this you perform one long right 344 00:47:07.230 --> 00:47:12.900 William Cheng: Okay, then this right basically never stopped. We will take a very long time to write all that data into the into the same cylinder group. 345 00:47:13.230 --> 00:47:23.790 William Cheng: Right. So in that case, you will minimize the, the, the, the see time and minimize rotational latency, right. So this way, your transfer capacity for right can be nearly can be nearly 100% of this transfer capacity. 346 00:47:24.540 --> 00:47:33.120 William Cheng: Okay, so, since that's a very, very long. Right, right. What if you get offices and cry. What if you lose power. Okay. So, in that case, you know, since it's a very, very long lesson we have to worry about it. 347 00:47:33.750 --> 00:47:45.630 William Cheng: Okay, so yeah, the way that the smaller file system do is that, you know, you're going to use right back modify that you know the the buffer cache, all you want and then when you gather enough data to perform why wrong right by a pending all the data. 348 00:47:46.770 --> 00:47:48.930 William Cheng: To the law. Okay, so in the middle. If you get 349 00:47:49.950 --> 00:47:54.600 William Cheng: If you get a crash. The only half of the data over here went on to the law. What are you going to end up with a problem. 350 00:47:54.960 --> 00:47:59.610 William Cheng: Okay, so this solution is to use the the bar, the idea from a computer graphics people 351 00:48:00.060 --> 00:48:07.500 William Cheng: So those of you who have taken a computer graphics up a class know that when you try to paint with each other to split it onto the screen. They use something called double buffering. 352 00:48:08.310 --> 00:48:11.790 William Cheng: That the idea of double buffering over here is that this is your screen right 353 00:48:12.000 --> 00:48:15.420 William Cheng: Now one of us to buffer. One of them over here is going to get displayed on the screen. 354 00:48:15.600 --> 00:48:21.930 William Cheng: What we're going to do that we're going to paint the next seeing on the screen by using the second buffer. So when I feel the second buffer with it over here. 355 00:48:22.140 --> 00:48:27.030 William Cheng: As soon as we finished the data with the second buffer over here we're going to switch the pointer to point to the second 356 00:48:27.900 --> 00:48:35.850 William Cheng: Have the data coming out of the second, the second buffer. Okay, so this way when we try to look at animation. We're not going to see any flickering 357 00:48:36.420 --> 00:48:44.340 William Cheng: Okay, so this is called double buffering over here. And then as soon as we switch to the new one. And now we can actually use the original buffer over here, paying the new screen. The new scene. 358 00:48:44.640 --> 00:48:49.560 William Cheng: When we're done over here again we switch the buffer over here. So this way. We're not gonna, we're not gonna get any flicker. 359 00:48:50.100 --> 00:48:58.980 William Cheng: Okay, so we can do the same thing with the the checkpoint file, we can actually have to check off over here. But what is heck of a one sec Rafa be inside a super 360 00:48:59.730 --> 00:49:09.210 William Cheng: Super blog over here. We're going to remember the current file system is now in chapter eight. Right, so they're super bug over here say right now. We can try it again check one for A and B. They are not part of the law. 361 00:49:09.750 --> 00:49:13.290 William Cheng: Okay, so maybe they have it. They actually have the beginning part of the beginning part of the disc over here. 362 00:49:13.710 --> 00:49:19.500 William Cheng: So this guy is the type of over here, a traveler, it will point to the current version of the process, then. So now let's say the ends right here. 363 00:49:19.980 --> 00:49:32.430 William Cheng: Okay, we will for a long time. Wait for enough data to be run out to the desk and then right before we're ready to write to the desk. What we're gonna do is we're going to make a copy of check off on a over here to chip off, I'll be over here. Okay. And then what we do is that 364 00:49:33.960 --> 00:49:39.510 William Cheng: They're going to perform one wrong right to write all the data over here to LA and I can actually take us, you know, 365 00:49:40.140 --> 00:49:48.690 William Cheng: Easily several seconds because you need to write, you know, maybe multiple cylinders of data to the desk. Okay. So at this point, Robin is that if you get a crash, what will happen. 366 00:49:49.140 --> 00:50:00.360 William Cheng: Okay, you can get a crash over here next time when we reboot the fastest. Then we go to the super blah, the super blocks as your chat box is checked Mopar a. So when you go to turn off our area over here. All you see are the data right at this point. 367 00:50:00.930 --> 00:50:09.240 William Cheng: Okay, all the data that you have just written over here, they will be completely lost. But the good news is that you'll be in a consistent stable inside of us. And before you do all this rights. 368 00:50:10.740 --> 00:50:14.640 William Cheng: Okay, so even when you get across. You're going to lose the last few seconds of your work, but at least you'll 369 00:50:15.180 --> 00:50:22.920 William Cheng: Be consistent. Now, if it turns out you don't get a crash. When are you going to perform this long. Right, right. Multiple cylinders on to the you know the the desk over here. 370 00:50:23.100 --> 00:50:27.690 William Cheng: India you finish writing all that I know Matt pieces over here, append them at the end. The log 371 00:50:27.900 --> 00:50:32.490 William Cheng: And then at that point you're going to modify the checkpoint file to point to all the new data over here. 372 00:50:32.700 --> 00:50:38.100 William Cheng: At the end of the law. And then you're going to go to the super blog to say that now I'm in. I'm using check ron paul be 373 00:50:38.310 --> 00:50:45.690 William Cheng: And from this point on, if you get a crash and next time when you reboot it the fastest them you will know that your end of the first of them is right here. And this way. Again, all the data. 374 00:50:45.990 --> 00:50:51.600 William Cheng: That you have written out to this is going to make the file system consistent again. And now, and now the fastest will be consistency. 375 00:50:52.170 --> 00:50:59.130 William Cheng: Okay, so this way if you run all your dislike this and if you use to check on far by borrowing the idea of double buffering from computer graphics 376 00:50:59.340 --> 00:51:06.150 William Cheng: No matter when you crash. The worst thing that can happen is that you're going to lose the last few seconds of your work and your file system will always be in a consistent state. 377 00:51:07.170 --> 00:51:07.560 William Cheng: Then 378 00:51:08.850 --> 00:51:16.530 William Cheng: Alright. So in summary, so please remember that for the log file system. The purpose of the law truck to a file system is just to demonstrate 379 00:51:16.770 --> 00:51:23.190 William Cheng: That we can achieve near 100% of the disk transport capacity when we're performing right into the desk. Okay, that's the main reason 380 00:51:23.940 --> 00:51:30.180 William Cheng: The main reason is not to provide a crash resiliency, as it turns out that, you know, since the right it's going to be very, very long it's gonna take a very long time. 381 00:51:30.390 --> 00:51:36.090 William Cheng: They have to address the the sort of the other crash problem and that's how they end up with it to check on possible. Sure. 382 00:51:36.570 --> 00:51:48.900 William Cheng: Yeah, so that is the last part of our system is that it's good performance for right and that's the main purpose. It also show you how you can recover from crashes using check my VA. So this way no matter when you crash the worst case that you're going to 383 00:51:49.380 --> 00:51:57.930 William Cheng: Lose a few seconds of your work now. The main advantage over here is that it's a lot of waste of space. And also, once you finish writing to the 384 00:51:59.220 --> 00:52:08.910 William Cheng: Once you finish writing, writing the entire log the entire this become a read on the desk. Okay. So clearly this is not a practical pharmacists. And so what do we talk about it. 385 00:52:09.360 --> 00:52:19.650 William Cheng: Okay, because as it turns out later. All people actually use these ideas. I tried to build your biceps guys later on, but sort of thought doesn't talk to talk about what kind of classes that will be using these ideas. Okay. 386 00:52:20.370 --> 00:52:25.740 William Cheng: All right, this is a gray a breaking point. So, I will see you guys in a little bit. Okay.