WEBVTT 1 00:00:03.570 --> 00:00:06.390 William Cheng: Welcome to lecture 24 2 00:00:09.059 --> 00:00:15.599 William Cheng: So, Colonel to his do this Friday. If you have co from previous semester, don't look at them. Don't copy them best to get rid of their 3 00:00:16.170 --> 00:00:22.200 William Cheng: Grading guidelines, the owner will grey need to pass all the testing BFS test. I see a favorite episode has see 4 00:00:22.680 --> 00:00:32.400 William Cheng: Just like Colonel one all the print out must be correct. And you need also, you know, use the right Readme file and deal with all the question mark after the submission 5 00:00:33.150 --> 00:00:40.620 William Cheng: Verify your final submission and again go through the grading guidelines, you know, one line, line by line, make sure you don't lose any silly points. Yeah. 6 00:00:44.280 --> 00:00:51.840 William Cheng: Right, so I think some people started to work on Colonel three already, which is great. So the recommended timeline for Colonel three 7 00:00:52.470 --> 00:01:00.540 William Cheng: Is that so, right now, technically was so incredible true right so so Colonel three will start after this Friday. So the idea here is I 8 00:01:01.230 --> 00:01:06.330 William Cheng: The first thing that you should do is to finish P frame Darcy. There are three function, you have to write in there. 9 00:01:06.750 --> 00:01:12.000 William Cheng: But what's more important over here as I just said the N equals to zero and f5 FS equals two, one 10 00:01:12.450 --> 00:01:20.160 William Cheng: Okay VM equal to zero, that means that you are not running user space code as soon as you said the n equals two, one. You're supposed to be running user space cup. 11 00:01:20.580 --> 00:01:24.780 William Cheng: Now, so right now, you're not ready for our yeah you got to make sure that you have a solid page frame. 12 00:01:25.290 --> 00:01:35.220 William Cheng: As five Fs equals two, one, that means that Colonel to instead of using the ramp bosses and we're going to switch to using the system PA system. So this way you can get rid of all your kernel two bucks. Okay, so 13 00:01:35.760 --> 00:01:42.960 William Cheng: So the idea here is that by the end of this weekend. You need to make sure those three function in P for him. God's see works and you pass BFS test. 14 00:01:43.530 --> 00:01:54.630 William Cheng: Okay so passing BFS test. That means that your code a solid anytime you need to go get a patient on the desk. You know, you're able to do that. Right. So it's very important because if you cannot reliably. Go to the desk and get a page rain. 15 00:01:55.440 --> 00:02:04.980 William Cheng: Is really no point to run the user space program. Okay, so any kind of air that you have to deal with in BFS test, you need to use to do this with f5 Si equal to one. Yeah. 16 00:02:05.250 --> 00:02:12.990 William Cheng: All right after you finish that next week, you should get a hollow program to run. So hello programs actually very difficult to get it to run, you probably going to spend one week to get it to work. 17 00:02:13.530 --> 00:02:23.250 William Cheng: So next week you know VM equal to one. And by the end of next weekend. Though the address space run hello directly by any progress using colonel. Exactly. Okay. 18 00:02:24.420 --> 00:02:29.280 William Cheng: So in the discussions. Actually, I mentioned before, right, the basic idea here is that your kernels. 19 00:02:29.670 --> 00:02:39.000 William Cheng: You know, in a product run is the inner process inside, Colonel. So the idea here is that that process need to go into the user space to execute one program. Right. Which program does execute 20 00:02:39.210 --> 00:02:47.670 William Cheng: It needs to execute a hollow program. So when you go into the user space program who is executing club or the hello world program. What is going to be the inner process right in the process. The 21 00:02:47.940 --> 00:02:55.590 William Cheng: Process will positive ID number one. I mean, doesn't really matter what it is you're in the process of going to the user space run the hello program. Our program is only three lines long. 22 00:02:56.430 --> 00:03:07.830 William Cheng: Because open and open is called right to right hello world message to the to the screen and then and then return the call exit you come down. Is that a colonel you call do exit and you know you'll kill it out. 23 00:03:08.670 --> 00:03:17.580 William Cheng: And since that's process ID number one, we got to exit, you know, in a process is dead. That will wake up the colonel process and the Colonel process will take or turn off the machine. 24 00:03:18.810 --> 00:03:25.410 William Cheng: Alright, so this is not easy. And a lot of work. So again, you know, if you can finish that in one week or that will be pretty good. Yeah. 25 00:03:25.800 --> 00:03:36.720 William Cheng: And again, you go back to my last discussion section. I tell you what the program. What a function that you that you have to implement. So again, my function list might not be complete. Okay, there might be something I forgot. If I forgot something. Let me know. 26 00:03:36.990 --> 00:03:39.930 William Cheng: Okay, so next time I can sort of pick up as a fixed up the slide. Yeah. 27 00:03:40.440 --> 00:03:53.460 William Cheng: And then other the week after will be the second week of kernel three are you to get all the basic user space program to run directly using exactly the same method, right. So again, the colonel in the process, going to the user space goes into user space. 28 00:03:53.880 --> 00:03:59.280 William Cheng: Run one program. And then you come down, come back down into the kernel and you can turn off the machine. 29 00:03:59.850 --> 00:04:03.690 William Cheng: Okay, so, so these program, you need to look at the current the waiting guideline. 30 00:04:03.930 --> 00:04:12.030 William Cheng: So in Section be of the wedding gown, and there's a bunch of basics test, you have to run. The first one is, hello, followed by a bunch of them as a you need to get all them to want to run 31 00:04:12.930 --> 00:04:17.010 William Cheng: Towards the end there is a function, I guess the last one is called forking wait 32 00:04:17.580 --> 00:04:23.040 William Cheng: That's the first program that you see that you saw in this class, right, the parent for coffee chart and wait for it. 33 00:04:23.460 --> 00:04:31.080 William Cheng: It's a little different because I want to make sure that you can copy on write it down correctly. Okay. So, therefore, that will be the you know the the program that they have to run 34 00:04:31.770 --> 00:04:41.850 William Cheng: So maybe you should try to get it done by, you know, Friday of the second week, so again, that it will be week 14 or Friday to finish and then during the weekend, you should get a spin and network. 35 00:04:42.240 --> 00:04:50.910 William Cheng: Guys, it's been an it will happen is that the colonel user space that the colonel enterprise is going to go into the user space and become the user space and and process. 36 00:04:51.390 --> 00:04:53.400 William Cheng: The user space in the process is called has been in it. 37 00:04:54.120 --> 00:05:01.590 William Cheng: Okay, so. So the code there is actually pretty simple. All it does is to fork off three child processes. Each one for one of the TTY device. 38 00:05:01.950 --> 00:05:07.230 William Cheng: So if you only have one terabyte device. In this case, you only have one child if you have three TTY device, it will fall called three 39 00:05:08.010 --> 00:05:10.980 William Cheng: Three child processes, each one of them or run as 40 00:05:11.490 --> 00:05:22.200 William Cheng: The or run slash business is SH right so slash business sh. That is the user space shuttle program. And again, it looks just like the colonel shall give me a problem you can type UNIX command. 41 00:05:22.860 --> 00:05:32.460 William Cheng: So again, all the code that has been written for you so so to work. So at the end of the with the the the weekend a week to you should get a spin it around and you need to run you 42 00:05:32.970 --> 00:05:39.450 William Cheng: Should be able to run run user space shout. Why is just having the users may show you should run all the program in section. 43 00:05:39.960 --> 00:05:46.020 William Cheng: Be of the grading guidelines, you know, bye bye bye typing those commands manually and you should be able to pass all that test. 44 00:05:46.620 --> 00:05:54.540 William Cheng: That and then the week we three of Colonel three or you need to pass all the remaining test in the grading guidelines. So again, those are the one you know the fourth bomb. 45 00:05:54.810 --> 00:06:02.730 William Cheng: The memory test the the IED men, the stress test and then also you have to give the FS test to run as a user space program. 46 00:06:03.300 --> 00:06:09.300 William Cheng: Okay, so it's the last week over here. Again, there might be a lot of bugs to fix that might be some additional new new function, you have to implement 47 00:06:09.930 --> 00:06:19.110 William Cheng: So again, this is a very, very tight schedule. So I strongly urge you to follow my recommendation. So this way you can make sure that you that you turn it on time. 48 00:06:19.710 --> 00:06:25.050 William Cheng: Okay, so every semester, lots of people they get it to finish even before the end of second week 49 00:06:26.040 --> 00:06:33.810 William Cheng: Or so, this is totally doable. You know, if you have a team, you need to coordinate, you need to, sort of, you know, plan things out. So, so to help each other, get 50 00:06:34.560 --> 00:06:39.960 William Cheng: All right, and anytime you anytime you have any kind of problem seven, you know, don't wait too long. Some people don't 51 00:06:40.680 --> 00:06:50.160 William Cheng: Get stuck for two days. That's too long. Right. You stop a one day or one one evening that's already too long, you know, send message to the Costco and please remember, you know, if you if you said 52 00:06:50.730 --> 00:06:57.450 William Cheng: If you posted the class Google group and then you mentioned me. I don't have to respond to you because he posted the class guru. 53 00:06:58.110 --> 00:07:01.620 William Cheng: Okay, I only have to respond to you if you send me a private email. 54 00:07:02.310 --> 00:07:08.400 William Cheng: Alright, so if you're too confused about all the people saying, different kinds of Southern class we will maybe you should consider sending me a private email. 55 00:07:09.150 --> 00:07:14.700 William Cheng: Or I sort of private emails email that's only to me and nobody else right then I will guarantee that will, I will. 56 00:07:14.970 --> 00:07:22.860 William Cheng: reply to your email within 24 hours or so of course some people don't want to wait for 24 hours. Yeah, you should do that. You should ask your classmates, right, because sometimes you asked me a question, I say, I can't 57 00:07:23.580 --> 00:07:26.220 William Cheng: I can't tell you what code. Right, right. So in the end, you have to do the same thing. 58 00:07:27.060 --> 00:07:37.470 William Cheng: Alright, so again, think about what kind of question that you want to ask anything that's a little bit conceptual I'll be very happy to answer them right if you asked me what Co. You should write. I won't be able to help you know 59 00:07:38.340 --> 00:07:45.900 William Cheng: All right. It's also important that you read Colonel three FAQ right so so before you do all these tests of scan the girls who FAQ to understand what's going on. 60 00:07:46.110 --> 00:07:54.690 William Cheng: Even though they might not make any sense. Right. Hopefully, later on, they will make more sense. Yeah. And also you need to understand pretty much all the lecture material covered so far in class. 61 00:07:55.470 --> 00:08:04.290 William Cheng: Okay, it's all the lecture, you know, not every lecture. Well, lots of stuff that we cover. It's getting ready to do the final Colonel assignment. So again, you need to put all these things together. 62 00:08:04.920 --> 00:08:09.870 William Cheng: It will remind yourself. You know what has been covering class and you know hopefully they won't make sense. Yeah. 63 00:08:12.180 --> 00:08:22.680 William Cheng: All right. Another important reminder over here, right, is that if you know this particular code inside your switch function for Colonel one and two, that's perfectly fine. Okay. But this call will not work for Colonel three 64 00:08:23.070 --> 00:08:28.380 William Cheng: Because for Colonel three when you go into the user space for the first time the interrupt will be disabled. 65 00:08:29.040 --> 00:08:38.550 William Cheng: Okay, so, so, so, so you're kind of wanting to interrupt never disable that but if you need to go into user space as you go into the user space. What it will do is it will disable interrupt. 66 00:08:39.390 --> 00:08:48.480 William Cheng: Okay. And then what you're going to use the search for when interrupts disable, you know, then, then, then what's going to happen is that if you want to wait for the desk. The designer will never happen. 67 00:08:49.110 --> 00:08:56.790 William Cheng: Okay, so therefore, what's required to do, is that right here. You have to help the CPU. I was on the call already before you have to hold us up over here is 68 00:08:57.450 --> 00:09:09.270 William Cheng: The way how CPU is that you enable in a robin how CPU guys, again, you know, looking through the kernel source, go find out how to do that. Yeah. And then, as I mentioned before that, if you just do this. There's a race condition and there 69 00:09:10.500 --> 00:09:16.530 William Cheng: Are so so once you add this one, you'll go, you'll be able to go into the user space, but still once in a while when you run your program, you know, 70 00:09:16.890 --> 00:09:23.130 William Cheng: So, so what's important. Over here is that over here, enable interrupt right one. So, you know, up in a ball. Now if you go into the user space program. 71 00:09:23.460 --> 00:09:30.120 William Cheng: You know, then you will get interrupted, stuff like that. So you do that but but again once in a while you're running tests. Everything is that all of a sudden, everything is going to be frozen. 72 00:09:30.480 --> 00:09:39.180 William Cheng: Okay. And if you press any key of that everything's going again. Then you know that is because you have a race condition in this you know in this loop right here. And then you had to fix our race condition. 73 00:09:40.500 --> 00:09:49.500 William Cheng: Alright, so again, the way that you, you know, so, so, so the reason that there's race condition is that because it doesn't wait for asynchronous even correctly. So what is the asynchronous the bat here. 74 00:09:49.950 --> 00:09:56.460 William Cheng: Right, it's the URL, right, because the interrupters generated by a hardware, the hardware is totally asynchronous with respect to your CPU. 75 00:09:57.090 --> 00:10:04.770 William Cheng: Okay, so if you want to wait for interrupt you have to wait for the right way that so you need to disable and block it and event doesn't occur, you need to 76 00:10:04.980 --> 00:10:16.920 William Cheng: Enable on a block and wait for one atomic operation. So some of the stuff we're done here already, there's something that's missing. Okay, so feel free to discuss it in class, right. I won't be able to tell you what what code, right, yeah. 77 00:10:18.390 --> 00:10:22.170 William Cheng: All right, so last time we finished chapter seven. You have everything you need to finish, you know, 78 00:10:22.920 --> 00:10:28.110 William Cheng: You know, Colonel three. So now we're going to go back. So again, if you look at the lecture web pack or or web page. 79 00:10:28.620 --> 00:10:36.780 William Cheng: There's the chapter six we finished the first part, the rest of it. We say, you know, we're going to talk about this first after Chapter seven second chapter seven. So now we'll finish up the seven 80 00:10:37.200 --> 00:10:40.320 William Cheng: We're going to talk about that, guys. We're going to go to chapter six. 81 00:10:40.650 --> 00:10:52.170 William Cheng: And then we're going to first take a look at the performance problem, right. We talked about the introduction of cystic fibrosis. Then we talked about the disc architecture and I'm going to sort of talk about what is the problem with the, this isn't PA system. 82 00:10:53.370 --> 00:10:59.910 William Cheng: Yeah, so some of their sort of review of it this architecture right, the smallest unit addressable units on with this is called a sector. 83 00:11:00.180 --> 00:11:06.000 William Cheng: And in order for you to address the sector that this address is a little mall, you know, convoluted. Now you need to specify 84 00:11:06.600 --> 00:11:10.890 William Cheng: You know, which had a which surface because typically that this has multiple services as multiple heads. 85 00:11:11.250 --> 00:11:22.620 William Cheng: And then you need to specify which track number you want to use. And we call it a cylinder, because you know that this head, they all move at the same time. So once you're sitting at a you know when once you decide to sitting at a particular 86 00:11:23.220 --> 00:11:30.900 William Cheng: Particular track you actually have access to all the other seven the seven tracks on the other surfaces. Right. So they have all these eight 87 00:11:31.410 --> 00:11:39.720 William Cheng: You know soon all tracks. Together they are known as a cylinder, guys. You can also specify cylinder number will track number they are synonymous. And finally, within a 88 00:11:40.200 --> 00:11:47.670 William Cheng: Within a cylinder or track we mentioned before, there's like, you know, on the average, 750 different sectors, you also need to give us a second number right 89 00:11:49.080 --> 00:11:54.900 William Cheng: And also that this as a some is slow, right, because in order for you to get to that that sector, you have to perform a seat. 90 00:11:55.200 --> 00:12:00.840 William Cheng: You have to perform a C Corp. Once you perform a C Corp that's moving that this had it to the right place. 91 00:12:01.080 --> 00:12:07.350 William Cheng: I mentioned before, you have to speed it up, you need to go shoot yourself across the death and you to step on the brake and then you have to turn off the motor 92 00:12:07.500 --> 00:12:14.010 William Cheng: Your turn on the second motor you jump around and try to find eventually find out you know where you are and make sure you go to the right. 93 00:12:14.220 --> 00:12:19.890 William Cheng: To make sure you go to the right track. That's cause the time right so see time is kind of random. You can't really predict that 94 00:12:20.370 --> 00:12:27.210 William Cheng: The second component over here is going to call rotation or latency, you need to wait for this role, rotate so that the sector of the, you know, 95 00:12:27.390 --> 00:12:33.600 William Cheng: Second appear under the discount right the deep. But this I only can move radio the income, it cannot move in a sort of a, you know, 96 00:12:34.170 --> 00:12:39.750 William Cheng: Circular you know Moshe. So, therefore, you have to wait for this to spin until your sector get right underneath the desk. 97 00:12:40.170 --> 00:12:48.600 William Cheng: So that one is also random right. You can't really predict. When is that going to be right. Depends on what you see, Cody to that particular cylinder exactly what it is. Head is that you have to wait. 98 00:12:49.050 --> 00:12:57.480 William Cheng: Finally, the data transfer. Time is to transfer that sector of data from the for profit sector on the desk to the to the buffer is not a controller. 99 00:12:57.900 --> 00:13:02.970 William Cheng: Okay. So the last part over here. Typically that that the runtime is typically considered fixed 100 00:13:03.450 --> 00:13:07.530 William Cheng: Okay, so even though you know depends on whether you're on the outer cylinder on the later. Later cylinder 101 00:13:07.740 --> 00:13:16.620 William Cheng: There might be a little bit of variation. Right. But again, all of these three components over here the first and second one. They are very difficult to predict. And the last one over here is typically constant. Yeah. 102 00:13:17.280 --> 00:13:23.700 William Cheng: So, so when you go buy a disc from, you know, Amazon, if I'm fries. So what it will do is it will miss you all these kind of parameters. 103 00:13:24.450 --> 00:13:31.770 William Cheng: That from these parameters you can calculate the capacity of this. You can also, you know, sort of, calculate what is the maximum transfer time for the desk. What is the maximum 104 00:13:32.250 --> 00:13:40.710 William Cheng: Potential top of the desk. Well, if you assume that the sixth seacom equal to zero and the rotational agency is equal to zero. Sorry. That will give you the minimum transfer time 105 00:13:41.220 --> 00:13:50.250 William Cheng: Right, because if all these other component is zero. That's the best you can do. Okay, so in that case it's going to be the middle of transfer time one over the minimum transmitted time is going to be the maximum throughput of the desk. 106 00:13:51.330 --> 00:14:02.460 William Cheng: Or so whenever you go buy this, you know, turn that the alternative is over. Look at all these number, it will tell you what the maximum transfer time and that that one is measure assuming that the sitcom equal to zero and the latency is equal to zero. 107 00:14:03.120 --> 00:14:10.500 William Cheng: Okay. So typically, that number is an achievable, you know, because if you do, you know, you could charge a lot of stuff you have to seek any up with rotational latency. Yeah. 108 00:14:12.120 --> 00:14:21.300 William Cheng: Alright, so for the rhino be a hard drive. If you calculate the maximum transfer time it's going to be close to 64 megabits four megabytes per second, right, the capital bo bo says what bytes. 109 00:14:21.570 --> 00:14:30.750 William Cheng: So, so the rhino PS Maxwell transfer speed over here is 64 megabytes. So one of the terminology that will use that this is going to be also the capacity of the desk. 110 00:14:31.050 --> 00:14:40.440 William Cheng: Okay, so we talked about this capacity. There are two things. One is the storage capacity. The other one is the transfer capacity right transfer capacity is the maximum number of bytes per second. 111 00:14:41.100 --> 00:14:50.700 William Cheng: Or megabytes per second. They can transfer. Okay, so, so for for today's class when I talk about this capacity. I'm not talking about the size of the desk. I'm talking about the transfer speed of the desk. 112 00:14:51.120 --> 00:15:05.700 William Cheng: Okay, so the disk capacity over here is 64 megabytes per second. If you put system for file system on the Nokia hard drive and you measure the actual performance of the palaces them you will get about 100 and 102.4 kilobytes per second. 113 00:15:06.180 --> 00:15:13.380 William Cheng: Okay, so if you divide this number by the maximum transfer capacity, the number you're going to get is that it's 0.16% of the maximum 114 00:15:14.160 --> 00:15:26.280 William Cheng: Okay not 16% of the maximum it's 0.16% of the maximum, right, the right now will be a hard drive was made by probably made by IBM is a huge. This is very high performance is very, very is very expensive. 115 00:15:26.790 --> 00:15:31.380 William Cheng: Okay, as it turns out to be persistent Bob autism on your performance is horrible. Less than 1% 116 00:15:32.610 --> 00:15:39.540 William Cheng: Okay, it's less than even point 2% right so. So, therefore, you know, how do you actually improve the the the performance of your past, is that right 117 00:15:39.870 --> 00:15:46.860 William Cheng: As it turns out, all the hardware doesn't really matter that much as it turns out that if you, you know, put your file system on the desk in a better way. 118 00:15:47.430 --> 00:15:57.000 William Cheng: You perform is going to be much better. Right, so, so, so today we're going to sort of see how to actually speed up the performance of your file system, or you can sort of think about is how do you speed up the performance of your storage system. 119 00:15:57.990 --> 00:16:02.730 William Cheng: Okay, so, so we're going to see some of the tricks over here that we have to do, right, improving performance. 120 00:16:03.330 --> 00:16:07.530 William Cheng: You can do this in hardware. Okay. So one thing that you can do is that, you know, when you transfer data for this. 121 00:16:07.980 --> 00:16:16.980 William Cheng: Instead of, you know, transferring one sector where the data. Why don't you transfer multiple sector of data. So this way, whenever you try to go to the desk. Maybe the data is already inside the disk controller. 122 00:16:17.790 --> 00:16:26.160 William Cheng: Okay. So in this case, all you have to do is to transfer the data from the disk controller into memory. So you don't have to wait for a seat. Wait for rotational latency and don't have to wait for data transfer time 123 00:16:26.460 --> 00:16:31.410 William Cheng: All you have to do is the DMA the data from, you know, from the, from the controllers buffer into memory. 124 00:16:32.130 --> 00:16:39.270 William Cheng: Okay, so that was certainly work. As it turns out, that doesn't really help that much. Okay. Because if you transfer a bunch of sector, the data from the disk into memory. 125 00:16:39.570 --> 00:16:48.060 William Cheng: Turns out that most likely the, the, the, the sector that you get there are not very useful okay because of the way that the file system is organized. You know, one sector, you know, 126 00:16:48.630 --> 00:16:55.620 William Cheng: Remember that the first of them is organized into this blocks right so maybe one blog is right here the next block is sending a different place on the desk. 127 00:16:56.490 --> 00:17:03.150 William Cheng: Okay. So remember, the way that we insistence of our system or the way we allocate a new this blog is that we go to the free list and we just got the next one. 128 00:17:03.360 --> 00:17:12.750 William Cheng: Is and that's why it's going to be sitting right next to the original one or know the next display is going to be very, very far away. So, therefore, if you go to the next this blog and if you just read ahead, you're going to read the wrong data. 129 00:17:13.380 --> 00:17:17.760 William Cheng: Okay, so therefore, even though this might be helpful. A little bit. Typically, it doesn't really help very much 130 00:17:18.090 --> 00:17:25.140 William Cheng: Okay. So in the end, as it turns out, it's the the software design that makes the most difference. Okay. So remember when we talk about the file system. 131 00:17:25.350 --> 00:17:33.780 William Cheng: The file system is about how to lay out data on the desk. So if you lay out data on this in a clever way, we can actually improve the performance by quite a bit. Right. 132 00:17:34.530 --> 00:17:44.190 William Cheng: Alright, so we're gonna see how this can be done so, so, so the next sort of an evolution of the fast other the system file, file system is known as the fast file system. 133 00:17:44.580 --> 00:17:52.890 William Cheng: The fastest. His name is a unique style system. They have a better own this organization. So we're gonna see what it is. Right, so they organize the data in the little sort of a different way. 134 00:17:53.580 --> 00:17:58.620 William Cheng: They also support some other feature when talking about a little later. For example, longer component naming directories. 135 00:17:59.370 --> 00:18:06.420 William Cheng: Since most of them you you know you saw the referral system code, right. So the directory entries over here. Every director entry. How many 136 00:18:07.110 --> 00:18:07.860 William Cheng: How many bytes is that 137 00:18:08.280 --> 00:18:12.270 William Cheng: Right. So remember, it's 32 bites long right there's a 28 bytes for the component name. 138 00:18:12.450 --> 00:18:23.940 William Cheng: And four bytes for the I know number. So for 28 bytes component, and one of them has to be reserved for backslash zero because what they store is a string. Well, in that case, you know, the component and can only be 27 characters long. 139 00:18:24.570 --> 00:18:28.800 William Cheng: Okay, it's 27 character long. It is actually a long enough. 140 00:18:29.190 --> 00:18:38.550 William Cheng: Okay, it's long enough for most people, but some people when they try to create a phone and they want to find it to be as long as possible. So in that case, you know, sometimes not long enough, it is desirable to have a phone and that's as long as possible. 141 00:18:39.000 --> 00:18:45.300 William Cheng: Okay, so the question is, what do you have to do. Okay, so if you make every direct to entry over here and there are four kilobytes. Well then that's 142 00:18:46.950 --> 00:18:53.220 William Cheng: A little too much. So what should be the the right design. So you can actually have any component name size right so we're gonna talk about that a little later. 143 00:18:53.670 --> 00:19:07.110 William Cheng: Maybe in the next lecture. Now, the Federal system also retained that this map of this isn't boss. Mr. So remember this map is right inside the I know the last 13 pointers as call it this map, so that will allow you to address up to 16 gigabytes of 144 00:19:08.820 --> 00:19:18.990 William Cheng: Data on this. So, the largest file, you can have a 16 gigabytes and also it's pretty you know the performance pretty good, right, we saw that order in the order one performance, you can actually go to any this block. 145 00:19:19.800 --> 00:19:31.410 William Cheng: Okay, so, so, so, so, so, so the. This isn't about versus them. The this map designs actually pretty good. Okay. I mean, it's not the best, but it again. It's pretty good. But for the five forces them. 146 00:19:32.010 --> 00:19:42.720 William Cheng: To decide to keep the same this map. I think they still limited file size to be two gigabytes, because, you know, nobody really wants to file sizes bigger, bigger than two gigabytes, right. This is the 1980s, 1990s. 147 00:19:44.190 --> 00:19:53.520 William Cheng: Alright, so one thing the system office wasn't if I fast houses and do is to make the block size bigger, right, so remember the block size is logical logical block size, you can 148 00:19:54.150 --> 00:20:00.480 William Cheng: Specify the blocks as anything you want. Okay, so this way you know when you try to go to the Go go through this, you will try to retreat one 149 00:20:01.020 --> 00:20:10.950 William Cheng: You know, one this block of data. So this way. What you can do is that you can actually put one block of what this block of data right next to each other. Instead, you know, in the 150 00:20:11.700 --> 00:20:13.530 William Cheng: You know, in the sector right next to each other. 151 00:20:14.130 --> 00:20:20.280 William Cheng: Okay, so instead of having the blocks is very, very small. So, for example, the, the old block sizes have a kilobyte right this habit kilobyte 152 00:20:20.580 --> 00:20:26.040 William Cheng: Okay, so have a kilobyte, you know, maybe that's equal to the sector size on the desk every one of them is a half of kilobytes over here. 153 00:20:26.340 --> 00:20:32.940 William Cheng: Okay. So in this case, you know, the next is you know the nesters this blog might be a random place on the desk right because the way the free list and manage 154 00:20:33.510 --> 00:20:35.130 William Cheng: So in this case, it will take you a long time. 155 00:20:35.760 --> 00:20:47.310 William Cheng: But what if you actually make your block size or four times bigger, right. So, the box is less than a sequel sequel two kilobytes. This will be a block size. And when you allocate on the desk, you actually use four consecutive 156 00:20:47.970 --> 00:20:53.220 William Cheng: four consecutive sector number. Okay, so this way when you try to retrieve this data. 157 00:20:54.060 --> 00:21:02.010 William Cheng: How much do you have to pay, right, you have to pay for one seat time one rotational latency and then you need to pay for for data transfer time 158 00:21:02.700 --> 00:21:12.450 William Cheng: Okay, if you go back to the previous approach over here if these for this blog. They are scattered all over the desk. If you want to retrieve all these for for for for this by a former retreat two kilobytes or your file. 159 00:21:12.780 --> 00:21:18.090 William Cheng: Out of your file, you have to actually pay for for see time for rotational agency and for data transfer time 160 00:21:18.750 --> 00:21:22.800 William Cheng: Now, so by simply making the this blog bigger you actually going to perform better. 161 00:21:23.700 --> 00:21:28.650 William Cheng: Alright, so, so, how big should we may this blog, we should we make it. This might be to be four gigabytes. 162 00:21:29.130 --> 00:21:34.860 William Cheng: Once again if you make it at this point, really, really big, you know, there'll be great, right, because all you have to do is to PayPal. 163 00:21:35.130 --> 00:21:40.920 William Cheng: You know, once the top one rotational agency and then you actually retrieve all the data from a distant memory. So what it's going to be the problem. 164 00:21:41.670 --> 00:21:48.750 William Cheng: What if you're this block is over here is too big, right. So, so I mean four gigabyte. That's ridiculous. Right. So if you make this one, two gigabyte two megabytes here, over here. 165 00:21:49.320 --> 00:21:58.410 William Cheng: Okay, two megabytes a really large that this block right you perform sequence rotational latency was a you're going to retreat. Many, many consecutive sectors from the desk and this will be Super Bass 166 00:21:59.010 --> 00:22:06.120 William Cheng: Okay, but, but what about internal fragmentation right if it turns out of your files, only one by long have you only one by law you allocate this plus 167 00:22:06.420 --> 00:22:10.410 William Cheng: two megabytes. You're going to waste, you're going to end up wasting two megabyte minus one bites. 168 00:22:11.310 --> 00:22:18.330 William Cheng: Okay, in the good old days, or this is actually very, very expensive. So if you're wasting your this block. If you're wasting your storage space like this you know people's not gonna 169 00:22:19.080 --> 00:22:19.980 William Cheng: It's not gonna be happy. 170 00:22:20.790 --> 00:22:31.410 William Cheng: That. So it's actually very difficult to actually figure out exactly what's you know what's the right this block size right if it's too small, then, then your speed is going to be low. If it's too big and wasting too much your storage capacity. 171 00:22:31.980 --> 00:22:40.590 William Cheng: Okay, so one thing that the fastest. And then over here I guess over here sort of illustrate to you, you know that if you have a small, you know, small block size. 172 00:22:41.040 --> 00:22:49.200 William Cheng: The neck has internal fragmentation is going to be at the limit is going to be the size of a small blah, if you use a large box is why, in that case, you know, most of your data could be wasted. 173 00:22:49.500 --> 00:22:59.790 William Cheng: Okay, so if every one of your file look like this. Okay, you're gonna you know every thought is exactly, you know, one block size plus a little bit. Well then 50% of your this is going to be wasted. 174 00:23:00.870 --> 00:23:07.350 William Cheng: Okay, so that would be horrible. Right. You know, if you if you spend a lot of money and turns out you know the storage capacity of it is you only use 50% of it. 175 00:23:07.950 --> 00:23:17.250 William Cheng: Somebody's going to be very, very upset. Okay, so, of course, you know, most fallen on like that or if you have a large file the beginning part over here and can use a big block only the last part is going to have where you have the internal fragmentation. 176 00:23:17.760 --> 00:23:29.520 William Cheng: Or so the reality is that there's not that bad. But if you, if this one is two megabytes. And if you if you have to waste two megabytes of storage space on a desk that's only 128 megabytes of this. Well, in that case, that would be too much. 177 00:23:30.450 --> 00:23:32.670 William Cheng: Okay, so therefore, what's going to be the solution, right. 178 00:23:33.450 --> 00:23:44.730 William Cheng: Now, so the first file system. Their solution is to use to block size okay at the beginning part of your file you going to use a large block because this way you can improve performance in the last blog over here we're going to actually use a smaller box. 179 00:23:45.300 --> 00:23:49.500 William Cheng: Okay, so in this case our file system is going to be more complicated, right, because there's no free lunch. 180 00:23:49.680 --> 00:23:56.430 William Cheng: If you want the best of both worlds. Yeah, you gotta pay somewhere. So in this case, what we got a lot of pain. This case is going to be the complexity inside the fastest there. 181 00:23:56.610 --> 00:24:03.510 William Cheng: Because now the fastest. And so you have to maintain two different block sizes, right, one of the, one of them is the big block, they use for the beginning of the file. 182 00:24:03.720 --> 00:24:09.690 William Cheng: And then will you have the last part about you will allocate a small block over here. Just enough so that you can cover the actual data. 183 00:24:10.830 --> 00:24:19.800 William Cheng: Okay, so, so, so, so, towards the end of the file, you know, so they call these in fragments. OK, so the fragments over here is typically is going to be a fraction of the this block. 184 00:24:20.100 --> 00:24:26.580 William Cheng: OK, so the large part over here. You're going to divide them into different five freshmen over here. So in this case, you're going to allocate just enough fragment to cover the actual data. 185 00:24:26.970 --> 00:24:35.700 William Cheng: Okay, if you actually if you start growing this file, you got to start using more and more fragmented when they grow up, beyond the large blah, you're going to allocate another large block over here. I'm gonna start allocating fragments again. 186 00:24:36.270 --> 00:24:42.870 William Cheng: Okay, so this way you know you're gonna have best of both worlds. You're going to have, you know, a good performance and also you have a small amount of overhead. 187 00:24:43.650 --> 00:24:48.120 William Cheng: Right. So again, in this case, what is, you know, what are you paying for because there's no free lunch, right. 188 00:24:48.330 --> 00:24:52.470 William Cheng: You'll file system is going to get a lot more complicated because now you have to manage to a block size. 189 00:24:52.710 --> 00:25:01.740 William Cheng: The data structure is that your system is going to get more complicated because now we have to remember, not just how many data blocks are there. Also, how many fragments inside the last other these are the last one. 190 00:25:02.250 --> 00:25:07.170 William Cheng: Okay, so again, you're going to end up with more pointers inside your pastor, Sam. So it's gonna cost you a little 191 00:25:08.010 --> 00:25:14.700 William Cheng: Alright, so, so, so, you know, if you start with this blog. How many fragments, you divide them into one. So again, we 192 00:25:15.210 --> 00:25:25.140 William Cheng: Know there's really no good way. Right. So, so in the fastest is that they allow the system administrator to decide how you want to go. You can divide them into as a fragment for fragment to fragment or just use the 193 00:25:25.680 --> 00:25:35.280 William Cheng: Big one if you want to have, you know, more internal fragmentation. Right. So, this way, this is administrator can actually choose so when you try to format the hard drive in and they make that decision. What a good decision is bad. 194 00:25:35.550 --> 00:25:38.520 William Cheng: Well, then you have to take all the data out reformat a driver, put it back in. 195 00:25:39.090 --> 00:25:48.510 William Cheng: Okay, so. So again, you know, this is a little more complicated. And also you need to make a decision at the beginning. Right. So the way you make a decision is use some kind of a heuristic to tell you what is the best way to split up the 196 00:25:49.110 --> 00:25:52.110 William Cheng: The, you know, the this blog into different different fragments 197 00:25:52.950 --> 00:26:00.450 William Cheng: All right, in the good old days, the disco very expensive. So what people actually tried to do is, is try to squeeze them as much, you know, storage capacity out of it. 198 00:26:00.720 --> 00:26:09.120 William Cheng: So you have two files. Right. One of them is the blue file over here it goes from one this blog to do this blog and then it will take up to different fragments, while the pink data over here. 199 00:26:09.720 --> 00:26:18.900 William Cheng: They take out one this law. Plus, you know, for more fragment, you can actually move, you can actually combine the end of the blue file and the end of the pink bar into one this one. 200 00:26:19.260 --> 00:26:24.810 William Cheng: Right if storage capacity is so important to you. You don't want to waste any storage space. Well then what you would do that you will pack them together. 201 00:26:25.380 --> 00:26:30.300 William Cheng: So guys, again, this requires more sophisticated file system design in order for you to accommodate that. 202 00:26:30.540 --> 00:26:35.460 William Cheng: And what if you want to grow this file you want to grow the blue blue file, you can grow to fragment over here. 203 00:26:35.580 --> 00:26:40.830 William Cheng: When you try to grow more one day, you need to actually split up this this blog into to this blog and then manage them separately. 204 00:26:40.980 --> 00:26:47.910 William Cheng: So again, you're going to end up modifying the file system a little more every time you modify the foster say you might have to end up writing to this, you're going to be slower. 205 00:26:48.570 --> 00:26:55.530 William Cheng: OK. So again, you can do all these kinds of stuff. But we remember that there's no free lunch, and you have to do to all these extra work. Yeah. 206 00:26:57.240 --> 00:27:06.720 William Cheng: All right, so that's one of the thing that is the fastest system does. The other thing that the fastest isn't does over here, they tried to improve the seat time guys over here which says minimizing the seat time 207 00:27:07.050 --> 00:27:13.290 William Cheng: I mean, we don't like to use the word optimize the seat time right so you know people who are who say let's optimize it, that's run an optimization algorithm. 208 00:27:14.100 --> 00:27:19.590 William Cheng: So typically we don't try to do that, we tried to do something, you sort of more heuristic. Right. So when I say we, I mean the computer science people 209 00:27:20.010 --> 00:27:27.450 William Cheng: What happens if you use optimization techniques you can actually solve optimization problem. So one problem is that the optimization problem will take you too long to solve. 210 00:27:27.870 --> 00:27:36.180 William Cheng: That and also at any point in time, the optimal solution is always different. So you have to keep changing your policy to move data from one place to the other. And it also take you too long. 211 00:27:36.780 --> 00:27:45.540 William Cheng: Okay, so for computer science, you know, people typically what they will do is that they will try to sort of find a reasonable heuristic, and they will stick to heuristic, even though they know that this is not optimal. 212 00:27:46.140 --> 00:27:55.110 William Cheng: Okay, so as long as you have a good enough heuristic. So turns out that's typically more beneficial. Okay, that you don't want to optimize things all the time. Yeah. 213 00:27:56.070 --> 00:28:05.790 William Cheng: Alright so what people are notice is that, you know, for these. I know, be a hard drive. Right. They are these two number one is how the average lifetime. The average time is formula. Second, the or the one track see time 214 00:28:06.150 --> 00:28:13.050 William Cheng: Is 0.0 point two milliseconds. So the one track see time is 20 times faster than the average time 215 00:28:13.770 --> 00:28:24.420 William Cheng: Okay, so. Okay, where's the difference coming from, right, the average time involved the main motor. The main motive is not very accurate. The one track see time only the involved the stepper motor that can actually step one, you know, one track at a time. 216 00:28:24.810 --> 00:28:32.640 William Cheng: So therefore, this time is actually much faster so also the every time we involve using the main motor and then using the the stepper motor 217 00:28:33.000 --> 00:28:36.450 William Cheng: To bounce around to actually find out what to do to make sure that you go to the right place. 218 00:28:37.350 --> 00:28:45.630 William Cheng: Okay, so, so by knowing this, it will be nice if you can have all your data that are sitting right next to each other. So this way you don't have to turn on the main motor 219 00:28:46.140 --> 00:28:56.100 William Cheng: Every time you turn on a motor. You gotta be in trouble right so. So the obvious solution over here is that when you try to retrieve data for the file. I want to make sure that this the you know that only turn on the stepper motor 220 00:28:57.120 --> 00:29:04.290 William Cheng: Okay, so was a. How do you know what what do you have to do it. Actually, actually do that. Well, you need to have a different file system organization. Okay. 221 00:29:05.790 --> 00:29:12.480 William Cheng: All right, so. So the way the fastest fastest and do is that they they watch it, they, they use something called a cylinder group. Okay, so when you try to create a file. 222 00:29:12.630 --> 00:29:21.090 William Cheng: You want to make sure that all the data, you know, for that file. They're all creator right next to each other. Right. So, so what they do is that they're organized this entire cylinder group. So here 223 00:29:21.690 --> 00:29:26.670 William Cheng: So how big is the cylinder group. Rather, I mean, since the random sequence is 20 times faster than 224 00:29:27.240 --> 00:29:33.810 William Cheng: That when I see that is 20 times slower than the one taxi time. So maybe what I will do is I will group 20 track together to form a cylinder group. 225 00:29:34.230 --> 00:29:46.380 William Cheng: Okay, so this way, you know, if I if I access. One of the truck over here in order for me to get to all the other data belong to the same file. All I have to do is to activate the stepper motor and then seek that one, you know, see it. 226 00:29:47.190 --> 00:29:49.950 William Cheng: On the see one track at a time and that will go to the right place. 227 00:29:50.790 --> 00:29:54.720 William Cheng: Okay. So in this case, you know, what happened is I haven't designed a file system. So all the 228 00:29:54.990 --> 00:30:06.120 William Cheng: All the information about a file need to go inside a cylinder group, right. So what do we mean by all the information about the file. So remember a file is consistent. The I note and also the data region you know full well 229 00:30:06.840 --> 00:30:13.980 William Cheng: That this map point you guys, so this way. The I know and the data they all need to fall within the same they'll send a group 230 00:30:14.910 --> 00:30:22.740 William Cheng: Okay, so this way you know once you start accessing a file when accessing a file, you're going to ask us any part of the file. What they will all be in the same cylinder group. 231 00:30:23.700 --> 00:30:27.510 William Cheng: Yeah. Alright, so the organization of this is going to be 232 00:30:27.810 --> 00:30:34.620 William Cheng: A little different, right in the system for us. Is that right, so what what is already revision of death, right, we have the we have the blue block, followed by the super blog. 233 00:30:34.860 --> 00:30:39.150 William Cheng: And then all the I knows up together and I list and then followed by the data region. 234 00:30:39.330 --> 00:30:48.450 William Cheng: So in that case, when we create a file, we're going to end up using one I know is that I list and then the this map over here is going to point all over the place. Instead of data region like this. 235 00:30:49.020 --> 00:30:59.370 William Cheng: Okay, so this is why access, you know, accessing data in that file. It's going to take so long because every time we try to access any of these data block is going to cost us one seat one rotational latency at one data transfer time 236 00:31:00.090 --> 00:31:07.530 William Cheng: Okay, if we divided this into cylinder group right over here. Again, we still have a blog, followed by the super blog. And then we have the cylinder group. 237 00:31:08.280 --> 00:31:15.930 William Cheng: That. So when you start to create a file. The I know is going to come from one cylinder group and also all the data block over here, they need to go into the same cylinder group. 238 00:31:16.080 --> 00:31:22.350 William Cheng: Okay, so therefore the this map over here at the end of the I know they will only point to that data within the same cylinder group. 239 00:31:22.500 --> 00:31:29.790 William Cheng: So once you open that file and you start trying to retrieve the entirety of the other data from for this particular file, you never have to activate the main motor 240 00:31:30.720 --> 00:31:36.900 William Cheng: Right, all you have news activate the stepper motor and then you can actually get all the other data. Okay. If you need to open a different file again, you know, 241 00:31:37.650 --> 00:31:45.600 William Cheng: This all is right here. So again, all the pointer over here will punch the same cylinder group. So, therefore, you know, all the seat over here, it will be 20 times faster on the average 242 00:31:47.070 --> 00:31:49.230 William Cheng: Alright so this guy's got, what are we paying for right again. 243 00:31:49.830 --> 00:31:58.680 William Cheng: There's no free lunch, right, we eat the right things in this way. Now when you try to create a file, you have asked, Where should I create this file right before when I try to create a file, you know, what do I create a file. 244 00:31:59.160 --> 00:32:04.470 William Cheng: Why, it doesn't matter. I will now create a file I can pick any I knows over here, you know, as long as they're free. I'm going to create I don't 245 00:32:05.130 --> 00:32:11.970 William Cheng: Know, and then I can actually picked it out of any parts of the data region. So, in that case, I have proof of poor, poor for performance. 246 00:32:12.570 --> 00:32:16.770 William Cheng: You know, but but it's very simple. Guys are now my file system is going to be much more complicated. 247 00:32:16.980 --> 00:32:25.470 William Cheng: When I try to create a new file I say wash it a new new file being cylinder group number one or number two. Number two, or, you know, well, you know, you know, we still have negotiated belong to 248 00:32:26.790 --> 00:32:33.600 William Cheng: Okay, so that will actually become a very difficult question to answer. OK. So again, some people say will run an optimization algorithm, trying to figure out where to go. So, you know, again, that 249 00:32:33.900 --> 00:32:40.710 William Cheng: That that will become very, very brittle, because every time, will you, you know, create more files files or you delete files, we're going to end up shuffling the files around 250 00:32:41.550 --> 00:32:49.830 William Cheng: Okay. And also, in order for you to keep track of the Cerner group. How do you keep track of the cylinder group. Well, again, you're you're you're you're super note over here need to keep track of all your center groups. 251 00:32:50.970 --> 00:32:58.410 William Cheng: Okay, right, so, so, so what people will have to do is that they need to sort of develop some kind of a heuristic to say that when you create a new file. 252 00:32:58.620 --> 00:33:08.250 William Cheng: Which cylinder group that it goes to rise again computer science people don't like to run optimization over them unless you know they they can make sure that it's efficient and also it doesn't cost too much trouble, you know, 253 00:33:09.480 --> 00:33:12.960 William Cheng: So, so in that case, otherwise they will try to use your mistake, right. So, for example, 254 00:33:13.470 --> 00:33:21.000 William Cheng: So I mean this. I don't think that this is the oldest kind of here's these actually use, but the here's a little basic example if you have a file system, you know, 255 00:33:21.960 --> 00:33:25.770 William Cheng: A hierarchy to look like this. How would you put the files into cylinder group. 256 00:33:26.550 --> 00:33:32.010 William Cheng: Okay, so if you look at this picture, you say, oh, I have a C files over here. So, therefore, what I'm going to do with it. I see file. I'm going to compile that 257 00:33:32.340 --> 00:33:39.000 William Cheng: Right. And if that component is going to be a make file. So, so where's the make file, right. So that makes our might be inside this directory. So maybe for a Darcy. 258 00:33:39.240 --> 00:33:48.180 William Cheng: Darcy. There's a make file. Is that the same, same, same directory, will you, will you type me what it will do is it will create a temporary file. So maybe all the temporary file, we can create inside the same directory 259 00:33:48.510 --> 00:33:53.670 William Cheng: There. So, in that case what I will do that. I can actually group all these things together, right, because when you tell me what's going to happen, right. 260 00:33:53.850 --> 00:34:04.110 William Cheng: They're going to scan the entire directory, find out, find all the CFO and then find the rules that may follow the computer component into data file. So what it will do is it will actually use it directly entry, many, many times. 261 00:34:04.800 --> 00:34:10.620 William Cheng: Over. So maybe it's a good idea to cook at the directory. I know, and the fall, I note over here all into the same cylinder group. 262 00:34:11.460 --> 00:34:19.560 William Cheng: Okay. Similarly, you know, for this data structure over here, there's going to be a make file. So again, I can put everything over here inside one certain group. But then what about this fall over here. 263 00:34:20.130 --> 00:34:26.490 William Cheng: Right. I mean, some people has, you know, sophisticated make file. They have the C FOS scattered all over the place inside every sub directory. There's a make file. 264 00:34:26.640 --> 00:34:32.370 William Cheng: So in this case, you know, this file, where does it go. Should we go to a separate cylinder group. Should I go to this little group. So you go that's the one group. 265 00:34:32.550 --> 00:34:38.970 William Cheng: So get inside of our system. You can't really have that much knowledge. Right. So, therefore, India, you have to do some heuristic to figure out where it goes. Okay. 266 00:34:39.210 --> 00:34:48.480 William Cheng: So here's an example, maybe you will put them on cylinder group. This way, doesn't mean that this is the best way. Right. So maybe you do some other way. Or maybe you'll try to put everything is that one syllable that will be the best 267 00:34:49.350 --> 00:35:01.020 William Cheng: Okay, but again sometime you know you might not have enough space is that that cylinder group. So again, you know, you're going to implement a file system to implement some kind of a curiosity, so that you can actually figure out where you create a file which cylinder group that goes into 268 00:35:02.130 --> 00:35:10.530 William Cheng: That all right so if you implement those two approaches. One is to use a larger block size and once the implement a cylinder group. How do you do 269 00:35:10.980 --> 00:35:20.490 William Cheng: Okay. As it turns out, they actually implement this on a rhino PA harddrive they implement the fast file system. If you use these two techniques, you're going to end up with a factor of 20 improvement. 270 00:35:21.240 --> 00:35:28.290 William Cheng: Not 20% a factor of 20 okay but before the performance so poor right before it was 0.16% of 271 00:35:29.700 --> 00:35:47.460 William Cheng: The disk capacity. Now, you multiply by 20 you still only get 3.7% of the maximum transfer capacity of the desk. Okay, so it's still horrible right but you are 20 times faster already. Okay, so using these two technique buys you a lot, right, so, so, so now we have minimize the sea time 272 00:35:48.630 --> 00:35:53.940 William Cheng: Alright, so, so the next thing is when you do so, we need to improve that the the other rotational latency. Yeah. 273 00:35:55.680 --> 00:36:01.590 William Cheng: Alright, so how do you minimize rotational agency, right. So let's say that you know the data that you need. Over here is actually in sector six over here. 274 00:36:02.100 --> 00:36:07.230 William Cheng: You know, an artist goes right here. Right. So when you perform a C Corp. When you finish the seat. Now you gotta 275 00:36:07.470 --> 00:36:16.680 William Cheng: You gotta rotational agency if our discount is right here. How long do you have the way right you have to wait for this, this actually spin this way. I mean, you know, for me it's just hard to make a spin 276 00:36:16.920 --> 00:36:25.770 William Cheng: You know, made the slice it. So what I will do is I'll actually move that this head right move that this head over here. So as the dis spin this way that this had over here is getting closer and closer to this particular sector. 277 00:36:25.980 --> 00:36:30.960 William Cheng: Once it gets to the, the beginning of this particular sector. Now I finished my rotational latency. 278 00:36:32.010 --> 00:36:40.290 William Cheng: OK, so now I start transferring data one bit at a time, over here, a transfer into this buffer when our operations done. I'm going to do a meta data into memory and then wake up this you know 279 00:36:40.470 --> 00:36:46.440 William Cheng: Generally in a robot. Like I said, right. Okay, so this amount of time that you need to pass you over here. That's the rotational agency. 280 00:36:47.610 --> 00:36:51.210 William Cheng: There. So what's wrong with rotational agency inside this isn't PA system. 281 00:36:51.540 --> 00:36:59.160 William Cheng: OK, so the problem without you. Will the rotational is the I'm going to draw slightly different picture over here. Okay, so let's say that you are you are transferring transferring 282 00:36:59.520 --> 00:37:04.890 William Cheng: Data from sector number one. So you just said is right here. Right. So you wait for your own internal agency that this is right here. 283 00:37:05.040 --> 00:37:10.710 William Cheng: And now you're gonna copy all this data into this controller buffer. So here's what this control over here. There's a buffer sitting right here. 284 00:37:10.860 --> 00:37:16.980 William Cheng: Yeah, so I'm gonna send this data over here is going to be converted one bit of time to cover the front of magnetic their data. 285 00:37:17.160 --> 00:37:23.460 William Cheng: Into electrical signal is going to go over the disarm and then go into this controller buffer one bit at a time. 286 00:37:23.670 --> 00:37:31.830 William Cheng: Okay, so, so the way they do this one day at a time that this is going to continue to spin as a despair. I'm going to draw the picture I'm moving at this at all the way at the end over here. 287 00:37:32.160 --> 00:37:36.360 William Cheng: When it finished transferring the last bit of the data into the disk controller buffer, what's going to happen, right. 288 00:37:36.960 --> 00:37:41.220 William Cheng: So at this time what he would do is that this controller says, oh, I have a sector, full of data. 289 00:37:41.580 --> 00:37:51.690 William Cheng: Guys will not need a DMA did they meant meant that the memory. So where's my memory or I may memories I just sitting right here, right. This is the memory that you have, you know, a gigabyte CPU by on your on your laptop. 290 00:37:52.050 --> 00:37:57.810 William Cheng: So we'll do is that they will take a DMA operation right transfer data into memory over here. It's going to take some time. 291 00:37:58.740 --> 00:38:08.880 William Cheng: Okay, when this time is finished. What it will do is that it will interrupt the CPU CPU sitting right over here. So, what it will do is they'll pull the interrupt line on the bus to interrupt the CPU does the CPU. So respond right away. 292 00:38:09.270 --> 00:38:19.500 William Cheng: Well, no, the CPU might have the, you know, might be servicing a higher level interrupt so this interrupt can be blocked or maybe you know office is disabled, because the CPU is is running the code inside of hardware abstraction. 293 00:38:19.920 --> 00:38:28.350 William Cheng: Layer. So, for any reason, you know, CPU might not service the interrupt right away, right, we mentioned that that disk interrupt. It's kind of a medium priority interrupt my 294 00:38:28.710 --> 00:38:37.470 William Cheng: Over here, the nail interrupts more important to keep going to rob is less important that this was kind of immediate party so therefore it's gonna take some time before he actually servicing interrupt. 295 00:38:38.130 --> 00:38:42.900 William Cheng: Okay. All this time what's happening on the desk. What it does continue to spin rather than this doesn't slow down. 296 00:38:43.140 --> 00:38:48.660 William Cheng: Okay. So, therefore, what happened is over here you know that this continuous spin and then eventually, you know, have a CPU. 297 00:38:48.840 --> 00:38:56.340 William Cheng: Inside interrupt service routine. What are the two things that can happen right now. But once you unplug the colonel threat. Number two is that you start the next IO operation. 298 00:38:56.970 --> 00:38:58.410 William Cheng: What is the next IO operation. 299 00:38:59.010 --> 00:39:02.880 William Cheng: The next hour operation is transfer data on sector number two into memory. 300 00:39:03.120 --> 00:39:10.890 William Cheng: Okay, so what are, what is otherwise she told this control to say now go second number two and transfer that into memory. So when did this control over your start doing that. Where is it this head. 301 00:39:11.880 --> 00:39:18.240 William Cheng: Okay, so remember that this is continuing to spin it to spin. By the time you start the next operation that this is actually right here already. 302 00:39:18.660 --> 00:39:26.220 William Cheng: This has my also be at the beginning of sector three over here. So now, how long does it have to wait. He has to wait for the entire revolution of the desk. 303 00:39:28.410 --> 00:39:32.910 William Cheng: Okay, so on the entire revolution of the does is on the order of milliseconds. 304 00:39:34.140 --> 00:39:39.750 William Cheng: Okay, so there were in this case where I should remember, you know, this is spinning at, you know, 10,000 revolution per, per minute. 305 00:39:40.170 --> 00:39:44.730 William Cheng: If you divide that down, you know, in order for you to wait for the entire revolution is going to be really, really long. 306 00:39:45.300 --> 00:39:56.250 William Cheng: That the text we actually do some analysis basically sort of showing you that when you finish the first sector and, on the average, you need to, you know, change DMA the data into memory. You need to interrupt the CPU. 307 00:39:56.580 --> 00:40:01.080 William Cheng: And then wait until the interrupt service routine to get executed by the time you start issuing the next command. 308 00:40:01.560 --> 00:40:09.390 William Cheng: You know this this hat will be in the middle of a second sector that's remember they are on the average 750 sectors over here in this in 309 00:40:09.750 --> 00:40:19.560 William Cheng: Inside this track over here. They are densely packed ducks do I to each other. So therefore, yeah. Will you, will you know when you're ready to issue the next, you know, a data transfer command on the desk. 310 00:40:20.010 --> 00:40:21.570 William Cheng: You know your head is in the next sector. 311 00:40:22.230 --> 00:40:29.490 William Cheng: Okay, so this is why this isn't off our system is so slow, right, because when you try to access the data, you know, we try to transfer data from from this 312 00:40:29.760 --> 00:40:37.920 William Cheng: The first sector over here, on the average, you need to wait for half a revolution of the desk all the other sector, you need to wait for the entire evolution of the desk. 313 00:40:39.000 --> 00:40:41.250 William Cheng: That's terrible. Right. So what is the solution over here. 314 00:40:42.330 --> 00:40:46.980 William Cheng: What the solution over here is actually very simple right don't call it the second number two, why don't you call this particular number two. 315 00:40:47.610 --> 00:40:53.940 William Cheng: Okay, if you call this second number one over here. The second number two. This one is three. This one is for this one is 5678 316 00:40:54.360 --> 00:41:03.030 William Cheng: What, in that case, if you have this head is in the middle of the next section over here. All you have to do is to wait for a small amount of, you know, rotational agency and then you get a sector to 317 00:41:04.020 --> 00:41:16.380 William Cheng: That. So this is called block interweaving there's obvious got blocking or leaving. I just need to we number my block number, guys. I call this 12345678 over here. So this way you know 318 00:41:16.740 --> 00:41:23.220 William Cheng: You know i have i can minimize my rotational latency, right, because if it says right here. All I have to do is to wait for this to spend that amount. 319 00:41:23.730 --> 00:41:32.550 William Cheng: Okay, what it turns out that you know my my system is really, really busy. So by the time I start transferring second number to over here. My mind my decide is over here. 320 00:41:33.030 --> 00:41:41.910 William Cheng: Well, so in that case, again, all you have to remember all these this block. This one will be one. This one will be to this all will be resolved before. So again you complete this trick. 321 00:41:42.270 --> 00:41:51.120 William Cheng: So what happened is that when you start formatting your hard drive, you need to have an estimate of how much you know you know what kind of configuration do. Did you have in the desk. I mentioned before. 322 00:41:51.750 --> 00:41:56.130 William Cheng: You know, on the modern Intel CPU. They have 128 interrupt levels. 323 00:41:56.550 --> 00:42:04.200 William Cheng: Okay, so in that case, where is that this is going around, right, if the December I was very, very low level it is possible. You have to wait for a long time before you start the next operation. 324 00:42:04.470 --> 00:42:08.970 William Cheng: Right. So in that case, okay, you need to sort of benchmark your system, you know, trying to find out what's the right information. 325 00:42:09.090 --> 00:42:18.120 William Cheng: So this way you can actually figuring out what you're into. Leaving factor you to plug in, you're leaving every other blog every two blocks every three blocks. So this way you have the best performance. 326 00:42:19.320 --> 00:42:21.000 William Cheng: Alright, so again, this a little bit tricky. 327 00:42:21.270 --> 00:42:30.870 William Cheng: But the fastest and people, you know, eventually you sort of figure out what what kind of a, you know, what kind of algorithm to run in order for you to figure out. So again, the actual algorithm is more for advanced, you know, operating system anyway. So when I 328 00:42:31.320 --> 00:42:40.320 William Cheng: Was not so anyways. So over here we just said that in the bathhouses them. They use block and you're leaving. And it turns out works out really well. Okay. So how does it work out. 329 00:42:41.220 --> 00:42:52.800 William Cheng: There. As it turns out, if you do just that it will improve the performance of your file system by 16 folds, guys. Okay, a factor of 50 improvement. Okay, so, so the total improve your performance. 330 00:42:53.490 --> 00:43:03.870 William Cheng: Right now is that. So before we get to 3.7% of the, the, the distress capacity, if you multiply that by 15 you're going to get to almost 50% of this capacity. 331 00:43:04.320 --> 00:43:12.690 William Cheng: That so so if you use all these three techniques, you know, a larger block size and then you do the cylinder group and then you do block entirely being 332 00:43:13.260 --> 00:43:22.260 William Cheng: The total forces and performance is going to be on the, you know, around 32 megabytes per second. So that's around 50% of this transfer capacity of the Rhino PR dry. 333 00:43:23.340 --> 00:43:24.540 William Cheng: Day. It's 50% good 334 00:43:25.890 --> 00:43:31.920 William Cheng: I mean, some people say, That's terrible, right, because I spent so much money on a hard drive. All I get is allows the 50% of the dispenser capacity. So, yeah. 335 00:43:32.370 --> 00:43:35.460 William Cheng: It's not good enough. Yeah, but it's pretty good compared to what you have a store with 336 00:43:36.120 --> 00:43:42.960 William Cheng: That. So again, that's review in the beginning, we only reach 0.16% of this capacity by using the system boss is them. 337 00:43:43.470 --> 00:43:56.520 William Cheng: The file system without blocking the levy, you can, you know, it's 2520 times faster you can reach 3.7% of 3.7% of this transfer capacity. If you also turn on block into Libya, you're going to get to, you know, 50% of the transfer capacity. 338 00:43:57.270 --> 00:44:04.950 William Cheng: Okay, can we reach 100% of its capacity. Well the answer of course is that we can, we have to use other tracks are going to see that a little bit later. Yeah. 339 00:44:07.260 --> 00:44:19.680 William Cheng: All right. Um, what else do we have to look at over here. Okay, so how do we get to wanting to present this transfer capacity. Right. So again, you know, the data is is scattered all over the place over here. So how do we actually make it faster. Yeah. 340 00:44:21.930 --> 00:44:31.290 William Cheng: So so so let's sort of review what we have so far right in the, in the, in the beginning, we have the system for us. And you know whether this look like this. Right. You know, we have the blue block the 341 00:44:31.710 --> 00:44:34.350 William Cheng: superblock the islands over here, the data region. 342 00:44:34.740 --> 00:44:44.370 William Cheng: There are scattered all over the place. That's really, really slow for the fat sources that we make the block bigger over here, right. So in order for you to store file you need less number of blocks guys over here at the picture I drove here. 343 00:44:44.580 --> 00:44:55.620 William Cheng: The block is bigger. Also the I know now, all these blog. There are storing the same cylinder group. So this way you minimize the the the see time because all you need to turn all is a stepper motor that 344 00:44:56.310 --> 00:45:03.930 William Cheng: So the question over here. How can you actually get to, you know, 100% of it this transfer capacity that, as it turns out, it was very, very difficult. 345 00:45:04.560 --> 00:45:12.630 William Cheng: Okay, so what happened over here is that, you know, over time, as it turns out, the memory actually get, you know, more and more get more and more cheap. 346 00:45:13.020 --> 00:45:19.530 William Cheng: Relative to the, the cost of the storage system. Okay. So, this the other picture is really not drawn to scale. 347 00:45:19.800 --> 00:45:26.130 William Cheng: In the beginning in the 1980s. Over here we have the, this isn't above us, and in the 1990 we are the fastest them. 348 00:45:26.460 --> 00:45:32.850 William Cheng: So, at that time, you know, things are very, very expensive. If you want to get, you know, a 64 you know megabytes of memory, it's gonna cost you a lot of money. 349 00:45:33.570 --> 00:45:38.070 William Cheng: Okay, so in the good old days, you know, we can't really have too much memory. I mean, today we have eight gigabytes of memory 16 gigs. 350 00:45:38.250 --> 00:45:50.100 William Cheng: Of memory. It's unimaginable in the good old days, right in the 1980s 1980s that that's that's gonna cost you a lot of money. Okay. So over time, memory become cheaper and cheaper and cheaper. So if memory become really, really cheap. What would you do 351 00:45:51.210 --> 00:45:57.630 William Cheng: What if the memory become really cheap. What happened is that at the time where you build your system. Why don't you read the entire this into memory. 352 00:45:59.430 --> 00:46:05.550 William Cheng: Okay, so at the time where you put your organism, you read the entire this into your memory. So let's say that you your memory so so cheap. 353 00:46:05.730 --> 00:46:17.670 William Cheng: You can store any amount of data you want if you read all the data from a distant memory. Well, now you will never have to go to the disk again. Okay. So, therefore, in this case, what's going to be a capacity of the desk. What is going to be the disk transport capacity. 354 00:46:18.840 --> 00:46:25.140 William Cheng: Why it depends on how many times you go to this right. If you go to this 1000 times every time the cost is going to be zero, right, because at 355 00:46:25.500 --> 00:46:32.010 William Cheng: The beginning over here. You spend all your time getting all the data into memory from this point on this point. Now anytime you go to the dentist and the cost is zero. 356 00:46:32.310 --> 00:46:38.220 William Cheng: Right. So in this case we can average all you want, you can end up with the overall trends as good as the other. This can actually go to infinity. 357 00:46:39.300 --> 00:46:48.000 William Cheng: Okay, so if you run your system for for a while you're gonna you compute the average number of time you go to this versus the total access time law that this transfer capacity will actually go to infinity. 358 00:46:49.110 --> 00:46:55.890 William Cheng: Okay, but in reality of course I, you know, if you think about it today. Sandra. How much storage capacity, you have your storage capacity going to be one terabyte 359 00:46:56.160 --> 00:47:03.000 William Cheng: And your memory is, you know, can be 16 gigabyte well. So again, you can't really go to the you know the entire desk and then copy that all them to memory. 360 00:47:04.020 --> 00:47:13.800 William Cheng: Okay, so therefore we can actually copy those but but what if we can actually go to the desk and copy all the data that you need into memory. Okay, every time when you turn on your laptop, you do something and you turn off your 361 00:47:15.180 --> 00:47:22.080 William Cheng: Clothes your laptop, how much storage capacity you actually use okay is the storage capacity they use actually less than the physical 362 00:47:22.470 --> 00:47:28.890 William Cheng: The amount of memory that you actually have. Okay. If the answer is yes. If we can smoke it will look if we can be smart about it. 363 00:47:29.100 --> 00:47:36.720 William Cheng: Then any data that we need to access we bring them in from the, you know, for this into memory, we just going to keep it in memory. And whenever, actually, you know, put it back onto the desk. 364 00:47:37.410 --> 00:47:43.020 William Cheng: Okay, so this way if the amount of that you that you x is less than 16 gigabytes of memory, and you have to give up my memory. 365 00:47:43.260 --> 00:47:52.440 William Cheng: All you need to do is to bring it in one spot. It isn't a memory and from this one out. You can go to this as many times as you want and the cause. It can cause it's going to be zero because the data is already memory. 366 00:47:53.190 --> 00:47:56.430 William Cheng: Why we mentioned the term before. This is called a file system cache. 367 00:47:56.700 --> 00:48:09.690 William Cheng: Right. So if you have a file system cache. You can cash all the data that you need when this case, you never have to go to a disc again and now you know the access I'm over this can actually be infinity and clearly infinity is more than 100% of the distressful capacity. 368 00:48:10.500 --> 00:48:20.850 William Cheng: Because it'll get the Infinity component average right so the average can actually play tricks on you so that this is actually not that fast. Right. But you know the average, you know, though, that this transfer capacity can be much, much higher than 369 00:48:21.330 --> 00:48:31.260 William Cheng: You know 64 megabytes per second. Okay. All right. So, so over time, over here. So, and then in 20 you know in 2010 or 2020 over here. We're going to be in this region. 370 00:48:31.770 --> 00:48:40.710 William Cheng: We can actually have an aggressive caching. Right. So all the data from the days would imagine catch all that means that the file system cache. So now once you bring them in. You don't have to go to the discover again. 371 00:48:41.310 --> 00:48:47.190 William Cheng: Yeah, so the solution over here is refer S That's over here, sort of, sort of search. An example here, right. 372 00:48:47.880 --> 00:48:55.500 William Cheng: So you know so. So once again, we're going to use caching. Well, whenever you need to access data on the desk. You first you look inside the file system cache. 373 00:48:55.680 --> 00:49:01.980 William Cheng: If the data inside of us are some cash, then you use that. Right. In this case, you don't have to go to this. It's as if the disk access time is equal to zero. 374 00:49:02.490 --> 00:49:11.190 William Cheng: Okay, if the data is not on to this and this case, you have to go to the dentist when he goes to the desk. You know, you know, on the average, you're going to take you on the order of, you know, 375 00:49:11.730 --> 00:49:21.000 William Cheng: So one millisecond over here. Okay, so, so you don't have you one of those. And I guess we mentioned before is is between one and 10 milliseconds, you're going to go to the dentist, you know, copy, did I caught you know 376 00:49:21.480 --> 00:49:27.090 William Cheng: Is it gonna cost you, you know, one to 10 millisecond over here and then you catch it inside your passes in cash. 377 00:49:27.690 --> 00:49:30.750 William Cheng: Okay, so next time when you have access to that data will be inside of houses in cash. 378 00:49:31.650 --> 00:49:34.230 William Cheng: And prizes and this guy is what's going to be, you know, performance on the file system. 379 00:49:34.680 --> 00:49:42.870 William Cheng: Was it again, it depends on your hooray. Right. If you hit is 90% over your head, Ray. That means that 90% of the time you'll cost is going to be go to zero. 380 00:49:43.170 --> 00:49:49.080 William Cheng: Okay and 10% of the time what is going to be a cost. Well you constantly be the, you know, we don't want this access 381 00:49:49.890 --> 00:50:00.210 William Cheng: Okay, so therefore, on the average, is going to be 0.9 times zero times, you know, sort of point 1% 10% of your this SS time 382 00:50:00.960 --> 00:50:09.600 William Cheng: Okay, so your this as a Sam over here is going to be 50% is going to be 32 megabytes per second. And you're going to be running on the average, as it's going to be running 10 times faster. 383 00:50:10.500 --> 00:50:25.290 William Cheng: Okay, if you're here it is 99% of the hit rate over here. That means that every 100 this SS over here 99% of the time is gonna cost is zero and one time is going to cost you one this access so that in the air, your file system will look like 100 100 times faster. 384 00:50:26.550 --> 00:50:35.340 William Cheng: Okay. So, therefore, you know, if you have a bigger file system cache, you can achieve a higher, you know, the higher Hey Ray when it doesn't really matter. You know how slow your discuss 385 00:50:35.730 --> 00:50:46.950 William Cheng: So in this case you have, you're gonna have 100% faster, you know, then the original days. You don't even have to play the trick of using a cylinder group. I'll do all that kind of stuff. So its own issues is a big file system cache your performance going to be good. 386 00:50:47.970 --> 00:50:55.890 William Cheng: Okay, so, so this new era is no as using a buffer cache. So we talked about use the term buffer cache. We're talking about the the cash inside of our system. 387 00:50:56.490 --> 00:51:03.720 William Cheng: OK. So, again, is that this this thing. We're going to have a gigantic cash from what I heard some system actually use one quarter of your memory. 388 00:51:04.650 --> 00:51:16.770 William Cheng: Developed about to the buffer cache. So if you have 16 gigabytes a physical memory four gigabytes of there's going to be devoted to faster than cash that assumption that actually use one fifth of the physical memory. So again, if you have 389 00:51:17.880 --> 00:51:19.140 William Cheng: A 16 gigabytes of memory. 390 00:51:19.740 --> 00:51:26.970 William Cheng: You know, one fifth of is going to be 3.2 gigabytes of photos. The file system cache. Yeah. So in this way, we perform a read operation over here. 391 00:51:27.150 --> 00:51:32.220 William Cheng: You trapped inside the Colonel, you check the data is inside the buffer cache or not. If it's not, you're going to pay 392 00:51:32.490 --> 00:51:37.530 William Cheng: You know this guy says your current also is going to fall asleep in a way for the data to get transferred into the buffer cache. 393 00:51:37.740 --> 00:51:43.380 William Cheng: Once you finish doing that you keep a copy. Is that a buffer cache. And then you return you wake up because they're usually on the read 394 00:51:43.620 --> 00:51:53.730 William Cheng: Next time, when you read, hopefully, the data is going to be inside of buffer cache, right. So again, if you have a high hit rate they are you going to perform really well, so, so, so that would be good for the read operations. And what about right. 395 00:51:54.750 --> 00:52:00.060 William Cheng: That will improve on the right operation. What is the data over here that this blog is already inside of buffer cache. 396 00:52:00.450 --> 00:52:06.360 William Cheng: Guys. So in that case, what you will have to do that, you have to actually modify it. Okay. When you modify. Do you need to write it back to the desk. 397 00:52:07.020 --> 00:52:16.260 William Cheng: Okay, so, so, so we talked about, you know, you know, the sort of the, the trick that we use before to make the fastball system fast those tricks are good for reading ahead. 398 00:52:16.710 --> 00:52:24.660 William Cheng: Okay, they're actually not very good for writing because you can't write a hat right when you can't really predict what you're going to write, guys, so therefore we you know when it, when it comes to writing data to this. 399 00:52:25.140 --> 00:52:33.960 William Cheng: In that case, you actually have to make it, make it out to the desk right because the semantic a right, is that when you make a racism Hall when right return you assume that data has gone to the desk. 400 00:52:35.220 --> 00:52:43.380 William Cheng: Okay, so therefore, in that case, even though you can improve the performance of the days, you can't really perform the you can't really, you know, improve the performance of the right 401 00:52:44.700 --> 00:52:54.420 William Cheng: The right operation on the desk. Now, so typically this is done, you're using two different approach. What is called right through the other ones called right back. So when you are using right through. 402 00:52:55.200 --> 00:53:01.590 William Cheng: Then when you write the data on to the into the buffer cache. You also need to write to the desk. You actually have to wait for the medical code. 403 00:53:01.800 --> 00:53:11.760 William Cheng: Because of this, again, you can also is going to fall asleep when you finished writing data to this over here. Now the data over here is going to be exactly the same data on the desk. So therefore, now this one is going to become clean 404 00:53:12.150 --> 00:53:21.150 William Cheng: Because remember this block and we can either be clean or dirty. If the block is dirty. That means that it's not the same as it is. So, therefore, in this case, we need to clean it by writing the data out to the desk. 405 00:53:21.510 --> 00:53:29.880 William Cheng: Okay. So in this case, every time. Will you try to write data to the the buffer cache over here, if we're using right through. Then you'll performs or your file system is going to be really, really bad. 406 00:53:30.750 --> 00:53:36.810 William Cheng: Guys, I don't know if you actually noticed that whenever you are, you know, putting the whole bunch of 16 point or four inside a Virtual Box. 407 00:53:37.140 --> 00:53:39.510 William Cheng: The, why should see a message on the screen to say using right through. 408 00:53:40.020 --> 00:53:46.710 William Cheng: The guys at the beginning part of your system when you try to put your optimism. We ensure the Pooh the files there they were actually using right through. 409 00:53:47.100 --> 00:53:50.010 William Cheng: At some point they're gonna they're gonna switch to a different mechanism using right back. 410 00:53:50.520 --> 00:53:54.420 William Cheng: Yeah. So again, the problem was right. Right. Sue is that it's going to be really, really slow. 411 00:53:54.780 --> 00:54:03.810 William Cheng: But the advantage of right through it. Is that the buffer cache will always have exactly the same thing as the as the data on this. So this way when you return from right you know the data actually gone to the desk. 412 00:54:04.770 --> 00:54:14.220 William Cheng: That. So if you're using right through that means that the data doesn't go into this. So therefore you're taking a risk that and some people actually did was in. Why is it okay to actually take a little bit of risk. 413 00:54:14.880 --> 00:54:20.280 William Cheng: Right, so, so if you want to take a better wrestler. You said a while. Well, you know, one of my second sacrificing for 414 00:54:20.610 --> 00:54:25.890 William Cheng: OK, so the sacrum, you're sacrificing for over here is going to be the performance or your father said, if you're willing to 415 00:54:26.160 --> 00:54:31.860 William Cheng: sacrifice a little bit of risk in order to get a good good good good performance in the fastest, then you can actually use right through. 416 00:54:32.580 --> 00:54:37.410 William Cheng: Okay, so I do the right thing over here is that when I tried to perform I write operation into this 417 00:54:37.650 --> 00:54:48.120 William Cheng: I'm going to modify the buffer cache and I'm only going to mark this, this, this, this block of here as dirty and then I will return from the racism called as they've data, data has already gone to the desk. 418 00:54:48.990 --> 00:54:57.210 William Cheng: Okay, so some people will say, Well, that's terrible, right, because we have clear this is wrong because you know the semantic of right is that will you return from the racism call you have modified the data on the desk already 419 00:54:57.480 --> 00:55:06.510 William Cheng: Whereas over here we're sort of cheating. We're being a little optimistic, to say that we're not going to crash, you're not going to lose power, so therefore it's gonna be okay because eventually all these data will get out to the desk. 420 00:55:06.960 --> 00:55:12.030 William Cheng: Okay, so when did these data actually get written out to the desk. While so therefore you're going to write some you're gonna have a 421 00:55:12.360 --> 00:55:21.720 William Cheng: Special story inside of Colonel. So what it will do is it will scan the buffer cache. Look for all these data blog. There are there are dirty and what they will do is that it will make them busy and then again you know 422 00:55:22.080 --> 00:55:28.530 William Cheng: map them from the address space right out to this one at a time and when they're finished you guys really need to see whether the you know the 423 00:55:28.980 --> 00:55:34.560 William Cheng: Some process waiting for them. If it's positive or waiting for them. In that case, they do that, they will they will actually 424 00:55:35.040 --> 00:55:40.110 William Cheng: You know math into back to address space. And then, you know, so that they can use it. So again, this is 425 00:55:40.350 --> 00:55:49.680 William Cheng: This is a little different for our Patreon. This one is the fight is posted some cash. So when you try to write data into this. You want to make sure no no process can access. So that's why you make them busy. 426 00:55:49.860 --> 00:55:53.760 William Cheng: And you remove that from the address space as soon as you finish finish writing. What are you 427 00:55:54.690 --> 00:56:00.600 William Cheng: Reading that data back to this. You don't return them back to the freelance right because this is the file system cache, you still need to keep them inside of cash. 428 00:56:01.080 --> 00:56:05.250 William Cheng: Okay, so therefore, again, you're going to map them back to the address space of all these processes so they can use it. 429 00:56:06.180 --> 00:56:13.920 William Cheng: Yeah. Alright, so this is go right through. So that's it. This is go right back right so you modify your buffer cache over here but you mark them dirty. 430 00:56:14.430 --> 00:56:23.280 William Cheng: You know, so, you know, and that is going to be taking some time for the data gone to the desk. Right. So how long does it take for the data to go on to the desk right maybe for maybe it's going to take five seconds. 431 00:56:23.520 --> 00:56:34.620 William Cheng: Or 10 seconds. So if you're unlucky. If you lose your power or you have, you know, have a crash or something, some solar, then the data will be here will be the data inside the buffer cache. Are we here will be different from the file system. 432 00:56:35.520 --> 00:56:41.910 William Cheng: Okay, so in that case, you know, if the buffer cache over here is the data blog that you're modify what we hear what if that's the data block for the route, I note. 433 00:56:42.810 --> 00:56:48.210 William Cheng: Okay, the data blog over here is a rude. I know what then next time when you boot the fastest them right so let's say you lose power there. 434 00:56:48.390 --> 00:56:58.800 William Cheng: The data on the file system on the desk over here, they will be inconsistent. Next time will you put the file system over here, you're gonna end up losing half of your desk. Okay. Or maybe, maybe you're really bad law, you're gonna lose your entire desk. 435 00:56:59.910 --> 00:57:04.650 William Cheng: Okay, so you guys are pretty lucky today, you know, I mean, the good old days over here. This happens all the time. And also, every time. 436 00:57:05.130 --> 00:57:08.700 William Cheng: You know, so, so, so, so, so what happened is that every time when you lose power. 437 00:57:09.000 --> 00:57:14.760 William Cheng: Next time where you put the system. You'll said oh you you know you're going across your finger you hope that you don't lose the hard disk. 438 00:57:15.030 --> 00:57:18.780 William Cheng: And guys, I'm going to sort of talk about how to actually address this issue. So this is the issue is 439 00:57:19.470 --> 00:57:25.350 William Cheng: With respect to, you know, the file system resiliency issue. We want to make the file system crash proof. 440 00:57:25.950 --> 00:57:33.780 William Cheng: That. But now we're talking about the fussing and performance. So certainly, if you use right back. We're going to assume that you never have a crash. So therefore right bags actually pretty good way to go. 441 00:57:34.710 --> 00:57:43.260 William Cheng: Okay, so later on I'm going to sort of deal with this, the issue of the nasty issue or what do you get a crash. So in that case right back. Actually, you know, that doesn't really work very well. Yeah. 442 00:57:43.950 --> 00:57:51.390 William Cheng: Alright, alright. So, so we're going to assume that you never have a crash at this time. So later on we're going to address some of the crash issue right so so you were taking a little bit of risk. 443 00:57:51.750 --> 00:57:56.190 William Cheng: But as long as you don't get a crash. This actually works out, though the works out perfectly okay 444 00:57:57.330 --> 00:57:58.650 William Cheng: All right, so 445 00:57:59.880 --> 00:58:08.520 William Cheng: You know, so, so, so, so, so, so in this case I'll go. We need to continue to address without five houses and issue when you try to write data, you know, onto a disc over here. 446 00:58:08.970 --> 00:58:15.660 William Cheng: You know the disc update task over here, what it would do that, it will go to the perfect cash. Look at all the dirty blocks over here, right, that one. 447 00:58:16.140 --> 00:58:23.310 William Cheng: This one at a time. So in that case is going to take a long time, right, because all these data blocks over here, they're going to be scattered all over the desk. 448 00:58:23.790 --> 00:58:32.430 William Cheng: Okay. So in this case, you know that this update task over here. What they will do is they'll try to be clever, you know, find all these this block if they're appealing to the same cylinder group, they will try to write them on to this. 449 00:58:32.790 --> 00:58:37.050 William Cheng: But in the end, the performance is not going to be as good as 50% of the district capacity. 450 00:58:37.770 --> 00:58:49.830 William Cheng: Okay, so in order for us to address the issue to say that, well, you know, for our disk up a task over here, is there a way that we're going to have it to, to, to the disrupt performance to be as close to 64 megabytes per second. 451 00:58:51.360 --> 00:59:00.390 William Cheng: All right, is there a file system organization over there out there that will actually be able to to do so. So we know that for reading is no problem, right, if we do all these trick we use the buffer cache. 452 00:59:00.600 --> 00:59:08.550 William Cheng: We can clearly be 64 megabytes on the average for the district capacity. Okay, but what about writing data to the desk, can we achieve near 64 megabytes. 453 00:59:09.210 --> 00:59:14.250 William Cheng: Per second, the answer is yes, but now it's a good time to break and next time we're going to sort of talk about what kind of analysis. 454 00:59:14.670 --> 00:59:23.490 William Cheng: Will achieve to 100% of this right capacity. Okay, now we're going to talk about the crash resiliency issue and how to make sure that we address this problem. Yeah.