WEBVTT 1 00:00:01.350 --> 00:00:09.420 William Cheng: This is the second part of lecture 14 I went back and change one of the slides over here for the hash page table. 2 00:00:09.990 --> 00:00:17.400 William Cheng: For those people who are not familiar with a hash table. This is just a reminder of that data structure called the collision resolution chain. 3 00:00:17.850 --> 00:00:25.500 William Cheng: The collision resolution chain is a list of key value pairs for keys that hash to the same bucket. Right. So we're in this example. 4 00:00:25.980 --> 00:00:29.100 William Cheng: You know, for this bookcase. Here is the collision resolution chain. 5 00:00:29.370 --> 00:00:35.970 William Cheng: The key over here. It's a tag, because as the you know the the virtual page number is the one that you feed it to the hash function. So, therefore, that's the key. 6 00:00:36.180 --> 00:00:39.390 William Cheng: The value is the stuff that you're looking for. So there was a page table entry. 7 00:00:39.600 --> 00:00:46.770 William Cheng: And then we build a link list right so the link over here. It's going to be the next point. Joe that link everything together. That's how we end up with this particular data structure. Yeah. 8 00:00:47.130 --> 00:00:53.310 William Cheng: Alright. So again, if you're not familiar with a hash table, you should look at some data structure you know textbook and see, see how they work. 9 00:00:54.930 --> 00:01:03.570 William Cheng: All right, so we finished solving the space problem we reduce the you know the the initial four megabytes of pace table. 10 00:01:03.990 --> 00:01:09.300 William Cheng: Down to very, very little memory. But in this case, because the space time trade off. 11 00:01:09.960 --> 00:01:18.630 William Cheng: The performance is really terrible. So now, as promised, we're going to talk about the solution in hardware tried to solve all the performance problem with the with the 12 00:01:19.440 --> 00:01:25.050 William Cheng: Translation only one shot. Okay. The solution has a fancy name is known as the translation leukocyte buffer. 13 00:01:25.590 --> 00:01:34.530 William Cheng: But all it is is that it so so translation, because I've offered this the abbreviation is to be there. So the tip is simply a hardware cash. 14 00:01:34.980 --> 00:01:42.030 William Cheng: But in this case, it's a specialized hardware cash because all the cash, you know, because all the cash our page table entries 15 00:01:42.930 --> 00:01:50.610 William Cheng: Okay, so, so, yeah, it's a specialized catch. It's not a data cache. It's an instruction cash I guess for those people hoarding hardware. They sort of know a bunch of different caches. 16 00:01:50.910 --> 00:02:02.100 William Cheng: This one, the only thing they catch his page table entries. Right. So again, if you, if you think about the pace table the page table is see us. Here's your bus right the spirit page table is sitting in physical memory across the bus. 17 00:02:02.610 --> 00:02:06.810 William Cheng: So what you would do is that whenever you need to perform address translation, you need to pay stable entry. 18 00:02:07.110 --> 00:02:17.040 William Cheng: Okay, the way we talked about it before. Is that what you need to go to the page stable energy you need to go across the bus to read a page table entry from the table and then you you you 19 00:02:17.520 --> 00:02:25.740 William Cheng: You read that into your MMU and then you MMU check the validity that they check the protection beds and then if everything checks out. It will use a physical page number 20 00:02:26.100 --> 00:02:31.950 William Cheng: There. So now we're going to catch the page table entry instead of translation look as a buffer inside the mmm you 21 00:02:32.340 --> 00:02:43.950 William Cheng: Guys over here is that, is that a CP over here. There's an inner core of the CPU and there is a translation look aside buffer that translate who's looking at that buffer is so big that it usually take up the entire area for the MMU 22 00:02:44.520 --> 00:02:52.380 William Cheng: Okay, so there. But when you look at the MMU these days, all you see is a translation jobs. I've ever even though you know that in the menu, they still need to perform the address translation logic. 23 00:02:52.800 --> 00:02:58.470 William Cheng: Right. They need to drive the bus. They need to you know to do the comparisons and that through all the checks and stuff like that. 24 00:02:58.710 --> 00:03:02.520 William Cheng: But the translation, because that buffer is going to take up all the space on the chip. 25 00:03:02.850 --> 00:03:06.450 William Cheng: So therefore it will look like the entire interview is just a translation little time off from 26 00:03:06.690 --> 00:03:12.480 William Cheng: There. So this is a hardware cash, which means that whenever you need to perform my address translation where you need to pay stable energy 27 00:03:12.660 --> 00:03:21.060 William Cheng: The first thing that you should check is that if that even that pays table entry is cash instead of translation, because I prefer okay if it is. This is known as the cache hit. 28 00:03:21.450 --> 00:03:25.230 William Cheng: So using our terminology is going to catch it. So in this case, in this 29 00:03:25.770 --> 00:03:32.940 William Cheng: This one is a translation look as a buffer. We're going to call it a translation, because that buffer ahead if you can find a page table entry instead of translation without buffering. 30 00:03:33.300 --> 00:03:40.230 William Cheng: Okay, if you can find it easy to use it right away without going to the bus, as I mentioned last time, right, if you don't go to the bus, then 31 00:03:40.500 --> 00:03:50.580 William Cheng: You know all the operation that you're done instead of CPU and also instead MMU is only going to cost you one nanosecond. Well, every time when you need to go across the bus, you know, the cost you 100 100 nanoseconds. 32 00:03:51.180 --> 00:03:55.740 William Cheng: Okay, so by not going to the bus over here, your performance is going to look like it's 100 times faster. 33 00:03:56.400 --> 00:04:00.720 William Cheng: Than so that's why this is going to be very, very useful when we perform a look off for translation. 34 00:04:00.960 --> 00:04:08.640 William Cheng: So support page table entry we first we look inside. A to B to C is there is there, we'll just use it and we just save ourselves 99 nanoseconds. 35 00:04:08.940 --> 00:04:15.630 William Cheng: Okay, if it's not there one in that case you have to go across the bus. So when you go across the bus over here you just start a bus operation over here again. 36 00:04:15.960 --> 00:04:18.630 William Cheng: You know, put the physical address of the page table entry under the bus. 37 00:04:18.840 --> 00:04:29.100 William Cheng: And you going to read this entry across the bus into the menu. So when you went when it goes into that menu. The first thing that you need to do is that you need to make a copy and then keep a copy. Is that a transition because that buffer. 38 00:04:29.580 --> 00:04:38.280 William Cheng: Okay, so next time. What you need it. Again you you'll find it quickly. So therefore, we're going to cash that instead of trying to say he looks out about her. So in this case, you know, since the first time when you look into the translation. 39 00:04:38.610 --> 00:04:41.850 William Cheng: looks out over it's not there. This is no as a TL be miss 40 00:04:42.300 --> 00:04:51.120 William Cheng: Okay, so when you got to be miss what you need to do is that you need to go across the bus read a page table entry and then you cash instead of transition for example for it, and he will use it. 41 00:04:51.570 --> 00:04:58.590 William Cheng: Okay, so hopefully your next memory reference. Right. So. So typically when we turn around your program. Right. We saw our program over here. 42 00:04:58.770 --> 00:05:07.140 William Cheng: Typically was are accessing memory location one after another. So if everything is within the four kilobytes or in that case, you're going to end up getting a lot of PLP heads. 43 00:05:07.380 --> 00:05:12.360 William Cheng: And then once in a while, when you go beyond this page when he goes to the next page. When you go across page boundary 44 00:05:12.540 --> 00:05:16.710 William Cheng: The next time maybe you're going to be miss and then again you need is you need to cost you 100 and 45 00:05:16.920 --> 00:05:28.050 William Cheng: That a second to call go across the bus and once the data is brought in. Again, you're going to continue to execute a call over here using the same page so therefore you okay you're gonna you're gonna you're gonna end up getting translation, because I prefer it's 46 00:05:28.290 --> 00:05:33.960 William Cheng: Off a mini mini instruction that we saw before. In chapter three, how big, how big is the typical instruction. 47 00:05:34.530 --> 00:05:45.540 William Cheng: Okay so typical instruction is like, you know, five bites long neck and one page over here is four kilobytes. So, so, so in this guy's you can actually get 100 hits before you get a mess. 48 00:05:47.250 --> 00:05:49.710 William Cheng: Okay, so, so, so, I mean, of course, there's only work with 49 00:05:49.980 --> 00:05:52.380 William Cheng: You know this. This works really, really well with the text admin. 50 00:05:52.560 --> 00:06:00.300 William Cheng: When you're executing co sequentially you're going through one page of addresses over here. Will you come to the first page you're going to get a translation Lucas I buffer miss 51 00:06:00.450 --> 00:06:06.570 William Cheng: And you're going to go across the bus to to to spend 100 nanoseconds to copy the data into translational that looks at buffer. 52 00:06:06.750 --> 00:06:18.330 William Cheng: And then for the next save 190 Nate, Nate. Nate instruction, they're going to execute you can execute all these instructions equation only on the average, every instructions five bites. So in that case, you don't need to go across the bus until you go to the next page. 53 00:06:19.710 --> 00:06:29.250 William Cheng: So if only one out of 100 times over here, you're going to get a miss. So in this case, the Miss ratio is going to be very, very small or the hit ratio for this particular characters can be really high. 54 00:06:29.430 --> 00:06:37.770 William Cheng: So the heroes over here, right, is going to be why 799 or 800 right so this number is going to be greater than greater than 99 point you know 55 00:06:39.000 --> 00:06:41.400 William Cheng: Rather than that. Definitely greater than 99% 56 00:06:42.630 --> 00:06:50.610 William Cheng: Okay. So this guy is on the average is going to look like you know you know 99% of the time you're going to get a translation opens up overhead is all going to cost you. 57 00:06:50.970 --> 00:07:03.180 William Cheng: You know, just 99% of the time the only going to cost you nine well one nanosecond right and then only 1% of the time is going to cost you 100 nanoseconds. So, on the average, what is the average you know translation. 58 00:07:03.600 --> 00:07:20.730 William Cheng: Looks up offer access time. Right, it's going to be 99% times one, no n plus 1% 100 is going to be about 200 or 199 seconds for every 100 translation books I bought a say for every 100 59 00:07:21.360 --> 00:07:29.820 William Cheng: page table entry look up. Okay. So, therefore, on the average, you know, every time we need to go through to go to get a paper G is going to cost you on the average to nanosecond. 60 00:07:31.980 --> 00:07:37.920 William Cheng: Okay, so this is why you know now, we don't have to worry about you know the why. 61 00:07:38.370 --> 00:07:42.060 William Cheng: Don't we don't have to worry about when we try to get the page table n g is going to take us 62 00:07:42.360 --> 00:07:55.290 William Cheng: You know 600% overhead. But now all the 600% overhead that like example to talk about before every, you know, access over here on the average is only going to cost you to nanosecond. So, so if you have to look up six entry into trance. 63 00:07:56.550 --> 00:08:08.430 William Cheng: If you need to look up six page table entry or since they're all cash in the translation, because that buffer over here is only going to cost you, you know, top, top, top nanosecond overall. And it's much cheaper than going over to the bus was 64 00:08:09.090 --> 00:08:20.100 William Cheng: OK. So again, if we go back to the original example over here whenever we need to go the memory is gonna cost you 100 nanoseconds right so now if you use the Intel multi level pace table, you're going to end up accessing page table. 65 00:08:20.550 --> 00:08:28.500 William Cheng: Entry two times, but both time both time on the average is going to cause you to nanosecond. Okay. And the last time we need to go over memory why in this case it will cost you one. 66 00:08:28.890 --> 00:08:45.390 William Cheng: You know 100 and nanosecond. So in this case, overall, you know, typically is going to cost you on the average going to cost you 104 nanosecond instead of 100 nanoseconds. OK, so now using the translation looks out about for your overall performance only get degraded by 4% instead of 200% 67 00:08:46.530 --> 00:08:59.790 William Cheng: Okay. So, therefore, this is really good way to go. So that's why, you know, even today, today we are using the Intel CPU, the Intel CPU use the translation logos that buffer. So this way you won't even when you do multi level, you know, the pace table, the 68 00:09:00.210 --> 00:09:02.880 William Cheng: Us a multi level paste a ball. It's not gonna cost you too much. 69 00:09:03.120 --> 00:09:12.870 William Cheng: Okay, because the Hey Ray is going to be very high. So again, the example that I use is that for the hero for the tech segment is going to be really high, but for the data segment and the stack and what the side is actually pretty good. 70 00:09:13.050 --> 00:09:17.100 William Cheng: Right before the data segment and the heap segment you know the performance is going to be not as good. 71 00:09:17.400 --> 00:09:27.930 William Cheng: Okay, but most of the time, you know, or a lot of time. You know when you run your program you're accessing the stack and you're accessing the tech segment. So luckily, you know, those are the places where the translation numbers that buffer is going to perform really well. 72 00:09:28.650 --> 00:09:37.560 William Cheng: Yeah, right. So I just want to point out the difference between translation Lucas I buffer, Mrs and paste fall. Okay, both of them. It's kind of like a some some some 73 00:09:38.430 --> 00:09:47.400 William Cheng: Some sort of a cache miss there, but the penalty for the transition logos that buffer is simply one memory. I said it's gonna cost you 100 hours ago. But what about a page fall 74 00:09:47.790 --> 00:09:56.130 William Cheng: Okay, when you try to perform a accurate translation and if it turns out of the page table entry that you find as the equal to zero. What is going to cause the page fall 75 00:09:56.280 --> 00:10:02.580 William Cheng: When you cause the pace while you're going to try me. Is that a colonel. Maybe you have to wait for a disk. So in that case, a page for is very, very expensive. 76 00:10:02.850 --> 00:10:13.710 William Cheng: Okay, the page, I can cause you on the order of 10 millisecond. If you have to go to the desk. Okay. Well, a translation, because I bought for Miss over here is not a big deal. Okay, only going to cost you 100 nanoseconds. 77 00:10:14.940 --> 00:10:17.940 William Cheng: Alright, so again, there's a major difference between two are these to kind of falls. 78 00:10:18.330 --> 00:10:26.880 William Cheng: But again, if the translational is that buffer perform really poorly. If you have a hit rate of like 10% or something like that. Well, in that case, you know, all this overhead, it's gonna cost you a lot. 79 00:10:27.330 --> 00:10:37.770 William Cheng: Okay, so in the end of the performance of the transition Lucas that proper even though the penalty is very, very small. But, you know, the effect that did the overall effect on the processor speed is going to be very, very significant. 80 00:10:38.430 --> 00:10:48.090 William Cheng: Okay, so therefore it's important to, you know, to, to use it to use a large translation look as a buffer as big as possible so so so that's why we're going to end up with the MMU 81 00:10:48.690 --> 00:10:54.270 William Cheng: They'll do the only things that interview is going to be the translation of the software because gonna take up all the space now. 82 00:10:56.130 --> 00:10:57.720 William Cheng: I also hear sort of picture. 83 00:10:58.650 --> 00:11:08.280 William Cheng: How do you transition looks at buffer. So inside the CPU. Over here we have the inner core of the CPU and then where the translation look is that buffer that will cache page table entry and here is 84 00:11:08.550 --> 00:11:15.210 William Cheng: You know, on the right side over here, what physical memory. We have all the page table one for the blue process one for the pink process one for the operating system. 85 00:11:15.600 --> 00:11:24.450 William Cheng: So when we are running the blue process inside a unicorn the CPU whenever we make reference to the page table entry, they will be cached inside of translation. Notice that buffer. Okay, so this 86 00:11:24.660 --> 00:11:29.580 William Cheng: This example this entry is cash right here. This entry is cash right here and this entry is cash right here. 87 00:11:29.910 --> 00:11:34.440 William Cheng: Okay. So, therefore, you know, will you get a translation look us up over here, over here. What, then, in this case. 88 00:11:34.890 --> 00:11:45.990 William Cheng: You know your performance going to be really good and if it turns out you get a miss. There's plenty of movies. I'm translation will stop offer, you're going to bring in another you know pay stable entries over here and you go caching server translational as I prefer that 89 00:11:47.070 --> 00:11:53.940 William Cheng: And when you switch through either, you know, the, the, you know, so, so what we when you switch to switch it up a pink process over here. What do you need to do. 90 00:11:54.420 --> 00:11:59.670 William Cheng: Well, when you switch the pink process over here, in this case, all the data inside of trends doesn't look as I bought for over here. 91 00:11:59.970 --> 00:12:08.130 William Cheng: They will be incorrect. And then you have the Garuda all that. Okay, even you know what you're using the blue process when if you come over here and modify the pace table entries over here. 92 00:12:08.520 --> 00:12:12.360 William Cheng: Okay. So in this example, if you're running the pace at which is over here. So remember, 93 00:12:12.630 --> 00:12:21.180 William Cheng: When we get a pitfall. We're coming out of Colonel the colonel is going to fix a bunch of stuff and then the colonel is going to face the page table. And again, the page table is a colonel data structure. 94 00:12:21.420 --> 00:12:28.740 William Cheng: Okay, so it is possible that your current or modify the pace table entries over here. What is this page. They were engine is cash is that the translation, because I've offer 95 00:12:29.760 --> 00:12:36.990 William Cheng: Okay, so if you keep the cash entries that transition looks up offer next time when you perform address translation, you're going to get the wrong. 96 00:12:37.290 --> 00:12:47.280 William Cheng: You know, you know, maybe so. So over here, let's say you update the the physical page number over here. So do we have to actually point to a different places that physical memory. So if you still use this cached paste 97 00:12:47.670 --> 00:12:50.430 William Cheng: Paste that Wednesday. Then in this case you will be accessing memory. 98 00:12:51.240 --> 00:12:59.790 William Cheng: Okay, so therefore it's very, very important that when he tried to modify. One of the page table entries over here, you need to invalidate the corresponding entry inside a translation from South Africa. 99 00:13:00.360 --> 00:13:07.980 William Cheng: Okay, so. So typically, and he said to CPU, there's going to be a specialized machine Russia allow you to invalidate this particular entry over here. 100 00:13:08.670 --> 00:13:14.640 William Cheng: So the terminology of yours. I'm going to use that use that is that for this page table entry. We're going to flush the corresponding 101 00:13:14.910 --> 00:13:20.880 William Cheng: Translational this about buffer entry. Well, we can we can call it invalidates the corresponding translation, because every entry. 102 00:13:21.360 --> 00:13:28.830 William Cheng: Okay. So this guy is one of the injuries over here will be invalid, right, so next time when you said it before, I'm actually going to delete them from the translation, because I bought her 103 00:13:29.100 --> 00:13:31.890 William Cheng: Okay, so in this case. Next time will you perform added trans Asia. 104 00:13:32.100 --> 00:13:40.050 William Cheng: And then you try to see if you have this cash entry and then the casual the transitional does that buffer is going to say, I don't have this entry. So, therefore, you have to get a go across the bus. 105 00:13:40.260 --> 00:13:48.660 William Cheng: Get this patient entry and then copy that needs are sensational because I buffer. So now what's inside a transitional because I've offer is going to be exactly the same as what is a physical memory. 106 00:13:49.920 --> 00:13:58.410 William Cheng: Alive. So this flushing of, you know, or invalidating the corresponding translation, because a buffer is very, very important operation. So when you are implementing your kernel three, you gotta watch out. 107 00:13:58.650 --> 00:14:06.780 William Cheng: Because whenever you try to modify paste table entry over here, since we're using the Intel CPU, we have to invalidate the corresponding translation look as a buffer. 108 00:14:07.320 --> 00:14:12.420 William Cheng: You know the trend signals about for entry over here using a specialized machine instruction for doing that. 109 00:14:13.020 --> 00:14:22.440 William Cheng: Okay, so, of course, you know, we need to know. You don't have to record the assembly language. You need to find the right function of the call and this way, it will flush or invalidate translation builders have offered entry. Yeah. 110 00:14:24.000 --> 00:14:29.070 William Cheng: So what if you switch to a different process. So the color of the CPU is going to turn pink. 111 00:14:29.310 --> 00:14:37.560 William Cheng: That means that all these entries over here. They're all wrong. Right. Because if you start using using them for address translation, you're going to end up pointing to physical pages that belong to the blue process. 112 00:14:37.800 --> 00:14:40.920 William Cheng: And then you value, then in that case you will violate protection. 113 00:14:41.370 --> 00:14:49.860 William Cheng: You know, from from one process address space to another. So, therefore, what you have to do is that you have to invalidate the entire translation looks at buffer using one machine structure. 114 00:14:50.490 --> 00:15:03.480 William Cheng: Okay, so for x86 CPU is over here. This can be achieved by setting the CI three register as a result. Remember, there's a vegetable. See if you register the CR three register is the one that point to the physical address for the base of the page table. 115 00:15:03.990 --> 00:15:06.300 William Cheng: Okay, so therefore we need to switch to a different process. 116 00:15:06.480 --> 00:15:16.890 William Cheng: Obviously, you have to change the CRC register to point to a different page a ball. Well, in that case it will automatically instead of hardware, they will invalidate all the entries that translation looks about right. And that's exactly what you want. 117 00:15:17.550 --> 00:15:23.970 William Cheng: Okay, so go inside your winnings girl, you can actually again look for the car three, you know, string inside your 118 00:15:24.750 --> 00:15:33.600 William Cheng: Your, your, your pristine kernel source go and find out where this happen. Okay, so one of the plays that have a will be when you try to flush the entire translation little step offer 119 00:15:33.840 --> 00:15:44.790 William Cheng: Guys, okay. Over here we have two different kind of flashes of brushes you going off. What is the flush the entire translation. Notice that Barbara way you change the car three register value and the other one is that when you flush the 120 00:15:45.540 --> 00:15:52.170 William Cheng: Flesh, a single a single entry instead of translation, because I Barbara Okay, so for you. Well, you're doing Colonel three, you got to decide which way you want to go right 121 00:15:54.510 --> 00:16:03.840 William Cheng: Okay, so let's take a look at the implementation of the translation lawyers are buffer. So in this case, we're implementing a hard work hash. So for those people who have taken an architecture class, you already know how to do that. 122 00:16:04.080 --> 00:16:11.220 William Cheng: Okay, so for the computer science people or other major. So in order for you to implement that buffer. So again, this is just 123 00:16:12.540 --> 00:16:18.900 William Cheng: Basically, it's going to be a simple look up, you know, sort of a simple look up data structure given you know 124 00:16:19.620 --> 00:16:33.030 William Cheng: So it's okay, I'm going to sort of implement that using a sort of a hash function. So, given a key. You want to be up to a locator page table entry. Okay, so guys kind of like the hash table over here, right. So in so he 125 00:16:34.050 --> 00:16:44.790 William Cheng: Accepted the hyperspace of, you know, the, the hyperspace hyperspace page table is implementing software. This one is actually implementing hardware inside. Inside the CPU and inside the MMU 126 00:16:45.390 --> 00:16:53.310 William Cheng: Okay, so what we're going to do that, we're going to take the virtual page number over here. We're going to feed it to a hash function. So what happened is that if you feed it to a hash function, the highest one is going to be too slow. 127 00:16:54.150 --> 00:16:58.320 William Cheng: Okay. The have fire hydrant, and typically is going to take confident, for you, for you to computer highest value. 128 00:16:58.470 --> 00:17:07.560 William Cheng: So one of the dumbest way to implement a hash function is that we're going to take the original page number over here, we're only going to use the least significant bit inside a virtual page number as an array index. 129 00:17:07.740 --> 00:17:13.620 William Cheng: And this way we can actually use that to perform the look up option. Okay, so this is known as the direct mapping cash. 130 00:17:14.370 --> 00:17:27.120 William Cheng: And so in this example, we're going to take the 20 bits of the the the virtual page number over here shoved into two parts. The first 14 days over here. It's going to be the tag value and the next six is going to be as a re index that will index into 131 00:17:27.780 --> 00:17:33.840 William Cheng: The status or over here. OK. So again, this is, you know, a computer hardware. So we're going to use the hardware terminology 132 00:17:34.020 --> 00:17:45.810 William Cheng: Each one of these entries over here is known as a cache line. So I'll be here. They're 64 customer, right, because to to the six. I'll be able to 64 so here is key equal to zero, keep with the one to all the ways to 63 133 00:17:46.620 --> 00:17:55.590 William Cheng: So when you try to perform this look our function again so so so again what we should think about this. This is a hash function we're going to use the key. The least that you can six days over here. Give us a 134 00:17:56.340 --> 00:18:05.160 William Cheng: Give us entries over here. And then what we need to do is that we need to compare to tag against the rest of the tag over here instead of virtual address if they're equal, then this is the page table entry that we need 135 00:18:06.000 --> 00:18:12.120 William Cheng: OK. So again, the algorithm over here look exactly the same as what we're using the hash page table. 136 00:18:12.450 --> 00:18:22.290 William Cheng: Okay. The only difference over here is that, you know, inside this book. So again, if you're a software person you call this a bucket. If you're a horrible person. You call this a cache line, they're exactly the same thing that 137 00:18:22.680 --> 00:18:31.170 William Cheng: So in this in this particular input measure if you're using direct mapping. Then in this case that the length of the collision resolution change inside this bucket. It's exactly why. 138 00:18:32.130 --> 00:18:43.740 William Cheng: Okay, so, so by using direct mapping. That means that the collision resolution change the length of the collision resolution exactly why. So, therefore, you don't have to walk down the list of the you have, you have to walk out linguists 139 00:18:43.950 --> 00:18:45.450 William Cheng: Because the length of the list is only one 140 00:18:46.080 --> 00:18:52.560 William Cheng: Okay. So this guy's you will take the tag over here, compare against this y is equal, that means that you get a translation, because that overhead. 141 00:18:52.740 --> 00:18:56.340 William Cheng: And this is the page table entry that you want, so therefore it's gonna cost you one nanosecond. 142 00:18:56.580 --> 00:19:03.840 William Cheng: And you don't have to go out to the bus, and if it turns out that this tag is not equal to this one, then what do you have to do, right, you have to go across the bus spend $100 okay 143 00:19:04.050 --> 00:19:08.340 William Cheng: And then what you're going to do that, you're going to read in a new page table entry and you're gonna wipe out so 144 00:19:08.550 --> 00:19:19.860 William Cheng: Clean this up over here, right. You want you to do is that you over here. You're going to wipe out the existing page table entry over here and replace it with a new one. And now you copy the leading 14 bits of the virtual guys over here, a store has 145 00:19:21.180 --> 00:19:27.060 William Cheng: Got from this point. Now if you continue to use the next machine instruction over here like example that we use over here. 146 00:19:27.240 --> 00:19:35.190 William Cheng: Again, in this case the tag. And the key was stayed the same. So, therefore, this key will give you this entry over here you compare the tag, they will be exactly. I said, because he just brought it in 147 00:19:36.030 --> 00:19:48.780 William Cheng: Okay, so therefore, in this case, the key and attack over here, they will be, they will give you exactly what you're looking for. Over here, so you will continue to use the page table entries over here for the next 799 machine structure and this corresponds to the tech segment. 148 00:19:50.130 --> 00:19:58.620 William Cheng: Okay, so again this is direct mapping cash is the most simple one guy so so so again over here with mentioned about the to be hidden to penises that 149 00:19:59.880 --> 00:20:06.090 William Cheng: Another way to do it is that, you know, so, so, so in this case, you know, you know the the collusion resolution change the length is always equal to one. 150 00:20:06.450 --> 00:20:12.960 William Cheng: As it turns out that this kind of a, you know, this kind of a translation looks like buffer. It doesn't really perform very well that 151 00:20:13.620 --> 00:20:17.340 William Cheng: Actually did the hero. A typical is going to be very, very high on if you are translation, because I 152 00:20:17.580 --> 00:20:26.070 William Cheng: Like this guy to improve the you know the the the the performance over here. We're going to basically we're going to use a longer collusion resolution Chang, right. So in this case, 153 00:20:26.310 --> 00:20:29.430 William Cheng: This example is known as a two way set associative cash. 154 00:20:29.700 --> 00:20:40.560 William Cheng: Or so in this case of yeah the cache line over here. Again, it's going to have a collision resolution J. The collision resolution change the length of the collusion resolution is always to and that's what this is called a two way set associative cash. 155 00:20:41.070 --> 00:20:46.500 William Cheng: But they're also, you know, for way said associate account when the collision resolution Chan is going to be 156 00:20:46.740 --> 00:20:58.170 William Cheng: The length is gonna be able to four and then there's a reset associated car which is used by the Intel CPU, so. So in this case, you know, every bucket over here is going to have a collision resolution chain of length eight 157 00:20:58.620 --> 00:21:05.040 William Cheng: Okay. So in this case, how do you perform a function. So again, you take your virtual address you take a virtual page number divided into two parts. 158 00:21:05.250 --> 00:21:16.230 William Cheng: Use the key over here to give you a cache line and now the cache line is going to have eight entries in that. So what you would do is that you will compare this key against the tag in all these entries over here simultaneously. 159 00:21:17.130 --> 00:21:23.430 William Cheng: Okay, so how can you do this. This is someone who does he. What does he do this in hardware. So therefore, we actually you're able to compare the tag. 160 00:21:23.640 --> 00:21:33.720 William Cheng: Against all these eight hags over here simultaneously. Okay, the picture that we showed up here as a to a set associated with cash, so therefore you will compare to attack against two tags over here simultaneously. 161 00:21:34.170 --> 00:21:39.210 William Cheng: If both of them is a mess. Well, then you go across the boss and you're gonna bring in a new cash and pasted went to 162 00:21:39.810 --> 00:21:49.020 William Cheng: Take the page. They want entry into your translation, because that buffer. And now you have a problem. Okay. When you bring the new pay stub or inches over here. Then in this case you have to decide which one to get rid of. 163 00:21:49.710 --> 00:22:00.690 William Cheng: Right, because you can only store to right. So if two of them, they don't match, we read in a new way to keep one of them out. So in order for you to decide which one to kick out you need to use your cash replacement policy. 164 00:22:01.260 --> 00:22:09.900 William Cheng: Because this is known as a replacement policy either pick out which one of them over here to get rid of. And now you need to replace that with a new one and also need to copy the tab over here to be to be coded 165 00:22:10.050 --> 00:22:16.530 William Cheng: That had so next time, hopefully when he tried to perform address translation, you will still get the same hit up for this case table entry. 166 00:22:17.370 --> 00:22:26.940 William Cheng: That. So the higher the amount of set associative at the, the better the performance. So therefore, Intel use a way said associative cash. So in this case, again, to try to increase the the hip. 167 00:22:27.270 --> 00:22:32.550 William Cheng: The hip probability you know for the translation, because I've offer. So as it turns out entirely do really, really well. 168 00:22:33.720 --> 00:22:35.310 William Cheng: With the hip ratio. Yeah. 169 00:22:36.750 --> 00:22:43.440 William Cheng: Alright, so again, the amount of says that assertion. It's the same thing as the bucket side. If you look at this as a hash table data structure. 170 00:22:43.650 --> 00:22:51.720 William Cheng: Okay. But again, this is done in hardware you can compare the, you know, you can compare the tide against all the tag inside the same cache line simultaneously. 171 00:22:52.080 --> 00:23:02.100 William Cheng: So those of you who have done hardware, you know, building compared to your competitors actually is quite expensive. So if you want to compare a lot of stuff simultaneously is going to eat up a lot of the real estate inside your inside 172 00:23:03.120 --> 00:23:11.430 William Cheng: Inside your CPU chair. Okay, so that's what we end up with a translation, because I buffer take up all the space inside the CPU or inside the enemy. Yeah. 173 00:23:13.080 --> 00:23:24.630 William Cheng: I see the extreme AI, we can only have a little can actually have only one cash lie. So in this way, this is known as a fully associative cash. Okay. So in this case, there's only one cache line. So you don't need to use the key anymore. 174 00:23:24.840 --> 00:23:29.220 William Cheng: So what you will do is that you will compare the tag against all the tab over here in parallel. 175 00:23:30.180 --> 00:23:40.080 William Cheng: OK, so maybe something. So in this case, you know, this cash. How many entries are there. Well, maybe there are 64 entries over here. Maybe there's 128 maybe there's 256. So again, what you would do is that you will take the tab over here. 176 00:23:40.290 --> 00:23:48.660 William Cheng: compared against all the tag over here in parallel. If all of them has a miss wine decays. Again, you need to go across the bus and to read the page table. 177 00:23:48.870 --> 00:23:59.820 William Cheng: Entry over here into memory and now again you have to write your replacement policy determine which one to get rid of. And you order for you to replace that. And therefore, the corresponding value in a copy to tag value over getting into that entry. 178 00:24:00.270 --> 00:24:06.630 William Cheng: Okay, if it turns out that you get a hit, you know, they nutcase. You don't need to go across the bus, a while ago cost you one nanosecond. 179 00:24:06.930 --> 00:24:17.070 William Cheng: But in this case is going to be really expensive because in order for you to compare 256 centuries over here in parallel, it's gonna cost you a lot of hardware. Okay, so there are actually some CPU or there's a meal. 180 00:24:17.550 --> 00:24:24.630 William Cheng: Out there. What they will do, they will run something called micro code. So again, this case what it would do that. It will see quickly go through this list over here. 181 00:24:25.080 --> 00:24:34.890 William Cheng: So again, on the average, you're going to traverse half the list and going to find a match in the worst case which over all the list over here and it's going to cost you you know 232 nanosecond. And then you're going to go across the bus. 182 00:24:35.370 --> 00:24:48.060 William Cheng: Sorry 256 now seconds over here. And then if it turns out it's a miss. You have to go across the bus and spend another one nanosecond. Okay. So yeah, we're not going to get into, you know, too much of the micro code. But if you're a horrible person. You know what I'm talking about. Yeah. 183 00:24:49.140 --> 00:24:59.580 William Cheng: Anyway, so there's three different approaches over here, typically we see on the modern in the modern CPU. They use a set associative cash and the amount of said associate diversity is going to be pretty high in order for you to get a 184 00:25:00.120 --> 00:25:04.500 William Cheng: You know for you to get a, get a good hit rate inside your translation, because I prefer. Yeah. 185 00:25:07.020 --> 00:25:15.660 William Cheng: Alright, so now we're gonna sort of briefly take a look at a problem with multiple CPU. So again, every time we talk about multiple CPU, things get a little more messy. 186 00:25:16.080 --> 00:25:23.940 William Cheng: So now I'm going to say sort of see multiple CPU or what problem does it bring for the translation look us up offer. Okay. So if you look at, you know, we have two CPUs over here. 187 00:25:24.330 --> 00:25:35.550 William Cheng: Every CPU has their own inner core of the CPU and also they have their own MMU right so therefore the translation, because I buffer is inside the CPU. So if you have to CPU, you're going to have two different translation service. I prefer 188 00:25:35.940 --> 00:25:41.790 William Cheng: Okay. So I mentioned before, when you update the page table entries over here inside the colonel data structure over here. 189 00:25:41.970 --> 00:25:51.420 William Cheng: You need to invalidate a translation, because I buffer. But now if you modify the pace table entries over here, you need to modify all the translation of a buffer over here. You guys are invalid. They all them. 190 00:25:51.690 --> 00:25:54.690 William Cheng: For the corresponding entries over here. Okay. So the question is, how do you do that. 191 00:25:56.160 --> 00:26:02.730 William Cheng: So yeah, so, so, so the one, the first question that you have to answer is that who is modifying the pace table entries over here. 192 00:26:03.210 --> 00:26:06.180 William Cheng: Okay, so we're going to assume that all the CPUs over here. They're running 193 00:26:06.870 --> 00:26:14.850 William Cheng: They're running threats from the same address based upon the same process rest. So that's why the end up, you know, modifying the same page table that so you know the 194 00:26:15.150 --> 00:26:24.120 William Cheng: Amount of other page table over here, one foot inside the CPU has to be modifying the page table entry over here. So this guy is going to be the one inside. Inside CPU one 195 00:26:24.510 --> 00:26:30.360 William Cheng: Okay, so if the third one is that the CPU one over here that will be the Colonel said that's modifying the page table entries over here. 196 00:26:30.600 --> 00:26:41.220 William Cheng: How do you, you know, I mean, so. So in this case you are executing code into your own CPU. You can embellish the entries over here inside your translation realist I've offered. So that's easy right you execute the machine instruction to do that. 197 00:26:41.430 --> 00:26:44.580 William Cheng: How do you evaluate the page table entries over here in the second CPU. 198 00:26:45.360 --> 00:26:55.230 William Cheng: Okay, why do you have to do that, right, because the second CPU is over here. Share your address space. So if you modify the pace. They went to over here. Well, then in that case for the second CPU. The address translation will be incorrect. 199 00:26:56.190 --> 00:27:06.270 William Cheng: Okay, so therefore you are sitting in the first CPU. How do you invalid day the transition because I've always had a second CPU. So remember that, you know, if you're running so much CPU. There's no way you can tell what the second use 200 00:27:06.570 --> 00:27:10.110 William Cheng: You can. There's no way you can tell what the second second CPU what to do. 201 00:27:11.370 --> 00:27:15.570 William Cheng: Okay, because you know they're all separate hardware over here, you cannot tell the second CPU what to do. 202 00:27:16.590 --> 00:27:25.650 William Cheng: So, so therefore, what you should do over here is that, you know, you're going to run this, that it's our them know as a distributed annoyance. The, the translation, because I prefer shoot down algorithm. 203 00:27:26.160 --> 00:27:39.870 William Cheng: Okay, so this is known as the to be shoot on our that. So what we need to do is that we need to ask the second CPU nicely to say, hey, could you invalidate the paper entry over here for us. Okay. But again, there's no way for you to talk to another CPU. So what do you have to do. 204 00:27:40.920 --> 00:27:47.850 William Cheng: Alright, so the solution over here is that you need to run a distributed over them. So it is different algorithm because you because really involve multiple CPUs over here. 205 00:27:48.030 --> 00:27:54.690 William Cheng: Okay, so they need to run the same district now with them, they need to work with each other in order for you to invalidate the pace at which is over here. 206 00:27:55.140 --> 00:28:04.770 William Cheng: Okay, so, since you're in CPU. Number one, how do you affect the seven second CPU. There's no way for you to tell the secondary view what to do. But there's one thing that you can do is that you could interrupt the second CPU. 207 00:28:05.790 --> 00:28:21.630 William Cheng: Okay, we saw before it for Intel right there's IPL on the 31 which is IPL high IPR number 3030 number 13 is losing power right IP on number 29 is called into processor interrupt and that's the internet. We're going to use 208 00:28:22.080 --> 00:28:30.270 William Cheng: Okay, so in this guy CPU one over here is going to modify the translation, because I'm over here, what you are not allowed to modify yet because if you modify it right now. What, then, in this case. 209 00:28:30.510 --> 00:28:34.230 William Cheng: A CPU to is going to perform the incorrect, you know, address translation. 210 00:28:34.710 --> 00:28:42.750 William Cheng: Okay, so therefore you are not allowed to modify the patient manager over here, you need to run the distributed algorithm so you know exactly when to modify the page table entry. 211 00:28:43.290 --> 00:28:51.840 William Cheng: Okay, so here we need to get the timing right now. So the basic idea over here is that, how do we make sure that you know you know so so 212 00:28:52.860 --> 00:29:01.950 William Cheng: So, so what we need to do is that before we modify the pace table entries over here. We want to make sure that CPU number to over here is not using these are using the address space. 213 00:29:02.310 --> 00:29:05.790 William Cheng: So therefore, it's not doing the address translation. So how can we do that. 214 00:29:06.600 --> 00:29:11.250 William Cheng: Okay, so one thing that you can do is that if the CPU. Number two, if it goes into the inner all contacts. 215 00:29:11.580 --> 00:29:22.380 William Cheng: Once you go into, you know, content is no longer insider threat contacts. If it's not, instead of three contacts, he will not be using, you know the the space table anymore. It will not be using the address space. 216 00:29:23.670 --> 00:29:30.570 William Cheng: Okay, so what you have to do is that you have to interrupt the second CPR. We're way for it to go into the contacts and then you start running your distributor protocol. 217 00:29:31.050 --> 00:29:37.350 William Cheng: Okay, so let's take a look at the distributor called protocol we here, right. So this guy we have you know this guy over here is known as the shooter. 218 00:29:38.670 --> 00:29:49.200 William Cheng: That this will be the shooter and all the other CP over here so we can actually have multiple CPU. They all they are running, you know, different threads on the same on the same process. So in this case, you have to, they have to shoot all them down. 219 00:29:50.280 --> 00:30:03.210 William Cheng: Okay, so one of the shooters over here. And then the other one is, they all call shoot he's there, they're the one that's been shot down. Okay, so we're going to see the code that's running inside the shooter. We also going to see the code for running is that all the other shoe to that. 220 00:30:04.710 --> 00:30:08.700 William Cheng: First, the combo view is going to be the shooter code, right. So there's the shooter call over here. 221 00:30:09.120 --> 00:30:19.320 William Cheng: So, so we're going to call the shooter. I'll be here CPU Jay bez over here. Again, this one is Jay over here. All the other CPU is going to be all be I want it i three. So they're all called ice that 222 00:30:19.950 --> 00:30:31.500 William Cheng: So we shoot over here is equal to jail be here for all processors are sharing the same address space with Jay, what we're going to do we're going to use interrupt number 30 at the call center processing interrupt. 223 00:30:32.190 --> 00:30:39.180 William Cheng: That. So this guy is what a charity, you know, I'll be here by executing function. So in this case, again, in the hardware over here. They were interrupt all the other CPU. 224 00:30:39.660 --> 00:30:46.110 William Cheng: So all the other CPU will need to go into the interrupt contacts. Right. So then we need to wait because all the other CPU, they might have ended up disable 225 00:30:46.920 --> 00:30:54.120 William Cheng: Okay, or maybe they haven't you know out there, serving a high level interrupt so we don't want, you know, I mean, there's some CPU has has 128 interrupt level over here. 226 00:30:54.330 --> 00:31:02.010 William Cheng: So again, we need to we need to make sure that the other CPU will need to go into the Iraq contact so therefore we have to keep waiting until they get into the right contacts. 227 00:31:02.250 --> 00:31:08.130 William Cheng: There. So therefore, the first thing that we do is that we need to interrupt all the other CPU. They're running through us on the same process. 228 00:31:08.370 --> 00:31:14.340 William Cheng: And therefore, all the processor is sharing the same address race. We need to wait for them to go into the interrupt service routine. 229 00:31:14.760 --> 00:31:20.610 William Cheng: Guys are the way we do this that we're looking at some global variable. So there's an array for every one of these a CPU is over here. 230 00:31:21.150 --> 00:31:27.720 William Cheng: The index over here is going to be the CPU and does it know that are equal to zero name is either haven't gone into the the interrupt service routine. 231 00:31:28.170 --> 00:31:37.710 William Cheng: Or so this guy, what we're gonna we're gonna we're gonna we're gonna sip spin the CPU over here. Keep executing this instruction forever until they get into the contacts. 232 00:31:38.250 --> 00:31:42.540 William Cheng: Okay, so hopefully for all the other CPU, they will they will going to be no contracts pretty quickly. 233 00:31:43.410 --> 00:31:47.040 William Cheng: You know, so, so in this case we don't have to wait too long. So once you you know wave 234 00:31:47.460 --> 00:31:51.780 William Cheng: Finish waiting for one of the CPU go to a new I'll contact you gotta wait for the next one, next one, next one. 235 00:31:51.930 --> 00:31:59.220 William Cheng: So eventually, when you're done with all these instructions over here, then you know that all the other CPU. They're sharing the same address space. They have all gone to the interest 236 00:31:59.730 --> 00:32:10.620 William Cheng: That they have all gone into Iraq contacts and now they become safe for you to modify the page table entry right because you know that no other CPU right now they are actually using they're actually using 237 00:32:11.550 --> 00:32:17.190 William Cheng: Using the pace able to modify now. So, therefore, this time we're going to modify the pace. 238 00:32:17.790 --> 00:32:22.470 William Cheng: The pace of what we're looking modify as many entries that we want, modify, one of them, two of them, you know, 239 00:32:22.710 --> 00:32:30.240 William Cheng: One 100 of them whatever you need. And what we've done over here, what we can do that, we can update our flesh to transition over that buffer for our own CPU. 240 00:32:30.510 --> 00:32:34.590 William Cheng: Depends on if you only modify one one piece that we're doing in that case, you would just invalidated. 241 00:32:34.980 --> 00:32:46.020 William Cheng: If you modify 100 of them. Well, maybe sometimes easier for you to flush the entire translation or developer. So again, you're kernels. I can decide whether you want the invalid. The only one entry or invalidate the entire translation because of 242 00:32:47.070 --> 00:32:56.340 William Cheng: That. So when you're finished doing this, what do you need to do that, you need to have all the other CP to say, now I'm done, you can actually go, you can actually invalidate your, your, your translation, because I buffer. 243 00:32:56.640 --> 00:32:58.950 William Cheng: And then you need to go back into whatever you're doing before. 244 00:32:59.220 --> 00:33:08.010 William Cheng: Okay, so this guy want to set a goal right over here. No, it's done. And again, me over here is the vehicle to Jay. So that will be the shooter index. So we here says I'm done over here I'm deciding to one. 245 00:33:08.520 --> 00:33:12.570 William Cheng: Okay, so this way all the other CPU, they can go back into the threat contacts. 246 00:33:13.530 --> 00:33:16.230 William Cheng: Or as it is going to be a pseudo code, right, so what's gonna be the studio. 247 00:33:16.620 --> 00:33:25.830 William Cheng: The studio code over here is going to interrupt service routine. So when they get interrupt that right so get into office enable and then there's no higher interrupt that are block. So therefore, 248 00:33:26.130 --> 00:33:35.100 William Cheng: You know servicing the entire process that you know raw. So this is going to be the code for the inner prosperity interrupt that. So what do we do that they will know the US, who is a shooter. So this guy's shooter j over here. 249 00:33:35.670 --> 00:33:38.340 William Cheng: So what they will do is that it was said, know that i equal to one says that. 250 00:33:38.610 --> 00:33:46.470 William Cheng: Now I'm sorry interrupt service routine. So now you can go wait for somebody else. OK. So again, this one is set to one right here. And this one is used by the shooter right here. 251 00:33:46.890 --> 00:33:56.220 William Cheng: Okay. And then what we'll do is is that it's going to work for the shooter to be done. So he will execute this code. Well done. Jay Jay is going to be the shooter is equal to zero over here. So, so 252 00:33:56.700 --> 00:34:02.850 William Cheng: Before the shoot, I guess right here. The other CPUs over here. We'll go into a busy way keep waiting for the shooter to to be done. 253 00:34:03.150 --> 00:34:11.310 William Cheng: Okay. So in this case, eventually, when the shooters down. What it will do is that it will get out of this in front row, and then what it will do is that they will flushed and higher translation books I bought her 254 00:34:11.820 --> 00:34:18.570 William Cheng: Okay, why doesn't. Why can't it just invalidate one of the transition metals that ball for entry right because it doesn't know what's up you want just did. 255 00:34:19.620 --> 00:34:26.340 William Cheng: Okay CPU one over here, maybe modify one translation, because I buffer. If you only modify one he doesn't really tell CPU to which one they modify. 256 00:34:26.520 --> 00:34:35.910 William Cheng: What if they might have a 100 of them. So at the CPU to has no idea how many page table entries has been modified. So in this case, the only safe thing to do is to flush the entire translation or whatever. 257 00:34:36.480 --> 00:34:44.910 William Cheng: Okay, so this was CPU to go back into the right context, it will start getting a few translational does that buffer. Mrs. And eventually is going to start getting hits 258 00:34:46.200 --> 00:34:52.920 William Cheng: Okay, so this is one of the reason over here. When you have for. See, you know what, we have four CPU. You can never run four times faster. 259 00:34:53.070 --> 00:34:59.430 William Cheng: Because it's going to be time when you try being sitting into the ordinances that and then this case when you you know interrupt the other CPU. 260 00:35:00.000 --> 00:35:04.560 William Cheng: You know, you're going to flush the entire translation that was about her. So on one of the CPU is going to go pretty fast. 261 00:35:04.680 --> 00:35:12.540 William Cheng: While all the other CPU because they're experiencing translation, because I bought for Mrs. So there will be running slow for a while and then eventually they're gonna run four times as fast again. 262 00:35:13.860 --> 00:35:16.560 William Cheng: Okay, so when you have for dinner CPU. You should never expect 263 00:35:16.770 --> 00:35:27.150 William Cheng: You know the speed ups going to be exactly four times. It's usually is going to be a little less than that. Okay. So one of the reasons is that because you have to run the translation, because that buffer, you know, shoot, don't shoot. Don't ever. Okay. 264 00:35:28.650 --> 00:35:36.750 William Cheng: All right, so, so I guess the next slide over here sort of talk about, you know, these kinds of what's called a caching hierarchy or the stories hierarchy. 265 00:35:37.140 --> 00:35:44.790 William Cheng: Because one of the view of, you know, so we've talked about the translation, because that buffer. We also talked about some caches over here. Those of you know of 266 00:35:45.300 --> 00:35:53.910 William Cheng: At least do that you know that there is also a data cashiers instruction cash, there's this kind of cars that kind of cash. Some people actually think of the disk as a cash. 267 00:35:54.660 --> 00:35:59.160 William Cheng: Now, why would you want you to think that these are the cash right because you can think about some of those things, something like that. 268 00:35:59.280 --> 00:36:08.100 William Cheng: You know, you actually did is I just sitting on the call somewhere. Okay, so what you do is that when you need something that you will bring the data from the cloud and put it on your hard drive as a cash for your 269 00:36:08.580 --> 00:36:11.610 William Cheng: For your cloud. And then once you put in a book. 270 00:36:12.330 --> 00:36:19.950 William Cheng: put things on your hard drive in order for you to access your hard drive. You're going to bring in data from the hard drive into memory. So therefore, the memory is actually attached for the hard drive. 271 00:36:20.160 --> 00:36:28.350 William Cheng: And now these are these are memory over here. Some of the entries over here are used as a page table. So there are cash instead of transition because I've ever. Some of them are actually 272 00:36:28.860 --> 00:36:33.930 William Cheng: A cash inside the only CPU cache. And they're also offshoot card also all kinds of casual via 273 00:36:34.440 --> 00:36:42.810 William Cheng: The web. So therefore, there is a caching hierarchy new the top over here is close to the CPU near the bottom of years away from the CPU. So typically, as you 274 00:36:43.050 --> 00:36:54.870 William Cheng: Get closer to the CPU the cash gets faster and faster and the capacity of cash gets smaller and smaller as you go away from the CPU at the cash gets bigger and bigger, and also the, the speed of cash doesn't get slower and slower. 275 00:36:55.230 --> 00:37:02.280 William Cheng: Okay, so here are the typical numbers over here. So again, these are the number from, you know, 2012. So today, the number of Matt. 276 00:37:03.090 --> 00:37:07.620 William Cheng: Will can can go can go faster, but the relative numbers are still very, very similar. 277 00:37:08.400 --> 00:37:12.600 William Cheng: That so so please take this number with a grain of salt. But knowing that the relative number are still pretty 278 00:37:13.200 --> 00:37:19.470 William Cheng: Pretty much the same. So, therefore, in a way, this table is still valid that alright so so let's take a look at the the the top over here. 279 00:37:19.800 --> 00:37:28.140 William Cheng: As I mentioned, over here, you know, he's had a CPU. We're going to assume that every instruction that he actually was gonna cost one nanosecond. So to access their transition because that buffer is going to be pretty fast. 280 00:37:28.380 --> 00:37:36.390 William Cheng: Because the performance is crucial. So therefore, the hardware. People can pull a lot of resources over here to make sure that they actually they're running really fast. That. So in this case, 281 00:37:36.630 --> 00:37:40.530 William Cheng: The access type is going to be the same as to the instruction is gonna be on the order one nanosecond. 282 00:37:40.770 --> 00:37:46.260 William Cheng: And the size over here are pretty small, on the order of 64 kilobytes. OK. So again, this is inside of CPU. 283 00:37:46.500 --> 00:37:55.140 William Cheng: Then the next one over here is, I don't know if you noticed that we introduced by CPU from Amazon or from frys, they will tell you the amount of Ellen Kashi you happy. I want to show you can 284 00:37:55.530 --> 00:38:02.580 William Cheng: Catch that you have. So as it turns out, Ellen cash and cash their typical the inside the CPU. So in this case, it was skipping around casual via 285 00:38:03.000 --> 00:38:08.280 William Cheng: The alto catch over here the access time it's inside the CPU, so therefore it's still going to be pretty fast. 286 00:38:08.700 --> 00:38:17.130 William Cheng: But then it's much bigger. So there is going to be a little slower the access time is an order on the order of four nanoseconds and the size over here is going to be 256 kilobytes. 287 00:38:17.790 --> 00:38:26.550 William Cheng: Back. So in that case again. Well, you try to access data in memory. You tried to look look at them. I'm inside the alto cash if there are two cars. Again, you don't have to go to the main memory. Right. 288 00:38:27.150 --> 00:38:37.950 William Cheng: So these things are inside of CPU and then there's something called an L three cash the cash is sort of in the more sophisticated system where you have multiple CPU. So you maximize every CPU over here. So let's say we have multiple CPU. 289 00:38:39.360 --> 00:38:55.170 William Cheng: CPUs over here. Each one of them will get will have an L three cash that's outside of the CPU chair guys over here. There are three cars over here. So these are outside of CPU and they sit on the same side of the bus as the as the CPU go across the bus over here. That's your 290 00:38:56.310 --> 00:39:01.350 William Cheng: That's your RAM. Again, the RAM is the one that here for four gigabyte eight gigabyte 16 gigabyte over here to do ran 291 00:39:01.560 --> 00:39:07.560 William Cheng: And then, you know, on the boy. Yeah, on the motherboard CPU that you'll have the CPU chip, you also going to have these three cash. 292 00:39:07.770 --> 00:39:18.690 William Cheng: Okay. The L three cars are built out of memory chip. There are much faster than the than the than the RAM, so I'll be here is that the out the LTV cast the access time is 10 times faster than the memory over here. 293 00:39:19.170 --> 00:39:28.290 William Cheng: So this guy is gonna be $10 I got the capacity is gonna be much smaller on the order of two megabytes. Okay, so each one of them over here is going to be two megabytes over here guys again. 294 00:39:28.650 --> 00:39:33.210 William Cheng: We don't really talk about if you're taking a computer hardware architecture class that was sort of talked about, you know, 295 00:39:33.660 --> 00:39:42.450 William Cheng: I'll talk about this. And again, you know, so there are some, you know, hardware issue that needs to be addressed when you have multiple Ltd caches. So yeah, I'm going to skip all that 296 00:39:43.230 --> 00:39:48.930 William Cheng: But the next level view and the couch hierarchy is going to be your random access memory or your RAM, or the physical memory. 297 00:39:49.410 --> 00:39:57.210 William Cheng: So they're on the order of 10 gigabyte right because you typically think about you have, you know, four gigabytes 16 gigabyte a gigabyte. So on the average of 10 gigabyte. 298 00:39:57.420 --> 00:40:04.620 William Cheng: The access time over here is going to be 100 times slower than translational beside buffer on the order of 100 nanoseconds now. 299 00:40:05.190 --> 00:40:13.590 William Cheng: So then this is outside of CPU. So again, this case, you know, so the again the red. It's a special, you know, we don't call it a device. 300 00:40:14.100 --> 00:40:19.050 William Cheng: So it's sort of a special piece of hardware thats hanging off the bus and the access is actually pretty big. 301 00:40:19.440 --> 00:40:27.330 William Cheng: Okay. The rest of it over here are required. They're all devices. So they party by side device driver that so the next device over here is that 302 00:40:27.570 --> 00:40:36.390 William Cheng: You know, how about the the source storage hierarchy is going to be. It's going to be the your, your solid state drive right so this is going to be a USB stick or your solid state drive 303 00:40:36.630 --> 00:40:45.630 William Cheng: So in this case, your access time is going to be 1000 times small a slower than the rent guys will be a while the 1000 times slower because you have to go through a device driver. 304 00:40:46.350 --> 00:40:51.330 William Cheng: OK, so the room, you can access them directly right by you know starting the bus cycle and that says that they don't have 305 00:40:51.600 --> 00:40:58.740 William Cheng: To use the SSD, you have to go through a device driver. So that's going to be one 1000 times small slow. So, on the order of 100 microseconds. 306 00:40:59.190 --> 00:41:09.240 William Cheng: So in this case, the site is going to be bigger, the side of the size is going to be 10 times bigger than the ramp that on the order of 100 gigabyte again today we're gonna have more memory or over here for your SSD that 307 00:41:09.900 --> 00:41:13.110 William Cheng: The next device over here is called remote RAM. So what is Nora. 308 00:41:13.590 --> 00:41:22.920 William Cheng: Okay, you have a machine right next to you, maybe your machine right next to you is not doing anything. So machine right next to actually allow you to bow ram from the front, you know, from the other machine. 309 00:41:23.340 --> 00:41:33.930 William Cheng: Okay, so in order for you to get to the remote arrive, you have to go through the network and device over here. So again, you have to go through a device driver. So in this case of them the access time over here is going to be on the order of 100 310 00:41:34.470 --> 00:41:36.960 William Cheng: microseconds. So it's the same speed as SSD. 311 00:41:37.740 --> 00:41:43.740 William Cheng: That but typically your network device on your network devices they're actually really fast, but going through the device driver. They're going to be really so 312 00:41:43.920 --> 00:41:50.940 William Cheng: Okay, so therefore Indiana going to end up with the same performers and the other machine they you know the other machine. I don't know how much memory that going alone it to you. 313 00:41:51.390 --> 00:41:57.270 William Cheng: But you can actually use all the machine or your local area network. They're all willing to help you out. They can actually give some of the memory to you. 314 00:41:57.360 --> 00:42:08.310 William Cheng: So in the end, you know, we're actually going to end up with on the order of 100 gigabytes of memory. So, yeah, yeah, you sort of need to be on the same local area. Whereas if you don't know what they are. It's okay. Right. If you take an hour can cause you know what they are. 315 00:42:08.760 --> 00:42:17.910 William Cheng: Saying, they're on the same network. So this guy. They were long your memory to their will on their memory to us. Okay. They need to be cooperating with you in order for you to, you know, for you to be able to do that but 316 00:42:18.360 --> 00:42:26.640 William Cheng: In most cases where you go to the computer room you want to use a wrap on another machine, the machine says, are you kidding me. I'm also doing something important. I'm not alone you any of my memory. 317 00:42:27.150 --> 00:42:30.240 William Cheng: OK. So again, this is only works if he has special setup that 318 00:42:30.750 --> 00:42:38.760 William Cheng: The next device over here is going to be the desk and we saw it before the right nope your hard drive. We talked about how slow. It is the access time on the average is going to be one to 10 milliseconds. 319 00:42:39.180 --> 00:42:50.220 William Cheng: So this kind of storage capacity is going to be a little bigger, right, it's going to get to one terabytes over here and today you can actually by, you know, by, by this one eight terabytes of disk or something like that. So again today capacity usable beta 320 00:42:51.090 --> 00:42:58.680 William Cheng: Yeah, so this guys, we are all everything over here is going to be a device. In this case, a mechanical device that's why he says he's been slower. You can also go to the remote desk. 321 00:42:59.040 --> 00:43:04.440 William Cheng: Okay, because there are other machines or your local area network that allow you to actually have storage out there. We also 322 00:43:04.890 --> 00:43:06.180 William Cheng: I guess today. They're also 323 00:43:06.510 --> 00:43:18.180 William Cheng: You know Network Attached devices, right, you can actually go to Amazon by a network attached this you can attach to your network. So the neck as they will be a remote this get the access time over here, it's gonna be even slower on the order of 100 milliseconds. 324 00:43:18.510 --> 00:43:21.150 William Cheng: But in this case, you can actually have a lot of extra storage, you know, 325 00:43:22.110 --> 00:43:31.140 William Cheng: The inside your network attached storage that, what about after that right after over here. It could be, you know, inside of cloud, you have to go through your network device driver into connect to the internet. 326 00:43:31.620 --> 00:43:41.670 William Cheng: The access. It depends on how far away it is and the capacity over here. Amen. I mean, you know, could it be infinity, right. So again, it's always fine I but it's going to get get very, very big 327 00:43:42.570 --> 00:43:51.510 William Cheng: Okay, so go one of the view over here is that you know your data is actually stored in the cloud. And then what we need to do is, I will. We need to access that we need to bring it closer to 328 00:43:52.260 --> 00:43:54.720 William Cheng: Get, get into as close to the CPU as far 329 00:43:55.440 --> 00:44:02.940 William Cheng: As possible. And as you get closer into the CPU, you know, to get faster and faster and then got a storage capacity over here for the cash, it's going to get a smaller and smaller. 330 00:44:03.270 --> 00:44:16.560 William Cheng: Okay, so you can sort of think about this entire hierarchy as a caching hierarchy or as your storage hierarchy guys when so. So guys, we're going to interview to talk about a hash cash hierarchy or storage hierarchy. So again, remember this particular slide. Yeah. 331 00:44:18.690 --> 00:44:29.160 William Cheng: Alright, so the next thing we're going to sort of briefly talk about is the 64 bit issues over here. So what about when you have a 64 bit CPU rather than this guy's your virtual as it's going to be 64 bits law. 332 00:44:29.970 --> 00:44:35.280 William Cheng: You know, so yeah, so. So if you have a multi level, you know, multi level patient. Well, you know, 333 00:44:35.880 --> 00:44:40.620 William Cheng: Again, instead of chopping down into three parts. We can chop the into as many parts of that as we want. Okay. 334 00:44:41.040 --> 00:44:50.370 William Cheng: So as it turns out, it's kind of funny. One of the competitor of interest noise AMD very so the AMD format is actually the most popular format that this is no as the x86 64 335 00:44:50.640 --> 00:45:05.520 William Cheng: You know, before, Matt. So when you download the Ubuntu 16 point or poor for ISO file. Remember, there are two different files, you can download one is called I 386 right that's what lines for the 32 bit format. And the other one is called x86 64 that's for the 60 day format. 336 00:45:06.720 --> 00:45:16.080 William Cheng: Okay, so, so, so in that case you're using this pretty good format over here. So again, a 64 bit address space to do the 64 is astronomical. The large number. Okay, you will never need to 337 00:45:16.500 --> 00:45:25.410 William Cheng: address space that big. So whether we do is that for me there was just said the first 16 but over here, it's gonna be unused. So in this case, the address actually is 4848 baseball 338 00:45:26.040 --> 00:45:32.670 William Cheng: Okay, so we're going to take the 48 this law. We are going to chop it into five parts over here and then again the same idea with the multi level paste a ball. 339 00:45:32.820 --> 00:45:37.110 William Cheng: For the first part over here. There's the first level paste a ball for the second bar over here. The second level page a ball. 340 00:45:37.320 --> 00:45:43.230 William Cheng: I mean the terminology is going to get a little weird over here. The first part over here is called a call page map tape paste map table. 341 00:45:43.410 --> 00:45:52.590 William Cheng: And a second one is that I know as the paste directly pointer table and the third one over here. Again, we're going to go back to the 32 bit terminology. This one will be no at the pace directory table and the next one of your 342 00:45:52.950 --> 00:45:56.310 William Cheng: Pace table. And then the last one over here will be the actual physical page. 343 00:45:56.940 --> 00:46:04.470 William Cheng: So again, the idea here is exactly the same when you perform address translation, you're going to use the first, you know, so many beers over here as array, array index. 344 00:46:04.710 --> 00:46:10.620 William Cheng: That will give you about your again give you a page map table entry that looks just like the pace for entries over here. 345 00:46:10.770 --> 00:46:18.570 William Cheng: You check the validity. Check the access there and then it will give you a physical page number that will give you the base address will be here for the next level page table and then you prefer address 346 00:46:18.960 --> 00:46:21.390 William Cheng: Translation exactly the same way it was before. 347 00:46:22.080 --> 00:46:31.590 William Cheng: Okay, so you can see that if we don't have a good performance translation look us up. But for now, the overhead over here is going to be really scary is going to be 1234 over here is going to be 348 00:46:31.890 --> 00:46:38.190 William Cheng: 400% overhead. So you're gonna run five times slower, but hopefully if you have a good performing translation. Notice that buffer. 349 00:46:38.310 --> 00:46:50.820 William Cheng: Every one of these pasted wedgie. They are all things that are transitioning to that buffer so therefore it's going to call only going to cost you four nanoseconds over here for you to provider sensation and eventually you're going to go to the bus and spent on the spent 100 now second 350 00:46:52.050 --> 00:46:57.840 William Cheng: Guys. Oh God, this is the importance of, you know, transition look as a buffer, especially if you have, if you have a 64 bit CPU. 351 00:46:58.710 --> 00:47:05.370 William Cheng: Know, for me, there's actually 2421 and the second one over here only tried to do for level address translation over here. 352 00:47:05.520 --> 00:47:14.010 William Cheng: They combine the last part over here into a giant page table. So this page, says I, a giant page. So this case, you can actually choose to have a page or four kilobytes. 353 00:47:14.220 --> 00:47:25.830 William Cheng: Or a two or two megabytes or two to make our iPad have a page over here. Okay. So in this case, you know, when you have a two megabyte page. Every time we need to go to this, you need to copy two megabytes or again on the average can take you a little 354 00:47:26.850 --> 00:47:30.060 William Cheng: Longer, but in this case the address translation over here will be a little faster. 355 00:47:30.870 --> 00:47:42.270 William Cheng: Guy. But again, if you are you if you're using a good transition because I prefer the advantage that you gain from, you know, from the address translation is not going to be very, very much there. So again, you know, people can decide which way they want to go. Yeah. 356 00:47:43.410 --> 00:47:47.100 William Cheng: All right, what about Intel. So this one is in the x over here. 357 00:47:47.760 --> 00:47:54.750 William Cheng: You know entices archaeological is 64 so i think i stays us as well. The Itanium architecture as well architecture. 358 00:47:54.960 --> 00:48:06.570 William Cheng: Itanium the 64 bit architecture. I think around 2012 at the last company that use it architecture as HP and HP says forget that we're not going to do this we're gonna we're gonna go with the D amp D format. 359 00:48:06.870 --> 00:48:08.340 William Cheng: So therefore, nobody uses anymore. 360 00:48:08.700 --> 00:48:17.070 William Cheng: So that we're not going to talk about it. But the basic idea over here is that if you look at the structure. It looks almost like the linear page table over here with divided into two or three parts. 361 00:48:17.310 --> 00:48:24.840 William Cheng: The first part over here, a small number of this and that will give you the space. The, the, they'll give you the space register or I say this guy is over here. 362 00:48:25.020 --> 00:48:32.820 William Cheng: They will use, you know, three or four bullets over here. So there are 16 different spaces, instead of having only to have only four spaces. Okay. 363 00:48:33.240 --> 00:48:44.490 William Cheng: But again, since you know it didn't really work out very well for digital equivalent court eventually Intel sort of suffer the same fate, because the performance for ice support is really not very good. So they so everybody went with the end format. 364 00:48:45.270 --> 00:48:53.400 William Cheng: So therefore, we're going to skip all that also. Yeah. And for the listener base page that will get you don't have to worry about the address translation because. Nobody does that anymore. Okay. 365 00:48:55.140 --> 00:48:58.320 William Cheng: Oh, the last part over here in chapter seven overview of the hardware. 366 00:48:58.530 --> 00:49:07.500 William Cheng: Is talk about virtualization, but we haven't really talked about what is the virtual machine. Yeah. So right now what we're gonna do is, I'm going to skip everything over here. And now we're going to go into the second part of chapter seven. 367 00:49:07.710 --> 00:49:15.780 William Cheng: Or so, so we're done with hauling out for now. Okay. And when we finished virtual machine. We're going to come back to talk of talking about, you know, what is the virtualization. 368 00:49:16.560 --> 00:49:24.360 William Cheng: Yo, you know what, what is the virtualization issues over here for virtual memory. You know, for, for, for, for, for virtual machines. 369 00:49:24.750 --> 00:49:31.140 William Cheng: Okay, I'm using the word virtual too much too many times. Again, it's gonna go crazy. So yeah, we're gonna come back talk about this. And now we're going to go into 370 00:49:31.410 --> 00:49:43.980 William Cheng: The second part of chapter seven and look at the operating system support to use all these hardware that. All right. This is actually a good time to break. So in part three, we're going to talk about the second, the second part of chapter seven.