WEBVTT

1
00:00:01.350 --> 00:00:09.420
William Cheng: This is the second part of lecture 14 I went back and change one of the slides over here for the hash page table.

2
00:00:09.990 --> 00:00:17.400
William Cheng: For those people who are not familiar with a hash table. This is just a reminder of that data structure called the collision resolution chain.

3
00:00:17.850 --> 00:00:25.500
William Cheng: The collision resolution chain is a list of key value pairs for keys that hash to the same bucket. Right. So we're in this example.

4
00:00:25.980 --> 00:00:29.100
William Cheng: You know, for this bookcase. Here is the collision resolution chain.

5
00:00:29.370 --> 00:00:35.970
William Cheng: The key over here. It's a tag, because as the you know the the virtual page number is the one that you feed it to the hash function. So, therefore, that's the key.

6
00:00:36.180 --> 00:00:39.390
William Cheng: The value is the stuff that you're looking for. So there was a page table entry.

7
00:00:39.600 --> 00:00:46.770
William Cheng: And then we build a link list right so the link over here. It's going to be the next point. Joe that link everything together. That's how we end up with this particular data structure. Yeah.

8
00:00:47.130 --> 00:00:53.310
William Cheng: Alright. So again, if you're not familiar with a hash table, you should look at some data structure you know textbook and see, see how they work.

9
00:00:54.930 --> 00:01:03.570
William Cheng: All right, so we finished solving the space problem we reduce the you know the the initial four megabytes of pace table.

10
00:01:03.990 --> 00:01:09.300
William Cheng: Down to very, very little memory. But in this case, because the space time trade off.

11
00:01:09.960 --> 00:01:18.630
William Cheng: The performance is really terrible. So now, as promised, we're going to talk about the solution in hardware tried to solve all the performance problem with the with the

12
00:01:19.440 --> 00:01:25.050
William Cheng: Translation only one shot. Okay. The solution has a fancy name is known as the translation leukocyte buffer.

13
00:01:25.590 --> 00:01:34.530
William Cheng: But all it is is that it so so translation, because I've offered this the abbreviation is to be there. So the tip is simply a hardware cash.

14
00:01:34.980 --> 00:01:42.030
William Cheng: But in this case, it's a specialized hardware cash because all the cash, you know, because all the cash our page table entries

15
00:01:42.930 --> 00:01:50.610
William Cheng: Okay, so, so, yeah, it's a specialized catch. It's not a data cache. It's an instruction cash I guess for those people hoarding hardware. They sort of know a bunch of different caches.

16
00:01:50.910 --> 00:02:02.100
William Cheng: This one, the only thing they catch his page table entries. Right. So again, if you, if you think about the pace table the page table is see us. Here's your bus right the spirit page table is sitting in physical memory across the bus.

17
00:02:02.610 --> 00:02:06.810
William Cheng: So what you would do is that whenever you need to perform address translation, you need to pay stable entry.

18
00:02:07.110 --> 00:02:17.040
William Cheng: Okay, the way we talked about it before. Is that what you need to go to the page stable energy you need to go across the bus to read a page table entry from the table and then you you you

19
00:02:17.520 --> 00:02:25.740
William Cheng: You read that into your MMU and then you MMU check the validity that they check the protection beds and then if everything checks out. It will use a physical page number

20
00:02:26.100 --> 00:02:31.950
William Cheng: There. So now we're going to catch the page table entry instead of translation look as a buffer inside the mmm you

21
00:02:32.340 --> 00:02:43.950
William Cheng: Guys over here is that, is that a CP over here. There's an inner core of the CPU and there is a translation look aside buffer that translate who's looking at that buffer is so big that it usually take up the entire area for the MMU

22
00:02:44.520 --> 00:02:52.380
William Cheng: Okay, so there. But when you look at the MMU these days, all you see is a translation jobs. I've ever even though you know that in the menu, they still need to perform the address translation logic.

23
00:02:52.800 --> 00:02:58.470
William Cheng: Right. They need to drive the bus. They need to you know to do the comparisons and that through all the checks and stuff like that.

24
00:02:58.710 --> 00:03:02.520
William Cheng: But the translation, because that buffer is going to take up all the space on the chip.

25
00:03:02.850 --> 00:03:06.450
William Cheng: So therefore it will look like the entire interview is just a translation little time off from

26
00:03:06.690 --> 00:03:12.480
William Cheng: There. So this is a hardware cash, which means that whenever you need to perform my address translation where you need to pay stable energy

27
00:03:12.660 --> 00:03:21.060
William Cheng: The first thing that you should check is that if that even that pays table entry is cash instead of translation, because I prefer okay if it is. This is known as the cache hit.

28
00:03:21.450 --> 00:03:25.230
William Cheng: So using our terminology is going to catch it. So in this case, in this

29
00:03:25.770 --> 00:03:32.940
William Cheng: This one is a translation look as a buffer. We're going to call it a translation, because that buffer ahead if you can find a page table entry instead of translation without buffering.

30
00:03:33.300 --> 00:03:40.230
William Cheng: Okay, if you can find it easy to use it right away without going to the bus, as I mentioned last time, right, if you don't go to the bus, then

31
00:03:40.500 --> 00:03:50.580
William Cheng: You know all the operation that you're done instead of CPU and also instead MMU is only going to cost you one nanosecond. Well, every time when you need to go across the bus, you know, the cost you 100 100 nanoseconds.

32
00:03:51.180 --> 00:03:55.740
William Cheng: Okay, so by not going to the bus over here, your performance is going to look like it's 100 times faster.

33
00:03:56.400 --> 00:04:00.720
William Cheng: Than so that's why this is going to be very, very useful when we perform a look off for translation.

34
00:04:00.960 --> 00:04:08.640
William Cheng: So support page table entry we first we look inside. A to B to C is there is there, we'll just use it and we just save ourselves 99 nanoseconds.

35
00:04:08.940 --> 00:04:15.630
William Cheng: Okay, if it's not there one in that case you have to go across the bus. So when you go across the bus over here you just start a bus operation over here again.

36
00:04:15.960 --> 00:04:18.630
William Cheng: You know, put the physical address of the page table entry under the bus.

37
00:04:18.840 --> 00:04:29.100
William Cheng: And you going to read this entry across the bus into the menu. So when you went when it goes into that menu. The first thing that you need to do is that you need to make a copy and then keep a copy. Is that a transition because that buffer.

38
00:04:29.580 --> 00:04:38.280
William Cheng: Okay, so next time. What you need it. Again you you'll find it quickly. So therefore, we're going to cash that instead of trying to say he looks out about her. So in this case, you know, since the first time when you look into the translation.

39
00:04:38.610 --> 00:04:41.850
William Cheng: looks out over it's not there. This is no as a TL be miss

40
00:04:42.300 --> 00:04:51.120
William Cheng: Okay, so when you got to be miss what you need to do is that you need to go across the bus read a page table entry and then you cash instead of transition for example for it, and he will use it.

41
00:04:51.570 --> 00:04:58.590
William Cheng: Okay, so hopefully your next memory reference. Right. So. So typically when we turn around your program. Right. We saw our program over here.

42
00:04:58.770 --> 00:05:07.140
William Cheng: Typically was are accessing memory location one after another. So if everything is within the four kilobytes or in that case, you're going to end up getting a lot of PLP heads.

43
00:05:07.380 --> 00:05:12.360
William Cheng: And then once in a while, when you go beyond this page when he goes to the next page. When you go across page boundary

44
00:05:12.540 --> 00:05:16.710
William Cheng: The next time maybe you're going to be miss and then again you need is you need to cost you 100 and

45
00:05:16.920 --> 00:05:28.050
William Cheng: That a second to call go across the bus and once the data is brought in. Again, you're going to continue to execute a call over here using the same page so therefore you okay you're gonna you're gonna you're gonna end up getting translation, because I prefer it's

46
00:05:28.290 --> 00:05:33.960
William Cheng: Off a mini mini instruction that we saw before. In chapter three, how big, how big is the typical instruction.

47
00:05:34.530 --> 00:05:45.540
William Cheng: Okay so typical instruction is like, you know, five bites long neck and one page over here is four kilobytes. So, so, so in this guy's you can actually get 100 hits before you get a mess.

48
00:05:47.250 --> 00:05:49.710
William Cheng: Okay, so, so, so, I mean, of course, there's only work with

49
00:05:49.980 --> 00:05:52.380
William Cheng: You know this. This works really, really well with the text admin.

50
00:05:52.560 --> 00:06:00.300
William Cheng: When you're executing co sequentially you're going through one page of addresses over here. Will you come to the first page you're going to get a translation Lucas I buffer miss

51
00:06:00.450 --> 00:06:06.570
William Cheng: And you're going to go across the bus to to to spend 100 nanoseconds to copy the data into translational that looks at buffer.

52
00:06:06.750 --> 00:06:18.330
William Cheng: And then for the next save 190 Nate, Nate. Nate instruction, they're going to execute you can execute all these instructions equation only on the average, every instructions five bites. So in that case, you don't need to go across the bus until you go to the next page.

53
00:06:19.710 --> 00:06:29.250
William Cheng: So if only one out of 100 times over here, you're going to get a miss. So in this case, the Miss ratio is going to be very, very small or the hit ratio for this particular characters can be really high.

54
00:06:29.430 --> 00:06:37.770
William Cheng: So the heroes over here, right, is going to be why 799 or 800 right so this number is going to be greater than greater than 99 point you know

55
00:06:39.000 --> 00:06:41.400
William Cheng: Rather than that. Definitely greater than 99%

56
00:06:42.630 --> 00:06:50.610
William Cheng: Okay. So this guy is on the average is going to look like you know you know 99% of the time you're going to get a translation opens up overhead is all going to cost you.

57
00:06:50.970 --> 00:07:03.180
William Cheng: You know, just 99% of the time the only going to cost you nine well one nanosecond right and then only 1% of the time is going to cost you 100 nanoseconds. So, on the average, what is the average you know translation.

58
00:07:03.600 --> 00:07:20.730
William Cheng: Looks up offer access time. Right, it's going to be 99% times one, no n plus 1% 100 is going to be about 200 or 199 seconds for every 100 translation books I bought a say for every 100

59
00:07:21.360 --> 00:07:29.820
William Cheng: page table entry look up. Okay. So, therefore, on the average, you know, every time we need to go through to go to get a paper G is going to cost you on the average to nanosecond.

60
00:07:31.980 --> 00:07:37.920
William Cheng: Okay, so this is why you know now, we don't have to worry about you know the why.

61
00:07:38.370 --> 00:07:42.060
William Cheng: Don't we don't have to worry about when we try to get the page table n g is going to take us

62
00:07:42.360 --> 00:07:55.290
William Cheng: You know 600% overhead. But now all the 600% overhead that like example to talk about before every, you know, access over here on the average is only going to cost you to nanosecond. So, so if you have to look up six entry into trance.

63
00:07:56.550 --> 00:08:08.430
William Cheng: If you need to look up six page table entry or since they're all cash in the translation, because that buffer over here is only going to cost you, you know, top, top, top nanosecond overall. And it's much cheaper than going over to the bus was

64
00:08:09.090 --> 00:08:20.100
William Cheng: OK. So again, if we go back to the original example over here whenever we need to go the memory is gonna cost you 100 nanoseconds right so now if you use the Intel multi level pace table, you're going to end up accessing page table.

65
00:08:20.550 --> 00:08:28.500
William Cheng: Entry two times, but both time both time on the average is going to cause you to nanosecond. Okay. And the last time we need to go over memory why in this case it will cost you one.

66
00:08:28.890 --> 00:08:45.390
William Cheng: You know 100 and nanosecond. So in this case, overall, you know, typically is going to cost you on the average going to cost you 104 nanosecond instead of 100 nanoseconds. OK, so now using the translation looks out about for your overall performance only get degraded by 4% instead of 200%

67
00:08:46.530 --> 00:08:59.790
William Cheng: Okay. So, therefore, this is really good way to go. So that's why, you know, even today, today we are using the Intel CPU, the Intel CPU use the translation logos that buffer. So this way you won't even when you do multi level, you know, the pace table, the

68
00:09:00.210 --> 00:09:02.880
William Cheng: Us a multi level paste a ball. It's not gonna cost you too much.

69
00:09:03.120 --> 00:09:12.870
William Cheng: Okay, because the Hey Ray is going to be very high. So again, the example that I use is that for the hero for the tech segment is going to be really high, but for the data segment and the stack and what the side is actually pretty good.

70
00:09:13.050 --> 00:09:17.100
William Cheng: Right before the data segment and the heap segment you know the performance is going to be not as good.

71
00:09:17.400 --> 00:09:27.930
William Cheng: Okay, but most of the time, you know, or a lot of time. You know when you run your program you're accessing the stack and you're accessing the tech segment. So luckily, you know, those are the places where the translation numbers that buffer is going to perform really well.

72
00:09:28.650 --> 00:09:37.560
William Cheng: Yeah, right. So I just want to point out the difference between translation Lucas I buffer, Mrs and paste fall. Okay, both of them. It's kind of like a some some some

73
00:09:38.430 --> 00:09:47.400
William Cheng: Some sort of a cache miss there, but the penalty for the transition logos that buffer is simply one memory. I said it's gonna cost you 100 hours ago. But what about a page fall

74
00:09:47.790 --> 00:09:56.130
William Cheng: Okay, when you try to perform a accurate translation and if it turns out of the page table entry that you find as the equal to zero. What is going to cause the page fall

75
00:09:56.280 --> 00:10:02.580
William Cheng: When you cause the pace while you're going to try me. Is that a colonel. Maybe you have to wait for a disk. So in that case, a page for is very, very expensive.

76
00:10:02.850 --> 00:10:13.710
William Cheng: Okay, the page, I can cause you on the order of 10 millisecond. If you have to go to the desk. Okay. Well, a translation, because I bought for Miss over here is not a big deal. Okay, only going to cost you 100 nanoseconds.

77
00:10:14.940 --> 00:10:17.940
William Cheng: Alright, so again, there's a major difference between two are these to kind of falls.

78
00:10:18.330 --> 00:10:26.880
William Cheng: But again, if the translational is that buffer perform really poorly. If you have a hit rate of like 10% or something like that. Well, in that case, you know, all this overhead, it's gonna cost you a lot.

79
00:10:27.330 --> 00:10:37.770
William Cheng: Okay, so in the end of the performance of the transition Lucas that proper even though the penalty is very, very small. But, you know, the effect that did the overall effect on the processor speed is going to be very, very significant.

80
00:10:38.430 --> 00:10:48.090
William Cheng: Okay, so therefore it's important to, you know, to, to use it to use a large translation look as a buffer as big as possible so so so that's why we're going to end up with the MMU

81
00:10:48.690 --> 00:10:54.270
William Cheng: They'll do the only things that interview is going to be the translation of the software because gonna take up all the space now.

82
00:10:56.130 --> 00:10:57.720
William Cheng: I also hear sort of picture.

83
00:10:58.650 --> 00:11:08.280
William Cheng: How do you transition looks at buffer. So inside the CPU. Over here we have the inner core of the CPU and then where the translation look is that buffer that will cache page table entry and here is

84
00:11:08.550 --> 00:11:15.210
William Cheng: You know, on the right side over here, what physical memory. We have all the page table one for the blue process one for the pink process one for the operating system.

85
00:11:15.600 --> 00:11:24.450
William Cheng: So when we are running the blue process inside a unicorn the CPU whenever we make reference to the page table entry, they will be cached inside of translation. Notice that buffer. Okay, so this

86
00:11:24.660 --> 00:11:29.580
William Cheng: This example this entry is cash right here. This entry is cash right here and this entry is cash right here.

87
00:11:29.910 --> 00:11:34.440
William Cheng: Okay. So, therefore, you know, will you get a translation look us up over here, over here. What, then, in this case.

88
00:11:34.890 --> 00:11:45.990
William Cheng: You know your performance going to be really good and if it turns out you get a miss. There's plenty of movies. I'm translation will stop offer, you're going to bring in another you know pay stable entries over here and you go caching server translational as I prefer that

89
00:11:47.070 --> 00:11:53.940
William Cheng: And when you switch through either, you know, the, the, you know, so, so what we when you switch to switch it up a pink process over here. What do you need to do.

90
00:11:54.420 --> 00:11:59.670
William Cheng: Well, when you switch the pink process over here, in this case, all the data inside of trends doesn't look as I bought for over here.

91
00:11:59.970 --> 00:12:08.130
William Cheng: They will be incorrect. And then you have the Garuda all that. Okay, even you know what you're using the blue process when if you come over here and modify the pace table entries over here.

92
00:12:08.520 --> 00:12:12.360
William Cheng: Okay. So in this example, if you're running the pace at which is over here. So remember,

93
00:12:12.630 --> 00:12:21.180
William Cheng: When we get a pitfall. We're coming out of Colonel the colonel is going to fix a bunch of stuff and then the colonel is going to face the page table. And again, the page table is a colonel data structure.

94
00:12:21.420 --> 00:12:28.740
William Cheng: Okay, so it is possible that your current or modify the pace table entries over here. What is this page. They were engine is cash is that the translation, because I've offer

95
00:12:29.760 --> 00:12:36.990
William Cheng: Okay, so if you keep the cash entries that transition looks up offer next time when you perform address translation, you're going to get the wrong.

96
00:12:37.290 --> 00:12:47.280
William Cheng: You know, you know, maybe so. So over here, let's say you update the the physical page number over here. So do we have to actually point to a different places that physical memory. So if you still use this cached paste

97
00:12:47.670 --> 00:12:50.430
William Cheng: Paste that Wednesday. Then in this case you will be accessing memory.

98
00:12:51.240 --> 00:12:59.790
William Cheng: Okay, so therefore it's very, very important that when he tried to modify. One of the page table entries over here, you need to invalidate the corresponding entry inside a translation from South Africa.

99
00:13:00.360 --> 00:13:07.980
William Cheng: Okay, so. So typically, and he said to CPU, there's going to be a specialized machine Russia allow you to invalidate this particular entry over here.

100
00:13:08.670 --> 00:13:14.640
William Cheng: So the terminology of yours. I'm going to use that use that is that for this page table entry. We're going to flush the corresponding

101
00:13:14.910 --> 00:13:20.880
William Cheng: Translational this about buffer entry. Well, we can we can call it invalidates the corresponding translation, because every entry.

102
00:13:21.360 --> 00:13:28.830
William Cheng: Okay. So this guy is one of the injuries over here will be invalid, right, so next time when you said it before, I'm actually going to delete them from the translation, because I bought her

103
00:13:29.100 --> 00:13:31.890
William Cheng: Okay, so in this case. Next time will you perform added trans Asia.

104
00:13:32.100 --> 00:13:40.050
William Cheng: And then you try to see if you have this cash entry and then the casual the transitional does that buffer is going to say, I don't have this entry. So, therefore, you have to get a go across the bus.

105
00:13:40.260 --> 00:13:48.660
William Cheng: Get this patient entry and then copy that needs are sensational because I buffer. So now what's inside a transitional because I've offer is going to be exactly the same as what is a physical memory.

106
00:13:49.920 --> 00:13:58.410
William Cheng: Alive. So this flushing of, you know, or invalidating the corresponding translation, because a buffer is very, very important operation. So when you are implementing your kernel three, you gotta watch out.

107
00:13:58.650 --> 00:14:06.780
William Cheng: Because whenever you try to modify paste table entry over here, since we're using the Intel CPU, we have to invalidate the corresponding translation look as a buffer.

108
00:14:07.320 --> 00:14:12.420
William Cheng: You know the trend signals about for entry over here using a specialized machine instruction for doing that.

109
00:14:13.020 --> 00:14:22.440
William Cheng: Okay, so, of course, you know, we need to know. You don't have to record the assembly language. You need to find the right function of the call and this way, it will flush or invalidate translation builders have offered entry. Yeah.

110
00:14:24.000 --> 00:14:29.070
William Cheng: So what if you switch to a different process. So the color of the CPU is going to turn pink.

111
00:14:29.310 --> 00:14:37.560
William Cheng: That means that all these entries over here. They're all wrong. Right. Because if you start using using them for address translation, you're going to end up pointing to physical pages that belong to the blue process.

112
00:14:37.800 --> 00:14:40.920
William Cheng: And then you value, then in that case you will violate protection.

113
00:14:41.370 --> 00:14:49.860
William Cheng: You know, from from one process address space to another. So, therefore, what you have to do is that you have to invalidate the entire translation looks at buffer using one machine structure.

114
00:14:50.490 --> 00:15:03.480
William Cheng: Okay, so for x86 CPU is over here. This can be achieved by setting the CI three register as a result. Remember, there's a vegetable. See if you register the CR three register is the one that point to the physical address for the base of the page table.

115
00:15:03.990 --> 00:15:06.300
William Cheng: Okay, so therefore we need to switch to a different process.

116
00:15:06.480 --> 00:15:16.890
William Cheng: Obviously, you have to change the CRC register to point to a different page a ball. Well, in that case it will automatically instead of hardware, they will invalidate all the entries that translation looks about right. And that's exactly what you want.

117
00:15:17.550 --> 00:15:23.970
William Cheng: Okay, so go inside your winnings girl, you can actually again look for the car three, you know, string inside your

118
00:15:24.750 --> 00:15:33.600
William Cheng: Your, your, your pristine kernel source go and find out where this happen. Okay, so one of the plays that have a will be when you try to flush the entire translation little step offer

119
00:15:33.840 --> 00:15:44.790
William Cheng: Guys, okay. Over here we have two different kind of flashes of brushes you going off. What is the flush the entire translation. Notice that Barbara way you change the car three register value and the other one is that when you flush the

120
00:15:45.540 --> 00:15:52.170
William Cheng: Flesh, a single a single entry instead of translation, because I Barbara Okay, so for you. Well, you're doing Colonel three, you got to decide which way you want to go right

121
00:15:54.510 --> 00:16:03.840
William Cheng: Okay, so let's take a look at the implementation of the translation lawyers are buffer. So in this case, we're implementing a hard work hash. So for those people who have taken an architecture class, you already know how to do that.

122
00:16:04.080 --> 00:16:11.220
William Cheng: Okay, so for the computer science people or other major. So in order for you to implement that buffer. So again, this is just

123
00:16:12.540 --> 00:16:18.900
William Cheng: Basically, it's going to be a simple look up, you know, sort of a simple look up data structure given you know

124
00:16:19.620 --> 00:16:33.030
William Cheng: So it's okay, I'm going to sort of implement that using a sort of a hash function. So, given a key. You want to be up to a locator page table entry. Okay, so guys kind of like the hash table over here, right. So in so he

125
00:16:34.050 --> 00:16:44.790
William Cheng: Accepted the hyperspace of, you know, the, the hyperspace hyperspace page table is implementing software. This one is actually implementing hardware inside. Inside the CPU and inside the MMU

126
00:16:45.390 --> 00:16:53.310
William Cheng: Okay, so what we're going to do that, we're going to take the virtual page number over here. We're going to feed it to a hash function. So what happened is that if you feed it to a hash function, the highest one is going to be too slow.

127
00:16:54.150 --> 00:16:58.320
William Cheng: Okay. The have fire hydrant, and typically is going to take confident, for you, for you to computer highest value.

128
00:16:58.470 --> 00:17:07.560
William Cheng: So one of the dumbest way to implement a hash function is that we're going to take the original page number over here, we're only going to use the least significant bit inside a virtual page number as an array index.

129
00:17:07.740 --> 00:17:13.620
William Cheng: And this way we can actually use that to perform the look up option. Okay, so this is known as the direct mapping cash.

130
00:17:14.370 --> 00:17:27.120
William Cheng: And so in this example, we're going to take the 20 bits of the the the virtual page number over here shoved into two parts. The first 14 days over here. It's going to be the tag value and the next six is going to be as a re index that will index into

131
00:17:27.780 --> 00:17:33.840
William Cheng: The status or over here. OK. So again, this is, you know, a computer hardware. So we're going to use the hardware terminology

132
00:17:34.020 --> 00:17:45.810
William Cheng: Each one of these entries over here is known as a cache line. So I'll be here. They're 64 customer, right, because to to the six. I'll be able to 64 so here is key equal to zero, keep with the one to all the ways to 63

133
00:17:46.620 --> 00:17:55.590
William Cheng: So when you try to perform this look our function again so so so again what we should think about this. This is a hash function we're going to use the key. The least that you can six days over here. Give us a

134
00:17:56.340 --> 00:18:05.160
William Cheng: Give us entries over here. And then what we need to do is that we need to compare to tag against the rest of the tag over here instead of virtual address if they're equal, then this is the page table entry that we need

135
00:18:06.000 --> 00:18:12.120
William Cheng: OK. So again, the algorithm over here look exactly the same as what we're using the hash page table.

136
00:18:12.450 --> 00:18:22.290
William Cheng: Okay. The only difference over here is that, you know, inside this book. So again, if you're a software person you call this a bucket. If you're a horrible person. You call this a cache line, they're exactly the same thing that

137
00:18:22.680 --> 00:18:31.170
William Cheng: So in this in this particular input measure if you're using direct mapping. Then in this case that the length of the collision resolution change inside this bucket. It's exactly why.

138
00:18:32.130 --> 00:18:43.740
William Cheng: Okay, so, so by using direct mapping. That means that the collision resolution change the length of the collision resolution exactly why. So, therefore, you don't have to walk down the list of the you have, you have to walk out linguists

139
00:18:43.950 --> 00:18:45.450
William Cheng: Because the length of the list is only one

140
00:18:46.080 --> 00:18:52.560
William Cheng: Okay. So this guy's you will take the tag over here, compare against this y is equal, that means that you get a translation, because that overhead.

141
00:18:52.740 --> 00:18:56.340
William Cheng: And this is the page table entry that you want, so therefore it's gonna cost you one nanosecond.

142
00:18:56.580 --> 00:19:03.840
William Cheng: And you don't have to go out to the bus, and if it turns out that this tag is not equal to this one, then what do you have to do, right, you have to go across the bus spend $100 okay

143
00:19:04.050 --> 00:19:08.340
William Cheng: And then what you're going to do that, you're going to read in a new page table entry and you're gonna wipe out so

144
00:19:08.550 --> 00:19:19.860
William Cheng: Clean this up over here, right. You want you to do is that you over here. You're going to wipe out the existing page table entry over here and replace it with a new one. And now you copy the leading 14 bits of the virtual guys over here, a store has

145
00:19:21.180 --> 00:19:27.060
William Cheng: Got from this point. Now if you continue to use the next machine instruction over here like example that we use over here.

146
00:19:27.240 --> 00:19:35.190
William Cheng: Again, in this case the tag. And the key was stayed the same. So, therefore, this key will give you this entry over here you compare the tag, they will be exactly. I said, because he just brought it in

147
00:19:36.030 --> 00:19:48.780
William Cheng: Okay, so therefore, in this case, the key and attack over here, they will be, they will give you exactly what you're looking for. Over here, so you will continue to use the page table entries over here for the next 799 machine structure and this corresponds to the tech segment.

148
00:19:50.130 --> 00:19:58.620
William Cheng: Okay, so again this is direct mapping cash is the most simple one guy so so so again over here with mentioned about the to be hidden to penises that

149
00:19:59.880 --> 00:20:06.090
William Cheng: Another way to do it is that, you know, so, so, so in this case, you know, you know the the collusion resolution change the length is always equal to one.

150
00:20:06.450 --> 00:20:12.960
William Cheng: As it turns out that this kind of a, you know, this kind of a translation looks like buffer. It doesn't really perform very well that

151
00:20:13.620 --> 00:20:17.340
William Cheng: Actually did the hero. A typical is going to be very, very high on if you are translation, because I

152
00:20:17.580 --> 00:20:26.070
William Cheng: Like this guy to improve the you know the the the the performance over here. We're going to basically we're going to use a longer collusion resolution Chang, right. So in this case,

153
00:20:26.310 --> 00:20:29.430
William Cheng: This example is known as a two way set associative cash.

154
00:20:29.700 --> 00:20:40.560
William Cheng: Or so in this case of yeah the cache line over here. Again, it's going to have a collision resolution J. The collision resolution change the length of the collusion resolution is always to and that's what this is called a two way set associative cash.

155
00:20:41.070 --> 00:20:46.500
William Cheng: But they're also, you know, for way said associate account when the collision resolution Chan is going to be

156
00:20:46.740 --> 00:20:58.170
William Cheng: The length is gonna be able to four and then there's a reset associated car which is used by the Intel CPU, so. So in this case, you know, every bucket over here is going to have a collision resolution chain of length eight

157
00:20:58.620 --> 00:21:05.040
William Cheng: Okay. So in this case, how do you perform a function. So again, you take your virtual address you take a virtual page number divided into two parts.

158
00:21:05.250 --> 00:21:16.230
William Cheng: Use the key over here to give you a cache line and now the cache line is going to have eight entries in that. So what you would do is that you will compare this key against the tag in all these entries over here simultaneously.

159
00:21:17.130 --> 00:21:23.430
William Cheng: Okay, so how can you do this. This is someone who does he. What does he do this in hardware. So therefore, we actually you're able to compare the tag.

160
00:21:23.640 --> 00:21:33.720
William Cheng: Against all these eight hags over here simultaneously. Okay, the picture that we showed up here as a to a set associated with cash, so therefore you will compare to attack against two tags over here simultaneously.

161
00:21:34.170 --> 00:21:39.210
William Cheng: If both of them is a mess. Well, then you go across the boss and you're gonna bring in a new cash and pasted went to

162
00:21:39.810 --> 00:21:49.020
William Cheng: Take the page. They want entry into your translation, because that buffer. And now you have a problem. Okay. When you bring the new pay stub or inches over here. Then in this case you have to decide which one to get rid of.

163
00:21:49.710 --> 00:22:00.690
William Cheng: Right, because you can only store to right. So if two of them, they don't match, we read in a new way to keep one of them out. So in order for you to decide which one to kick out you need to use your cash replacement policy.

164
00:22:01.260 --> 00:22:09.900
William Cheng: Because this is known as a replacement policy either pick out which one of them over here to get rid of. And now you need to replace that with a new one and also need to copy the tab over here to be to be coded

165
00:22:10.050 --> 00:22:16.530
William Cheng: That had so next time, hopefully when he tried to perform address translation, you will still get the same hit up for this case table entry.

166
00:22:17.370 --> 00:22:26.940
William Cheng: That. So the higher the amount of set associative at the, the better the performance. So therefore, Intel use a way said associative cash. So in this case, again, to try to increase the the hip.

167
00:22:27.270 --> 00:22:32.550
William Cheng: The hip probability you know for the translation, because I've offer. So as it turns out entirely do really, really well.

168
00:22:33.720 --> 00:22:35.310
William Cheng: With the hip ratio. Yeah.

169
00:22:36.750 --> 00:22:43.440
William Cheng: Alright, so again, the amount of says that assertion. It's the same thing as the bucket side. If you look at this as a hash table data structure.

170
00:22:43.650 --> 00:22:51.720
William Cheng: Okay. But again, this is done in hardware you can compare the, you know, you can compare the tide against all the tag inside the same cache line simultaneously.

171
00:22:52.080 --> 00:23:02.100
William Cheng: So those of you who have done hardware, you know, building compared to your competitors actually is quite expensive. So if you want to compare a lot of stuff simultaneously is going to eat up a lot of the real estate inside your inside

172
00:23:03.120 --> 00:23:11.430
William Cheng: Inside your CPU chair. Okay, so that's what we end up with a translation, because I buffer take up all the space inside the CPU or inside the enemy. Yeah.

173
00:23:13.080 --> 00:23:24.630
William Cheng: I see the extreme AI, we can only have a little can actually have only one cash lie. So in this way, this is known as a fully associative cash. Okay. So in this case, there's only one cache line. So you don't need to use the key anymore.

174
00:23:24.840 --> 00:23:29.220
William Cheng: So what you will do is that you will compare the tag against all the tab over here in parallel.

175
00:23:30.180 --> 00:23:40.080
William Cheng: OK, so maybe something. So in this case, you know, this cash. How many entries are there. Well, maybe there are 64 entries over here. Maybe there's 128 maybe there's 256. So again, what you would do is that you will take the tab over here.

176
00:23:40.290 --> 00:23:48.660
William Cheng: compared against all the tag over here in parallel. If all of them has a miss wine decays. Again, you need to go across the bus and to read the page table.

177
00:23:48.870 --> 00:23:59.820
William Cheng: Entry over here into memory and now again you have to write your replacement policy determine which one to get rid of. And you order for you to replace that. And therefore, the corresponding value in a copy to tag value over getting into that entry.

178
00:24:00.270 --> 00:24:06.630
William Cheng: Okay, if it turns out that you get a hit, you know, they nutcase. You don't need to go across the bus, a while ago cost you one nanosecond.

179
00:24:06.930 --> 00:24:17.070
William Cheng: But in this case is going to be really expensive because in order for you to compare 256 centuries over here in parallel, it's gonna cost you a lot of hardware. Okay, so there are actually some CPU or there's a meal.

180
00:24:17.550 --> 00:24:24.630
William Cheng: Out there. What they will do, they will run something called micro code. So again, this case what it would do that. It will see quickly go through this list over here.

181
00:24:25.080 --> 00:24:34.890
William Cheng: So again, on the average, you're going to traverse half the list and going to find a match in the worst case which over all the list over here and it's going to cost you you know 232 nanosecond. And then you're going to go across the bus.

182
00:24:35.370 --> 00:24:48.060
William Cheng: Sorry 256 now seconds over here. And then if it turns out it's a miss. You have to go across the bus and spend another one nanosecond. Okay. So yeah, we're not going to get into, you know, too much of the micro code. But if you're a horrible person. You know what I'm talking about. Yeah.

183
00:24:49.140 --> 00:24:59.580
William Cheng: Anyway, so there's three different approaches over here, typically we see on the modern in the modern CPU. They use a set associative cash and the amount of said associate diversity is going to be pretty high in order for you to get a

184
00:25:00.120 --> 00:25:04.500
William Cheng: You know for you to get a, get a good hit rate inside your translation, because I prefer. Yeah.

185
00:25:07.020 --> 00:25:15.660
William Cheng: Alright, so now we're gonna sort of briefly take a look at a problem with multiple CPU. So again, every time we talk about multiple CPU, things get a little more messy.

186
00:25:16.080 --> 00:25:23.940
William Cheng: So now I'm going to say sort of see multiple CPU or what problem does it bring for the translation look us up offer. Okay. So if you look at, you know, we have two CPUs over here.

187
00:25:24.330 --> 00:25:35.550
William Cheng: Every CPU has their own inner core of the CPU and also they have their own MMU right so therefore the translation, because I buffer is inside the CPU. So if you have to CPU, you're going to have two different translation service. I prefer

188
00:25:35.940 --> 00:25:41.790
William Cheng: Okay. So I mentioned before, when you update the page table entries over here inside the colonel data structure over here.

189
00:25:41.970 --> 00:25:51.420
William Cheng: You need to invalidate a translation, because I buffer. But now if you modify the pace table entries over here, you need to modify all the translation of a buffer over here. You guys are invalid. They all them.

190
00:25:51.690 --> 00:25:54.690
William Cheng: For the corresponding entries over here. Okay. So the question is, how do you do that.

191
00:25:56.160 --> 00:26:02.730
William Cheng: So yeah, so, so, so the one, the first question that you have to answer is that who is modifying the pace table entries over here.

192
00:26:03.210 --> 00:26:06.180
William Cheng: Okay, so we're going to assume that all the CPUs over here. They're running

193
00:26:06.870 --> 00:26:14.850
William Cheng: They're running threats from the same address based upon the same process rest. So that's why the end up, you know, modifying the same page table that so you know the

194
00:26:15.150 --> 00:26:24.120
William Cheng: Amount of other page table over here, one foot inside the CPU has to be modifying the page table entry over here. So this guy is going to be the one inside. Inside CPU one

195
00:26:24.510 --> 00:26:30.360
William Cheng: Okay, so if the third one is that the CPU one over here that will be the Colonel said that's modifying the page table entries over here.

196
00:26:30.600 --> 00:26:41.220
William Cheng: How do you, you know, I mean, so. So in this case you are executing code into your own CPU. You can embellish the entries over here inside your translation realist I've offered. So that's easy right you execute the machine instruction to do that.

197
00:26:41.430 --> 00:26:44.580
William Cheng: How do you evaluate the page table entries over here in the second CPU.

198
00:26:45.360 --> 00:26:55.230
William Cheng: Okay, why do you have to do that, right, because the second CPU is over here. Share your address space. So if you modify the pace. They went to over here. Well, then in that case for the second CPU. The address translation will be incorrect.

199
00:26:56.190 --> 00:27:06.270
William Cheng: Okay, so therefore you are sitting in the first CPU. How do you invalid day the transition because I've always had a second CPU. So remember that, you know, if you're running so much CPU. There's no way you can tell what the second use

200
00:27:06.570 --> 00:27:10.110
William Cheng: You can. There's no way you can tell what the second second CPU what to do.

201
00:27:11.370 --> 00:27:15.570
William Cheng: Okay, because you know they're all separate hardware over here, you cannot tell the second CPU what to do.

202
00:27:16.590 --> 00:27:25.650
William Cheng: So, so therefore, what you should do over here is that, you know, you're going to run this, that it's our them know as a distributed annoyance. The, the translation, because I prefer shoot down algorithm.

203
00:27:26.160 --> 00:27:39.870
William Cheng: Okay, so this is known as the to be shoot on our that. So what we need to do is that we need to ask the second CPU nicely to say, hey, could you invalidate the paper entry over here for us. Okay. But again, there's no way for you to talk to another CPU. So what do you have to do.

204
00:27:40.920 --> 00:27:47.850
William Cheng: Alright, so the solution over here is that you need to run a distributed over them. So it is different algorithm because you because really involve multiple CPUs over here.

205
00:27:48.030 --> 00:27:54.690
William Cheng: Okay, so they need to run the same district now with them, they need to work with each other in order for you to invalidate the pace at which is over here.

206
00:27:55.140 --> 00:28:04.770
William Cheng: Okay, so, since you're in CPU. Number one, how do you affect the seven second CPU. There's no way for you to tell the secondary view what to do. But there's one thing that you can do is that you could interrupt the second CPU.

207
00:28:05.790 --> 00:28:21.630
William Cheng: Okay, we saw before it for Intel right there's IPL on the 31 which is IPL high IPR number 3030 number 13 is losing power right IP on number 29 is called into processor interrupt and that's the internet. We're going to use

208
00:28:22.080 --> 00:28:30.270
William Cheng: Okay, so in this guy CPU one over here is going to modify the translation, because I'm over here, what you are not allowed to modify yet because if you modify it right now. What, then, in this case.

209
00:28:30.510 --> 00:28:34.230
William Cheng: A CPU to is going to perform the incorrect, you know, address translation.

210
00:28:34.710 --> 00:28:42.750
William Cheng: Okay, so therefore you are not allowed to modify the patient manager over here, you need to run the distributed algorithm so you know exactly when to modify the page table entry.

211
00:28:43.290 --> 00:28:51.840
William Cheng: Okay, so here we need to get the timing right now. So the basic idea over here is that, how do we make sure that you know you know so so

212
00:28:52.860 --> 00:29:01.950
William Cheng: So, so what we need to do is that before we modify the pace table entries over here. We want to make sure that CPU number to over here is not using these are using the address space.

213
00:29:02.310 --> 00:29:05.790
William Cheng: So therefore, it's not doing the address translation. So how can we do that.

214
00:29:06.600 --> 00:29:11.250
William Cheng: Okay, so one thing that you can do is that if the CPU. Number two, if it goes into the inner all contacts.

215
00:29:11.580 --> 00:29:22.380
William Cheng: Once you go into, you know, content is no longer insider threat contacts. If it's not, instead of three contacts, he will not be using, you know the the space table anymore. It will not be using the address space.

216
00:29:23.670 --> 00:29:30.570
William Cheng: Okay, so what you have to do is that you have to interrupt the second CPR. We're way for it to go into the contacts and then you start running your distributor protocol.

217
00:29:31.050 --> 00:29:37.350
William Cheng: Okay, so let's take a look at the distributor called protocol we here, right. So this guy we have you know this guy over here is known as the shooter.

218
00:29:38.670 --> 00:29:49.200
William Cheng: That this will be the shooter and all the other CP over here so we can actually have multiple CPU. They all they are running, you know, different threads on the same on the same process. So in this case, you have to, they have to shoot all them down.

219
00:29:50.280 --> 00:30:03.210
William Cheng: Okay, so one of the shooters over here. And then the other one is, they all call shoot he's there, they're the one that's been shot down. Okay, so we're going to see the code that's running inside the shooter. We also going to see the code for running is that all the other shoe to that.

220
00:30:04.710 --> 00:30:08.700
William Cheng: First, the combo view is going to be the shooter code, right. So there's the shooter call over here.

221
00:30:09.120 --> 00:30:19.320
William Cheng: So, so we're going to call the shooter. I'll be here CPU Jay bez over here. Again, this one is Jay over here. All the other CPU is going to be all be I want it i three. So they're all called ice that

222
00:30:19.950 --> 00:30:31.500
William Cheng: So we shoot over here is equal to jail be here for all processors are sharing the same address space with Jay, what we're going to do we're going to use interrupt number 30 at the call center processing interrupt.

223
00:30:32.190 --> 00:30:39.180
William Cheng: That. So this guy is what a charity, you know, I'll be here by executing function. So in this case, again, in the hardware over here. They were interrupt all the other CPU.

224
00:30:39.660 --> 00:30:46.110
William Cheng: So all the other CPU will need to go into the interrupt contacts. Right. So then we need to wait because all the other CPU, they might have ended up disable

225
00:30:46.920 --> 00:30:54.120
William Cheng: Okay, or maybe they haven't you know out there, serving a high level interrupt so we don't want, you know, I mean, there's some CPU has has 128 interrupt level over here.

226
00:30:54.330 --> 00:31:02.010
William Cheng: So again, we need to we need to make sure that the other CPU will need to go into the Iraq contact so therefore we have to keep waiting until they get into the right contacts.

227
00:31:02.250 --> 00:31:08.130
William Cheng: There. So therefore, the first thing that we do is that we need to interrupt all the other CPU. They're running through us on the same process.

228
00:31:08.370 --> 00:31:14.340
William Cheng: And therefore, all the processor is sharing the same address race. We need to wait for them to go into the interrupt service routine.

229
00:31:14.760 --> 00:31:20.610
William Cheng: Guys are the way we do this that we're looking at some global variable. So there's an array for every one of these a CPU is over here.

230
00:31:21.150 --> 00:31:27.720
William Cheng: The index over here is going to be the CPU and does it know that are equal to zero name is either haven't gone into the the interrupt service routine.

231
00:31:28.170 --> 00:31:37.710
William Cheng: Or so this guy, what we're gonna we're gonna we're gonna we're gonna sip spin the CPU over here. Keep executing this instruction forever until they get into the contacts.

232
00:31:38.250 --> 00:31:42.540
William Cheng: Okay, so hopefully for all the other CPU, they will they will going to be no contracts pretty quickly.

233
00:31:43.410 --> 00:31:47.040
William Cheng: You know, so, so in this case we don't have to wait too long. So once you you know wave

234
00:31:47.460 --> 00:31:51.780
William Cheng: Finish waiting for one of the CPU go to a new I'll contact you gotta wait for the next one, next one, next one.

235
00:31:51.930 --> 00:31:59.220
William Cheng: So eventually, when you're done with all these instructions over here, then you know that all the other CPU. They're sharing the same address space. They have all gone to the interest

236
00:31:59.730 --> 00:32:10.620
William Cheng: That they have all gone into Iraq contacts and now they become safe for you to modify the page table entry right because you know that no other CPU right now they are actually using they're actually using

237
00:32:11.550 --> 00:32:17.190
William Cheng: Using the pace able to modify now. So, therefore, this time we're going to modify the pace.

238
00:32:17.790 --> 00:32:22.470
William Cheng: The pace of what we're looking modify as many entries that we want, modify, one of them, two of them, you know,

239
00:32:22.710 --> 00:32:30.240
William Cheng: One 100 of them whatever you need. And what we've done over here, what we can do that, we can update our flesh to transition over that buffer for our own CPU.

240
00:32:30.510 --> 00:32:34.590
William Cheng: Depends on if you only modify one one piece that we're doing in that case, you would just invalidated.

241
00:32:34.980 --> 00:32:46.020
William Cheng: If you modify 100 of them. Well, maybe sometimes easier for you to flush the entire translation or developer. So again, you're kernels. I can decide whether you want the invalid. The only one entry or invalidate the entire translation because of

242
00:32:47.070 --> 00:32:56.340
William Cheng: That. So when you're finished doing this, what do you need to do that, you need to have all the other CP to say, now I'm done, you can actually go, you can actually invalidate your, your, your translation, because I buffer.

243
00:32:56.640 --> 00:32:58.950
William Cheng: And then you need to go back into whatever you're doing before.

244
00:32:59.220 --> 00:33:08.010
William Cheng: Okay, so this guy want to set a goal right over here. No, it's done. And again, me over here is the vehicle to Jay. So that will be the shooter index. So we here says I'm done over here I'm deciding to one.

245
00:33:08.520 --> 00:33:12.570
William Cheng: Okay, so this way all the other CPU, they can go back into the threat contacts.

246
00:33:13.530 --> 00:33:16.230
William Cheng: Or as it is going to be a pseudo code, right, so what's gonna be the studio.

247
00:33:16.620 --> 00:33:25.830
William Cheng: The studio code over here is going to interrupt service routine. So when they get interrupt that right so get into office enable and then there's no higher interrupt that are block. So therefore,

248
00:33:26.130 --> 00:33:35.100
William Cheng: You know servicing the entire process that you know raw. So this is going to be the code for the inner prosperity interrupt that. So what do we do that they will know the US, who is a shooter. So this guy's shooter j over here.

249
00:33:35.670 --> 00:33:38.340
William Cheng: So what they will do is that it was said, know that i equal to one says that.

250
00:33:38.610 --> 00:33:46.470
William Cheng: Now I'm sorry interrupt service routine. So now you can go wait for somebody else. OK. So again, this one is set to one right here. And this one is used by the shooter right here.

251
00:33:46.890 --> 00:33:56.220
William Cheng: Okay. And then what we'll do is is that it's going to work for the shooter to be done. So he will execute this code. Well done. Jay Jay is going to be the shooter is equal to zero over here. So, so

252
00:33:56.700 --> 00:34:02.850
William Cheng: Before the shoot, I guess right here. The other CPUs over here. We'll go into a busy way keep waiting for the shooter to to be done.

253
00:34:03.150 --> 00:34:11.310
William Cheng: Okay. So in this case, eventually, when the shooters down. What it will do is that it will get out of this in front row, and then what it will do is that they will flushed and higher translation books I bought her

254
00:34:11.820 --> 00:34:18.570
William Cheng: Okay, why doesn't. Why can't it just invalidate one of the transition metals that ball for entry right because it doesn't know what's up you want just did.

255
00:34:19.620 --> 00:34:26.340
William Cheng: Okay CPU one over here, maybe modify one translation, because I buffer. If you only modify one he doesn't really tell CPU to which one they modify.

256
00:34:26.520 --> 00:34:35.910
William Cheng: What if they might have a 100 of them. So at the CPU to has no idea how many page table entries has been modified. So in this case, the only safe thing to do is to flush the entire translation or whatever.

257
00:34:36.480 --> 00:34:44.910
William Cheng: Okay, so this was CPU to go back into the right context, it will start getting a few translational does that buffer. Mrs. And eventually is going to start getting hits

258
00:34:46.200 --> 00:34:52.920
William Cheng: Okay, so this is one of the reason over here. When you have for. See, you know what, we have four CPU. You can never run four times faster.

259
00:34:53.070 --> 00:34:59.430
William Cheng: Because it's going to be time when you try being sitting into the ordinances that and then this case when you you know interrupt the other CPU.

260
00:35:00.000 --> 00:35:04.560
William Cheng: You know, you're going to flush the entire translation that was about her. So on one of the CPU is going to go pretty fast.

261
00:35:04.680 --> 00:35:12.540
William Cheng: While all the other CPU because they're experiencing translation, because I bought for Mrs. So there will be running slow for a while and then eventually they're gonna run four times as fast again.

262
00:35:13.860 --> 00:35:16.560
William Cheng: Okay, so when you have for dinner CPU. You should never expect

263
00:35:16.770 --> 00:35:27.150
William Cheng: You know the speed ups going to be exactly four times. It's usually is going to be a little less than that. Okay. So one of the reasons is that because you have to run the translation, because that buffer, you know, shoot, don't shoot. Don't ever. Okay.

264
00:35:28.650 --> 00:35:36.750
William Cheng: All right, so, so I guess the next slide over here sort of talk about, you know, these kinds of what's called a caching hierarchy or the stories hierarchy.

265
00:35:37.140 --> 00:35:44.790
William Cheng: Because one of the view of, you know, so we've talked about the translation, because that buffer. We also talked about some caches over here. Those of you know of

266
00:35:45.300 --> 00:35:53.910
William Cheng: At least do that you know that there is also a data cashiers instruction cash, there's this kind of cars that kind of cash. Some people actually think of the disk as a cash.

267
00:35:54.660 --> 00:35:59.160
William Cheng: Now, why would you want you to think that these are the cash right because you can think about some of those things, something like that.

268
00:35:59.280 --> 00:36:08.100
William Cheng: You know, you actually did is I just sitting on the call somewhere. Okay, so what you do is that when you need something that you will bring the data from the cloud and put it on your hard drive as a cash for your

269
00:36:08.580 --> 00:36:11.610
William Cheng: For your cloud. And then once you put in a book.

270
00:36:12.330 --> 00:36:19.950
William Cheng: put things on your hard drive in order for you to access your hard drive. You're going to bring in data from the hard drive into memory. So therefore, the memory is actually attached for the hard drive.

271
00:36:20.160 --> 00:36:28.350
William Cheng: And now these are these are memory over here. Some of the entries over here are used as a page table. So there are cash instead of transition because I've ever. Some of them are actually

272
00:36:28.860 --> 00:36:33.930
William Cheng: A cash inside the only CPU cache. And they're also offshoot card also all kinds of casual via

273
00:36:34.440 --> 00:36:42.810
William Cheng: The web. So therefore, there is a caching hierarchy new the top over here is close to the CPU near the bottom of years away from the CPU. So typically, as you

274
00:36:43.050 --> 00:36:54.870
William Cheng: Get closer to the CPU the cash gets faster and faster and the capacity of cash gets smaller and smaller as you go away from the CPU at the cash gets bigger and bigger, and also the, the speed of cash doesn't get slower and slower.

275
00:36:55.230 --> 00:37:02.280
William Cheng: Okay, so here are the typical numbers over here. So again, these are the number from, you know, 2012. So today, the number of Matt.

276
00:37:03.090 --> 00:37:07.620
William Cheng: Will can can go can go faster, but the relative numbers are still very, very similar.

277
00:37:08.400 --> 00:37:12.600
William Cheng: That so so please take this number with a grain of salt. But knowing that the relative number are still pretty

278
00:37:13.200 --> 00:37:19.470
William Cheng: Pretty much the same. So, therefore, in a way, this table is still valid that alright so so let's take a look at the the the top over here.

279
00:37:19.800 --> 00:37:28.140
William Cheng: As I mentioned, over here, you know, he's had a CPU. We're going to assume that every instruction that he actually was gonna cost one nanosecond. So to access their transition because that buffer is going to be pretty fast.

280
00:37:28.380 --> 00:37:36.390
William Cheng: Because the performance is crucial. So therefore, the hardware. People can pull a lot of resources over here to make sure that they actually they're running really fast. That. So in this case,

281
00:37:36.630 --> 00:37:40.530
William Cheng: The access type is going to be the same as to the instruction is gonna be on the order one nanosecond.

282
00:37:40.770 --> 00:37:46.260
William Cheng: And the size over here are pretty small, on the order of 64 kilobytes. OK. So again, this is inside of CPU.

283
00:37:46.500 --> 00:37:55.140
William Cheng: Then the next one over here is, I don't know if you noticed that we introduced by CPU from Amazon or from frys, they will tell you the amount of Ellen Kashi you happy. I want to show you can

284
00:37:55.530 --> 00:38:02.580
William Cheng: Catch that you have. So as it turns out, Ellen cash and cash their typical the inside the CPU. So in this case, it was skipping around casual via

285
00:38:03.000 --> 00:38:08.280
William Cheng: The alto catch over here the access time it's inside the CPU, so therefore it's still going to be pretty fast.

286
00:38:08.700 --> 00:38:17.130
William Cheng: But then it's much bigger. So there is going to be a little slower the access time is an order on the order of four nanoseconds and the size over here is going to be 256 kilobytes.

287
00:38:17.790 --> 00:38:26.550
William Cheng: Back. So in that case again. Well, you try to access data in memory. You tried to look look at them. I'm inside the alto cash if there are two cars. Again, you don't have to go to the main memory. Right.

288
00:38:27.150 --> 00:38:37.950
William Cheng: So these things are inside of CPU and then there's something called an L three cash the cash is sort of in the more sophisticated system where you have multiple CPU. So you maximize every CPU over here. So let's say we have multiple CPU.

289
00:38:39.360 --> 00:38:55.170
William Cheng: CPUs over here. Each one of them will get will have an L three cash that's outside of the CPU chair guys over here. There are three cars over here. So these are outside of CPU and they sit on the same side of the bus as the as the CPU go across the bus over here. That's your

290
00:38:56.310 --> 00:39:01.350
William Cheng: That's your RAM. Again, the RAM is the one that here for four gigabyte eight gigabyte 16 gigabyte over here to do ran

291
00:39:01.560 --> 00:39:07.560
William Cheng: And then, you know, on the boy. Yeah, on the motherboard CPU that you'll have the CPU chip, you also going to have these three cash.

292
00:39:07.770 --> 00:39:18.690
William Cheng: Okay. The L three cars are built out of memory chip. There are much faster than the than the than the RAM, so I'll be here is that the out the LTV cast the access time is 10 times faster than the memory over here.

293
00:39:19.170 --> 00:39:28.290
William Cheng: So this guy is gonna be $10 I got the capacity is gonna be much smaller on the order of two megabytes. Okay, so each one of them over here is going to be two megabytes over here guys again.

294
00:39:28.650 --> 00:39:33.210
William Cheng: We don't really talk about if you're taking a computer hardware architecture class that was sort of talked about, you know,

295
00:39:33.660 --> 00:39:42.450
William Cheng: I'll talk about this. And again, you know, so there are some, you know, hardware issue that needs to be addressed when you have multiple Ltd caches. So yeah, I'm going to skip all that

296
00:39:43.230 --> 00:39:48.930
William Cheng: But the next level view and the couch hierarchy is going to be your random access memory or your RAM, or the physical memory.

297
00:39:49.410 --> 00:39:57.210
William Cheng: So they're on the order of 10 gigabyte right because you typically think about you have, you know, four gigabytes 16 gigabyte a gigabyte. So on the average of 10 gigabyte.

298
00:39:57.420 --> 00:40:04.620
William Cheng: The access time over here is going to be 100 times slower than translational beside buffer on the order of 100 nanoseconds now.

299
00:40:05.190 --> 00:40:13.590
William Cheng: So then this is outside of CPU. So again, this case, you know, so the again the red. It's a special, you know, we don't call it a device.

300
00:40:14.100 --> 00:40:19.050
William Cheng: So it's sort of a special piece of hardware thats hanging off the bus and the access is actually pretty big.

301
00:40:19.440 --> 00:40:27.330
William Cheng: Okay. The rest of it over here are required. They're all devices. So they party by side device driver that so the next device over here is that

302
00:40:27.570 --> 00:40:36.390
William Cheng: You know, how about the the source storage hierarchy is going to be. It's going to be the your, your solid state drive right so this is going to be a USB stick or your solid state drive

303
00:40:36.630 --> 00:40:45.630
William Cheng: So in this case, your access time is going to be 1000 times small a slower than the rent guys will be a while the 1000 times slower because you have to go through a device driver.

304
00:40:46.350 --> 00:40:51.330
William Cheng: OK, so the room, you can access them directly right by you know starting the bus cycle and that says that they don't have

305
00:40:51.600 --> 00:40:58.740
William Cheng: To use the SSD, you have to go through a device driver. So that's going to be one 1000 times small slow. So, on the order of 100 microseconds.

306
00:40:59.190 --> 00:41:09.240
William Cheng: So in this case, the site is going to be bigger, the side of the size is going to be 10 times bigger than the ramp that on the order of 100 gigabyte again today we're gonna have more memory or over here for your SSD that

307
00:41:09.900 --> 00:41:13.110
William Cheng: The next device over here is called remote RAM. So what is Nora.

308
00:41:13.590 --> 00:41:22.920
William Cheng: Okay, you have a machine right next to you, maybe your machine right next to you is not doing anything. So machine right next to actually allow you to bow ram from the front, you know, from the other machine.

309
00:41:23.340 --> 00:41:33.930
William Cheng: Okay, so in order for you to get to the remote arrive, you have to go through the network and device over here. So again, you have to go through a device driver. So in this case of them the access time over here is going to be on the order of 100

310
00:41:34.470 --> 00:41:36.960
William Cheng: microseconds. So it's the same speed as SSD.

311
00:41:37.740 --> 00:41:43.740
William Cheng: That but typically your network device on your network devices they're actually really fast, but going through the device driver. They're going to be really so

312
00:41:43.920 --> 00:41:50.940
William Cheng: Okay, so therefore Indiana going to end up with the same performers and the other machine they you know the other machine. I don't know how much memory that going alone it to you.

313
00:41:51.390 --> 00:41:57.270
William Cheng: But you can actually use all the machine or your local area network. They're all willing to help you out. They can actually give some of the memory to you.

314
00:41:57.360 --> 00:42:08.310
William Cheng: So in the end, you know, we're actually going to end up with on the order of 100 gigabytes of memory. So, yeah, yeah, you sort of need to be on the same local area. Whereas if you don't know what they are. It's okay. Right. If you take an hour can cause you know what they are.

315
00:42:08.760 --> 00:42:17.910
William Cheng: Saying, they're on the same network. So this guy. They were long your memory to their will on their memory to us. Okay. They need to be cooperating with you in order for you to, you know, for you to be able to do that but

316
00:42:18.360 --> 00:42:26.640
William Cheng: In most cases where you go to the computer room you want to use a wrap on another machine, the machine says, are you kidding me. I'm also doing something important. I'm not alone you any of my memory.

317
00:42:27.150 --> 00:42:30.240
William Cheng: OK. So again, this is only works if he has special setup that

318
00:42:30.750 --> 00:42:38.760
William Cheng: The next device over here is going to be the desk and we saw it before the right nope your hard drive. We talked about how slow. It is the access time on the average is going to be one to 10 milliseconds.

319
00:42:39.180 --> 00:42:50.220
William Cheng: So this kind of storage capacity is going to be a little bigger, right, it's going to get to one terabytes over here and today you can actually by, you know, by, by this one eight terabytes of disk or something like that. So again today capacity usable beta

320
00:42:51.090 --> 00:42:58.680
William Cheng: Yeah, so this guys, we are all everything over here is going to be a device. In this case, a mechanical device that's why he says he's been slower. You can also go to the remote desk.

321
00:42:59.040 --> 00:43:04.440
William Cheng: Okay, because there are other machines or your local area network that allow you to actually have storage out there. We also

322
00:43:04.890 --> 00:43:06.180
William Cheng: I guess today. They're also

323
00:43:06.510 --> 00:43:18.180
William Cheng: You know Network Attached devices, right, you can actually go to Amazon by a network attached this you can attach to your network. So the neck as they will be a remote this get the access time over here, it's gonna be even slower on the order of 100 milliseconds.

324
00:43:18.510 --> 00:43:21.150
William Cheng: But in this case, you can actually have a lot of extra storage, you know,

325
00:43:22.110 --> 00:43:31.140
William Cheng: The inside your network attached storage that, what about after that right after over here. It could be, you know, inside of cloud, you have to go through your network device driver into connect to the internet.

326
00:43:31.620 --> 00:43:41.670
William Cheng: The access. It depends on how far away it is and the capacity over here. Amen. I mean, you know, could it be infinity, right. So again, it's always fine I but it's going to get get very, very big

327
00:43:42.570 --> 00:43:51.510
William Cheng: Okay, so go one of the view over here is that you know your data is actually stored in the cloud. And then what we need to do is, I will. We need to access that we need to bring it closer to

328
00:43:52.260 --> 00:43:54.720
William Cheng: Get, get into as close to the CPU as far

329
00:43:55.440 --> 00:44:02.940
William Cheng: As possible. And as you get closer into the CPU, you know, to get faster and faster and then got a storage capacity over here for the cash, it's going to get a smaller and smaller.

330
00:44:03.270 --> 00:44:16.560
William Cheng: Okay, so you can sort of think about this entire hierarchy as a caching hierarchy or as your storage hierarchy guys when so. So guys, we're going to interview to talk about a hash cash hierarchy or storage hierarchy. So again, remember this particular slide. Yeah.

331
00:44:18.690 --> 00:44:29.160
William Cheng: Alright, so the next thing we're going to sort of briefly talk about is the 64 bit issues over here. So what about when you have a 64 bit CPU rather than this guy's your virtual as it's going to be 64 bits law.

332
00:44:29.970 --> 00:44:35.280
William Cheng: You know, so yeah, so. So if you have a multi level, you know, multi level patient. Well, you know,

333
00:44:35.880 --> 00:44:40.620
William Cheng: Again, instead of chopping down into three parts. We can chop the into as many parts of that as we want. Okay.

334
00:44:41.040 --> 00:44:50.370
William Cheng: So as it turns out, it's kind of funny. One of the competitor of interest noise AMD very so the AMD format is actually the most popular format that this is no as the x86 64

335
00:44:50.640 --> 00:45:05.520
William Cheng: You know, before, Matt. So when you download the Ubuntu 16 point or poor for ISO file. Remember, there are two different files, you can download one is called I 386 right that's what lines for the 32 bit format. And the other one is called x86 64 that's for the 60 day format.

336
00:45:06.720 --> 00:45:16.080
William Cheng: Okay, so, so, so in that case you're using this pretty good format over here. So again, a 64 bit address space to do the 64 is astronomical. The large number. Okay, you will never need to

337
00:45:16.500 --> 00:45:25.410
William Cheng: address space that big. So whether we do is that for me there was just said the first 16 but over here, it's gonna be unused. So in this case, the address actually is 4848 baseball

338
00:45:26.040 --> 00:45:32.670
William Cheng: Okay, so we're going to take the 48 this law. We are going to chop it into five parts over here and then again the same idea with the multi level paste a ball.

339
00:45:32.820 --> 00:45:37.110
William Cheng: For the first part over here. There's the first level paste a ball for the second bar over here. The second level page a ball.

340
00:45:37.320 --> 00:45:43.230
William Cheng: I mean the terminology is going to get a little weird over here. The first part over here is called a call page map tape paste map table.

341
00:45:43.410 --> 00:45:52.590
William Cheng: And a second one is that I know as the paste directly pointer table and the third one over here. Again, we're going to go back to the 32 bit terminology. This one will be no at the pace directory table and the next one of your

342
00:45:52.950 --> 00:45:56.310
William Cheng: Pace table. And then the last one over here will be the actual physical page.

343
00:45:56.940 --> 00:46:04.470
William Cheng: So again, the idea here is exactly the same when you perform address translation, you're going to use the first, you know, so many beers over here as array, array index.

344
00:46:04.710 --> 00:46:10.620
William Cheng: That will give you about your again give you a page map table entry that looks just like the pace for entries over here.

345
00:46:10.770 --> 00:46:18.570
William Cheng: You check the validity. Check the access there and then it will give you a physical page number that will give you the base address will be here for the next level page table and then you prefer address

346
00:46:18.960 --> 00:46:21.390
William Cheng: Translation exactly the same way it was before.

347
00:46:22.080 --> 00:46:31.590
William Cheng: Okay, so you can see that if we don't have a good performance translation look us up. But for now, the overhead over here is going to be really scary is going to be 1234 over here is going to be

348
00:46:31.890 --> 00:46:38.190
William Cheng: 400% overhead. So you're gonna run five times slower, but hopefully if you have a good performing translation. Notice that buffer.

349
00:46:38.310 --> 00:46:50.820
William Cheng: Every one of these pasted wedgie. They are all things that are transitioning to that buffer so therefore it's going to call only going to cost you four nanoseconds over here for you to provider sensation and eventually you're going to go to the bus and spent on the spent 100 now second

350
00:46:52.050 --> 00:46:57.840
William Cheng: Guys. Oh God, this is the importance of, you know, transition look as a buffer, especially if you have, if you have a 64 bit CPU.

351
00:46:58.710 --> 00:47:05.370
William Cheng: Know, for me, there's actually 2421 and the second one over here only tried to do for level address translation over here.

352
00:47:05.520 --> 00:47:14.010
William Cheng: They combine the last part over here into a giant page table. So this page, says I, a giant page. So this case, you can actually choose to have a page or four kilobytes.

353
00:47:14.220 --> 00:47:25.830
William Cheng: Or a two or two megabytes or two to make our iPad have a page over here. Okay. So in this case, you know, when you have a two megabyte page. Every time we need to go to this, you need to copy two megabytes or again on the average can take you a little

354
00:47:26.850 --> 00:47:30.060
William Cheng: Longer, but in this case the address translation over here will be a little faster.

355
00:47:30.870 --> 00:47:42.270
William Cheng: Guy. But again, if you are you if you're using a good transition because I prefer the advantage that you gain from, you know, from the address translation is not going to be very, very much there. So again, you know, people can decide which way they want to go. Yeah.

356
00:47:43.410 --> 00:47:47.100
William Cheng: All right, what about Intel. So this one is in the x over here.

357
00:47:47.760 --> 00:47:54.750
William Cheng: You know entices archaeological is 64 so i think i stays us as well. The Itanium architecture as well architecture.

358
00:47:54.960 --> 00:48:06.570
William Cheng: Itanium the 64 bit architecture. I think around 2012 at the last company that use it architecture as HP and HP says forget that we're not going to do this we're gonna we're gonna go with the D amp D format.

359
00:48:06.870 --> 00:48:08.340
William Cheng: So therefore, nobody uses anymore.

360
00:48:08.700 --> 00:48:17.070
William Cheng: So that we're not going to talk about it. But the basic idea over here is that if you look at the structure. It looks almost like the linear page table over here with divided into two or three parts.

361
00:48:17.310 --> 00:48:24.840
William Cheng: The first part over here, a small number of this and that will give you the space. The, the, they'll give you the space register or I say this guy is over here.

362
00:48:25.020 --> 00:48:32.820
William Cheng: They will use, you know, three or four bullets over here. So there are 16 different spaces, instead of having only to have only four spaces. Okay.

363
00:48:33.240 --> 00:48:44.490
William Cheng: But again, since you know it didn't really work out very well for digital equivalent court eventually Intel sort of suffer the same fate, because the performance for ice support is really not very good. So they so everybody went with the end format.

364
00:48:45.270 --> 00:48:53.400
William Cheng: So therefore, we're going to skip all that also. Yeah. And for the listener base page that will get you don't have to worry about the address translation because. Nobody does that anymore. Okay.

365
00:48:55.140 --> 00:48:58.320
William Cheng: Oh, the last part over here in chapter seven overview of the hardware.

366
00:48:58.530 --> 00:49:07.500
William Cheng: Is talk about virtualization, but we haven't really talked about what is the virtual machine. Yeah. So right now what we're gonna do is, I'm going to skip everything over here. And now we're going to go into the second part of chapter seven.

367
00:49:07.710 --> 00:49:15.780
William Cheng: Or so, so we're done with hauling out for now. Okay. And when we finished virtual machine. We're going to come back to talk of talking about, you know, what is the virtualization.

368
00:49:16.560 --> 00:49:24.360
William Cheng: Yo, you know what, what is the virtualization issues over here for virtual memory. You know, for, for, for, for, for virtual machines.

369
00:49:24.750 --> 00:49:31.140
William Cheng: Okay, I'm using the word virtual too much too many times. Again, it's gonna go crazy. So yeah, we're gonna come back talk about this. And now we're going to go into

370
00:49:31.410 --> 00:49:43.980
William Cheng: The second part of chapter seven and look at the operating system support to use all these hardware that. All right. This is actually a good time to break. So in part three, we're going to talk about the second, the second part of chapter seven.