WEBVTT 1 00:00:01.829 --> 00:00:10.440 William Cheng: This is part three of lectures 17 so we just finished talking about software update. So the next approach. We're going to look at is transaction. 2 00:00:10.860 --> 00:00:16.710 William Cheng: And let's see why transaction give you a better consistency. Then stop update. Okay. 3 00:00:17.279 --> 00:00:28.620 William Cheng: Transaction was the concept from database systems. So we're going to sort of look at a classical example of what a transaction is so the classical example is to transfer $100 4 00:00:29.250 --> 00:00:34.080 William Cheng: From one account to another account. Okay, so let's say that we have account number one over here. 5 00:00:34.680 --> 00:00:48.360 William Cheng: That has you know $900 and account number to over here is a has $200 okay and you want to transfer $100 from a one to a to maybe because we wanted the second one is a checking account and you want to write a check for $300 then 6 00:00:48.930 --> 00:00:56.790 William Cheng: So there's really no a primitive operations such as transferring money from one place to the other. And again, this is very similar to the the 7 00:00:57.450 --> 00:01:06.000 William Cheng: Moving a file from one directly to another, another in order for you to do that. You do the separate into two operations. One is, um, Link, followed by link. Okay, it's over here. 8 00:01:06.390 --> 00:01:12.810 William Cheng: The way that we should achieve this and moving account from one account to the other. What we should do is that we should document. The first account by $100 9 00:01:12.990 --> 00:01:26.190 William Cheng: And they increment the second account by by $100. Yeah. So therefore, the way that we do this is that we can detriment account number one by $100 and increment account number two up by $100 but we want to make sure that this is done in atomic fashion. 10 00:01:27.150 --> 00:01:35.340 William Cheng: Okay, so, because, because again, you know, once we performed the first operation. What do you have a system crash. Well, you have a system crash, then you know the customer. 11 00:01:35.730 --> 00:01:43.530 William Cheng: account number one over here will go to $100 account number two will have $200 so in this case 100 $100 so it sort of disappears. Now, 12 00:01:44.160 --> 00:01:53.130 William Cheng: So in that case, what we need to do is that we want to make sure that these two operation are done in one atomic operation right as we mentioned before, there's really no way to do it. You cannot tell the system, not the crash. 13 00:01:53.910 --> 00:01:59.880 William Cheng: So one thing that you can do is that, you know, against database people what they did, they what they will do. They will, they will put these two 14 00:02:01.080 --> 00:02:10.650 William Cheng: These two operations inside a transaction. Okay, so what's the transaction transaction is basically a contract that you make with a database system that either. All of them are done, or none of them are done. 15 00:02:11.040 --> 00:02:23.730 William Cheng: Okay, so either you know none of them are done so in this case it will still go back to the previous day. So in that case, if you write a check your check will bounce. But again, it's not as bad as this 100 100 $100 disappear on your account. 16 00:02:25.020 --> 00:02:31.020 William Cheng: OK. So again, it's not a perfect system that you know so. So when you when you put that means that a transaction either none of them is done. 17 00:02:31.380 --> 00:02:39.780 William Cheng: Okay, or all of them. Is that right, so what's important. Over here is that at the end. The some of these two accounts they the total of these two accounts. They're equal to the same number. 18 00:02:40.410 --> 00:02:44.640 William Cheng: Okay. So in this case, there's no money disappear and also there's no no so extra money that shows up. 19 00:02:45.060 --> 00:02:52.890 William Cheng: Because if there's extra money that shows up, then in that case the bank will lose money. Right. So the bank somebody doesn't want that. Yeah, so enjoy the sort of picture is that we're going to put these two 20 00:02:54.120 --> 00:02:59.820 William Cheng: These two operations instead of transaction, you know, we have put a lock here is a business that is an atomic operation. 21 00:03:00.450 --> 00:03:06.510 William Cheng: So again, what is the transaction, right. So we're going to talk about what is the property of the transaction and then we're going to talk about how to implement a transaction. 22 00:03:07.230 --> 00:03:16.080 William Cheng: The transaction transaction and sometimes that satisfy the asset property and acid is an acronym. A stands for of course atomic right so he's an all or nothing. 23 00:03:16.770 --> 00:03:24.120 William Cheng: Okay, so either all of them are done, and none of them again. How do we know whether he's all of them are gone. Oh, none of them have done. Okay, so there's going to be a time where we're going to commit the transaction. 24 00:03:24.300 --> 00:03:30.150 William Cheng: So that's know as as commitment time is over here. Here is the time over here at some point we're going to commit a transaction. 25 00:03:30.420 --> 00:03:39.060 William Cheng: Okay, if the system crash before the before the commitment time. In that case, none of them will be done if the system crash after the commitment time that all of it will be done. 26 00:03:39.990 --> 00:03:47.250 William Cheng: Okay, so therefore the concept of a commitment time is very, very important at this commitment time is implementation dependent on later on we're going to see how to actually implement 27 00:03:48.240 --> 00:03:53.550 William Cheng: This transaction commitment. So we know exactly when it's going to be all when is going to be nothing yet. 28 00:03:54.000 --> 00:04:02.190 William Cheng: The C stands for consistent what he will do is that when you run these in transactions you will take the system from one consistent state to another consistent state. 29 00:04:02.790 --> 00:04:09.240 William Cheng: Okay. So, therefore, if all you are doing our transactions, then in that case your system will always be in a consistent say no matter where you crash. 30 00:04:09.720 --> 00:04:13.650 William Cheng: You can crash at a bad time. Have a good time. So next time when you recover. You're always going to be in an 31 00:04:14.010 --> 00:04:23.760 William Cheng: Inconsistency and that's why using transaction will make your system, you know, more resistant to crashes, because the system will always guaranteed to be in a consistent state like 32 00:04:24.930 --> 00:04:34.410 William Cheng: The ice as well isolate it so so it means that, you know, have no effect on other transaction until committed. So you can actually have, you know, many, you know, so 33 00:04:35.250 --> 00:04:41.940 William Cheng: Many transactions happen inside your, your, your, your, your banking system, they cannot interfere with each other. The basic rule, though they'll 34 00:04:42.210 --> 00:04:46.710 William Cheng: Follow is that there's no effect on other transaction until the transaction is committed 35 00:04:46.920 --> 00:04:54.990 William Cheng: Okay, so when you're in the middle of a transaction when other transaction want to use the same information for account one or number two, they have to wait. They have to wait until the 36 00:04:55.530 --> 00:04:59.280 William Cheng: Transaction is committed once they are committed what in that case you can start using the new value. 37 00:05:00.000 --> 00:05:05.880 William Cheng: Okay, so again you know all these transactions, they're isolated they're protected from each other until something is committed, then 38 00:05:06.480 --> 00:05:07.770 William Cheng: The last one over here is doable. 39 00:05:08.100 --> 00:05:14.730 William Cheng: So this one said that transaction is persistent because you just signed a contract over here. Where do you store those contracts. You just thought this contract a memory. 40 00:05:14.910 --> 00:05:21.930 William Cheng: Wiped restore this contract. Remember, you have a system crash, you're going to lose it. Right. So, so the requirement over here is that the transaction has to be made persistent 41 00:05:22.080 --> 00:05:29.610 William Cheng: So that in case we have a system crash, you can actually go back and see, oh, what kind of, you know what, what kind of transaction you know i i have 42 00:05:30.150 --> 00:05:34.470 William Cheng: A transaction contract and my side. So in this case, I have to honor all these transactions. 43 00:05:35.280 --> 00:05:41.130 William Cheng: That so so so so this is what the asset properties stands for. So, pretty soon, we're gonna see how that's the implement it. 44 00:05:41.670 --> 00:05:48.180 William Cheng: So why did you start running transaction, we are once you start running transactions to modify. So in this case, we're not doing that on the file system. 45 00:05:48.870 --> 00:05:55.530 William Cheng: OK, you are running transaction to modify the process then then the only way to modify the file system is to run transactions. 46 00:05:56.040 --> 00:06:06.300 William Cheng: Okay, if you want to modify only one part of the the facilities and this doesn't really affect anybody. You're not allowed to do that directly. You have to put inside. You have to put the operation inside a transaction. 47 00:06:06.840 --> 00:06:13.320 William Cheng: Okay, because when you put us out of transaction. Then we have the asset property and therefore with guarantee that the system will always in a consistent state. 48 00:06:14.610 --> 00:06:22.230 William Cheng: Alright. So in a way, this one, this is one of the reason it was slow things down, right, because once you start running transaction, no matter what you do to change even the tiniest 49 00:06:22.440 --> 00:06:30.030 William Cheng: Amount. You want to touch the tiniest money's out of pastas them you have to run a transaction. So get running. So it's actually meaning that you want to modify the houses that 50 00:06:30.210 --> 00:06:36.210 William Cheng: We just want to read it. It's okay. But if you want to modify. You have to run to those actions. Okay. Very important to understand that. 51 00:06:37.650 --> 00:06:47.340 William Cheng: All right, so let's take a look at how to implement transactions are there are two major approaches. One is called journaling so journal journal sounds like the last section of our system. 52 00:06:47.910 --> 00:06:53.490 William Cheng: OK. So the idea here is that we're going to sort of follow some of the idea in the log structured process them. And what we're going to do is that 53 00:06:53.760 --> 00:07:01.620 William Cheng: You know, sort of, before we modified the actual file system or we're going to do is that we actually get the right information into the journal to say what we are. What are we about to do. 54 00:07:02.280 --> 00:07:07.350 William Cheng: Okay, so it's almost like, you know, let's say you're forgetful firsthand. You want to go to the supermarket to buy something. 55 00:07:07.620 --> 00:07:13.590 William Cheng: How do you make sure that you don't forget. Well, you make a shopping list right so the shopping list is a journal. It tells you what I have to buy 56 00:07:13.740 --> 00:07:18.480 William Cheng: In from the supermarket. So this case, in case you forget you can always check your shopping this and you compare that against your 57 00:07:18.810 --> 00:07:24.120 William Cheng: grocery bag. And you know what's what what what's bad and what's missing and you continue, you know, to, to, to buy it, buy all the stuff 58 00:07:24.930 --> 00:07:32.730 William Cheng: That's required. So, India, you're guaranteed to get everything that's on the shock of this. Okay, so that's what the general. So we're going to write, you know, different kinds of information on 59 00:07:33.300 --> 00:07:37.230 William Cheng: There are two different approaches. One is called undo generally and the other one is called redo generally 60 00:07:37.440 --> 00:07:44.520 William Cheng: If you're using undo generally. So in this case, what you would do is that you are right that before images of the this blog, you know, you're running into the journal 61 00:07:45.120 --> 00:07:53.040 William Cheng: Okay, so before you modify the file system. What you want to do that. You want to look at the file system, say, oh, you know, here is a block that we're going to replace. I'm going to remember what was the previous value. 62 00:07:53.940 --> 00:08:03.090 William Cheng: Okay, so this way, in case there's a system crash. I can actually go back to the previous state of the file system. And of course, by definition, the previous era, the file system is inconsistent state. 63 00:08:04.080 --> 00:08:10.920 William Cheng: Or as and this guy's if I if I store that before images of at this blog inside the journal. That's called undo generally yeah 64 00:08:11.340 --> 00:08:15.300 William Cheng: The other one is called we do journaling. So in this case, I'm going to store the after images over this blog. 65 00:08:15.600 --> 00:08:18.210 William Cheng: You know the inside a journal before I, right, to the right to the desk. 66 00:08:18.510 --> 00:08:29.610 William Cheng: So in this case, what I would do is I will go to the buffer cache. I'm going to look at it that this is a this is going to be my new data before I read the new data onto the desk. What I need to do is I need to write the after images into the journal 67 00:08:30.510 --> 00:08:36.660 William Cheng: Okay, so this is what we do. Generally, it's a typical day inside the databases. Then we're going to see two kinds of gentlemen. I'll do generally and we do generally 68 00:08:37.200 --> 00:08:45.240 William Cheng: Inside the file system, typically what we see is renewed early. Okay. So, therefore, for the purpose of this class we're going to focus on radio generally and then pretty soon, we're gonna see how it's not 69 00:08:45.930 --> 00:08:54.000 William Cheng: That there's also another approach call shadow page. So after we finished with generally we're going to sort of take a look at shadow pages to see how see how that works. Yeah. 70 00:08:56.220 --> 00:09:03.450 William Cheng: All right, a journal is a separate part of this. So if you look at the desk over here. What's on the desk right there is the actual file system. Right. The fast. 71 00:09:04.080 --> 00:09:12.390 William Cheng: So again, what's the actual file system. Right. It's the file system hierarchy and also the free blocks that the actual process then and we also know that there's a swap space on the desk right 72 00:09:12.720 --> 00:09:20.160 William Cheng: Okay, so now if you are doing. Generally, you can also have a journal as part of your desk. Okay, so there's going to be a dedicated space on it is to 73 00:09:20.460 --> 00:09:29.130 William Cheng: To to to read your journal. Okay, so in a way to sort of think about. It's kind of like a log structure file system because because you know this journal is going to be a pen only and never modify 74 00:09:29.670 --> 00:09:36.810 William Cheng: But again, we, we mentioned before the last audio file system is not very useful because India where you you know what you right to the end of the fastest 75 00:09:37.260 --> 00:09:45.330 William Cheng: The fastest and become a read only file system. Okay, so in that case the journal cannot be like that. So we learned our lesson from that. And then we're going to use a journal in a slightly different way. 76 00:09:45.600 --> 00:09:50.970 William Cheng: Okay, but basically the idea of general is that it's a penalty and it's never modify. Okay. But there's an additional 77 00:09:51.540 --> 00:09:55.770 William Cheng: Additional operations, they can actually clear the journal and reset the journal back to an empty journal 78 00:09:56.280 --> 00:10:04.830 William Cheng: Okay, so that will be the extra operation. We're going to sort of see, when are you allowed to do that to perform that operation because you know if you perform at the wrong time. Well, then your system is cannot be recovered. 79 00:10:05.490 --> 00:10:15.420 William Cheng: Yeah. Alright, so the journal is a separate part of the this and here's one. So one of the most important thing about generally is that you can add generally to any existing file system. 80 00:10:15.900 --> 00:10:23.340 William Cheng: Okay, so if you have an actual file system that's not crash resilient. All you have to do is you add generally into it. And now to become a crash resilient file system. 81 00:10:24.060 --> 00:10:32.430 William Cheng: Okay, so that's why today generally is very, very popular. Although you know all the general pops up and says that they're all using journaling at the pullback crash resilience, then 82 00:10:33.090 --> 00:10:40.200 William Cheng: Also journal is a pen only like law right over here. The general sort of go this way. Right. So. So anytime you want to add something to a journal you your penny penny penny. 83 00:10:40.620 --> 00:10:44.940 William Cheng: For we do journal you a pen. What you're going to write to the main part of the day is to the 84 00:10:45.900 --> 00:10:53.430 William Cheng: To the actual file system. Okay, so before I modify the actual file system one block at a time. I need to write the after images into the journal 85 00:10:53.850 --> 00:11:01.470 William Cheng: Okay, so for example. Or here's my journal. I'm sort of draw this horizontally. Now, right. So again, it's a pen only. So, whenever I tried to, you know, 86 00:11:02.340 --> 00:11:06.990 William Cheng: I guess we have all the example before we have three this blog. We try to modify x y AMP z. 87 00:11:07.650 --> 00:11:12.990 William Cheng: Okay, so what we're going to do is that we're going to write the after images or the new image of the three blocks into the journal 88 00:11:13.410 --> 00:11:21.630 William Cheng: Called x, y, and z, right, we're gonna write x to the journal. Why to the journal and z to the journal journal before we modify the actual file system. So another another way 89 00:11:21.930 --> 00:11:34.560 William Cheng: To think about is that before we release them to the this update has we have to modify the journal first. Then once we modify the June over here about a pending that after images of X y&z to the journal, though we write a committed record. 90 00:11:35.460 --> 00:11:42.600 William Cheng: That. So over here I use this you know this funny block over here to show as a copywriter. So the company record is one desplechin sighs 91 00:11:42.870 --> 00:11:48.990 William Cheng: Okay, but this will guarantee that are committed record is either written to the desk or it's not written there. So it's all or nothing. 92 00:11:49.620 --> 00:11:55.710 William Cheng: Okay, so when I run a company record on to the days I can actually ask that this is a verify that there's a committed record everything. I'll do this and that. 93 00:11:56.250 --> 00:12:05.760 William Cheng: This will say either it's on there or it's not on there. Okay, there's nothing in between. Okay, there's no 90% is on there. If it's 90% of the company record is on the desk. That means that it's not on the desk. 94 00:12:06.510 --> 00:12:16.140 William Cheng: Okay so often is that that this employee. The this use these are powerful error checking algorithm, you know, to, to or error checking code. 95 00:12:16.560 --> 00:12:23.970 William Cheng: To make sure that you know the entire this Bob can be with another this or is considered now on the desk. Okay. So again, it's all or nothing. And it's easily verifiable. 96 00:12:24.360 --> 00:12:32.340 William Cheng: Yeah. So once I have, you know, written the company record to the journal and that verify that the committee record has been written to journal. That's my commitment time 97 00:12:33.090 --> 00:12:43.080 William Cheng: Okay, so if I can verify that the committee records Brudenell to this and now this transaction is committed. So what does that mean that means x, y, and z, eventually they will get modified inside the actual file system. 98 00:12:44.340 --> 00:12:53.490 William Cheng: Okay, I don't know when it's going to happen. So what happened is that once I write x, y, and z to the journal. I read the company record I verify the company record. Now I'm going to release x y AMP z to the this update 99 00:12:54.030 --> 00:12:59.130 William Cheng: This update has sort of that this update has can update the actual file system at any time the Watts. 100 00:12:59.550 --> 00:13:05.700 William Cheng: Okay, it could be in the next millisecond could be in the next second could be in the next 10 seconds. It doesn't really matter. Okay. Because as long as I'm 101 00:13:06.450 --> 00:13:14.850 William Cheng: Running transaction. I'm guarantee that the system, you know, so, so, since you know I met a come in already, the system will always go into the new state. 102 00:13:15.240 --> 00:13:22.200 William Cheng: Okay, of course, you want this to happen as quickly as possible because there might be other transactions that are waiting for this transaction do to to to to come in. 103 00:13:22.590 --> 00:13:33.120 William Cheng: So in that case, you know, you sort of want things to be done as soon as possible. Okay. But again, there's no no hurry. You can use it right back mechanism to write the block x y AMP z to the actual file system, but 104 00:13:34.260 --> 00:13:41.100 William Cheng: I sell so get over here says when it comes to other devices and you know like to the journal first and right to the you're ready to commit record. 105 00:13:41.250 --> 00:13:51.210 William Cheng: And then you write a letter to the file system asynchronously, you know, meaning that release them to that this update has so they can break through this anytime. It was only after the coming record has been written to the journal 106 00:13:52.440 --> 00:13:55.980 William Cheng: Okay, so that's it. That's how you run transactions. So what if you had a crash. 107 00:13:56.580 --> 00:14:06.690 William Cheng: Okay, so you could get as close as possible as taking the scenario here we're going to write X, Y, and Z fall about upcoming record and then we release x y AMP z to be to be written out to the actual process then and then 108 00:14:06.990 --> 00:14:11.370 William Cheng: The next one obvious ABC and D over here. Right. And then we write a commit record again. 109 00:14:11.550 --> 00:14:21.600 William Cheng: And then we released a, b, and c and d. We were not apostles them at this point we don't know if X y&z ABC. They have gone through the process them and then when you write E and F and now we get a system crash. 110 00:14:23.280 --> 00:14:27.210 William Cheng: Okay, so we're going to system called. What should we do, right, so, so, so let's take a look at this example here. 111 00:14:29.700 --> 00:14:30.270 William Cheng: Alright, so, so 112 00:14:31.380 --> 00:14:39.330 William Cheng: Let me go, go back to this example over here. Maybe I shouldn't clean it out. Alright, so we. Yeah, so, so, so when I reboot. 113 00:14:39.870 --> 00:14:47.130 William Cheng: The system because now I lose power or if I have an operating system crash next time when I you know will will will will when I read what it says that 114 00:14:47.370 --> 00:14:55.740 William Cheng: The first thing that I need to do is I need to sort of determine whether the file system can be used or not. OK, so the file system us sorry that this guy over here. 115 00:14:56.460 --> 00:15:08.250 William Cheng: Extended erase this. Okay, so, so there's the actual process them over here, x, y, and z. They're part of the actual versus them. So, so what I need to do that I need to determine whether the actual file system is inconsistent state or not. 116 00:15:09.300 --> 00:15:14.100 William Cheng: Okay, so. So in that case, what should I do right, I will, I will look at the journal to see the journal is empty. 117 00:15:14.250 --> 00:15:18.360 William Cheng: Is the journal is empty, then I know I can use the actual processor right the journal is not empty. 118 00:15:18.480 --> 00:15:25.770 William Cheng: Well then I have to make sure that all the data that's supposed to go on to this has gone to this. Yeah, so what what i would do over here is I'm going to scan the journal over here. 119 00:15:25.890 --> 00:15:32.700 William Cheng: To look for transactions. Right. So one of the transaction right X y&z followed by the company record ABC and D, followed by the coming record that year and a half. 120 00:15:33.540 --> 00:15:41.010 William Cheng: Or. So in this case, what I would do is I will scan the journal five x, y, and z. And I found the company record that means that x, y, and z needs to go on to the desk. 121 00:15:41.520 --> 00:15:44.670 William Cheng: Okay, I don't really know where the XYZ has gone through this yet, right. 122 00:15:44.850 --> 00:15:53.490 William Cheng: Because you know before I release it at this debate has I've no idea when they're going to be modified. So what I would do with our started all over again. Copy XYZ over here to the desk over here again. 123 00:15:53.670 --> 00:15:59.370 William Cheng: You can be done asynchronously. I don't know when that's going to be da well that continue to read the journal to look for the next committee wrecker 124 00:15:59.550 --> 00:16:10.140 William Cheng: When I found the next coming record. I'm also going to release a, b, c, and d over here to tell that this up, it has to say, hey, write these after images into the into the actual pass them. What if I get a crash right now again. 125 00:16:11.040 --> 00:16:21.210 William Cheng: That I will do again the same thing over and over again. So next have what every boo. Again, I'm going to check whether the journal is empty on other journal is not empty, I find the first transaction. I read release X, Y, and Z. 126 00:16:21.390 --> 00:16:29.520 William Cheng: You know, to this update has and then I found the Second Coming right over here really say B and C. So, so let's say I haven't even released and not get another crash. 127 00:16:30.390 --> 00:16:35.310 William Cheng: Okay, it doesn't really matter how many times I cried. I will keep doing this over and over again. So that's not my crotch over here again. 128 00:16:35.460 --> 00:16:45.330 William Cheng: I find a committed record release x, y, and z to the actual passage and find out a company record, right. A, B, and C D to the company record and then I won't be able to find any more coming right there. So, therefore, I know that I'm done. 129 00:16:46.080 --> 00:16:51.990 William Cheng: Okay, so now. Now all I have to do is to wait for all these things right. I need to wait for this update has finished updating the desk. 130 00:16:52.650 --> 00:17:05.100 William Cheng: Okay. So, at this time, I'm still in the middle of the boot. But now I'm sort of in the recovery process because I need to make sure that the fastest. I'm going to a consistent thing. Okay. Eventually when the dis update has has finished writing all this data out. 131 00:17:06.720 --> 00:17:12.000 William Cheng: To the actual file system, and I don't get a crash is the fastest them in a consistent, they know 132 00:17:13.560 --> 00:17:19.290 William Cheng: What the answer is yes. Right, because I'd write all these data will be here out to this and now my by the definition of transaction. 133 00:17:19.590 --> 00:17:24.030 William Cheng: I'm at the end of all these transactions. So therefore, the file system has to be inconsistent state. 134 00:17:24.570 --> 00:17:30.300 William Cheng: Okay, so in the file system is the real consistency over here. The next thing I would do is I would delete the entire journal 135 00:17:31.260 --> 00:17:44.340 William Cheng: Okay, so this time for next time when I you know when when I remove my file system. So next up, but I reboot the system, I will see that the journal is empty. So therefore I know my PA system is in a consistency. Alright, so this is how you perform recovery. Yeah. 136 00:17:45.780 --> 00:17:49.770 William Cheng: Okay, so let's take a look at our original example over here, right, we have three files. 137 00:17:50.220 --> 00:17:59.640 William Cheng: Were three this blocks x y AMP Z. There's some dependency and now we don't care about the dependencies anymore. Right. We know that X, Y, and Z or the dirty blocks inside our buffer cache. 138 00:17:59.880 --> 00:18:07.050 William Cheng: Okay, so it says, Oh, said. Also, we know that we sort of modify them together. When we perform a file system operation. So what we're gonna do is, I'm going to run a transaction. 139 00:18:07.230 --> 00:18:15.570 William Cheng: We're going to take the after images of X, Y, and Z first write it to the journal and then write a company record once we finish coming to record them we can release them. 140 00:18:16.020 --> 00:18:20.190 William Cheng: To be written out to the desk over here, I sort of show you that you can read in any order that we want. 141 00:18:20.790 --> 00:18:30.150 William Cheng: Okay, so over here is y z and x over here. And then when we release it alpha for for writing over here. After we verify the committee record has going to our journal. We can we can release them. 142 00:18:30.660 --> 00:18:31.830 William Cheng: You know, can be reached out to the test. 143 00:18:32.820 --> 00:18:38.790 William Cheng: Them so so so so that this particular operation is done is that you know so so inside the the processor 144 00:18:38.970 --> 00:18:47.700 William Cheng: what he would do is that if you're implementing generally and periodically what he will do is it will go into the buffer cache over here. It's going to find all the transaction right so it's okay. What is this transaction. 145 00:18:48.030 --> 00:19:00.600 William Cheng: Transaction is a data structure. Okay, so again, your file systems little get a little more complicated, whenever you try to modify the actual file system. What you need to do is I need to start a transaction and put all these after images into the transaction and then you 146 00:19:02.340 --> 00:19:09.090 William Cheng: Will when you're done with that you you are you read another another records. Is there any transaction. 147 00:19:09.690 --> 00:19:15.810 William Cheng: Okay so so got a transaction look like this right there three operations. There's X, Y, and Z. Here's beginning transaction at the end of 148 00:19:16.320 --> 00:19:20.940 William Cheng: The transaction. So once you write the end transition over here. You're going to give this data structure over here. 149 00:19:21.420 --> 00:19:30.270 William Cheng: To to the transaction manager inside you inside of our system and then periodically your transaction manager will take all the transaction and what it will do is it will start writing into the journal 150 00:19:31.050 --> 00:19:43.290 William Cheng: OK. So again, the way the writer journalist right all these after images are coming record release them to this update tasks and then write the, you know, right after images right upcoming record. So keep they keep doing this until all the transaction over here going to journal 151 00:19:44.520 --> 00:19:51.870 William Cheng: Okay. So you went and then what it will do is it will wait for this athlete has finished updating all the records when that's done anyway, it will be able to delete the journal that 152 00:19:53.730 --> 00:19:55.260 William Cheng: Alright, so the recovery. 153 00:19:55.680 --> 00:20:04.170 William Cheng: I mentioned already, right. So which do is recovery is that, you know, whenever you get a system crash. Right. So, I guess, at some point, you cannot predict when they're going to happen. You don't lose power, you can have operating system crash. 154 00:20:04.440 --> 00:20:10.320 William Cheng: When you are in the recovery of your fastest them. The first thing you need to do is check the journal. The journal to see the journal is empty. 155 00:20:10.560 --> 00:20:14.790 William Cheng: Is the journal is empty your recovery that and you know the fastest them. Is that a consistent thing, right. 156 00:20:15.030 --> 00:20:20.760 William Cheng: Otherwise going to perform recovery by, you know, the operation. I just mentioned before, again, you wait for the recovery to be done. 157 00:20:21.120 --> 00:20:30.180 William Cheng: If you get another class you're gonna repeat this operation over and over again, if you don't get a crash at the end of your recovery. You can add the journal, because now you know the file system using a consistency. Yeah. 158 00:20:30.960 --> 00:20:42.480 William Cheng: Alright, so after, after you are completely done with the recovery, the state of the first is them is what it was at the end of the last committed transaction inside the journal. So therefore, by the asset property. 159 00:20:42.930 --> 00:20:54.720 William Cheng: Your, your file system is inconsistent say okay so this is one of the reason today. You know when you're using the fastest, then you'll find them can actually, you know, the system can crash, you lose power and we would have to send the file systems. That'd be an inconsistency. 160 00:20:55.890 --> 00:21:06.450 William Cheng: Okay. So, therefore, you know, this is a really, really powerful idea and then you know the the way this is done is by, you know, borrowing the idea about databases them by running transaction inside of our system. Okay. 161 00:21:08.160 --> 00:21:18.330 William Cheng: All right, so, so let's go back to our, you know, regional example we here we modify x y AMP Z right so now the only thing that matters that whether you get a system crash before the coming records were not a desk. 162 00:21:18.510 --> 00:21:27.780 William Cheng: Or after the company record has been written out to the desk. Okay, so here the only two different times that you actually worry about whether it crashed, before, after, right. So if you cross before this. 163 00:21:28.290 --> 00:21:36.990 William Cheng: Okay, so if you crash before the committee records would another this your guarantee that you know x, y, and z has not been written out to the actual PA system. Why is that 164 00:21:38.280 --> 00:21:47.250 William Cheng: Right, because you know we are not allowed to release these this blog over here to this update has until we verify the coming records we read nowadays. So if there's no company record. 165 00:21:47.430 --> 00:21:56.400 William Cheng: And then we know that x y AMP Z. None of x, y, z has gone I have gone to the desk. So this way when we reboot the system, we're going to see which they were going to see the day at the office. They will look like this. 166 00:21:57.390 --> 00:22:03.060 William Cheng: Okay, if you get a crash afterwards. Right. So let's say that we're going to crash immediately after the coming records written 167 00:22:03.270 --> 00:22:12.870 William Cheng: Even though x, y, and z has not been written out to the this year, we know that if we perform the recovery operation over and over again, sooner or later, X y&z will be written out to this. 168 00:22:13.200 --> 00:22:22.440 William Cheng: Guy. So therefore, we know that India when the recovery is done, the file system is guaranteed. You're going to go into the news day and the new state over here. Again, it's going to be a customer and say, hey, 169 00:22:24.030 --> 00:22:30.030 William Cheng: All right. Similarly, for the other example over here. Again, we still have X depends on why depends on z. Can we do exactly the same thing. 170 00:22:30.270 --> 00:22:41.190 William Cheng: As somebody. There's the one that we move a file from one place to the other. In that case, we have two records that are change x prime and y prime, we're going to run x plus y prime into our transaction. So again, it's either all or nothing that 171 00:22:43.980 --> 00:22:56.430 William Cheng: All right, so, so basically you just download it, it will run to 16.04 it's a 1.5 gigabyte file. Okay. So, in that case, you know how big the journal has to be an even number for you to write the after images of that file onto the desk. 172 00:22:56.910 --> 00:23:02.820 William Cheng: Okay, so again the journal is only a part of the desk. Over here I turn over here, right, if you want to write everything that you run with 173 00:23:03.060 --> 00:23:10.080 William Cheng: That you want to write to the actual processor right into the journal. First, that means that the journal has to be as big as the largest file that you ever download 174 00:23:10.560 --> 00:23:14.790 William Cheng: Okay, or even bigger because we've done with two files or three files like that. Right. So there are 175 00:23:15.090 --> 00:23:21.600 William Cheng: Actually two different options. One is journal everything. So if you are generally everything. And that means that nothing can be lost, right, either all or nothing. 176 00:23:22.050 --> 00:23:26.820 William Cheng: Okay, and the other way is to do is to turn on meta data only I remember one of the meta data. 177 00:23:27.330 --> 00:23:34.680 William Cheng: Okay, when we talk about the, the, the, the, the, I know right. This is the file system. The I know the I know as part of the metadata and also the 178 00:23:34.950 --> 00:23:42.390 William Cheng: The, the, the, I know this map point to the indirect blog, the doubling direct Bob the troubling direct law, they are all metadata. 179 00:23:43.140 --> 00:23:50.700 William Cheng: Okay, so in that case we can actually journal all the meta data because even for a really, really big file the metadata is actually pretty small, and they will fit inside the journal 180 00:23:51.060 --> 00:23:58.110 William Cheng: Okay. So this guy is. We will now journal, the actual data for a file. Okay. So this means that you know that 181 00:23:58.920 --> 00:24:02.940 William Cheng: When you try to save a file, it is possible that your file will be or will be corrupted 182 00:24:03.240 --> 00:24:11.550 William Cheng: Will actually will will actually get corrupted. So, for example, using Microsoft Word. And then, you know, right at the time when you hit the Save button. You have a power outage. 183 00:24:12.510 --> 00:24:22.380 William Cheng: Now that it is possible next time when you put your put up your fastest. And as it turns out, we are going to recover all the metadata, but we will not be able to recover the actual data for the file. 184 00:24:22.980 --> 00:24:31.260 William Cheng: Okay, lots of system uses pretty good office. Oh, yeah. Why do they do that is because they don't want to journal everything that gives the general will have to be really, really big, you have to take up a lot of disk space. 185 00:24:31.590 --> 00:24:41.370 William Cheng: Okay, so in order for them to say space. They will do, they will do meta data generally have the meta data only. So, in this case whenever you have a crush the file system hierarchy still in 186 00:24:41.880 --> 00:24:49.200 William Cheng: The, the file system data structure is still going to be impact. Okay. The only thing that might not be intact are the content of a regular file. 187 00:24:49.800 --> 00:24:57.480 William Cheng: Okay, so for all the other father device thoughts with directory. Are they all going to be kept. Make sure that they did it that they're actually quite resilient. 188 00:24:58.020 --> 00:25:08.910 William Cheng: Except for for regular file. Okay, so, so, so, so, you know, so. So again, if there's a common right here to say in general, it's extremely costly. If you want to make sure that data is never lost 189 00:25:09.690 --> 00:25:17.040 William Cheng: Okay if I want to make sure that, you know, not a single file is ever lost again, I mentioned before, right. There are some banking system, they're not allowed to have any data that there are laws. 190 00:25:17.190 --> 00:25:21.870 William Cheng: So in that case, they're willing to spend a lot of money. Make sure you buy all these big power supplies that the system is never down 191 00:25:22.020 --> 00:25:29.940 William Cheng: And also we got to make sure that we are generally everything because you know if you're willing to spend a lot of money, you will buy these shoes days, you have a huge journal. So in this case, you would never lose any data. 192 00:25:30.630 --> 00:25:33.960 William Cheng: Okay, so this can be done, but it typically in the general purpose opening said 193 00:25:34.590 --> 00:25:40.830 William Cheng: You know, we actually will allow you to actually lose some of the file data, but we are not allow you to corrupt the file system data structure. 194 00:25:41.610 --> 00:25:50.220 William Cheng: Guys over again that the directory contact has to be okay has been tagged even though the the for regular file the data without, you know, can actually be corrupt. Yeah. 195 00:25:50.880 --> 00:25:57.060 William Cheng: But I mean, so, so why is that the, you know, so, so, so, you know, so if you think about it right where you are. 196 00:25:57.360 --> 00:26:04.950 William Cheng: Clicking the Save button your Microsoft Word document right you're modifying something you spent the last, you know, I don't know, five minutes or something like that. And then you press Save. 197 00:26:05.760 --> 00:26:17.040 William Cheng: Okay, so at the time when use purpose if you lose power. Okay. Do you know whether you you know that when you click on the Save button. Was it immediately after you lose power or immediately before you lose power. 198 00:26:18.090 --> 00:26:28.980 William Cheng: Okay, if you click on Save right after you lose power, you know, if the file system doesn't remember what modification, you may well that's understandable, right, because you press the button right after you lose power. So, of course, there's no guarantee. 199 00:26:29.460 --> 00:26:40.140 William Cheng: That. But if you press the button, you know, in one microsecond 1,000,001 nanosecond right before you lose power if you demand that the fastest. Imagine we keep all your data. That seems to be a little unrealistic. 200 00:26:40.710 --> 00:26:46.050 William Cheng: Okay, because it is possible that you actually, you know, you press the button right after you lost power, but there's no way for you to know. 201 00:26:47.580 --> 00:26:55.860 William Cheng: Okay, so in that case, again, it's very, very expensive to make sure you don't lose any of your modification. So therefore, it is possible that we try to save something you're going to end up losing 202 00:26:56.460 --> 00:27:06.930 William Cheng: You know the maybe the last few minutes of your work. Okay. So yeah, even though you have a you're using the system that's quite resilient, you're still you still should back up your file because you know once in a while, you know, 203 00:27:07.890 --> 00:27:10.440 William Cheng: You still going to end up losing us a sample content. 204 00:27:11.250 --> 00:27:24.360 William Cheng: That was unless you are building a banking system that you willing to spend a lot of money line that case, you can make sure that it has never lost for a regular system, you always need to be ready for some data to be lost, right, because otherwise again. Again, it's gonna be very expensive. 205 00:27:27.300 --> 00:27:30.150 William Cheng: One sort of briefly mentioned about Linux implementation Linux. 206 00:27:33.480 --> 00:27:43.830 William Cheng: Is the FFS clone. It's the exact the same the excited the same information as the fast file system, okay EFT three is simply yesterday to plus journaling. 207 00:27:44.400 --> 00:27:51.210 William Cheng: Okay, so that's why the XT to where the slash laws and bad directory, starting with EFT three a year 64 they don't have a lost and found directory anymore. 208 00:27:51.630 --> 00:28:00.210 William Cheng: Okay. So they're basically the implement your transaction, just like what we're doing over here. They also include more extra operation is known as a checkpoint operation. 209 00:28:00.690 --> 00:28:08.400 William Cheng: Okay, because you really want to, you know, so why is that is that again inside. Inside your desk or via the journal is going to keep growing and growing and growing. 210 00:28:08.880 --> 00:28:11.940 William Cheng: So once in a while, you have to do your journal. Right. So what you can do is that once in a while. 211 00:28:12.090 --> 00:28:21.240 William Cheng: What it will do is that instead of the Linux operating system. They will have a threat intel your Colonel try to write all the data, you know, for this update has regular up to the desk and then 212 00:28:21.510 --> 00:28:29.940 William Cheng: What happens, that is going to make sure all the data went on to the desk and once they verify that all the data inside the journal. They all have gone on to the desk. What you can do think it actually leave the journal 213 00:28:30.690 --> 00:28:41.880 William Cheng: Okay. They call this operation. Unfortunately, called this operation I check pointing operation. Right. Why do I say unfortunately because inside the log services. Then we also have something called checkpoint file is nothing to do with checkpoint. 214 00:28:42.990 --> 00:28:53.310 William Cheng: Okay, so, so, so again, what's important about check lighting is that for for Linux after you perform the checkpoint operation, you can clear the journal. So this way, the journal doesn't have to be very big. So, you know, because pretty soon. 215 00:28:53.580 --> 00:28:58.470 William Cheng: You can actually use up the entire journal and it will be useless. So you got to be able to clear the journal was in a while. 216 00:29:00.330 --> 00:29:10.410 William Cheng: Alright, so again, some people confuse generally with lots of houses there. Right. So again, they are completely different things, even though some of the thing they use, there are similar ideas guys over here. Here the the important difference 217 00:29:10.710 --> 00:29:17.400 William Cheng: For the law structure of our sister. The purpose of the law structure of our system is to show that good right performance can be achieved. 218 00:29:17.820 --> 00:29:25.530 William Cheng: Okay, that's the main purpose of the law charge of our system and it's a experimental research houses people build it, just to see if that can be done. Okay. 219 00:29:26.040 --> 00:29:35.730 William Cheng: They have a core screen recovery using the checkpoint file right they use the double buffering to the checkpoint file again you know there's name conflict there, and also the last frontier file system is the file system. 220 00:29:36.510 --> 00:29:49.440 William Cheng: Okay, generally is a technique that can be used to provide crush resiliency and the only purpose. Where is the polite question resiliency. It's not to, you know, demonstrate that you can achieve the 100% of the right capacity of the right nope your hard drive that 221 00:29:49.860 --> 00:29:56.610 William Cheng: The gentleman can be added to any existing processor. Okay. And finally, I generally use checkpoint into for phone right bags. 222 00:29:57.120 --> 00:30:07.110 William Cheng: The purpose of that is to be able to go to do to clear the journal if it turns out in the middle of a checkup pointing you crush wine decades, you go to the recovery phase again and then in the end. 223 00:30:07.950 --> 00:30:14.400 William Cheng: What you prefer recovery phase, you're going to make sure that the fastest of going through the consistency and apply. You can also clear to journal 224 00:30:15.540 --> 00:30:18.840 William Cheng: Then, alright. So again, these two things are completely different. Okay. 225 00:30:20.700 --> 00:30:30.270 William Cheng: So the last approach that that's based on transaction is called shadow paging so the we are the textbook said, this one's a refreshingly simple. We're going to see how refreshing it is 226 00:30:30.720 --> 00:30:40.260 William Cheng: It's based on the copy or idea. But in this case, we're going to use them on the file system because of the copyright, we used to implement virtual memory. Right. But now we're going to use the system. Okay. 227 00:30:40.890 --> 00:30:45.780 William Cheng: Some of the example they use in celebration of one is the waffle file system that's done by network pious 228 00:30:46.200 --> 00:30:55.380 William Cheng: There are prizes very famous for network attached storage devices, right, you can actually buy a hard drive from, you know, Amazon or fries plugging into your network and they become a network drive 229 00:30:55.800 --> 00:31:06.540 William Cheng: Okay, so, so that that that's who they are. The other one is the example is some some emphasis and put CFS. My understanding was that, you know, for for Mac OS X, you know, 230 00:31:07.680 --> 00:31:11.430 William Cheng: With Darwin before they have your final public release. 231 00:31:12.450 --> 00:31:20.130 William Cheng: At some point they're actually using DFS there isn't shadow patriot. Okay. But as it turns out that in the end they decide to go to HR plus 232 00:31:20.670 --> 00:31:27.570 William Cheng: You know, which doesn't use shadow paging would you actually use transaction and that was the final houses in this article with yeah 233 00:31:28.290 --> 00:31:29.790 William Cheng: Right, so, so, so nice. 234 00:31:30.150 --> 00:31:38.070 William Cheng: What a shot of aging. So again, if you look at all that this block on your file system hierarchy. You can arrange them into a tree structure right here's a rude. I know. 235 00:31:38.190 --> 00:31:44.970 William Cheng: In the route. I know you have the, the, I guess inside. I know you have that this map to this map point all these data blast inside data blocks. 236 00:31:45.450 --> 00:31:53.970 William Cheng: You have the directory entries instead of directory entries. They contain I know numbers, the number point. So now that I know and then in the end can sort of draw this entire hierarchy. 237 00:31:54.870 --> 00:31:59.790 William Cheng: Okay, so what we're gonna do is we're going to, sort of, you know, we're using shadow shadow patient technique over here we're going to 238 00:32:00.390 --> 00:32:07.080 William Cheng: Take a look at all these this blog that form a hierarchy. And we're going to consider all these this blog read only and would apply copy on write on it. 239 00:32:07.980 --> 00:32:17.340 William Cheng: Okay. So in this case, for example, if you want to modify this this blah, what do you have to do all you have to apply a copy on write, you're going to make a copy of it, right. So I'm going to ask the you know the the the actual process. 240 00:32:17.730 --> 00:32:23.610 William Cheng: For new this blog. I'm going to copy the data over it over here right now, this won't be the new one and I'm going to modify the copy 241 00:32:24.120 --> 00:32:29.640 William Cheng: Okay, if you just the original data over here is considered read only. So, therefore, I need to copy. All right. So when I finished doing that it will look like this. 242 00:32:30.420 --> 00:32:37.290 William Cheng: A lot. But in this case, if I start from the root of my director hierarchy. I would never find this this block because over here, this point there's plenty of the wrong place. 243 00:32:38.100 --> 00:32:43.410 William Cheng: Of so therefore I need to change this pointers to point to the blue block over here. So, again, in order for me to change the pointer. 244 00:32:43.590 --> 00:32:46.560 William Cheng: I need to again apply copy on write, I need to ask the 245 00:32:46.740 --> 00:32:54.990 William Cheng: The, the file system over here for new Brock copy this this blog over here when you copy the pointer, they will point to the same place. So therefore, the pointer on the left or point right here. 246 00:32:55.170 --> 00:32:59.610 William Cheng: The point on the right will point right here. I'm going to change the first point over here. The point of the blue bra. 247 00:33:00.300 --> 00:33:03.120 William Cheng: Okay, to apply copy on write so therefore it will look like this. 248 00:33:03.960 --> 00:33:12.930 William Cheng: Because again, none of the blue blocks can be reached from the root of the file system hierarchy. So in this case I need to modify this pointer over here. So, again, as the actual classes and for new this block. 249 00:33:13.110 --> 00:33:18.750 William Cheng: Make a copy of so they will apply right here. And then that changes. Point two, two point right here so it will look like this. 250 00:33:18.930 --> 00:33:24.420 William Cheng: So again, all the blues one are not reachable so therefore I'm going to make a copy of this one over here. 251 00:33:24.630 --> 00:33:32.220 William Cheng: And then when I copy pointer. They point to the same place of the three pointers or pie right here. The last point that I need to change it to point to the new one. So it will look like this. 252 00:33:32.610 --> 00:33:39.270 William Cheng: So again, I'm not done right. The route over here. I need to perform copy on write. I'm going to make a copy of the rule. The last point to point here, the right pointer right here. 253 00:33:39.480 --> 00:33:44.520 William Cheng: I don't change the right partner to apply here and then it will be available, I guess. What if I get a crash right now. 254 00:33:45.270 --> 00:33:54.600 William Cheng: If I get across. Right now the root of the file system is still here. So in this case, all the blue box over here. It's basically it's like innocuous inconsistency. I will not be able to find that 255 00:33:55.230 --> 00:33:59.610 William Cheng: Right. So in this case, I go back to the previous state instead of houses them so there will still be a consistent say 256 00:34:00.060 --> 00:34:09.600 William Cheng: Okay, so in that case how do I actually moved to the new state inside of our system. So all I have to do is I need to go to the super blog and tell the superblock to say that here is the root of the forces in the hierarchy. 257 00:34:10.260 --> 00:34:12.900 William Cheng: Okay, so before instead of sort of gloss over here. This one is the 258 00:34:13.920 --> 00:34:23.640 William Cheng: Hierarchy and now I just need to modify the information, say this is the new this new part of the rule of the hierarchy. Yeah. As soon as I do that. That's like the commitment time 259 00:34:24.270 --> 00:34:29.160 William Cheng: When I modify the super blog to say that this one is going to be the new file system hierarchy. That's my focus is 260 00:34:29.580 --> 00:34:35.280 William Cheng: The commitment time if I get across. Before that I go to the yellow file system over here if I get a crash when I 261 00:34:35.700 --> 00:34:41.610 William Cheng: Get a crash after I've made a commitment when this guys next time when I reboot. I go to the supervisor supervisor will say, 262 00:34:41.820 --> 00:34:50.460 William Cheng: This is the new. This is the current route of us in the hierarchy. So in this case, I'm going to find all my changes and then all these changes will take the file system into a consistency. 263 00:34:52.320 --> 00:35:01.260 William Cheng: Okay, so this is the idea of shadow paging right the entire idea over here is that you want to use a copy on write, but he use it on this blog instead of, you know, you came from the virtual memory, then 264 00:35:02.760 --> 00:35:08.640 William Cheng: Alright, so here's where the room location is written on to this. It's like a committed record. So, therefore, that will give you the commitment time. Yeah. 265 00:35:10.500 --> 00:35:19.020 William Cheng: Alright, so we are done with crush recovery Christ resiliency. So the next thing we're going to look at is how to implement directories. Okay. 266 00:35:19.380 --> 00:35:25.650 William Cheng: So we're going to look out for directory and also we're going to sort of briefly look at name say management. We're going to only look at the beginning part of this. 267 00:35:26.400 --> 00:35:34.350 William Cheng: So, some of these are real property have a directory, you know. Number one is that there should be no restriction of the component and a component, it should be as long as you want. 268 00:35:35.430 --> 00:35:44.430 William Cheng: Okay, so. So what we've done in the system five houses and what even the ranch houses and we'll have a fixed size record that's really not a good way to go because if you want to do in the long phone and we can see that 269 00:35:44.730 --> 00:35:59.910 William Cheng: You know, you're going to get a name too long error message. And then you're not allowed to create a component that the other requirement of yours that you know the file system has to be fast are we mentioned before that, you know, the you know the the data directory file is busy. 270 00:36:01.170 --> 00:36:14.100 William Cheng: Especially like a file. So if you try to, you know, to, to, to look up something inside you know the ISA directory file, it can take a very long time. Okay, so for example in Linux system every time when you type of command. Right. Well, you said warm up one 271 00:36:15.360 --> 00:36:22.500 William Cheng: Okay, what about what about Wi Fi. If you try to type that into your terminal. It was a one on one command F out. Okay, so how 272 00:36:23.280 --> 00:36:32.670 William Cheng: How does it know that this command is not found right even though one point is that the current working directory. So that's why I was always running in a dot slash. So in this way, this case he will. He will find one more point in the current directory 273 00:36:33.030 --> 00:36:40.290 William Cheng: Okay, but if we don't have a slash what he would do is that it will look into a directory slash users slash Ben us are slash Ben. 274 00:36:41.400 --> 00:36:48.510 William Cheng: Yeah, if you tried to do an ls LS slash user slash Ben Linux actually going to ask you, are you sure because there are too many files in this directory 275 00:36:49.320 --> 00:36:54.990 William Cheng: Okay, so you can actually imagine if I tried to run this program over here when I perform. I wonder when it's hard to find this file. 276 00:36:55.140 --> 00:36:59.880 William Cheng: What I will need to do is, I will need to go to size user been a user business, a very, very large directory file. 277 00:37:00.060 --> 00:37:08.760 William Cheng: I need to go through every directory entry, try to look for warm up what and in the end I will fail because none of the entries over here but equal equal equal equal to one on one. 278 00:37:09.480 --> 00:37:16.470 William Cheng: Okay, so in that case I'm opening a really long file today. I need to go to the this many times in that case it will be really slow. So we need to the forces them to be fast. 279 00:37:16.650 --> 00:37:20.880 William Cheng: In case when we try to look up in the program to execute. So in our case, it needs to be much faster. 280 00:37:21.420 --> 00:37:27.750 William Cheng: Okay, so again this is linear list. The typical solution is that you build a tree with a hash table. So we're going to take a look at that in the next lecture. Yeah. 281 00:37:28.470 --> 00:37:30.150 William Cheng: All right. And also, we want you know 282 00:37:30.480 --> 00:37:38.760 William Cheng: We want to be space efficient right if we wanted to allow you know sort of a component or any length. So one thing that we can do that, we're going to have directory entry to have really, really big directory 283 00:37:38.940 --> 00:37:45.030 William Cheng: Well, in that case, what will create a small file. We're going to end up wasting a lot of space. So, so also we need to be space efficient. 284 00:37:45.600 --> 00:37:47.460 William Cheng: That's what we're also going to see how others can be done. 285 00:37:48.270 --> 00:37:54.900 William Cheng: That. So first, a reminder over here in this isn't PA system right again that directory file is an array of directory entries 286 00:37:55.110 --> 00:38:02.010 William Cheng: Right here is the first director entry over here. Every directly entry is 32 bites law. The 28 bytes over here is for the component name. 287 00:38:02.250 --> 00:38:08.820 William Cheng: It will be a see string that's backside zero terminated. Right. The second one over here. It's going to be for by as long, that will be. I know number 288 00:38:09.660 --> 00:38:17.580 William Cheng: Okay, so even though in this example we are we trying to create these components and they're all very, very sure that each one of them is going to take up 28 by so we're being a little wasteful in 289 00:38:18.780 --> 00:38:22.350 William Cheng: Using a storage space for directory file. 290 00:38:23.220 --> 00:38:31.890 William Cheng: Okay. So imagine if somebody says, oh, you know, every component, and he needs to be 1024 bytes law. So thanks guys would end up wasting a lot of space or so, therefore, we're not allowed to do that, then 291 00:38:32.310 --> 00:38:38.970 William Cheng: So this is the data structure for both the brand bosses and also for this is the boss is them. So guys, not very, very flexible. 292 00:38:39.600 --> 00:38:57.810 William Cheng: So we're going to do is I you know so i. So I guess I'm sort of, again, I'm following the the schedule of, you know, summer 2019 and this will be the end of today's lecture. So next time we're going to see how to make the directory entry, as long as you want. Okay. Alright, see you next lecture.