Return-Path: william@bourbon.usc.edu Delivery-Date: Wed Nov 12 14:31:19 2008 X-Spam-Checker-Version: SpamAssassin 3.2.3 (2007-08-08) on merlot.usc.edu X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.3 Received: from bourbon.usc.edu (bourbon.usc.edu [128.125.9.75]) by merlot.usc.edu (8.14.1/8.14.1) with ESMTP id mACMVJi4004894 for ; Wed, 12 Nov 2008 14:31:19 -0800 Received: from bourbon.usc.edu (localhost.localdomain [127.0.0.1]) by bourbon.usc.edu (8.14.2/8.14.1) with ESMTP id mACMQpB7018599 for ; Wed, 12 Nov 2008 14:26:51 -0800 Message-Id: <200811122226.mACMQpB7018599@bourbon.usc.edu> To: cs551@merlot.usc.edu Subject: Re: search and get Date: Wed, 12 Nov 2008 14:26:51 -0800 From: Bill Cheng Someone wrote: > scene: > > nodes A,B,C > > node A gives store command to store file X, which also gets flodded > and stored (storeprob +ve) at B and C (and obviously A) so now we have > 3 copies of same file (same name, sha1 and nonce) > node A does a search for X and gets back 3 results from A,B and C > (different fileID's) > node A does a get for file X stored at node B > > now, > according to spec - If the user at node A attempts to retrieve file X > and file X was successfully retrieved, node A must serve file X (i.e., > respond properly to future search messages). File X should be stored > in the permanent area and not stored in its cache > > here, node A already has the file X in its permanent storage, then > should it save that file again or check if it already has the file. > would this involve opening all files to compare name,sha1 and nonce? FileName, SHA1, and Nonce together uniquely identifies a file. So, if everything matches, a 2nd copy of the file should not be saved. Once you have received a file, you can use either the filename index (or the sha1 index) to find all file numbers having the same filename (or sha1 value). For every one of these files, you should open the corresponding metadata file and see if they have the same nonce and sha1 value (or filename). > further, in the " additional notes " in the spec it says > When a node performs a get and the file happens to be in the mini > filesystem of this node, the following should happen. If the file is > in the permanent space, a copy of the file should be placed in the > node's current working directory. I think this paragraph was not phrased clearly. Sorry! I've just rewritten it. Hopefully it's more clear now. Please see: http://merlot.usc.edu/cs551-f08/projects/final.html#nospace > continuing above example, a get only follows a search/get and in > search we only display different fileID. so if I found 3 copies of > file X and 'luckily' do a get for the file on my own node, only then > will the above happen. in all other cases i will have multiple copies > of same file on permament storage. If i do 50 gets for X stored on B i > will end up with 51 copies of X on node A. If an action will cause an identical file (defined above) to be saved, you should not save a copy of this file. For the copy you saved in the current working directory, if there is already a file with the same filename (as in the metadata), you should probably first prompt the user to see if it's okay to overwite the file. -- Bill Cheng // bill.cheng@usc.edu