Return-Path: william@bourbon.usc.edu Delivery-Date: Thu Nov 20 18:57:17 2008 X-Spam-Checker-Version: SpamAssassin 3.2.3 (2007-08-08) on merlot.usc.edu X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.3 Received: from bourbon.usc.edu (bourbon.usc.edu [128.125.9.75]) by merlot.usc.edu (8.14.1/8.14.1) with ESMTP id mAL2vHPu016403 for ; Thu, 20 Nov 2008 18:57:17 -0800 Received: from bourbon.usc.edu (localhost.localdomain [127.0.0.1]) by bourbon.usc.edu (8.14.2/8.14.1) with ESMTP id mAL2shng025975 for ; Thu, 20 Nov 2008 18:54:43 -0800 Message-Id: <200811210254.mAL2shng025975@bourbon.usc.edu> To: cs551@merlot.usc.edu Subject: Re: CS551_Final2_index files Date: Thu, 20 Nov 2008 18:54:43 -0800 From: Bill Cheng Someone wrote: > Why did we add this requirement? [BC: Added 11/19/2008] The > "name_index" and "sha1_index" must be sorted. > > If we arent using the sorting because there is no efficiency > requirement (i.e. they are sorted, but we search linearly) then what is the benefit? Well, I wouldn't say that we have *no* efficiency requirement. They are index files, so they need to resemble some sort of an index. The requirement of using Binary Search Tree has already been relaxed. The *minimum* requirement of having a sorted list is reasonable. I don't think one can get more minimum than this and still call it an index. With a sorted list, the expected time to find an item, I think is 1/3 of the length of the list. -- Bill Cheng // bill.cheng@usc.edu