Return-Path: william@bourbon.usc.edu
Delivery-Date: Wed Apr 18 07:42:47 2007
X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on merlot.usc.edu
X-Spam-Level: 
X-Spam-Status: No, score=-3.3 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00,
	NO_REAL_NAME autolearn=ham version=3.1.3
Received: from bourbon.usc.edu (bourbon.usc.edu [128.125.9.75])
	by merlot.usc.edu (8.13.5/8.13.5) with ESMTP id l3IEglup030726
	for <cs551@merlot.usc.edu>; Wed, 18 Apr 2007 07:42:47 -0700
Received: from bourbon.usc.edu (localhost.localdomain [127.0.0.1])
	by bourbon.usc.edu (8.13.5/8.13.5) with ESMTP id l3IEgl5x013420
	for <cs551@merlot>; Wed, 18 Apr 2007 07:42:47 -0700
Message-Id: <200704181442.l3IEgl5x013420@bourbon.usc.edu>
To: cs551@merlot.usc.edu
Subject: Re: Bitvectors and keywords 
Date: Wed, 18 Apr 2007 07:42:47 -0700
From: william@bourbon.usc.edu

Someone wrote:

  > If i keep a multimap structure for storing the bitvectors created
  > by the keywords in the metadata file, so as every bitvectors
  > maps a set of keywords and the file containing them then i really
  > dont feel any need of keeping the Keywords index
  > structures to quiken the search operation coz then we would be
  > incorporating redundancy in the logic and code.
  > (END to END argument... ;) :)

Please remember that one of the main "escape clause" for the
end-to-end argument is efficiency.  And the main point of
having an index structure is to introduce redundancy to speed
up certain operations.

>From your description, even though it's an improvement over
traditional data structures, it doesn't scale as well as the
bit-vector approach.

  > Do i still  require extra keywords index then??

Yes.  Please see the grading guidelines on "minu points".
--
Bill Cheng // bill.cheng@usc.edu <URL:http://merlot.usc.edu/william/usc/>
