Return-Path: william@bourbon.usc.edu Delivery-Date: Wed Nov 19 22:15:56 2008 X-Spam-Checker-Version: SpamAssassin 3.2.3 (2007-08-08) on merlot.usc.edu X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.3 Received: from bourbon.usc.edu (bourbon.usc.edu [128.125.9.75]) by merlot.usc.edu (8.14.1/8.14.1) with ESMTP id mAK6FuLT005661 for ; Wed, 19 Nov 2008 22:15:56 -0800 Received: from bourbon.usc.edu (localhost.localdomain [127.0.0.1]) by bourbon.usc.edu (8.14.2/8.14.1) with ESMTP id mAK6D8D8015729 for ; Wed, 19 Nov 2008 22:13:08 -0800 Message-Id: <200811200613.mAK6D8D8015729@bourbon.usc.edu> To: cs551@merlot.usc.edu Subject: Re: Search & Store Date: Wed, 19 Nov 2008 22:13:08 -0800 From: Bill Cheng Someone wrote: > I have a few questions regarding the search/store operations: > > 1) If we type the same store command thrice, would we have multiple > instances of those bitvectors in our index structures, or do we have > to determine if we have already stored that file earlier and discard > it ? Please see my message with timestamp "Sun 16 Nov 20:25". > 2) In keyword based searches, it has been mentioned that the keyword > searches are an "AND" based search. But in the grading guidelines, > consider the following cases: > > 1. store chess.jpg 1 categories="audio mp3" artist="Blondie" > 2. search keywords=mp3 ......... should get a response > > So how is it returning a hit here when it is an "AND" based > search ? Should it return a hit only if all the keywords i.e. > categories audio mp3 artist Blondie are entered ? The "AND" based search applies when you enter multiple keywords *in your search command*. So, if you enter: search keywords="mp3 blondie" you should get a hit. But if you enter: search keywords="mp3 song" you should not get a hit. > 3) Assuming that we get a 'hit' in the second case for 'mp3' , it will > return all the files that has the keyword 'mp3' right ? For a single keyword search, yes. For a two-keyword search, it should return all files that has *both* the keywords. -- Bill Cheng // bill.cheng@usc.edu