Return-Path: william@bourbon.usc.edu Delivery-Date: Thu Nov 20 08:11:42 2008 X-Spam-Checker-Version: SpamAssassin 3.2.3 (2007-08-08) on merlot.usc.edu X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.3 Received: from bourbon.usc.edu (bourbon.usc.edu [128.125.9.75]) by merlot.usc.edu (8.14.1/8.14.1) with ESMTP id mAKGBgd6011058 for ; Thu, 20 Nov 2008 08:11:42 -0800 Received: from bourbon.usc.edu (localhost.localdomain [127.0.0.1]) by bourbon.usc.edu (8.14.2/8.14.1) with ESMTP id mAKG91sJ020053 for ; Thu, 20 Nov 2008 08:09:01 -0800 Message-Id: <200811201609.mAKG91sJ020053@bourbon.usc.edu> To: cs551@merlot.usc.edu Subject: Re: CS551_Final2_index files Date: Thu, 20 Nov 2008 08:09:01 -0800 From: Bill Cheng Someone wrote: > *So name_index should be sorted alphabetically. How will sha1_index be > sorted ? Will it be in the decreasing order of ascii values of SHA1* ? It's up to you. You need to be a little more careful if you store SHA1 in binary. If you use a comma or a tab as a delimeter between fields, it is possible that the SHA1 value may contain this character. So, when you read the sha1_index file, you should read 20 bytes first and then start looking for the delimeter (if you use one). Or you can just have everything in binary. > *Can u give an example.* If you convert the 20-byte SHA1 to all lowercase ASCII (40 characters long), you can just use strcmp(): int result=strcmp(str1, str2); if (result == 0) { /* str1 is the same as str2 */ } else if (result > 0) { /* str1 is "bigger" than str2 */ } else { /* str1 is "smaller" than str2 */ } If you are keeping the SHA1 value binary, you can use memcmp() in a very similar way as above. -- Bill Cheng // bill.cheng@usc.edu