Return-Path: lccheung@usc.edu Delivery-Date: Sun Oct 12 20:33:39 2008 X-Spam-Checker-Version: SpamAssassin 3.2.3 (2007-08-08) on merlot.usc.edu X-Spam-Level: X-Spam-Status: No, score=-2.2 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.3 Received: from msg-scanner3.usc.edu (msg-scanner3.usc.edu [128.125.137.212]) by merlot.usc.edu (8.14.1/8.14.1) with ESMTP id m9D3Xds4027942 for ; Sun, 12 Oct 2008 20:33:39 -0700 Received: from msg-mx8.usc.edu ([128.125.137.26]) by msg-scanner3.usc.edu (Sun Java System Messaging Server 6.2-3.04 (built Jul 15 2005)) with ESMTP id <0K8N009I9RVCJP70@msg-scanner3.usc.edu> for cs551@merlot.usc.edu; Sun, 12 Oct 2008 20:47:36 -0700 (PDT) Received: from rv-out-0506.google.com (rv-out-0506.google.com [209.85.198.229]) by msg-mx8.usc.edu (Postfix) with ESMTP id C19477FA0 for ; Sun, 12 Oct 2008 20:47:35 -0700 (PDT) Received: by rv-out-0506.google.com with SMTP id k40so1241710rvb.15 for ; Sun, 12 Oct 2008 20:47:35 -0700 (PDT) Received: by 10.114.79.18 with SMTP id c18mr4784904wab.86.1223869655445; Sun, 12 Oct 2008 20:47:35 -0700 (PDT) Received: from LesliePC (76-195-62-167.lightspeed.irvnca.sbcglobal.net [76.195.62.167]) by mx.google.com with ESMTPS id m30sm15943111wag.0.2008.10.12.20.47.34 (version=SSLv3 cipher=RC4-MD5); Sun, 12 Oct 2008 20:47:34 -0700 (PDT) Date: Sun, 12 Oct 2008 20:47:30 -0700 From: Leslie Cheung Subject: broken pipes problems To: cs551@merlot.usc.edu Message-id: <00c101c92ce6$6d5be910$4813bb30$@edu> MIME-version: 1.0 X-Mailer: Microsoft Office Outlook 12.0 Content-type: text/plain; charset=us-ascii Content-language: en-us Content-transfer-encoding: 7bit Thread-index: Acks5mrrC3aVs7mQT0Oj1DWv50k12g== Hi class, I know a few of you are encountering broken pipes problems. Here is a note I wrote a while ago, and I hope this is useful to all of you. If what I suggest below cannot fix your problem, it's best to schedule an appointment with Bill or myself so we can look at your code. It's really hard to fix a bug like this without going into your code. --Leslie --------------------------------------------------------- Hi all, I debugged a couple SIGPIPE problems, and I hope this information is useful. So I assume if we are unable to connect to the other node (e.g., it has not been started), we sleep for a while, and try again later. One problem I saw is that when you try "reusing" a socket, it is able to complete "connect", but whenever you send something using that reused socket, you get SIGPIPE. Why? I have no idea either, but this is how things work. Instead of saying... ------------------------------------- int sockfd = socket(...); while (!done){ if (connect(sockfd, ...) < 0){ //cannot connect } else { //it won't give you an error for connect, so it comes here //but if you try to send something, it gives you SIGPIPE write(sockfd, ...); //this line give you SIGPIPE } sleep(10); } ------------------------------------- You should do ------------------------------------- while (!done){ int sockfd = socket(...); if (connect(sockfd, ...) < 0){ //cannot connect //you should now close the socket close(sockfd); } else { //connected //now if you send something, it should work write(sockfd, ...); //this should work } sleep(10); } ------------------------------------- In other words, you should create a new socket "inside the loop". Again, this is just one possible scenario that may give you SIGPIPE. Yours might be some other problems.