tag:blogger.com,1999:blog-28219072.post4480863848567778779..comments2023-12-26T03:34:55.950-08:00Comments on genomics: Finding difference between 2 filesSucheta Tripathy PI @ Computational Genomics Group at IICB, Kolkatahttp://www.blogger.com/profile/17433426304045795341noreply@blogger.comBlogger3125tag:blogger.com,1999:blog-28219072.post-44851960701017174022009-07-20T20:30:51.980-07:002009-07-20T20:30:51.980-07:00Hi Wilma,
I am really impressed with VADLO search...Hi Wilma,<br /><br />I am really impressed with VADLO search engine. This is completely new information for me.. Please keep posting new information.<br /><br />Best<br /><br />SuchetaSucheta Tripathy PI @ Computational Genomics Group at IICB, Kolkatahttps://www.blogger.com/profile/17433426304045795341noreply@blogger.comtag:blogger.com,1999:blog-28219072.post-69988412095391897072009-07-17T05:28:04.038-07:002009-07-17T05:28:04.038-07:00Thanks for your comments!! I am wondering about it...Thanks for your comments!! I am wondering about its performance. How long does it take to execute for large files.Sucheta Tripathy PI @ Computational Genomics Group at IICB, Kolkatahttps://www.blogger.com/profile/17433426304045795341noreply@blogger.comtag:blogger.com,1999:blog-28219072.post-45729321822222362972009-07-16T21:14:56.260-07:002009-07-16T21:14:56.260-07:00Good to see your activity here.
Just to add to yo...Good to see your activity here. <br />Just to add to your script repo., <br />here is a scripts which calculates diff, intersection and diff b/n files. This is not mine, but using it for some time..<br /><br />----------<br /><br />#! /bin/sh<br /># Union, intersection or minus of two sets, where each set is a file<br /># and an element of the set is a line in the file. WH 1994 Dec, 2001 Jan<br /><br /><br /><br />NAME=`basename $0`<br /><br />if [ $# != 3 ] ; then echo >&2 Need three argments ; usage ; fi<br />FILE1="$1"<br />FILE2="$3"<br />if [ ! -r "$FILE1" ] ; then<br /> echo >&2 Error: File "$FILE1" does not exist ; usage<br />fi<br />if [ ! -r "$FILE2" ] ; then<br /> echo >&2 Error: File "$FILE2" does not exist ; usage<br />fi<br />case "$2" in<br /> u|i|-) OP="$2" ;;<br /> *) echo >&2 Error: Bad option ; usage ;;<br />esac<br /><br />UNION=$NAME.u.$$<br />DIFF=$NAME.d.$$<br /><br />trap '/bin/rm "$UNION" "$DIFF" 2>/dev/null' 0<br />trap '/bin/rm "$UNION" "$DIFF" 2>/dev/null; exit 1' 1 2 15<br /><br />## generate $FILE1 UNION $FILE2<br />## uniq: remove duplicate = UNION<br />cat "$FILE1" "$FILE2" | sort | uniq > "$UNION"<br /><br />if [ "$OP" = u ] ; then cat "$UNION" ; exit 0; fi<br /><br />## generate $FILE1 diff $FILE2 as $FILE2 symdiff ( $FILE1 union $FILE2 )<br />## uniq -u: nonrepeated lines = symdiff<br />cat "$FILE2" "$UNION" | sort | uniq -u > "$DIFF"<br /><br />if [ "$OP" = - ] ; then cat "$DIFF" ; exit 0; fi<br /><br />## generate intersection: $FILE1 symdiff ( $FILE1 diff $FILE2 )<br />cat "$FILE1" "$DIFF" | sort | uniq -uDNAhttps://www.blogger.com/profile/10170931489462687349noreply@blogger.com