mydiff - INI style diff
Well, needed to compare two 300MB directories at work yesterday. Unfortunately, 'regular' diff just wasn't cutting it. A file would be declared different even if it was an INI style moved section… Example:
File 1: [a] Setting1=a Setting2=b [b] Setting3=c Setting4=d File 2: [b] Setting3=c Setting4=d [a] Setting1=a Setting2=b
Obviously, these two files are EFFECTIVELY the same, but diff will show the first as having the entire [a] section only, then [b] common, then file 2 only having… the same exact [a] section. So I whipped up a perl script to tell me that those two files are the same. This script may have problems and might not do what you want (it was quick and dirty) but it may help others (and me later, which is what this blog is more for anyway)… Looking at it this morning I can see a handful of places to easily condense it, but oh well… and if you care, these were Quartus project files and associated files (CSF, PSF, etc). Note: It fails when there is a < > or | in the text file. But if usually dumps so little you can eyeball it and decide if it is OK.
#!/usr/bin/perl -w use Data::Dumper; my $textdump; my %lhash; my %rhash; my $debug = 0; my $file = $ARGV[0]; # Some filenames have () in them that we need to escape: $file =~ s/\(/\\(/g; $file =~ s/\)/\\)/g; open (INPUT, "diff -iEbwBrsty --suppress-common-lines Projects/$file Folder\\ for\\ Experimenting/Projects/$file|"); while (<INPUT>) { if ($_ =~ /Files .*differ$/) { #Binary files print "Binary file comparison - they differ.\n"; exit; } if ($_ =~ /Files .*identical$/) { print "No diff!\n"; exit; } my $a = 0; # For some reason chomp was giving me problems (cygwin, win2k) s/\n//g; s/\r//g; $_ =~ /^(.*)([<>\|])(.*)$/; my $left = $1; my $dir = $2; my $right = $3; $left =~ /^\s*(.*?)\s*$/; $left = $1; $right =~ /^\s*(.*?)\s*$/; $right = $1; # print "1: '$left'\n2: '$dir'\n3: '$right'\n"; # OK, now we have all we wanted... if ($dir eq '<') { $lhash{$left}++; $a++; }; if ($dir eq '>') { $rhash{$right}++; $a++; } if ($dir eq '|') { $lhash{$left}++; $rhash{$right}++; $a++; } print "Missed this: $left $dir $right\n" unless $a; } # while close(INPUT); foreach (sort keys %lhash) { if (not exists $rhash{$_}) { # No Match... print "Only in left: '$_'\n"; } else { if ($lhash{$_} != $rhash{$_}) { print "Left count not equal to Right, $_\n"; } } } foreach (sort keys %rhash) { if (not exists $lhash{$_}) { # No Match... print "Only in right: '$_'\n"; } else { if ($lhash{$_} != $rhash{$_}) { print "Left count not equal to Right, $_\n"; } } } print Dumper(\%rhash) if $debug; print Dumper(\%lhash) if $debug;