My main point when posting here was to let people know about the problem,
and hope that the whole Excite code gets checked and rewritten if needed.
Sorry for not being more clear about that.
Here are a few answers to comments I received:
> See the post that was made to bugtraq about CGI security recently - it's
I wish I had had the time to read that. All those posts are sitting in a
temp mailbox, and I do intend to read them (real soon now :-D)
(From Dan Jacobowitz)
> > $qcommand = "$queryprog -C $configfile $timeout -q \"$query\" -num $max_docs_to_return $syntax_flag";
> > $qcommand = &convert_file_names($qcommand);
> > + #print "Command: $qcommand<BR>\n";
> > ## print $qcommand;
> > if (open(QUERY, "$qcommand |")) {
> > ## Accumulate the results.
>
> Just checking; was that line solely for debugging purposes or did it
> belong there?
It was for debugging, and lets you see how your cleartext query gets
slightly modified before being passed to the shell.
> As an aside, piping things to shell like that is BAD. Put it into an
> array and use system ("$queryprog", @args);
It's indeed the right way of fixing it. However, having played with the
Excite code, I wouldn't be surprised if the problem is also elsewhere, so I
didn't focus on making perfect fix there knowing that I could leave big
holes in other places.
(From Greg Bacon)
> That's a pretty inefficient way of doing it.
>
> $query =~ s/[`\n\$^|\\~&;*]+//g;
Yep. I first added a line, and a second, etc... (worried more about plugging
the whole to some extent than making an official patch for EWS :-D)
> FWIW, this gets rid of all shell metacharacters:
>
> $query =~ s/[\$&*(){}\[\]'\";\\|?<>~`\n]+//g;
or
(From Oliver Fromme)
> on ftp.cert.org somewhere. Never catch special characters; instead, make
> up a list of _GOOD_ characters to allow and either delete or replace with
> _ all other characters:
(several people pointed out similar things to me)
The first one is not acceptable as we are talking about a search engine and
people may want to use quotes in their queries (as stated in my original
mail). Among many other characters, they should be able to use []{}()<> (I
did actually take the above list, and only kept the characters that I
thought would pose a threat).
As for the second piece of advise, it is obviously by far the most secure,
but I didn't want to restrict characters that people would potentially want
to query on since a fulltext search engine is supposed to accept all kinds
of queries.
The right fix is to correct Excite so that it does accept everything, and
uses the perl equivalent of a C exec* call (as pointed out by Dan above)
Marc