[SUMMARY] Need grep-like utility that traverses directories

foster@bial1.ucsd.edu
Wed, 11 Mar 1998 18:02:07 -0800

--Boundary_(ID_OoGOQEsJTVMkA3DOtb8X0w)
Content-type: TEXT/PLAIN; CHARSET=US-ASCII; NAME=text
Content-description: text
Content-disposition: ATTACHMENT; FILENAME=text
X-Sun-Charset: us-ascii

Well, I'm kinda embarassed because the moment I saw the first answer I
thought, "Duh!". Pretty obvious. Thanks for all the answers!

Here are the suggestions:

1) find <target_dir> -type f -print | xargs grep -l <regular_expression>

2) find <dir> -exec grep -l <pattern-to-match> /dev/null {} \;

Specifying /dev/null as one file to search will ensure that grep
prints the file name of the matching file.

3) grep '^foo$' `find . -name "*" -print`

4) Attached is a "UNIX Hot Tip" sent to me. I could not get any of
the commands, or the script, to work. I include it here in case
others have more luck!

5) Someone suggested "rgrep" they had downloaded some time ago.
No pointers were provided.

6) Attached also is an explanation given by Michael Maciolek,
about methods 1 and 2.

7) One person responded with one word: "perl".

8) One person sent a script "rgrep" they had written. Works pretty
well, but includes unnecessary output. Sent by:
David Thorburn-Gundlach <david@bae.uga.edu>

9) Pointer to "tgrep" utility (haven't had time to try it yet):

http://www.sun.com/workshop/threads/apps.html

10) Pointer to "glimpse" utility (haven't had time to try it yet):

http://glimpse.cs.arizona.edu/index.html

Most suggested method 2. See credits below for methods 1 and 3.

I timed these alternatives under the same conditions, and here are the results:

1 : 10 sec
2 : 92 sec
3 : 6 sec

Here is a variation:

Add "-type f" to find command : restricts search to files, not directories

==========================================================================
I include the following to remind everyone how INCREDIBLE this list is!
==========================================================================

Joel jlee@thomas.com
Will Lavery <willl@ilx.com>
Nate Itkin <Nate-Itkin@ptdcs2.intel.com> *** Suggested method (1) ***
gibian@stars1.hanscom.af.mil (Marc S. Gibian)
Daniel Lorenzini <lorenzd@gcm.com>
sanjiv@aryabhat.cs.umsl.edu (Sanjiv K. Bhatia)
Theo Wawers <wawers@sat1link.sat1.de>
"Robert T. Clift" <rclift@nswc.navy.mil>
AMH <ahoerter@netcom.com>
David Dhunjishaw <dave@colltech.com>
HCDEB@mead.com
Charlie Mengler <charliem@anchorchips.com>
"Michael J. Garcia" <mjgarcia@auspex.com>
jason.axley@attws.com
John Vaughan <jvaughan@onlinemagic.net> *** Suggested method (3) ***
Craig Glover <cglover@dbintellect.com>
"Michael R. Zika" <zika@glacier.llnl.gov>
Grant Lowe <grant@doctord.com>
"Jeffrey K. Pado" <jkp@cdicad.com>
Gene Rackow <rackow@mcs.anl.gov>
"Floyd, Randall D." <FLOYDR@nebeng.otis.com> *** Suggested method (1) ***
emarch@pinole1.com (D. Ellen March) *** Suggested method (1) ***
Matthew Stier <Matthew.Stier@tddny.fujitsu.com> *** Suggested method (1) ***
Sean Ward <sdward@uswest.com>
Andrew M Townsend <ATOWNSEND@DOLETA.GOV>
Jeff Graham <demit@best.com>
Seth Rothenberg <SROTHENB@montefiore.org>
Peter Polasek <pete@cobra.brass.com>
MARC.NEWMAN@chase.com
Sanjaya Srivastava <Sanjaya.Srivastava@Eng.Sun.COM>
jipping@smaug.cs.hope.edu (Mike Jipping)
Ron Kelley <rkelly@InfoAve.Net>
Ning Zhang <btinz@ui.uis.doleta.gov>
Chris Tubutis <tubutis.chris@tci.com>
poffen@San-Jose.ate.slb.com (Russ Poffenberger) *** Suggested method (1) ***
bbyoung@amoco.com (Brad Young ) *** Suggested method (1) ***
Gerald Combs - Unicom Communications <gerald@unicom.net>
Kevin.Sheehan@uniq.com.au (Kevin Sheehan {Consulting Poster Child})
Anthony.Worrall@reading.ac.uk (Anthony Worrall)
Daniel Dunn <dunn@sled.gsfc.nasa.gov>
Benjamin Cline <benji@hnt.com>
Mark Bergman <bergman@phri.nyu.edu>
Michael Maciolek <mikem@centerline.com>
Brian Sherwood <sherwood@alux4.micro.lucent.com>
Derek Flynn <flynn+public@cs.uchicago.edu> *** Method 1 ****
Mike Myers <mmyers@willamette.edu> *** ALL methods! ****
Peter Bestel <peter.bestel@uniq.com.au> *** Method 1 ***
"Avrami, Louis" <L.Avrami@dialogic.com>
"Darryl V. Pace" <dpace@tacticsus.com>
Jay Lessert <jayl@latticesemi.com>
farrare@rayleigh.tt.aftac.gov (Edward Turner Farrar (SAIC))
"D. Stewart McLeod" <stewart.mcleod@boeing.com>
Bryan Hodgson <bryanh@astea.com>
"Roger D McGraw, Jr., E.I.T." <roger@ptac.com>
Richard Bajusz <richb@osg.saic.com>
gary carr@aotmail2.atdiv.lanl.gov
"William L. Hamlin" <whamlin@connetsys.com>
Austin Hastings <hastinga@tarim.dialogic.com>
Jeff Putsch <putsch@unitrode.com>
David Thorburn-Gundlach <david@bae.uga.edu>
Harvey Wamboldt <harvey@iotek.ns.ca>
jbking@sass1633.csua35.sandia.gov
Joel Lee <jlee@thomas.com>
"Paquette, Trevor" <TrevorPaquette@mcc.net>
"Pitcher, Glenn" <gpitcher@comstream.com>
Sweth Chandramouli <sweth@astaroth.nit.gwu.edu>
Chris Phillips <chris@scooter.Canada.Sun.COM>
Susan Feng <sfeng@crg-gw.Stanford.EDU>

On Tue, 10 Mar 1998 foster@bial1.ucsd.edu wrote:

>
>
> Managers:
>
> I am looking for a grep-like utility that traverses directories, to do
> a text search in all files in a given directory structure. Does anyone
> know of or have pointers to such a thing?
>
> Thanks! I will of course summarize.
>
> Dave
>
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> David S. Foster Univ. of California, San Diego
> Programmer/Analyst Brain Image Analysis Laboratory
> foster@bial1.ucsd.edu Department of Psychiatry
> (619) 622-5892 8950 Via La Jolla Drive, Suite 2240
> La Jolla, CA 92037
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
>

----- End Included Message -----

--Boundary_(ID_OoGOQEsJTVMkA3DOtb8X0w)
Content-type: TEXT/PLAIN; CHARSET=US-ASCII; NAME=grep_tip
Content-description: default
Content-disposition: ATTACHMENT; FILENAME=grep_tip
X-Sun-Charset: us-ascii
X-Sun-Data-type: default

Hi Dave.

I finally get to return the help you've given me in the past. Here you go:

UNIX GURU UNIVERSE
UNIX HOT TIP

Unix Tip #379- January 14, 1998

http://www.ugu.com/sui/ugu/show?tip.today
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

RECURSIVE GREP
Here is a nasty One-Liner:

Did you wish you could grep through files recursively
down subdirectories:

find . -type f -exec grep -l "foo" {} \\; -exec grep -n "foo" {} \\; -exec
echo " " \\;
Or another version that was submitted is:

find . -type f -print | xargs grep foo

Here is the command in script form.
This is probably easier in the long run....
=============== CUT HERE =======================================

# Script : gref
# Shell : any
# Author : Thom Vaught, David Miller
# Date : 8/16/90
#
# Recursively greps down a directory tree.
# If no path is specified, default is working directory.
#
# NOTE: Some shells require the variables in the "if"
# statements be quoted.

if [ $# = 1 ]
then
dir=.
else if [ $# = 2 ]
then
dir=$2
else
echo "Usage: `basename $0` pattern [path]"
exit 1
fi
fi

find $dir -type f -exec grep -l "$1" {} \\; \\
-exec grep -n "$1" {} \\; \\
-exec echo " " \\;

--------------------------------------------------------------------------
To unsubscribe to this list, mail to tips@ugu.com
Subject: unsubscribe tips
==========================================================================
DISCLAIMER: All UNIX HOT TIPS ARE OWNED BY THE UNIX GURU UNIVERSE AND ARE
NOT TO BE SOLD, PRINTED OR USED WITHOUT THE WRITTEN CONSENT OF THE UNIX
GURU UNIVERSE. ALL TIPS ARE "USE AT YOUR OWN RISK". UGU ADVISES THAT
ALL TIPS BE TESTED IN A NON-PRODUCTION DEVELOPMENT ENVIRONMENT FIRST.

Unix Guru Universe - http://www.ugu.com - tips@ugu.com - Copyright 1997
==========================================================================

--Boundary_(ID_OoGOQEsJTVMkA3DOtb8X0w)
Content-type: TEXT/PLAIN; CHARSET=US-ASCII; NAME=explanation
Content-description: default
Content-disposition: ATTACHMENT; FILENAME=explanation
X-Sun-Charset: us-ascii
X-Sun-Data-type: default

WARNING: some people might suggest this:

find <dir> -type f -exec grep <pattern> {} /dev/null \;

The less knowledgeable of those people might not even suggest using the
'-type f' flag or the /dev/null. Be wary of using 'find' in this manner.
Although it will work, it's not very efficient, and for large directory
trees, it will take a lot longer than it has to. A MUCH better solution
is this:

find <dir> -type f -print | xargs grep <pattern>

The key to much better performance is the 'xargs' command. Here's how
it all works together:

'find' traverses the directory hierarchy starting at <dir> and prints
the names of all regular files (-type f) to the pipe.

'xargs' reads the list of files from the pipe, keeps reading until it
has (by default) 470 characters' worth of names; then it calls:

grep <pattern> <file1> <file2> ... <file-n>

'xargs' continues reading filenames from standard input and calling the
'grep' command until there are no more names to read.

It's much better to use 'xargs' because it reduces the number of times
a new instance of 'grep' has to be invoked; if there are 1000 files in
a particular directory tree, the simple 'find' command (the one I warned
you against) will call 'grep' for every file it finds. This is grossly
inefficient; grep understands a list of filenames on its command line,
why limit it to only looking at one file at a time?

In comparison, the implementation with xargs might call each instance of
'grep' with 10 to 15 filenames (maybe more, maybe less - it depends on the
length of the average file pathname; /usr averages 32 characters per file)
So the 'xargs' implementation might only have to spawn 70 to 100 greps -
one for every 10 to 15 filenames.

With the 'xargs' implementation, you do 900 fewer invocations of 'grep'.
In the long run, that'll save you a significant amount of time.

regards,

Michael

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
David S. Foster Univ. of California, San Diego
Programmer/Analyst Brain Image Analysis Laboratory
foster@bial1.ucsd.edu Department of Psychiatry
(619) 622-5892 8950 Via La Jolla Drive, Suite 2240
La Jolla, CA 92037
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

--Boundary_(ID_OoGOQEsJTVMkA3DOtb8X0w)--