codesearch: a fantastic tool for searching a large codebase

I've submitted (and it is now in the tree) a port for textproc/codesearch (homepage: codesearch). This is a fantastic tool for being able to search over a large, but largely static, codebase. (Like /usr/src, for example.)

Example usage:
* cindex /usr/src : Adds /usr/src (recursively) to the index. This only needs to be done once for a path. This takes a (little) while, and after adding the paths you are interested in, you should add something like this to your crontab, such that the index can be automatically updated for you:
0 5 * * * /usr/local/bin/cindex > /dev/null 2>&1 || echo 'Error updating index!'
* csearch <regex>: Searches over all indexed files; typically in less time than it takes to type the command.

Output of csearch -help
Code:
usage: csearch [-c] [-f fileregexp] [-h] [-i] [-l] [-n] regexp

Csearch behaves like grep over all indexed files, searching for regexp,
an RE2 (nearly PCRE) regular expression.

The -c, -h, -i, -l, and -n flags are as in grep, although note that as per Go's
flag parsing convention, they cannot be combined: the option pair -i -n
cannot be abbreviated to -in.

The -f flag restricts the search to files whose names match the RE2 regular
expression fileregexp.

Csearch relies on the existence of an up-to-date index created ahead of time.
To build or rebuild the index that csearch uses, run:

        cindex path...

where path... is a list of directories or individual files to be included in the index.
If no index exists, this command creates one.  If an index already exists, cindex
overwrites it.  Run cindex -help for more.

Csearch uses the index stored in $CSEARCHINDEX or, if that variable is unset or
empty, $HOME/.csearchindex.

and cindex -help
Code:
usage: cindex [-list] [-reset] [path...]

Cindex prepares the trigram index for use by csearch.  The index is the
file named by $CSEARCHINDEX, or else $HOME/.csearchindex.

The simplest invocation is

        cindex path...

which adds the file or directory tree named by each path to the index.
For example:

        cindex $HOME/src /usr/include

or, equivalently:

        cindex $HOME/src
        cindex /usr/include

If cindex is invoked with no paths, it reindexes the paths that have
already been added, in case the files have changed.  Thus, 'cindex' by
itself is a useful command to run in a nightly cron job.

The -list flag causes cindex to list the paths it has indexed and exit.

By default cindex adds the named paths to the index but preserves
information about other paths that might already be indexed
(the ones printed by cindex -list).  The -reset flag causes cindex to
delete the existing index before indexing the new paths.
With no path arguments, cindex -reset removes the index.

Example search with timing:
Code:
$ time csearch -f /usr/src if_ix
/usr/src/ObsoleteFiles.inc:OLD_FILES+=usr/share/man/man4/if_ixgb.4.gz
/usr/src/share/man/man4/Makefile:MLINKS+=ixgbe.4 if_ix.4
/usr/src/share/man/man4/Makefile:MLINKS+=ixgbe.4 if_ixgbe.4
/usr/src/share/man/man4/Makefile:MLINKS+=ixl.4 if_ixl.4
/usr/src/share/man/man4/ixgbe.4:if_ixgbe_load="YES"
/usr/src/share/man/man4/ixl.4:if_ixl_load="YES"
/usr/src/sys/conf/files:dev/ixgbe/if_ix.c               optional ix inet \
/usr/src/sys/conf/files:dev/ixgbe/if_ixv.c              optional ixv inet \
/usr/src/sys/conf/files.amd64:dev/ixl/if_ixl.c          optional        ixl pci \
/usr/src/sys/dev/ixl/ixl_iw.c: * if_ixl internal API
/usr/src/sys/modules/iavf/Makefile:SYMLINKS=    ${KMOD}.ko ${KMODDIR}/if_ixlv.ko
/usr/src/sys/modules/ix/Makefile:KMOD    = if_ix
/usr/src/sys/modules/ix/Makefile:SRCS    += if_ix.c if_bypass.c if_fdir.c if_sriov.c ix_txrx.c ixgbe_osdep.c
/usr/src/sys/modules/ixl/Makefile:KMOD    = if_ixl
/usr/src/sys/modules/ixl/Makefile:SRCS    += if_ixl.c ixl_pf_main.c ixl_pf_qmgr.c ixl_txrx.c ixl_pf_i2c.c i40e_osdep.c
/usr/src/sys/modules/ixv/Makefile:KMOD    = if_ixv
/usr/src/sys/modules/ixv/Makefile:SRCS    += if_ixv.c if_fdir.c ix_txrx.c ixgbe_osdep.c

real    0m0.020s
user    0m0.013s
sys     0m0.006s

Yes, 20ms. A recursive grep (with the files already warm in ARC) takes four and a half seconds.
 
Back
Top