Re: [ILUG] find regex question

From: Brian Foster (blf at domain utvinternet.ie)
Date: Wed 19 Jun 2002 - 05:16:25 IST


  | Date: Tue, 18 Jun 2002 17:14:07 +0100
  | From: Padraig Brady <padraig at domain antefacto.com>
  |[ ... ]
  | yuk. Why put regex in find?
  | How about piping output of find to grep -E '[^/]*/[-0-9A-Z]{36}\.'

 because it's a powerful selection criteria?
 e.g., whilst you can pipe to `grep' to implement `And' conditions
 (e.g., is a directory And whose name matches the RE); it is much
 harder to do `Or' (e.g., is a directories Or whose name matches).
   having said that, `-exec sh -c "echo -E '{}' | grep -q ..." \;'
   could be used to do `Or' albeit at the cost of opaqueness (plus
   the usual newline-in-filename bugbear, see below) ....

 because it closes a security hole?
 files whose names contain a newline (e.g.) make writing a 100%
 reliable pipe-to-`grep' very difficult, perhaps impossible!
   whilst GNU `find' does have `-print0', I am unaware of any
   corresponding _input_ option for any `grep'. hence, AFAIK,
   this hole cannot be closed when using pipe-to-`grep' ....

 because, continuing the above line-of-reasoning, other `find'
 options are also pointless, as they can also be done by piping
 the output of `find ... -print' to an appropriate command or
 (relatively simple) shell script.
 e.g., `-type' is pointless because it can be done by test(1):

     find DIR -type f -print

 is the same as (using GNU echo(1), and any Bourne-ish shell):

     find DIR -print | while read f; do [ -f "$f" ] && echo -E "$f"; done

 albeit that screws up on files named (e.g.) `-n' (plus the
 usual bugbear, files whose names contain a newline).

 the real issue here is that `find' is neither a filter nor a sink,
 but a source (of data). active data sources tend to be complex
 (IMHO) since they, loosely speaking, create information.

 this is not a trivial thing to do. consider, the simplest possible
 `find' is one which only does `-print'. everything else is done
 by generally simple filters. e.g., `grep' to filter on filenames,
 a mythical `perms' to filter on permissions, a mythical `times' to
 filter on a/m/ctimes, and so on .... then, taking this reductionist
 plan to completion, a `-print'-only-`find' is itself unnecessary,
 as recursive directory listings can be done by other commands ....

 yet the `Or' and security problems (at least) would still exist.
 nothing has been solved, albeit a grand collection of new and
 perhaps useful tools/filters has been created.

cheers!
        -blf-

--
 Innovative, very experienced, Unix and      | Brian Foster    Dublin, Ireland
 Chorus (embedded RTOS) kernel internals     | e-mail: blf at domain utvinternet.ie
 expert looking for a new position ...       | mobile: (+353 or 0)86 854 9268
  For a résumé, contact me, or see my website  http://www.blf.utvinternet.ie
    Stop E$$o (ExxonMobile):  «Whatever you do, don't buy Esso --- they
     don't give a damn about global warming.»    http://www.stopesso.com
     Supported by Greenpeace, Friends of the Earth, and numerous others...


This archive was generated by hypermail 2.1.6 : Thu 06 Feb 2003 - 13:17:20 GMT