LINUX.IE, website of the Irish Linux Users' Group
Tux rules!

   
Home
New Users
Articles
Download
Projects
Community
Vendors

  Print Version
Email to...
 
Archives:


planetILUG

Recent News

News Archive


Join the
ILUG
on FaceBook


Join the
ILUG
on LinkedIn


Join the
ILUG SETI
Group



















 
 :: Mailing Lists

[ILUG] Gawk query

[ILUG] Gawk query

Brian Foster blf at utvinternet.ie
Fri Aug 24 00:44:29 IST 2007


  | From: Brendan Halpin <brendan.halpin at ul.ie>
  | Date: Thu, 23 Aug 2007 20:31:03 +0100
  | 
  | Brian Foster <blf at utvinternet.ie> writes:
  | >  after a bit of head-scratching, the easiest approach
  | >  seems to be a bit of pre-processing; that is, make
  | >  the two types of spaces unique.
  | 
  | Frankly, the easiest approach is to bite the bullet and go for a
  | regexp approach.  [ ... ]

 it's some of this and some of that:  unless yer an RE guru,
 the size/number of REs involved can be daunting, difficult
 to debug, and (I suspect) yer wondering what odd cases were
 missed (i.e., it could be difficult for an RE non-guru to
 confidently grok when the RE “fails”).  OTOH, an RE can be
 the easiest and quickest (in most senses) approach ....

 w.r.t. my speculation it ought to be possible to generalise
 the OP's case, here's a possible general solution (this may
 only work with GNU sed(1)?  yer kiloage could vary!):

     # each input line consists of zero or more FIELDs.
     # each FIELD is printed on a separate output line.
     # a FIELD is [BTXT] or "QTXT" or STXT where:
     #  - BTXT does not contain ] but may contain [, ", and space anywhere.
     #  - QTXT does not contain " but may contain [, ], and space anywhere.
     #  - STXT does not contain space, [, or ", but may contain ] anywhere.
     # spaces at the beginning and end of an input line are discarded.
     # spaces not in BTXT or QTXT separate FIELDs (and are discarded).
     # both [BTXT (no ]) and "QTXT (no terminal ") may cause chaos.
     sed -e ':again
             s/^ \+//
             /^$/d
             /^\[/{
                 s/^\[\([^]]*\)\]/\1\n/
                 bprint
             }
             /^"/{
                 s/^"\([^"]*\)"/\1\n/
                 bprint
             }
             s/^\([^[" ]*\)/\1\n/
             :print
             P
             s/^.*\n//
             tagain'

  other solutions are possible.
  both [BTXT and "QTXT malformed FIELDs should be handled better.  ;-\ 

cheers!
	-blf-
-- 
▶ ▶  I AM CURRENTLY LOOKING FOR A JOB!  ◀ ◀ | Brian Foster
Experienced (>25 yrs) software engineer:    |        Montpellier, FRANCE
 • Unix, Linux, embedded, design-for-test;  | Stop E$$o (ExxonMobile)!
 • Software/hardware co-design, debugging;  |     http:/www.stopesso.com
 • Kernels, drivers, filesystems, &tc;    Résumé (CV) & contact details:
 • IDL, automated testing, process, &tc.   http://www.blf.utvinternet.ie



More information about the ILUG mailing list
Read this without the formatting.
                                                                                                    

 

Hosted by HEAnet


Maintained by the ILUG website team. The aim of Linux.ie is to support and help commercial and private users of Linux in Ireland. You can display ILUG news in your own webpages, read backend information to find out how. Networking services kindly provided by HEAnet, server kindly donated by Dell. Linux is a trademark of Linus Torvalds, used with permission. No penguins were harmed in the production or maintenance of this highly praised website. Looking for the Indian Linux Users' Group? Try here. If you've read all this and aren't a lawyer: you should be!
RSS Version
Powered by Dell