LINUX.IE, website of the Irish Linux Users' Group
Tux rules!

   
Home
New Users
Articles
Download
Projects
Community
Vendors

  Print Version
Email to...
 
Archives:


planetILUG

Recent News

News Archive


Join the
ILUG
on FaceBook


Join the
ILUG
on LinkedIn


Join the
ILUG SETI
Group



















 
 :: Mailing Lists

[ILUG] Slightly Off Topic : Concurrent Access to a Text File in a bash script - can I enforce thread safety

[ILUG] Slightly Off Topic : Concurrent Access to a Text File in a bash script - can I enforce thread safety

Brian Foster blf at utvinternet.ie
Sat Jul 14 00:05:50 IST 2007


  | Date: Fri, 13 Jul 2007 15:45:42 +0100
  | From: "Oisin Kim" <oisinkim at gmail.com>
  |[ ... ]
  | "Forgetting O_DIRECT for a moment, O_APPEND writes on NFS don't work
  | in anycase when multiple clients are writing to a file [ ... ]
  | 
  | Basically, its not safe, as in my case, we will always be using NFS mounts.
  |[ ... ]
  | > On 7/13/07, Pádraig Brady <P at draigbrady.com> wrote:
  | >[ ... ]
  | > > Well when you open a file with O_APPEND set (as the shell
  | > > does when you `>> file`), on each write, the file offset
  | > > private to each process is automatically set to the current
  | > > file size.  All you have to worry about is that cksum does
  | > > not write a partial line the whole way to the kernel
  | > > before it scheduled.  [ ... ]

 there is another simple approach which doesn't depend
 on the O_APPEND semantics:  use an intermediate named
 pipe (FIFO).  POSIX guarantees concurrent write(2)s
 to a pipe (named and anonymous) are atomic (with some
 fairly minor caveats).  so the trick is to have a
 simple reader deamon that appends what it reads from
 a well-known named pipe to yer logfile (and, AFAIK,
 the logfile can be on an NFS mount).  messages to be
 logged are simply written with one write(2) system
 call to that named pipe.

 the caveats (that I can now recall):

  + like O_APPEND, each message must be written to the
    named pipe as exactly one write(2) system call.

  + there is a limit to the size of write (to the pipe);
    larger writes are legal, but not atomic.  AFAICR,
    the limit is PIPE_BUF, which is nominally a small
    number of KiB (larger than any reasonable line).

  + all writers must use the same named pipe.  hence,
    AFAIK, all writers must be on the same machine,
    and the named pipe must be local.

 below is a pair of illustrative bash(1) scripts:

  + `rd' is the daemon, to be run in the background:

       rd &

     each FIFO should have at most one reader daemon.

  + `wr' is the writer:

       wr ["message"]...

     multiple writers may run concurrently.

  + to terminate the daemon, kill it by job number:

       kill %<jobno>

     where <jobno> was printed by bash when you started
     it (or use the `jobs' command to get the <jobno>).

 I haven't exhaustively tested the scripts, but I've used
 this basic trick multiple times in the past.  (in fact,
 I thought I had a C program someplace which wrote each
 line read from stdin using exactly one write(1), but I
 can't find it ATM.  ;-\  )

=====(cut here and below)=====(`rd' reader daemon)=====
#!/bin/bash
#
# rd	- Reader daemon
#
# Usage: rd &
#        ... concurrent `wr message' ...
#        kill %<jobno>
#
# Read messages from $FIFO (default: "named_pipe"), appending
# each to $LOG (default: "log_file").  POSIX requires pipe
# writes to be atomic; hence, provided the writer (`wr') calls
# write(1) exactly once (for each message), concurrent messages
# are effectively serialised.

	# Create named pipe $FIFO if it does not already exist.
	# We don't do anything special if it does exist, but
	# since it is nominally removed when this daemon exits,
	# an existing FIFO implies there already is a reader.
[ -p "${FIFO:=named_pipe}" ]  ||  mkfifo -- "$FIFO"  ||  exit

	# In the background, append everything written to $FIFO
	# to $LOG.  A trick is to ignore SIGTERM to allow a simple
	# `kill %<jobno>' to clean up nicely (remove $FIFO), as
	# a trivial flag (no $FIFO iff. no deamon (not robust!)).
( trap '' TERM; cat -- "$FIFO" >>"${LOG:-log_file}"; rm -f -- "$FIFO" ) &

	# Keep $FIFO open so cat(1) does not get a premature EOF.
	# To terminate, kill(1) the sleep(1); an easy way to do
	# this is `kill %<jobno>' where <jobno> is the bash(1)
	# job number of this reader daemon script, which should
	# be run in the background (`rd &').
while
	date --iso-8601=seconds			# timestamp every 2m,
	sleep $(( 2 * 60 ))  &&  kill -0 $!	# provided cat exists
do
	: Nothing
done >$FIFO
=====(cut here and above)=====(`rd' reader daemon)=====

=====(cut here and below)=====(`wr' logging writer)=====
#!/bin/bash
# wr	- Writer
#
# Usage: wr [message]...
#
# Write <message> into $FIFO (default: "named_pipe") to be read
# and serialised by `rd'.  Multiple `wr' may write concurrently;
# each <message> is atomic (provided it does not exceed the pipe
# atomic write size (PIPE_BUF(?) bytes)),
#
# NOTE: Assumes *WRONGLY(?)* `echo' calls write(2) exactly once!

	# If $FIFO does not exist, Do Not Pass Go.
[ -p "${FIFO:=named_pipe}" ]  || {
	echo -E "$0: error: Missing FIFO (no reader?): $FIFO"  >&2
	exit 1
}

	# Stuff each message down the $FIFO (there is a race here
	# should the `rd' daemon be killed after the above check,
	# but we ignore this glitch).  For purposes of illustration,
	# each argument is a separate message.
	#
	# A message *must* be written with exactly one write(2).
	# Here, we use `echo' to show how simple this is, but
	# `echo' (probably) does *not* do exactly one write;
	# i.e., this is (probably) not precisely correct!  ;-(
for message in "$@"; do
	echo "$message" >"$FIFO"	# WRONG(? see above)!
done
=====(cut here and above)=====(`wr' logging writer)=====

cheers!
	-blf-
-- 
▶ ▶  I AM CURRENTLY LOOKING FOR A JOB!  ◀ ◀ | Brian Foster
Experienced (>25 yrs) software engineer:    |        Montpellier, FRANCE
 • Unix, Linux, embedded, design-for-test;  | Stop E$$o (ExxonMobile)!
 • Software/hardware co-design, debugging;  |     http:/www.stopesso.com
 • Kernels, drivers, filesystems, &tc;    Résumé (CV) & contact details:
 • IDL, automated testing, process, &tc.   http://www.blf.utvinternet.ie



More information about the ILUG mailing list
Read this without the formatting.
                                                                                                    

 

Hosted by HEAnet


Maintained by the ILUG website team. The aim of Linux.ie is to support and help commercial and private users of Linux in Ireland. You can display ILUG news in your own webpages, read backend information to find out how. Networking services kindly provided by HEAnet, server kindly donated by Dell. Linux is a trademark of Linus Torvalds, used with permission. No penguins were harmed in the production or maintenance of this highly praised website. Looking for the Indian Linux Users' Group? Try here. If you've read all this and aren't a lawyer: you should be!
RSS Version
Powered by Dell