LINUX.IE, website of the Irish Linux Users' Group
Tux rules!

   
Home
New Users
Articles
Download
Projects
Community
Vendors

  Print Version
Email to...
 
Archives:


planetILUG

Recent News

News Archive


Join the
ILUG
on FaceBook


Join the
ILUG
on LinkedIn


Join the
ILUG SETI
Group



















 
 :: Mailing Lists

[ILUG] benefits of raw i/o

[ILUG] benefits of raw i/o

Paul Jakma paul at clubi.ie
Wed Jun 14 01:27:16 IST 2000


AIUI == as i understand it.. yes. and it applies to this email too. 

(in fact all my mails should really have big AIUI, IIRC, IMO,
etc.. disclaimers around them..) :)

On Tue, 13 Jun 2000, David Murphy wrote:

> If by 'serial disk i/o' you mean 'sequential disk i/o', then yes,

yes i do. but what's the substantive difference between serial and
sequential anyway?

> it is, and will be on any OS that uses disks. As I said
> yesterday, moving the disk heads is one of the slowest operations
> on a system - sequential reads need fewer, shorter seeks than
> random reads.
> 

indeed. bear the above in mind and re-read what you say below about
filesystems. :)

> It's not the buffering, it's the filesystem -

uhmm.. filesystem would have an effect, obviously. But that can't be
it. The killer /must/ be block buffered I/O - if it wasn't then
surely the solution would be for (eg) Oracle to just use block
devices directly? eg tell it to use /dev/hdd - so that it would still
be using block buffered I/O but without the FS overhead.

but then why was raw I/O invented? could only be because the true
overhead is in the OS block buffering...

incidentally, one way of optimising block I/O for large db
performance is to get the OS to do minimal buffering for that
fs. Eg donald becker had a patch where you could tell the kernel to
only use 50% of the buffer cache for a particular fs.

> as you'll recall, with ufs, and I presume ext2fs, once a file has
> more than X direct blocks, the filesystem starts allocating
> indirect blocks, double indirect blocks, etc. etc. - the upshot
> is the bigger the file, the more pointers you have to follow
> around the disk.

but that's not really a huge overhead /imo/. anyway, chances are you
already have the indirect blocks buffered.

> With an extent-based filesystem,

<unsure>aren't extents just a way to maintain groups of related
blocks, to try keep these blocks in a relatively sequential order on
disk?</unsure> extents are just another layer of indirection, because
you still will still have blocks, fragments, {double,triple} indirect
blocks to dereference...

> the
> typical commmercial example being Veritas File System [VxFS], you
> could have a 4GB file, with the FS allocating it as just one 4GB
> extent. Applications can influence the way VxFS allocates files,
> hence you can approach the control you have with a raw disk,
> while avoiding the inconvenience of raw partitions.
> 

urmm... even extent/higher tech FS's such as SGI XFS, DU AdvFS, (and
i think Vxfs too) have a raw I/O interface. 

also, the application control thing: that's probably an IOCTL/open
flag to tell the fs /NOT/ to buffer that device/file.

> This is why you should ask questions if someone tells you they're
> running Oracle on UFS.

or maybe they can't afford VxFS? :)

> If they're running Oracle over NFS, the
> question would be "Have you had your head examined recently?".
> 

:0

> The buffering issue is essentially double-caching eating all your RAM

still can't be the full story. if it was then the answer would be
weakly buffered block i/o - and no-one would want the following:

bash-2.03# ls -l /dev/dsk/dks0d1s0 /dev/rdsk/dks0d1s0 
brw-------    2 root     sys      128, 16 Feb 24 01:54 /dev/dsk/dks0d1s0
crw-------    2 root     sys      128, 16 Feb 24 01:54 /dev/rdsk/dks0d1s0

> - if Oracle is caching the data in its SGA, and your OS is caching
> that same data in its VM system, you may find you don't have much RAM
> left for other things, like, say, the OS 8) VxFS has a potentially
> useful feature, where it can decide if a given read should be buffered
> or not, just be sure you've tuned the threshold - see:
> http://www.sun.com/blueprints/0400/ram-vxfs.pdf
> 

that's the kind of hackery that raw I/O avoids. Sticking loads of
clever little algorithms into your FS to determine whether or not to
buffer a /given/ read and if so, by how much, becomes pointless
beyond a certain point.

Or do you want your FS to have an intimate knowledge of how oracle
works? Perhaps with a 100MB kernel table full of statistics on how
different observed Oracles access the disk?

That's silly, and that's what raw I/O is about - facing up to the
fact that bloating the kernel with lots of "second-guess
userspace" stuff is bad cause it can't come close to guessing right
most of the time. 

Instead throw that crap back to userspace to the code that knows best
- the app itself - by using raw I/O.

> > Also: RAID systems are optimised for long sequential seeks, which
> > helps...
> 
> Again, this is more the physics of disk drives than RAID systems per
> so.

*low cough* uhmmm.. yes i knew that... of course... *low cough* 

:)

-- 
Paul Jakma	paul at clubi.ie
PGP5 key: http://www.clubi.ie/jakma/publickey.txt
-------------------------------------------
Fortune:
It seems intuitively obvious to me, which means that it might be wrong.
		-- Chris Torek






More information about the ILUG mailing list
Read this without the formatting.
                                                                                                    

 

Hosted by HEAnet


Maintained by the ILUG website team. The aim of Linux.ie is to support and help commercial and private users of Linux in Ireland. You can display ILUG news in your own webpages, read backend information to find out how. Networking services kindly provided by HEAnet, server kindly donated by Dell. Linux is a trademark of Linus Torvalds, used with permission. No penguins were harmed in the production or maintenance of this highly praised website. Looking for the Indian Linux Users' Group? Try here. If you've read all this and aren't a lawyer: you should be!
RSS Version
Powered by Dell