LINUX.IE, website of the Irish Linux Users' Group
Tux rules!

   
Home
New Users
Articles
Download
Projects
Community
Vendors

  Print Version
Email to...
 
Archives:


planetILUG

Recent News

News Archive


Join the
ILUG
on FaceBook


Join the
ILUG
on LinkedIn


Join the
ILUG SETI
Group



















 
 :: Mailing Lists

[ILUG] RAID, huge filesystems and data mining.

[ILUG] RAID, huge filesystems and data mining.

Olivier Tharan olive at oban.frmug.org
Wed Jul 14 09:33:25 IST 2004


* Ronan Cunniffe <rcunniff at stp.dias.ie> (20040713 14:59):
>    Prompted by the "why RAID" discussion, I want to see what
> ILUGgers think of the following data-mining challenge, and my current
> sorta idea for solving it.  It's not *my* problem, but it's an interesting
> one.
> 
>    Large (1-2TB, scaling soon x10 or thereabouts) proprietary
> (multi-owner) data corpus, made up of many (thousands at least) of
> separate datasets.
> 
>    You are holding this data, and mediating access to it for an arbitrary
> number of dataminers.  Each user has a very definite set of access
> permissions, and it's not a regular pattern (i.e. there's no easy way of
> splitting the problem).
>    A data-mining run is going to involve 0.1 to 0.5 TB.
> 
>    This is (AFAIK) going to run on Red Hat 9, or possibly Fedora or
> something more recent.

I don't know if you want to build your solution on
directly-attached storage or not, and if it is going to be
opensource-only.

A NetApp storage could do what you want. It's NFS (or CIFS or
HTTP, etc.) but it could be part of a NAS with Fiber Channel
(this part I am not sure). When you run out of space, you throw
more disks in your qtree[1] and there you are.

The "views" you are talking about could translate into
"snapshots": at one given time, what your users see is a frozen
filesystem where the data does not change, whereas underneath the
snapshot you can add more data or do some cleaning.

I don't know if there is a Linux-based filesystem which does
snapshots; at least UFS2 on FreeBSD-CURRENT now does.

On the downside, the permissions are bound to your system, so for
the access permissions mentioned earlier, you would have to rely
on Unix groups and basic permissions (no ACL).

Do not think of NetApps only as very expensive file storage
cabinets, because you also get a reliable solution for the price.

[1] "quota tree", roughly equivalent to a mount point. You put
one or several qtrees on a RAID subsystem in a Netapp, which are
by the way some kind of RAID4++.

-- 
olive



More information about the ILUG mailing list
Read this without the formatting.
                                                                                                    

 

Hosted by HEAnet


Maintained by the ILUG website team. The aim of Linux.ie is to support and help commercial and private users of Linux in Ireland. You can display ILUG news in your own webpages, read backend information to find out how. Networking services kindly provided by HEAnet, server kindly donated by Dell. Linux is a trademark of Linus Torvalds, used with permission. No penguins were harmed in the production or maintenance of this highly praised website. Looking for the Indian Linux Users' Group? Try here. If you've read all this and aren't a lawyer: you should be!
RSS Version
Powered by Dell