NAS for archival info

Hi,
I am setting up a expandable NAS for a museum for data storage. (it might also be used for some website hosting and FTP but that's side the point) It needs to be bullet proof. They are storing digital copy's and database systems of very important documents(one of's). They are also computer "challenged" so it needs to have backups onsite if they accidentally delete files. I was leaning towards FreeNAS. I will be getting a 24 drive 4u case (NORCO RPC-4224 4U) and using a raid card that could handle all the drives (adaptec RAID 52445 2258800-R). I would start out with 8 hard drives (WD1002FAEX 1TB 7200 RPM 64MB ) setting in them up with 2 sets of 4 with RAID6 and then use ZFS. I would have the one set of raid be a backup of the other. not mirrored. Just have it make a clone of the drive at 5pm every day. Then i will have a off-site backup. I am hoping to use a cloud based service, but I think I will start with discs or something. (any ideas for this would be great) Like is stated before, it will need to be expandable in the future so all they need to do is add hard drives. I would gladly take recommendations.
Thanks,
Rick
 
ZFS will be expandable, you can add almost every drive you want to the pool and grow it.
Depending on the size of your data, the backup can become quite time consuming, so I suggest to use an incremental program (at least rsync, but other backup-specific tools like bacula could be better). One doubt I have is if it is worth setting up a raid6 and use zfs at the same time. In theory zfs can perform a very good raid without hardware, but I tend to trust the hardware most (it's my opinion, of course). Also note that having a database running on top of a raid5/6 could not be the best, and having the same configuration for database and documents could not be optimal, but of course it depends on your workload.
Finally, ensure that the controller has enough bandwidth to drive all drives, or your daily backup could become very easily a bottle neck.
 
And please make sure there is a way for movable backups. Reading about "very important documents(one of's)" in combination with a museum and then not catering for the desaster case makes me somewhat uneasy. Maybe you do not build that in from the start (while I think you should), but do not make decisions which may block such.
 
If you want to archive you should think about WORM (Write Once Read Many) media. Usually this kind of media is easier to move around and archive in a safe or similar place. I would really think of this as an additional issue.
 
fluca1978 said:
the backup can become quite time consuming, so I suggest to use an incremental program

Can i back up one hard drive set to another in the same system?

fluca1978 said:
One doubt I have is if it is worth setting up a raid6 and use zfs at the same time.

Well, I liked the "snapshot" tool and info checking on the drive. So even if it shows up to the system as 2 drives I can take avantige of what ZFS has to offer.

fluca1978 said:
Also note that having a database running on top of a raid5/6 could not be the best, and having the same configuration for database and documents could not be optimal, but of course it depends on your workload.

We use a POS software called file maker pro. so it really is just like any other document stored on a computer. It is not a "true" database.

Crivens said:
And please make sure there is a way for movable backups. Reading about "very important documents(one of's)" in combination with a museum and then not catering for the desaster case makes me somewhat uneasy.

And that's why I got WD blacks. 5 year warranty. I also have to arrays that stor the info. so it would take up to 4 hard drive failures to render the storage useless. Then that is where the of-sit backup comes into effect. Does anyone know if freeNAS allows for cloud backup?

Thanks for the help/advise
Rick
 
Since you have the full control of the system, you can decide what/when/how to backup.
ZFS is a great file system, and you can use it, but my doubt is if it is worth setting up a raid-z or mirror on a system that has an hardware raid. I believe that using zfs as stripe will give you all the power of zfs without having an extra cost for the software raid on top of the hardware one.
I see your hard disk are SATA, I strongly suggest to use SAS if you want full speed. My experience with SATA against SAS is that SAS can support 2x and more write speed that SATA. Of course, again, it depends on how much you are writing data and how much you are reading it back.
As far as I know FreeNAS does not have any facility for cloud storage, except the availability to mount each kind of "standard" remote file system.
 
Let's see if I got this right.
I should use the software of zfs to make the two arrays. Then only use the hardware to allow the system to see the drives.
I then can the read the aray in another computer if i needed to too. Btw, where is my redundincy made? Is it made by zfs? Would I have two drive failure feature still? Also, what is shnapshot? Is it a backup? Cool. I think this build I will have a lot of options I can take. I am using SATA b/c there will be a maximum of 6 people writing to it at a time. At 6 gb/sec I think transfers of the little files will happen quick. The bigger files like videos will take longer but I don't care. They need this on the cheap.
 
ZFS provides the RAID6 protection (actually RAIDz2) which allows for 2 physical disk failures as you want.
Snapshot is a file system feature that allows you to see your file system at many points of time historically but without requiring space to store all the data as you would need to have complete separate copies of all your data. Best to have a dig around for a fuller explanation on the internet, you could start here:

http://en.wikipedia.org/wiki/ZFS#Snapshots_and_clones

It is a good solution for recovering from accidental deletions etc, but its not a backup in the normal sense, since any fatal issues with your file system (ZFS pool) will render all your snapshots unusable too. So you still need actual backups on a different disk or tape or whatever...

ta Andy.
 
richcj10 said:
And that's why I got WD blacks. 5 year warranty. I also have to arrays that stor the info. so it would take up to 4 hard drive failures to render the storage useless. Then that is where the of-sit backup comes into effect. Does anyone know if freeNAS allows for cloud backup?

Thanks for the help/advise
Rick

When the fire department drives home after the lightning struck, the warranty will not give you back the data. It does not need to burn the machine, having the disks curl up by massive overvoltage will do as nicely.

But you are on the right track with ZFS, there is no need to use a seperate RAID system. That would, in some cases, be counterproductive as it might hide drive failures from ZFS and thus from you noticing them. Each disk that failed me in the last years were reported to be damaged by ZFS before SMART even got a hint that something was wrong.

That, I would not want to miss ever again :)
 
First, how many drives do you need to use zfs to get the two fail safty? I think I will have then a smaller array to hold snapshots. Any back up medias do you recomend? (Dvd, drive, ect. )
Also, if i have zfs, can i make the drive a software raid? ( so it shows up as one)
Thanks for the help!
 
Snapshots are stored in the same storage pool as the rest, so you do not need any extra drives for that.

ZFS is not only a file system but a volume manager as well. When you use a more traditional OS, you are exposed to the different layers. With ZFS, there is no need. You can add storage space to the file system and watch the free space show up at once, no need to reformat or to "grow" a file system.
When you use the search function and look for zfs performance in the forum you will find several threads explaining what you can do with 8 disks.

As for a backup, that is a tough one. I use a seperate ZFS pool which I can detatch from the server and keep the disks locked away somewhere else. When you go for a cheap backup solution, then seperate disks provide best GB.per.$. Some external eSata enclosure would do the trick quite well. You can, when time permits, use more than one enclosure and rotate them so you always have at least one backup off-site at all times. This can save you some money/time as long as the archive is not filled with all the really important data,
 
Don't try to backup the data inside the same case. Backup to a separate machine, preferably off-site. And, if it's *really* important, back it up to multiple locations on different types of media (disk, DVD, tape, etc). And be sure to do test restores from the media to make sure it's still usable.

I'd recommend ditching the hardware RAID controller completely. Return it. Get some 8-port SATA controllers instead. Something like the SuperMicro AOC-USAS-L8i or AOC-USAS2-L8i work beatifully with FreeBSD and ZFS. That way, each disk appears directly in FreeBSD, without any RAID nonsense in the way.

Then use ZFS to create the "arrays" (called vdevs in ZFS). If you care more about data integrity than performance, create raidz3 vdevs (can handle 3 dead drives before losing data). With 24 drives, you could create 3x raidz3 vdevs like so:
# zpool create mypoolname raidz3 da0 da1 da2 da3 da4 da5 da6 da7
# zpool add mypoolname raidz3 da8 da9 da10 da11 da12 da13 da14 da15
# zpool add mypoolname raidz3 da16 da17 da18 da19 da20 da21 da22 da23

You can even do it in stages. Start with 8 drives and a single raidz3 vdev. In a couple of months, add another 8 drives to the system and add another raidz3 vdev. A couple months down the road, add another 8 drives and add another raidz3 vdev. All the storage is available in the same pool.

Every night (or every hour, or whatever schedule you want) you create snapshots of the pool (or individual filesystems). These are your "onsite" backups. If a file is deleted, you go into the snapshots and recover it.

Then you create an identical system and host it somewhere else. And use "zfs send" and "zfs recv" to replicate the data from the one system to the other. That gives you the off-site backup.

And, then you use the off-site backup box to create the archival media (CD, DVD, Blu-Ray, tape, external harddrive, USB stick, etc).
 
phoenix said:
Don't try to backup the data inside the same case. Backup to a separate machine, preferably off-site.

I totally agree. Backing up data into the same machine can be faster, but it will not pay off if the machine is burned, damaged or even stolen (it can happen: I had a server stolen from a fired employee since the company did not take my advice to lock the server room!).

phoenix said:
I'd recommend ditching the hardware RAID controller completely. Return it. Get some 8-port SATA controllers instead. Something like the SuperMicro AOC-USAS-L8i or AOC-USAS2-L8i work beatifully with FreeBSD and ZFS.

While it is true that ZFS is a very stable filesystem, I trust raid controllers too (if they are not cheap components).
 
fluca1978 said:
While it is true that ZFS is a very stable filesystem, I trust raid controllers too (if they are not cheap components).

Even an expensive raid controller can serve you the bad block of a mirror 50% of the time, something that leads to interesting heisenbugs and which ZFS will detect and correct if possible. While a good raid card has it's benefits, which I agree on, using it for ZFS is a border case which I would avoid.
 
ZFS has been built with the idea that the hardware can fault, what they call silent data corruption. I agree that having a raid card on top of which runs ZFS is a double work, and choosing one or the other is a better idea. It depends on how much you trust your hardware or your software. If you need to run cheap components, ZFS is definitively the choice; if you can run expensive components, I will be in doubt of what to choose. I ran an HP P400 controller driving 12 disks for 3+ years, and luckily I haven't had a single fault. At the same time, I'm running ZFS FreeBSD installations from 2+ years having no fault too.
So I cannot say without any doubt that ZFS is better than hardware or vice versa. For new installations I tend to prefer ZFS against expensive hardware.
This is my experience and opinion.
 
ZFS snapshots can protect your users against accidently deleted files. You don't need to set up a mirror for that. This is apart from your (offsite) backups of course. You could create a script that creates snapshots every day, and deletes snapshots older than the wanted retention time. If for instance a week retention is enough, you could do something like this:

Code:
#!/bin/sh

snapshotname=`/bin/date "+%A"`


for zpool in `/sbin/mount -dt zfs | /usr/bin/awk '{print $1}'`
        do
                /sbin/zfs destroy $zpool@$snapshotname
                /sbin/zfs snapshot $zpool@$snapshotname
        done

This will create snapshots for all your zpools like tank@Monday / tank@Tuesday and so on. Keeping data safe against deletion/alteration for a week.
 
Thanks all for your feedback.

1) I like raid card because in a pinch they would allow me to run multiple raids of I needed to do so. Also they are optimized for speed and the bottle neck that will occur at the hard drives.

2) I decided to go with a raid card that will also allow me to do JBOD. This will allow me to have great flexibility in my build. (http://www.newegg.com/Product/Product.aspx?Item=N82E16816151099)

I think I will use a portable HD array to store backups once a week in. I like the snapshots and will use them every 30min. I like the raidz3 too. I have not purchased a UPS for this system yet for the fact that they were going to be putting backup generator in the building. From the sounds of things, it will not be happening anytime soon. Any recommended UPS that will work with FreeNAS?
 
Back
Top