Fault tolerance/100% uptime. How?

A friend who is involved in a very small non IT related startup asked me if I new whether it was possible to create a file server with fault tolerance. I said I was in no way an expert, just a happy FreeBSD user, but I'd try to find out.

So the question is, can servers be arranged to 'fail-over' or whatever you call it, so that work using them isn't interrupted, and data isn't lost?

Thanks.
 
michaelrmgreen said:
A friend who is involved in a very small non IT related startup asked me if I new whether it was possible to create a file server with fault tolerance. I said I was in no way an expert, just a happy FreeBSD user, but I'd try to find out.
One never builds a server with fault tolerance. You build a fault tolerant service.

So the question is, can servers be arranged to 'fail-over' or whatever you call it, so that work using them isn't interrupted, and data isn't lost?
Yes. Different techniques exist for different services.

For file services have a look at HAST: http://wiki.freebsd.org/HAST
 
The lady at the startup emailed me last night to say 'Thanks, but it looks a bit too complicated'. And 'I'll just get something very reliable, are Dell good?'. I laughed out loud. I said I'd ask here. I also asked about her capacity requirements and budget and, of course, she didn't know, but she said there were five people working there at present and as far as she knew non had very large files or needed special software.

So, a low end file server from someone with a reputation for reliability. OR a parts spec. for a build it yourself solution with a view to reliability - I'm thinking fanless, SSD and so on.?
 
I'll probably be cursed for it but I think your friend will be better off with Windows (using DFS).
 
I know I am going to pay dearly for this but is she cute? That's the crux of the matter.

If yes,

then go for a DIY FreeBSD install;

else go for a Windows install.

Seriously, you cannot direct a computer-illiterate person to a Microsoft Server solution, the cost and the effort of servicing and maintaining are simply unbearable.

She would be better off if you set up the IT for some shares in the company.
 
Lol. Naughty. No she's not cute. However the problem is genuinely solved, at least for now, she emailed me last night to say they were going with some kind of Cisco NAS solution.

I'm sure that will be OK in the short term. Long term? Who knows.
 
michaelrmgreen said:
Lol. Naughty. No she's not cute. However the problem is genuinely solved, at least for now, she emailed me last night to say they were going with some kind of Cisco NAS solution.

I'm sure that will be OK in the short term. Long term? Who knows.

I set up the IT of a biotech startup two years ago for shares in the company.

They needed to go the high performance computing road; you know clouded CPUs + GPUs + whatever.

They did not have money to pay for a full-time CTO.

Maybe one of these days, somebody will snatch up that company for billions :)
 
michaelrmgreen said:
going with some kind of Cisco NAS solution.

If that is a sufficient solution will depend on the development of the company. I strongly doubt that it will be sufficient for 10-20 years, but then no solution would be. In any case, if they are a startup they are likely to make mistakes in the setup of their IT-structure. Who here can honestly say that they wouldn't have done things differently if they got to do them again today? Saving money by going down the SMB-NAS route is a sensible choice.

Just be sure to remind her that she should take care of the backup system. She will probably forget this, speaking from own experiences. _NOTHING_ is 100% reliable, everything fails. Well gravity is fairly reliable, it will always make you fall down.
 
Going back to that point though, nothing can be 100% reliable. Everything is designed to fail, due to it being built by imperfect humans. All we can do is give the perception that it does not fail by increasing redundancy and giving the appearance of a fault-less system.

Look at Google, they provide some ludicrous up time statistics i think it's like 5 or 6 9's of up. They only do this due to their ability to distribute fault throughout a larger number of systems. A single solution such as a NAS will never give you required reliability you need. It only comes through resources duplication and distribution can this occur.

100% uptime is also a crazy amount. You have to understand each element that leads to your server/box is a series in a chain that can break. All you can really do is mitigate the risk of it failing but sometimes this maybe out of your control. (Environment/External factors) etc.
 
Well, it's 100% uptime with a margin of error of -+10%, you understand. Or, to put it another way, guaranteed 100% uptime with a confidence of 95.4%.
 
Back
Top