Hello
So now I have got Six HDD which is 3.5 and Two portable HDD which is 2.5 and my system is running on SSD,
one of them is 1TB, some of them are 2TB, most of the rest are 4TB in capacities
Some HDDs are getting old (3-4 yeas). They are still fine now but we know they could fail at anytime without any signs.
I've been hearing about RAID, NAS, things like that for many years but they sound like really difficult for me.
However, I think it's time for me to learn something, before everything is too late.
What should I get started now? What should I do to ensure the safety of data in case of hardware failure?
In fact, because of the cost concern, I don't really need to backup all the data, I can risk losing some data that is less important
Now I've already sorted the files into two main categories
1. Important Files - (that I must back up)
2. Not Important Files - (that no problem to lose)
Only 30% of all the files (in terms of total sizes) fall into the category of Important Files.
What is the strategy now? Please suggest a general idea of what I should learn to get started
Thanks
-
Well, for the important data, you want at *minimum* three separate copies of it: your usual copy, a local backup, and an offsite backup. The offsite backup must be literally (far) away from your usual location, usual recommendations are at least 200 miles away so that even in the worst case possible, the data will still be safe (say, your town/city is absolutely destroyed). Either buy a HDD(s) and physically ship those out somewhere or find a cloud storage provider you like that offers cheap storage (Amazon Glacier is an example, and I know Google and perhaps a few other companies offer similar services). Just be sure to encrypt your offsite backups and keep your encryption keys/passwords/etc. safe. If you don't have offsite backups for your important data, you don't really see it as important.
For all the data (important and not), you'll also need some sort of local backup. This can be as simple as an external HDD or as complex as a NAS server at home. You can either build your own NAS server or you can buy a pre-made one, but either route won't be cheap, depending on the amount of data you're backing up. A pre-built NAS will be dead-simple to use (QNAP, Netgear, Drobo, etc. make such devices) but can be more expensive than a DIY solution. The disadvantages to the DIY NAS route would be that you'd have to make it yourself and you'll spend time researching parts, etc. On the far simpler side of things, you can just have an external drive(s) as a local backup and just set up automatic backups from your computer. For any case, whether you want to encrypt your local backups is up to you.
------------------------------------------------------
As for RAID, let me preface this with the following: **RAID IS NOT A BACKUP**. Never rely on just a RAID for your backups; they are not meant to be a stand-alone backup solution and are only meant to protect against hardware failure of one or more HDDs (depending on which RAID you're using) and if another HDD failure(s) happens while your RAID is trying to recover, you will lose all your data on that RAID. RAIDs are only useful for adding redundancy to local storage (except for RAID 0, which I'll explain below).
That said, RAID is actually rather simple to set up. Most, if not all OSes, have built-in software RAID which is good enough for normal usage. It's not complicated to set up a software RAID and there are plenty of excellent guides on Google for whatever pre-built NAS or operating system you are using. When you're installing the OS on your DIY NAS or configuring your pre-built NAS, you can usually choose between a few options. A general guideline for the most common RAIDs:
- RAID 0: This is not actually a "true" RAID imo, since this doesn't offer redundancy but instead offers a performance improvement over a single disk. RAID 0 simply takes your data and and splits it between two or more disks. If one of these drives fails, you lose all of your data, period. Don't use this if you care about any of your data.
- RAID 1: This mode will make an exact copy of the data on one drive and store said copy on one or more drives (you need at least two drives in a RAID 1 setup). The amount of data you can store in RAID 1 is the size of your lowest capacity drive in the array. You can lose all but one of the drives and still have your data. The major disadvantage is that this doesn't use your total storage capacity efficiently; for example, if I had four 1TB drives in RAID 1, I could only use up to 1TB of storage (but three of the four drives can fail and the data is still safe).
- RAID 5: This mode uses parity that is distributed across all the disks in the array in order to provide redundancy. You need at least three drives in order to use this mode. You can have up to one drive fail in this RAID mode; if two or more drives fail, you've lost your data. Note that this mode isn't really recommended for large (2TB or larger) drives since higher-capacity drives have a non-trivial chance of failing during a rebuild, so if one drive fails, it's likely that another one will fail while it is trying to recover. Total usable storage will be the total capacity of the smallest drive times the number of drives, minus the capacity of the smallest drive times 1 (for example: four 4TB drives will result in 12TB capacity).
- RAID 6: Similar to RAID 5, but requires at least four drives and uses double parity instead of single parity. Up to two drives can fail; if more than two drives fail, you lose your data. Total usable storage will be the total capacity of the smallest drive times the number of drives, minus the capacity of the smallest drive times 2 (so four 4TB drives will provide 8TB capacity). Use this if you're going to use drives larger than 2TB each.
VERY IMPORTANT NOTE: If you are considering using a RAID, either in a pre-built system or a DIY system, you **cannot use normal hard drives**. You need to use drives which are specifically designed for RAID use, as normal desktop drives have firmware which does not have adequate performance for RAIDs. What I mean by this is that part of how a RAID works is that the controller (hardware or software) polls the drives for errors and expect a response in a short amount of time; normal hard drives take too long to respond, and thus the RAID controller will mark the drive as failed (which is equivalent to an actual hard drive failure in the RAID). To use Western Digital as an example, you cannot use your normal Blue or Green drives for RAID; you need to use either Red or Red Pro drives for this purpose. NAS-rated drives will be somewhat more expensive than normal desktop drives, but you need them if you want to use RAID.
For more information, Google searches can turn up far more detailed explanations. I think the Wikipedia article provided more detail while still being an easy read: https://en.wikipedia.org/wiki/Standard_RAID_levels
======================
Here's my personal setup in case you're curious:
Onsite: I have a DIY server (part of its function is as a NAS) which contains three 3TB WD Red drives in software RAID 5, giving a total of 6TB usage capacity. Yes, I said don't use large drives for RAID 5, but I keep several copies of data and the really important stuff is safe elsewhere. I have a laptop and desktop that make regular backups to this NAS over the network (using the built-in Windows 10 backup tools). I also have an external hard drive attached to this server; I have a script on the server that runs every two weeks to make encrypted backups of the data on the RAID array onto the external drive. I also have email notifications set up on the server which will alert me to any hard drive failure in the RAID array; this will let me know when a failure happens as soon as it happens, which will allow me to shut down the server quickly and replace the dead drive with a spare WD Red so that the RAID array can rebuild the data and recover. The server is plugged into a UPS (uninterruptible power supply) instead of a wall socket; this allows the server to be shut down gracefully if my apartment loses power, as the battery backup will keep powering it for some time after the electricity goes out.
Offsite: I have an external hard drive which I use to store another set of encrypted backups on. I have family which live a little over 200 miles away from me, so I either ship this drive to them or I take it with me whenever I visit. This drive is kept in a bank lock box that my family owns, so that it is safe even if something were to happen to my family. I also keep some copies of some files on various cloud providers; for example I keep my code on a git cloud (Bitbucket), some files on various cloud storage providers (Dropbox, OneDrive), and my email is provided via GMail.
========================
Neither the suggestions I've given you nor my personal setup are cheap. No matter what tactic you use (either DIYing your backups or using cloud providers), you will be spending some money on your backup solution if you care about it at all. For your important data, using the 30% figure you provide and the drive sizes you provide, that backup solution alone will be 6.5TB total storage that you need to purchase.Last edited: Mar 17, 2017 -
the real question is ... how much of that data is related to your avatar?
-
Firmware and software RAID designs tend to be a lot more flexible on this matter. Using consumer drives may decrease performance, but they generally don't timeout and panic.
For a typical personal usage a potentially more relevant criteria is vibration resistance, which again is not inherently related to RAID, just physically proximity.Last edited: Mar 19, 2017 -
https://forums.freenas.org/index.php?threads/checking-for-tler-erc-etc-support-on-a-drive.27126/ (includes WD and Seagate whitepapers explaining this in much greater detail) -
On this Arch system I'm using it defaults to 30 seconds. -
No worries
-
ok then I give up RAID and NAS if they are not a backup solution
-
-
Because RAID is more difficult than I think after I study something
In addition, otherwies than backup I don't need anything like networking access or sharing files with other devices or cloud service.
In that case, having a lot more portable simple hard drive is a better and more economic solution isn't it?
However, I am thinking how I can synchronize the files between several copies on several drives
Because I will add/delete and create/delete files and folders. I don't want to just copy and paste the whole 4T or 8T HDD every time I need to synchronize that will take forever and Iw ill go crazy
Is there any way to sychronzie files and folders between different drives while trying to check but not overwriting the same files, and even deleting repeated files. -
Linux has rsync, which does exactly what you're asking.
Using external hard drives would be cheaper, though you still need multiple hard drives. You can't just buy one big one, keep it at home/backpack/etc and call it a backup. As my original post said, for important data you want at least three copies of it, one of which is offsite.
You'll spend less money with just a bunch of external drives, but you're still spending a fair bit to back up what your posted about. -
is it another operating system like windows?
1. I know Linux has a lot of version, which one is the best and the most stable against virus and attack?
2. Rsync? Does it mean it can update (add and delete) files at the same time automatically, between different drives? it's very attractive to me
3. Last, can I install Linux and windows together at the same time on the same PC? -
It's immune to Windows viruses (virii?), though no OS is 100% safe from attack.
https://en.wikipedia.org/wiki/Rsync
Yes, that's called dual-booting. -
@Jarhead
Do you think using cheap consumer drives in a RAID for a personal NAS with device timeout set to more than a few minutes is a valid strategy if one is on a very tight budget? On a dedicated NAS with modern hardware you have at least a few G's of RAM space for I/O buffering. Will the lag from the consumer drive kill anything before the buffer runs out?
RAID can be applied to both the computer you're actually working on (such as your laptop) and the NAS device that's responsible for the backup.
There is a rsync wrapper on Windows call DeltaCopy: http://www.aboutmyip.com/AboutMyXApp/DeltaCopy.jsp -
-
@Jarhead
You're definitely right about NAS drives being not much more expensive. I was thinking mostly about OP who has at least 8 drives already laying around.
Anyway, there's hardly any need for OP to run RAID given the use case.
@kenny1999
Rsync or similar software can copy your files to the backup and overwrite older visions if present. Do you think there is any chance you would need version history? If you frequently modify some of the files you need to backup keeping only the latest version is dangerous.Last edited: Mar 21, 2017 -
To add my two cents.
RAID is not a backup. It is only to keep a system's data live in the event of a drive failure. A RAID setup should be backed up to a separate set of backup media like an external hard drive, another server, or the cloud. A NAS is a great backup solution for your home PC's data, just make sure the NAS drives are also backed up regularly. I would strongly suggest investing in NAS drives over conventional drives. They are lower power and more reliable and designed to be running 24/7.
After experiencing data loss in the past, I'm super paranoid about my data retention and backup now. All users should be. Depends on how important your data is to you.Aroc likes this. -
I haven't tested it, but there is OwnCloud, which is similar to a nas/dropbox.
-
I've tried it once, and it's really nice software. But to keep it secure requires that you just throw a few hoops and to make it perform well you need to tweak it somewhat (by default it uses SQLite, whereas you'd want something like MySQL instead). -
Support.2@XOTIC PC Company Representative
Call me cheap, but I just use my tiny amount of free cloud storage for anything I want to sync across systems that aren't on the same network, and then multiple individual drives for important files, I don't bother with RAID or NAS at all at the moment.
-
I don't think that's cheap at all. Hell, that's what I did back in college for the past ~5 years.
Personally I have a home server just because I like building and tinkering with computers. Your system works just as well as a backup: you have your original copy, a local backup, and (for important stuff I would think) the cloud backup/sync. -
Support.2@XOTIC PC Company Representative
If it ain't broke...
Not to say I haven't considered a home server, I just never seem to get around to building one. -
Probably similar to car enthusiasts vs regular drivers. -
Support.2@XOTIC PC Company Representative
-
I probably don't need one either, but living a life with only the bare minimal needs meant is a boring life
-
Even using an old computer or laptop is fine to use as a backup. Depends on what you want to backup. -
Support.2@XOTIC PC Company Representative
-
-
Support.2@XOTIC PC Company Representative
what is going to be the strategy to handle my files??
Discussion in 'Desktop Hardware' started by kenny1999, Mar 17, 2017.