The Notebook Review forums were hosted by TechTarget, who shut down them down on January 31, 2022. This static read-only archive was pulled by NBR forum users between January 20 and January 31, 2022, in an effort to make sure that the valuable technical information that had been posted on the forums is preserved. For current discussions, many NBR forum users moved over to NotebookTalk.net after the shutdown.
Problems? See this thread at archive.org.

    what is going to be the strategy to handle my files??

    Discussion in 'Desktop Hardware' started by kenny1999, Mar 17, 2017.

  1. kenny1999

    kenny1999 Notebook Evangelist

    Reputations:
    26
    Messages:
    359
    Likes Received:
    28
    Trophy Points:
    41
    Hello

    So now I have got Six HDD which is 3.5 and Two portable HDD which is 2.5 and my system is running on SSD,
    one of them is 1TB, some of them are 2TB, most of the rest are 4TB in capacities



    Some HDDs are getting old (3-4 yeas). They are still fine now but we know they could fail at anytime without any signs.

    I've been hearing about RAID, NAS, things like that for many years but they sound like really difficult for me.

    However, I think it's time for me to learn something, before everything is too late.


    What should I get started now? What should I do to ensure the safety of data in case of hardware failure?

    In fact, because of the cost concern, I don't really need to backup all the data, I can risk losing some data that is less important

    Now I've already sorted the files into two main categories

    1. Important Files - (that I must back up)
    2. Not Important Files - (that no problem to lose)

    Only 30% of all the files (in terms of total sizes) fall into the category of Important Files.

    What is the strategy now? Please suggest a general idea of what I should learn to get started

    Thanks
     
  2. Jarhead

    Jarhead 恋の♡アカサタナ

    Reputations:
    5,036
    Messages:
    12,168
    Likes Received:
    3,134
    Trophy Points:
    681
    Well, for the important data, you want at *minimum* three separate copies of it: your usual copy, a local backup, and an offsite backup. The offsite backup must be literally (far) away from your usual location, usual recommendations are at least 200 miles away so that even in the worst case possible, the data will still be safe (say, your town/city is absolutely destroyed). Either buy a HDD(s) and physically ship those out somewhere or find a cloud storage provider you like that offers cheap storage (Amazon Glacier is an example, and I know Google and perhaps a few other companies offer similar services). Just be sure to encrypt your offsite backups and keep your encryption keys/passwords/etc. safe. If you don't have offsite backups for your important data, you don't really see it as important.

    For all the data (important and not), you'll also need some sort of local backup. This can be as simple as an external HDD or as complex as a NAS server at home. You can either build your own NAS server or you can buy a pre-made one, but either route won't be cheap, depending on the amount of data you're backing up. A pre-built NAS will be dead-simple to use (QNAP, Netgear, Drobo, etc. make such devices) but can be more expensive than a DIY solution. The disadvantages to the DIY NAS route would be that you'd have to make it yourself and you'll spend time researching parts, etc. On the far simpler side of things, you can just have an external drive(s) as a local backup and just set up automatic backups from your computer. For any case, whether you want to encrypt your local backups is up to you.

    ------------------------------------------------------

    As for RAID, let me preface this with the following: **RAID IS NOT A BACKUP**. Never rely on just a RAID for your backups; they are not meant to be a stand-alone backup solution and are only meant to protect against hardware failure of one or more HDDs (depending on which RAID you're using) and if another HDD failure(s) happens while your RAID is trying to recover, you will lose all your data on that RAID. RAIDs are only useful for adding redundancy to local storage (except for RAID 0, which I'll explain below).

    That said, RAID is actually rather simple to set up. Most, if not all OSes, have built-in software RAID which is good enough for normal usage. It's not complicated to set up a software RAID and there are plenty of excellent guides on Google for whatever pre-built NAS or operating system you are using. When you're installing the OS on your DIY NAS or configuring your pre-built NAS, you can usually choose between a few options. A general guideline for the most common RAIDs:

    • RAID 0: This is not actually a "true" RAID imo, since this doesn't offer redundancy but instead offers a performance improvement over a single disk. RAID 0 simply takes your data and and splits it between two or more disks. If one of these drives fails, you lose all of your data, period. Don't use this if you care about any of your data.
    • RAID 1: This mode will make an exact copy of the data on one drive and store said copy on one or more drives (you need at least two drives in a RAID 1 setup). The amount of data you can store in RAID 1 is the size of your lowest capacity drive in the array. You can lose all but one of the drives and still have your data. The major disadvantage is that this doesn't use your total storage capacity efficiently; for example, if I had four 1TB drives in RAID 1, I could only use up to 1TB of storage (but three of the four drives can fail and the data is still safe).
    • RAID 5: This mode uses parity that is distributed across all the disks in the array in order to provide redundancy. You need at least three drives in order to use this mode. You can have up to one drive fail in this RAID mode; if two or more drives fail, you've lost your data. Note that this mode isn't really recommended for large (2TB or larger) drives since higher-capacity drives have a non-trivial chance of failing during a rebuild, so if one drive fails, it's likely that another one will fail while it is trying to recover. Total usable storage will be the total capacity of the smallest drive times the number of drives, minus the capacity of the smallest drive times 1 (for example: four 4TB drives will result in 12TB capacity).
    • RAID 6: Similar to RAID 5, but requires at least four drives and uses double parity instead of single parity. Up to two drives can fail; if more than two drives fail, you lose your data. Total usable storage will be the total capacity of the smallest drive times the number of drives, minus the capacity of the smallest drive times 2 (so four 4TB drives will provide 8TB capacity). Use this if you're going to use drives larger than 2TB each.
    There are other RAID modes as well (2, 3, 4, etc.) and you can combine RAID modes as well (an example would be RAID 10, which is a RAID 0 array of two RAID 1 arrays), but these aren't usually used in practice.

    VERY IMPORTANT NOTE: If you are considering using a RAID, either in a pre-built system or a DIY system, you **cannot use normal hard drives**. You need to use drives which are specifically designed for RAID use, as normal desktop drives have firmware which does not have adequate performance for RAIDs. What I mean by this is that part of how a RAID works is that the controller (hardware or software) polls the drives for errors and expect a response in a short amount of time; normal hard drives take too long to respond, and thus the RAID controller will mark the drive as failed (which is equivalent to an actual hard drive failure in the RAID). To use Western Digital as an example, you cannot use your normal Blue or Green drives for RAID; you need to use either Red or Red Pro drives for this purpose. NAS-rated drives will be somewhat more expensive than normal desktop drives, but you need them if you want to use RAID.

    For more information, Google searches can turn up far more detailed explanations. I think the Wikipedia article provided more detail while still being an easy read: https://en.wikipedia.org/wiki/Standard_RAID_levels

    ======================

    Here's my personal setup in case you're curious:

    Onsite: I have a DIY server (part of its function is as a NAS) which contains three 3TB WD Red drives in software RAID 5, giving a total of 6TB usage capacity. Yes, I said don't use large drives for RAID 5, but I keep several copies of data and the really important stuff is safe elsewhere. I have a laptop and desktop that make regular backups to this NAS over the network (using the built-in Windows 10 backup tools). I also have an external hard drive attached to this server; I have a script on the server that runs every two weeks to make encrypted backups of the data on the RAID array onto the external drive. I also have email notifications set up on the server which will alert me to any hard drive failure in the RAID array; this will let me know when a failure happens as soon as it happens, which will allow me to shut down the server quickly and replace the dead drive with a spare WD Red so that the RAID array can rebuild the data and recover. The server is plugged into a UPS (uninterruptible power supply) instead of a wall socket; this allows the server to be shut down gracefully if my apartment loses power, as the battery backup will keep powering it for some time after the electricity goes out.

    Offsite: I have an external hard drive which I use to store another set of encrypted backups on. I have family which live a little over 200 miles away from me, so I either ship this drive to them or I take it with me whenever I visit. This drive is kept in a bank lock box that my family owns, so that it is safe even if something were to happen to my family. I also keep some copies of some files on various cloud providers; for example I keep my code on a git cloud (Bitbucket), some files on various cloud storage providers (Dropbox, OneDrive), and my email is provided via GMail.

    ========================

    Neither the suggestions I've given you nor my personal setup are cheap. No matter what tactic you use (either DIYing your backups or using cloud providers), you will be spending some money on your backup solution if you care about it at all. For your important data, using the 30% figure you provide and the drive sizes you provide, that backup solution alone will be 6.5TB total storage that you need to purchase.
     
    Last edited: Mar 17, 2017
  3. houstoned

    houstoned Yoga Pants Connoisseur.

    Reputations:
    2,852
    Messages:
    2,224
    Likes Received:
    388
    Trophy Points:
    101
    the real question is ... how much of that data is related to your avatar?
     
  4. Mr.Koala

    Mr.Koala Notebook Virtuoso

    Reputations:
    568
    Messages:
    2,307
    Likes Received:
    566
    Trophy Points:
    131
    This depends on how you RAID them. If you're using RAID cards made for server workloads, what you said is often true, but it has more to do with the simple fact that you're pairing hardware not made for each other and less to do with the inner workings of RAID. Even if you just connect one drive to a card or, in some cases, turn off all RAID features and expose the physical drives independently, similar issues can still arise.

    Firmware and software RAID designs tend to be a lot more flexible on this matter. Using consumer drives may decrease performance, but they generally don't timeout and panic.



    For a typical personal usage a potentially more relevant criteria is vibration resistance, which again is not inherently related to RAID, just physically proximity.
     
    Last edited: Mar 19, 2017
  5. Jarhead

    Jarhead 恋の♡アカサタナ

    Reputations:
    5,036
    Messages:
    12,168
    Likes Received:
    3,134
    Trophy Points:
    681
    The problem in that case isn't if you use software or hardware RAID with your drives, it's the drive firmware itself. The timing for TLER/ERC is far too long in consumer-class drives, which makes the (software/hardware) RAID think there's an issue with the drive when there isn't. NAS drives have much shorter TLER/ERC times which actually work in RAID.

    https://forums.freenas.org/index.php?threads/checking-for-tler-erc-etc-support-on-a-drive.27126/ (includes WD and Seagate whitepapers explaining this in much greater detail)
     
  6. Mr.Koala

    Mr.Koala Notebook Virtuoso

    Reputations:
    568
    Messages:
    2,307
    Likes Received:
    566
    Trophy Points:
    131
    I was under the impression that at least with Linux/mdadm the default was to wait for the drive to come up for a practically unlimited amount of time. Apparently I was wrong. Sorry.

    On this Arch system I'm using it defaults to 30 seconds.
     
  7. Jarhead

    Jarhead 恋の♡アカサタナ

    Reputations:
    5,036
    Messages:
    12,168
    Likes Received:
    3,134
    Trophy Points:
    681
    No worries :)
     
  8. kenny1999

    kenny1999 Notebook Evangelist

    Reputations:
    26
    Messages:
    359
    Likes Received:
    28
    Trophy Points:
    41
    ok then I give up RAID and NAS if they are not a backup solution
     
  9. Jarhead

    Jarhead 恋の♡アカサタナ

    Reputations:
    5,036
    Messages:
    12,168
    Likes Received:
    3,134
    Trophy Points:
    681
    I think you misunderstood what I said in my post. I said that RAID *alone* isn't a backup. Likewise, a single external hard drive isn't a backup. You need to have several copies of your data in order to have a proper backup. A NAS and/or RAID and/or external hard drive can be part of that backup.
     
  10. kenny1999

    kenny1999 Notebook Evangelist

    Reputations:
    26
    Messages:
    359
    Likes Received:
    28
    Trophy Points:
    41

    Because RAID is more difficult than I think after I study something

    In addition, otherwies than backup I don't need anything like networking access or sharing files with other devices or cloud service.

    In that case, having a lot more portable simple hard drive is a better and more economic solution isn't it?

    However, I am thinking how I can synchronize the files between several copies on several drives

    Because I will add/delete and create/delete files and folders. I don't want to just copy and paste the whole 4T or 8T HDD every time I need to synchronize that will take forever and Iw ill go crazy

    Is there any way to sychronzie files and folders between different drives while trying to check but not overwriting the same files, and even deleting repeated files.
     
  11. Jarhead

    Jarhead 恋の♡アカサタナ

    Reputations:
    5,036
    Messages:
    12,168
    Likes Received:
    3,134
    Trophy Points:
    681
    Linux has rsync, which does exactly what you're asking.

    Using external hard drives would be cheaper, though you still need multiple hard drives. You can't just buy one big one, keep it at home/backpack/etc and call it a backup. As my original post said, for important data you want at least three copies of it, one of which is offsite.

    You'll spend less money with just a bunch of external drives, but you're still spending a fair bit to back up what your posted about.
     
  12. kenny1999

    kenny1999 Notebook Evangelist

    Reputations:
    26
    Messages:
    359
    Likes Received:
    28
    Trophy Points:
    41
    Linux?
    is it another operating system like windows?

    1. I know Linux has a lot of version, which one is the best and the most stable against virus and attack?

    2. Rsync? Does it mean it can update (add and delete) files at the same time automatically, between different drives? it's very attractive to me

    3. Last, can I install Linux and windows together at the same time on the same PC?
     
  13. Jarhead

    Jarhead 恋の♡アカサタナ

    Reputations:
    5,036
    Messages:
    12,168
    Likes Received:
    3,134
    Trophy Points:
    681
    Yes, there are various "flavors" of Linux though. I'd recommend Ubuntu or Mint if you're going to use it.

    It's immune to Windows viruses (virii?), though no OS is 100% safe from attack.

    https://en.wikipedia.org/wiki/Rsync

    Yes, that's called dual-booting.
     
  14. Mr.Koala

    Mr.Koala Notebook Virtuoso

    Reputations:
    568
    Messages:
    2,307
    Likes Received:
    566
    Trophy Points:
    131
    @Jarhead
    Do you think using cheap consumer drives in a RAID for a personal NAS with device timeout set to more than a few minutes is a valid strategy if one is on a very tight budget? On a dedicated NAS with modern hardware you have at least a few G's of RAM space for I/O buffering. Will the lag from the consumer drive kill anything before the buffer runs out?
    NAS can act as a (local) backup solution. RAID is not a backup solution on its own, but does provide more reliability.

    RAID can be applied to both the computer you're actually working on (such as your laptop) and the NAS device that's responsible for the backup.
    NAS devices are typically running Linux for multiple reasons. However for simple personal usage you can run whatever common operation system you like, Windows included.

    There is a rsync wrapper on Windows call DeltaCopy: http://www.aboutmyip.com/AboutMyXApp/DeltaCopy.jsp
     
  15. Jarhead

    Jarhead 恋の♡アカサタナ

    Reputations:
    5,036
    Messages:
    12,168
    Likes Received:
    3,134
    Trophy Points:
    681
    Well, I would if it were feasible, but you'd somehow have to change the HDD firmware and/or your RAID software/hardware. At any rate, NAS drives aren't that much more expensive than regular desktop drives: WD Red 3TB NAS drives are ~$110 ( https://www.newegg.com/Product/Prod...36344&cm_re=wd_red_3tb-_-22-236-344-_-Product) whereas WD Blue 3TB desktop drives are only $10 cheaper ( https://www.newegg.com/Product/Prod...5012&cm_re=wd_blue_3tb-_-22-235-012-_-Product), for example.
     
  16. Mr.Koala

    Mr.Koala Notebook Virtuoso

    Reputations:
    568
    Messages:
    2,307
    Likes Received:
    566
    Trophy Points:
    131
    @Jarhead
    You're definitely right about NAS drives being not much more expensive. I was thinking mostly about OP who has at least 8 drives already laying around.

    Anyway, there's hardly any need for OP to run RAID given the use case.

    @kenny1999
    Rsync or similar software can copy your files to the backup and overwrite older visions if present. Do you think there is any chance you would need version history? If you frequently modify some of the files you need to backup keeping only the latest version is dangerous.
     
    Last edited: Mar 21, 2017
  17. HTWingNut

    HTWingNut Potato

    Reputations:
    21,580
    Messages:
    35,370
    Likes Received:
    9,877
    Trophy Points:
    931
    To add my two cents.

    RAID is not a backup. It is only to keep a system's data live in the event of a drive failure. A RAID setup should be backed up to a separate set of backup media like an external hard drive, another server, or the cloud. A NAS is a great backup solution for your home PC's data, just make sure the NAS drives are also backed up regularly. I would strongly suggest investing in NAS drives over conventional drives. They are lower power and more reliable and designed to be running 24/7.

    After experiencing data loss in the past, I'm super paranoid about my data retention and backup now. All users should be. Depends on how important your data is to you.
     
    Aroc likes this.
  18. Primes

    Primes Notebook Deity

    Reputations:
    919
    Messages:
    1,736
    Likes Received:
    718
    Trophy Points:
    131
  19. Jarhead

    Jarhead 恋の♡アカサタナ

    Reputations:
    5,036
    Messages:
    12,168
    Likes Received:
    3,134
    Trophy Points:
    681
    Owncloud requires that you have a server to deploy it from and/or you need to pay a host to host it (whereas Dropbox only has you optionally installing a desktop app and you do t need to worry about buying hardware).

    I've tried it once, and it's really nice software. But to keep it secure requires that you just throw a few hoops and to make it perform well you need to tweak it somewhat (by default it uses SQLite, whereas you'd want something like MySQL instead).
     
  20. Support.2@XOTIC PC

    Support.2@XOTIC PC Company Representative

    Reputations:
    486
    Messages:
    3,148
    Likes Received:
    3,490
    Trophy Points:
    331
    Call me cheap, but I just use my tiny amount of free cloud storage for anything I want to sync across systems that aren't on the same network, and then multiple individual drives for important files, I don't bother with RAID or NAS at all at the moment.
     
  21. Jarhead

    Jarhead 恋の♡アカサタナ

    Reputations:
    5,036
    Messages:
    12,168
    Likes Received:
    3,134
    Trophy Points:
    681
    I don't think that's cheap at all. Hell, that's what I did back in college for the past ~5 years.

    Personally I have a home server just because I like building and tinkering with computers. Your system works just as well as a backup: you have your original copy, a local backup, and (for important stuff I would think) the cloud backup/sync.
     
  22. Support.2@XOTIC PC

    Support.2@XOTIC PC Company Representative

    Reputations:
    486
    Messages:
    3,148
    Likes Received:
    3,490
    Trophy Points:
    331

    If it ain't broke...

    Not to say I haven't considered a home server, I just never seem to get around to building one.
     
  23. Jarhead

    Jarhead 恋の♡アカサタナ

    Reputations:
    5,036
    Messages:
    12,168
    Likes Received:
    3,134
    Trophy Points:
    681
    IMO it's a fun exercise, but I guess mostly for those into that sort of thing.

    Probably similar to car enthusiasts vs regular drivers.
     
  24. Support.2@XOTIC PC

    Support.2@XOTIC PC Company Representative

    Reputations:
    486
    Messages:
    3,148
    Likes Received:
    3,490
    Trophy Points:
    331
    True, I probably don't need one anymore anyway.
     
  25. Jarhead

    Jarhead 恋の♡アカサタナ

    Reputations:
    5,036
    Messages:
    12,168
    Likes Received:
    3,134
    Trophy Points:
    681
    I probably don't need one either, but living a life with only the bare minimal needs meant is a boring life ;)
     
  26. HTWingNut

    HTWingNut Potato

    Reputations:
    21,580
    Messages:
    35,370
    Likes Received:
    9,877
    Trophy Points:
    931
    That's fine too. You don't need anything complicated. I used to just store my most important files on an external hard drive, and then I would take the drive over to my parents whenever I went to visit (usually once every 4-6 weeks), and swap it out for the other one I stored there. At least my most important photos and documents were somewhat secure because at least most of it was away from my home in case it blew up or whatever.

    Even using an old computer or laptop is fine to use as a backup. Depends on what you want to backup.
     
  27. Support.2@XOTIC PC

    Support.2@XOTIC PC Company Representative

    Reputations:
    486
    Messages:
    3,148
    Likes Received:
    3,490
    Trophy Points:
    331
    A good idea, I've got an ancient ASUS that's still running that I've though of dropping a huge drive into and using as a poor man's media server / backup.
     
  28. HTWingNut

    HTWingNut Potato

    Reputations:
    21,580
    Messages:
    35,370
    Likes Received:
    9,877
    Trophy Points:
    931
    Yep. That's all you need. Shove a few 2TB drives in there, and call it a day. All depends on your needs.
     
  29. Support.2@XOTIC PC

    Support.2@XOTIC PC Company Representative

    Reputations:
    486
    Messages:
    3,148
    Likes Received:
    3,490
    Trophy Points:
    331
    That's the plan, got a barracuda that I'm going to install this weekend if I have time.