I have for a while a ubuntu server where I selfhost for my household syncthing (automatic backup of most important files on devices), baïkal, magic mirror and a few other things via docker.

I was looking at what I have now (leftovers of a computer of mine, amd 2600 with 16 gb ram with a 1660 super and a western digital blue ssd of 512GB), and regarding storage wise, at the time I decided to get several sort of cheap ssd’s to have enough initial space (made a logical volume out of 3 crucial mx500 1TB, in total making 3TB). At the time I though I wanted to avoid regular hdd at all costs (knew people who had issues with it), but in hindsight, I never worked with NAS drives, so my fear over these hdd with such low usage is sort of uncalled for.

So now I am trying to understand what can I change this setup so I can expand later if needed, but also having a bit more space already (for the personal stuff I have around 1.5TB of data) and add a bit more resilience in case something happens. Another goal is to try to make a 3-2-1 backup kind of solution (starting with the setup at home, with an external disk already and later a remote backup location). Also, I will probably decommission for now the ssd’s since I want to avoid to have a logical volumes (something happens on one drive, and puff all the data goes away). So my questions regarding this are:

  • For hdd’s to be used as long term storage, what is usually the rule of thumb? Are there any recommendations on what drives are usually better for this?
  • Considering this is going to store personal documents and photos, is RAID a must in your opinion? And if so, which configuration?
  • And in case RAID would be required, is ubuntu server good enough for this? or using something such as unraid is a must?
  • I was thinking of probably trying to sell the 1660 super while it has some market value. However, I was never able to have the server completely headless. Is there a way to make this happen with a msi tomahawk b450? Or is only possible with an APU (such as 5600g)?

Thanks in advance

PS: If you guys find any glaring issues with my setup and know a tip or two, please share them so I can also understand better this selfhosted landscape :)

  • CrackedLinuxISO@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    3
    ·
    edit-2
    5 days ago

    However, I was never able to have the server completely headless.

    Depending on what you mean by “completely headless” it may or may not be possible.

    Simplest solution: When you’re installing OS and setting up the system, you have a GPU and monitor for local access. Once you’ve configured ssh access, you no longer need the GPU or monitor. You could get by with a cheap “Just display something” graphics card and keep it permanently installed, only plugging in the monitor when something is not working right. This is what I used to do.

    Downside: If you ever need to perform an OS reinstall, debug boot issues, or change BIOS settings, you will need to reconnect the monitor.

    Medium tech solution: Install a cheap graphics card, and then connect your server with something like PiKVM or BliKVM. They can plug into your GPU and motherboard and provide a web interface to control your server physically. Everything from controlling physical power buttons to emulating a USB storage device is possible. You’ll be able to boot from cold start, install OS, and change BIOS settings without ever needing a physical monitor. This is what I do now.

    Downsides: Additional cost to buy the KVM hardware, plus now you have to remember to keep your KVM software updated. Anyone who controls the KVM has equivalent physical access to the server, so keep it secure and off the public internet.

    • ZeDoTelhado@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      5 days ago

      Never heard of pikvm, it actually looks like a very interesting solution.

      From the previous point, what I mean by headless is basically to go the server, yank the GPU, press power button and it just boots.

      I’ve tried several times, but bios straight up doesn’t let me go on. I’ve seen in a couple of places some mobos simply refuse to boot without a GPU.

      I can see if I can have a decent value for the GPU. If not, I guess it’s doing its job as is. It just feels a waste to have this GPU be used as video for a server.

      • CrackedLinuxISO@lemmy.dbzer0.com
        link
        fedilink
        English
        arrow-up
        1
        ·
        edit-2
        5 days ago

        Ah, I see what you mean. Yeah, no way around that without a GPU or a processor with integrated graphics.

        You should be able to get a used workstation GPU for $20-40 on eBay. Something from Dell, or a basic nvidia quadro would do the trick. If you could sell the 1660 super for more than that, could be worth the effort.

        Alternatively, the 1660 Super would do the trick nicely if you ever needed to transcode video streams, like from running Jellyfin or Plex.

  • IsoKiero@sopuli.xyz
    link
    fedilink
    English
    arrow-up
    6
    arrow-down
    1
    ·
    6 days ago

    My personal opinions, not facts:

    For hdd’s to be used as long term storage, what is usually the rule of thumb? Are there any recommendations on what drives are usually better for this?

    Anything with a long history, like HGST or WD (red series preferably). Backblaze among others publish their data on longevity of drives, so look for what they’re offering. On ebay (and others) there’s refurbished drives available which are pretty promising, but I have no personal experience on those.

    Considering this is going to store personal documents and photos, is RAID a must in your opinion? And if so, which configuration?

    Depends heavily on your backup scheme, amount of data and available bandwidth (among other things). Raid protects you against a single point of failure on storage. Without raid, you need to replace the drive, pull data back from backups and while that’s happening you don’t have access to the things you stored on the failed disk. With raid you can keep using the environment without interruptions while waiting for a day or two for a replacement. If you have fast connection which can download your backups in less than 24 hours it might be worth the money to skip raid, but if it takes a week or two to pull data back, then the additional cost of raid might be worth it. Also, if you change a lot of data during the day, it’s possible that a drive failure happens before backup is finished and in that case some data is potentially lost.

    On which level of RAID you should use, it’s a balancing act. Personally I like to run things with RAID5 or 6 even if I have a pretty decent uplink. Also, you need to consider what’s the acceptable downtime for your services. If you can’t access all of your photos in 48 hours it’s not a end of the world, but if your home automation is offline it can at least increase your electric bill for some amount and maybe cause some inconvenience, depending on how your setup is built.

    And in case RAID would be required, is ubuntu server good enough for this? or using something such as unraid is a must?

    Ubuntu server is well enough. You can do either sofware raid or LVM for traditionald RAID setup or opt for a more modern approach like zfs.

    I was thinking of probably trying to sell the 1660 super while it has some market value. However, I was never able to have the server completely headless. Is there a way to make this happen with a msi tomahawk b450? Or is only possible with an APU (such as 5600g)?

    No idea. My server has a on board graphics, but I haven’t used that for years. But it’s a nice option to have in case something goes really wrong. You can still sell your 1660 and replace that with the cheapest GPU you can find from ebay/whatever, at least as long as you’re comfortable with the console you can fix things with anything that can output plain text. If your motherboard has separate remote management (generally not available in consumer grade stuff) it might be enough to skip any kind of GPU, but personally I would not have that kind of setup, even if remote management/console was available.

    If you guys find any glaring issues with my setup

    I don’t know about actual issues, but I have spinning hard drives a lot older than my kids which still run just fine. Spinning rust is pretty robust (at least in sub 4TB capacity), so unless you really need the speed traditional hard drives still have their place. Sure, a ton more of spinning drives has failed on me than SSD’s, but I have working hard drives older than SSD as a technology has been around (at least in the sense of what we have now), so claiming that SSD’s are more robust (at least on my experience) is just a misnderstood statistics.

    • ZeDoTelhado@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      6 days ago

      Thanks for your insights. Yes you are for sure correct. There was a time I had friends of mine losing everything because of spinning drives. But then again, none of them were nas grade (and also, was a time having 128gb was an absolute luxury).

      As for RAID, I was asking since it is something I hear people a lot doing. On my situation, my plan is to always have an external ssd with me plus a future remote like location for last ditch effort to save the data if really needed. So maybe it is OK for me to skip it. (And if I don’t have access to my photos for a week, no one dies)

      • atzanteol@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        0
        ·
        5 days ago

        SSDs fail too. All storage is temporary…

        Setting up a simple software raid is so easy it’s almost a no-brainer for the benefit imho. There’s nothing like seeing that a drive has problems, failing it from the raid, ordering a replacement, and just swapping it out and moving on. What would otherwise be hours of data copying, fixing things that broke, and discovery of what wasn’t backed up is now 10 minutes of swapping a disk.

        • ZeDoTelhado@lemmy.worldOP
          link
          fedilink
          English
          arrow-up
          1
          ·
          5 days ago

          This is something I still don’t fully understand because raid in itself has so many bizarre terms and configurations that for the initiated is just really hard to understand, unless you really take time to dive into it.

          So my question is: when you tall about software raid, which configuration you mean? And also, how many drives are needed to do such configuration? Thanks in advance

          • atzanteol@sh.itjust.works
            link
            fedilink
            English
            arrow-up
            2
            ·
            5 days ago

            Yeah - that’s fair. I may have oversimplified a tad… The concepts behind RAID, the theory, implementations, etc. are pretty complicated. And there are many tools that do “raid-like-things” with many options about raid types… So the landscape has a lot of options.

            But once you’ve made a choice the actual “setting it up” is usually pretty simple, and there’s no real on-going support or management you need to do beyond just basic health monitoring which you’d want to do even without a RAID (e.g. smartd). Any Linux system can create and use a RAID - you don’t need anything special like Unraid. My old early-to-mid-2010’s Debian box manages a RAID with NFS just fine.

            If you decide you want a RAID you first decide which “level” you want before talking about any specific implementations. This drives all of your future decisions including which software you use. This basically focuses on 2 questions - how much budget do you have and what is your fault tolerance?

            e.g. I have a RAID5 because I’m cheap and wanted biggest bang-for-the-buck with some failure resiliency. RAID5 lets me lose one drive and recover, and I get the storage space of N-1 drives (1 drive is redundant). Minimum size for a RAID5 is 3 drives. Wikipedia lists the standard RAID levels which are “basically” standardized even though implementations vary.

            I could have gone with RAID6 (minimum 4 disks) which can suffer a 2 drive outage. I have off-site backups so I’ve decided that the low-probability of a 2 drive failure means this option isn’t necessary for me. If I’m that unlucky I’ll restore from BackBlaze. In 10+ years of managing my own fileserver I’ve never had more than 1 drive fail at a time. I’ve definitely had drives fail though (replaced one 2 weeks ago - was basically a non-issue to fix).

            Some folks are paranoid and go with RAID1 and friends (RAID1, RAID10, etc.) which involves basically full duplication of drives. Very safe, very expensive for the same amount of usable storage. But RAID1 can work with a minimum of 2 drives. It just mirrors them so you get half the storage.

            Next the question becomes - what RAID software to use? Here there are lots of options and where things can get confusing. Many people have become oddly tribal about it as well. There’s the traditional Linux “md” RAID which I use that operates under the filesystems. It basically takes my 4 disks and creates a new block device (/dev/md0) where I create my filesystems. It’s “just a disk” so you can put anything you want on it - I do LVM + ext4. You could put btrfs on it, zfs, etc. It’s “just a disk” as far as the OS is concerned.

            These days the trend is to let the filesystems handle your disk pooling rather than a separate layer. BTRFS will create a RAID (but cautions against RAID5), as does ZFS. These filesystems basically implement the functionality I get from md and lvm into the filesystem itself.

            But there are also tools like Unraid that will provide a nice GUI and handle the details for you. I don’t know much about it though.

            • ZeDoTelhado@lemmy.worldOP
              link
              fedilink
              English
              arrow-up
              1
              ·
              5 days ago

              Thanks for the reply. The breakdown is very good and I can actually see a lot of reasoning on your situation that I also would share (I do not have vast amounts of money to throw at this + only one drive failing and 2 handle the boat sounds about right).

              As for the way to do the software raid, I’ve seen MD somewhere before but I honestly forgot. Since people tend to talk about unraid a lot. From my perspective, I would probably go as simple as possible, although I will be studying how effectively MD works.

              Great reply :) learned a lot

              • atzanteol@sh.itjust.works
                link
                fedilink
                English
                arrow-up
                1
                ·
                edit-2
                5 days ago

                Sure thing - one thing I’ll often do for stuff like this is spin up a VM. You can throw 4x1GiB virtual drives in it and play around with creating and managing a raid using whatever you like. You can try out md, ZFS, and BTRFS without any risk - even unraid.

                Another variable to consider as well - different RAID systems have different flexibility for reshaping the RAID. For example - if you wanted to add a disk later, or swap out old drives for new ones to increase space. It’s yet another rabbit hole to go down, but something to keep in mind. When we start talking about 10’s of terrabytes of data you start to lose somewhere to temporarily put it all if you need to recreate your raid to change your raid layout. :-)

  • non_burglar@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    ·
    5 days ago

    Your focus shouldn’t be on what technologies to use, because you can’t know what will help until you know what you’re trying to do.

    Define your use case and the problems you can see now, and the technologies to address them will become more apparent.

    • ZeDoTelhado@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      5 days ago

      You are for sure right. I did find gaps on my solution right now which is:

      • I have several external disks that only have the information once (some of then quite old).
      • if I aggregate all of those in one spot I for sure need more space
      • right now the ssds are grouped into a lvm to make a logical volume of 3TB (at the time this was OK since I was testing it out for a while). However, one disk fails and I have a problem on my hands.
      • decided to look into ssd prices and my eyes started get watery at how expensive would it be (thus, coming to the realization regarding disk types. Didn’t mention before since my post was getting WAY too long).

      Since I get this now, I am trying to understand better the landscape of solutions that can potentially fit.

      • non_burglar@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        5 days ago

        Sounds reasonable, and I’m sure you’re on your way to solving this.

        In my experience thinking hard about my storage needs, I’ve found that as long as I can get decent performance and a bit of redundancy, a solid and tested backup plan can fill in the rest in terms of data safety and integrity.