This is an unpopular opinion, and I get why – people crave a scapegoat. CrowdStrike undeniably pushed a faulty update demanding a low-level fix (booting into recovery). However, this incident lays bare the fragility of corporate IT, particularly for companies entrusted with vast amounts of sensitive personal information.

Robust disaster recovery plans, including automated processes to remotely reboot and remediate thousands of machines, aren’t revolutionary. They’re basic hygiene, especially when considering the potential consequences of a breach. Yet, this incident highlights a systemic failure across many organizations. While CrowdStrike erred, the real culprit is a culture of shortcuts and misplaced priorities within corporate IT.

Too often, companies throw millions at vendor contracts, lured by flashy promises and neglecting the due diligence necessary to ensure those solutions truly fit their needs. This is exacerbated by a corporate culture where CEOs, vice presidents, and managers are often more easily swayed by vendor kickbacks, gifts, and lavish trips than by investing in innovative ideas with measurable outcomes.

This misguided approach not only results in bloated IT budgets but also leaves companies vulnerable to precisely the kind of disruptions caused by the CrowdStrike incident. When decision-makers prioritize personal gain over the long-term health and security of their IT infrastructure, it’s ultimately the customers and their data that suffer.

  • LrdThndr@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    2 months ago

    A decade ago I worked for a regional chain of gyms with locations in 4 states.

    I was in TN. When a system would go down in SC or NC, we originally had three options:

    1. (The most common) have them put it in a box and ship it to me.
    2. I go there and fix it (rare)
    3. I walk them through fixing it over the phone (fuck my life)

    I got sick of this. So I researched options and found an open source software solution called FOG. I ran a server in our office and had little optiplex 160s running a software client that I shipped to each club. Then each machine at each club was configured to PXE boot from the fog client.

    The server contained images of every machine we commonly used. I could tell FOG which locations used which models, and it would keep the images cached on the client machines.

    If everything was okay, it would chain the boot to the os on the machine. But I could flag a machine for reimage and at next boot, the machine would check in with the local FOG client via PXE and get a complete reimage from premade images on the fog server.

    The corporate office was physically connected to one of the clubs, so I trialed the software at our adjacent club, and when it worked great, I rolled it out company wide. It was a massive success.

    So yes, I could completely reimage a computer from hundreds of miles away by clicking a few checkboxes on my computer. Since it ran in PXE, the condition of the os didn’t matter at all. It never loaded the os when it was flagged for reimage. It would even join the computer to the domain and set up that locations printers and everything. All I had to tell the low-tech gymbro sales guy on the phone to do was reboot it.

    This was free software. It saved us thousands in shipping fees alone. And brought our time to fix down from days to minutes.

    There ARE options out there.

    • magikmw@lemm.ee
      link
      fedilink
      English
      arrow-up
      0
      ·
      edit-2
      2 months ago

      This works great for stationary pcs and local servers, does nothing for public internet connected laptops in hands of users.

      The only fix here is staggered and tested updates, and apparently this update bypassed even deffered update settings that crowdstrike themselves put into their software.

      The only winning move here was to not use crowdstrike.

      • John Richard@lemmy.worldOP
        link
        fedilink
        English
        arrow-up
        0
        ·
        2 months ago

        Almost all computers can be set to PXE boot, but work laptops usually even have more advanced remote management capabilities. You ask the employee to reboot the laptop and presto!

        • magikmw@lemm.ee
          link
          fedilink
          English
          arrow-up
          1
          ·
          2 months ago

          I wonder how you’re supposed to get PXE boot to work securely over the internet. And how that helps when affected disk is still encrypted and needs unusual intervention to fix, including admin access to system files.

          I’ve been doing this for a while, and I like creative solutions, so I wonder about those issues a lot. Not much comes to my mind besides let’s recall all the laptops and do it one by one.