• communism@lemmy.ml
    link
    fedilink
    arrow-up
    30
    arrow-down
    1
    ·
    2 months ago

    OP isn’t trying to parse HTML though… they are trying to detect opening xml tags. Which seems quite achievable with regex.

    • winterayars@sh.itjust.works
      link
      fedilink
      arrow-up
      1
      ·
      edit-2
      2 months ago

      It’s still actually pretty sketchy, depending on exactly what you want to do. Strict regex still won’t be able to match correctly if you want to match what an HTML parser considers the opening tag, though fancier regex will. If you’re just looking for the tags in the HTML document as a flat document it’s doable, though. (Mostly.)