• diffuselight@lemmy.world
    link
    fedilink
    arrow-up
    1
    ·
    edit-2
    1 year ago

    I think at this point we are arguing belief.

    I actually work with this stuff daily and there is a number of 30B models that are exceeding chatGPT for specific tasks such as coding or content generation, especially when enhanced with a lora.

    airoboros-33b1gpt4-1.4.SuperHOT-8k for example comfortably outputs > 10 tokens/s on a 3090 and beats GPT-3.5 on writing stories, probably because it’s uncensored. It’s also got 8k context instead of 4.

    Several recent LLama 2 based models exceed chatgpt on coding and classification tasks and are approaching GPT4 territory. Google bard has already been clobbered into a pulp.

    The speed of advances is stunning.

    M- architecture macs can run large LLMs via llama.cpp because of unified memory interface - in fact a recent macbook air with 64GB can comfortably run most models just fine. Even notebook AMD GPUs with shared memory have started running generative AI in the last week.

    • hottari@lemmy.ml
      link
      fedilink
      arrow-up
      0
      arrow-down
      1
      ·
      1 year ago

      recent macbook air with 64GB

      How much does this cost?

      You will answer any and every question but this.

      My points still stand