• samus7070@programming.dev
    link
    fedilink
    English
    arrow-up
    8
    ·
    11 months ago

    The only reason I can think of is for more on device ai. LLMs like ChatGPT are extremely greedy when it comes down to RAM. There are some optimizations that squeeze them into a smaller memory footprint at the expense of accuracy/capability. Even some of the best phones out there today are barely capable of running a stripped down generative ai. When they do, the output is nowhere near as good as when it is run in an uncompressed mode on a server.