As always with Zitron, grab a beverage before settling in.

  • Powderhorn@beehaw.orgOP
    link
    fedilink
    English
    arrow-up
    4
    arrow-down
    2
    ·
    2 days ago

    I don’t think it’s going to come down to these absurd datacentres. We’re only a few years off from platform-agnostic local inference at mass-market prices. Could I get a 5090? Yes. Legally? No.

    • Feyd@programming.dev
      link
      fedilink
      arrow-up
      3
      ·
      1 day ago

      We’re only a few years off from platform-agnostic local inference at mass-market prices.

      What makes you confident in that? What will change?

      • Powderhorn@beehaw.orgOP
        link
        fedilink
        English
        arrow-up
        3
        ·
        1 day ago

        There are already large local models. It’s a question of having the hardware, which has historically gotten more powerful with each generation. I don’t think it’s going to be phones for quite some time, but on desktop, absolutely.

        • Feyd@programming.dev
          link
          fedilink
          arrow-up
          2
          ·
          1 day ago

          For business use, laptops without powerful graphics cards have been the norm for quite some time. Do you see businesses deciding to change to desktops to accommodate the power for local models? I think it’s pretty optimistic to think that laptops are going to be that powerful in the next 5 years. The advancement in chip capability has dramatically slowed, and to put them in laptops they’d need to be incredibly more power efficient as well.

          • jarfil@beehaw.org
            link
            fedilink
            arrow-up
            1
            ·
            edit-2
            20 hours ago

            Keywords: NPU, unified RAM

            Apple is doing it, AMD is doing it, phones are doing it.

            GPUs with dedicated VRAM are an inefficient way of doing inference. They’ve been great for research purposes, into what type of NPU may be the best one, but that’s been answered already for LLMs. Current step is, achieving mass production.

            5 years sounds realistic, unless WW3.

          • Powderhorn@beehaw.orgOP
            link
            fedilink
            English
            arrow-up
            3
            ·
            1 day ago

            For the security tradeoff of sensitive data not heading to the cloud for processing? Not all businesses, but many would definitely see value in it. We’re also discussing this as though the options are binary … models could also be hosted on company servers that employees VPN into.

    • HakFoo@lemmy.sdf.org
      link
      fedilink
      arrow-up
      5
      ·
      2 days ago

      I have to think that most people won’t want to do local training.

      It’s like Gentoo Linux. Yeah, you can compile everything with the exact optimal set of options for your kit, but at huge inefficiency when most use cases might be mostly served by two or three pre built options.

      If you’re just running pre-made models, plenty of them will run on a 6900XT or whatever.

      • Powderhorn@beehaw.orgOP
        link
        fedilink
        English
        arrow-up
        3
        ·
        2 days ago

        I don’t expect anyone other than … I don’t even know what the current term is … geeks? batshit billionaires? to be doing training.

        I’m very much of the belief that our next big leap in LLMs is local processing. Once my interactions stay on my device, I’ll jump in.