• 12 Posts
  • 330 Comments
Joined 1 year ago
cake
Cake day: June 14th, 2023

help-circle


  • Would be interesting to know why some people downvoted this comment, if they think there’s some reason to not play The Finals on Linux. I’ve only done the tutorial so far, and the gameplay seems somewhat similar to Apex, it’s also f2p, and uses EAC so currently no issues with anti-cheat. Might not look like an indie game but it feels like a decent alternative to Apex.


  • “största räntesänkning på över tio år” - låter som att SVT försöker sensationalisera nyheten genom att fokusera på de irrelevanta delarna… Den faktiska nyheten är att räntenivån börjar närma sig den där den låg innan 2022, men att jämföra storleken på den här sänkningen med andra sänkningar under en godtycklig tidsperiod är totalt nonsens. Enda anledningen att ens kan bli en stor sänkning nu är ju för att det var rekordstora ökningar för 2 år sedan. Klart det inte varit några andra stora sänkningar de åren då räntan legat konstant runt ca 2%… Trots “rekordsänkningen” kommer den faktiska räntan, vilket är det som spelar roll för låntagare, fortfarande att vara högre än 2015-2022.



  • I bought a Razer Basilisk 3 because it was the only mouse where I could reach both thumb buttons with the fingertip-ish grip I use. Wasn’t fully supported by Linux software at first, but worst case I could program it on Windows which I had on a dual boot at the time. Now that I can use it with Polychromatic and OpenRazer it even works better on Linux. On Windows the Razer software won’t let me save individual LED colours to the mouse, and needs to be running all the time in order to do that…



  • We just had Windows Update brick itself due to a faulty update. The fix required updating them manually while connected to the office network, making them unusable for 2-3 hours. Another issue we’ve had is that Windows appears to be monopolizing virtualization HW acceleration for some memory integrity protection, which made our VMs slow and laggy. Fixing it required a combination of shell commands, settings changes and IT support remotely changing some permission, but the issue also comes back after some updates.

    Though I’ve also had quite a lot of Windows problems at home, when I was still using it regularly. Not saying Linux usage has been problem free, but there I can at least fix things. Windows has a tendency to give unusable error messages and make troubleshooting difficult, and even when you figure out what’s wrong you’re at the mercy of Microsoft if you are allowed to change things on your own computer, due to their operating system’s proprietary nature.



  • Article is written in a bit confusing way, but you’ll most likely want to turn off Nvidia’s automatic VRAM swapping if you’re on Windows, so it doesn’t happen by accident. Partial offloading with llama.cpp is much faster AFAIK if you want to split the model between GPU and CPU, and it’s easier to find how many layers you can offload if it fails to load instead when you set it too high.

    Also if you want to experiment partial offload, maybe a 12B around Q4 would be more interesting than the same 7B model with higher precision? I haven’t checked if anything new has come out the last couple of months, but Mistral Nemo is fairly good IMO, though you might need to limit context to 4k or something.




  • Mixtral in particular runs great with partial offloading, I used a Q4_K_M quant while only having 12GB VRAM.

    To answer your original question I think it depends on the model and use case. Complex logic such as programming seems to suffer the most from quantization, while RP/chat can take much heaver quantization while staying coherent. I think most people think quantization around 4-5 bpw gives the best value, and you really get diminishing returns over 6 bpw so I know few who thinks it’s worth using 8 bpw.

    Personally I always use as large models as I can. With Q2 quantization the 70B models I’ve used occasionally give bad results, but often they feel smarter than 35B Q4. Though it’s ofc. difficult to compare models from completely different families, e.g. command-r vs llama, and there are not that many options in the 30B range. I’d take a 35B Q4 over a 12B Q8 any day though, and 12B Q4 over 7B Q8 etc. In the end I think you’ll have to test yourself, and see which model and quant combination you think gives best result at the inference speed you consider usable.







  • Behöver inte ens blanda in Ryssen, Ungern är en auktoritär stat som knappt har någon demokrati, de borde inte vara drivande i någon fråga gällande övriga EU-länders politik. Jag tycker det är skrämmande nog att idén om att man ska kunna ha ett privat samtal med släkt och vänner ses som något dåligt, och till och med portätteras som något farligt. Om makthavare tycker att alla konversationer online måste övervakas, är dom också emot att folk får ha privata konversationer IRL?