What are your thoughts on #privacy and #itsecurity regarding the #LocalLLMs you use? They seem to be an alternative to ChatGPT, MS Copilot etc. which basically are creepy privacy black boxes. How can you be sure that local LLMs do not A) “phone home” or B) create a profile on you, C) that their analysis is restricted to the scope of your terminal? As far as I can see #ollama and #lmstudio do not provide privacy statements.
Take a look at https://nano-gpt.com/ they have all models available and respect your privacy.
“respect your privacy” is a vague buzzword phrase, and for a post about local LLMs linking a client that calls APIs which log user data is unhelpful
Thanks.
I feel it would be constructive if people who downvoted the OP (I am not them) told them why. As then the OP can learn what this community expects and people who stumble across comments being downvoted, we can clearly see why and learn more from it.
Before doing that, I would very carefully describe the problem I want to solve and other possible solutions. There are (relatively uncommon) situations where LLMs make sense, but many people are buying the snake oil when they don’t need it. Wouldn’t want to be played a fool.
D) what is AMD support like or is the Python fan boys still focusing on Nvidia exclusively?
It is slow. Syntax & community idioms suck. The package ecosystem is a giant mess—constant dependency breakage, many supply-side attacks, quality is all over the place with many packages with failing tests or build that isn’t reproducible—& can largely be an effect of too many places saying this is the first language you should learn first. When it comes to running Python software on my machine, it always is the buggiest, breaks the most shipping new software, & uses more resources than other things.
When I used to program in it, I thought Python was so versatile that it was the 2nd best language at everything. I learned more languages & thought it was 3rd best… then 4th… then realized it isn’t good at anything. The only reason it has things going for it is all the effort put into the big C libraries powering the math, AI, etc. libraries.
that’s an oversimplification.
python is slow because it’s meant as glue; all the important parts of the ml libraries are written in other languages.
all the dependency stuff is due to running outside of a managed environment, which has been the norm for 10 years now. yes venv/bin/activate is clunky, but it solves the problem.
also, what supply-side attacks?
lua is probably a better first language though.
I’m running gpt4all on AMD. Had to figure out which packages to install, which took a while, but since then it runs fine just fine
Good to know. Is there a particular guide that you followed to get it running on AMD?
have you looked at backyard ai?
Since you ask, here are my thoughts https://fabien.benetou.fr/Content/SelfHostingArtificialIntelligence with numerous examples. To clarify your points :
- rely on open-source repository where the code is auditable, hopefully audited, and try offline
- see previous point
- LLMs don’t “analyze” anything, they just spit out human looking text
To clarify on the first point, as the other 2 unfold from there, such project would instantly lose credibility if they were to sneak in telemetry. Some FLOSS projects tried that in the past and it always led to uproars, reverts and often forks of the exact same codebase but without telemetry.
“They just spit out human looking text” is so incredibly regressive and asinine.
But it’s accurate? Doesn’t mean that human looking text can’t be helpful to some, but it’ll also help keep us grounded to the reality of the tech.