This subject is kind of niche, but hey… It’s new content of some kind at least! Also just want to be up front: These projects may have reached the point of usefulness (in some cases) but they’re also definitely not production ready.

ggml-sys-bleedingedge

GGML is the machine learning library that makes llama.cpp work. If you’re interested in LLMs, you’ve probably already heard of llama.cpp by now. If not, this one is probably irrelevant to you!

ggml-sys-bleedingedge is a set of low level bindings to GGML which are automatically generated periodically. Theoretically it also supports stuff like CUDA, OpenCL, Metal via feature flags but this is not really tested.

Repo: https://github.com/KerfuffleV2/ggml-sys-bleedingedge

Crate: https://crates.io/crates/ggml-sys-bleedingedge

llm-samplers

You may or may not already know this: When you evaluate an LLM, you don’t get any specific answer back. LLMs have a list of tokens they understand which is referred to as their “vocabulary”. For LLaMA models, this is about 32,000 tokens. So once you’re done evaluating the LLM, you get a list of ~32,000 f32s out of it representing the probability for each token.

The naive approach of just picking the most probable token actually doesn’t work that well (“greedy sampling”) so there are various approaches to filtering, sorting and selecting tokens to produce better results.

Repo: https://github.com/KerfuffleV2/llm-samplers

Crate: https://crates.io/crates/llm-samplers

rusty-ggml

Higher level bindings built on the ggml-sys-bleedingedge crate. Not too much to say about this one: if you want to use GGML in Rust, there aren’t that many options and using low level bindings directly isn’t all that pleasant.

I’m actually using this one in the next project, but it’s very, very alpha.

Repo: https://github.com/KerfuffleV2/rusty-ggml

Crate: https://crates.io/crates/rusty-ggml

smolrsrwkv

If you’re interested in LLMs, most (maybe all) of the models you know about like LLaMA, ChatGPT, etc are based on the Transformer paradigm. RWKV is a different approach to building large language models: https://github.com/BlinkDL/RWKV-LM

This project started out “smol” as an attempt to teach myself about LLMs but I’ve gradually added features and backends. It’s mostly useful as a learning aid/example of some of the other projects I made. In addition to being able to run inference using ndarray (pretty slow) it now supports GGML as a backend and I’m in the process of adding llm-samplers support.

Repo: https://github.com/KerfuffleV2/smolrsrwkv

repugnant-pickle

Last (and possibly least) is repugnant-pickle. As far as I know, it is the only Rust crate available that will let you deal with PyTorch files (which are basically zipped up Python pickles). smolrsrwkv also uses this one to allow loading PyTorch RWKV models directly without having to convert them first.

If that’s not enough of a description: Pickle is the default Python data serialization format. It was designed by crazy people, though: it is extremely difficult to interoperate with unless you’re Python because it’s basically a little stack based virtual machine and can call into Python classes. Existing Rust crates don’t fully support it.

repugnant-pickle takes the approach of best-effort scraping pickled data rather than trying to be 100% correct and can deal with weird pickle stuff that other crates throw their hands up at.

Repo: https://github.com/KerfuffleV2/repugnant-pickle

Crate: TBD

Rules [Developing]

Observe our code of conduct

Strive to treat others with respect, patience, kindness, and empathy.
We observe the Rust Project Code of Conduct.
Submissions must be on-topic
Posts must reference Rust or relate to things using Rust. For content that does not, use a text post to explain its relevance.
Post titles should include useful context.
For Rust questions, use the stickied Q&A thread. [TBD]
Arts-and-crafts posts are permitted on weekends.
No meta posts; message the mods instead.

Constructive criticism only

Criticism is encouraged, though it must be constructive, useful and actionable.
If criticizing a project on GitHub, you may not link directly to the project’s issue tracker. Please create a read-only mirror and link that instead.
Keep things in perspective
A programming language is rarely worth getting worked up over.
No zealotry or fanaticism.
Be charitable in intent. Err on the side of giving others the benefit of the doubt.

No endless relitigation

Avoid re-treading topics that have been long-settled or utterly exhausted.
Avoid bikeshedding.
This is not an official Rust forum, and cannot fulfill feature requests. Use the official venues for that.

No low-effort content

Showing off your new projects is fine

No memes or image macros

Please find other communities to post memes

No NSFW Content

There are many other NSFW communities, let’s keep this related to the language

I've been working on a number of Rust projects related to large language models