Seems like the model you mentioned is more like a fine tuned Llama?
Specifically, these are fine-tuned versions of Qwen and Llama, on a dataset of 800k samples generated by DeepSeek R1.
Yeah, it’s distilled from deepseek and abliterated. The non-abliterated ones give you the same responses as Deepseek R1.