69 points
*
Deleted by creator
permalink
report
reply
26 points

I tried it on a 6900 XT recently and generation time was well under half a second.

Results are not as good as with SDXL but for the time it needs it’s very impressive.

permalink
report
parent
reply
18 points

The author can’t type very quickly

permalink
report
parent
reply
7 points
*
Deleted by creator
permalink
report
parent
reply
14 points
*

I’d guess that the ‘realtime’ is a quote from StabilityAI and of course they’re running that stuff on an A100. A couple of seconds is still interactive rate as generally speaking you want to think about the changes you’re making to your conditioning.

Haven’t tried yet but if individual steps of XL Turbo take ballpark as much time as LCM steps then… well, it’s four to eight times faster. As quality generally isn’t production-ready we’re generally speaking about rough prompt prototyping, testing out an animation pipeline, such stuff, but that has the caveat that increasing step size often leads to markedly different results (complete change of composition, not just details) so the information you gain from those preview-quality images is limited.

Oh, “production ready quality”: image quality being roughly en par with 4-step LCM means that it’s nowhere near production grade. For the final render you still want to give the model more steps. OTOH I’ve found that some LCM-based merges do in 30 steps what other models need 80 steps for so improvements are always welcome. But I’m also worried about these distilled models being less flexible, pruning only slightly trodden paths that you actually might want the model to take.

EDIT: Addendum: I’m not seeing anything about using this stuff as a Lora. The nice thing about LCM is that you can take any model you have on your disk and turn it pretty much instantly into a model that can generate fast previews. Also, VAE decoding already can be slower than generation with LCM, so, yeah. I guess having something in between the full VAE and TAESD would be nice, TAESD is fast but is quite limited both when it comes to details, so much that you might not even be able to see what kind of texture SD generated. Oh and it also tends to get colours wrong, at least in my experience it tends to be oversaturated.

permalink
report
parent
reply
4 points

Well, it is technically as fast as you can type if you’re running a better GPU. The 3060 is pretty mid-tier at this point.

permalink
report
parent
reply
5 points
*

Low end card.

I’ll get crucified for saying that because people will interpret that as an attack on their PC or something daft like that. It’s not.

It’s Ampere, a GPU architecture from 3.5 years ago. And even then, here’s what the desktop stack was like:

  1. 3090 Ti (GA102)

  2. 3090 (GA102)

  3. 3080 Ti (GA102)

  4. 3080 12GB (GA102)

  5. 3080 (GA102)

  6. 3070 Ti (GA102/GA104)

  7. 3070 (GA104)

  8. 3060 Ti (GA104/GA103)

  9. 3060 (GA106/GA104)

  10. 3050 (GA106/GA107)

It was almost at the bottom of Nvidia’s stack 3 years ago. It was a low end card then (because, you know, it was at the bottom end of what they were offering). It’s an even more low end card now.

People are always fooled by Nvidia’s marketing and thinking they’re getting a mid range card when in reality Nvidia’s giving people the scraps and pretending they’re giving you a great deal. People need to demand more from these companies.

Nvidia takes a low end card, slaps a $400 price tag on it, calls it mid range, and people lap it up every time.

permalink
report
parent
reply
3 points

The pricing makes it a mid range card, because the budget end is just gone these days.

permalink
report
parent
reply
2 points

I know it’s low-end when compared to the newer generations but if we call a 3060 low-end then what do we call people with older GPUs like a 1070?

permalink
report
parent
reply
2 points
Deleted by creator
permalink
report
parent
reply
1 point

I’m on a 3060 and with 4x upscaling it takes about a second and a half.

permalink
report
parent
reply
45 points
*

XL Turbotastic Mega Ginormous, etc. Hate naming schemes like this. Why not just make it v2.0 or the Pro version instead? Why use multiple words that make it sound bigger and better? Marketing BS that just sounds dumb.

permalink
report
reply
48 points
*

Not sure why you have a problem with it, the naming here makes a lot of sense if you know the context.

Stable Diffusion --> The original SD with versions like 1.5, 2.0, 2.1 etc

Stable Diffusion XL --> A version of SD with much bigger training data and support for much larger resolutions (hence, XL)

Stable Diffusion XL Turbo --> A version of SDXL that is much faster (hence, Turbo)

They have different names because they’re actually different things, it’s not exactly a v1.0 --> v2.0 scenario

permalink
report
parent
reply
8 points

Thanks for the context. That does make it much less redundant.

permalink
report
parent
reply
3 points

Naming schemes that aren’t clear are absolute garbage.

What if you’re new to it, and there are 6 different recent versions of something all named with a description instead of version number? Is Jumbo newer than Mega?

Fuck it, I’m ranting about this because it still upsets me.

I wanted to buy a 3DS to play Shovel Knight and Binding of Issac. Reading up on them, BoI would only play on a New 3DS XL. Cool.

Went to the store and bought a new 3DS XL only to find out I got the wrong one. What I wanted was a NEW 3DS XL, and what I got was a 3DS XL that was new. There is a difference, and it took me 4 days to notice, and I was working out of town for the next month. So I can’t return it. FUN!

So screw naming new versions of things with names instead of numbers. But somehow, Microsoft screwed that one up.

KISS: Keep it simple, stupid.

permalink
report
parent
reply
5 points
*

Sure, 3DS names are dumb, but this is definitely not the case here. Using version numbers instead of different names for different things causes insane confusion and having to over-explain what it is.

See: DLSS

DLSS 2 is just DLSS 1 but better. DLSS 3 is frame generation that isn’t compatible with most hardware. DLSS 3.5 is similar to DLSS 2 but includes enhanced raytracing denoising.

It’s a nightmare. Making a version 2, 3, 4 etc of something also makes it sound like there’s no reason to use the old version, whereas a lot of people are still using the regular stable diffusion over stable diffusion XL.

Imagine if the discussion was “Hey don’t use Stable Diffusion 3 since you need a lot of VRAM, you should be using Stable Diffusion 1.5 or Stable Diffusion 2.1, but also it’s worth getting a new GPU for Stable Diffusion 4 cuz it’s very fast but has lower quality than version 3”

permalink
report
parent
reply
1 point

Yeah but the next version has yet a bigger training set, so what then? XXL? and what about the next ? Turbo was already used, so now we call it Nitro? This is not the “new kids” movies, you know…

permalink
report
parent
reply
33 points

Why not just make it v2.0 or the Pro version instead?

“Pro version” is equally cringe.

permalink
report
parent
reply
2 points

Yeah I get that. Would just have made more sense given that it’s widely used. Though I’ve been told why the name is so weird and it makes some sense now

permalink
report
parent
reply
0 points

Here are my suggestions:

Stable Diffusion Free

Stable Diffusion Paid with Limitations

Stable Diffusion Paid Unlimited

permalink
report
parent
reply
10 points

I agree with you in general, but for Stable Diffusion, “2.0/2.1” was not an incremental direct improvement on “1.5” but was trained and behaves differently. XL is not a simple upgrade from 2.0, and since they say this Turbo model doesn’t produce as detailed images it would be more confusing to have SDXL 2.0 that is worse but faster than base SDXL, and then presumably when there’s a more direct improvement to SDXL have that be called SDXL 3.0 (but really it’s version 2) etc.

It’s less like Windows 95->Windows 98 and more like DOS->Windows NT.

That’s not to say it all couldn’t have been better named. Personally, instead of ‘XL’ I’d rather they start including the base resolution and something to reference whether it uses a refiner model etc.

(Note: I use Stable Diffusion but am not involved with the AI/ML community and don’t fully understand the tech – I’m not trying to claim expert knowledge this is just my interpretation)

permalink
report
parent
reply
3 points

AFAIU SDXL is actually an erm genetic descendant of SD1.5, with its architecture expanded, weights transferred from 1.5, and then trained on bigger inputs (512x512 in the end is awfully small). SD2.0 is a completely new model, trained from scratch and as far as I’m aware noone’s actually using it. Also noone is using the SDXL refiner if you go to civitai it’s all models with detailer capabilities baked in, what you do see is workflows that generate an image, add some noise at the very end and repeat the last couple of steps. Using the base sdxl refiner on the output of other sdxl models is sometimes right-out comical because it sometimes has no idea what it’s looking at and then produced exquisitely surface texture details of the wrong material. Say a silk keyboard because it doesn’t realise that it’s supposed to be ABS and, well, black silk exists.

permalink
report
parent
reply
2 points

Yeah I got some good replies to my comment explaining it. Makes more sense now.

permalink
report
parent
reply
2 points

Im just glad we’re moving away from purposely misspelled product SEO hacks.

permalink
report
parent
reply
Deleted by creator
permalink
report
parent
reply
2 points
Deleted by creator
permalink
report
parent
reply
0 points

I heard they were all child murderers! 😱

permalink
report
parent
reply
20 points

This isn’t free BTW folks

permalink
report
reply
3 points

I haven’t messed with any AI imaging stuff yet. And free recommendations to just have some fun?

permalink
report
parent
reply
3 points
*
Deleted by creator
permalink
report
parent
reply
1 point

Bing and Open AI still and free stuff. Bing’s is actually really good.

permalink
report
parent
reply
8 points

Great, even more online noise that I can look forward to.

permalink
report
reply
6 points

And the resulting faces still all have lazy eyes, asymmetric features, and significantly uncanny issues.

permalink
report
reply
15 points

Humans have asymmetric features. No one is symmetrical

permalink
report
parent
reply
3 points

These features are abnormally asymmetric to the point of being off-putting. General symmetry of features is a significant part of what attracts people one to another, and why facial droops from things like Bells Palsy or strokes can often be psychologically difficult for the patient who experiences them.

General symmetry, not exact symmetry.

permalink
report
parent
reply
2 points

Anecdote: I think Denzel Washington is supposed to have one of the most symmetrical faces.

permalink
report
parent
reply
2 points

You can easily get incredibly canny stuff.

permalink
report
parent
reply

Technology

!technology@lemmy.world

Create post

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


Community stats

  • 17K

    Monthly active users

  • 12K

    Posts

  • 543K

    Comments