New accessibility feature coming to Firefox, an “AI powered” alt-text generator.


"Starting in Firefox 130, we will automatically generate an alt text and let the user validate it. So every time an image is added, we get an array of pixels we pass to the ML engine and a few seconds after, we get a string corresponding to a description of this image (see the code).

Our alt text generator is far from perfect, but we want to take an iterative approach and improve it in the open.

We are currently working on improving the image-to-text datasets and model with what we’ve described in this blog post…"

115 points

Overall see nothing wrong with this. Encourages users to support alt-text more, which we should be doing for our disabled friends anyway. I really like the confirmation before applying.

permalink
report
reply
40 points

On the one hand, having an AI generated alt-text on the client side would be much better than not having any alt-text at all. On the other hand, the pessemist in me thinks that if it becomes widely available, website makers will feel less of a need to add proper alt-text to their content.

permalink
report
parent
reply
26 points

A more optimistic way of looking at it is that this tool makes people more interested in alt-text in general, meaning more tools are developed to make use of it, meaning more web devs bother with it in the first place (either using this tool or manually)

permalink
report
parent
reply
7 points

If they feel less need to add proper alt-text because peoples’ browsers are doing a better job anyway, I don’t see why that’s a problem. The end result is better alt text.

permalink
report
parent
reply
9 points
*

I don’t think they’re likely to do a better job than humans any time soon. We can hope that it won’t be extremely misleading too often.

permalink
report
parent
reply
4 points

True, but if it genuinely works really well then does it really matter? Seems like the change would be a net positive.

permalink
report
parent
reply
1 point

Sounds like proton and linux gaming

permalink
report
parent
reply
27 points

The biggest problem with AI alt text is that it lacks the ability to determine and add in context, which is particularly important in social media image descriptions. But people adding useless alt text isn’t exactly a new thing either. If people treat this as a starting place for adding an alt text description and not a “click it and I don’t have to think about it” solution I’m massively in support of it.

permalink
report
reply
11 points

They just need to gamify it. Have a “Verified Accurate Alt-Text Submissions” leaderboard or something.

permalink
report
parent
reply
5 points
*

I would expect it’d be not too hard to expand the context fed into the AI from just the pixels to including adjacent text as well. Multimodal AIs can accept both kinds of input. Might as well start with the basics though.

permalink
report
parent
reply
25 points

I like this approach of having a model locally and running it locally. I’ve been using the firefox website translator and its great. Handy and it doesn’t send my data to google. That I know of, ha.

permalink
report
reply
1 point

The only issue for Firefox’s translator currently is the time it takes to load at first, or the fact you have to download each model first. Its not some monumental task, but it does have more friction than Google’s “automatically send the site you are browsing to our server”

permalink
report
parent
reply
22 points

Neat. I just hope it can be disabled to save power.

permalink
report
reply
6 points

Power management is going to be a huge emerging issue with the deployment of transformer model inference to the edge.

I foresee some backpedaling from this idea that “one model can do everything”. LLMs have their place, but sometimes a good old LSTM or CNN is a better choice.

permalink
report
parent
reply
15 points

Babe another pointless Al just dropped

permalink
report
reply
40 points

This is actually one of the few cases where it makes sense. Its for alt-text for people who browse with TTS

permalink
report
parent
reply
17 points

Yeah, this is actually a pretty great application for AI. It’s local, privacy-preserving and genuinely useful for an underserved demographic.

One of the most wholesome and actually useful applications for LLMs/CLIP that I’ve seen.

permalink
report
parent
reply
30 points

“I don’t need Alt text so it must be useless”

permalink
report
parent
reply
27 points

it’s not pointless; it’s amazing for accessibility, especially in pdfs.

permalink
report
parent
reply
2 points

Well I do agree it’ll be useful for people who need it, but for most people it’s pretty pointless and I hope at least they don’t enable it by default just like Windoze sticky key because ai use a lot of system resources for a little benefits especially with self hosted ai

permalink
report
parent
reply
12 points

beehaw is a safe-space, we shouldnt villify the experiences/needs of people who need alt-text. this could be game changing for people who need it.

permalink
report
parent
reply
20 points

Its for blind people, it let’s them know what is in images using a screen reader, just because it doesn’t apply to you doesn’t mean it’s useless

permalink
report
parent
reply
12 points

Think AI is pointless when it doesn’t apply to you?

permalink
report
parent
reply
2 points

If you had a visual disability you would certainly think otherwise.

permalink
report
parent
reply
1 point

Tell me you don’t add alt text to your posts without telling me :p

permalink
report
parent
reply

Technology

!technology@beehaw.org

Create post

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:


This community’s icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

Community stats

  • 3K

    Monthly active users

  • 2.8K

    Posts

  • 55K

    Comments