if you could pick a standard format for a purpose what would it be and why?

e.g. flac for lossless audio because…

(yes you can add new categories)

summary:

  1. photos .jxl
  2. open domain image data .exr
  3. videos .av1
  4. lossless audio .flac
  5. lossy audio .opus
  6. subtitles srt/ass
  7. fonts .otf
  8. container mkv (doesnt contain .jxl)
  9. plain text utf-8 (many also say markup but disagree on the implementation)
  10. documents .odt
  11. archive files (this one is causing a bloodbath so i picked randomly) .tar.zst
  12. configuration files toml
  13. typesetting typst
  14. interchange format .ora
  15. models .gltf / .glb
  16. daw session files .dawproject
  17. otdr measurement results .xml
132 points

Just going to leave this xkcd comic here.

Yes, you already know what it is.

permalink
report
reply
26 points

One could say it is the standard comic for these kinds of discussions.

permalink
report
parent
reply
5 points

There are too many of these comics, I’ll make one to be the true comic response and unite all the different competing standards

permalink
report
parent
reply
5 points

🪛

permalink
report
parent
reply
2 points

how did i know it was standards

now, proliferate

permalink
report
parent
reply
113 points

Open Document Standard (.odt) for all documents. In all public institutions (it’s already a NATO standard for documents).

Because the Microsoft Word ones (.doc, .docx) are unusable outside the Microsoft Office ecosystem. I feel outraged every time I need to edit .docx file because it breaks the layout easily. And some older .doc files cannot even work with Microsoft Word.

Actually, IMHO, there should be some better alternative to .odt as well. Something more out of a declarative/scripted fashion like LaTeX but still WYSIWYG. LaTeX (and XeTeX, for my use cases) is too messy for me to work with, especially when a package is Byzantine. And it can be non-reproducible if I share/reuse the same document somewhere else.

Something has to be made with document files.

permalink
report
reply
21 points

Markdown, asciidoc, restructuredtext are kinda like simple alternatives to LaTeX

permalink
report
parent
reply
14 points
17 points

It is unbelievable we do not have standard document format.

permalink
report
parent
reply
14 points

What’s messed up is that, technically, we do. Originally, OpenDocument was the ISO standard document format. But then, baffling everyone, Microsoft got the ISO to also have .docx as an ISO standard. So now we have 2 competing document standards, the second of which is simply worse.

permalink
report
parent
reply
1 point

That’s awful, we should design something that covers both use cases!

There are now 3 competing standards.

permalink
report
parent
reply
15 points

I was too young to use it in any serious context, but I kinda dig how WordPerfect does formatting. It is hidden by default, but you can show them and manipulate them as needed.

It might already be a thing, but I am imagining a LaTeX-based standard for document formatting would do well with a WYSIWYG editor that would hide the complexity by default, but is available for those who need to manipulate it.

permalink
report
parent
reply
10 points

There are programs (LyX, TexMacs) that implement WYSIWYG for LaTeX, TexMacs is exceptionally good. I don’t know about the standards, though.

Another problem with LaTeX and most of the other document formats is that they are so bloated and depend on many other tasks that it is hardly possible to embed the tool into a larger document. That’s a bit of criticism for UNIX design philosophy, as well. And LaTeX code is especially hard to make portable.

There used to be a similar situation with PDFs, it was really hard to display a PDF embedded in application. Finally, Firefox pdf.js came in and solved that issue.

The only embedded and easy-to-implement standard that describes a ‘document’ is HTML, for now (with Javascript for scripting). Only that it’s not aware of page layout. If only there’s an extension standard that could make a HTML page into a document…

permalink
report
parent
reply
4 points

I was actually thinking of something like markdown or HTML forming the base of that standard. But it’s almost impossible (is it?) to do page layout with either of them.

But yeah! What I was thinking when I mentioned a LaTeX-based standard is to have a base set of “modules” (for a lack of a better term) that everyone should have and that would guarantee interoperability. That it’s possible to create a document with the exact layout one wants with just the base standard functionality. That things won’t be broken when opening up a document in a different editor.

There could be additional modules to facilitate things, but nothing like the 90’s proprietary IE tags. The way I’m imagining this is that the additional modules would work on the base modules, making things slightly easier but that they ultimately depend on the base functionality.

IDK, it’s really an idea that probably won’t work upon further investigation, but I just really like the idea of an open standard for documents based on LaTeX (kinda like how HTML has been for web pages), where you could work on it as a text file (with all the tags) if needed.

permalink
report
parent
reply
3 points

Finally, Firefox pdf.js came in and solved that issue.

Which uses a bloated and convoluted scripting format specialized on manipulating html.

permalink
report
parent
reply
11 points

Bro, trying to give padding in Ms word, when you know… YOU KNOOOOW… they can convert to html. It drives me up the wall.

And don’t get me started on excel.

Kill em all, I say.

permalink
report
parent
reply
89 points
*

zip or 7z for compressed archives. I hate that for some reason rar has become the defacto standard for piracy. It’s just so bad.

The other day I saw a tar.gz containing a multipart-rar which contained an iso which contained a compressed bin file with an exe to decompress it. Soooo unnecessary.

Edit: And the decompressed game of course has all of its compressed assets in renamed zip files.

permalink
report
reply
51 points

A .tarducken, if you will.

permalink
report
parent
reply
10 points

Ziptarar?

permalink
report
parent
reply
35 points

It was originally rar because it’s so easy to separate into multiple files. Now you can do that in other formats, but the legacy has stuck.

permalink
report
parent
reply
11 points

Not just that. RAR also has recovery records.

permalink
report
parent
reply
18 points

.tar.zstd all the way IMO. I’ve almost entirely switched to archiving with zstd, it’s a fantastic format.

permalink
report
parent
reply
5 points

why not gzip?

permalink
report
parent
reply
18 points

Gzip is slower and outputs larger compression ratio. Zstandard, on the other hand, is terribly faster than any of the existing standards in means of compression speed, this is its killer feature. Also, it provides a bit better compression ratio than gzip citation_needed.

permalink
report
parent
reply
5 points
*

gzip is very slow compared to zstd for similar levels of compression.

The zstd algorithm is a project by the same author as lz4. lz4 was designed for decompression speed, zstd was designed to balance resource utilization, speed and compression ratio and it does a fantastic job of it.

permalink
report
parent
reply
3 points

The only annoying thing is that the extension for zstd compression is zst (no d). Tar does not recognize a zstd extension, only zst is automatically recognized and decompressed. Come on!

permalink
report
parent
reply
2 points
*

If we’re being entirely honest just about everything in the zstd ecosystem needs some basic UX love. Working with .tar.zst files in any GUI is an exercise in frustration as well.

I think they recently implemented support for chunked decoding so reading files inside a zstd archive (like, say, seeking to read inside tar files) should start to improve sooner or later but some of the niceties we expect from compressed archives aren’t entirely there yet.

Fantastic compression though!

permalink
report
parent
reply
4 points

.tar.xz masterrace

permalink
report
parent
reply
2 points

This comment didn’t age well.

permalink
report
parent
reply
89 points
*

This is the kind of thing i think about all the time so i have a few.

  • Archive files: .tar.zst
    • Produces better compression ratios than the DEFLATE compression algorithm (used by .zip and gzip/.gz) and does so faster.
    • By separating the jobs of archiving (.tar), compressing (.zst), and (if you so choose) encrypting (.gpg), .tar.zst follows the Unix philosophy of “Make each program do one thing well.”.
    • .tar.xz is also very good and seems more popular (probably since it was released 6 years earlier in 2009), but, when tuned to it’s maximum compression level, .tar.zst can achieve a compression ratio pretty close to LZMA (used by .tar.xz and .7z) and do it faster[1].

      zstd and xz trade blows in their compression ratio. Recompressing all packages to zstd with our options yields a total ~0.8% increase in package size on all of our packages combined, but the decompression time for all packages saw a ~1300% speedup.

  • Image files: JPEG XL/.jxl
    • “Why JPEG XL”
    • Free and open format.
    • Can handle lossy images, lossless images, images with transparency, images with layers, and animated images, giving it the potential of being a universal image format.
    • Much better quality and compression efficiency than current lossy and lossless image formats (.jpeg, .png, .gif).
    • Produces much smaller files for lossless images than AVIF[2]
    • Supports much larger resolutions than AVIF’s 9-megapixel limit (important for lossless images).
    • Supports up to 24-bit color depth, much more than AVIF’s 12-bit color depth limit (which, to be fair, is probably good enough).
  • Videos (Codec): AV1
    • Free and open format.
    • Much more efficient than x264 (used by .mp4) and VP9[3].
  • Documents: OpenDocument / ODF / .odt

    it’s already a NATO standard for documents Because the Microsoft Word ones (.doc, .docx) are unusable outside the Microsoft Office ecosystem. I feel outraged every time I need to edit .docx file because it breaks the layout easily. And some older .doc files cannot even work with Microsoft Word.


  1. https://archlinux.org/news/now-using-zstandard-instead-of-xz-for-package-compression/ ↩︎

  2. https://tonisagrista.com/blog/2023/jpegxl-vs-avif/ ↩︎

  3. https://engineering.fb.com/2018/04/10/video-engineering/av1-beats-x264-and-libvpx-vp9-in-practical-use-case/ ↩︎

permalink
report
reply
14 points
*
Deleted by creator
permalink
report
parent
reply
6 points
*

.tar is pretty bad as it lacks in index, making it impossible to quickly seek around in the file.

.tar.pixz/.tpxz has an index and uses LZMA and permits for parallel compression/decompression (increasingly-important on modern processors).

https://github.com/vasi/pixz

It’s packaged in Debian, and I assume other Linux distros.

Only downside is that GNU tar doesn’t have a single-letter shortcut to use pixz as a compressor, the way it does “z” for gzip, “j” for bzip2, or “J” for xz (LZMA); gotta use the more-verbose “-Ipixz”.

Also, while I don’t recommend it, IIRC gzip has a limited range that the effects of compression can propagate, and so even if you aren’t intentionally trying to provide random access, there is software that leverages this to hack in random access as well. I don’t recall whether someone has rigged it up with tar and indexing, but I suppose if someone were specifically determined to use gzip, one could go that route.

permalink
report
parent
reply
10 points
  • By separating the jobs of archiving (.tar), compressing (.zst), and (if you so choose) encrypting (.gpg), .tar.zst follows the Unix philosophy of “Make each program do one thing well.”.

wait so does it do all of those things?

permalink
report
parent
reply
23 points

So there’s a tool called tar that creates an archive (a .tar file. Then theres a tool called zstd that can be used to compress files, including .tar files, which then becomes a .tar.zst file. And then you can encrypt your .tar.zst file using a tool called gpg, which would leave you with an encrypted, compressed .tar.zst.gpg archive.

Now, most people aren’t doing everything in the terminal, so the process for most people would be pretty much the same as creating a ZIP archive.

permalink
report
parent
reply
9 points

By separating the jobs of archiving (.tar), compressing (.zst), and (if you so choose) encrypting (.gpg), .tar.zst follows the Unix philosophy of “Make each program do one thing well.”.

The problem here being that GnuPG does nothing really well.

Videos (Codec): AV1

  • Much more efficient than x264 (used by .mp4) and VP9[3].

AV1 is also much younger than H264 (AV1 is a specification, x264 is an implementation), and only recently have software-encoders become somewhat viable; a more apt comparison would have been AV1 to HEVC, though the latter is also somewhat old nowadays but still a competitive codec. Unfortunately currently there aren’t many options to use AV1 in a very meaningful way; you can encode your own media with it, but that’s about it; you can stream to YouTube, but YouTube will recode to another codec.

permalink
report
parent
reply
6 points

The problem here being that GnuPG does nothing really well.

Could you elaborate? I’ve never had any issues with gpg before and curious what people are having issues with.

Unfortunately currently there aren’t many options to use AV1 in a very meaningful way; you can encode your own media with it, but that’s about it; you can stream to YouTube, but YouTube will recode to another codec.

AV1 has almost full browser support (iirc) and companies like YouTube, Netflix, and Meta have started moving over to AV1 from VP9 (since AV1 is the successor to VP9). But you’re right, it’s still working on adoption, but this is moreso just my dreamworld than it is a prediction for future standardization.

permalink
report
parent
reply
4 points

Could you elaborate? I’ve never had any issues with gpg before and curious what people are having issues with.

This article and the blog post linked within it summarize it very well.

permalink
report
parent
reply
5 points

.odt is simply a better standard than .docx.

No surprise, since OOXML is barely even a standard.

permalink
report
parent
reply
3 points

I get better compression ratio with xz than zstd, both at highest. When building an Ubuntu squashFS

Zstd is way faster though

permalink
report
parent
reply
3 points

is av1 lossy

permalink
report
parent
reply
18 points

AV1 can do lossy video as well as lossless video.

permalink
report
parent
reply
2 points

wait im confusrd whats the differenc ebetween .tar.zst and .tar.xz

permalink
report
parent
reply
9 points

Different ways of compressing the initial .tar archive.

permalink
report
parent
reply
-19 points
*
Deleted by creator
permalink
report
parent
reply
2 points

Damn didn’t realize that JXL was such a big deal. That whole JPEG recompression actually seems pretty damn cool as well. There was some noise about GNOME starting to make use of JXL in their ecosystem too…

permalink
report
parent
reply
47 points

Ogg Opus for all lossy audio compression (mp3 needs to die)

7z or tar.zst for general purpose compression (zip and rar need to die)

permalink
report
reply
23 points

The existence of zip, and especially rar files, actually hurts me. It’s slow, it’s insecure, and the compression is from the jurassic era. We can do better

permalink
report
parent
reply
-6 points

@dinckelman @Supermariofan67 I think you mean unsecure. It doesn’t feel unsure of itself. 😁

permalink
report
parent
reply
14 points

in·se·cure (ĭn′sĭ-kyo͝or′) adj.

  1. Inadequately guarded or protected; unsafe: A shortage of military police made the air base insecure.

https://www.thefreedictionary.com/insecure

Unsecure

a. 1. Insecure.

https://www.thefreedictionary.com/Unsecure

permalink
report
parent
reply
6 points

why does zip and rar need to die

permalink
report
parent
reply
13 points

Zip has terrible compression ratio compared to modern formats, it’s also a mess of different partially incompatible implementations by different software, and also doesn’t enforce utf8 or any standard for that matter for filenames, leading to garbled names when extracting old files. Its encryption is vulnerable to a known-plaintext attack and its key-derivation function is very easy to brute force.

Rar is proprietary. That alone is reason enough not to use it. It’s also very slow.

permalink
report
parent
reply
7 points

Again, I’m not the original poster. But zip isn’t as dense as 7zip, and I honestly haven’t seen rar are used much.

Also, if I remember correctly, the audio codecs and compression types. The other poster listed are open source. But I could be mistaken. I know at least 7zip is and I believe opus or something like that is too

permalink
report
parent
reply
3 points

Most mods on Nexus are in rar or zip. Also most game cracks; or as iso, which is even worse.

permalink
report
parent
reply
4 points

why does ml3 need todie

permalink
report
parent
reply
15 points

It’s a 30 year old format, and large amounts of research and innovation in lossy audio compression have occurred since then. Opus can achieve better quality in like 40% the bitrate. Also, the format is, much like zip, a mess of partially broken implementations in the early days (although now everyone uses LAME so not as big of a deal). Its container/stream format is very messy too. Also no native tag format so it needs ID3 tags which don’t enforce any standardized text encoding.

permalink
report
parent
reply
6 points

Not the original poster, but there are newer audio codecs that are more efficient at storing data than mp3, I believe. And there’s also lossless standards, compared to mp3’s lossy compression.

permalink
report
parent
reply
3 points

What’s wrong with mp3

permalink
report
parent
reply
23 points

Big file size for rather bad audio quality.

permalink
report
parent
reply
4 points

I’ve yet to meet someone who can genuinely pass the 320kbps vs. lossless blind-test on anything but very high-end equipment.

permalink
report
parent
reply
3 points

How about tar.gz? How does gzip compare to zstd?

permalink
report
parent
reply
5 points

Both slower and worse at compression at all its levels.

permalink
report
parent
reply
2 points

its worth noting that aac is actually pretty good in a lot of cases too

permalink
report
parent
reply
7 points

However, it is very patent encumbered and therefore wouldn’t make for a good standard.

permalink
report
parent
reply
2 points

aac lc and he-aac are both free now hev2 and xhe aren’t, but those have more limited use

permalink
report
parent
reply
1 point

How about xz compared to zstd?

permalink
report
parent
reply
2 points

At both algorithms’ highest levels, xz seems to be on average a few percent better at compression ratio, but zstd is a bit faster at compression and much much faster at decompression. So if your goal is to compress as much as possible without regard to speed at all, xz -9 is better, but if you want compression that is almost as good but faster, zstd --long -19 is the way to go

At the lower compression presets, zstd is both faster and compresses better

permalink
report
parent
reply
-2 points
*
Removed by mod
permalink
report
parent
reply
1 point

How are you going to recreate the MP3 audio artifacts that give a lot of music its originality, when encoding to OPUS?

Oh, a gramophone user.

Joke aside, i find ogg Opus often sounding better than the original. Probably something with it’s psychoacoustic optimizations.

permalink
report
parent
reply
0 points
*
Removed by mod
permalink
report
parent
reply

Linux

!linux@lemmy.ml

Create post

From Wikipedia, the free encyclopedia

Linux is a family of open source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991 by Linus Torvalds. Linux is typically packaged in a Linux distribution (or distro for short).

Distributions include the Linux kernel and supporting system software and libraries, many of which are provided by the GNU Project. Many Linux distributions use the word “Linux” in their name, but the Free Software Foundation uses the name GNU/Linux to emphasize the importance of GNU software, causing some controversy.

Rules

  • Posts must be relevant to operating systems running the Linux kernel. GNU/Linux or otherwise.
  • No misinformation
  • No NSFW content
  • No hate speech, bigotry, etc

Related Communities

Community icon by Alpár-Etele Méder, licensed under CC BY 3.0

Community stats

  • 6.7K

    Monthly active users

  • 6.6K

    Posts

  • 180K

    Comments