

und denke mal, bei dem Username, dass du deutsch sprechen kannst haha Jup, stimmt. :D
Ich bleib’ trotzdem mal bei Englisch, damit’s im englischen Thread verstanden wird.
ENGLISH: Yeah, you’re right, I wasn’t particularly on-topic there. :D I tried to address your underlying assumptions as well as the actual file format question, and it kinda derailed from there.
Sooo, file format… I think you’re restricting yourself too much if you just use the formats that are included in binutils. Also, you have conflicting goals there: it’s compression (make the most of your storage) vs. resilience (have a format that is stable in the long term). Someone here recommended lzip
, which is definitely a right answer for good compression ratio. The Wikipedia article I linked features a table that compares compressed archive formats, so that might be a good starting point to find resilient formats. Look out for formats with at least Integrity Check and possibly Recovery Record, as these seem to be more important than compression ratio. When you have settled on a format, run some tests to find the best compression algorithm for your material. You might also want to measure throughput/time while you’re at it to find variants that offer a reasonable compromise between compression and performance. If you’re so inclined, try to read a few format specs to find suitable candidates.
You’re generally looking for formats that:
- are in widespread use
- are specified/standardized publicly
- are of a low complexity
- don’t have features like DRM/Encryption/anti-copy
- are self-documenting
- are robust
- don’t have external dependencies (e.g. for other file formats)
- are free of any restrictive licensing/patents
- can be validated.
You might want to read up on more technical infos on how an actual archive handles these challenges at https://slubarchiv.slub-dresden.de/technische-standards-fuer-die-ablieferung-von-digitalen-dokumenten and the PDF files with specifications linked there (all in German).
Of yeah, there really was, thank you. :)