in terms of balance between speed and compression ratio
Thalidomide Vintage Ad Shirt $22.14 |
UFOs Are A Psyop Shirt $21.68 |
Thalidomide Vintage Ad Shirt $22.14 |
in terms of balance between speed and compression ratio
Thalidomide Vintage Ad Shirt $22.14 |
UFOs Are A Psyop Shirt $21.68 |
Thalidomide Vintage Ad Shirt $22.14 |
>447b conveys pepe frog
whatever you're already using
https://web.archive.org/web/20210725124915/http://www.piedpiper.com/
some hypothetical vector based compression algorithm
Sloot digital coding system but ~~*they*~~ don't want you to use it
Depends on the data being compressed.
If it's important enough to archive long term, I tend to compress the data multiple ways and just pick the winner since it's not always going to be a consistent choice. It varies depending on what you're encoding.
This one wins about 80% of the time for the datasets I tend to archive.
guys I need a real answer I need to compress terabytes of csv files and be able to uncompress on the fly to serve my users
Then use gzip like everyone else you fricking moron
File system level compression as offered by systems such as ZFS, NTFS, et al.
just benchmark multiple algorithms and pick the winner
zstd is pretty balanced if you turn a blind eye to its extreme memory usage
lz4 is the fastest
thanks, I think I'll try zstd
You need filesystem level compression, otherwise you can't uncompress on the fly.
If you are on linux and you can life with it being read-only, you can use squashfs.
No, you don't
DEFLATE
xor with itself
Brotli
https://lz4.org/
7zip/lzma/zstandard/xz are all OBSOLETE and DEPRECATED
what's the point of making a "pixel art" of 4x4 with a resolution of 666x666, you can't divide 666 by 4, and now there's 2 big pixels with a pixel more than the others
>less than 25% the resolution
>larger file size
>like 5% the resolution or some pathetic shit
>barely smaller file size
Just what is OPs secret?
not sure. i couldnt match with GIMP. which makes me disgusted with GIMP
idk, but not much
*strips further 45B*
how
https://github.com/fhanau/Efficient-Compression-Tool
>so many posts
>still no good answer
Here's the Pareto frontier of compression algorithms: https://insanity.industries/post/pareto-optimal-compression/
TL;DR is that depending on your desired tradeoff of speed and compression ratio, you should use lz4 (super fast but poor compression), zstd (moderately fast and pretty good), or lzma (slow but great compression).
>open ended question with no answer
>hurr why no answer
why don't you figure it out if you're so intelligent? Black person.
Archtard shilling zstd, classic.
No, there are way more variables to consider.
Archtard presents a "Pareto Frontier" that makes zstd look better than it actually is. No, most of the time you don't want the "worst of both worlds" - you either want fast decompression times or high compression ratios, depending on the application.
For hardware bound IO, decompression speed is king.
For network data transfers and archival, compression ratio is king.
For something you write-once and use a billion times, again compression ratio.
Furthermore Archtard also makes bzip2 look bad by forgetting another important variable: the type of data you're working with.
Yes, with high entropy synthetic data, bzip does quite poorly.
For natural language text data, bzip2 is very very good.
Read the article you moron. He tested on various types of data including text, and he gives many options depending on which tradeoff of speed and ratio you want. bzip2 is simply not very good: you can achieve better speed at the same compression ratio, or the same ratio with less time, if you use other tools.
>A simple text file, being a dump of the Kernel log via thedmesgcommand, representing textual data
No. That is not a good representation for the real world use-case of bzip.
You're moronicly grasping at straws. Kernel logs are text and pretty low entropy too. zstd is simply a newer and better algorithm, if you disagree then post your own benchmarks.
You seem to think it's an "either or" choice - it's not.
You should always test multiple algorithms on your data.
I have seen bzip beat lzma before in compression ratio.
Generally speaking, yes, bzip does worse, but my point is that you must always *test* the algorithm on the data you're working with.
newer != better, lzma is ancient, but it is superior to zstd on compression ratio, generally.
But again, you sometimes get data where lzma is trash, and bzip wins.
zstd is garbage that solves no real problems by being in the middle, depending on application lzma is superior for compression ratios, or lzo/lz4 for decompression speeds.
jack of all trades master of none, is how I would describe zstd.
But hey, yes I'll use it if there is a particular dataset it outperforms on.
I think you're extremely biased against zstd. It beats bzip2 hands down in pretty much every case - better compression ratio at the same time, or shorter time taken at the same ratio. It's actually bzip2 that's the "worst of both worlds" algorithm, as you say, and there's no point in using it anymore.
The real algorithms you should use are lzo/lz4 for speed, lzma for ratio, and zstd for something in the middle. You say it's useless, but I found it to be the optimal choice for many things, like streaming data to a HDD or over Gigabit Ethernet as fast as possible.
>like streaming data to a HDD or over Gigabit Ethernet as fast as possible.
On local hardware IO, lzo and lz4 will do better than zstd.
>It's actually bzip2 that's the "worst of both worlds" algorithm, as you say, and there's no point in using it anymore.
You're completely missing the point.
On certain data, I've seen both bzip and gzip beat lzma on compression ratio.
Your stupid article doesn't have error bars anywhere.
There is no algorithm that's inherently better than the others. It's always a case-by-case basis.
>On local hardware IO, lzo and lz4 will do better than zstd.
You really don't get it and need it spelled out? In both use cases I mentioned you want the algorithm that has the best compression ratio possible while still letting you output 100-150 MB/s. It's the perfect use case for an algorithm with medium speed and medium ratio.
>There is no algorithm that's inherently better than the others. It's always a case-by-case basis.
Any more obvious truisms and goalpost moving to share with us, anon? Obviously it differs case by case, but in the vast majority of cases zstd beats bzip2, so you're moronic for shilling the latter.
what are the numbers on this graph? because idk about you but for me around 500-1000MB/s decompression speed is good enough for most cases
op asked for best balance, zstd being excellent as a middle ground option is what i would call balanced
it's not the fastest nor does it have the highest ratio, but when i want something compressed and it doesn't need to be ultra fast nor ultra small, zstd is what i pick. the balanced option
Consider transposing your csv file, helps a lot for most datasets.
Do you only care about decompression speed? Or do you also care about compression speed?
Gtp-4o
Maid-LZW
Optimized.
Incest between cousins in the first generation has an extremely low chance you cause any health issues for the children. I don't know why mutts are so brainwashed by the media.
>in terms of balance between speed and compression ratio
LZ + statistical compression (Huffman or Arithmetic coding)
Depends what you are compressing. Here's the best for text on EVERYONES favorite website.
https://github.com/qntm/base2048
zstd
Zstd, lz4 are good modern ones