What is the best compression algorithm?

in terms of balance between speed and compression ratio

Thalidomide Vintage Ad Shirt $22.14

UFOs Are A Psyop Shirt $21.68

Thalidomide Vintage Ad Shirt $22.14

  1. 1 week ago
    Anonymous

    >447b conveys pepe frog
    whatever you're already using

  2. 1 week ago
    Anonymous

    https://web.archive.org/web/20210725124915/http://www.piedpiper.com/

  3. 1 week ago
    Anonymous

    some hypothetical vector based compression algorithm

  4. 1 week ago
    Anonymous

    Sloot digital coding system but ~~*they*~~ don't want you to use it

    • 1 week ago
      Anonymous

      Depends on the data being compressed.
      If it's important enough to archive long term, I tend to compress the data multiple ways and just pick the winner since it's not always going to be a consistent choice. It varies depending on what you're encoding.
      This one wins about 80% of the time for the datasets I tend to archive.

  5. 1 week ago
    Anonymous

    guys I need a real answer I need to compress terabytes of csv files and be able to uncompress on the fly to serve my users

    • 1 week ago
      Anonymous

      Then use gzip like everyone else you fricking moron

    • 1 week ago
      Anonymous

      File system level compression as offered by systems such as ZFS, NTFS, et al.

    • 1 week ago
      Anonymous

      just benchmark multiple algorithms and pick the winner
      zstd is pretty balanced if you turn a blind eye to its extreme memory usage
      lz4 is the fastest

      • 1 week ago
        Anonymous

        >so many posts
        >still no good answer
        Here's the Pareto frontier of compression algorithms: https://insanity.industries/post/pareto-optimal-compression/
        TL;DR is that depending on your desired tradeoff of speed and compression ratio, you should use lz4 (super fast but poor compression), zstd (moderately fast and pretty good), or lzma (slow but great compression).

        thanks, I think I'll try zstd

    • 1 week ago
      Anonymous

      You need filesystem level compression, otherwise you can't uncompress on the fly.

      If you are on linux and you can life with it being read-only, you can use squashfs.

      • 1 week ago
        Anonymous

        No, you don't

  6. 1 week ago
    Anonymous

    DEFLATE

  7. 1 week ago
    Anonymous

    xor with itself

  8. 1 week ago
    Anonymous

    Brotli

  9. 1 week ago
    Anonymous

    https://lz4.org/

    7zip/lzma/zstandard/xz are all OBSOLETE and DEPRECATED

  10. 1 week ago
    Anonymous

    what's the point of making a "pixel art" of 4x4 with a resolution of 666x666, you can't divide 666 by 4, and now there's 2 big pixels with a pixel more than the others

  11. 1 week ago
    Anonymous
    • 1 week ago
      Anonymous
      • 1 week ago
        Anonymous
      • 1 week ago
        Anonymous

        >less than 25% the resolution
        >larger file size

        >like 5% the resolution or some pathetic shit
        >barely smaller file size

        • 1 week ago
          Anonymous

          Just what is OPs secret?

          • 1 week ago
            Anonymous

            not sure. i couldnt match with GIMP. which makes me disgusted with GIMP

          • 1 week ago
            Anonymous

            idk, but not much
            *strips further 45B*

          • 1 week ago
            Anonymous

            how

          • 1 week ago
            Anonymous

            https://github.com/fhanau/Efficient-Compression-Tool

    • 1 week ago
      Anonymous
  12. 1 week ago
    Anonymous

    >so many posts
    >still no good answer
    Here's the Pareto frontier of compression algorithms: https://insanity.industries/post/pareto-optimal-compression/
    TL;DR is that depending on your desired tradeoff of speed and compression ratio, you should use lz4 (super fast but poor compression), zstd (moderately fast and pretty good), or lzma (slow but great compression).

    • 1 week ago
      Anonymous

      >open ended question with no answer
      >hurr why no answer
      why don't you figure it out if you're so intelligent? Black person.

    • 1 week ago
      Anonymous

      Archtard shilling zstd, classic.
      No, there are way more variables to consider.

      Archtard presents a "Pareto Frontier" that makes zstd look better than it actually is. No, most of the time you don't want the "worst of both worlds" - you either want fast decompression times or high compression ratios, depending on the application.
      For hardware bound IO, decompression speed is king.
      For network data transfers and archival, compression ratio is king.
      For something you write-once and use a billion times, again compression ratio.

      Furthermore Archtard also makes bzip2 look bad by forgetting another important variable: the type of data you're working with.
      Yes, with high entropy synthetic data, bzip does quite poorly.
      For natural language text data, bzip2 is very very good.

      • 1 week ago
        Anonymous

        Read the article you moron. He tested on various types of data including text, and he gives many options depending on which tradeoff of speed and ratio you want. bzip2 is simply not very good: you can achieve better speed at the same compression ratio, or the same ratio with less time, if you use other tools.

        • 1 week ago
          Anonymous

          >A simple text file, being a dump of the Kernel log via thedmesgcommand, representing textual data
          No. That is not a good representation for the real world use-case of bzip.

          • 1 week ago
            Anonymous

            You're moronicly grasping at straws. Kernel logs are text and pretty low entropy too. zstd is simply a newer and better algorithm, if you disagree then post your own benchmarks.

          • 1 week ago
            Anonymous

            You seem to think it's an "either or" choice - it's not.
            You should always test multiple algorithms on your data.
            I have seen bzip beat lzma before in compression ratio.
            Generally speaking, yes, bzip does worse, but my point is that you must always *test* the algorithm on the data you're working with.
            newer != better, lzma is ancient, but it is superior to zstd on compression ratio, generally.
            But again, you sometimes get data where lzma is trash, and bzip wins.
            zstd is garbage that solves no real problems by being in the middle, depending on application lzma is superior for compression ratios, or lzo/lz4 for decompression speeds.
            jack of all trades master of none, is how I would describe zstd.
            But hey, yes I'll use it if there is a particular dataset it outperforms on.

          • 1 week ago
            Anonymous

            I think you're extremely biased against zstd. It beats bzip2 hands down in pretty much every case - better compression ratio at the same time, or shorter time taken at the same ratio. It's actually bzip2 that's the "worst of both worlds" algorithm, as you say, and there's no point in using it anymore.
            The real algorithms you should use are lzo/lz4 for speed, lzma for ratio, and zstd for something in the middle. You say it's useless, but I found it to be the optimal choice for many things, like streaming data to a HDD or over Gigabit Ethernet as fast as possible.

          • 1 week ago
            Anonymous

            >like streaming data to a HDD or over Gigabit Ethernet as fast as possible.
            On local hardware IO, lzo and lz4 will do better than zstd.
            >It's actually bzip2 that's the "worst of both worlds" algorithm, as you say, and there's no point in using it anymore.
            You're completely missing the point.
            On certain data, I've seen both bzip and gzip beat lzma on compression ratio.
            Your stupid article doesn't have error bars anywhere.
            There is no algorithm that's inherently better than the others. It's always a case-by-case basis.

          • 1 week ago
            Anonymous

            >On local hardware IO, lzo and lz4 will do better than zstd.
            You really don't get it and need it spelled out? In both use cases I mentioned you want the algorithm that has the best compression ratio possible while still letting you output 100-150 MB/s. It's the perfect use case for an algorithm with medium speed and medium ratio.
            >There is no algorithm that's inherently better than the others. It's always a case-by-case basis.
            Any more obvious truisms and goalpost moving to share with us, anon? Obviously it differs case by case, but in the vast majority of cases zstd beats bzip2, so you're moronic for shilling the latter.

      • 1 week ago
        Anonymous

        what are the numbers on this graph? because idk about you but for me around 500-1000MB/s decompression speed is good enough for most cases

      • 1 week ago
        Anonymous

        op asked for best balance, zstd being excellent as a middle ground option is what i would call balanced
        it's not the fastest nor does it have the highest ratio, but when i want something compressed and it doesn't need to be ultra fast nor ultra small, zstd is what i pick. the balanced option

  13. 1 week ago
    Anonymous

    Consider transposing your csv file, helps a lot for most datasets.
    Do you only care about decompression speed? Or do you also care about compression speed?

  14. 1 week ago
    Anonymous

    Gtp-4o

  15. 1 week ago
    Anonymous

    Maid-LZW

  16. 1 week ago
    Anonymous

    Optimized.

    • 1 week ago
      Anonymous

      Incest between cousins in the first generation has an extremely low chance you cause any health issues for the children. I don't know why mutts are so brainwashed by the media.

  17. 1 week ago
    Anonymous

    >in terms of balance between speed and compression ratio
    LZ + statistical compression (Huffman or Arithmetic coding)

  18. 1 week ago
    Anonymous

    Depends what you are compressing. Here's the best for text on EVERYONES favorite website.

    https://github.com/qntm/base2048

  19. 1 week ago
    Anonymous
  20. 1 week ago
    Anonymous

    zstd

  21. 1 week ago
    Anonymous

    Zstd, lz4 are good modern ones

Your email address will not be published. Required fields are marked *