Talk:Grammar-based code

Latest comment: 7 years ago by 2602:306:CD5B:FD30:61C0:3B3E:902:22A7

I will update the article on grammar-based coding from time to time -- Da-ke.

Article needs to be updated with references about Structured Grammar Based Codes. Amanbhatia 05:34, 11 August 2008 (UTC)Reply


It would be good to spell out why SLGs are interesting for compression: they can be decoded very, very fast. Constructing the grammar may take a relatively long time, but decoding it is just a depth-first traversal of a DAG and is fast linear time. This makes grammar-based compression attractive for data that may be encoded once but downloaded and/or decompressed many times. (GLZA is the champ for compressing all of Wikipedia well and decompressing it in a big hurry.) — Preceding unsigned comment added by 2602:306:CD5B:FD30:61C0:3B3E:902:22A7 (talk) 02:00, 3 December 2017 (UTC)Reply

The problem of constructing the "smallest grammar" isn't just intractable... for understanding data compression with entropy coding, it's the wrong problem. The compression-optimal SLG is generally not irreducible, because you may have patterns that repeat infrequently but are made of very frequently-repeating constituents whose entropy codes are therefore short. In such cases, it's worth "spelling out" each repeat in terms of the (short codes for) the constituents, rather than paying the cost of encoding the repeating pattern as a separate rule. See Conrad and Wilson (GLZA) on this.