Re-Store Software
Re-Store is a phrase browsing system based on the Re-Pair compression algorithm that consists of several components and described by:
- R. Wan. Browsing and Searching Compressed Documents. PhD thesis, University of Melbourne, Australia, December 2003. (Supervised by Alistair Moffat). [Abstract] [Thesis] [University of Melbourne library]
- A. Moffat and R. Wan. Re-Store: A system for compressing, browsing, and searching large documents. In Proc. 8th International Symposium on String Processing and Information Retrieval, pages 162-174. IEEE Computer Society, November 2001. (Invited talk.) [Abstract].
The Re-Pair compression algorithm is described in N. J. Larsson's PhD thesis and:
- N. J. Larsson and A. Moffat. Offline Dictionary-Based Compression. In Proc. IEEE, 88(11), 1722-1732, November 2000.
The Re-Merge compression algorithm was most recently described in:
- R. Wan and A. Moffat. Block merging for off-line compression. Journal of the American Society for Information Science and Technology, 58(1):3-14, 2007.
An entropy coder is required to encode the modifiers or the sequence (one of the outputs from Re-Pair). The one used in the original Re-Pair paper and the Re-Merge paper was the Minimum-Redundancy (Huffman) coder by Andrew Turpin and Alistair Moffat.
The software below is released under the GNU Public License. In addition, as a courtesy to me, I would appreciate it if you let me know if you found any bugs or have any comments or suggestions. Thank you!
- Re-Pair compression algorithm (and its decompressor, Des-Pair).
Version 1.0.1, April 2, 2007
[Download]
- Pre-Pair (pre-processor for word-based Re-Pair).
Version 1.0.1, April 2, 2007
[Download]
- Re-Merge (Offline block-merging for Re-Pair).
Version 1.0, April 2, 2007
[Download]
Previous page