Abstract Binary Block Order Rouen Transform
Transcription
Abstract Binary Block Order Rouen Transform
Abstract Binary Block Order Rouen Transform Jacqueline W. Daykin1,2 , Richard Groult3,4 , Yannick Guesnet4 , Thierry Lecroq4 , Arnaud Lefebvre4 , Martine Léonard4 , Élise Prieur-Gaston4 1 Department of Computer Science, Aberystwyth University (Mauritius Branch Campus), Quartier Militaire, Mauritius 2 Department of Computer Science, Royal Holloway, University of London, UK 3 Modélisation, Information et Systèmes (MIS), Université de Picardie Jules Verne, Amiens, France 4 Normandie Univ., UNIROUEN, UNIHAVRE, INSA Rouen, LITIS, 76000 Rouen, France Abstract We introduce bijective Burrows-Wheeler type transforms for binary strings [1]. These twin transforms originated in problem solving sessions of the Rouen 2012 StringMasters workshop, hence the name Rouen Transform. The original method by Burrows and Wheeler [2] is based on lexicographic order for general alphabets, and the transform is defined to be the last column of the ordered BWT matrix. The new approach applies binary block order, B-order, which yields not one, but twin transforms: one based on Lyndon words, the other on a repetition of Lyndon words. These binary B-BWT transforms are constructed here for B-words, analogous structures to Lyndon words. A key computation in the transforms is the application of a linear-time suffix-sorting technique, such as [3], to sort the cyclic rotations of a binary input string into their B-order. Moreover, like the original lexicographic transform, we show that computing the B-BWT inverses is also achieved in linear time by using straightforward combinatorial arguments. Some preliminary experimental results demonstrated that it may be worthwhile in practice to implement the Rouen Transform as preprocessing for compression. An obvious quest for future research is to devise a fully bijective linear transform for binary block order over arbitrary inputs. If the given string is not a B-word, then it should be factored into these patterned words. It also remains to see if pattern matching can be efficiently performed using this kind of transforms as it is the case with the usual Burrows-Wheeler transform. References [1] J. W. Daykin, R. Groult, Y. Guesnet, T. Lecroq, A. Lefebvre, M. Léonard, and É. Prieur-Gaston. Binary block order Rouen transform. Theoret. Comput. Sci., 2016. DOI: 10.1016/j.tcs.2016.05.028. [2] M. Burrows and D. J. Wheeler. A block sorting lossless data compression algorithm. Technical Report 124, Digital Equipment Corporation, 1994. [3] P. Ko and S. Aluru. Space efficient linear time construction of suffix arrays. In Proc. 14th Annual Symposium on Combinatorial Pattern Matching (CPM), pages 200–210, 2003. 1