|
1 |
| -SDSL - Succinct Data Structure Library |
2 |
| -========= |
3 |
| - |
4 |
| -[](https://travis-ci.org/simongog/sdsl-lite) |
5 |
| - |
6 |
| -What is it? |
7 |
| ------------ |
8 |
| - |
9 |
| -The Succinct Data Structure Library (SDSL) is a powerful and flexible C++11 |
10 |
| -library implementing succinct data structures. In total, the library contains |
11 |
| -the highlights of 40 [research publications][SDSLLIT]. Succinct data structures |
12 |
| -can represent an object (such as a bitvector or a tree) in space close to the |
13 |
| -information-theoretic lower bound of the object while supporting operations |
14 |
| -of the original object efficiently. The theoretical time complexity of an |
15 |
| -operation performed on the classical data structure and the equivalent |
16 |
| -succinct data structure are (most of the time) identical. |
17 |
| - |
18 |
| -Why SDSL? |
19 |
| --------- |
20 |
| - |
21 |
| -Succinct data structures have very attractive theoretical properties. However, |
22 |
| -in practice implementing succinct data structures is non-trivial as they are |
23 |
| -often composed of complex operations on bitvectors. The SDSL Library provides |
24 |
| -high quality, open source implementations of many succinct data structures |
25 |
| -proposed in literature. |
26 |
| - |
27 |
| -Specifically, the aim of the library is to provide basic and complex succinct |
28 |
| -data structure which are |
29 |
| - |
30 |
| - * Easy and intuitive to use (like the [STL][STL], which provides classical data structures), |
31 |
| - * Faithful to the original theoretical results, |
32 |
| - * Capable of handling large inputs (yes, we support 64-bit), |
33 |
| - * Provide efficient construction of all implemented succinct data structures, |
34 |
| - while at the same time enable good run-time performance. |
35 |
| - |
36 |
| -<a href="http://simongog.github.io/assets/data/space-vis.html" > |
37 |
| -<img align="right" src="extras/resources/space-vis.png?raw=true" /> |
38 |
| -</a> |
39 |
| - |
40 |
| -In addition we provide additional functionality which can help you use succinct |
41 |
| -data structure to their full potential. |
42 |
| - |
43 |
| - * Each data structure can easily be serialized and loaded to/from disk. |
44 |
| - * We provide functionality which helps you analyze the storage requirements of any |
45 |
| - SDSL based data structure (see right) |
46 |
| - * We support features such as hugepages and tracking the memory usage of each |
47 |
| - SDSL data structure. |
48 |
| - * Complex structures can be configured by template parameters and therefore |
49 |
| - easily be composed. There exists one simple method which constructs |
50 |
| - all complex structures. |
51 |
| - * We maintain an extensive collection of examples which help you use the different |
52 |
| - features provided by the library. |
53 |
| - * All data structures are tested for correctness using a unit-testing framework. |
54 |
| - * We provide a large collection of supporting documentation consisting of examples, |
55 |
| - [cheat sheet][SDSLCS], [tutorial slides and walk-through][TUT]. |
56 |
| - |
57 |
| -The library contains many succinct data structures from the following categories: |
58 |
| - |
59 |
| - * Bitvectors supporting Rank and Select |
60 |
| - * Integer Vectors |
61 |
| - * Wavelet Trees |
62 |
| - * Compressed Suffix Arrays (CSA) |
63 |
| - * Balanced Parentheses Representations |
64 |
| - * Longest Common Prefix (LCP) Arrays |
65 |
| - * Compressed Suffix Trees (CST) |
66 |
| - * Range Minimum/Maximum Query (RMQ) Structures |
67 |
| - |
68 |
| -For a complete overview including theoretical bounds see the |
69 |
| -[cheat sheet][SDSLCS] or the |
70 |
| -[wiki](https://github.com/simongog/sdsl-lite/wiki/List-of-Implemented-Data-Structures). |
71 |
| - |
72 |
| -Documentation |
73 |
| -------------- |
74 |
| - |
75 |
| -We provide an extensive set of documentation describing all data structures |
76 |
| -and features provided by the library. Specifically we provide |
77 |
| - |
78 |
| -* A [cheat sheet][SDSLCS] which succinctly |
79 |
| -describes the usage of the library. |
80 |
| -* An doxygen generated [API reference][DOXYGENDOCS] which lists all types and functions |
81 |
| -of the library. |
82 |
| -* A set of [example](examples/) programs demonstrating how different features |
83 |
| -of the library are used. |
84 |
| -* A tutorial [presentation][TUT] with the [example code](tutorial/) using in the |
85 |
| -sides demonstrating all features of the library in a step-by-step walk-through. |
86 |
| -* [Unit Tests](test/) which contain small code snippets used to test each |
87 |
| -library feature. |
88 |
| - |
89 |
| -Requirements |
90 |
| ------------- |
91 |
| - |
92 |
| -The SDSL library requires: |
93 |
| - |
94 |
| -* A modern, C++11 ready compiler such as `g++` version 4.9 or higher or `clang` version 3.2 or higher. |
95 |
| -* The [cmake][cmake] build system. |
96 |
| -* A 64-bit operating system. Either Mac OS X or Linux are currently supported. |
97 |
| -* For increased performance the processor of the system should support fast bit operations available in `SSE4.2` |
98 |
| - |
99 |
| -Installation |
100 |
| ------------- |
101 |
| - |
102 |
| -To download and install the library use the following commands. |
103 |
| - |
104 |
| -```sh |
105 |
| -git clone https://github.com/simongog/sdsl-lite.git |
106 |
| -cd sdsl-lite |
107 |
| -./install.sh |
108 |
| -``` |
109 |
| - |
110 |
| -This installs the sdsl library into the `include` and `lib` directories in your |
111 |
| -home directory. A different location prefix can be specified as a parameter of |
112 |
| -the `install.sh` script: |
113 |
| - |
114 |
| -```sh |
115 |
| -./install /usr/local/ |
116 |
| -``` |
117 |
| - |
118 |
| -To remove the library from your system use the provided uninstall script: |
119 |
| - |
120 |
| -```sh |
121 |
| -./uninstall.sh |
122 |
| -``` |
123 |
| - |
124 |
| -Getting Started |
125 |
| ------------- |
126 |
| - |
127 |
| -To get you started with the library you can start by compiling the following |
128 |
| -sample program which constructs a compressed suffix array (a FM-Index) over the |
129 |
| -text `mississippi!`, counts the number of occurrences of pattern `si` and |
130 |
| -stores the data structure, and a space usage visualization to the |
131 |
| -files `fm_index-file.sdsl` and `fm_index-file.sdsl.html`: |
132 |
| - |
133 |
| -```cpp |
134 |
| -#include <sdsl/suffix_arrays.hpp> |
135 |
| -#include <fstream> |
136 |
| - |
137 |
| -using namespace sdsl; |
138 |
| - |
139 |
| -int main() { |
140 |
| - csa_wt<> fm_index; |
141 |
| - construct_im(fm_index, "mississippi!", 1); |
142 |
| - std::cout << "'si' occurs " << count(fm_index,"si") << " times.\n"; |
143 |
| - store_to_file(fm_index,"fm_index-file.sdsl"); |
144 |
| - std::ofstream out("fm_index-file.sdsl.html"); |
145 |
| - write_structure<HTML_FORMAT>(fm_index,out); |
146 |
| -} |
147 |
| -``` |
148 |
| - |
149 |
| -To compile the program using `g++` run: |
150 |
| - |
151 |
| -```sh |
152 |
| -g++ -std=c++11 -O3 -DNDEBUG -I ~/include -L ~/lib program.cpp -o program -lsdsl -ldivsufsort -ldivsufsort64 |
153 |
| -``` |
154 |
| - |
155 |
| -Next we suggest you look at the comprehensive [tutorial][TUT] which describes |
156 |
| -all major features of the library or look at some of the provided [examples](examples). |
157 |
| - |
158 |
| -Test |
159 |
| ----- |
160 |
| - |
161 |
| -Implementing succinct data structures can be tricky. To ensure that all data |
162 |
| -structures behave as expected, we created a large collection of unit tests |
163 |
| -which can be used to check the correctness of the library on your computer. |
164 |
| -The [test](./test) directory contains test code. We use [googletest][GTEST] |
165 |
| -framework and [make][MAKE] to run the tests. See the README file in the |
166 |
| -directory for details. |
167 |
| - |
168 |
| -To simply run all unit tests after installing the library type |
169 |
| - |
170 |
| -```sh |
171 |
| -cd sdsl-lite/build |
172 |
| -make test-sdsl |
173 |
| -``` |
174 |
| - |
175 |
| -Note: Running the tests requires several sample files to be downloaded from the web |
176 |
| -and can take up to 2 hours on slow machines. |
177 |
| - |
178 |
| - |
179 |
| -Benchmarks |
180 |
| ----------- |
181 |
| - |
182 |
| -To ensure the library runs efficiently on your system we suggest you run our |
183 |
| -[benchmark suite](benchmark). The benchmark suite recreates a |
184 |
| -popular [experimental study](http://arxiv.org/abs/0712.3360) which you can |
185 |
| -directly compare to the results of your benchmark run. |
186 |
| - |
187 |
| -Bug Reporting |
188 |
| ------------- |
189 |
| - |
190 |
| -While we use an extensive set of unit tests and test coverage tools you might |
191 |
| -still find bugs in the library. We encourage you to report any problems with |
192 |
| -the library via the [github issue tracking system](https://github.com/simongog/sdsl-lite/issues) |
193 |
| -of the project. |
194 |
| - |
195 |
| -The Latest Version |
196 |
| ------------------- |
197 |
| - |
198 |
| -The latest version can be found on the SDSL github project page https://github.com/simongog/sdsl-lite . |
199 |
| - |
200 |
| -If you are running experiments in an academic settings we suggest you use the |
201 |
| -most recent [released](https://github.com/simongog/sdsl-lite/releases) version |
202 |
| -of the library. This allows others to reproduce your experiments exactly. |
203 |
| - |
204 |
| -Licensing |
205 |
| ---------- |
206 |
| - |
207 |
| -The SDSL library is free software provided under the GNU General Public License |
208 |
| -(GPLv3). For more information see the [COPYING file][CF] in the library |
209 |
| -directory. |
210 |
| - |
211 |
| -We distribute this library freely to foster the use and development of advanced |
212 |
| -data structure. If you use the library in an academic setting please cite the |
213 |
| -following paper: |
214 |
| - |
215 |
| - @inproceedings{gbmp2014sea, |
216 |
| - title = {From Theory to Practice: Plug and Play with Succinct Data Structures}, |
217 |
| - author = {Gog, Simon and Beller, Timo and Moffat, Alistair and Petri, Matthias}, |
218 |
| - booktitle = {13th International Symposium on Experimental Algorithms, (SEA 2014)}, |
219 |
| - year = {2014}, |
220 |
| - pages = {326-337}, |
221 |
| - ee = {http://dx.doi.org/10.1007/978-3-319-07959-2_28} |
222 |
| - } |
223 |
| - |
224 |
| -A preliminary version if available [here on arxiv][SEAPAPER]. |
225 |
| - |
226 |
| -## External Resources used in SDSL |
227 |
| - |
228 |
| -We have included the code of two excellent suffix array |
229 |
| -construction algorithms. |
230 |
| - |
231 |
| -* Yuta Mori's incredible fast suffix [libdivsufsort][DIVSUF] |
232 |
| - algorithm for byte-alphabets. |
233 |
| -* An adapted version of [Jesper Larsson's][JESL] [implementation][QSUFIMPL] of |
234 |
| - suffix array sorting on integer-alphabets (description of [Larsson and Sadakane][LS]). |
235 |
| - |
236 |
| -Additionally, we use the [googletest][GTEST] framework to provide unit tests. |
237 |
| -Our visualizations are implemented using the [d3js][d3js]-library. |
238 |
| - |
239 |
| -Authors |
240 |
| --------- |
241 |
| - |
242 |
| -The main contributors to the library are: |
243 |
| - |
244 |
| -* [Johannes Bader] (https://github.com/olydis) |
245 |
| -* [Timo Beller](https://github.com/tb38) |
246 |
| -* [Simon Gog](https://github.com/simongog) (Creator) |
247 |
| -* [Matthias Petri](https://github.com/mpetri) |
248 |
| - |
249 |
| -This project is also supported by code contributions |
250 |
| -from other researchers. E.g. Juha Kärkkäinen, |
251 |
| -[Dominik Kempa](https://github.com/dkempa), |
252 |
| -and Simon Puglisi contributed a compressed bitvector |
253 |
| -implementation ([hyb_vector][HB]). |
254 |
| -This project further profited from excellent input of our students |
255 |
| -Markus Brenner, Alexander Diehm, Christian Ocker, and Maike Zwerger. Stefan |
256 |
| -Arnold helped us with tricky template questions. We are also grateful to |
257 |
| -[Diego Caro](https://github.com/diegocaro), |
258 |
| -[Travis Gagie](https://github.com/TravisGagie), |
259 |
| -Kalle Karhu, |
260 |
| -[Bruce Kuo](https://github.com/bruce3557), |
261 |
| -Jan Kurrus, |
262 |
| -[Shanika Kuruppu](https://github.com/skuruppu), |
263 |
| -Jouni Siren, |
264 |
| -and [Julio Vizcaino](https://github.com/garviz) |
265 |
| -for bug reports. |
266 |
| - |
267 |
| -Contribute |
268 |
| ----------- |
269 |
| - |
270 |
| -Are you working on a new or improved implementation of a succinct data structure? |
271 |
| -We encourage you to contribute your implementation to the SDSL library to make |
272 |
| -your work accessible to the community within the existing library framework. |
273 |
| -Feel free to contact any of the authors or create an issue on the |
274 |
| -[issue tracking system](https://github.com/simongog/sdsl-lite/issues). |
275 |
| - |
276 |
| - |
277 |
| -[STL]: http://www.sgi.com/tech/stl/ "Standard Template Library" |
278 |
| -[pz]: http://pizzachili.di.unipi.it/ "Pizza&Chli" |
279 |
| -[d3js]: http://d3js.org "D3JS library" |
280 |
| -[cmake]: http://www.cmake.org/ "CMake tool" |
281 |
| -[MAKE]: http://www.gnu.org/software/make/ "GNU Make" |
282 |
| -[gcc]: http://gcc.gnu.org/ "GNU Compiler Collection" |
283 |
| -[DIVSUF]: https://github.com/y-256/libdivsufsort/ "libdivsufsort" |
284 |
| -[LS]: http://www.sciencedirect.com/science/article/pii/S0304397507005257 "Larson & Sadakane Algorithm" |
285 |
| -[GTEST]: https://code.google.com/p/googletest/ "Google C++ Testing Framework" |
286 |
| -[SDSLCS]: http://simongog.github.io/assets/data/sdsl-cheatsheet.pdf "SDSL Cheat Sheet" |
287 |
| -[SDSLLIT]: https://github.com/simongog/sdsl-lite/wiki/Literature "Succinct Data Structure Literature" |
288 |
| -[TUT]: http://simongog.github.io/assets/data/sdsl-slides/tutorial "Tutorial" |
289 |
| -[QSUFIMPL]: http://www.larsson.dogma.net/qsufsort.c "Original Qsufsort Implementation" |
290 |
| -[JESL]: http://www.itu.dk/people/jesl/ "Homepage of Jesper Larsson" |
291 |
| -[CF]: https://github.com/simongog/sdsl-lite/blob/master/COPYING "Licence" |
292 |
| -[SEAPAPER]: http://arxiv.org/pdf/1311.1249v1.pdf "SDSL paper" |
293 |
| -[HB]: https://github.com/simongog/sdsl-lite/blob/hybrid_bitvector/include/sdsl/hybrid_vector.hpp "Hybrid bitevctor" |
294 |
| -[DOXYGENDOCS]: http://algo2.iti.kit.edu/gog/docs/html/index.html "API Reference" |
| 1 | +Development repository for SDSL Version 3 |
0 commit comments