Finite State Transducers String-to-String Map in Rust
Several years ago working on a project I found a requirement for a really fast loading and compact string map that I could search over with regexes. Rust already has a great data structure for searching strings with regexes, called fst. However, fst only supports integers as map keyed values. To create a string-to-string map from fst I packed the strings into a pointer table along with the fst data itself. The fst data can be memmapped into use, so loading time is almost negligible.
I have decided to open-source that code in case that it would be useful for others. I will put all that code in the fst_stringstring crate and on Github.
Building a String Table
The biggest feature of this code is the memory mapped string table. Creating and using a string table looks like this:
Integrating String Key-Values with FST
The next step for a string-string FST map is to memory map both the string table and the fst data. To build a FST file, the string inserts must happen in ascending order. Aside from that the mmap itself is deemed unsafe behavior, so an unsafe block is required.