Finite State Transducers String-to-String Map in Rust
Several years ago working on a project I found a requirement for a really fast loading and compact string map that I could search over with regexes. Rust already has a great data structure for searching strings with regexes, called fst. However, fst only supports integers as map keyed values. To create a string-to-string map from fst I packed the strings into a pointer table along with the fst data itself. The fst data can be memmapped into use, so loading time is almost negligible.
I have decided to open-source that code in case that it would be useful for others. I will put all that code in the fst_stringstring crate and on Github.
Building a String Table
The biggest feature of this code is the memory mapped string table. Creating and using a string table looks like this:
The next step for a string-string FST map is to memory map both the string table and the fst data. To build a FST file, the string inserts must happen in ascending order. Aside from that the mmap itself is deemed unsafe behavior, so an unsafe block is required.
tests/map_builder.rs
usefst_stringstring::builder::StringBuilder;usefst_stringstring::strings::StringMap;usefst_stringstring::map::StringStringMap;usestd::fs::{File,remove_file};usestd::path::Path;usefst::Map;usememmap::Mmap;#[test]fnmbuilder1()->std::io::Result<()>{ifPath::new("testy.fst").exists(){remove_file("testy.fst")?;}ifPath::new("testy.fmm").exists(){remove_file("testy.fmm")?;}letmut builder =StringBuilder::new("testy.fmm")?;let wtr =std::io::BufWriter::new(File::create("testy.fst")?);letmut fst_builder =fst::MapBuilder::new(wtr).expect("Unable to create fst::MapBuilder");let i1 = builder.insert("abcd")?;
fst_builder.insert("a", i1 asu64).expect("fst insert");let i2 = builder.insert("efgh")?;
fst_builder.insert("b", i2 asu64).expect("fst insert");
builder.finish();
fst_builder.finish().expect("Unable to finish build of fst::MapBuilder");let mmap =unsafe{Mmap::map(&File::open("testy.fst")?).expect("unable to mmap file")};let fmap =Map::new(mmap).expect("Could not create fst::Map from file");let smap =StringMap::new("testy.fmm")?;let map =StringStringMap::new(fmap, smap);assert_eq!( map.get("a"),Some("abcd".to_string()));assert_eq!( map.get("b"),Some("efgh".to_string()));assert_eq!( map.get("c"),None);ifPath::new("testy.fst").exists(){remove_file("testy.fst")?;}ifPath::new("testy.fmm").exists(){remove_file("testy.fmm")?;}Ok(())}