An Attempt at Generative Programming in Rust

What is generative programming?

According to Wikipedia, generative programming is "to manufacture software components in an automated way". I find this definition sufficient, but to illustrate the point more I would like to give the example of Natural Language Processing. It is very simple to spit out vocabulary words according to simple classifications like nouns/verbs/adjectives. However, writing meaningful text with proper grammar is not so simple. One example of generative programming would be to write a similar text in multiple languages with proper grammar. Say it once, Hear it everywhere.

What does Generative Programming have to do with Natural Language Processing?

In this article I aim to show some simple implementation of generative programming to generate some text in English with appropriate grammar considerations. This may seem impossible or it may seem simplistic depending on your previous experience with Natural Language Processing. Either way, my goal is not to actually generate good text, but rather to illustrate the generative capabilities of Rust. For this example we will generate sentences in the form of [subject] [verb] [direct object] [adverb].

use bitflags::bitflags;

bitflags! {
   pub struct WordUsage: u32 {
      const SINGULAR                   = 0b00000000_00000000_00000000_00000001;
      const PLURAL                     = 0b00000000_00000000_00000000_00000010;
      //2 Pluralities

      const LINKING                    = 0b00000000_00000000_00000000_00000100;
      const INTRANSITIVE               = 0b00000000_00000000_00000000_00001000;
      const MONOTRANSITIVE             = 0b00000000_00000000_00000000_00010000;
      const DITRANSITIVE               = 0b00000000_00000000_00000000_00100000;
      //4 Transitivities

      const PRESENT_SIMPLE             = 0b00000000_00000000_00000001_00000000;
      const PRESENT_CONTINUOUS         = 0b00000000_00000000_00000010_00000000;
      const PRESENT_PERFECT            = 0b00000000_00000000_00000100_00000000;
      const PRESENT_PERFECT_CONTINUOUS = 0b00000000_00000000_00001000_00000000;
      const PAST_SIMPLE                = 0b00000000_00000000_00010000_00000000;
      const PAST_CONTINUOUS            = 0b00000000_00000000_00100000_00000000;
      const PAST_PERFECT               = 0b00000000_00000000_01000000_00000000;
      const PAST_PERFECT_CONTINUOUS    = 0b00000000_00000000_10000000_00000000;
      const FUTURE_SIMPLE              = 0b00000000_00000001_00000000_00000000;
      const FUTURE_PERFECT             = 0b00000000_00000010_00000000_00000000;
      const FUTURE_CONTINUOUS          = 0b00000000_00000100_00000000_00000000;
      const FUTURE_PERFECT_CONTINUOUS  = 0b00000000_00001000_00000000_00000000;
      //12 Verb Tenses

      const NUMERAL                    = 0b00000000_00100000_00000000_00000000;
      const ARTICLE                    = 0b00000000_01000000_00000000_00000000;
      const DETERMINER                 = 0b00000000_10000000_00000000_00000000;
      const NOUN                       = 0b00000001_00000000_00000000_00000000;
      const VERB                       = 0b00000010_00000000_00000000_00000000;
      const ADJECTIVE                  = 0b00000100_00000000_00000000_00000000;
      const ADVERB                     = 0b00001000_00000000_00000000_00000000;
      const PRONOUN                    = 0b00010000_00000000_00000000_00000000;
      const PREPOSITION                = 0b00100000_00000000_00000000_00000000;
      const CONJUNCTION                = 0b01000000_00000000_00000000_00000000;
      const INTERJECTION               = 0b10000000_00000000_00000000_00000000;
      //11 Parts of Speech
   }
}

Representing Dictionary Words in a Space-Effecient Manner

It is important to note that words and ngrams may be ambiguous, sometimes very ambiguous. To capture this information in our vocabulary we make a bitfield of all the properties that a word can have and then we will store them in a simple tree data structure. There are more compact representations, like finite state machines or neural networks, that we may use later. For now we will just use a Character Trie to hold WordUsage bitfields at each word boundary.

use crate::en::grammar::WordUsage;
use radix_trie::Trie;

pub struct Dictionary {
   diction: Trie<String,WordUsage>
}
impl Dictionary {
   pub fn new() -> Dictionary {
      Dictionary {
         diction: Trie::new()
      }
   }
   pub fn insert(&mut self, key: String, usage: WordUsage) {
      if let Some(record) = self.diction.get_mut(&key) {
         *record |= usage;
      } else {
         self.diction.insert(key, usage);
      }
   }
}

Representing Language Grammar in a Space-Efficient Manner

For the language reading and writing we will use a simple graph representation to represent the language as a whole. EBNF is good for context-free languages with small lookaheads, however this can quickly become unwieldy when there is a larger amount of ambiguity present in the language. Surprisingly, from past experience, I have found that CFG is sometimes usable to represent natural language in many cases. This is the grammar nazi approach to language. When Context Free Grammars break, they usually break down by accepting everything and rejecting nothing. It is always possible to draw in broad strokes and lose in specificity. This is the short reasoning behind this approach and along with it comes the whole wagon of problems with statistical language processing.

pub struct GrammarVertex {
   pub is_terminal: bool,
   pub word: WordUsage,
}
impl GrammarVertex {
   pub fn new(word: WordUsage, is_terminal: bool) -> GrammarVertex {
      GrammarVertex {
         is_terminal: is_terminal,
         word: word,
      }
   }
}

pub struct RegularLanguage {
   pub nodes: Vec<GrammarVertex>,
   pub adjacency_list: Vec<(usize,usize)>,
}
impl RegularLanguage {
   pub fn new(is_terminal: bool) -> RegularLanguage {
      RegularLanguage {
         nodes: vec![
            GrammarVertex::new(WordUsage::NONE, is_terminal)
         ],
         adjacency_list: Vec::new(),
      }
   }
   pub fn put(&mut self, n: GrammarVertex) -> usize {
      let i = self.nodes.len();
      self.nodes.push(n);
      i
   }
   pub fn v(&mut self, word: WordUsage, is_terminal: bool) -> usize {
      self.put(GrammarVertex::new(word, is_terminal))
   }
   pub fn e(&mut self, a: usize, b: usize) {
      self.adjacency_list.push((a, b));
   }
}

pub struct RigidLanguage;
impl RigidLanguage {
   pub fn new() -> RegularLanguage {
      let mut l                = RegularLanguage::new(false);
      let rigid_subject        = l.v(WordUsage::NOUN, false);
      let rigid_verb           = l.v(WordUsage::VERB, false);
      let rigid_direct_object  = l.v(WordUsage::NOUN, false);
      let rigid_adverb         = l.v(WordUsage::ADVERB, true);
      l.e(rigid_subject, rigid_verb);
      l.e(rigid_verb, rigid_direct_object);
      l.e(rigid_direct_object, rigid_adverb);
      l
   }
}

Generating Feedforward Neural Networks from Grammars and Bitfields

This is all well and good, but where is the generative programming? Well, so far we have hard coded the representation of word usage and some very quaint grammar rules. It is infeasible for us to manually hardcode every word usage, grammar rule, and vocabulary into this machine. It is time for some Machine Learning! To learn new vocabulary and grammar we will implement a feedforward neural network and we will have the entire code mostly generated for us from what we already have. Don't believe me, just read the code:

use crate::en::grammar::WordUsage;

pub struct WordUsageTensor {
   pub tensor: [f64; 32]
}

impl From<WordUsage> for WordUsageTensor {
   fn from(wu: WordUsage) -> Self {
      let bits = wu.bits();
      WordUsageTensor {
         tensor: [
            if (bits & 0b00000000_00000000_00000000_00000001)>0 { 1.0 } else { 0.0 },
            if (bits & 0b00000000_00000000_00000000_00000010)>0 { 1.0 } else { 0.0 },
            if (bits & 0b00000000_00000000_00000000_00000100)>0 { 1.0 } else { 0.0 },
            if (bits & 0b00000000_00000000_00000000_00001000)>0 { 1.0 } else { 0.0 },
            if (bits & 0b00000000_00000000_00000000_00010000)>0 { 1.0 } else { 0.0 },
            if (bits & 0b00000000_00000000_00000000_00100000)>0 { 1.0 } else { 0.0 },
            if (bits & 0b00000000_00000000_00000000_01000000)>0 { 1.0 } else { 0.0 },
            if (bits & 0b00000000_00000000_00000000_10000000)>0 { 1.0 } else { 0.0 },

            if (bits & 0b00000000_00000000_00000001_00000000)>0 { 1.0 } else { 0.0 },
            if (bits & 0b00000000_00000000_00000010_00000000)>0 { 1.0 } else { 0.0 },
            if (bits & 0b00000000_00000000_00000100_00000000)>0 { 1.0 } else { 0.0 },
            if (bits & 0b00000000_00000000_00001000_00000000)>0 { 1.0 } else { 0.0 },
            if (bits & 0b00000000_00000000_00010000_00000000)>0 { 1.0 } else { 0.0 },
            if (bits & 0b00000000_00000000_00100000_00000000)>0 { 1.0 } else { 0.0 },
            if (bits & 0b00000000_00000000_01000000_00000000)>0 { 1.0 } else { 0.0 },
            if (bits & 0b00000000_00000000_10000000_00000000)>0 { 1.0 } else { 0.0 },

            if (bits & 0b00000000_00000001_00000000_00000000)>0 { 1.0 } else { 0.0 },
            if (bits & 0b00000000_00000010_00000000_00000000)>0 { 1.0 } else { 0.0 },
            if (bits & 0b00000000_00000100_00000000_00000000)>0 { 1.0 } else { 0.0 },
            if (bits & 0b00000000_00001000_00000000_00000000)>0 { 1.0 } else { 0.0 },
            if (bits & 0b00000000_00010000_00000000_00000000)>0 { 1.0 } else { 0.0 },
            if (bits & 0b00000000_00100000_00000000_00000000)>0 { 1.0 } else { 0.0 },
            if (bits & 0b00000000_01000000_00000000_00000000)>0 { 1.0 } else { 0.0 },
            if (bits & 0b00000000_10000000_00000000_00000000)>0 { 1.0 } else { 0.0 },

            if (bits & 0b00000001_00000000_00000000_00000000)>0 { 1.0 } else { 0.0 },
            if (bits & 0b00000010_00000000_00000000_00000000)>0 { 1.0 } else { 0.0 },
            if (bits & 0b00000100_00000000_00000000_00000000)>0 { 1.0 } else { 0.0 },
            if (bits & 0b00001000_00000000_00000000_00000000)>0 { 1.0 } else { 0.0 },
            if (bits & 0b00010000_00000000_00000000_00000000)>0 { 1.0 } else { 0.0 },
            if (bits & 0b00100000_00000000_00000000_00000000)>0 { 1.0 } else { 0.0 },
            if (bits & 0b01000000_00000000_00000000_00000000)>0 { 1.0 } else { 0.0 },
            if (bits & 0b10000000_00000000_00000000_00000000)>0 { 1.0 } else { 0.0 },
         ]
      }
   }
}

How a Feedforward Neural Network learns

Neural Networks are a mathematical abstraction of something much simpler: good and bad stimulus, i.e. pain and pleasure. For our use case it is straightforward to say that it is good when we recognize a word and it is also good when we have an appropriate grammar rule defined. It is bad when we have no appropriate rule for a word. It is bad when we have too much ambiguity. For Neural Networks this is all we need defined to then learn from a dataset.

This adventure continues in a new post: Continuing Natural Language Processing in Rust