RubyGems - sequitur - Versions diffs - 0.1.01 → 0.1.02 - Mend

sequitur 0.1.01 → 0.1.02

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (13) hide show

checksums.yaml +8 -8
data/CHANGELOG.md +5 -1
data/README.md +84 -2
data/lib/sequitur/constants.rb +1 -1
data/lib/sequitur/formatter/base_formatter.rb +9 -7
data/lib/sequitur/grammar_visitor.rb +2 -1
data/spec/sequitur/digram_spec.rb +0 -1
data/spec/sequitur/dynamic_grammar_spec.rb +1 -1
data/spec/sequitur/formatter/base_text_spec.rb +5 -5
data/spec/sequitur/formatter/debug_spec.rb +4 -4
data/spec/sequitur/grammar_visitor_spec.rb +1 -1
data/spec/sequitur/production_ref_spec.rb +2 -2
metadata +7 -3

checksums.yaml CHANGED Viewed

@@ -1,15 +1,15 @@
 ---
 !binary "U0hBMQ==":
   metadata.gz: !binary |-
-    NTQ5ODkzODJlYjNmZDBiODNiZTdiZTE4ZDFlNTljYWFhMTg5MzExYw==
+    YTdkNzZiNTc1NjBkM2M0MDlhZDI1M2MyNTFhODJhZGI1MjFlYWI2MQ==
   data.tar.gz: !binary |-
-    YzZkNWE1MTdhZTBiMWZmOTI1ZDhhMDJkM2QxYTU3ZDAxZDExZjk5MQ==
+    MmEzZGRlNTI2M2U3ZmQwYmY3MTA1MmE0MDkzMGQ0ZjBmZDJlYTRmMQ==
 !binary "U0hBNTEy":
   metadata.gz: !binary |-
-    YWVmZTE0YWQ4MmEyMTI4YTFjYmU2MTZkYTZhOGUwZTM4YTZmMDQ2OTliMDky
-    NjVmMjdkYmVkMGY3MGEzMDBlM2RjMzNlODMzZDQ3NGY4NzRmNzNkNjNlOTI2
-    ZGFlMTkxMTNmZjA0MTI2MjM0MzhkMjM4MjUwMzg0ZDk5NDU0YTU=
+    MjUzMmU0YTQ4MzQ2NmVmMWU2YWQzMTkwZDNiZjM3MjgyOTFlMmRmZDJmMmJi
+    NmM5YjMxMjA0YzM5OGFiOGRiYjBmYTc2M2YyN2NiNjJiMGRlYjJkMmMxMThk
+    ZGU1MDlhYTBkZDc3YTEwMDAwNmQ0YTZlOTQyZGM5YmFmNTRjNmM=
   data.tar.gz: !binary |-
-    MTNhOGQyOWNlNTQ4Y2RiOTcyYjU3ZDM3MGZmZGEzYjY2NjFkM2RiZmI3ZTZl
-    ZmRiM2JiNzU0MGVkMDUzYTVjM2M4MmM3MWQwYTgyNmQzYjA0MTdhNWI4Mjg2
-    ODAwYTc4M2JjZGQyOTUzOGM4MjA5MzZhYThlMmM5MWNjYjJhZmQ=
+    NGU2NjY2Yzc2ZmQ4NDFlN2E4MGVlYTUwMDg4NjgwYzBiYjk0ZjM5NGY4MTg4
+    NTI5NTExOTQzMWY1YzhiNWM4ZjM1OWQ5YjM1MjViZWVlYWRlMWU5NjcyNDNk
+    MzAwMzZhM2NlZGE1M2MzYTYyOGZmODkyMWE4YjA0NTE3MTk4NjA=

data/CHANGELOG.md CHANGED Viewed

@@ -1,6 +1,10 @@
+### 0.1.02 / 2014-09-18
+* [CHANGE] File `README.md`: expanded introductory text.
+* [CHANGE] File `sequitur.gemspec` : expanded gem description in the specification.
 ### 0.1.01 / 2014-09-17
 * [NEW] Added new `BaseFormatter` superclass. Sample formatters are inheriting from this one.
-* [CHANGE] File `README.me`: added a brief intro to the Sequitur algorithm, expanded the Ruby examples
+* [CHANGE] File `README.md`: added a brief intro to the Sequitur algorithm, expanded the Ruby examples
 * [CHANGE] Private method `BaseText#prod_name` production name doesn't contain an underscore.
 * [CHANGE] Formatter class `BaseText` now inherits from `BaseFormatter`
 * [CHANGE] Formatter class `Debug` now inherits from `BaseFormatter`

data/README.md CHANGED Viewed

@@ -15,7 +15,7 @@ The following are good entry points to learn about the algorithm:
 ### The theory in a nutshell ###
 Given a sequence of input tokens (say, characters), the Sequitur algorithm
-will represent that input sequence as a set of rules. As the algorithm detects
+will represent that input sequence as a set of rules. As the algorithm detects
 automatically repeated token patterns, the resulting rule set can encode repetitions in the input
 in a very compact way.
 Of interest is the fact that the algorithm runs in time linear in the length of the input sequence.
@@ -46,7 +46,7 @@ P3 : P2 d.
 ```
 Translated in plain English:
-- Rule (start) tells that the input consists of the sequence of P_1 P_2 P_3 patterns followed by the letter e.
+- Rule (start) tells that the input consists of the sequence of P1 P2 P3 patterns followed by the letter e.
 - Rule (P1) represents the sequence 'ab'.
 - Rule (P2) represents the pattern encoded by P1 (thus 'ab') then 'c'.
 In other words, it represents the string 'abc'.
@@ -78,6 +78,7 @@ The following Ruby snippet show how to apply Sequitur on the input string from t
 The demo illustrates how easy it is to run the algorithm on a string. However, the next question is how
 can you make good use of the algorithm's result.
+**Printing the resulting rules**
 The very first natural step is to be able to print out the (grammar) rules.
 Here's how:
@@ -106,6 +107,87 @@ Here's how:
     # P3 : P2 d.
 ```
+## Understanding the algorithm's results
+The Sequitur algorithm generates a -simplified- context-free grammar, therefore we dedicate this section
+to the terminology about context-free grammars. As the Internet provides tons of information can be found
+on the subject, we limit ourselves to the minimal terminology of interest when using the sequitur gem.
+First of all, what is a **grammar**? To simplify the matter, one can see a grammar as a set of
+grammar rules. These rules are called production rules or more briefly **productions**.
+In a context-free grammar, productions have the form:
+````
+P : body.
+```
+Where:
+- The colon ':' character separates the head (= left-hand side) and the body (right-hand side, *rhs* in short)
+of the rule.
+- The left-hand side consists just of one symbol, P. P is a categorized as a *nonterminal symbol* and for our purposes
+a nonterminal symbol can be seen as the "name" of the production. By contrast, a terminal symbol is just one element
+from the input sequence (symbols as defined in formal grammar theory shouldn't be confused with Ruby's `Symbol` class).
+- the body is a sequence -possibly empty- of *symbols* (terminal or nonterminal).
+Basically, a production rule tells that P is equivalent to the sequence of symbols found in the
+right-hand side of the production. A nonterminal symbol that appears in the rhs of a production can be
+seen as a reference to the production with same name.
+## The Sequitur API
+Recall the above example: a single call to the `Sequitur#build_from` factory method
+suffices to construct a grammar object.
+```ruby
+    require 'sequitur'
+    input_sequence =  'ababcabcdabcde'
+    grammar = Sequitur.build_from(input_sequence)
+```
+The return value `grammar` is a `Sequitur::SequiturGrammar` instance.
+Unsurprisingly, the `Sequitur::SequiturGrammar` class defines an accessor method called 'productions'
+that returns the productions of the grammar as an array of `Sequitur::Production` objects.
+```ruby
+	# Count the number of productions in the grammar
+	puts grammar.productions.size # => 4
+	# Retrieve all productions of the grammar
+	all_prods = grammar.productions
+	# Retrieve the start production
+	start_prod = grammar.production[0]
+```
+Once we have a grip on a production, it is easy to access its right-hand side through the `Production#rhs` method.
+It returns an array of symbols.
+```ruby
+	# ...Continuing the same example
+	# Retrieve the right-hand side of the production
+	prod_body = start_prod.rhs	# Return an Array object
+```
+The RHS of a production is a sequence (i.e. Array) of symbols.
+How are the grammar symbols implemented?
+-Terminal symbols are directly originating from the input sequence. They are inserted "as is" in the
+RHS. For instance, if the input sequence consists of integer values (i.e. Finum instances), then they
+will be inserted in the RHS of productions.
+-Non-terminal symbols are implemented as `Sequitur::ProductionRef` objects.
+A ProductionRef is reference to a Production object. The latter one can be accessed through the `ProductionRef#production` method.
+### Installation ###
+The sequitur gem installation is fairly standard.
+If your project has a `Gemfile` file, add `sequitur` to it. Otherwise, install the gem like this:
+```bash
+$[sudo] gem install sequitur
+```
 ### TODO: Add more documentation ###

data/lib/sequitur/constants.rb CHANGED Viewed

@@ -3,7 +3,7 @@
 module Sequitur # Module used as a namespace
   # The version number of the gem.
-  Version = '0.1.01'
+  Version = '0.1.02'
   # Brief description of the gem.
   Description = 'Ruby implementation of the Sequitur algorithm'

data/lib/sequitur/formatter/base_formatter.rb CHANGED Viewed

@@ -17,17 +17,19 @@ module Sequitur
       # Given a grammar or a grammar visitor, perform the visit
       # and render the visit events in the output stream.
       def render(aGrmOrVisitor)
-        aVisitor = if aGrmOrVisitor.kind_of?(GrammarVisitor)
-          aGrmOrVisitor
+        if aGrmOrVisitor.kind_of?(GrammarVisitor)
+          a_visitor = aGrmOrVisitor
         else
-          aGrmOrVisitor.visitor
+          a_visitor = aGrmOrVisitor.visitor
         end
-        aVisitor.subscribe(self)
-        aVisitor.start()
-        aVisitor.unsubscribe(self)
+        a_visitor.subscribe(self)
+        a_visitor.start
+        a_visitor.unsubscribe(self)
       end
     end # class
   end # module
-end # module
+end # module
+# End of file

data/lib/sequitur/grammar_visitor.rb CHANGED Viewed

@@ -22,7 +22,7 @@ class GrammarVisitor
   end
   def unsubscribe(aSubscriber)
-    subscribers.delete_if { |entry| entry == aSubscriber}
+    subscribers.delete_if { |entry| entry == aSubscriber }
   end
   # The signal to start the visit.
@@ -66,6 +66,7 @@ class GrammarVisitor
   end
   private
   def broadcast(msg, *args)
     subscribers.each do |a_subscriber|
       next unless a_subscriber.respond_to?(msg)

data/spec/sequitur/digram_spec.rb CHANGED Viewed

@@ -46,4 +46,3 @@ end # describe
 end # module
 # End of file

data/spec/sequitur/dynamic_grammar_spec.rb CHANGED Viewed

@@ -117,7 +117,7 @@ describe DynamicGrammar do
       a_visitor.subscribe(fake_formatter)
       expect(fake_formatter).to receive(:before_grammar).with(subject).ordered
-      expect(fake_formatter).to receive(:before_production).with(subject.root).ordered
+      expect(fake_formatter).to receive(:before_production).with(subject.root)
       expect(fake_formatter).to receive(:before_rhs).with([]).ordered
       expect(fake_formatter).to receive(:after_rhs).with([]).ordered
       expect(fake_formatter).to receive(:after_production).with(subject.root)

data/spec/sequitur/formatter/base_text_spec.rb CHANGED Viewed

@@ -41,7 +41,7 @@ describe BaseText do
       expect { BaseText.new(StringIO.new('', 'w')) }.not_to raise_error
     end
-    it "should know its output destination" do
+    it 'should know its output destination' do
       instance = BaseText.new(destination)
       expect(instance.output).to eq(destination)
     end
@@ -54,7 +54,7 @@ describe BaseText do
       instance = BaseText.new(destination)
       a_visitor = empty_grammar.visitor
       instance.render(a_visitor)
-      expectations =<<-SNIPPET
+      expectations = <<-SNIPPET
 start :.
 SNIPPET
       expect(destination.string).to eq(expectations)
@@ -64,7 +64,7 @@ SNIPPET
       instance = BaseText.new(destination)
       a_visitor = sample_grammar.visitor  # Use visitor explicitly
       instance.render(a_visitor)
-      expectations =<<-SNIPPET
+      expectations = <<-SNIPPET
 start :.
 P1 : a.
 P2 : b.
@@ -77,7 +77,7 @@ SNIPPET
     it 'should support visit events without an explicit visitor' do
       instance = BaseText.new(destination)
       instance.render(sample_grammar)
-      expectations =<<-SNIPPET
+      expectations = <<-SNIPPET
 start :.
 P1 : a.
 P2 : b.
@@ -92,4 +92,4 @@ end # describe
 end # module
 end # module
-# End of file
+# End of file

data/spec/sequitur/formatter/debug_spec.rb CHANGED Viewed

@@ -41,7 +41,7 @@ describe Debug do
       expect { Debug.new(StringIO.new('', 'w')) }.not_to raise_error
     end
-    it "should know its output destination" do
+    it 'should know its output destination' do
       instance = Debug.new(destination)
       expect(instance.output).to eq(destination)
     end
@@ -54,7 +54,7 @@ describe Debug do
       instance = Debug.new(destination)
       a_visitor = empty_grammar.visitor
       instance.render(a_visitor)
-      expectations =<<-SNIPPET
+      expectations = <<-SNIPPET
 before_grammar
   before_production
     before_rhs
@@ -69,7 +69,7 @@ SNIPPET
       instance = Debug.new(destination)
       a_visitor = sample_grammar.visitor
       instance.render(a_visitor)
-      expectations =<<-SNIPPET
+      expectations = <<-SNIPPET
 before_grammar
   before_production
     before_rhs
@@ -111,4 +111,4 @@ end # describe
 end # module
 end # module
-# End of file
+# End of file

data/spec/sequitur/grammar_visitor_spec.rb CHANGED Viewed

@@ -95,4 +95,4 @@ end # describe
 end # module
-# End of file
+# End of file

data/spec/sequitur/production_ref_spec.rb CHANGED Viewed

@@ -72,8 +72,8 @@ describe ProductionRef do
     it 'should complain when binding to something else than production' do
       subject.bind_to(target)
-      msg = "Illegal production type String"
-      expect {subject.bind_to('WRONG') }.to raise_error(StandardError, msg)
+      msg = 'Illegal production type String'
+      expect { subject.bind_to('WRONG') }.to raise_error(StandardError, msg)
     end
     it 'should compare to other production (reference)' do

metadata CHANGED Viewed

@@ -1,14 +1,14 @@
 --- !ruby/object:Gem::Specification
 name: sequitur
 version: !ruby/object:Gem::Version
-  version: 0.1.01
+  version: 0.1.02
 platform: ruby
 authors:
 - Dimitri Geshef
 autorequire:
 bindir: bin
 cert_chain: []
-date: 2014-09-17 00:00:00.000000000 Z
+date: 2014-09-18 00:00:00.000000000 Z
 dependencies:
 - !ruby/object:Gem::Dependency
   name: rake
@@ -66,7 +66,11 @@ dependencies:
     - - ! '>='
       - !ruby/object:Gem::Version
         version: 2.0.0
-description: Ruby implementation of the Sequitur algorithm.
+description: ! "Ruby implementation of the Sequitur algorithm. This algorithm automatically
+  \nfinds repetitions and hierarchical structures in a given sequence of input \ntokens.
+  It encodes the input into a context-free grammar. \nThe Sequitur algorithm can be
+  used to \na) compress a sequence of items,\nb) discover patterns in an sequence,
+  \nc) generate grammar rules that can represent a given input.\n"
 email: famished.tiger@yahoo.com
 executables: []
 extensions: []