RubyGems - markov_twitter - Versions diffs - 0.0.1 - Mend

markov_twitter 0.0.1

Files changed (12) hide show

checksums.yaml +7 -0
data/README.md +230 -0
data/bin/console +25 -0
data/bin/markov_twitter +8 -0
data/lib/markov_twitter/authenticator.rb +16 -0
data/lib/markov_twitter/markov_builder/node.rb +192 -0
data/lib/markov_twitter/markov_builder.rb +236 -0
data/lib/markov_twitter/test_helper_methods.rb +175 -0
data/lib/markov_twitter/tweet_reader.rb +20 -0
data/lib/markov_twitter.rb +41 -0
data/lib/version.rb +4 -0
metadata +167 -0

checksums.yaml ADDED Viewed

@@ -0,0 +1,7 @@
+---
+SHA1:
+  metadata.gz: 4cf59089020057ea5529984594a60fe7397d58f3
+  data.tar.gz: 2b9396221a1a511e0f7a2b324cadcc52595569b0
+SHA512:
+  metadata.gz: 5346cb8d500dd6cc45f58bcbb1ff0025803c0f9559c30e0dec4f5c42f1afcc5ff8a009c501706df5010f7504b185d8fac4c5b53486b1873d5dd588bae848b2a3
+  data.tar.gz: c29df5a5f48de8089e193a2f884730880b4343fa3bd4682138dcbf9fbf824aa39898d00b18966129bbf406c14746f1d293a59eef282b4cdf37a96da968190b9c

data/README.md ADDED Viewed

@@ -0,0 +1,230 @@
+# markov_twitter
+## setup: _installation_
+Either:
+```sh
+gem install markov_twitter
+```
+or add it to a Gemfile:
+```rb
+gem "markov_twitter"
+```
+After doing this, require it as usual:
+```rb
+require "markov_twitter"
+```
+## setup: _twitter integration_
+The source code of the gem (available on github [here](http://github.com/maxpleaner/markov_twitter)) includes a `.env.example` file which includes two environment variables. Both of them need to be changed to the values provided by Twitter. To get these credentials, create an application on the Twitter developer console. Then create a file identical to `.env.example` but named `.env` in the root of your project, and add the credentials there. Finally, add the [dotenv](https://github.com/bkeepers/dotenv) gem and call `Dotenv.load` right afterward.
+The two environment variables that are needed are `TWITTER_API_KEY` and `TWITTER_SECRET_KEY`. They can alternatively be set on a per-invocation basis using the [env](https://ss64.com/bash/env.html) command in bash, e.g.:
+```sh
+env TWITTER_API_KEY=foo TWITTER_SECRET_KEY=bar ruby script.rb
+```
+Note that the callback URL or any of the OAuth stuff on the Twitter dev console is unnecessary. Specifically this requires only  [application-only authentication](https://developer.twitter.com/en/docs/basics/authentication/overview/application-only).
+## usage: _TweetReader_
+First, initialize a [MarkovTwitter::Authenticator](http://rubydoc.info/gems/markov_twitter/MarkovTwitter/Authenticator):
+```rb
+authenticator = MarkovTwitter::Authenticator.new(
+  api_key: ENV.fetch("TWITTER_API_KEY"),
+  secret_key: ENV.fetch("TWITTER_SECRET_KEY")
+)
+```
+Then initialize [MarkovTwitter::TweetReader](http://rubydoc.info/gems/markov_twitter/MarkovTwitter/TweetReader):
+```rb
+tweet_reader = MarkovTwitter::TweetReader.new(
+  client: authenticator.client
+)
+```
+Lastly, fetch some tweets for an arbitrary username. Note that the [get_tweets](http://rubydoc.info/gems/markov_twitter/MarkovTwitter/TweetReader:get_tweets) method will return the most recently 20 tweets only. This gem doesn't have a way to fetch more tweets than that.
+```rb
+tweets = tweet_reader.get_tweets(username: "@accidental575")
+puts tweets.map(&:text).first # the newest
+# => "Jets fan who stands for /\nnational anthem sits on /\nAmerican flag /\n#accidentalhaiku by @Deadspin \nhttps://t.co/INsLlMB31G"
+```
+## usage: _MarkovBuilder_
+[MarkovTwitter::MarkovBuilder](http://rubydoc.info/gems/markov_twitter/MarkovTwitter/MarkovBuilder) gets passed the list of tweet strings to its initialize:
+```rb
+chain = MarkovTwitter::MarkovBuilder.new(
+  phrases: tweets.map(&:text)
+)
+```
+It internally stores the words in a [#nodes](http://rubydoc.info/gems/markov_twitter/MarkovTwitter/MarkovBuilder:nodes) dict where keys are strings and values are [Node](http://rubydoc.info/gems/markov_twitter/MarkovTwitter/MarkovBuilder/Node) instances. A Node is created from each whitespace-separated entity. Punctuation is treated like any other non-whitespace character.
+The linkages between words are automatically created ([Node#linkages](http://rubydoc.info/gems/markov_twitter/MarkovTwitter/MarkovBuilder/Node:linkages)) and it's  possible to evaluate the chain right away, producing a randomly generated sentence. There are three built in methods to  evaluate the chain, but more can be constructed using lower-level methods. There are two ways these methods differ:
+1. Do they build the result by walking along the :next or :prev nodes (forward or backward)?
+2. How do they pick the first node, and how do they choose a node when there are no more linkages along the given direction (:prev or :next)?
+Here are those three methods:
+1. [evaluate](http://rubydoc.info/gems/markov_twitter/MarkovTwitter/MarkovBuilder:evaluate)
+    - traverses rightward along :next
+    - when starting or stuck, picks any random word
+    ```rb
+    5.times.map  { chain.evaluate length: 10 }
+    # => [
+    # "by @FlayrahNews https://t.co/LbxzPQ5Zqv back. / together with dung! / American",
+    # "thought/ #accidentalhaiku by @news_24_365 https://t.co/kkfz5S3Kut pumpkin / Wes Anderson's Isle",
+    # "has been in a lot about / #accidentalhaiku by @UrbanLion_",c
+    # "them, my boyfriend used my friends. Or as / #accidentalhaiku",
+    # "25 years... / feeling it today. / to write /"
+    # ]
+    ```
+2. [evaluate_favoring_end](http://rubydoc.info/gems/markov_twitter/MarkovTwitter/MarkovBuilder:evaluate_favoring_end)
+    - traverses leftward along :prev
+    - when starting or stuck, picks a word that was at the end of one of the original phrases.
+    - reverses the result before returning
+    ```rb
+    5.times.map  { chain.evaluate_favoring_end length: 10 }
+    # => [
+    # "revolution / to improve care, / #accidentalhaiku by @Deadspin https://t.co/INsLlMB31G",
+    # "to save the songs you thought/ #accidentalhaiku by @Mary_Mulan https://t.co/ixw2EQamHq",
+    # "adventure / together with dung! / #accidentalhaiku by @Deadspin https://t.co/INsLlMB31G",
+    # "harder / for / creativity? / #accidentalhaiku by @AlbertBrooks https://t.co/DzXbGeYh0Z",
+    # "/ Asking for 25 years... / #accidentalhaiku by @StratfordON https://t.co/k81u693AbV"
+    # ]
+    ```
+3. [evaluate_favoring_start](http://rubydoc.info/gems/markov_twitter/MarkovTwitter/MarkovBuilder:evaluate_favoring_start)
+    - traverses rightward along :next
+    - when starting or stuck, picks a word that was at the start of one of the original phrases.
+    ```rb
+    5.times.map { chain.evaluate_favoring_start length: 10 }
+    # => [
+    # "RT if you listened to / to get lost /",
+    # "Jets fan who stands for / #accidentalhaiku by @theloniousdev https://t.co/6Rb5F8XySy   # ",
+    # "The first trailer for / and never come back.    # /",
+    # "Zooey Deschanel / and never come back. / house in   # ",
+    # "Oh my friends. Or as / #accidentalhaiku by @timkaine https://t.co/4pgknpmom5   # "
+    # ]
+    ```
+Note that it is possible to manually change the lists of start nodes and end nodes using [MarkovBuilder#start_nodes](http://rubydoc.info/gems/markov_twitter/MarkovTwitter/MarkovBuilder:start_nodes) and [MarkovBuilder#end_nodes](http://rubydoc.info/gems/markov_twitter/MarkovTwitter/MarkovBuilder:end_nodes)
+## advanced usage: _custom evaluator_
+The three previously mentioned methods all use [_evaluate](http://rubydoc.info/gems/markov_twitter/MarkovTwitter/MarkovBuilder:_evaluate) under the hood. This method supports any permutation of the following keyword args (all except start_node and probability_bounds are required).
+- **length**
+   number of nodes in the result
+- **direction**
+  :next or :prev
+- **start_node**
+  the node to use at the beginning
+- **probability_bounds**
+  _Array<Int1,Int2>_ where _0 <= Int1 <= Int2 <= 100_
+  This is essentially used to "stack the dice", so to speak. Internally, smaller probabilities are checked first. So if A has 50% likelihood and B/C/D/E/F each have 10% likelihood, then B/C/D/E/F can be guaranted by using [0,50] as probability_bounds. This 'stacked' probability is applied any time the program chooses a :next or :prev option.
+- **node_finder**
+  A lambda which gets run when the evaluator is starting or stuck. It gets passed random nodes one-by-one. The first one for which the block returns a truthy value is used.
+Note that [_evaluate](http://rubydoc.info/gems/markov_twitter/MarkovTwitter/MarkovBuilder:_evaluate) returns nodes and so the values must be manually fetched and joined. Here's an example of providing a custom node_finder lambda so that all phrases in the result start with "the":
+```rb
+5.times.map do
+  nodes = chain._evaluate(
+    direction: :next,
+    length: 10,
+    node_finder: -> (node) {
+      node.value.downcase == "the"
+    }
+  )
+  nodes.map(&:value).join " "
+end
+# => [
+# "the rain / #accidentalhaiku by @theloniousdev https://t.co/6Rb5F8XySy The first trailer",
+# "The first trailer for / #accidentalhaiku by @shiku___ https://t.co/ZutjdsopAo the",
+# "the songs you thought/ #accidentalhaiku by @Mary_Mulan https://t.co/ixw2EQamHq The first",
+# "The first trailer for / #accidentalhaiku by @UrbanLion_ https://t.co/bvM6eeXGj5 The",
+# "the rain / and start / I THOUGHT MY BOYFRIEND"
+# ]
+```
+## advanced usage: _linkage manipulation_
+There are manipulations available at the [Node](http://rubydoc.info/gems/markov_twitter/MarkovTwitter/MarkovBuilder/Node) level (accessible through the [MarkovBuilder#nodes](http://rubydoc.info/gems/markov_twitter/MarkovTwitter/MarkovBuilder:nodes) dict). Keep in mind that there is only a single Node for each unique string. There can be many references to it from other nodes' linkages, but since there is still only a single object, each unique string only has a single set of :next and :previous linkages emanating from it.
+Although the core linkage data is accessible in [Node#linkages](http://rubydoc.info/gems/markov_twitter/MarkovTwitter/MarkovBuilder/Node:linkages) and [Node#total_num_inputs](http://rubydoc.info/gems/markov_twitter/MarkovTwitter/MarkovBuilder/Node:total_num_inputs), they should not be manipulated directly via these references. Rather, use one of the following methods which are automatically balancing in terms of keeping :next and :previous probabilities mirrored and ensuring that the probabilities sum to 1. That is to say, if I add _node1_ as the :next linkage of _node2_, then _node1_ will have its :prev probabilities balanced and _node2_ will have its :next probabilities balanced.
+1. [#add_next_linkage(child_node)](http://rubydoc.info/gems/markov_twitter/MarkovTwitter/MarkovBuilder/Node:add_next_linkage)
+  adds a linkage in the :next direction or increases its likelihood
+2. [#add_prev_linkage(parent_node)](http://rubydoc.info/gems/markov_twitter/MarkovTwitter/MarkovBuilder/Node:add_prev_linkage)
+  adds a linkage in the :prev direction or increases its likelihood
+3. [#remove_next_linkage(child_node)](http://rubydoc.info/gems/markov_twitter/MarkovTwitter/MarkovBuilder/Node:remove_next_linkage)
+  removes a linkage in the :next direction or decreases its likelihood
+4. [#remove_prev_linkage(parent_node)](http://rubydoc.info/gems/markov_twitter/MarkovTwitter/MarkovBuilder/Node:remove_prev_linkage)
+  removes a linkage in the :prev direction or decreases its likelihood
+5. [#add_linkage!(direction, other_node, probability)](http://rubydoc.info/gems/markov_twitter/MarkovTwitter/MarkovBuilder/Node:add_linkage!)
+  Force-sets the probability of a linkage. Adjusts the other probabilities so they still sum to 1.
+6. [#remove_linkage!(direction, other_node)](http://rubydoc.info/gems/markov_twitter/MarkovTwitter/MarkovBuilder/Node:remove_linkage!)
+  Completely removes a linkage as an option. Adjusts other probabilities so they still sum to 1.
+All of these methods can be safely run many times. Note that `remove_next_linkage` and `remove_prev_linkage` do _not_ completely remove the node from the list of options. They just decrement its probability an amount determined by [Node#total_num_inputs](http://rubydoc.info/gems/markov_twitter/MarkovTwitter/MarkovBuilder/Node:total_num_inputs).
+## development: code organization
+The gem boilerplate was scaffolded using a gem I made, [gemmyrb](http://github.com/maxpleaner/gemmyrb).
+Test scripts are in the [spec/](http://github.com/maxpleaner/markov_twitter/tree/master/spec) folder, although some helper methods are written into the application code at [lib/markov_twitter/test_helper_methods.rb](http://github.com/maxpleaner/markov_twitter/tree/master/lib/markov_twitter/test_helper_methods.rb).
+The application code is in [lib/](http://github.com/maxpleaner/markov_twitter/tree/lib).
+Documentation is built with [yard](https://github.com/lsegal/yard) into [doc/](http://github.com/maxpleaner/markov_twitter/tree/master/doc) - it's viewable [on rubydoc](http://rubydoc.info/gems/markov_twitter).
+## development: tests
+To run the tests, install markov_twitter with the development dependencies:
+```rb
+gem install markov_twitter --development
+```
+Then run `rspec` in the root of the repo.
+There are 40 test cases at time of writing.
+By default, Webmock will prevent any real HTTP calls for the twitter-related tests, but this can be disabled (and real Twitter data used) by running the test suite with an environment variable:
+```sh
+env DISABLE_WEBMOCK=true rspec
+```
+## development: building docs
+Docs are built with `yard` from the command line. It has 100% documentation at time of writing. If when building, it shows that something is undocumented, run `yard --list-undoc` to find out where it is.
+## development: todos
+Things which would be interesting to add:
+- dictionary-based search and replace
+- part-of-speech-based search and replace

data/bin/console ADDED Viewed

@@ -0,0 +1,25 @@
+#!/usr/bin/env ruby
+require 'pry'
+require 'dotenv'
+require 'markov_twitter'
+Dotenv.load
+authenticator = MarkovTwitter::Authenticator.new(
+  api_key: ENV.fetch("TWITTER_API_KEY"),
+  secret_key: ENV.fetch("TWITTER_SECRET_KEY")
+)
+tweet_reader = MarkovTwitter::TweetReader.new(
+  client: authenticator.client
+)
+tweets = tweet_reader.get_tweets(username: "@accidental575")
+chain = MarkovTwitter::MarkovBuilder.new(
+  phrases: tweets.map(&:text)
+)
+Pry.start

data/bin/markov_twitter ADDED Viewed

@@ -0,0 +1,8 @@
+#!/usr/bin/env ruby
+require 'markov_twitter'
+class MarkovTwitter::CLI < Thor
+  # Nothing here yet
+end
+MarkovTwitter::CLI.start ARGV

data/lib/markov_twitter/authenticator.rb ADDED Viewed

@@ -0,0 +1,16 @@
+# Wrapper for the twitter gem's client.
+class MarkovTwitter::Authenticator
+  # @return [Twitter::Client]
+  attr_reader :client
+  # @param api_key [String] should be stored in ENV var.
+  # @param secret_key [String] should be stored in ENV var.
+  def initialize(api_key:, secret_key:)
+    @client = Twitter::REST::Client.new do |config|
+      config.consumer_key        = api_key
+      config.consumer_secret     = secret_key
+    end
+  end
+end

data/lib/markov_twitter/markov_builder/node.rb ADDED Viewed

@@ -0,0 +1,192 @@
+class MarkovTwitter::MarkovBuilder
+  # Represents a single node in a Markov chain.
+  class Node
+    # @return [String] a single token, such as a word.
+    attr_reader :value
+    # @return [Hash<Symbol, Hash<String, Float>>]
+    #   the :next and :previous linkages.
+    #   - Outer hash is keyed by the direction (:next, :prev).
+    #   - Inner hash represents possible traversals -
+    #     also keyed by string value, its values are probabilities
+    #     representing the likelihood of choosing that route.
+    attr_accessor :linkages
+    # @return [Hash<Symbol>, Integer]
+    #   the total number of inputs added in each direction.
+    #   - also used to re-calculate probabilities.
+    attr_accessor :total_num_inputs
+    # @return [Hash<String,Node>]
+    #   a reference to the attr of the parent MarkovBuilder
+    attr_reader :nodes
+    # @param value [String] for example, a word.
+    # @param nodes [Hash<String,Node>].
+    def initialize(value:, nodes:)
+      @value = value
+      @linkages = { next: Hash.new(0), prev: Hash.new(0) }
+      @total_num_inputs = { next: 0, prev: 0 }
+      @nodes = nodes
+    end
+    # Adds a single node to the :next linkages and updates probabilities.
+    # Also updates the opposite direction,
+    # e.g. :prev will be updated if something is added to :next.
+    # @param direction [Symbol] either :next or :prev.
+    # @param other_node [Node]
+    # @param mirror_change [Boolean]
+    #   whether to update the opposite direction, defaults to true.
+    # @return [void]
+    def add_and_adjust_probabilities(direction, other_node, mirror_change=true)
+      @nodes[other_node.value] ||= other_node
+      total_num_inputs[direction] += 1
+      unit = get_probability_unit(direction)
+      probability_multiplier = (total_num_inputs[direction] - 1) * unit
+      linkages[direction].each_key do |node_key|
+        linkages[direction][node_key] *= probability_multiplier
+      end
+      linkages[direction][other_node.value] += unit
+      # Add a linkage in the opposite direction to keep :next and :prev in sync
+      update_opposite_direction(direction, other_node, __method__) if mirror_change
+    end
+    # Determines the weight of a single insertion by looking up the total
+    # number of insertions in that direction.
+    # @param direction [Symbol] :next or :prev
+    # @return [Float] between 0 and 1.
+    def get_probability_unit(direction)
+      unless total_num_inputs[direction] > 0
+        raise ArgumentError, "no inputs were added in <direction>"
+      end
+      1.0 / total_num_inputs[direction]
+    end
+    # Removes a single node from the :prev linkages and updates probabilities.
+    # Safe to run if the other_node is not actually present in the linkages.
+    # Mirrors the change in the opposite direction, to keep :prev and :next in sync.
+    # @param direction [Symbol] either :next or :prev
+    # @param other_node [Node] the node to be removed.
+    # @param mirror_change [Boolean]
+    #   whether to update the opposite direction, defaults to true
+    # @return [void]
+    def remove_and_adjust_probabilities(direction, other_node, mirror_change=true)
+      return unless linkages[direction].has_key? other_node.value
+      unit = get_probability_unit(direction)
+      if linkages[direction][other_node.value] - unit <= 0
+        delete_linkage!(direction, other_node)
+      else
+        linkages[direction][other_node.value] -= unit
+        num_per_direction = total_num_inputs[direction]
+        linkages[direction].each_key do |node_key|
+          linkages[direction][node_key] *= (
+            num_per_direction / (num_per_direction - 1.0)
+          )
+        end
+        total_num_inputs[direction] -= 1
+      end
+      # Add a linkage in the opposite direction to keep :next and :prev in sync
+      update_opposite_direction(direction, other_node, __method__) if mirror_change
+    end
+    # Force-removes a linkage, re-adjusting other probabilities
+    # but potentially breaking their proportionality.
+    # Can be safely run for non-existing nodes.
+    # Adjusts the linkages in the opposite direction accordingly.
+    # @param direction [Symbol]
+    # @param other_node [Node]
+    # @param mirror_change [Boolean]
+    #   whether to update the opposite direction, defaults to true
+    # @return [void]
+    def delete_linkage!(direction, other_node, mirror_change=true)
+      return unless linkages[direction].has_key? other_node.value
+      probability = linkages[direction][other_node.value]
+      # delete the linkage
+      linkages[direction].delete other_node.value
+      # distribute the probability evenly among the other options.
+      amt_to_add = probability / linkages[direction].keys.length
+      linkages[direction].each_key do |key|
+        linkages[direction][key] += amt_to_add
+      end
+      # decrement the total count
+      total_num_inputs[direction] -= 1
+      # Add a linkage in the opposite direction to keep :next and :prev in sync
+      update_opposite_direction(direction, other_node, __method__) if mirror_change
+    end
+    # Force-adds a linkage at a specific probability.
+    # Readjusts other probabilities but may break their proportionality.
+    # Updates the opposite direction to keep :next and :prev in sync
+    # @param direction [Symbol]
+    # @param other_node [Node]
+    # @param probability [Float] between 0 and 1.
+    # @param mirror_change [Boolean] whether to update the opposite direction.
+    # @return [void]
+    def add_linkage!(direction, other_node, probability, mirror_change=true)
+      raise ArgumentError, "invalid probability" unless probability.between?(0,1)
+      # first remove any existing node there and distribute the probability.
+      delete_linkage!(direction, other_node)
+      # Re-adjust each probability to account for the added value
+      linkages[direction].each_key do |key|
+        linkages[direction][key] *= (1 - probability)
+        # remove the linkage if it's probability is zero
+        if linkages[direction][key].zero?
+          delete_linkage!(direction, @nodes[key])
+        end
+      end
+      # Add the new value and set its probability
+      binding.pry if other_node.value == "dog"
+      linkages[direction][other_node.value] = probability
+      # increment the total count
+      total_num_inputs[direction] += 1
+      # Add a linkage in the opposite direction to keep :next and :prev in sync
+      if mirror_change
+        update_opposite_direction(direction, other_node, __method__, probability)
+      end
+    end
+    # Calls given method_name on other_node, passing the opposite direction
+    # to the one given as an argument. This keeps :next and :prev in sync.
+    # @param direction [Symbol]
+    #   something should have already been added/removed here
+    # @param other_node [Node] the node which was added/removed
+    # @param method_name [Symbol] the method to invoke on other_node
+    # @return [void]
+    def update_opposite_direction(direction, other_node, method_name, *other_args)
+      other_direction = (%i{next prev} - [direction])[0]
+      other_node.send(method_name, other_direction, self, *other_args, false)
+    end
+    # Adds another node to the :next linkages, updating probabilities.
+    # @param child_node [Node] to be added.
+    # @return [void]
+    def add_next_linkage(child_node, mirror_change=true)
+      add_and_adjust_probabilities(:next, child_node)
+    end
+    # Adds another node to the :prev linkages, updating probabilities.
+    # @param parent_node [Node] to be added.
+    # @return [void]
+    def add_prev_linkage(parent_node, mirror_change=true)
+      add_and_adjust_probabilities(:prev, parent_node)
+    end
+    # Removes a node from the :next linkages, updating probabilities.
+    # @param child_node [Node] to be removed.
+    # @return [void]
+    def remove_next_linkage(child_node, mirror_change=true)
+      remove_and_adjust_probabilities(:next, child_node)
+    end
+    # Removes a node from the :prev linkages, updating probabilities.
+    # @param parent_node [Node] to be removed.
+    # @return [void]
+    def remove_prev_linkage(parent_node, mirror_change=true)
+      remove_and_adjust_probabilities(:prev, parent_node)
+    end
+  end
+end

data/lib/markov_twitter/markov_builder.rb ADDED Viewed

@@ -0,0 +1,236 @@
+# Builds a Markov chain from phrases passed as input.
+# A "phrase" is defined here as a tweet.
+class MarkovTwitter::MarkovBuilder
+  # Regex used to split the phrase into tokens.
+  # It splits on any number of whitespace\in sequence.
+  # Sequences of punctuation characters are treated like any other word.
+  SeparatorCharacterRegex = /\s+/
+  # The base dictionary for nodes.
+  # There is only a single copy of each node created,
+  # although they are referenced in Node#linkages as well.
+  # @return [Hash<String, Node>]
+  attr_reader :nodes
+  # The nodes that were found at the start of phrases
+  # @return [Set<Node>]
+  attr_reader :start_nodes
+  # The nodes that were found at the end of phrases
+  # @return [Set<Node>]
+  attr_reader :end_nodes
+  # lambdas which can be used during evaluation to find the first node,
+  # or the next node when "stuck" (meaning there is no :next/:prev node).
+  # @return [Lambda<Node>]
+  #   the lambda should return true for a node that is suitable.
+  def node_finders
+    @node_finders ||= {
+      random:      -> (node) { true },
+      favor_start: -> (node) { start_nodes.include? node.value },
+      favor_end:   -> (node) { end_nodes.include? node.value },
+    }
+  end
+  # Splits a phrase into tokens.
+  # @param phrase [String]
+  # @return [Array<String>]
+  def self.split_phrase(phrase)
+    phrase.split(SeparatorCharacterRegex)
+  end
+  # @param phrases [Array<String>] e.g. sentences or tweets.
+  # processes the phrases to populate @nodes.
+  def initialize(phrases: [])
+    @nodes = {}
+    @start_nodes = Set.new
+    @end_nodes = Set.new
+    phrases.each &method(:process_phrase)
+  end
+  # Splits a phrase into tokens, adds them to @nodes, and creates linkages.
+  # @param phrase [String] e.g. a sentence or tweet.
+  # @return [void]
+  def process_phrase(phrase)
+    node_vals = self.class.split_phrase(phrase)
+    last_node = nil
+    node_vals.length.times do |i|
+      nodes = node_vals[i..(i+1)].compact.map do |node_val|
+        construct_node(node_val)
+      end
+      @start_nodes.add(nodes[0].value) if i == 0
+      last_node = nodes.last
+      add_nodes(*nodes)
+    end
+    @end_nodes.add last_node.value
+  end
+  # Adds a sequence of two tokens to @nodes and creates linkages.
+  # if node_val2 is nil, it won't be added and linkages won't be created
+  # @param node1 [Node]
+  # @param node2 [Node]
+  # @return [void]
+  def add_nodes(node1, node2=nil)
+    unless node1.is_a?(Node)
+      raise ArgumentError, "first arg passed to add_nodes is not a Node"
+    end
+    @nodes[node1.value] ||= node1
+    if node2
+      @nodes[node2.value] ||= node2
+      add_linkages(*@nodes.values_at(*[node1,node2].map(&:value)))
+    end
+  end
+  # Builds a single node which contains a reference to @nodes.
+  # Note that this does do the inverse (it doesn't add the node to @nodes)
+  # @param value [String]
+  # @return [Node]
+  def construct_node(value)
+    Node.new(value: value, nodes: @nodes)
+  end
+  # Adds bidirectional linkages beween two nodes.
+  # the Node class re-calculates the probabilities internally
+  # and mirrors the change on :prev.
+  # @param node1 [Node] the parent.
+  # @param node2 [Node] the child.
+  # @return [void]
+  def add_linkages(node1, node2)
+    node1.add_next_linkage(node2, mirror_change=true)
+  end
+  # The default evaluation method to produce a run case.
+  # Goes in forward direction with with random nodes as start points.
+  # See also #evaluate_favoring_start and #evaluate_favoring_end.
+  # See #_evaluate for paramspecs
+  # The passed node_node_finder lambda picks a totally random new node.
+  # @return [String] the result of #_evaluate joined by whitespace.
+  def evaluate(length:, probability_bounds: [0,100], root_node: nil)
+    _evaluate(
+      length: length,
+      probability_bounds: probability_bounds,
+      root_node: root_node,
+      direction: :next,
+      node_finder: node_finders[:random]
+    ).map(&:value).join(" ")
+  end
+  # See #_evaluate for paramspec.
+  # The passed node_node_finder lambda picks a node contained in @start_nodes
+  # An error is raised if no nodes match this condition.
+  # @return [String] the result of #_evaluate joined by whitespace.
+  def evaluate_favoring_start(length:, probability_bounds: [0,100], root_node: nil)
+    node_finder = node_finders[:favor_start]
+    has_possible_start_node = nodes.values.any? &node_finder
+    unless has_possible_start_node
+      raise ArgumentError, "@start_nodes is empty; can't evaluate favoring start"
+    end
+    _evaluate(
+      length: length,
+      probability_bounds: probability_bounds,
+      root_node: root_node,
+      direction: :next,
+      node_finder: node_finder
+    ).map(&:value).join(" ")
+  end
+  # See #_evaluate for paramspec.
+  # The passed node_node_finder lambda picks a node contained in @end_nodes
+  # An error is raised if no nodes match this condition.
+  # @return [String] the result of #_evaluate reversed and joined by whitespace.
+  def evaluate_favoring_end(length:, probability_bounds: [0,100], root_node: nil)
+    node_finder = node_finders[:favor_end]
+    has_possible_end_node = nodes.values.any? &node_finder
+    unless has_possible_end_node
+      raise ArgumentError, "@end_nodes is empty; can't evaluate favoring end"
+    end
+    _evaluate(
+      length: length,
+      probability_bounds: probability_bounds,
+      root_node: root_node,
+      direction: :prev,
+      node_finder: node_finder
+    ).map(&:value).reverse.join(" ")
+  end
+  # An "evaluation" of the markov chain. e.g. a run case.
+  # Passes random values through the probability sequences.
+  # @param length [Integer] the number of tokens in the result.
+  # @param probability_bounds [Array<Integer, Integer>]
+  #   optional, can limit the probability to a range where
+  #   0 <= min <= result <= max <= 100.
+  # @param node_finder [Lambda<Node>]
+  #   during iteration, if the current node has no linkages in <direction>,
+  #   a new node is selected from the nodes dict. The first randomly-picked
+  #   node which this lambda returns a truthy value for is selected.
+  # @return [Array<Node>] the result tokens in order.
+  def _evaluate(
+    length:,
+    probability_bounds: [0,100],
+    root_node: nil,
+    direction:,
+    node_finder:
+  )
+    length.times.reduce([]) do |result_nodes|
+      root_node ||= get_new_start_point(node_finder)
+      result_nodes.push root_node
+      root_node = pick_linkage(
+        root_node.linkages[direction],
+        probability_bounds,
+      )
+      result_nodes
+    end
+  end
+  # Gets a random node as a potential start point.
+  # @param node_finder [lambda<Node>]
+  #   any returned node will return a truthy value from this.
+  # @return [Node] or nil if one couldn't be found.
+  def get_new_start_point(node_finder)
+    nodes.values.shuffle.find(&node_finder)
+  end
+  # validates the given probability bounds
+  # @param bounds [Array<Integer, Integer>]
+  # @return [Boolean] indicating whether it is valid
+  def check_probability_bounds(bounds)
+    bounds1, bounds2 = bounds
+    bounds_diff = bounds2 - bounds1
+    if (
+      (bounds_diff < 0) || (bounds_diff > 100) ||
+      (bounds1 < 0) || (bounds2 > 100)
+    )
+      raise ArgumentError, "wasn't given 0 <= bounds1 <= bounds2 <= 100"
+    end
+  end
+  # Given "linkages" which includes all possibly node traversals in
+  # a predetermined direction, pick one based on their probabilities.
+  # @param linkages [Hash<String, Float>] key=token, val=probability
+  # @param probability_bounds [Array<Integer,Integer>]
+  #   Optional, can limit the probability to a range where
+  #   0 <= min <= result <= max <= 100.
+  #   This gets divided by 100 before being compared to the linkage values.
+  #
+  # @return [Node] or nil if one couldn't be found.
+  def pick_linkage(linkages, probability_bounds=[0,100])
+    check_probability_bounds(probability_bounds)
+    bounds1, bounds2 = probability_bounds
+    # pick a random number between the bounds.
+    random_num = (rand(bounds2 - bounds1) + bounds1) * 0.01
+    # offset is the accumulation of probabilities seen during iteration.
+    offset = 0
+    # sort to lowest first
+    sorted = linkages.sort_by { |name, prob| prob }
+    # find the first linkage value that satisfies offset < N(rand) < val.
+    new_key = sorted.find do |(key, probability)|
+      # increment the offset each time.
+      random_num.between?(offset, probability + offset).tap do
+        offset += probability
+      end
+    end
+    nodes[new_key&.first]
+  end
+end

data/lib/markov_twitter/test_helper_methods.rb ADDED Viewed

@@ -0,0 +1,175 @@
+# Methods which are included into test case via refinement,
+# so they don't interfere with application code
+# and don't require a namespace.
+module MarkovTwitter::TestHelperMethods
+  # alias
+  Authenticator = MarkovTwitter::Authenticator
+  # Builds an authenticator instance with valid credentials.
+  # Will raise an error unless the expected ENV vars are defined.
+  # @return [Authenticator]
+  def build_valid_authenticator
+    Authenticator.new(
+      api_key: ENV.fetch("TWITTER_API_KEY"),
+      secret_key: ENV.fetch("TWITTER_SECRET_KEY")
+    )
+  end
+  # Builds an authenticator instance with invalid credentials.
+  # Should raise errors on subsequent operations.
+  # @return [Authenticator]
+  def build_invalid_authenticator
+    Authenticator.new(api_key: "", secret_key: "")
+  end
+  # This is a twitter user I've created that has a fixed set of tweets.
+  # It's there to make sure that fetching tweets works correctly.
+  # @return [String]
+  def get_sample_username
+    "max_p_sample"
+  end
+  # The expected tweets of the user I manually posted to on Twitter.
+  # @return [Array<String>]
+  def get_sample_user_tweets
+    [
+      "don't ever change",
+      "A long-term goal of mine is to create a water-based horror game. I've done some work on building this in Unity already.",
+      "Many amazing looking animals can be kept in reasonably simple environments, but some require elaborate setups .",
+      "I enjoy creating terrariums but it's a lot of work to keep everything balanced so that all the critters survive .",
+      "Although I haven't had a cat myself, I have had aquariums, terrariums, and rodents at different points .",
+      "i have personally never owned a pet cat, and I'm a bit allergic, but I still enjoy their company .",
+      "carnivorous by nature, cats hunt many other wild animals such as gophers and mice. As a result, some people would prefer less outdoor cats .",
+      "you have now unsubscribed to cat facts. respond with UNSUBSCRIBE to unsubscribe .",
+      "egyption hairless cats are less allergenic than most other cats. they don't have hair and are probably less oily .",
+      "the cat in the hat ate and sat. it got fat and couldn't catch a rat ."
+    ]
+  end
+  # Converts strings into a specific structure that is used internally by
+  # the twitter gem. Used for stubbing with Webmock.
+  # @param strings [Array<String>]
+  # @return [Array<Hash>] where each hash has :id and :text keys.
+  def to_tweet_objects(strings)
+    strings.map { |string| { id: 0, text: string } }
+  end
+  # Sample tweets for which the content does not matter.
+  # these are only used to test the pagination of Twitter results.
+  # @return [Array<String>]
+  def get_stubbed_many_tweets_user_tweets
+    20.times.map &:to_s
+  end
+  # This user should raise an error when the twitter gem looks them up.
+  # @return [String]
+  def get_invalid_username
+    "3u9r4j8fjniecn875jdpwqk32mdiy4584vuniwcoekpd932"
+  end
+  # A twitter user which has many tweets.
+  # Used to test pagination of search results.
+  # @return [String]
+  def get_many_tweets_username
+    "SFist"
+  end
+  # Makes twitter's oauth request succeed.
+  # Returns without doing anything if DisableWebmock=true in ENV.
+  # @return [void]
+  def stub_twitter_token_request_with_valid_credentials
+    return if ENV["DisableWebmock"] == "true"
+    stub_request(
+      :post, "https://api.twitter.com/oauth2/token"
+    ).to_return(status: 200, body: "", headers: {})
+  end
+  # Makes twitter's oauth request fail.
+  # Returns without doing anything if DisableWebmock=true in ENV.
+  # @return [void]
+  def stub_twitter_token_request_with_invalid_credentials
+    return if ENV["DisableWebmock"] == "true"
+    stub_request(
+      :post, "https://api.twitter.com/oauth2/token"
+    ).to_return(status: 403, body: "", headers: {})
+  end
+  # Makes the twitter user lookup request succeed.
+  # Returns without doing anything if DisableWebmock=true in ENV.
+  # @return [void]
+  def stub_twitter_user_lookup_request_with_valid_username(username)
+    return if ENV["DisableWebmock"] == "true"
+    stub_request( :get,
+      "https://api.twitter.com/1.1/users/show.json?screen_name=#{username}"
+    ).to_return(status: 200, body: {id: 0}.to_json, headers: {})
+  end
+  # Makes the twitter user lookup request fail.
+  # Returns without doing anything if DisableWebmock=true in ENV.
+  # @param username [String]
+  # @return [void]
+  def stub_twitter_user_lookup_request_with_invalid_username(username)
+    return if ENV["DisableWebmock"] == "true"
+    stub_request( :get,
+      "https://api.twitter.com/1.1/users/show.json?screen_name=#{username}"
+    ).to_return(status: 404, body: {id: 0}.to_json, headers: {})
+  end
+  # Makes the twitter user timeline request succeed.
+  # returns without doing anything if DisableWebmock=true in ENV.
+  # @param tweets_to_return [Array<Hash>]
+  #   - the hashes must have keys "id" and "text".
+  # @return [void]
+  def stub_twitter_user_timeline_request(tweets_to_return)
+    return if ENV["DisableWebmock"] == "true"
+    stub_request(
+      :get,
+      "https://api.twitter.com/1.1/statuses/user_timeline.json?user_id=0"
+    ).to_return(status: 200, body: tweets_to_return.to_json, headers: {})
+  end
+  # Validates that the linkages on a node are as expected.
+  # @param node [MarkovBuilder::Node]
+  # @param _next [Hash<String,Float>] mapping key to probability.
+  # @param prev [Hash<String,Float>] mapping key to probability.
+  # @param total_num_inputs [Hash<Symbol,Integer>] keyed by direction.
+  def validate_linkages(node, _next: nil, prev: nil, total_num_inputs: nil)
+    precision = 0.00001
+    if _next
+      node.linkages[:next].each do |name, linkage|
+        expect(linkage).to be_within(precision).of(_next[name])
+      end
+    end
+     if prev
+      node.linkages[:prev].each do |name, linkage|
+        expect(linkage).to be_within(precision).of(prev[name])
+      end
+    end
+     if total_num_inputs
+      expect(node.total_num_inputs).to eq(total_num_inputs)
+    end
+  rescue
+    binding.pry
+  end
+  # A sample phrase used to test manipulations on the markov chain.
+  # @return [String]
+  def get_sample_phrase_1
+    "the cat in the hat"
+  end
+  # A sample phrase used to test manipulations on the markov chain.
+  # @return [String]
+  def get_sample_phrase_2
+    "the bat in the flat"
+  end
+  # This module can be added with `using MarkovTwitter::TestHelperMethods`.
+  # It can also be added globally with `Object.include MarkovTwitter::TestHelperMethods`.
+  refine Object do
+    include MarkovTwitter::TestHelperMethods
+  end
+end

data/lib/markov_twitter/tweet_reader.rb ADDED Viewed

@@ -0,0 +1,20 @@
+# Fetches the latest tweets.
+class MarkovTwitter::TweetReader
+  # @return [Twitter::REST::Client]
+  attr_reader :client
+  # @param client [Twitter::REST::Client]
+  def initialize(client:)
+    @client = client
+  end
+  # @param username [String] must exist om twitter or this will raise an error
+  # @return [Array<Hash>]
+  #   - the hashes will have :text and :id keys
+  def get_tweets(username:)
+    user = client.user(username)
+    client.user_timeline(user)
+  end
+end

data/lib/markov_twitter.rb ADDED Viewed

@@ -0,0 +1,41 @@
+###############################################################################
+#                                                                             #
+#                      /\/\   __ _ _ __| | _______   __                       #
+#                     /    \ / _` | '__| |/ / _ \ \ / /                       #
+#                    / /\/\ \ (_| | |  |   < (_) \ V /                        #
+#                    \/    \/\__,_|_|  |_|\_\___/ \_/                         #
+#                     _____           _ _   _                                 #
+#                    /__   \__      _(_) |_| |_ ___ _ __                      #
+#                      / /\/\ \ /\ / / | __| __/ _ \ '__                      #
+#                     / /    \ V  V /| | |_| ||  __/ |                        #
+#                     \/      \_/\_/ |_|\__|\__\___|_|                        #
+#                                                                             #
+###############################################################################
+# =============================================================================
+# Dependencies
+# =============================================================================
+# Using a gem to interact with twitter saves a lot of work.
+require 'twitter'
+# Extensions to Ruby core language.
+require 'active_support/all'
+# =============================================================================
+# Top level namespace.
+# =============================================================================
+class MarkovTwitter; end
+# =============================================================================
+# Individual components.
+# =============================================================================
+require "markov_twitter/tweet_reader"
+require "markov_twitter/authenticator"
+require "markov_twitter/markov_builder"
+require "markov_twitter/markov_builder/node"
+require "markov_twitter/test_helper_methods"

data/lib/version.rb ADDED Viewed

@@ -0,0 +1,4 @@
+class MarkovTwitter
+  # The version of the gem.
+  VERSION = '0.0.1'
+end

metadata ADDED Viewed

@@ -0,0 +1,167 @@
+--- !ruby/object:Gem::Specification
+name: markov_twitter
+version: !ruby/object:Gem::Version
+  version: 0.0.1
+platform: ruby
+authors:
+- max pleaner
+autorequire:
+bindir: bin
+cert_chain: []
+date: 2017-10-12 00:00:00.000000000 Z
+dependencies:
+- !ruby/object:Gem::Dependency
+  name: thor
+  requirement: !ruby/object:Gem::Requirement
+    requirements:
+    - - ">="
+      - !ruby/object:Gem::Version
+        version: '0'
+  type: :runtime
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    requirements:
+    - - ">="
+      - !ruby/object:Gem::Version
+        version: '0'
+- !ruby/object:Gem::Dependency
+  name: twitter
+  requirement: !ruby/object:Gem::Requirement
+    requirements:
+    - - ">="
+      - !ruby/object:Gem::Version
+        version: '0'
+  type: :runtime
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    requirements:
+    - - ">="
+      - !ruby/object:Gem::Version
+        version: '0'
+- !ruby/object:Gem::Dependency
+  name: activesupport
+  requirement: !ruby/object:Gem::Requirement
+    requirements:
+    - - ">="
+      - !ruby/object:Gem::Version
+        version: '0'
+  type: :runtime
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    requirements:
+    - - ">="
+      - !ruby/object:Gem::Version
+        version: '0'
+- !ruby/object:Gem::Dependency
+  name: pry-byebug
+  requirement: !ruby/object:Gem::Requirement
+    requirements:
+    - - ">="
+      - !ruby/object:Gem::Version
+        version: '0'
+  type: :development
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    requirements:
+    - - ">="
+      - !ruby/object:Gem::Version
+        version: '0'
+- !ruby/object:Gem::Dependency
+  name: dotenv
+  requirement: !ruby/object:Gem::Requirement
+    requirements:
+    - - ">="
+      - !ruby/object:Gem::Version
+        version: '0'
+  type: :development
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    requirements:
+    - - ">="
+      - !ruby/object:Gem::Version
+        version: '0'
+- !ruby/object:Gem::Dependency
+  name: webmock
+  requirement: !ruby/object:Gem::Requirement
+    requirements:
+    - - ">="
+      - !ruby/object:Gem::Version
+        version: '0'
+  type: :development
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    requirements:
+    - - ">="
+      - !ruby/object:Gem::Version
+        version: '0'
+- !ruby/object:Gem::Dependency
+  name: rspec
+  requirement: !ruby/object:Gem::Requirement
+    requirements:
+    - - ">="
+      - !ruby/object:Gem::Version
+        version: '0'
+  type: :development
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    requirements:
+    - - ">="
+      - !ruby/object:Gem::Version
+        version: '0'
+- !ruby/object:Gem::Dependency
+  name: yard
+  requirement: !ruby/object:Gem::Requirement
+    requirements:
+    - - ">="
+      - !ruby/object:Gem::Version
+        version: '0'
+  type: :development
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    requirements:
+    - - ">="
+      - !ruby/object:Gem::Version
+        version: '0'
+description: ''
+email: maxpleaner@gmail.com
+executables:
+- console
+- markov_twitter
+extensions: []
+extra_rdoc_files: []
+files:
+- README.md
+- bin/console
+- bin/markov_twitter
+- lib/markov_twitter.rb
+- lib/markov_twitter/authenticator.rb
+- lib/markov_twitter/markov_builder.rb
+- lib/markov_twitter/markov_builder/node.rb
+- lib/markov_twitter/test_helper_methods.rb
+- lib/markov_twitter/tweet_reader.rb
+- lib/version.rb
+homepage: http://github.com/maxpleaner/markov_twitter
+licenses:
+- MIT
+metadata: {}
+post_install_message:
+rdoc_options: []
+require_paths:
+- lib
+required_ruby_version: !ruby/object:Gem::Requirement
+  requirements:
+  - - "~>"
+    - !ruby/object:Gem::Version
+      version: '2.3'
+required_rubygems_version: !ruby/object:Gem::Requirement
+  requirements:
+  - - ">="
+    - !ruby/object:Gem::Version
+      version: 2.6.13
+requirements: []
+rubyforge_project:
+rubygems_version: 2.6.13
+signing_key:
+specification_version: 4
+summary: markov chains from twitter posts
+test_files: []