rbbt 1.1.0 → 1.1.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/README.rdoc +6 -6
- data/bin/rbbt_config +11 -11
- metadata +1 -1
    
        data/README.rdoc
    CHANGED
    
    | @@ -79,8 +79,8 @@ Install the gem normally <tt>gem install rbbt</tt>. The gem includes a configura | |
| 79 79 |  | 
| 80 80 | 
             
            === Using rbbt to translate identifiers
         | 
| 81 81 |  | 
| 82 | 
            -
            1. Do <tt>rbbt_config  | 
| 83 | 
            -
            3. Now you may do <tt>rbbt_config  | 
| 82 | 
            +
            1. Do <tt>rbbt_config prepare identifiers</tt> to do deploy the configuration files and download entrez data, this needs to be done just once.
         | 
| 83 | 
            +
            3. Now you may do <tt>rbbt_config install organisms</tt> toprocess all the organisms, or <tt>rbbt_config install organisms -o sgd</tt> to process only yeast (sgd).
         | 
| 84 84 | 
             
            4. You may now use a script like this to translate gene identifiers from yeast feed from the standard input
         | 
| 85 85 | 
             
              require 'rbbt/sources/organism'
         | 
| 86 86 |  | 
| @@ -92,12 +92,12 @@ Install the gem normally <tt>gem install rbbt</tt>. The gem includes a configura | |
| 92 92 |  | 
| 93 93 | 
             
            First prepare the organisms as you did in the previous section. Next, if you want to use the default NER module:
         | 
| 94 94 |  | 
| 95 | 
            -
            1. Install the Biocreative data used to train the model and compile the CRF++ plugin, <tt>rbbt_config  | 
| 96 | 
            -
            2. Build the module for a particular organism <tt>rbbt_config  | 
| 95 | 
            +
            1. Install the Biocreative data used to train the model and compile the CRF++ plugin, <tt>rbbt_config prepare rner</tt>. You may need at this point to install ParseTree and ruby2ruby
         | 
| 96 | 
            +
            2. Build the module for a particular organism <tt>rbbt_config install ner -o sgd</tt>. You need to have the gems ParseTree and ruby2ruby for this to work. This process can take a long time.
         | 
| 97 97 |  | 
| 98 98 | 
             
            Or, if you wan to use Abner or Banner:
         | 
| 99 99 |  | 
| 100 | 
            -
            1. Download and install the packages <tt>rbbt_config  | 
| 100 | 
            +
            1. Download and install the packages <tt>rbbt_config prepare java_ner</tt> 
         | 
| 101 101 |  | 
| 102 102 | 
             
            You may now, for example, find mentions to genes in articles from a PubMed query using this script
         | 
| 103 103 |  | 
| @@ -123,7 +123,7 @@ You may now, for example, find mentions to genes in articles from a PubMed query | |
| 123 123 |  | 
| 124 124 | 
             
            This is the complete list of gem requirements: <tt>ParseTree ruby2ruby simpleconsole rjb rsruby stemmer rand rake progress-monitor</tt>. Some of these gems to not work with ruby 1.9 at the time, or may be a bit more complicated to install, for that reason *they are not reported as dependencies and are only required when they are about to be used*. Note that some of these gems are in the gemcutter repository, you may need to install the <tt>gemcutter</tt> gem and do <tt>gem tumble</tt>
         | 
| 125 125 |  | 
| 126 | 
            -
            Some of the API requires to have some data processed using rbbt_config. This command is used to install third party software, download data from the internet, or build models. The command <tt>rbbt_config  | 
| 126 | 
            +
            Some of the API requires to have some data processed using rbbt_config. This command is used to install third party software, download data from the internet, or build models. The command <tt>rbbt_config prepare all</tt> will install and process everything, this will take a long time, specially building the NER models. So you might want to start with the basic install and include more things as they are needed. 
         | 
| 127 127 |  | 
| 128 128 |  | 
| 129 129 | 
             
            = Note on Patches/Pull Requests
         | 
    
        data/bin/rbbt_config
    CHANGED
    
    | @@ -15,13 +15,13 @@ end | |
| 15 15 | 
             
            TASKS= %w(organisms ner norm classifier biocreative entrez go wordlists polysearch abner banner crf++)
         | 
| 16 16 |  | 
| 17 17 | 
             
            $USAGE =<<EOT
         | 
| 18 | 
            -
            #{__FILE__} <action> [<subaction>] [-- | 
| 18 | 
            +
            #{__FILE__} <action> [<subaction>] [--update] [--organism <org>]
         | 
| 19 19 |  | 
| 20 20 | 
             
              actions:
         | 
| 21 21 |  | 
| 22 22 | 
             
                * configure:   Set paths for data, cache, and tmp directories
         | 
| 23 23 |  | 
| 24 | 
            -
                *  | 
| 24 | 
            +
                * prepare:
         | 
| 25 25 |  | 
| 26 26 | 
             
                  Basic subactions:
         | 
| 27 27 |  | 
| @@ -50,12 +50,12 @@ $USAGE =<<EOT | |
| 50 50 | 
             
                  * classifier:   organisms, wordlists, classifier, go
         | 
| 51 51 | 
             
                  * all:          #{TASKS.join(", ")}
         | 
| 52 52 |  | 
| 53 | 
            -
                *  | 
| 53 | 
            +
                * install: 
         | 
| 54 54 | 
             
                  * organisms:      Gather organisms data
         | 
| 55 55 | 
             
                  * ner:            Build Named Entity Recognition Models. Mention Normalization needs no training.
         | 
| 56 56 | 
             
                  * classification: Build Function/Process Classifiers
         | 
| 57 57 |  | 
| 58 | 
            -
                  -- | 
| 58 | 
            +
                  --update:         Rebuild models or reprocess organism data even if present. You may want to purge the cache 
         | 
| 59 59 | 
             
                                    to be up to date with the data in the internet.
         | 
| 60 60 |  | 
| 61 61 | 
             
                  --organism:       Gather data only for that particular organism. The organism must be specified by the 
         | 
| @@ -74,7 +74,7 @@ EOT | |
| 74 74 |  | 
| 75 75 | 
             
            class Controller < SimpleConsole::Controller
         | 
| 76 76 |  | 
| 77 | 
            -
              params :bool => {: | 
| 77 | 
            +
              params :bool => {:u => :update},
         | 
| 78 78 | 
             
                     :string => {:o => :organism}
         | 
| 79 79 |  | 
| 80 80 | 
             
              def organisms
         | 
| @@ -89,7 +89,7 @@ class Controller < SimpleConsole::Controller | |
| 89 89 | 
             
                render :action => :usage
         | 
| 90 90 | 
             
              end
         | 
| 91 91 |  | 
| 92 | 
            -
              def  | 
| 92 | 
            +
              def install
         | 
| 93 93 | 
             
                raise "Run #{__FILE__} configure first to configure rbbt" if $noconfig
         | 
| 94 94 |  | 
| 95 95 | 
             
                case params[:id]
         | 
| @@ -103,12 +103,12 @@ class Controller < SimpleConsole::Controller | |
| 103 103 | 
             
                  redirect_to :action => :help, :id => :update
         | 
| 104 104 | 
             
                end
         | 
| 105 105 |  | 
| 106 | 
            -
                $force = true if params[: | 
| 106 | 
            +
                $force = true if params[:update]
         | 
| 107 107 | 
             
                $org = params[:organism] if params[:organism]
         | 
| 108 108 |  | 
| 109 109 | 
             
              end
         | 
| 110 110 |  | 
| 111 | 
            -
              def  | 
| 111 | 
            +
              def prepare
         | 
| 112 112 | 
             
                raise "Run #{__FILE__} configure first to configure rbbt" if $noconfig
         | 
| 113 113 | 
             
                case params[:id]
         | 
| 114 114 | 
             
                when "identifiers"
         | 
| @@ -146,7 +146,7 @@ class Controller < SimpleConsole::Controller | |
| 146 146 | 
             
                  @tasks = [params[:id]]
         | 
| 147 147 | 
             
                end
         | 
| 148 148 |  | 
| 149 | 
            -
                $force = true if params[: | 
| 149 | 
            +
                $force = true if params[:update]
         | 
| 150 150 | 
             
                $org = params[:organism] if params[:organism]
         | 
| 151 151 |  | 
| 152 152 | 
             
              end
         | 
| @@ -175,7 +175,7 @@ class View < SimpleConsole::View | |
| 175 175 | 
             
              end
         | 
| 176 176 |  | 
| 177 177 |  | 
| 178 | 
            -
              def  | 
| 178 | 
            +
              def prepare
         | 
| 179 179 | 
             
                load File.join(Rbbt.rootdir, 'tasks/install.rake')
         | 
| 180 180 |  | 
| 181 181 | 
             
                @tasks.each{|t|
         | 
| @@ -184,7 +184,7 @@ class View < SimpleConsole::View | |
| 184 184 | 
             
                }
         | 
| 185 185 | 
             
              end
         | 
| 186 186 |  | 
| 187 | 
            -
              def  | 
| 187 | 
            +
              def install
         | 
| 188 188 |  | 
| 189 189 | 
             
                puts "Changing directory to #{@location}"
         | 
| 190 190 | 
             
                chdir @location
         |