rwikibot 1.0.0

Sign up to get free protection for your applications and to get access to all the features.
Files changed (4) hide show
  1. data/README +3 -0
  2. data/config.yaml +5 -0
  3. data/lib/rwikibot.rb +470 -0
  4. metadata +48 -0
data/README ADDED
@@ -0,0 +1,3 @@
1
+ This is RWikiBot 1.0.
2
+
3
+ This is a test README file.
data/config.yaml ADDED
@@ -0,0 +1,5 @@
1
+ :source:use: 'default'
2
+ :source:default:
3
+ username: "RWikiBot"
4
+ password: "rwikibot"
5
+ api_path: "http://localhost:8888/wiki/api.php"
data/lib/rwikibot.rb ADDED
@@ -0,0 +1,470 @@
1
+ # RWikiBot 0.1
2
+ #
3
+ # This is a framework upon which to create MediaWiki Bots. It provides a set of methods to acccess MediaWiki's API and return information in
4
+ # various forms, depending on the type of information returned. By abstracting these methods into a Bot object, cleaner script code can be
5
+ # written later. Furthermore, it facilitates the updating of the API without breaking old bots. Last, but not least, its good to abstract.
6
+ #
7
+ # Configuration options are kept in config.yaml and processed on bot creation. I have worked really hard to make sure this file doesn't need editing,
8
+ # so please don't make changes here, and email me if you find something that I missed.
9
+ #
10
+ # Author:: Edwin Sidney Roger (mailto:eddieroger@gmail.com)
11
+ # Copyright:: Copyright (c) 2007 Edwin Sidney Roger
12
+ # License:: GNU/GPL 2.0
13
+ require 'net/http'
14
+ require 'uri'
15
+ require 'yaml'
16
+ require 'cgi'
17
+ require 'logger'
18
+
19
+
20
+ #This is the main bot object. The goal is to represent every API method in some form here, and then write seperate, cleaner scripts in individual bot files utilizing this framework. Basically, this is an include at best.
21
+ class RWikiBot
22
+
23
+
24
+ attr_accessor :http, :config, :botname
25
+
26
+ # New bots hope for three attributes, but require none. The first is simply the name of the bot for logging purposes. The second is the debug level constant, and third is the logfile.
27
+ #
28
+ # Example: bot = RWikiBot.new("My Neat Bot", Logger::DEBUG, "./rwikibot.log")
29
+ def initialize (name = "Unnamed Bot", debuglevel = Logger::INFO, log_location = "./rwikibot.log" )
30
+
31
+ @botname = name
32
+
33
+ #We log. Log4r in the house.
34
+ @wikibotlogger = Logger.new('rwikibot.log')
35
+ # @wikibotlogger.outputters = FileOutputter.new('f1', :filename => log_location, :trunc => false)
36
+
37
+ #INFO level is nice. I like it as production default because I like log files. Feel free to change. It will change once config loaded
38
+ @wikibotlogger.level = debuglevel
39
+ @wikibotlogger.info "New RWikiBot created. My name is #{@botname}"
40
+
41
+ # Load the configuration!! (saved the file based, because I liked it too much.)
42
+ @wikibotlogger.debug "Loading configuration..."
43
+ config_file = YAML.load_file('config.yaml')
44
+ use_config = config_file.fetch('use') #tricky, tricky
45
+ @wikibotlogger.debug "Config loaded. Using settings: #{use_config}"
46
+
47
+ #now, the hash that will hold the config options
48
+ @config = config_file.fetch(use_config)
49
+ @config['logged_in'] = FALSE
50
+ @config['uri'] = URI.parse(@config.fetch('api_path'))
51
+
52
+ ##change - make an HTTP object here for use later, now that we have config
53
+ @http = Net::HTTP.new(@config.fetch('uri').host, @config.fetch('uri').port)
54
+ @wikibotlogger.debug("INIT - Created HTTP object. Result is: #{@http}}")
55
+
56
+ @config['cookie'] = nil
57
+ end
58
+
59
+ # Login
60
+ #
61
+ # This is the method that will allow the bot to log in to the wiki. Its not always necessary, but bots need to log in to save changes or retrieve watchlists.
62
+ #
63
+ # No variables are accepted and the username/password are stored in config.yaml.
64
+ def login
65
+
66
+ @wikibotlogger.debug("LOGIN - Preparing login information...")
67
+ post_me = {'lgname'=>@config.fetch('username'),'lgpassword'=>@config.fetch('password')}
68
+
69
+ @wikibotlogger.debug("LOGIN - Asking make_request to perform login...")
70
+ login_result = make_request('login', post_me)
71
+ @wikibotlogger.debug("LOGIN - We should have a result of type login now.")
72
+
73
+ @wikibotlogger.debug("LOGIN - login_result received. Result is: #{login_result.fetch('result')}")
74
+
75
+ # Now we need to changed some @config stuff, specifically that we're logged in and the variables of that
76
+ # This will also change the make_request, but I'll comment there
77
+ if login_result.fetch('result') == "Success"
78
+ # All lg variables are directly from API and stored in config that way
79
+ @wikibotlogger.info("LOGIN - Login successful. Wiki user is: #{login_result.fetch('lgusername')}")
80
+ @config['logged_in'] = TRUE
81
+ @config['lgusername'] = login_result.fetch('lgusername')
82
+ @config['lguserid'] = login_result.fetch('lguserid')
83
+ @config['lgtoken'] = login_result.fetch('lgtoken')
84
+ @wikibotlogger.debug("LOGIN - MediwWiki API variables stored in @config")
85
+ return TRUE
86
+ else
87
+ @wikibotlogger.error("LOGIN - Login Error. Wiki API said: #{login_result.fetch('result')}")
88
+ return FALSE
89
+ end
90
+
91
+ end
92
+
93
+ # Watchlist
94
+ #
95
+ # This method will get the watchlist for the bot's MediaWiki username. This is really onlu useful if you want the bot to watch a specific list of pages, and would require the bot maintainer to login to the wiki as the bot to set the watchlist.
96
+ #
97
+ # INPUT:: Options is a hash of API allowed fields that will be passed.
98
+ #
99
+ # OUTPUT:: Returns an array of hashes.
100
+ def watchlist (options=nil)
101
+ # Get the bot's watchlist
102
+ @wikibotlogger.debug("WATCHLIST - Preparing request information...")
103
+ post_me = {'list'=>'watchlist'}
104
+
105
+ if options != nil
106
+ options.each do |key, value|
107
+ post_me[key] = value
108
+ end
109
+ end
110
+
111
+ # Make the request
112
+ @wikibotlogger.debug "WATCHLIST - Asking make_request to get watchlist..."
113
+ watchlist_result = make_request('query', post_me)
114
+ @wikibotlogger.debug "WATCHLIST - We should have a result of type query now."
115
+
116
+ #Process into a Hash for return
117
+ @wikibotlogger.debug "WATCHLIST - Processing result..."
118
+ return watchlist_result.fetch('watchlist')
119
+
120
+ end
121
+
122
+ # Query
123
+ #
124
+ # This method will return Wiki-wide recent changes, almost as if looking at the Special page Recent Changes. But, in this format, a bot can handle it. Also we're using the API. And bots can't read.
125
+ #
126
+ # INPUT:: A hash of API-allowed keys and values. Default is same as API default.
127
+ #
128
+ # OUTPUT:: An array of hashes.
129
+ def recent_changes (options=nil)
130
+
131
+ # This will allow any given bot to get recent changes. Then act on it. But that's another method
132
+ # TODO - Persistent timestamp storage
133
+
134
+ @wikibotlogger.debug("RECENT CHANGES - Preparing request information...")
135
+ post_me = {"list" => "recentchanges"}
136
+ if options != nil
137
+ options.each do |key, value|
138
+ post_me[key] = value
139
+ end
140
+ end
141
+
142
+ # Make the request
143
+ @wikibotlogger.debug "RECENT CHANGES - Asking make_request to get recentchanges..."
144
+ recentchanges_result = make_request('query' , post_me)
145
+ @wikibotlogger.debug "RECENT CHANGES - We should have a result of type query now."
146
+
147
+ # Figure out what to do now. Process, I'd think
148
+ @wikibotlogger.debug "RECENT CHANGES - Processing result..."
149
+
150
+ return recentchanges_result.fetch('recentchanges')
151
+
152
+ end
153
+
154
+ # List
155
+ #
156
+ # This will reutrn a list of the most recent log events. Useful for bots who want to validate log events, or even just a notify bot that checks for events and sends them off.
157
+ #
158
+ # INPUT:: A hash of API-allowed keys and values. Default is same as API default.
159
+ #
160
+ # OUTPUT:: An array of hashes containing log events.
161
+ def log_events (options = nil)
162
+
163
+ @wikibotlogger.debug "LOG EVENTS - Preparing request information..."
164
+
165
+ # Make the request
166
+ post_me = {"list" => "logevents"}
167
+
168
+ if options != nil
169
+ @wikibotlogger.debug("LOG EVENTS - Additional options added by requestor. Adding to post_me...")
170
+ options.each_pair do |key, value|
171
+ post_me[key] = value
172
+ @wikibotlogger.debug "LOG EVENTS - Added #{post_me[key]}"
173
+ end
174
+ @wikibotlogger.debug("LOG EVENTS - No more additional options. Moving on...")
175
+ end
176
+
177
+ #Make the request!
178
+ @wikibotlogger.debug "LOG EVENTS = Asking make_request to get logevents"
179
+ logevents_result = make_request('query', post_me)
180
+ @wikibotlogger.debug "LOG EVENTS - We should have a result of type logevents now."
181
+
182
+ # Process results
183
+ @wikibotlogger.debug "LOG EVENTS - Processing result..."
184
+
185
+
186
+ return logevents_result.fetch('logevents')
187
+
188
+ end
189
+
190
+ # Query
191
+ #
192
+ # This will return any redirects from an article title so that you know where it ends. Useful to check for redirects, but mostly here for completeness of the framework.
193
+ #
194
+ # INPUT:: A string of pipe-delimited titles ('Apple|Baseball|Car port'), and an optional hash of API acceptable values.
195
+ # OUTPUT:: An array of redirects.
196
+ def redirects (title, options = nil)
197
+
198
+ # Prepare the request
199
+ @wikibotlogger.debug "REDIRECTS - Preparing request information..."
200
+ post_me = {'titles' => title, 'redirects'=>'', 'prop' => 'info'}
201
+
202
+ if options != nil
203
+ @wikibotlogger.debug("REDIRECTS - Additional options added by requestor. Adding to post_me...")
204
+ options.each_pair do |key, value|
205
+ post_me[key] = value
206
+ @wikibotlogger.debug "REDIRECTS - Added #{post_me[key]}"
207
+ end
208
+ @wikibotlogger.debug("REDIRECTS - No more additional options. Moving on...")
209
+ end
210
+
211
+ #Make the request
212
+ @wikibotlogger.debug "REDIRECTS - Asking make_request find redirects..."
213
+ redirects_result = make_request('query', post_me)
214
+ @wikibotlogger.debug "REDIRECTS - We should have a result now..."
215
+
216
+ @wikibotlogger.debug "REDIRECTS - Processing result..."
217
+
218
+
219
+ return redirects_result.fetch('pages')
220
+
221
+ end
222
+
223
+
224
+
225
+ # Query
226
+ #
227
+ # This little ditty returns a normalized version of the title passed to it. It is super useful because it will normalize an otherise poorly entered title, but most importantly it will let us know if an article exists or not by if it is able to normalize.
228
+ #
229
+ # INPUT:: Titles, either singular or pipe-delimited.
230
+ # OUTPUT:: An array of normalized hashes.
231
+ def normalize (title)
232
+
233
+ # Prepare the request
234
+ @wikibotlogger.debug "NORMALIZE - Preparing request information..."
235
+ post_me = {'titles' => title}
236
+
237
+ #Make the request
238
+ @wikibotlogger.debug "NORMALIZE - Asking make_request to normalize titles..."
239
+ normalized_result = make_request('query', post_me)
240
+ @wikibotlogger.debug "NORMALIZE - We should have a result now..."
241
+
242
+ @wikibotlogger.debug "NORMALIZE - Processing result..."
243
+
244
+
245
+ return normalized_result.fetch('pages')
246
+
247
+ end
248
+
249
+ # List
250
+ #
251
+ # This will return a list of all pages in a given namespace. It returns a list of pages in with the normalized title and page ID, suitable for usage elsewhere. Accepts all parameters from the API in Hash form.
252
+ # Default is namespace => 0, which is just plain pages. Nothing 'special'.
253
+ # Also note that if the username the Bot uses is not of type Bot in the Wiki, you will be limited to 50 articles. Also log in, or you get an error.
254
+ #
255
+ # INPUT:: A hash of API-allowed keys and values. Default is same as API default.
256
+ #
257
+ # OUTPUT:: An array of hashes with information about the pages.
258
+ def all_pages (options = nil)
259
+
260
+ # This will get all pages. Limits vary based on user rights of the Bot. Set to bot.
261
+ @wikibotlogger.debug "ALL PAGES - Preparing request information..."
262
+ post_me = {'list' => 'allpages', 'apnamespace' => '0', 'aplimit' => '5000'}
263
+
264
+ #make the request
265
+ @wikibotlogger.debug "ALL PAGES - Asking make_request to get all pages..."
266
+ allpages_result = make_request('query', post_me)
267
+ @wikibotlogger.debug "ALL PAGES - We should have a result now..."
268
+
269
+ if options != nil
270
+ @wikibotlogger.debug("ALL PAGES - Additional options added by requestor. Adding to post_me...")
271
+ options.each_pair do |key, value|
272
+ post_me[key] = value
273
+ @wikibotlogger.debug "ALL PAGES - Added #{post_me[key]}"
274
+ end
275
+ @wikibotlogger.debug("ALL PAGES - No more additional options. Moving on...")
276
+ end
277
+
278
+ return allpages_result.fetch('allpages')
279
+
280
+ end
281
+
282
+ # Prop = Info
283
+ #
284
+ # I decided to split this up since I wanted to normalize the bot framework as much as possible, or in other words, make it as easy to use as possible. I think the sacrifice of more methods is worth having more English looking code. Its the Ruby way.
285
+ # Info will return information about the page, from namespace to normalized title, last touched, etc.
286
+ #
287
+ # INPUT:: This method only takes titles, but will accept a pipe-delimited string. Ex: "Apple|Baseball|Horse|Main Page"
288
+ #
289
+ # OUTPUT:: An array of hashes.
290
+ def info (titles)
291
+
292
+ # Basic quqery info
293
+ @wikibotlogger.debug "INFO - Preparing the query..."
294
+ post_me = {"prop" => "info", 'titles' => titles}
295
+
296
+ # Make the request
297
+ @wikibotlogger.debug "INFO - Asking make_request to get info"
298
+ info_result = make_request('query', post_me)
299
+ @wikibotlogger.debug "INFO - We should have a result set now..."
300
+
301
+ # Result processing
302
+ @wikibotlogger.debug "INFO - Preparing results..."
303
+
304
+ return info_result.fetch('pages')
305
+
306
+ end
307
+
308
+ # Prop - Revisions
309
+ #
310
+ # This is the main way of accessing content and page specific information from the wiki. It has multiple uses as described in the API, Its also considerably more complex than the other methods. Enjoy it.
311
+ # A final note - I'd really be familiar with this method in the API since I've spent a lot of time trying to figure it out myself.
312
+ #
313
+ # Please be sure to add the RVPROP key at least, otherwise you'll just get the basic information of revid, oldid and pageid. Boring.
314
+ #
315
+ # INPUT:: A string of article titles (pipe-delimited), and a hash of API-allowed keys and values. Default is same as API default.
316
+ #
317
+ # OUTPUT:: An array of hashes.
318
+ def revisions(titles, options = nil)
319
+
320
+ # Prepare the request! Notify the logger!
321
+ @wikibotlogger.debug "REVISIONS - Preparing the requeset..."
322
+ post_me = {'prop' => 'revisions', 'titles' => titles}
323
+
324
+ # Handle any additional options
325
+ if options != nil
326
+ @wikibotlogger.debug("REVISIONS - Additional options added by requestor. Adding to post_me...")
327
+ options.each_pair do |key, value|
328
+ post_me[key] = value
329
+ @wikibotlogger.debug "REVISIONS - Added #{post_me[key]}"
330
+ end
331
+ @wikibotlogger.debug("REVISIONS - No more additional options. Moving on...")
332
+ end
333
+
334
+ # Make the request. Becuase we care.
335
+ @wikibotlogger.debug "REVISIONS - Asking make_request to get revision for articles(s) #{titles}"
336
+ revisions_result = make_request('query', post_me )
337
+ @wikibotlogger.debug "REVISIONS - We should have a result now..."
338
+
339
+ #Process the results
340
+ @wikibotlogger.debug "REVISIONS - Preparing results..."
341
+
342
+ return revisions_result.fetch('pages')
343
+
344
+ end
345
+
346
+ # Meta
347
+ #
348
+ # This is the only meta method. It will return site information. I chose not to allow it to specify, and it will only return all known properties.
349
+ # api.php?action=query&meta=siteinfo&siprop=general|namespaces
350
+ #
351
+ # INPUT:: siprop is either 'general' or 'namespaces'.
352
+ #
353
+ # OUTPUT:: A hash of values about site information.
354
+ def site_info (siprop = 'general')
355
+
356
+ @wikibotlogger.debug "SITE INFO - Preparing request information..."
357
+
358
+ # Make the request
359
+ post_me = {"meta" => "siteinfo" , "siprop" => siprop}
360
+
361
+
362
+ #Make the request!
363
+ @wikibotlogger.debug "SITE INFO - Asking make_request to get site info"
364
+ siteinfo_result = make_request('query', post_me)
365
+ @wikibotlogger.debug "SITE INFO - We should have a result of type site info now."
366
+
367
+ # Process results
368
+ @wikibotlogger.debug "SITE INFO - Processing result..."
369
+
370
+ if siprop == 'general'
371
+ return siteinfo_result.fetch('general')
372
+ else
373
+ return siteinfo_result.fetch('namespaces')
374
+ end
375
+
376
+ end
377
+
378
+ # The point of this method is to iterate through an array of hashes, which most of the other methods return, and remove multiple instances of the same wiki page. We're more than often only concerned with the most recent revision, so we'll delete old ones.
379
+ #
380
+ # Hashes don't respond to the the Array.uniq method. So this is the same-ish
381
+ # INPUT:: An array of hashes.
382
+ # OUTPUT:: An array of hashes that are unique.
383
+ def make_unique(array)
384
+
385
+ test_array = array
386
+ count = 0
387
+
388
+ # First, let's make one big loop to go through each item in the array.
389
+ array.reverse.each do |current_item|
390
+
391
+ # Now, let's loop double time.
392
+ test_array.each do |test_item|
393
+
394
+ # Some comparisons...
395
+ if (current_item.fetch('title') == test_item.fetch('title') && current_item.fetch('revid') > test_item.fetch('revid') )
396
+
397
+ # At this point, current is the same article as test, and current is newer. Delete test
398
+ array.delete(test_item)
399
+ count += 1
400
+
401
+ end
402
+ end
403
+ end
404
+
405
+ puts "Deleted #{count} items."
406
+
407
+ return array
408
+ end
409
+
410
+
411
+ private
412
+
413
+ # Make Request is a method that actually handles making the request to the API. Since the API is somewhat standardized, this method is able to accept the action and a hash of variables, and it handles all the fun things MediaWiki likes to be weird over, like cookies and limits and actions. Its very solid, but I didn't want it public because it also does some post processing, and that's not very OO.
414
+ def make_request (action, post_this)
415
+
416
+ #Housekeeping. We need to add format and action to the request hash
417
+ post_this['format'] = 'yaml'
418
+ post_this['action'] = action
419
+
420
+ #change - preparing a POST string instead of hash.
421
+ post_string = ''
422
+ post_this.each_pair do |key, value|
423
+ post_string << "#{key}=#{value}&"
424
+ end
425
+ @wikibotlogger.info("MAKE REQUEST - Post String is: #{post_string}")
426
+ @wikibotlogger.debug "MAKE REQUEST - Sending request to: #{@config.fetch('uri')}"
427
+
428
+ #Send the actual request
429
+ @wikibotlogger.debug "MAKE REQUEST - Sending request..."
430
+ resp = @http.post( @config.fetch('uri').path , post_string , {'User-agent' => 'RWikiBot/0.1','Cookie' => bake(@config.fetch('cookie'))})
431
+ @wikibotlogger.info "MAKE REQUEST - Response: "
432
+ @wikibotlogger.info resp.body
433
+ @wikibotlogger.debug "MAKE REQUEST - End Response "
434
+ result = YAML.load(resp.body)
435
+
436
+ #Process response
437
+ @wikibotlogger.debug "MAKE REQUEST - Response received. Processing..."
438
+ return_result = result.fetch(action)
439
+
440
+ # A small check to make sure we don't need to save cookie data. Because usually, we don't
441
+ if @config.fetch('logged_in') == FALSE
442
+ @config['cookie'] = resp.header['set-cookie']
443
+ @wikibotlogger.debug "MAKE REQUEST - Received login cookie. Cookie is: #{resp.header['set-cookie']}"
444
+ end
445
+
446
+ #Return the response
447
+ @wikibotlogger.debug "MAKE REQUEST - Response cleaned. Returning result."
448
+ return return_result
449
+
450
+ end
451
+
452
+ # For some odd reason, MediaWiki sends back three cookies the first time you establis a session. For some even more odd, Net::HTTP doesn't treat the set-cookie headers individually the way that normal borowsers do. So, I have this processing method to handle cookie logic - when to send, when to set, etc. And it bakes cookies. Get it? Bake.
453
+ def bake(cookie)
454
+
455
+ # Make some dough. Let CGI parse set-cookie, since there's multiple set-cookies and this is hard
456
+ dough = CGI::Cookie::parse(cookie)
457
+
458
+ # Take the dough and get my three elements out
459
+ # wikidb_wiki_UserID wikidb_wiki_UserName wikidb_wiki_Token
460
+
461
+ if cookie != nil
462
+ return "#{dough.fetch('wikidb_wiki_UserID')}; #{dough.fetch('wikidb_wiki_UserName')}; #{dough.fetch('wikidb_wiki_Token')}"
463
+ else
464
+ return ""
465
+ end
466
+
467
+ # No shit, this took me almost a week to figure it out
468
+ end
469
+
470
+ end
metadata ADDED
@@ -0,0 +1,48 @@
1
+ --- !ruby/object:Gem::Specification
2
+ rubygems_version: 0.8.11
3
+ specification_version: 1
4
+ name: rwikibot
5
+ version: !ruby/object:Gem::Version
6
+ version: 1.0.0
7
+ date: 2007-02-25 00:00:00 -06:00
8
+ summary: A framework for creating MediaWiki bots.
9
+ require_paths:
10
+ - lib
11
+ email: eddieroger @nospam@ gmail.com
12
+ homepage:
13
+ rubyforge_project:
14
+ description:
15
+ autorequire: rwikibot
16
+ default_executable:
17
+ bindir: bin
18
+ has_rdoc: "true"
19
+ required_ruby_version: !ruby/object:Gem::Version::Requirement
20
+ requirements:
21
+ - - ">"
22
+ - !ruby/object:Gem::Version
23
+ version: 0.0.0
24
+ version:
25
+ platform: ruby
26
+ signing_key:
27
+ cert_chain:
28
+ authors:
29
+ - Eddie Roger
30
+ files:
31
+ - lib/rwikibot.rb
32
+ - README
33
+ - config.yaml
34
+ test_files: []
35
+
36
+ rdoc_options:
37
+ - --inline-source
38
+ extra_rdoc_files:
39
+ - README
40
+ - config.yaml
41
+ executables: []
42
+
43
+ extensions: []
44
+
45
+ requirements: []
46
+
47
+ dependencies: []
48
+