hashbang 1.0.0.beta2 → 1.0.0

Sign up to get free protection for your applications and to get access to all the features.
data/README.md CHANGED
@@ -3,17 +3,37 @@
3
3
  Hashbang is a tiny Rack proxy serving HTML dumps for your RICH web-applications according to
4
4
  [Google AJAX Crawling conventions](http://code.google.com/web/ajaxcrawling/). Make your Rails AJAX applications indexable in no time.
5
5
 
6
- Using Rails generators Hashbang will setup small inner Rack application which will handle all magic requests containing `_escaped_fragment_` parameter. These requests will cause a subrequest to a real AJAX URL using virtual browser. This hidden browser will wait for some condition you define using well-known [Watir](http://watirwebdriver.com/) API. And then return an HTML dump.
6
+ Using Rails generators Hashbang will setup a small inner Rack application which will handle all magic requests containing `_escaped_fragment_` parameter. These requests will cause a subrequest to a real AJAX URL using a virtual browser.
7
7
 
8
- Let's say for example you've got a request to `test.com/?_escaped_fragment_=/my_hidden_page`.
8
+ Let's say for example you've got a request to `test.com/?_escaped_fragment_=/my_hidden_page`. Hashbang will convert this URL to `test.com/#!/my_hidden_page` and open it in the virtual browser. The virtual browser will load this page and wait for `Suncscraper.finish` javascript call. As soon as it was called Hashbang will respond with an HTML dump.
9
9
 
10
- Hashbang will convert this URL to `test.com/#!/my_hidden_page` and open it in a virtual browser.
10
+ Hashbang uses [Sunscraper](http://github.com/roundlake/sunscraper) and therefore you will need Qt to use it.
11
11
 
12
- Virtual browser will call your lambda with `browser` object as parameter. With help of this lambda you can setup the wait behavior. Here is a great introduction to [Watir wait API](http://watirwebdriver.com/waiting/). Note that your lambda will act as a block to `Watir::Wait.until`.
12
+ ## Environments are specific
13
+
14
+ If you are not using Rails you should skip this paragraph. While working with at development environment, this gem will catch all the requests containing `_escaped_fragment_` directly from Rails using middleware and therefore it will just work **(see P.S. below)**. Go to `http://localhost:3000?_escaped_fragment_=test` to make Hashbang load and dump `http://localhost:3000/#!/test` for you.
15
+
16
+ Due to security and performance reasons, at the production servers you are supposed to boot this Rack app separately and manually forward all the magic requests to the standalone instance.
17
+
18
+ Imagine you are runing Hashbang rack instance at `33222` port. With that you should proxy all the requests containing `_escaped_fragment_=` to `localhost:33222/?url=…` where `…` is a full request URI. **Don't forget to escape url parameter so resulting request could be like this:** `localhost:33222/?url=http%3A%2F%2Fwww.dvnts.ru%2F%3F_encoded_fragment_%3D`.
19
+
20
+ You are supposed to limit the concurent connections and restrict the direct connection to Hashbang instances. We'll describe typical production nginx/passenger setup later in this README.
21
+
22
+ **P.S.**
23
+
24
+ Since in most cases basic development setup uses just one Rails instance, all the requests to magic urls will lead to Deadlock! To solve this problem we've included the `rake hashbang:rails` command which will run your Rails project inside a [Unicorn](http://unicorn.bogomips.org/) with 2 instances.
25
+
26
+ You can also simulate production mode runing `rake hashbang:standalone`.
13
27
 
14
28
  ## Installation
15
29
 
16
- Start from your Gemfile:
30
+ Ensure you have the Qt dependencies for Sunscraper (read [Sunscraper description](http://github.com/roundlake/sunscraper/) for more info).
31
+
32
+ To install it on Mac with [Homebrew](http://mxcl.github.com/homebrew/) run `brew install qt`.
33
+
34
+ To install it on Debian run `apt-get install qt4-dev-tools --no-install-recommends`.
35
+
36
+ Add gem to your Gemfile:
17
37
 
18
38
  ```
19
39
  gem 'hashbang'
@@ -25,14 +45,33 @@ And follow with basic generator:
25
45
  rails g hashbang
26
46
  ```
27
47
 
28
- This generator will create an inline Rack application at `hashbang/` dir. To set lambda you want to use to check if your page is ready refer to `hashbang/config.rb`.
48
+ This generator will create an inline Rack application at `hashbang/` dir. You can proceed with required configuration at `hashbang/config.rb`.
29
49
 
30
- ## Environments are specific
50
+ ## Configuration
51
+
52
+ ```ruby
53
+ Hashbang::Config.map do |c|
54
+ c.url = /^.*$/
55
+ c.timeout = 5000
56
+ end
57
+ ```
58
+
59
+ #### Url
31
60
 
32
- While working at development environment, this gem will catch all the requests directly from rails using middleware and therefore it will just work (see P.S. below :). However due to security and performance reasons, at the production servers you are supposed to boot this Rack app separately and manually forward all magic requests to it. We'll describe typical production nginx/passenger setup later in this README.
61
+ Url limits hashbang crawling to described set of URLs. The limit only works in standalone mode.
33
62
 
34
- P.S. Since basic development setup will use just one Rails instance in most cases all the requests to magic urls will lead to Deadlock! To solve this problem we've included the `rake hashbang:rails` command which will run your Rails project inside a [Unicorn](http://unicorn.bogomips.org/) with 2 instances.
63
+ #### Timeout
64
+
65
+ Hashbang will give virtual browser specified timeout in miliseconds to grab data. Keep it as low as possible. The timeout only works in standalone mode.
66
+
67
+ ## Crawling marker
68
+
69
+ To help Sunscraper (virtual browser of Hashbang) understand what should be considered a loaded page, add Javascript `Suncscraper.finish()` call when all AJAX is done and your DOM is ready. Note that for the straight client calls `Sunscraper` variable will be empty and therefore you should check if it's available. This is how it should basically look:
70
+
71
+ ```javascript
72
+ if (typeof Sunscraper !== "undefined") { Sunscraper.finish() }
73
+ ```
35
74
 
36
75
  ## Memory consumption
37
76
 
38
- This part of hashbang is currently in progress. We still bundle Watir chromedev for proof-of-concept reasons. The full version will come with Qt WebKit bindings.
77
+ Hashbang will keep one instance of Sunscraper per each Hashbang instance. Sunscraper bundles clear QTWebKit and therefore keeps memory consumption as low as possible for virtual browsers. However it can still be noticeable and therefore you should only increase possible concurency if your resource gets indexed often.
@@ -1,6 +1,6 @@
1
1
  Gem::Specification.new do |s|
2
2
  s.name = "hashbang"
3
- s.version = "1.0.0.beta2"
3
+ s.version = "1.0.0"
4
4
  s.platform = Gem::Platform::RUBY
5
5
  s.summary = "Magic support of Google/Bing/... AJAX search indexing for your Rails apps"
6
6
  s.email = "boris@roundlake.ru"
@@ -13,5 +13,5 @@ Gem::Specification.new do |s|
13
13
 
14
14
  s.add_dependency 'headless'
15
15
  s.add_dependency 'sunscraper', '~> 1.1.0.beta3'
16
- s.add_dependency 'unicorn'
16
+ s.add_development_dependency 'unicorn'
17
17
  end
@@ -1,4 +1,4 @@
1
1
  Hashbang::Config.map do |c|
2
- c.url = /^$/
2
+ c.url = /^.*$/
3
3
  c.timeout = 5000
4
4
  end
metadata CHANGED
@@ -1,8 +1,8 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: hashbang
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.0.0.beta2
5
- prerelease: 6
4
+ version: 1.0.0
5
+ prerelease:
6
6
  platform: ruby
7
7
  authors:
8
8
  - Boris Staal
@@ -13,7 +13,7 @@ date: 2012-03-10 00:00:00.000000000 Z
13
13
  dependencies:
14
14
  - !ruby/object:Gem::Dependency
15
15
  name: headless
16
- requirement: &70344278924840 !ruby/object:Gem::Requirement
16
+ requirement: &70247960918140 !ruby/object:Gem::Requirement
17
17
  none: false
18
18
  requirements:
19
19
  - - ! '>='
@@ -21,10 +21,10 @@ dependencies:
21
21
  version: '0'
22
22
  type: :runtime
23
23
  prerelease: false
24
- version_requirements: *70344278924840
24
+ version_requirements: *70247960918140
25
25
  - !ruby/object:Gem::Dependency
26
26
  name: sunscraper
27
- requirement: &70344278922420 !ruby/object:Gem::Requirement
27
+ requirement: &70247960915780 !ruby/object:Gem::Requirement
28
28
  none: false
29
29
  requirements:
30
30
  - - ~>
@@ -32,18 +32,18 @@ dependencies:
32
32
  version: 1.1.0.beta3
33
33
  type: :runtime
34
34
  prerelease: false
35
- version_requirements: *70344278922420
35
+ version_requirements: *70247960915780
36
36
  - !ruby/object:Gem::Dependency
37
37
  name: unicorn
38
- requirement: &70344278920940 !ruby/object:Gem::Requirement
38
+ requirement: &70247960913160 !ruby/object:Gem::Requirement
39
39
  none: false
40
40
  requirements:
41
41
  - - ! '>='
42
42
  - !ruby/object:Gem::Version
43
43
  version: '0'
44
- type: :runtime
44
+ type: :development
45
45
  prerelease: false
46
- version_requirements: *70344278920940
46
+ version_requirements: *70247960913160
47
47
  description: Hashbang is a tiny Rack proxy serving HTML dumps for your RICH web-applications
48
48
  according to Google AJAX Crawling conventions. Make your Rails AJAX applications
49
49
  indexable in no time.
@@ -84,9 +84,9 @@ required_ruby_version: !ruby/object:Gem::Requirement
84
84
  required_rubygems_version: !ruby/object:Gem::Requirement
85
85
  none: false
86
86
  requirements:
87
- - - ! '>'
87
+ - - ! '>='
88
88
  - !ruby/object:Gem::Version
89
- version: 1.3.1
89
+ version: '0'
90
90
  requirements: []
91
91
  rubyforge_project:
92
92
  rubygems_version: 1.8.15