ironfan 4.3.4 → 4.4.0
Sign up to get free protection for your applications and to get access to all the features.
- data/CHANGELOG.md +7 -0
- data/ELB.md +121 -0
- data/Gemfile +1 -0
- data/Rakefile +4 -0
- data/VERSION +1 -1
- data/ironfan.gemspec +48 -3
- data/lib/chef/knife/cluster_launch.rb +5 -0
- data/lib/chef/knife/cluster_proxy.rb +3 -3
- data/lib/chef/knife/cluster_sync.rb +4 -0
- data/lib/chef/knife/ironfan_knife_common.rb +17 -6
- data/lib/chef/knife/ironfan_script.rb +29 -11
- data/lib/ironfan.rb +2 -2
- data/lib/ironfan/broker/computer.rb +8 -3
- data/lib/ironfan/dsl/ec2.rb +133 -2
- data/lib/ironfan/headers.rb +4 -0
- data/lib/ironfan/provider.rb +48 -3
- data/lib/ironfan/provider/ec2.rb +23 -8
- data/lib/ironfan/provider/ec2/elastic_load_balancer.rb +239 -0
- data/lib/ironfan/provider/ec2/iam_server_certificate.rb +101 -0
- data/lib/ironfan/provider/ec2/machine.rb +8 -0
- data/lib/ironfan/provider/ec2/security_group.rb +3 -5
- data/lib/ironfan/requirements.rb +2 -0
- data/notes/Home.md +45 -0
- data/notes/INSTALL-cloud_setup.md +103 -0
- data/notes/INSTALL.md +134 -0
- data/notes/Ironfan-Roadmap.md +70 -0
- data/notes/advanced-superpowers.md +16 -0
- data/notes/aws_servers.jpg +0 -0
- data/notes/aws_user_key.png +0 -0
- data/notes/cookbook-versioning.md +11 -0
- data/notes/core_concepts.md +200 -0
- data/notes/declaring_volumes.md +3 -0
- data/notes/design_notes-aspect_oriented_devops.md +36 -0
- data/notes/design_notes-ci_testing.md +169 -0
- data/notes/design_notes-cookbook_event_ordering.md +249 -0
- data/notes/design_notes-meta_discovery.md +59 -0
- data/notes/ec2-pricing_and_capacity.md +69 -0
- data/notes/ec2-pricing_and_capacity.numbers +0 -0
- data/notes/homebase-layout.txt +102 -0
- data/notes/knife-cluster-commands.md +18 -0
- data/notes/named-cloud-objects.md +11 -0
- data/notes/opscode_org_key.png +0 -0
- data/notes/opscode_user_key.png +0 -0
- data/notes/philosophy.md +13 -0
- data/notes/rake_tasks.md +24 -0
- data/notes/renamed-recipes.txt +142 -0
- data/notes/silverware.md +85 -0
- data/notes/style_guide.md +300 -0
- data/notes/tips_and_troubleshooting.md +92 -0
- data/notes/version-3_2.md +273 -0
- data/notes/walkthrough-hadoop.md +168 -0
- data/notes/walkthrough-web.md +166 -0
- data/spec/fixtures/ec2/elb/snakeoil.crt +35 -0
- data/spec/fixtures/ec2/elb/snakeoil.key +51 -0
- data/spec/integration/minimal-chef-repo/chefignore +41 -0
- data/spec/integration/minimal-chef-repo/environments/_default.json +12 -0
- data/spec/integration/minimal-chef-repo/knife/credentials/knife-org.rb +19 -0
- data/spec/integration/minimal-chef-repo/knife/credentials/knife-user-ironfantester.rb +9 -0
- data/spec/integration/minimal-chef-repo/knife/knife.rb +66 -0
- data/spec/integration/minimal-chef-repo/roles/systemwide.rb +10 -0
- data/spec/integration/spec/elb_build_spec.rb +95 -0
- data/spec/integration/spec_helper.rb +16 -0
- data/spec/integration/spec_helper/launch_cluster.rb +55 -0
- data/spec/ironfan/ec2/elb_spec.rb +95 -0
- data/spec/ironfan/ec2/security_group_spec.rb +0 -6
- metadata +60 -3
data/notes/silverware.md
ADDED
@@ -0,0 +1,85 @@
|
|
1
|
+
# Silverware Chef Cookbook
|
2
|
+
|
3
|
+
## Overview
|
4
|
+
|
5
|
+
Cookbooks repeatably express these and other aspects:
|
6
|
+
|
7
|
+
* "I launch these daemons: ..."
|
8
|
+
* "I have a collection of logs at '/var/log/lol'"
|
9
|
+
* "I have a dashboard at 'http://....:...'"
|
10
|
+
* ... and much more.
|
11
|
+
|
12
|
+
Wouldn't it be nice if announcing a log directory caused...
|
13
|
+
|
14
|
+
- my log rotation system to start rotating my logs?
|
15
|
+
- a 'disk free space' gauge to be added to the monitoring dashboard for that service?
|
16
|
+
- Flume (or whatever) began picking up my logs and archiving them to a predictable location?
|
17
|
+
- in the case of standard apache logs, a listener to start counting the rate of requests, 200s, 404s and so forth?
|
18
|
+
Similarly, announcing ports should mean
|
19
|
+
- the firewall and security groups configure themselves correspondingly
|
20
|
+
- the monitor system starts regularly pinging the port for uptime and latency
|
21
|
+
- and pings the interfaces that it should *not* appear on to ensure the firewall is in place?
|
22
|
+
|
23
|
+
Ironfan makes those aspects standardized and predictable, and provides integration and discovery hooks. The key is to make integration *inevitable*: No more forgetting to rotate or monitor a service, or having a config change over here screw up a dependent system over there.
|
24
|
+
________________________________________________________________________
|
25
|
+
|
26
|
+
Attributes are scoped by *cookbook* and then by *component*.
|
27
|
+
|
28
|
+
* If I declare `announce(:redis)`, it will look in `node[:redis]`.
|
29
|
+
* If I declare `announce(:hadoop, :namenode)`, it will look in `node[:hadoop]` for cookbook-wide concerns and `node[:hadoop][:namenode]` for component-specific concerns.
|
30
|
+
* The cookbook scope is always named for its cookbook. Its attributes live in`node[:cookbook_name]`. If everything in the cookbook shares a concern, it sits at cookbook level. So the Hadoop log directory (shared by all its components) is at `(scratch_root)/hadoop/log`.
|
31
|
+
* If there is only one component, it can be implicitly named for its cookbook. In this case, it is omitted: the component attributes live in `node[:cookbook_name]` (which is the same as the component name).
|
32
|
+
* If there are multiple components, they will live in `node[:cookbook_name][:component_name]` (eg `[:hadoop][:namenode]` or `[:flume][:master]`.
|
33
|
+
|
34
|
+
### Discovery
|
35
|
+
|
36
|
+
Allow nodes to discover the location for a given service at runtime, adapting when new services register.
|
37
|
+
|
38
|
+
#### Operations:
|
39
|
+
|
40
|
+
* register for a service. A timestamp records the last registry.
|
41
|
+
* discover all chef nodes that have registered for the given service.
|
42
|
+
* discover the most recent chef node for that service.
|
43
|
+
* get the 'public_ip' for a service -- the address that nodes in the larger world should use
|
44
|
+
* get the 'private_ip' for a service -- the address that nodes on the local subnet / private cloud should use
|
45
|
+
|
46
|
+
#### Implementation
|
47
|
+
|
48
|
+
Nodes register a service by calling `announce(<service>[,<component>])`, which adds a hash to node[:announces][<service>][<component>], containing 'timestamp' (the time of registry) and other metadata passed in. Nodes discover services by calling `discover(<service>[,<component>[,<realm>]])`, where realm is the scope of the discovery (the current cluster, by default).
|
49
|
+
|
50
|
+
## Recipes
|
51
|
+
|
52
|
+
* `default` - Base configuration for silverware
|
53
|
+
|
54
|
+
## Integration
|
55
|
+
|
56
|
+
Supports platforms: Debian and Ubuntu
|
57
|
+
|
58
|
+
|
59
|
+
|
60
|
+
## Attributes
|
61
|
+
|
62
|
+
* `[:silverware][:conf_dir]` - (default: "/etc/silverware")
|
63
|
+
* `[:silverware][:log_dir]` - (default: "/var/log/silverware")
|
64
|
+
* `[:silverware][:home_dir]` - (default: "/etc/silverware")
|
65
|
+
* `[:silverware][:user]` - (default: "root")
|
66
|
+
* `[:users][:root][:primary_group]` - (default: "root")
|
67
|
+
|
68
|
+
## License and Author
|
69
|
+
|
70
|
+
Author:: Philip (flip) Kromer - Infochimps, Inc (<coders@infochimps.com>)
|
71
|
+
Copyright:: 2011, Philip (flip) Kromer - Infochimps, Inc
|
72
|
+
|
73
|
+
Licensed under the Apache License, Version 2.0 (the "License");
|
74
|
+
you may not use this file except in compliance with the License.
|
75
|
+
You may obtain a copy of the License at
|
76
|
+
|
77
|
+
http://www.apache.org/licenses/LICENSE-2.0
|
78
|
+
|
79
|
+
Unless required by applicable law or agreed to in writing, software
|
80
|
+
distributed under the License is distributed on an "AS IS" BASIS,
|
81
|
+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
82
|
+
See the License for the specific language governing permissions and
|
83
|
+
limitations under the License.
|
84
|
+
|
85
|
+
> readme generated by [ironfan](http://github.com/infochimps-labs/ironfan)'s cookbook_munger
|
@@ -0,0 +1,300 @@
|
|
1
|
+
# Ironfan + Chef Style Guide
|
2
|
+
|
3
|
+
------------------------------------------------------------------------
|
4
|
+
|
5
|
+
### System+Component define Names
|
6
|
+
|
7
|
+
Name things uniformly for their system and component. For the ganglia master,
|
8
|
+
|
9
|
+
* attributes: `node[:ganglia][:master]`
|
10
|
+
* recipe: `ganglia::master`
|
11
|
+
* role: `ganglia_master`
|
12
|
+
* directories: `ganglia/master` (if specific to component), `ganglia` (if not).
|
13
|
+
- for example: `/var/log/ganglia/master`
|
14
|
+
|
15
|
+
### Component names
|
16
|
+
|
17
|
+
* `agent.rb`
|
18
|
+
* `worker.rb`
|
19
|
+
* `datanode.rb`
|
20
|
+
* `webnode.rb`
|
21
|
+
|
22
|
+
|
23
|
+
### Recipes
|
24
|
+
|
25
|
+
Recipes partition these things:
|
26
|
+
|
27
|
+
* shared functionality between components
|
28
|
+
* proper event order
|
29
|
+
* optional or platform-specific functionality
|
30
|
+
|
31
|
+
* Within the foo cookbook, name your recipes like this:
|
32
|
+
- `default.rb` -- information shared by anyone using foo, including support packages, users and directories.
|
33
|
+
- `user.rb` -- define daemon users. Called 'user' even if there is more than one. It's OK to move this into the default cookbook.
|
34
|
+
- `install_from_X.rb` -- install packages (`install_from_package`), versioned tarballs (`install_from_release`). It's OK to move this into `default.rb`.
|
35
|
+
- `deploy.rb` -- use this when doing sha-versioned deploys.
|
36
|
+
- `plugins.rb` -- install additional plugins or support code. If you have separate plugins, name them `git_plugin`, `rspec_plugin`, etc.
|
37
|
+
- `server.rb` -- define the foo server process. Similarly, `agent`, `worker`, etc -- see component naming above.
|
38
|
+
- `client.rb` -- install libraries to *use* the foo service.
|
39
|
+
- `config_files.rb` -- discover other components, write final configuration to disk
|
40
|
+
- `finalize.rb` -- final cleanup
|
41
|
+
|
42
|
+
* Do not repeat the cookbook name in a recipe title: `ganglia::master`, not `ganglia::ganglia_master`.
|
43
|
+
* Use only `[a-z0-9_]` for cookbook and component names. Do not use capital letters or hyphens.
|
44
|
+
* Keep names short and descriptive (preferably 15 characters or less, or it jacks with the Chef webui).
|
45
|
+
|
46
|
+
* Always include a `default.rb` recipe, even if it is blank.
|
47
|
+
* *DO NOT* use the default cookbook to install daemons or do anything interesting at all, even if that's currently the only thing the recipe does. I want to be able to refer to the attributes in the apache cookbook without launching the apache service. Think of it like a C header file.
|
48
|
+
|
49
|
+
A `client` is also passive -- it lets me *use* the system without requiring that I run it. This means the client recipe should *never* launch a process (chef_client` and `nfs_client` components are allowed exceptions).
|
50
|
+
|
51
|
+
### Cookbook Dependencies
|
52
|
+
|
53
|
+
* Dependencies should be announced in metadata.rb, of course.
|
54
|
+
* Explicitly `include_recipe` for system resources -- `runit`, `java`, `silverware`, `thrift` and `apt`.
|
55
|
+
- never
|
56
|
+
* *DO NOT* use `include_recipe` unless putting it in the role would be utterly un-interesting. You *want* the run to break unless it's explicitly included in the role.
|
57
|
+
- *yes*: `java`, `ruby`, `announces`, etc.
|
58
|
+
- *no*: `zookeeper::client`, `nfs::server`, or anything that will start a daemon
|
59
|
+
Remember: ordinary cookbooks describe systems, roles and integration cookbooks coordinate them.
|
60
|
+
* `include_recipe` statements should only appear in recipes that are entry points. Recipes that are not meant to be called directly should assume their dependencies have been met.
|
61
|
+
* If a recipe is meant to be the primary entrypoint, it *should* include default, and it should do so explicitly: `include_recipe 'foo::default'` (not just 'foo').
|
62
|
+
|
63
|
+
Crisply separate cookbook-wide concerns from component concerns.
|
64
|
+
|
65
|
+
Separate system configuration from multi-system integration. Cookbooks should provide hooks that are neighborly but not exhibitionist, and otherwise mind their own business.
|
66
|
+
|
67
|
+
### Templates
|
68
|
+
|
69
|
+
*DO NOT* refer to attributes directly on the node (`node[:foo]`). This prevents people from using those templates outside the cookbook. Instead:
|
70
|
+
|
71
|
+
```ruby
|
72
|
+
# in recipe
|
73
|
+
template 'fooconf.yml' do
|
74
|
+
variables :foo => node[:foo]
|
75
|
+
end
|
76
|
+
|
77
|
+
# in template
|
78
|
+
@node[:log_dir]
|
79
|
+
```
|
80
|
+
|
81
|
+
### Attributes
|
82
|
+
|
83
|
+
* Scope concerns by *cookbook* or *cookbook and component*. `node[:hadoop]` holds cookbook-wide concerns, `node[:hadoop][:namenode]` holds component-specific concerns.
|
84
|
+
* Attributes shared by all components sit at cookbook level, and are always named for the cookbook: `node[:hadoop][:log_dir]` (since it is shared by all its components).
|
85
|
+
* Component-specific attributes sit at component level (`node[:cookbook_name][:component_name]`): eg `node[:hadoop][:namenode][:service_state]`. Do not use a prefix (NO: `node[:hadoop][:namenode_handler_count]`)
|
86
|
+
|
87
|
+
* Refer to node attributes by symbol, never by method:
|
88
|
+
- `node[:ganglia][:log_dir]`, not `node.ganglia.log_dir` or `node['ganglia']['log_dir']
|
89
|
+
|
90
|
+
#### Attribute Files
|
91
|
+
|
92
|
+
* The main attribute file should be named `attributes/default.rb`. Do not name the file after the cookbook, or anything else.
|
93
|
+
* If there are a sizeable number of tunable attributes (hadoop, cassandra), place them in `attributes/tuneables.rb`.
|
94
|
+
|
95
|
+
## Name Attributes for their aspects
|
96
|
+
|
97
|
+
Attributes should be named for their aspect: `port`, `log`, etc. Use generic names if there is only one attribute for an aspect, prefixed names if there are many:
|
98
|
+
- For a component that only opens one port: `node[:foo][:server][:port]`
|
99
|
+
- More than one port, use a prefix: `node[:foo][:server][:dash_port]` and `node[:foo][:server][:rpc_port]`.
|
100
|
+
|
101
|
+
Sometimes the conventions below are inappropriate. All we ask is in those cases that you *not* use the special magic name. For example, don't use `:port` and give it a comma-separated string; name it something else, like `:port_list`.
|
102
|
+
|
103
|
+
Here are specific conventions:
|
104
|
+
|
105
|
+
### File and Dir Aspects
|
106
|
+
|
107
|
+
A *file* is the full directory and basename for a file. A *dir* is a directory whose contents correspond to a single concern. A *prefix* not intended to be used directly -- it will be decorated with suffixes to form dirs and files. A *basename* is only the leaf part of a file reference. Don't use the terms 'path' or 'filename'.
|
108
|
+
|
109
|
+
Ignore the temptation to make a one-true-home-for-my-system, or to fight the package maintainer's choices. (FIXME: Rewrite to encourage OS-correct naming schemas.)
|
110
|
+
- a sandbox holding dir, pid, log, ...
|
111
|
+
|
112
|
+
#### Application
|
113
|
+
|
114
|
+
* **prefix**: A container with directories bin, lib, share, src, to use according to convention
|
115
|
+
- default: `/usr/local`.
|
116
|
+
* **home_dir**: Logical location for the cookbook's system code.
|
117
|
+
- default: typically, leave it up to the package maintainer. Otherwise, `:prefix/share/:cookbook` should be a symlink to the `install_dir` (see below).
|
118
|
+
- instead of: `xx_home` / `dir` alone / `install_dir`
|
119
|
+
* **install_dir**: The cookbook's system code, in case the home dir is a pointer to potential alternates.
|
120
|
+
- default: `:prefix/share/:cookbook-:version` ( you don't need the directory after the cookbook runs, use `:prefix/share/:cookbook-:version` instead, eg `/usr/local/src/tokyo_tyrant-xx.xx`)
|
121
|
+
- Make `home_dir` a symlink to this directory (eg home_dir `/usr/local/share/elasticsearch` links to install_dir `/usr/local/share/elasticsearch-0.17.8`).
|
122
|
+
* **src_dir**: holds the compressed tarball, its expanded contents, and the compiled files when installing from source. Use this when you will run `make install` or equivalent and use the files elsewhere.
|
123
|
+
- default: `:prefix/src/:system_name-:version`, eg `/usr/local/src/pig-0.9.tar.gz`
|
124
|
+
- do not: expand the tarball to `:prefix/src/(whatever)` if it will actually be used from there; instead, use the `install_dir` convention described above. (As a guideline, I should be able to blow away `/usr/local/src` and everything still works).
|
125
|
+
* **deploy_dir**: deployed code that follows the capistrano convention. See more about deploy variables below.
|
126
|
+
- the `:deploy_dir/shared` directory holds common files
|
127
|
+
- releases are checked out to `:deploy_dir/releases/{sha}`
|
128
|
+
- the operational release is a symlink to the right release: `:deploy_dir/current -> :deploy_dir/releases/xxx`.
|
129
|
+
- do not: use this when you mean `home_dir`.
|
130
|
+
|
131
|
+
* **scratch_roots**, **persistent_roots**: an array of directories spread across volumes, with expectations on persistence
|
132
|
+
- `scratch_root`s have no guarantee of persistence -- for example, stop/start'ing a machine on EC2 destroys the contents of its local (ephemeral) drives. `persistent_root`s have the *best available* promise of persistance: if permanent (eg EBS) volumes are available, they will exclusively populate the `persistent_root`s; but if not, the ephemeral drives are used instead.
|
133
|
+
- these attributes are provided by the `mountable_volume` meta-cookbook and its appropriate integration recipe. Ordinary cookbooks should always trust the integration cookbook's choices (or visit the integration cookbook to correct them).
|
134
|
+
- each element in `persistent_roots` is by contract on a separate volume, and similarly each of the `scratch_roots` is on a separate volume. A volume *may* be in both scratch and persistent (for example, there may be only one volume!).
|
135
|
+
- the singular forms **scratch_root** and **persistent_root** are provided for your convenience and always correspond to `scratch_roots.first` and `persistent_roots.first`. This means lots the first named volume is picked on the heaviest -- if you don't like that, choose explicitly (but not randomly, or you won't be idempotent).
|
136
|
+
|
137
|
+
|
138
|
+
* **log_file**, **log_dir**, **xx_log_file**, **xx_log_dir**:
|
139
|
+
- default:
|
140
|
+
- if the log files will always be trivial in size, put them in `/var/log/:cookbook.log` or `/var/log/:cookbook/(whatever)`.
|
141
|
+
- if it's a runit-managed service, leave them in `/etc/sv/:cookbook-:component/log/main/current`, and make a symlink from `/var/log/:cookbook-component` to `/etc/sv/:cookbook-:component/log/main/`.
|
142
|
+
- If the log files are non-trivial in size, set log dir `/:scratch_root/:cookbook/log/`, and symlink `/var/log/:cookbook/` to it.
|
143
|
+
- If the log files should be persisted, place them in `/:persistent_root/:cookbook/log`, and symlink `/var/log/:cookbook/` to it.
|
144
|
+
- in all cases, the directory is named `.../log`, not `.../logs`. Never put things in `/tmp`.
|
145
|
+
- Use the physical location for the `log_dir` attribute, not the /var/log symlink.
|
146
|
+
* **tmp_dir**:
|
147
|
+
- default: `/:scratch_root/:cookbook/tmp/`
|
148
|
+
- Do not put a symlink or directory in `/tmp` -- something else blows it away, the app recreates it as a physical directory, `/tmp` overflows, pagers go off, sadness spreads throughout the land.
|
149
|
+
* **conf_dir**:
|
150
|
+
- default: `/etc/:cookbook`
|
151
|
+
* **bin_dir**:
|
152
|
+
- default: `/:home_dir/bin`
|
153
|
+
* **pid_file**, **pid_dir**:
|
154
|
+
- default: pid_file: `/var/run/:cookbook.pid` or `/var/run/:cookbook/:component.pid`; pid_dir: `/var/run/:cookbook/`
|
155
|
+
- instead of: `job_dir`, `job_file`, `pidfile`, `run_dir`.
|
156
|
+
* **cache_dir**:
|
157
|
+
- default: `/var/cache/:cookbook`.
|
158
|
+
|
159
|
+
* **data_dir**:
|
160
|
+
- default: `:persistent_root/:cookbook/:component/data`
|
161
|
+
- instead of: `datadir, `dbfile`, `dbdir`
|
162
|
+
* **journal_dir**: high-speed local storage for commitlogs and so forth. Can be deleted, though you may rather it wasn't.
|
163
|
+
- default: `:scratch_root/:cookbook/:component/scratch`
|
164
|
+
- instead of: `commitlog_dir`
|
165
|
+
|
166
|
+
### Daemon Aspects
|
167
|
+
|
168
|
+
* **daemon_name**: daemon's actual service name, if it differs from the component. For example, the `hadoop-namenode` component's daemon is `hadoop-0.20-namenode` as installed by apt.
|
169
|
+
* **daemon_states**: an array of the verbs acceptable to the Chef `service` resource: `:enable`, `:start`, etc.
|
170
|
+
* **num_xx_processes**, **num_xx_threads** the number of separate top-level processes (distinct PIDs) or internal threads to run
|
171
|
+
- instead of `num_workers`, `num_servers`, `worker_processes`, `foo_threads`.
|
172
|
+
* **log_level**
|
173
|
+
- application-specific; often takes values info, debug, warn
|
174
|
+
- instead of `verbose`, `verbosity`, `loglevel`
|
175
|
+
* **user**, **group**, **uid**, **gid** -- `user` is the user name. The `user` and `group` should be strings, even the `uid` and `gid` should be integers.
|
176
|
+
- instead of username, group_name, using uid for user name or vice versa.
|
177
|
+
- if there are multiple users, use a prefix: `launcher_user` and `observer_user`.
|
178
|
+
|
179
|
+
### Install / Deploy Aspects
|
180
|
+
|
181
|
+
* **release_url**: URL for the release.
|
182
|
+
- instead of: install_url, package_url, being careless about partial vs whole URLs
|
183
|
+
* **release_file**: Where to put the release.
|
184
|
+
- default: `:prefix/src/system_name-version.ext`, eg `/usr/local/src/elasticsearch-0.17.8.tar.bz2`.
|
185
|
+
- do not use `/tmp` -- let me decide when to blow it away (and make it easy to be idempotent).
|
186
|
+
- do not use a non-versioned URL or file name.
|
187
|
+
* **release_file_sha** or **release_file_md5** fingerprint
|
188
|
+
- instead of: `whatever_checksum`, `whatever_fingerprint`
|
189
|
+
* **version**: if it's a simply-versioned resource that uses the `major.minor.patch-cruft` convention. Do not use unless this is true, and do not use the source control revision ID.
|
190
|
+
|
191
|
+
* **plugins**: array of system-specific plugins
|
192
|
+
|
193
|
+
use `deploy_{}` for anything that would be true whatever SCM you're using; use `git_{}` (and so forth) where specific to that repo.
|
194
|
+
|
195
|
+
* **deploy_env** production / staging / etc
|
196
|
+
* **deploy_strategy**
|
197
|
+
* **deploy_user** user to run as
|
198
|
+
* **deploy_dir**: Only use `deploy_dir` if you are following the capistrano convention: see above.
|
199
|
+
|
200
|
+
* **git_repo**: url for the repo, eg `git@github.com:infochimps-labs/ironfan.git` or `http://github.com/infochimps-labs/ironfan.git`
|
201
|
+
- instead of: `deploy_repo`, `git_url`
|
202
|
+
* **git_revision**: SHA or branch
|
203
|
+
- instead of: `deploy_revision`
|
204
|
+
|
205
|
+
* **apt/(repo_name)** Options for adding a cookbook's apt repo.
|
206
|
+
- Note that this is filed under *apt*, not the cookbook.
|
207
|
+
- Use the best name for the repo, which is not necessarily the cookbook's name: eg `apt/cloudera/{...}`, which is shared by hadoop, flume, pig, and so on.
|
208
|
+
- `apt/{repo_name}/url` -- eg `http://archive.cloudera.com/debian`
|
209
|
+
- `apt/{repo_name}/key` -- GPG key
|
210
|
+
- `apt/{repo_name}/force_distro` -- forces the distro (eg, you are on natty but the apt repo only has maverick)
|
211
|
+
|
212
|
+
### Ports
|
213
|
+
|
214
|
+
* **xx_port**:
|
215
|
+
- *do not* use 'port' on its own.
|
216
|
+
- examples: `thrift_port`, `webui_port`, `zookeeper_port`, `carbon_port` and `whisper_port`.
|
217
|
+
- xx_port: `default[:foo][:server][:port] = 5000`
|
218
|
+
- xx_ports, if an array: `default[:foo][:server][:ports] = [5000, 5001, 5002]`
|
219
|
+
|
220
|
+
* **addr**, **xx_addr**
|
221
|
+
- if all ports bind to the same interface, use `addr`. Otherwise, do *not* use `addr`, and use a unique `foo_addr` for each `foo_port`.
|
222
|
+
- instead of: `hostname`, `binding`, `address`
|
223
|
+
|
224
|
+
* Want some way to announce my port is http or https.
|
225
|
+
* Need to distinguish client ports from service ports. You should be using cluster service discovery anyway though.
|
226
|
+
|
227
|
+
### Application Integration
|
228
|
+
|
229
|
+
* **jmx_port**
|
230
|
+
|
231
|
+
### Tunables
|
232
|
+
|
233
|
+
* **XX_heap_max**, **xx_heap_min**, **java_heap_eden**
|
234
|
+
* **java_home**
|
235
|
+
* AVOID batch declaration of options (e.g. **java_opts**) if possible: assemble it in your recipe from intelligible attribute names.
|
236
|
+
|
237
|
+
### Nitpicks
|
238
|
+
|
239
|
+
* Always put file modes in quote marks: `mode "0664"` not `mode 0664`.
|
240
|
+
|
241
|
+
## Announcing Aspects
|
242
|
+
|
243
|
+
If your app does any of the following,
|
244
|
+
|
245
|
+
* **services** -- Any interesting long-running process.
|
246
|
+
* **ports** -- Any reserved open application port
|
247
|
+
- *http*: HTTP application port
|
248
|
+
- *https*: HTTPS application port
|
249
|
+
- *internal*: port is on private IP, should *not* be visible through public IP
|
250
|
+
- *external*: port *is* available through public IP
|
251
|
+
* metric_ports:
|
252
|
+
- **jmx_ports** -- JMX diagnostic port (announced by many Java apps)
|
253
|
+
* **dashboards** -- Web interface to look inside a system; typically internal-facing only, and probably not performance-monitored by default.
|
254
|
+
* **logs** -- um, logs. You can also announce the logs' flavor: `:apache`, `log4j`, etc.
|
255
|
+
* **scheduleds** -- regularly-occurring events that leave a trace
|
256
|
+
* **exports** -- jars or libs that other programs may wish to incorporate
|
257
|
+
* **consumes** -- placed there by any call to `discover`.
|
258
|
+
|
259
|
+
## Clusters
|
260
|
+
|
261
|
+
* Describe physical configuration:
|
262
|
+
- machine size, number of instances per facet, etc
|
263
|
+
- external assets (elastic IP, ebs volumes)
|
264
|
+
* Describe high-level assembly of systems via roles: `hadoop_namenode`, `nfs_client`, `ganglia_agent`, etc.
|
265
|
+
* Describe important modifications, such as `ironfan::system_internals`, mounts ebs volumes, etc
|
266
|
+
* Describe override attributes:
|
267
|
+
- `heap size`, rvm versions, etc.
|
268
|
+
|
269
|
+
* roles and recipes
|
270
|
+
- remove `cluster_role` and `facet_role` if empty
|
271
|
+
- are not in `run_list`, but populated by the `role` and `recipe` directives
|
272
|
+
* remove big_package unless it's a dev machine (sandbox, etc)
|
273
|
+
|
274
|
+
## Roles
|
275
|
+
|
276
|
+
Roles define the high-level assembly of recipes into systems
|
277
|
+
|
278
|
+
* override attributes go into the cluster.
|
279
|
+
currently, those files are typically empty and are badly cluttering the roles/ directory.
|
280
|
+
the cluster and facet override attributes should be together, not scattered in different files.
|
281
|
+
roles shouldn't assemble systems. The contents of the infochimps_chef/roles/plato_truth.rb file belong in a facet.
|
282
|
+
|
283
|
+
* Deprecated:
|
284
|
+
- Cluster and facet roles (`roles/gibbon_cluster.rb`, `roles/gibbon_namenode.rb`, etc) go away
|
285
|
+
- Roles should be service-oriented: `hadoop_master` considered harmful, you should explicitly enumerate the services
|
286
|
+
|
287
|
+
|
288
|
+
### Facets should be (nearly) identical
|
289
|
+
|
290
|
+
Within a facet, keep your servers almost entirely identical. For example, servers in a MySQL facet would their index to set shard order and to claim the right attached volumes. However, it would be a mistake to have one server within a facet be a master process and the rest be worker processes -- just define different facets for each.
|
291
|
+
|
292
|
+
### Pedantic Distinctions:
|
293
|
+
|
294
|
+
Separate the following terms:
|
295
|
+
|
296
|
+
* A *machine* is a concrete thing that runs your code -- it might be a VM or raw metal, but it has CPUs and fans and a finite lifetime. It has a unique name tied to its physical presence -- something like 'i-123abcd' or 'rack 4 server 7'.
|
297
|
+
* A *chef node* is the code object that, together with the chef-client process, configures a machine. In ironfan, the chef node is strictly slave to the server description and the measured attributes of the machine.
|
298
|
+
* A *server description* gives the high-level specification the machine should acheive. This includes the roles, recipes and attributes given to the chef node; the physical characteristics of the machine ('8 cores, 7GB ram, AWS cloud'); and its relation to the rest of the system (george cluster, webnode facet, index 3).
|
299
|
+
|
300
|
+
In particular, we try to be careful to always call a Chef node a 'chef node' (never just 'node'). Try processing graph nodes in a flume node feeding a node.js decorator on a cloud node define by a chef node. No(de) way.
|
@@ -0,0 +1,92 @@
|
|
1
|
+
## Tips and Notes
|
2
|
+
|
3
|
+
### Gems
|
4
|
+
|
5
|
+
knife cluster ssh bonobo-worker-2 'sudo gem update --system'
|
6
|
+
knife cluster ssh bonobo-worker-2 'sudo true ; for foo in /usr/lib/ruby/gems/1.9.2-p290/specifications/* ; do sudo sed -i.bak "s!000000000Z!!" $foo ; done'
|
7
|
+
knife cluster ssh bonobo-worker-2 'sudo true ; for foo in /usr/lib/ruby/site_ruby/*/rubygems/deprecate.rb ; do sudo sed -i.bak "s!@skip ||= false!true!" $foo ; done'
|
8
|
+
|
9
|
+
|
10
|
+
### EC2 Notes Instance attributes: `disable_api_termination` and `delete_on_termination`
|
11
|
+
|
12
|
+
To set `delete_on_termination` to 'true' after the fact, run the following (modify the instance and volume to suit):
|
13
|
+
|
14
|
+
```
|
15
|
+
ec2-modify-instance-attribute -v i-0704be6c --block-device-mapping /dev/sda1=vol-XX8d2c80::true
|
16
|
+
```
|
17
|
+
|
18
|
+
If you set `disable_api_termination` to true, in order to terminate the node run
|
19
|
+
```
|
20
|
+
ec2-modify-instance-attribute -v i-0704be6c --disable-api-termination false
|
21
|
+
```
|
22
|
+
|
23
|
+
To view whether an attached volume is deleted when the machine is terminated:
|
24
|
+
|
25
|
+
```
|
26
|
+
# show volumes that will be deleted
|
27
|
+
ec2-describe-volumes --filter "attachment.delete-on-termination=true"
|
28
|
+
```
|
29
|
+
|
30
|
+
You can't (as far as I know) alter the delete-on-termination flag of a running volume. Crazy, huh?
|
31
|
+
|
32
|
+
### EC2: See your userdata
|
33
|
+
|
34
|
+
curl http://169.254.169.254/latest/user-data
|
35
|
+
|
36
|
+
### EBS Volumes for a persistent HDFS
|
37
|
+
|
38
|
+
* Make one volume and format for XFS:
|
39
|
+
`$ sudo mkfs.xfs -f /dev/sdh1`
|
40
|
+
* options "defaults,nouuid,noatime" give good results. The 'nouuid' part
|
41
|
+
prevents errors when mounting multiple volumes from the same snapshot.
|
42
|
+
* poke a file onto the drive :
|
43
|
+
datename=`date +%Y%m%d`
|
44
|
+
sudo bash -c "(echo $datename ; df /data/ebs1 ) > /data/ebs1/xfs-created-at-$datename.txt"
|
45
|
+
|
46
|
+
|
47
|
+
If you want to grow the drive:
|
48
|
+
* take a snapshot.
|
49
|
+
* make a new volume from it
|
50
|
+
* mount that, and run `sudo xfs_growfs`. You *should* have the volume mounted, and should stop anything that would be working the volume hard.
|
51
|
+
|
52
|
+
### Hadoop: On-the-fly backup of your namenode metadata
|
53
|
+
|
54
|
+
bkupdir=/ebs2/hadoop-nn-backup/`date +"%Y%m%d"`
|
55
|
+
|
56
|
+
for srcdir in /ebs*/hadoop/hdfs/ /home/hadoop/gibbon/hdfs/ ; do
|
57
|
+
destdir=$bkupdir/$srcdir ; echo $destdir ;
|
58
|
+
sudo mkdir -p $destdir ;
|
59
|
+
done
|
60
|
+
|
61
|
+
|
62
|
+
### NFS: Halp I am using an NFS-mounted /home and now I can't log in as ubuntu
|
63
|
+
|
64
|
+
Say you set up an NFS server 'core-homebase-0' (in the 'core' cluster) to host and serve out `/home` directory; and a machine 'awesome-webserver-0' (in the 'awesome' cluster), that is an NFS client.
|
65
|
+
|
66
|
+
In each case, when the machine was born EC2 created a `/home/ubuntu/.ssh/authorized_keys` file listing only the single approved machine keypair -- 'core' for the core cluster, 'awesome' for the awesome cluster.
|
67
|
+
|
68
|
+
When chef client runs, however, it mounts the NFS share at /home. This then masks the actual /home directory -- nothing that's on the base directory tree shows up. Which means that after chef runs, the /home/ubuntu/.ssh/authorized_keys file on awesome-webserver-0 is the one for the *'core'* cluster, not the *'awesome'* cluster.
|
69
|
+
|
70
|
+
The solution is to use the cookbook ironfan provides -- it moves the 'ubuntu' user's home directory to an alternative path not masked by the NFS.
|
71
|
+
|
72
|
+
|
73
|
+
### NFS: Problems starting NFS server on ubuntu maverick
|
74
|
+
|
75
|
+
For problems starting NFS server on ubuntu maverick systems, read, understand and then run /tmp/fix_nfs_on_maverick_amis.sh -- See "this thread for more":http://fossplanet.com/f10/[ec2ubuntu]-not-starting-nfs-kernel-daemon-no-support-current-kernel-90948/
|
76
|
+
|
77
|
+
|
78
|
+
### Git deploys: My git deploy recipe has gone limp
|
79
|
+
|
80
|
+
Suppose you are using the @git@ resource to deploy a recipe (@george@ for sake of example). If @/var/chef/cache/revision_deploys/var/www/george@ exists then *nothing* will get deployed, even if /var/www/george/{release_sha} is empty or screwy. If git deploy is acting up in any way, nuke that cache from orbit -- it's the only way to be sure.
|
81
|
+
|
82
|
+
$ sudo rm -rf /var/www/george/{release_sha} /var/chef/cache/revision_deploys/var/www/george
|
83
|
+
|
84
|
+
### Runit services : 'fail: XXX: unable to change to service directory: file does not exist'
|
85
|
+
|
86
|
+
Your service is probably installed but removed from runit's purview; check the `/etc/service` symlink. All of the following should be true:
|
87
|
+
|
88
|
+
* directory `/etc/sv/foo`, containing file `run` and dirs `log` and `supervise`
|
89
|
+
* `/etc/init.d/foo` is symlinked to `/usr/bin/sv`
|
90
|
+
* `/etc/servics/foo` is symlinked tp `/etc/sv/foo`
|
91
|
+
|
92
|
+
|