ironfan 3.1.7 → 3.2.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (63) hide show
  1. data/CHANGELOG.md +11 -0
  2. data/Gemfile +15 -12
  3. data/Rakefile +1 -1
  4. data/VERSION +1 -1
  5. data/config/ubuntu10.04-ironfan.erb +10 -0
  6. data/config/ubuntu11.10-ironfan.erb +10 -0
  7. data/ironfan.gemspec +29 -54
  8. data/lib/chef/knife/bootstrap/centos6.2-ironfan.erb +10 -0
  9. data/lib/chef/knife/bootstrap/ubuntu10.04-ironfan.erb +10 -0
  10. data/lib/chef/knife/bootstrap/ubuntu11.10-ironfan.erb +10 -0
  11. data/lib/chef/knife/cluster_kick.rb +7 -2
  12. data/lib/chef/knife/cluster_launch.rb +3 -0
  13. data/lib/chef/knife/cluster_ssh.rb +3 -3
  14. data/lib/chef/knife/ironfan_knife_common.rb +21 -0
  15. data/lib/chef/knife/ironfan_script.rb +2 -0
  16. data/lib/ironfan/chef_layer.rb +9 -9
  17. data/lib/ironfan/cloud.rb +232 -360
  18. data/lib/ironfan/cluster.rb +3 -3
  19. data/lib/ironfan/compute.rb +26 -40
  20. data/lib/ironfan/deprecated.rb +45 -10
  21. data/lib/ironfan/discovery.rb +1 -1
  22. data/lib/ironfan/dsl_builder.rb +99 -0
  23. data/lib/ironfan/facet.rb +2 -3
  24. data/lib/ironfan/fog_layer.rb +14 -10
  25. data/lib/ironfan/private_key.rb +1 -1
  26. data/lib/ironfan/security_group.rb +46 -44
  27. data/lib/ironfan/server.rb +26 -52
  28. data/lib/ironfan/server_slice.rb +13 -19
  29. data/lib/ironfan/volume.rb +47 -59
  30. data/lib/ironfan.rb +5 -4
  31. metadata +116 -122
  32. data/lib/ironfan/dsl_object.rb +0 -124
  33. data/notes/Backup of ec2-pricing_and_capacity.numbers +0 -0
  34. data/notes/Home.md +0 -45
  35. data/notes/INSTALL-cloud_setup.md +0 -103
  36. data/notes/INSTALL.md +0 -134
  37. data/notes/Ironfan-Roadmap.md +0 -70
  38. data/notes/advanced-superpowers.md +0 -16
  39. data/notes/aws_servers.jpg +0 -0
  40. data/notes/aws_user_key.png +0 -0
  41. data/notes/cookbook-versioning.md +0 -11
  42. data/notes/core_concepts.md +0 -200
  43. data/notes/declaring_volumes.md +0 -3
  44. data/notes/design_notes-aspect_oriented_devops.md +0 -36
  45. data/notes/design_notes-ci_testing.md +0 -169
  46. data/notes/design_notes-cookbook_event_ordering.md +0 -249
  47. data/notes/design_notes-meta_discovery.md +0 -59
  48. data/notes/ec2-pricing_and_capacity.md +0 -69
  49. data/notes/ec2-pricing_and_capacity.numbers +0 -0
  50. data/notes/homebase-layout.txt +0 -102
  51. data/notes/knife-cluster-commands.md +0 -18
  52. data/notes/named-cloud-objects.md +0 -11
  53. data/notes/opscode_org_key.png +0 -0
  54. data/notes/opscode_user_key.png +0 -0
  55. data/notes/philosophy.md +0 -13
  56. data/notes/rake_tasks.md +0 -24
  57. data/notes/renamed-recipes.txt +0 -142
  58. data/notes/silverware.md +0 -85
  59. data/notes/style_guide.md +0 -300
  60. data/notes/tips_and_troubleshooting.md +0 -92
  61. data/notes/version-3_2.md +0 -273
  62. data/notes/walkthrough-hadoop.md +0 -168
  63. data/notes/walkthrough-web.md +0 -166
@@ -1,11 +0,0 @@
1
- # Named Cloud Objects
2
-
3
- To add a new machine image, place this snippet:
4
-
5
- Chef::Config[:ec2_image_info] ||= {}
6
- Chef::Config[:ec2_image_info].merge!({
7
- # ... lines like this:
8
- # %w[ us-west-1 64-bit ebs natty ] => { :image_id => 'ami-4d580408' },
9
- })
10
-
11
- in your knife.rb or whereever. ironfan will notice that it exists and add to it, rather than clobbering it.
Binary file
Binary file
data/notes/philosophy.md DELETED
@@ -1,13 +0,0 @@
1
- ## Philosophy
2
-
3
- Some general principles of how we use Chef.
4
-
5
- * *Chef server is never the repository of truth* -- it only mirrors the truth. A file is tangible and immediate to access.
6
- * Specifically, we want truth to live in the git repo, and be enforced by the Chef server. This means that everything is versioned, documented and exchangeable. *There is no truth but git, and Chef is its messenger*.
7
- * *Systems, services and significant modifications cluster should be obvious from the `clusters` file*. I don't want to have to bounce around nine different files to find out which thing installed a redis:server. The existence of anything that opens a port should be obvious when I look at the cluster file.
8
- * *Roles define systems, clusters assemble systems into a machine*.
9
- - For example, a resque worker queue has a redis, a webserver and some config files -- your cluster should invoke a @whatever_queue@ role, and the @whatever_queue@ role should include recipes for the component services.
10
- - the existence of anything that opens a port _or_ runs as a service should be obvious when I look at the roles file.
11
- * *include_recipe considered harmful* Do NOT use include_recipe for anything that a) provides a service, b) launches a daemon or c) is interesting in any way. (so: @include_recipe java@ yes; @include_recipe iptables@ no.) You should note the dependency in the metadata.rb. This seems weird, but the breaking behavior is purposeful: it makes you explicitly state all dependencies.
12
- * It's nice when *machines are in full control of their destiny*. Their initial setup (elastic IP, attaching a drive) is often best enforced externally. However, machines should be able independently assert things like load balancer registration which may change at any point in their lifetime.
13
- * It's even nicer, though, to have *full idempotency from the command line*: I can at any time push truth from the git repo to the Chef server and know that it will take hold.
data/notes/rake_tasks.md DELETED
@@ -1,24 +0,0 @@
1
-
2
- Rake Tasks
3
- ==========
4
-
5
- The homebase contains a `Rakefile` that includes tasks that are installed with the Chef libraries. To view the tasks available with in the homebase with a brief description, run `rake -T`.
6
-
7
- Besides your `~/.chef/knife.rb` file, the Rakefile loads `config/rake.rb`, which sets:
8
-
9
- * Constants used in the `ssl_cert` task for creating the certificates.
10
- * Constants that set the directory locations used in various tasks.
11
-
12
- If you use the `ssl_cert` task, change the values in the `config/rake.rb` file appropriately. These values were also used in the `new_cookbook` task, but that task is replaced by the `knife cookbook create` command which can be configured below.
13
-
14
- The default task (`default`) is run when executing `rake` with no arguments. It will call the task `test_cookbooks`.
15
-
16
- The following standard Chef tasks are typically accomplished using the rake file:
17
-
18
- * `bundle_cookbook[cookbook]` - Creates cookbook tarballs in the `pkgs/` dir.
19
- * `install` - Calls `update`, `roles` and `upload_cookbooks` Rake tasks.
20
- * `ssl_cert` - Create self-signed SSL certificates in `certificates/` dir.
21
- * `update` - Update the homebase from source control server, understands git and svn.
22
- * `roles` - iterates over the roles and uploads with `knife role from file`.
23
-
24
- Most other tasks use knife: run a bare `knife cluster`, `knife cookbook` (etc) to find out more.
@@ -1,142 +0,0 @@
1
- cassandra :: default |
2
- cassandra :: add_apt_repo | new
3
- cassandra :: install_from_git |
4
- cassandra :: install_from_package |
5
- cassandra :: install_from_release |
6
- cassandra :: config_from_data_bag | autoconf
7
- cassandra :: client |
8
- cassandra :: server |
9
- cassandra :: authentication | not include_recipe'd -- added to role
10
- cassandra :: bintools |
11
- cassandra :: ec2snitch |
12
- cassandra :: jna_support |
13
- cassandra :: mx4j |
14
- cassandra :: iptables |
15
- cassandra :: ruby_client |
16
- cassandra :: config_files | new
17
-
18
- elasticsearch :: default |
19
- elasticsearch :: install_from_git |
20
- elasticsearch :: install_from_release |
21
- elasticsearch :: plugins | install_plugins
22
- elasticsearch :: server |
23
- elasticsearch :: client |
24
- elasticsearch :: load_balancer |
25
- elasticsearch :: config_files | config
26
-
27
- flume :: default |
28
- flume :: master |
29
- flume :: agent | node
30
- flume :: plugin-hbase_sink | hbase_sink_plugin
31
- flume :: plugin-jruby | jruby_plugin
32
- flume :: test_flow |
33
- flume :: test_s3_source |
34
- flume :: config_files | config
35
-
36
- ganglia :: agent |
37
- ganglia :: default |
38
- ganglia :: server |
39
- ganglia :: config_files | new
40
-
41
- graphite :: default |
42
- graphite :: carbon |
43
- graphite :: ganglia |
44
- graphite :: dashboard | web
45
- graphite :: whisper |
46
-
47
- hadoop_cluster :: default |
48
- hadoop_cluster :: add_cloudera_repo |
49
- hadoop_cluster :: datanode |
50
- hadoop_cluster :: doc |
51
- hadoop_cluster :: hdfs_fuse |
52
- hadoop_cluster :: jobtracker |
53
- hadoop_cluster :: namenode |
54
- hadoop_cluster :: secondarynn |
55
- hadoop_cluster :: tasktracker |
56
- hadoop_cluster :: wait_on_hdfs_safemode |
57
- hadoop_cluster :: fake_topology |
58
- hadoop_cluster :: minidash |
59
- hadoop_cluster :: config_files | cluster_conf
60
-
61
- hbase :: default |
62
- hbase :: master |
63
- hbase :: minidash |
64
- hbase :: regionserver |
65
- hbase :: stargate |
66
- hbase :: thrift |
67
- hbase :: backup_tables |
68
- hbase :: config_files | config
69
-
70
- jenkins :: default |
71
- jenkins :: server |
72
- jenkins :: user_key |
73
- jenkins :: node_ssh |
74
- jenkins :: osx_worker |
75
- jenkins :: build_from_github |
76
- jenkins :: build_ruby_rspec |
77
- jenkins :: auth_github_oauth |
78
- jenkins :: plugins |
79
- #
80
- jenkins :: add_apt_repo |
81
- jenkins :: iptables |
82
- jenkins :: node_jnlp |
83
- jenkins :: node_windows |
84
- jenkins :: proxy_apache2 |
85
- jenkins :: proxy_nginx |
86
-
87
- minidash :: default |
88
- minidash :: server |
89
-
90
- mongodb :: default |
91
- mongodb :: apt | add_apt_repo
92
- mongodb :: install_from_release | source
93
- mongodb :: backup |
94
- mongodb :: config_server | fixme
95
- mongodb :: mongos | fixme
96
- mongodb :: server |
97
-
98
- nfs :: client |
99
- nfs :: default |
100
- nfs :: server |
101
-
102
- redis :: default |
103
- redis :: install_from_package |
104
- redis :: install_from_release |
105
- redis :: client |
106
- redis :: server |
107
-
108
- resque :: default |
109
- resque :: dedicated_redis |
110
- resque :: dashboard |
111
-
112
- route53 :: default |
113
- route53 :: set_hostname | ec2
114
-
115
- statsd :: default |
116
- statsd :: server |
117
-
118
- volumes :: default |
119
- volumes :: build_raid |
120
- volumes :: format |
121
- volumes :: mount |
122
- volumes :: resize |
123
- volumes_ebs :: default |
124
- volumes_ebs :: attach_ebs |
125
-
126
- zabbix :: agent |
127
- zabbix :: agent_prebuild |
128
- zabbix :: agent_source |
129
- zabbix :: database |
130
- zabbix :: database_mysql |
131
- zabbix :: default |
132
- zabbix :: firewall |
133
- zabbix :: server |
134
- zabbix :: server_source |
135
- zabbix :: web |
136
- zabbix :: web_apache |
137
- zabbix :: web_nginx |
138
-
139
- zookeeper :: default |
140
- zookeeper :: client |
141
- zookeeper :: server |
142
- zookeeper :: config_files |
data/notes/silverware.md DELETED
@@ -1,85 +0,0 @@
1
- # Silverware Chef Cookbook
2
-
3
- ## Overview
4
-
5
- Cookbooks repeatably express these and other aspects:
6
-
7
- * "I launch these daemons: ..."
8
- * "I have a collection of logs at '/var/log/lol'"
9
- * "I have a dashboard at 'http://....:...'"
10
- * ... and much more.
11
-
12
- Wouldn't it be nice if announcing a log directory caused...
13
-
14
- - my log rotation system to start rotating my logs?
15
- - a 'disk free space' gauge to be added to the monitoring dashboard for that service?
16
- - Flume (or whatever) began picking up my logs and archiving them to a predictable location?
17
- - in the case of standard apache logs, a listener to start counting the rate of requests, 200s, 404s and so forth?
18
- Similarly, announcing ports should mean
19
- - the firewall and security groups configure themselves correspondingly
20
- - the monitor system starts regularly pinging the port for uptime and latency
21
- - and pings the interfaces that it should *not* appear on to ensure the firewall is in place?
22
-
23
- Ironfan makes those aspects standardized and predictable, and provides integration and discovery hooks. The key is to make integration *inevitable*: No more forgetting to rotate or monitor a service, or having a config change over here screw up a dependent system over there.
24
- ________________________________________________________________________
25
-
26
- Attributes are scoped by *cookbook* and then by *component*.
27
-
28
- * If I declare `announce(:redis)`, it will look in `node[:redis]`.
29
- * If I declare `announce(:hadoop, :namenode)`, it will look in `node[:hadoop]` for cookbook-wide concerns and `node[:hadoop][:namenode]` for component-specific concerns.
30
- * The cookbook scope is always named for its cookbook. Its attributes live in`node[:cookbook_name]`. If everything in the cookbook shares a concern, it sits at cookbook level. So the Hadoop log directory (shared by all its components) is at `(scratch_root)/hadoop/log`.
31
- * If there is only one component, it can be implicitly named for its cookbook. In this case, it is omitted: the component attributes live in `node[:cookbook_name]` (which is the same as the component name).
32
- * If there are multiple components, they will live in `node[:cookbook_name][:component_name]` (eg `[:hadoop][:namenode]` or `[:flume][:master]`.
33
-
34
- ### Discovery
35
-
36
- Allow nodes to discover the location for a given service at runtime, adapting when new services register.
37
-
38
- #### Operations:
39
-
40
- * register for a service. A timestamp records the last registry.
41
- * discover all chef nodes that have registered for the given service.
42
- * discover the most recent chef node for that service.
43
- * get the 'public_ip' for a service -- the address that nodes in the larger world should use
44
- * get the 'private_ip' for a service -- the address that nodes on the local subnet / private cloud should use
45
-
46
- #### Implementation
47
-
48
- Nodes register a service by calling `announce(<service>[,<component>])`, which adds a hash to node[:announces][<service>][<component>], containing 'timestamp' (the time of registry) and other metadata passed in. Nodes discover services by calling `discover(<service>[,<component>[,<realm>]])`, where realm is the scope of the discovery (the current cluster, by default).
49
-
50
- ## Recipes
51
-
52
- * `default` - Base configuration for silverware
53
-
54
- ## Integration
55
-
56
- Supports platforms: Debian and Ubuntu
57
-
58
-
59
-
60
- ## Attributes
61
-
62
- * `[:silverware][:conf_dir]` - (default: "/etc/silverware")
63
- * `[:silverware][:log_dir]` - (default: "/var/log/silverware")
64
- * `[:silverware][:home_dir]` - (default: "/etc/silverware")
65
- * `[:silverware][:user]` - (default: "root")
66
- * `[:users][:root][:primary_group]` - (default: "root")
67
-
68
- ## License and Author
69
-
70
- Author:: Philip (flip) Kromer - Infochimps, Inc (<coders@infochimps.com>)
71
- Copyright:: 2011, Philip (flip) Kromer - Infochimps, Inc
72
-
73
- Licensed under the Apache License, Version 2.0 (the "License");
74
- you may not use this file except in compliance with the License.
75
- You may obtain a copy of the License at
76
-
77
- http://www.apache.org/licenses/LICENSE-2.0
78
-
79
- Unless required by applicable law or agreed to in writing, software
80
- distributed under the License is distributed on an "AS IS" BASIS,
81
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
82
- See the License for the specific language governing permissions and
83
- limitations under the License.
84
-
85
- > readme generated by [ironfan](http://github.com/infochimps-labs/ironfan)'s cookbook_munger
data/notes/style_guide.md DELETED
@@ -1,300 +0,0 @@
1
- # Ironfan + Chef Style Guide
2
-
3
- ------------------------------------------------------------------------
4
-
5
- ### System+Component define Names
6
-
7
- Name things uniformly for their system and component. For the ganglia master,
8
-
9
- * attributes: `node[:ganglia][:master]`
10
- * recipe: `ganglia::master`
11
- * role: `ganglia_master`
12
- * directories: `ganglia/master` (if specific to component), `ganglia` (if not).
13
- - for example: `/var/log/ganglia/master`
14
-
15
- ### Component names
16
-
17
- * `agent.rb`
18
- * `worker.rb`
19
- * `datanode.rb`
20
- * `webnode.rb`
21
-
22
-
23
- ### Recipes
24
-
25
- Recipes partition these things:
26
-
27
- * shared functionality between components
28
- * proper event order
29
- * optional or platform-specific functionality
30
-
31
- * Within the foo cookbook, name your recipes like this:
32
- - `default.rb` -- information shared by anyone using foo, including support packages, users and directories.
33
- - `user.rb` -- define daemon users. Called 'user' even if there is more than one. It's OK to move this into the default cookbook.
34
- - `install_from_X.rb` -- install packages (`install_from_package`), versioned tarballs (`install_from_release`). It's OK to move this into `default.rb`.
35
- - `deploy.rb` -- use this when doing sha-versioned deploys.
36
- - `plugins.rb` -- install additional plugins or support code. If you have separate plugins, name them `git_plugin`, `rspec_plugin`, etc.
37
- - `server.rb` -- define the foo server process. Similarly, `agent`, `worker`, etc -- see component naming above.
38
- - `client.rb` -- install libraries to *use* the foo service.
39
- - `config_files.rb` -- discover other components, write final configuration to disk
40
- - `finalize.rb` -- final cleanup
41
-
42
- * Do not repeat the cookbook name in a recipe title: `ganglia::master`, not `ganglia::ganglia_master`.
43
- * Use only `[a-z0-9_]` for cookbook and component names. Do not use capital letters or hyphens.
44
- * Keep names short and descriptive (preferably 15 characters or less, or it jacks with the Chef webui).
45
-
46
- * Always include a `default.rb` recipe, even if it is blank.
47
- * *DO NOT* use the default cookbook to install daemons or do anything interesting at all, even if that's currently the only thing the recipe does. I want to be able to refer to the attributes in the apache cookbook without launching the apache service. Think of it like a C header file.
48
-
49
- A `client` is also passive -- it lets me *use* the system without requiring that I run it. This means the client recipe should *never* launch a process (chef_client` and `nfs_client` components are allowed exceptions).
50
-
51
- ### Cookbook Dependencies
52
-
53
- * Dependencies should be announced in metadata.rb, of course.
54
- * Explicitly `include_recipe` for system resources -- `runit`, `java`, `silverware`, `thrift` and `apt`.
55
- - never
56
- * *DO NOT* use `include_recipe` unless putting it in the role would be utterly un-interesting. You *want* the run to break unless it's explicitly included in the role.
57
- - *yes*: `java`, `ruby`, `announces`, etc.
58
- - *no*: `zookeeper::client`, `nfs::server`, or anything that will start a daemon
59
- Remember: ordinary cookbooks describe systems, roles and integration cookbooks coordinate them.
60
- * `include_recipe` statements should only appear in recipes that are entry points. Recipes that are not meant to be called directly should assume their dependencies have been met.
61
- * If a recipe is meant to be the primary entrypoint, it *should* include default, and it should do so explicitly: `include_recipe 'foo::default'` (not just 'foo').
62
-
63
- Crisply separate cookbook-wide concerns from component concerns.
64
-
65
- Separate system configuration from multi-system integration. Cookbooks should provide hooks that are neighborly but not exhibitionist, and otherwise mind their own business.
66
-
67
- ### Templates
68
-
69
- *DO NOT* refer to attributes directly on the node (`node[:foo]`). This prevents people from using those templates outside the cookbook. Instead:
70
-
71
- ```ruby
72
- # in recipe
73
- template 'fooconf.yml' do
74
- variables :foo => node[:foo]
75
- end
76
-
77
- # in template
78
- @node[:log_dir]
79
- ```
80
-
81
- ### Attributes
82
-
83
- * Scope concerns by *cookbook* or *cookbook and component*. `node[:hadoop]` holds cookbook-wide concerns, `node[:hadoop][:namenode]` holds component-specific concerns.
84
- * Attributes shared by all components sit at cookbook level, and are always named for the cookbook: `node[:hadoop][:log_dir]` (since it is shared by all its components).
85
- * Component-specific attributes sit at component level (`node[:cookbook_name][:component_name]`): eg `node[:hadoop][:namenode][:service_state]`. Do not use a prefix (NO: `node[:hadoop][:namenode_handler_count]`)
86
-
87
- * Refer to node attributes by symbol, never by method:
88
- - `node[:ganglia][:log_dir]`, not `node.ganglia.log_dir` or `node['ganglia']['log_dir']
89
-
90
- #### Attribute Files
91
-
92
- * The main attribute file should be named `attributes/default.rb`. Do not name the file after the cookbook, or anything else.
93
- * If there are a sizeable number of tunable attributes (hadoop, cassandra), place them in `attributes/tuneables.rb`.
94
-
95
- ## Name Attributes for their aspects
96
-
97
- Attributes should be named for their aspect: `port`, `log`, etc. Use generic names if there is only one attribute for an aspect, prefixed names if there are many:
98
- - For a component that only opens one port: `node[:foo][:server][:port]`
99
- - More than one port, use a prefix: `node[:foo][:server][:dash_port]` and `node[:foo][:server][:rpc_port]`.
100
-
101
- Sometimes the conventions below are inappropriate. All we ask is in those cases that you *not* use the special magic name. For example, don't use `:port` and give it a comma-separated string; name it something else, like `:port_list`.
102
-
103
- Here are specific conventions:
104
-
105
- ### File and Dir Aspects
106
-
107
- A *file* is the full directory and basename for a file. A *dir* is a directory whose contents correspond to a single concern. A *prefix* not intended to be used directly -- it will be decorated with suffixes to form dirs and files. A *basename* is only the leaf part of a file reference. Don't use the terms 'path' or 'filename'.
108
-
109
- Ignore the temptation to make a one-true-home-for-my-system, or to fight the package maintainer's choices. (FIXME: Rewrite to encourage OS-correct naming schemas.)
110
- - a sandbox holding dir, pid, log, ...
111
-
112
- #### Application
113
-
114
- * **prefix**: A container with directories bin, lib, share, src, to use according to convention
115
- - default: `/usr/local`.
116
- * **home_dir**: Logical location for the cookbook's system code.
117
- - default: typically, leave it up to the package maintainer. Otherwise, `:prefix/share/:cookbook` should be a symlink to the `install_dir` (see below).
118
- - instead of: `xx_home` / `dir` alone / `install_dir`
119
- * **install_dir**: The cookbook's system code, in case the home dir is a pointer to potential alternates.
120
- - default: `:prefix/share/:cookbook-:version` ( you don't need the directory after the cookbook runs, use `:prefix/share/:cookbook-:version` instead, eg `/usr/local/src/tokyo_tyrant-xx.xx`)
121
- - Make `home_dir` a symlink to this directory (eg home_dir `/usr/local/share/elasticsearch` links to install_dir `/usr/local/share/elasticsearch-0.17.8`).
122
- * **src_dir**: holds the compressed tarball, its expanded contents, and the compiled files when installing from source. Use this when you will run `make install` or equivalent and use the files elsewhere.
123
- - default: `:prefix/src/:system_name-:version`, eg `/usr/local/src/pig-0.9.tar.gz`
124
- - do not: expand the tarball to `:prefix/src/(whatever)` if it will actually be used from there; instead, use the `install_dir` convention described above. (As a guideline, I should be able to blow away `/usr/local/src` and everything still works).
125
- * **deploy_dir**: deployed code that follows the capistrano convention. See more about deploy variables below.
126
- - the `:deploy_dir/shared` directory holds common files
127
- - releases are checked out to `:deploy_dir/releases/{sha}`
128
- - the operational release is a symlink to the right release: `:deploy_dir/current -> :deploy_dir/releases/xxx`.
129
- - do not: use this when you mean `home_dir`.
130
-
131
- * **scratch_roots**, **persistent_roots**: an array of directories spread across volumes, with expectations on persistence
132
- - `scratch_root`s have no guarantee of persistence -- for example, stop/start'ing a machine on EC2 destroys the contents of its local (ephemeral) drives. `persistent_root`s have the *best available* promise of persistance: if permanent (eg EBS) volumes are available, they will exclusively populate the `persistent_root`s; but if not, the ephemeral drives are used instead.
133
- - these attributes are provided by the `mountable_volume` meta-cookbook and its appropriate integration recipe. Ordinary cookbooks should always trust the integration cookbook's choices (or visit the integration cookbook to correct them).
134
- - each element in `persistent_roots` is by contract on a separate volume, and similarly each of the `scratch_roots` is on a separate volume. A volume *may* be in both scratch and persistent (for example, there may be only one volume!).
135
- - the singular forms **scratch_root** and **persistent_root** are provided for your convenience and always correspond to `scratch_roots.first` and `persistent_roots.first`. This means lots the first named volume is picked on the heaviest -- if you don't like that, choose explicitly (but not randomly, or you won't be idempotent).
136
-
137
-
138
- * **log_file**, **log_dir**, **xx_log_file**, **xx_log_dir**:
139
- - default:
140
- - if the log files will always be trivial in size, put them in `/var/log/:cookbook.log` or `/var/log/:cookbook/(whatever)`.
141
- - if it's a runit-managed service, leave them in `/etc/sv/:cookbook-:component/log/main/current`, and make a symlink from `/var/log/:cookbook-component` to `/etc/sv/:cookbook-:component/log/main/`.
142
- - If the log files are non-trivial in size, set log dir `/:scratch_root/:cookbook/log/`, and symlink `/var/log/:cookbook/` to it.
143
- - If the log files should be persisted, place them in `/:persistent_root/:cookbook/log`, and symlink `/var/log/:cookbook/` to it.
144
- - in all cases, the directory is named `.../log`, not `.../logs`. Never put things in `/tmp`.
145
- - Use the physical location for the `log_dir` attribute, not the /var/log symlink.
146
- * **tmp_dir**:
147
- - default: `/:scratch_root/:cookbook/tmp/`
148
- - Do not put a symlink or directory in `/tmp` -- something else blows it away, the app recreates it as a physical directory, `/tmp` overflows, pagers go off, sadness spreads throughout the land.
149
- * **conf_dir**:
150
- - default: `/etc/:cookbook`
151
- * **bin_dir**:
152
- - default: `/:home_dir/bin`
153
- * **pid_file**, **pid_dir**:
154
- - default: pid_file: `/var/run/:cookbook.pid` or `/var/run/:cookbook/:component.pid`; pid_dir: `/var/run/:cookbook/`
155
- - instead of: `job_dir`, `job_file`, `pidfile`, `run_dir`.
156
- * **cache_dir**:
157
- - default: `/var/cache/:cookbook`.
158
-
159
- * **data_dir**:
160
- - default: `:persistent_root/:cookbook/:component/data`
161
- - instead of: `datadir, `dbfile`, `dbdir`
162
- * **journal_dir**: high-speed local storage for commitlogs and so forth. Can be deleted, though you may rather it wasn't.
163
- - default: `:scratch_root/:cookbook/:component/scratch`
164
- - instead of: `commitlog_dir`
165
-
166
- ### Daemon Aspects
167
-
168
- * **daemon_name**: daemon's actual service name, if it differs from the component. For example, the `hadoop-namenode` component's daemon is `hadoop-0.20-namenode` as installed by apt.
169
- * **daemon_states**: an array of the verbs acceptable to the Chef `service` resource: `:enable`, `:start`, etc.
170
- * **num_xx_processes**, **num_xx_threads** the number of separate top-level processes (distinct PIDs) or internal threads to run
171
- - instead of `num_workers`, `num_servers`, `worker_processes`, `foo_threads`.
172
- * **log_level**
173
- - application-specific; often takes values info, debug, warn
174
- - instead of `verbose`, `verbosity`, `loglevel`
175
- * **user**, **group**, **uid**, **gid** -- `user` is the user name. The `user` and `group` should be strings, even the `uid` and `gid` should be integers.
176
- - instead of username, group_name, using uid for user name or vice versa.
177
- - if there are multiple users, use a prefix: `launcher_user` and `observer_user`.
178
-
179
- ### Install / Deploy Aspects
180
-
181
- * **release_url**: URL for the release.
182
- - instead of: install_url, package_url, being careless about partial vs whole URLs
183
- * **release_file**: Where to put the release.
184
- - default: `:prefix/src/system_name-version.ext`, eg `/usr/local/src/elasticsearch-0.17.8.tar.bz2`.
185
- - do not use `/tmp` -- let me decide when to blow it away (and make it easy to be idempotent).
186
- - do not use a non-versioned URL or file name.
187
- * **release_file_sha** or **release_file_md5** fingerprint
188
- - instead of: `whatever_checksum`, `whatever_fingerprint`
189
- * **version**: if it's a simply-versioned resource that uses the `major.minor.patch-cruft` convention. Do not use unless this is true, and do not use the source control revision ID.
190
-
191
- * **plugins**: array of system-specific plugins
192
-
193
- use `deploy_{}` for anything that would be true whatever SCM you're using; use `git_{}` (and so forth) where specific to that repo.
194
-
195
- * **deploy_env** production / staging / etc
196
- * **deploy_strategy**
197
- * **deploy_user** user to run as
198
- * **deploy_dir**: Only use `deploy_dir` if you are following the capistrano convention: see above.
199
-
200
- * **git_repo**: url for the repo, eg `git@github.com:infochimps-labs/ironfan.git` or `http://github.com/infochimps-labs/ironfan.git`
201
- - instead of: `deploy_repo`, `git_url`
202
- * **git_revision**: SHA or branch
203
- - instead of: `deploy_revision`
204
-
205
- * **apt/(repo_name)** Options for adding a cookbook's apt repo.
206
- - Note that this is filed under *apt*, not the cookbook.
207
- - Use the best name for the repo, which is not necessarily the cookbook's name: eg `apt/cloudera/{...}`, which is shared by hadoop, flume, pig, and so on.
208
- - `apt/{repo_name}/url` -- eg `http://archive.cloudera.com/debian`
209
- - `apt/{repo_name}/key` -- GPG key
210
- - `apt/{repo_name}/force_distro` -- forces the distro (eg, you are on natty but the apt repo only has maverick)
211
-
212
- ### Ports
213
-
214
- * **xx_port**:
215
- - *do not* use 'port' on its own.
216
- - examples: `thrift_port`, `webui_port`, `zookeeper_port`, `carbon_port` and `whisper_port`.
217
- - xx_port: `default[:foo][:server][:port] = 5000`
218
- - xx_ports, if an array: `default[:foo][:server][:ports] = [5000, 5001, 5002]`
219
-
220
- * **addr**, **xx_addr**
221
- - if all ports bind to the same interface, use `addr`. Otherwise, do *not* use `addr`, and use a unique `foo_addr` for each `foo_port`.
222
- - instead of: `hostname`, `binding`, `address`
223
-
224
- * Want some way to announce my port is http or https.
225
- * Need to distinguish client ports from service ports. You should be using cluster service discovery anyway though.
226
-
227
- ### Application Integration
228
-
229
- * **jmx_port**
230
-
231
- ### Tunables
232
-
233
- * **XX_heap_max**, **xx_heap_min**, **java_heap_eden**
234
- * **java_home**
235
- * AVOID batch declaration of options (e.g. **java_opts**) if possible: assemble it in your recipe from intelligible attribute names.
236
-
237
- ### Nitpicks
238
-
239
- * Always put file modes in quote marks: `mode "0664"` not `mode 0664`.
240
-
241
- ## Announcing Aspects
242
-
243
- If your app does any of the following,
244
-
245
- * **services** -- Any interesting long-running process.
246
- * **ports** -- Any reserved open application port
247
- - *http*: HTTP application port
248
- - *https*: HTTPS application port
249
- - *internal*: port is on private IP, should *not* be visible through public IP
250
- - *external*: port *is* available through public IP
251
- * metric_ports:
252
- - **jmx_ports** -- JMX diagnostic port (announced by many Java apps)
253
- * **dashboards** -- Web interface to look inside a system; typically internal-facing only, and probably not performance-monitored by default.
254
- * **logs** -- um, logs. You can also announce the logs' flavor: `:apache`, `log4j`, etc.
255
- * **scheduleds** -- regularly-occurring events that leave a trace
256
- * **exports** -- jars or libs that other programs may wish to incorporate
257
- * **consumes** -- placed there by any call to `discover`.
258
-
259
- ## Clusters
260
-
261
- * Describe physical configuration:
262
- - machine size, number of instances per facet, etc
263
- - external assets (elastic IP, ebs volumes)
264
- * Describe high-level assembly of systems via roles: `hadoop_namenode`, `nfs_client`, `ganglia_agent`, etc.
265
- * Describe important modifications, such as `ironfan::system_internals`, mounts ebs volumes, etc
266
- * Describe override attributes:
267
- - `heap size`, rvm versions, etc.
268
-
269
- * roles and recipes
270
- - remove `cluster_role` and `facet_role` if empty
271
- - are not in `run_list`, but populated by the `role` and `recipe` directives
272
- * remove big_package unless it's a dev machine (sandbox, etc)
273
-
274
- ## Roles
275
-
276
- Roles define the high-level assembly of recipes into systems
277
-
278
- * override attributes go into the cluster.
279
- currently, those files are typically empty and are badly cluttering the roles/ directory.
280
- the cluster and facet override attributes should be together, not scattered in different files.
281
- roles shouldn't assemble systems. The contents of the infochimps_chef/roles/plato_truth.rb file belong in a facet.
282
-
283
- * Deprecated:
284
- - Cluster and facet roles (`roles/gibbon_cluster.rb`, `roles/gibbon_namenode.rb`, etc) go away
285
- - Roles should be service-oriented: `hadoop_master` considered harmful, you should explicitly enumerate the services
286
-
287
-
288
- ### Facets should be (nearly) identical
289
-
290
- Within a facet, keep your servers almost entirely identical. For example, servers in a MySQL facet would their index to set shard order and to claim the right attached volumes. However, it would be a mistake to have one server within a facet be a master process and the rest be worker processes -- just define different facets for each.
291
-
292
- ### Pedantic Distinctions:
293
-
294
- Separate the following terms:
295
-
296
- * A *machine* is a concrete thing that runs your code -- it might be a VM or raw metal, but it has CPUs and fans and a finite lifetime. It has a unique name tied to its physical presence -- something like 'i-123abcd' or 'rack 4 server 7'.
297
- * A *chef node* is the code object that, together with the chef-client process, configures a machine. In ironfan, the chef node is strictly slave to the server description and the measured attributes of the machine.
298
- * A *server description* gives the high-level specification the machine should acheive. This includes the roles, recipes and attributes given to the chef node; the physical characteristics of the machine ('8 cores, 7GB ram, AWS cloud'); and its relation to the rest of the system (george cluster, webnode facet, index 3).
299
-
300
- In particular, we try to be careful to always call a Chef node a 'chef node' (never just 'node'). Try processing graph nodes in a flume node feeding a node.js decorator on a cloud node define by a chef node. No(de) way.