RubyGems - ironfan - Versions diffs - 5.0.11 → 6.0.0 - Mend

ironfan 5.0.11 → 6.0.0

Files changed (121) hide show

data/.gitignore +4 -0
data/.gitmodules +3 -0
data/Gemfile +8 -26
data/Gemfile.lock +38 -41
data/NOTES-REALM.md +172 -0
data/Rakefile +19 -77
data/config/ubuntu12.04-ironfan.erb +7 -0
data/ironfan.gemspec +28 -225
data/lib/chef/cluster_knife.rb +26 -0
data/lib/chef/knife/bootstrap/ubuntu12.04-ironfan.erb +7 -0
data/lib/chef/knife/cluster_bootstrap.rb +1 -3
data/lib/chef/knife/cluster_diff.rb +2 -8
data/lib/chef/knife/cluster_kick.rb +1 -3
data/lib/chef/knife/cluster_kill.rb +1 -2
data/lib/chef/knife/cluster_launch.rb +17 -34
data/lib/chef/knife/cluster_list.rb +6 -5
data/lib/chef/knife/cluster_proxy.rb +1 -3
data/lib/chef/knife/cluster_pry.rb +1 -2
data/lib/chef/knife/cluster_show.rb +6 -7
data/lib/chef/knife/cluster_ssh.rb +10 -8
data/lib/chef/knife/cluster_start.rb +1 -2
data/lib/chef/knife/cluster_stop.rb +1 -2
data/lib/chef/knife/cluster_sync.rb +2 -3
data/lib/chef/knife/ironfan_knife_common.rb +58 -18
data/lib/chef/knife/ironfan_script.rb +0 -3
data/lib/ironfan/broker/computer.rb +14 -11
data/lib/ironfan/broker.rb +17 -12
data/lib/ironfan/cookbook_requirements.rb +155 -0
data/lib/ironfan/dsl/cloud.rb +2 -0
data/lib/ironfan/dsl/cluster.rb +25 -15
data/lib/ironfan/dsl/component.rb +12 -15
data/lib/ironfan/dsl/compute.rb +10 -8
data/lib/ironfan/dsl/ec2.rb +2 -26
data/lib/ironfan/dsl/facet.rb +16 -14
data/lib/ironfan/dsl/openstack.rb +147 -0
data/lib/ironfan/dsl/realm.rb +23 -16
data/lib/ironfan/dsl/security_group.rb +29 -0
data/lib/ironfan/dsl/server.rb +14 -5
data/lib/ironfan/dsl/static.rb +63 -0
data/lib/ironfan/dsl/vsphere.rb +1 -0
data/lib/ironfan/dsl.rb +1 -134
data/lib/ironfan/headers.rb +19 -0
data/lib/ironfan/provider/chef/node.rb +3 -2
data/lib/ironfan/provider/ec2/machine.rb +10 -14
data/lib/ironfan/provider/ec2/security_group.rb +58 -43
data/lib/ironfan/provider/openstack/elastic_ip.rb +96 -0
data/lib/ironfan/provider/openstack/keypair.rb +78 -0
data/lib/ironfan/provider/openstack/machine.rb +371 -0
data/lib/ironfan/provider/openstack/security_group.rb +224 -0
data/lib/ironfan/provider/openstack.rb +69 -0
data/lib/ironfan/provider/static/machine.rb +192 -0
data/lib/ironfan/provider/static.rb +23 -0
data/lib/ironfan/provider.rb +58 -1
data/lib/ironfan/requirements.rb +17 -1
data/lib/ironfan/version.rb +3 -0
data/lib/ironfan.rb +107 -172
data/spec/chef/cluster_bootstrap_spec.rb +2 -7
data/spec/chef/cluster_launch_spec.rb +1 -2
data/spec/fixtures/realms/samurai.rb +26 -0
data/spec/integration/minimal-chef-repo/clusters/.gitkeep +0 -0
data/spec/integration/minimal-chef-repo/config/.gitkeep +0 -0
data/spec/integration/minimal-chef-repo/knife/credentials/.gitignore +1 -0
data/spec/integration/minimal-chef-repo/knife/credentials/certificates/.gitkeep +0 -0
data/spec/integration/minimal-chef-repo/knife/credentials/client_keys/.gitkeep +0 -0
data/spec/integration/minimal-chef-repo/knife/credentials/data_bag_keys/.gitkeep +0 -0
data/spec/integration/minimal-chef-repo/knife/credentials/ec2_certs/.gitkeep +0 -0
data/spec/integration/minimal-chef-repo/knife/credentials/ec2_keys/.gitkeep +0 -0
data/spec/integration/minimal-chef-repo/knife/credentials/ironfantest-validator.pem +27 -0
data/spec/integration/minimal-chef-repo/knife/credentials/ironfantester.pem +27 -0
data/spec/integration/minimal-chef-repo/tasks/.gitkeep +0 -0
data/spec/ironfan/cluster_spec.rb +1 -2
data/spec/ironfan/diff_spec.rb +0 -2
data/spec/ironfan/dsl_spec.rb +6 -3
data/spec/ironfan/ec2/cloud_provider_spec.rb +17 -18
data/spec/ironfan/ec2/elb_spec.rb +44 -41
data/spec/ironfan/ec2/security_group_spec.rb +45 -47
data/spec/ironfan/manifest_spec.rb +0 -1
data/spec/ironfan/plugin_spec.rb +55 -40
data/spec/ironfan/realm_spec.rb +42 -30
data/spec/spec_helper.rb +17 -31
data/spec/{spec_helper → support}/dummy_chef.rb +0 -0
data/spec/{spec_helper → support}/dummy_diff_drawer.rb +0 -0
metadata +78 -155
data/.rspec +0 -2
data/.yardopts +0 -19
data/VERSION +0 -2
data/chefignore +0 -41
data/notes/Future-development-proposals.md +0 -266
data/notes/Home.md +0 -55
data/notes/INSTALL-cloud_setup.md +0 -103
data/notes/INSTALL.md +0 -134
data/notes/Ironfan-Roadmap.md +0 -70
data/notes/Upgrading-to-v4.md +0 -66
data/notes/advanced-superpowers.md +0 -16
data/notes/aws_servers.jpg +0 -0
data/notes/aws_user_key.png +0 -0
data/notes/cookbook-versioning.md +0 -11
data/notes/core_concepts.md +0 -200
data/notes/declaring_volumes.md +0 -3
data/notes/design_notes-aspect_oriented_devops.md +0 -36
data/notes/design_notes-ci_testing.md +0 -169
data/notes/design_notes-cookbook_event_ordering.md +0 -249
data/notes/design_notes-meta_discovery.md +0 -59
data/notes/ec2-pricing_and_capacity.md +0 -75
data/notes/ec2-pricing_and_capacity.numbers +0 -0
data/notes/homebase-layout.txt +0 -102
data/notes/knife-cluster-commands.md +0 -21
data/notes/named-cloud-objects.md +0 -11
data/notes/opscode_org_key.png +0 -0
data/notes/opscode_user_key.png +0 -0
data/notes/philosophy.md +0 -13
data/notes/rake_tasks.md +0 -24
data/notes/renamed-recipes.txt +0 -142
data/notes/silverware.md +0 -85
data/notes/style_guide.md +0 -300
data/notes/tips_and_troubleshooting.md +0 -92
data/notes/walkthrough-hadoop.md +0 -168
data/notes/walkthrough-web.md +0 -166
data/spec/fixtures/gunbai.rb +0 -24
data/spec/test_config.rb +0 -20
data/tasks/chef_config.rake +0 -38

data/notes/renamed-recipes.txt DELETED Viewed

@@ -1,142 +0,0 @@
-cassandra      	:: default                 	|
-cassandra      	:: add_apt_repo            	| new
-cassandra      	:: install_from_git        	|
-cassandra      	:: install_from_package    	|
-cassandra      	:: install_from_release    	|
-cassandra      	:: config_from_data_bag       	| autoconf
-cassandra      	:: client                  	|
-cassandra      	:: server                  	|
-cassandra      	:: authentication          	| not include_recipe'd -- added to role
-cassandra      	:: bintools                	|
-cassandra      	:: ec2snitch               	|
-cassandra      	:: jna_support             	|
-cassandra      	:: mx4j                    	|
-cassandra      	:: iptables                	|
-cassandra      	:: ruby_client             	|
-cassandra      	:: config_files            	| new
-elasticsearch  	:: default                 	|
-elasticsearch  	:: install_from_git        	|
-elasticsearch  	:: install_from_release    	|
-elasticsearch  	:: plugins                 	| install_plugins
-elasticsearch  	:: server                  	|
-elasticsearch  	:: client                  	|
-elasticsearch  	:: load_balancer           	|
-elasticsearch  	:: config_files               	| config
-flume          	:: default                 	|
-flume          	:: master                  	|
-flume          	:: agent                   	| node
-flume          	:: plugin-hbase_sink       	| hbase_sink_plugin
-flume          	:: plugin-jruby            	| jruby_plugin
-flume          	:: test_flow               	|
-flume          	:: test_s3_source          	|
-flume          	:: config_files              	| config
-ganglia        	:: agent                   	|
-ganglia        	:: default                 	|
-ganglia        	:: server                  	|
-ganglia        	:: config_files            	| new
-graphite       	:: default                 	|
-graphite       	:: carbon                  	|
-graphite       	:: ganglia                 	|
-graphite       	:: dashboard                  	| web
-graphite       	:: whisper                 	|
-hadoop_cluster 	:: default                 	|
-hadoop_cluster 	:: add_cloudera_repo       	|
-hadoop_cluster 	:: datanode                	|
-hadoop_cluster 	:: doc                     	|
-hadoop_cluster 	:: hdfs_fuse               	|
-hadoop_cluster 	:: jobtracker              	|
-hadoop_cluster 	:: namenode                	|
-hadoop_cluster 	:: secondarynn             	|
-hadoop_cluster 	:: tasktracker             	|
-hadoop_cluster 	:: wait_on_hdfs_safemode     	|
-hadoop_cluster 	:: fake_topology           	|
-hadoop_cluster 	:: minidash                	|
-hadoop_cluster 	:: config_files            	| cluster_conf
-hbase          	:: default                 	|
-hbase          	:: master                  	|
-hbase          	:: minidash                	|
-hbase          	:: regionserver            	|
-hbase          	:: stargate                	|
-hbase          	:: thrift                  	|
-hbase          	:: backup_tables           	|
-hbase          	:: config_files              	| config
-jenkins        	:: default                 	|
-jenkins        	:: server                  	|
-jenkins        	:: user_key                	|
-jenkins        	:: node_ssh                	|
-jenkins        	:: osx_worker              	|
-jenkins        	:: build_from_github       	|
-jenkins        	:: build_ruby_rspec        	|
-jenkins        	:: auth_github_oauth       	|
-jenkins        	:: plugins                 	|
-#
-jenkins        	:: add_apt_repo            	|
-jenkins        	:: iptables                	|
-jenkins        	:: node_jnlp               	|
-jenkins        	:: node_windows            	|
-jenkins        	:: proxy_apache2           	|
-jenkins        	:: proxy_nginx             	|
-minidash       	:: default                 	|
-minidash       	:: server                  	|
-mongodb        	:: default                 	|
-mongodb        	:: apt                     	| add_apt_repo
-mongodb        	:: install_from_release       	| source
-mongodb        	:: backup                  	|
-mongodb        	:: config_server           	| fixme
-mongodb        	:: mongos                  	| fixme
-mongodb        	:: server                  	|
-nfs            	:: client                  	|
-nfs            	:: default                 	|
-nfs            	:: server                  	|
-redis          	:: default                 	|
-redis          	:: install_from_package    	|
-redis          	:: install_from_release    	|
-redis          	:: client                  	|
-redis          	:: server                  	|
-resque         	:: default                 	|
-resque         	:: dedicated_redis         	|
-resque         	:: dashboard               	|
-route53        	:: default                 	|
-route53        	:: set_hostname                	| ec2
-statsd         	:: default                 	|
-statsd         	:: server                  	|
-volumes        	:: default                 	|
-volumes        	:: build_raid              	|
-volumes        	:: format                  	|
-volumes        	:: mount                   	|
-volumes        	:: resize                  	|
-volumes_ebs    	:: default                 	|
-volumes_ebs    	:: attach_ebs              	|
-zabbix         	:: agent                   	|
-zabbix         	:: agent_prebuild          	|
-zabbix         	:: agent_source            	|
-zabbix         	:: database                	|
-zabbix         	:: database_mysql          	|
-zabbix         	:: default                 	|
-zabbix         	:: firewall                	|
-zabbix         	:: server                  	|
-zabbix         	:: server_source           	|
-zabbix         	:: web                     	|
-zabbix         	:: web_apache              	|
-zabbix         	:: web_nginx               	|
-zookeeper      	:: default                 	|
-zookeeper      	:: client                  	|
-zookeeper      	:: server                  	|
-zookeeper      	:: config_files            	|

data/notes/silverware.md DELETED Viewed

@@ -1,85 +0,0 @@
-# Silverware Chef Cookbook
-## Overview
-Cookbooks repeatably express these and other aspects:
-* "I launch these daemons: ..."
-* "I have a collection of logs at '/var/log/lol'"
-* "I have a dashboard at 'http://....:...'"
-* ... and much more.
-Wouldn't it be nice if announcing a log directory caused...
-  - my log rotation system to start rotating my logs?
-  - a 'disk free space' gauge to be added to the monitoring dashboard for that service?
-  - Flume (or whatever) began picking up my logs and archiving them to a predictable location?
-  - in the case of standard apache logs, a listener to start counting the rate of requests, 200s, 404s and so forth?
-Similarly, announcing ports should mean
-  - the firewall and security groups configure themselves correspondingly
-  - the monitor system starts regularly pinging the port for uptime and latency
-  - and pings the interfaces that it should *not* appear on to ensure the firewall is in place?
-Ironfan makes those aspects standardized and predictable, and provides integration and discovery hooks. The key is to make integration *inevitable*: No more forgetting to rotate or monitor a service, or having a config change over here screw up a dependent system over there.
-________________________________________________________________________
-Attributes are scoped by *cookbook* and then by *component*.
-* If I declare `announce(:redis)`, it will look in `node[:redis]`.
-* If I declare `announce(:hadoop, :namenode)`, it will look in `node[:hadoop]` for cookbook-wide concerns and `node[:hadoop][:namenode]` for component-specific concerns.
-* The cookbook scope is always named for its cookbook. Its attributes live in`node[:cookbook_name]`. If everything in the cookbook shares a concern, it sits at cookbook level. So the Hadoop log directory (shared by all its components) is at `(scratch_root)/hadoop/log`.
-* If there is only one component, it can be implicitly named for its cookbook. In this case, it is omitted: the component attributes live in `node[:cookbook_name]` (which is the same as the component name).
-* If there are multiple components, they will live in `node[:cookbook_name][:component_name]` (eg `[:hadoop][:namenode]` or `[:flume][:master]`.
-### Discovery
-Allow nodes to discover the location for a given service at runtime, adapting when new services register.
-#### Operations:
-* register for a service. A timestamp records the last registry.
-* discover all chef nodes that have registered for the given service.
-* discover the most recent chef node for that service.
-* get the 'public_ip' for a service -- the address that nodes in the larger world should use
-* get the 'private_ip' for a service -- the address that nodes on the local subnet / private cloud should use
-#### Implementation
-Nodes register a service by calling `announce(<service>[,<component>])`, which adds a hash to node[:announces][<service>][<component>], containing 'timestamp' (the time of registry) and other metadata passed in. Nodes discover services by calling `discover(<service>[,<component>[,<realm>]])`, where realm is the scope of the discovery (the current cluster, by default).
-## Recipes
-* `default`                  - Base configuration for silverware
-## Integration
-Supports platforms: Debian and Ubuntu
-## Attributes
-* `[:silverware][:conf_dir]`            -  (default: "/etc/silverware")
-* `[:silverware][:log_dir]`             -  (default: "/var/log/silverware")
-* `[:silverware][:home_dir]`            -  (default: "/etc/silverware")
-* `[:silverware][:user]`                -  (default: "root")
-* `[:users][:root][:primary_group]`     -  (default: "root")
-## License and Author
-Author::                Philip (flip) Kromer - Infochimps, Inc (<coders@infochimps.com>)
-Copyright::             2011, Philip (flip) Kromer - Infochimps, Inc
-Licensed under the Apache License, Version 2.0 (the "License");
-you may not use this file except in compliance with the License.
-You may obtain a copy of the License at
-    http://www.apache.org/licenses/LICENSE-2.0
-Unless required by applicable law or agreed to in writing, software
-distributed under the License is distributed on an "AS IS" BASIS,
-WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-See the License for the specific language governing permissions and
-limitations under the License.
-> readme generated by [ironfan](http://github.com/infochimps-labs/ironfan)'s cookbook_munger

data/notes/style_guide.md DELETED Viewed

@@ -1,300 +0,0 @@
-# Ironfan + Chef Style Guide
-------------------------------------------------------------------------
-### System+Component define Names
-Name things uniformly for their system and component. For the ganglia master,
-* attributes:  `node[:ganglia][:master]`
-* recipe:      `ganglia::master`
-* role:        `ganglia_master`
-* directories: `ganglia/master` (if specific to component), `ganglia` (if not).
-  - for example: `/var/log/ganglia/master`
-### Component names
-* `agent.rb`
-* `worker.rb`
-* `datanode.rb`
-* `webnode.rb`
-### Recipes
-Recipes partition these things:
-* shared functionality between components
-* proper event order
-* optional or platform-specific functionality
-* Within the foo cookbook, name your recipes like this:
-  - `default.rb`      -- information shared by anyone using foo, including support packages, users and directories.
-  - `user.rb`         -- define daemon users. Called 'user' even if there is more than one. It's OK to move this into the default cookbook.
-  - `install_from_X.rb` -- install packages (`install_from_package`), versioned tarballs (`install_from_release`). It's OK to move this into `default.rb`.
-  - `deploy.rb`       -- use this when doing sha-versioned deploys.
-  - `plugins.rb`      -- install additional plugins or support code. If you have separate plugins, name them `git_plugin`, `rspec_plugin`, etc.
-  - `server.rb`       -- define the foo server process. Similarly, `agent`, `worker`, etc -- see component naming above.
-  - `client.rb`       -- install libraries to *use* the foo service.
-  - `config_files.rb` -- discover other components, write final configuration to disk
-  - `finalize.rb`     -- final cleanup
-* Do not repeat the cookbook name in a recipe title: `ganglia::master`, not `ganglia::ganglia_master`.
-* Use only `[a-z0-9_]` for cookbook and component names. Do not use capital letters or hyphens.
-* Keep names short and descriptive (preferably 15 characters or less, or it jacks with the Chef webui).
-* Always include a `default.rb` recipe, even if it is blank.
-* *DO NOT* use the default cookbook to install daemons or do anything interesting at all, even if that's currently the only thing the recipe does. I want to be able to refer to the attributes in the apache cookbook without launching the apache service. Think of it like a C header file.
-A `client` is also passive -- it lets me *use* the system without requiring that I run it. This means the client recipe should *never* launch a process (chef_client` and `nfs_client` components are allowed exceptions).
-### Cookbook Dependencies
-* Dependencies should be announced in metadata.rb, of course.
-* Explicitly `include_recipe` for system resources -- `runit`, `java`, `silverware`, `thrift` and `apt`.
-  - never
-* *DO NOT* use `include_recipe` unless putting it in the role would be utterly un-interesting. You *want* the run to break unless it's explicitly included in the role.
-  - *yes*: `java`, `ruby`, `announces`, etc.
-  - *no*:  `zookeeper::client`, `nfs::server`, or anything that will start a daemon
-  Remember: ordinary cookbooks describe systems, roles and integration cookbooks coordinate them.
-* `include_recipe` statements should only appear in recipes that are entry points. Recipes that are not meant to be called directly should assume their dependencies have been met.
-* If a recipe is meant to be the primary entrypoint, it *should* include default, and it should do so explicitly: `include_recipe 'foo::default'` (not just 'foo').
-Crisply separate cookbook-wide concerns from component concerns.
-Separate system configuration from multi-system integration. Cookbooks should provide hooks that are neighborly but not exhibitionist, and otherwise mind their own business.
-### Templates
-*DO NOT* refer to attributes directly on the node (`node[:foo]`). This prevents people from using those templates outside the cookbook. Instead:
-```ruby
-    # in recipe
-    template 'fooconf.yml' do
-      variables :foo => node[:foo]
-    end
-    # in template
-    @node[:log_dir]
-```
-### Attributes
-* Scope concerns by *cookbook* or *cookbook and component*. `node[:hadoop]` holds cookbook-wide concerns, `node[:hadoop][:namenode]` holds component-specific concerns.
-* Attributes shared by all components sit at cookbook level, and are always named for the cookbook: `node[:hadoop][:log_dir]` (since it is shared by all its components).
-* Component-specific attributes sit at component level (`node[:cookbook_name][:component_name]`): eg `node[:hadoop][:namenode][:service_state]`. Do not use a prefix (NO: `node[:hadoop][:namenode_handler_count]`)
-* Refer to node attributes by symbol, never by method:
-  - `node[:ganglia][:log_dir]`, not `node.ganglia.log_dir` or `node['ganglia']['log_dir']
-#### Attribute Files
-* The main attribute file should be named `attributes/default.rb`. Do not name the file after the cookbook, or anything else.
-* If there are a sizeable number of tunable attributes (hadoop, cassandra), place them in `attributes/tuneables.rb`.
-## Name Attributes for their aspects
-Attributes should be named for their aspect: `port`, `log`, etc. Use generic names if there is only one attribute for an aspect, prefixed names if there are many:
-  - For a component that only opens one port: `node[:foo][:server][:port]`
-  - More than one port, use a prefix: `node[:foo][:server][:dash_port]` and `node[:foo][:server][:rpc_port]`.
-Sometimes the conventions below are inappropriate. All we ask is in those cases that you *not* use the special magic name. For example, don't use `:port` and give it a comma-separated string; name it something else, like `:port_list`.
-Here are specific conventions:
-### File and Dir Aspects
-A *file* is the full directory and basename for a file. A *dir* is a directory whose contents correspond to a single concern. A *prefix* not intended to be used directly -- it will be decorated with suffixes to form dirs and files. A *basename* is only the leaf part of a file reference. Don't use the terms 'path' or 'filename'.
-Ignore the temptation to make a one-true-home-for-my-system, or to fight the package maintainer's choices. (FIXME: Rewrite to encourage OS-correct naming schemas.)
-- a sandbox holding dir, pid, log, ...
-#### Application
-* **prefix**: A container with directories bin, lib, share, src, to use according to convention
-  - default: `/usr/local`.
-* **home_dir**: Logical location for the cookbook's system code.
-  - default: typically, leave it up to the package maintainer. Otherwise, `:prefix/share/:cookbook` should be a symlink to the `install_dir` (see below).
-  - instead of:         `xx_home` / `dir` alone / `install_dir`
-* **install_dir**: The cookbook's system code, in case the home dir is a pointer to potential alternates.
-  - default: `:prefix/share/:cookbook-:version` ( you don't need the directory after the cookbook runs, use `:prefix/share/:cookbook-:version` instead, eg `/usr/local/src/tokyo_tyrant-xx.xx`)
-  - Make `home_dir` a symlink to this directory (eg home_dir `/usr/local/share/elasticsearch` links to install_dir `/usr/local/share/elasticsearch-0.17.8`).
-* **src_dir**: holds the compressed tarball, its expanded contents, and the compiled files when installing from source. Use this when you will run `make install` or equivalent and use the files elsewhere.
-  - default:            `:prefix/src/:system_name-:version`, eg `/usr/local/src/pig-0.9.tar.gz`
-  - do not:             expand the tarball to `:prefix/src/(whatever)` if it will actually be used from there; instead, use the `install_dir` convention described above. (As a guideline, I should be able to blow away `/usr/local/src` and everything still works).
-* **deploy_dir**: deployed code that follows the capistrano convention. See more about deploy variables below.
-  - the `:deploy_dir/shared` directory holds common files
-  - releases are checked out to `:deploy_dir/releases/{sha}`
-  - the operational release is a symlink to the right release: `:deploy_dir/current -> :deploy_dir/releases/xxx`.
-  - do not:             use this when you mean `home_dir`.
-* **scratch_roots**, **persistent_roots**: an array of directories spread across volumes, with expectations on persistence
-  - `scratch_root`s have no guarantee of persistence -- for example, stop/start'ing a machine on EC2 destroys the contents of its local (ephemeral) drives. `persistent_root`s have the *best available* promise of persistance: if permanent (eg EBS) volumes are available, they will exclusively populate the `persistent_root`s; but if not, the ephemeral drives are used instead.
-  - these attributes are provided by the `mountable_volume` meta-cookbook and its appropriate integration recipe. Ordinary cookbooks should always trust the integration cookbook's choices (or visit the integration cookbook to correct them).
-  - each element in `persistent_roots` is by contract on a separate volume, and similarly each of the `scratch_roots` is on a separate volume. A volume *may* be in both scratch and persistent (for example, there may be only one volume!).
-  - the singular forms  **scratch_root** and **persistent_root** are provided for your convenience and always correspond to `scratch_roots.first` and `persistent_roots.first`. This means lots the first named volume is picked on the heaviest -- if you don't like that, choose explicitly (but not randomly, or you won't be idempotent).
-* **log_file**, **log_dir**, **xx_log_file**, **xx_log_dir**:
-  - default:
-    - if the log files will always be trivial in size, put them in `/var/log/:cookbook.log` or `/var/log/:cookbook/(whatever)`.
-    - if it's a runit-managed service, leave them in `/etc/sv/:cookbook-:component/log/main/current`, and make a symlink from `/var/log/:cookbook-component` to `/etc/sv/:cookbook-:component/log/main/`.
-    - If the log files are non-trivial in size, set log dir `/:scratch_root/:cookbook/log/`, and symlink `/var/log/:cookbook/` to it.
-    - If the log files should be persisted, place them in `/:persistent_root/:cookbook/log`, and symlink `/var/log/:cookbook/` to it.
-    - in all cases, the directory is named `.../log`, not `.../logs`. Never put things in `/tmp`.
-    - Use the physical location for the `log_dir` attribute, not the /var/log symlink.
-* **tmp_dir**:
-  - default:            `/:scratch_root/:cookbook/tmp/`
-  - Do not put a symlink or directory in `/tmp` -- something else blows it away, the app recreates it as a physical directory, `/tmp` overflows, pagers go off, sadness spreads throughout the land.
-* **conf_dir**:
-  - default:            `/etc/:cookbook`
-* **bin_dir**:
-  - default:            `/:home_dir/bin`
-* **pid_file**, **pid_dir**:
-  - default:            pid_file: `/var/run/:cookbook.pid` or `/var/run/:cookbook/:component.pid`; pid_dir: `/var/run/:cookbook/`
-  - instead of:         `job_dir`, `job_file`, `pidfile`, `run_dir`.
-* **cache_dir**:
-  - default:            `/var/cache/:cookbook`.
-* **data_dir**:
-  - default:            `:persistent_root/:cookbook/:component/data`
-  - instead of:         `datadir, `dbfile`, `dbdir`
-* **journal_dir**: high-speed local storage for commitlogs and so forth. Can be deleted, though you may rather it wasn't.
-  - default:            `:scratch_root/:cookbook/:component/scratch`
-  - instead of:         `commitlog_dir`
-### Daemon Aspects
-* **daemon_name**:      daemon's actual service name, if it differs from the component. For example, the `hadoop-namenode` component's daemon is `hadoop-0.20-namenode` as installed by apt.
-* **daemon_states**:    an array of the verbs acceptable to the Chef `service` resource: `:enable`, `:start`, etc.
-* **num_xx_processes**, **num_xx_threads** the number of separate top-level processes (distinct PIDs) or internal threads to run
-  - instead of          `num_workers`, `num_servers`, `worker_processes`, `foo_threads`.
-* **log_level**
-  - application-specific; often takes values info, debug, warn
-  - instead of          `verbose`, `verbosity`, `loglevel`
-* **user**, **group**, **uid**, **gid** -- `user` is the user name.  The `user` and `group` should be strings, even the `uid` and `gid` should be integers.
-  - instead of          username, group_name, using uid for user name or vice versa.
-  - if there are multiple users, use a prefix: `launcher_user` and `observer_user`.
-### Install / Deploy Aspects
-* **release_url**:      URL for the release.
-  - instead of:         install_url, package_url, being careless about partial vs whole URLs
-* **release_file**:     Where to put the release.
-  - default:            `:prefix/src/system_name-version.ext`, eg `/usr/local/src/elasticsearch-0.17.8.tar.bz2`.
-  - do not use `/tmp` -- let me decide when to blow it away (and make it easy to be idempotent).
-  - do not use a non-versioned URL or file name.
-* **release_file_sha** or **release_file_md5** fingerprint
-  - instead of:         `whatever_checksum`, `whatever_fingerprint`
-* **version**:          if it's a simply-versioned resource that uses the `major.minor.patch-cruft` convention. Do not use unless this is true, and do not use the source control revision ID.
-* **plugins**:          array of system-specific plugins
-use `deploy_{}` for anything that would be true whatever SCM you're using; use `git_{}` (and so forth) where specific to that repo.
-* **deploy_env**        production / staging / etc
-* **deploy_strategy**
-* **deploy_user**       user to run as
-* **deploy_dir**:       Only use `deploy_dir` if you are following the capistrano convention: see above.
-* **git_repo**:  url for the repo, eg `git@github.com:infochimps-labs/ironfan.git` or `http://github.com/infochimps-labs/ironfan.git`
-  - instead of:         `deploy_repo`, `git_url`
-* **git_revision**:  SHA or branch
-  - instead of:         `deploy_revision`
-* **apt/(repo_name)**   Options for adding a cookbook's apt repo.
-  - Note that this is filed under *apt*, not the cookbook.
-  - Use the best name for the repo, which is not necessarily the cookbook's name: eg `apt/cloudera/{...}`, which is shared by hadoop, flume, pig, and so on.
-  - `apt/{repo_name}/url` -- eg `http://archive.cloudera.com/debian`
-  - `apt/{repo_name}/key` -- GPG key
-  - `apt/{repo_name}/force_distro` -- forces the distro (eg, you are on natty but the apt repo only has maverick)
-### Ports
-* **xx_port**:
-  - *do not* use 'port' on its own.
-  - examples: `thrift_port`, `webui_port`, `zookeeper_port`, `carbon_port` and `whisper_port`.
-  - xx_port: `default[:foo][:server][:port] =  5000`
-  - xx_ports, if an array: `default[:foo][:server][:ports] = [5000, 5001, 5002]`
-* **addr**, **xx_addr**
-  - if all ports bind to the same interface, use `addr`. Otherwise, do *not* use `addr`, and use a unique `foo_addr` for each `foo_port`.
-  - instead of:         `hostname`, `binding`, `address`
-* Want some way to announce my port is http or https.
-* Need to distinguish client ports from service ports. You should be using cluster service discovery anyway though.
-### Application Integration
-* **jmx_port**
-### Tunables
-* **XX_heap_max**, **xx_heap_min**, **java_heap_eden**
-* **java_home**
-* AVOID batch declaration of options (e.g. **java_opts**) if possible: assemble it in your recipe from intelligible attribute names.
-### Nitpicks
-* Always put file modes in quote marks: `mode "0664"` not `mode 0664`.
-## Announcing Aspects
-If your app does any of the following,
-* **services**    -- Any interesting long-running process.
-* **ports**       -- Any reserved open application port
-  - *http*:          HTTP application port
-  - *https*:         HTTPS application port
-  - *internal*:      port is on private IP, should *not* be visible through public IP
-  - *external*:      port *is* available through public IP
-* metric_ports:
-  - **jmx_ports** -- JMX diagnostic port (announced by many Java apps)
-* **dashboards**  -- Web interface to look inside a system; typically internal-facing only, and probably not performance-monitored by default.
-* **logs**        -- um, logs. You can also announce the logs' flavor: `:apache`, `log4j`, etc.
-* **scheduleds**  -- regularly-occurring events that leave a trace
-* **exports**     -- jars or libs that other programs may wish to incorporate
-* **consumes**    -- placed there by any call to `discover`.
-## Clusters
-* Describe physical configuration:
-  - machine size, number of instances per facet, etc
-  - external assets (elastic IP, ebs volumes)
-* Describe high-level assembly of systems via roles: `hadoop_namenode`, `nfs_client`, `ganglia_agent`, etc.
-* Describe important modifications, such as `ironfan::system_internals`, mounts ebs volumes, etc
-* Describe override attributes:
-  - `heap size`, rvm versions, etc.
-* roles and recipes
-  - remove `cluster_role` and `facet_role` if empty
-  - are not in `run_list`, but populated by the `role` and `recipe` directives
-* remove big_package unless it's a dev machine (sandbox, etc)
-## Roles
-Roles define the high-level assembly of recipes into systems
-* override attributes go into the cluster.
-currently, those files are typically empty and are badly cluttering the roles/ directory.
-the cluster and facet override attributes should be together, not scattered in different files.
-roles shouldn't assemble systems. The contents of the infochimps_chef/roles/plato_truth.rb file belong in a facet.
-* Deprecated:
-  - Cluster and facet roles (`roles/gibbon_cluster.rb`, `roles/gibbon_namenode.rb`, etc) go away
-  - Roles should be service-oriented: `hadoop_master` considered harmful, you should explicitly enumerate the services
-### Facets should be (nearly) identical
-Within a facet, keep your servers almost entirely identical. For example, servers in a MySQL facet would their index to set shard order and to claim the right attached volumes. However, it would be a mistake to have one server within a facet be a master process and the rest be worker processes -- just define different facets for each.
-### Pedantic Distinctions:
-Separate the following terms:
-* A *machine* is a concrete thing that runs your code -- it might be a VM or raw metal, but it has CPUs and fans and a finite lifetime. It has a unique name tied to its physical presence -- something like 'i-123abcd' or 'rack 4 server 7'.
-* A *chef node* is the code object that, together with the chef-client process, configures a machine. In ironfan, the chef node is strictly slave to the server description and the measured attributes of the machine.
-* A *server description* gives the high-level specification the machine should acheive. This includes the roles, recipes and attributes given to the chef node; the physical characteristics of the machine ('8 cores, 7GB ram, AWS cloud'); and its relation to the rest of the system (george cluster, webnode facet, index 3).
-In particular, we try to be careful to always call a Chef node a 'chef node' (never just 'node'). Try processing graph nodes in a flume node feeding a node.js decorator on a cloud node define by a chef node. No(de) way.

data/notes/tips_and_troubleshooting.md DELETED Viewed

@@ -1,92 +0,0 @@
-## Tips and Notes
-### Gems
-   knife cluster ssh bonobo-worker-2 'sudo gem update --system'
-   knife cluster ssh bonobo-worker-2 'sudo true ; for foo in /usr/lib/ruby/gems/1.9.2-p290/specifications/*  ; do sudo sed -i.bak "s!000000000Z!!"          $foo ; done'
-   knife cluster ssh bonobo-worker-2 'sudo true ; for foo in /usr/lib/ruby/site_ruby/*/rubygems/deprecate.rb ; do sudo sed -i.bak "s!@skip ||= false!true!" $foo ; done'
-### EC2 Notes Instance attributes: `disable_api_termination` and `delete_on_termination`
-To set `delete_on_termination` to 'true' after the fact, run the following (modify the instance and volume to suit):
-```
-    ec2-modify-instance-attribute -v i-0704be6c --block-device-mapping /dev/sda1=vol-XX8d2c80::true
-```
-If you set `disable_api_termination` to true, in order to terminate the node run
-```
-    ec2-modify-instance-attribute -v i-0704be6c --disable-api-termination false
-```
-To view whether an attached volume is deleted when the machine is terminated:
-```
-    # show volumes that will be deleted
-    ec2-describe-volumes --filter "attachment.delete-on-termination=true"
-```
-You can't (as far as I know) alter the delete-on-termination flag of a running volume. Crazy, huh?
-### EC2: See your userdata
-curl http://169.254.169.254/latest/user-data
-### EBS Volumes for a persistent HDFS
-* Make one volume and format for XFS:
-    `$ sudo mkfs.xfs -f /dev/sdh1`
-* options "defaults,nouuid,noatime" give good results. The 'nouuid' part
-  prevents errors when mounting multiple volumes from the same snapshot.
-* poke a file onto the drive :
-  datename=`date +%Y%m%d`
-  sudo bash -c "(echo $datename ; df /data/ebs1 ) > /data/ebs1/xfs-created-at-$datename.txt"
-If you want to grow the drive:
-* take a snapshot.
-* make a new volume from it
-* mount that, and run `sudo xfs_growfs`. You *should* have the volume mounted, and should stop anything that would be working the volume hard.
-### Hadoop: On-the-fly backup of your namenode metadata
-bkupdir=/ebs2/hadoop-nn-backup/`date +"%Y%m%d"`
-for srcdir in /ebs*/hadoop/hdfs/ /home/hadoop/gibbon/hdfs/  ; do
-  destdir=$bkupdir/$srcdir ; echo $destdir ;
-  sudo mkdir -p $destdir ;
-done
-### NFS: Halp I am using an NFS-mounted /home and now I can't log in as ubuntu
-Say you set up an NFS server 'core-homebase-0' (in the 'core' cluster) to host and serve out `/home` directory; and a machine 'awesome-webserver-0' (in the 'awesome' cluster), that is an NFS client.
-In each case, when the machine was born EC2 created a `/home/ubuntu/.ssh/authorized_keys` file listing only the single approved machine keypair -- 'core' for the core cluster, 'awesome' for the awesome cluster.
-When chef client runs, however, it mounts the NFS share at /home. This then masks the actual /home directory -- nothing that's on the base directory tree shows up. Which means that after chef runs, the /home/ubuntu/.ssh/authorized_keys file on awesome-webserver-0 is the one for the *'core'* cluster, not the *'awesome'* cluster.
-The solution is to use the cookbook ironfan provides -- it moves the 'ubuntu' user's home directory to an alternative path not masked by the NFS.
-### NFS: Problems starting NFS server on ubuntu maverick
-For problems starting NFS server on ubuntu maverick systems, read, understand and then run /tmp/fix_nfs_on_maverick_amis.sh -- See "this thread for more":http://fossplanet.com/f10/[ec2ubuntu]-not-starting-nfs-kernel-daemon-no-support-current-kernel-90948/
-### Git deploys: My git deploy recipe has gone limp
-Suppose you are using the @git@ resource to deploy a recipe (@george@ for sake of example). If @/var/chef/cache/revision_deploys/var/www/george@ exists then *nothing* will get deployed, even if /var/www/george/{release_sha} is empty or screwy.  If git deploy is acting up in any way, nuke that cache from orbit -- it's the only way to be sure.
- $ sudo rm -rf /var/www/george/{release_sha} /var/chef/cache/revision_deploys/var/www/george
-### Runit services : 'fail: XXX: unable to change to service directory: file does not exist'
-Your service is probably installed but removed from runit's purview; check the `/etc/service` symlink. All of the following should be true:
-* directory `/etc/sv/foo`, containing file `run` and dirs `log` and `supervise`
-* `/etc/init.d/foo`  is symlinked to `/usr/bin/sv`
-* `/etc/servics/foo` is symlinked tp `/etc/sv/foo`