kubernetes 0.0.1 → 0.0.2
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/Makefile +53 -0
- data/docs/access.md +252 -0
- data/docs/architecture.dia +0 -0
- data/docs/architecture.svg +523 -0
- data/docs/client-libraries.md +11 -0
- data/docs/flaky-tests.md +52 -0
- data/docs/identifiers.md +90 -0
- data/docs/networking.md +108 -0
- data/docs/ovs-networking.md +14 -0
- data/docs/ovs-networking.png +0 -0
- data/docs/pods.md +23 -0
- data/docs/releasing.dot +113 -0
- data/docs/releasing.md +152 -0
- data/docs/releasing.png +0 -0
- data/docs/resources.md +218 -0
- data/docs/roadmap.md +65 -0
- data/docs/salt.md +85 -0
- data/docs/security.md +26 -0
- metadata +22 -4
data/docs/releasing.png
ADDED
Binary file
|
data/docs/resources.md
ADDED
@@ -0,0 +1,218 @@
|
|
1
|
+
# The Kubernetes resource model
|
2
|
+
|
3
|
+
To do good pod placement, Kubernetes needs to know how big pods are, as well as the sizes of the nodes onto which they are being placed. The definition of "how big" is given by the Kubernetes resource model - the subject of this document.
|
4
|
+
|
5
|
+
The resource model aims to be:
|
6
|
+
* simple, for common cases;
|
7
|
+
* extensible, to accommodate future growth;
|
8
|
+
* regular, with few special cases; and
|
9
|
+
* precise, to avoid misunderstandings and promote pod portability.
|
10
|
+
|
11
|
+
## The resource model
|
12
|
+
A Kubernetes _resource_ is something that can be requested by, allocated to, or consumed by a pod or container. Examples include memory (RAM), CPU, disk-time, and network bandwidth.
|
13
|
+
|
14
|
+
Once resources on a node have been allocated to one pod, they should not be allocated to another until that pod is removed or exits. This means that Kubernetes schedulers should ensure that the sum of the resources allocated (requested and granted) to its pods never exceeds the usable capacity of the node. Testing whether a pod will fit on a node is called _feasibility checking_. Note that the resource model currently prohibits over-committing resources; we will want to relax that restriction later.
|
15
|
+
|
16
|
+
### Resource types
|
17
|
+
|
18
|
+
All resources have a _type_ that is identified by their _typename_ (a string, e.g., "memory"). Several resource types are predefined by Kubernetes (a full list is below), although only two will be supported at first: CPU and memory. Users and system administrators can define their own resource types if they wish (e.g., Hadoop slots).
|
19
|
+
|
20
|
+
A fully-qualified resource typename is constructed from a DNS-style _subdomain_ with at least one dot, a slash `/`, and a path comprised of one or more segments separated by slashes.
|
21
|
+
* The subdomain must conform to [RFC 1035 section 2.3.1 'subdomain' syntax](http://tools.ietf.org/html/rfc1035) (e.g., `kubernetes.io`, `myveryown.org`).
|
22
|
+
* The path must conform to [RFC 3986 URI `path-rootless` syntax](http://tools.ietf.org/html/rfc3986#section-3.3) (e.g., `memory`, `shinyNewResource/v2`), save that it must not use dot-segments (`.` and `..`).
|
23
|
+
* As a shorthand, any resource typename that does not start with a subdomain and a slash will automatically be prefixed with the built-in Kubernetes _namespace_, `kubernetes.io/resources/` in order to fully-qualify it. This namespace is reserved for code in the open source Kubernetes repository; as a result, all user typenames MUST be fully qualified, and cannot be created in this namespace.
|
24
|
+
* Typenames are treated as literal strings, and neither escaped nor case-converted. This means that case is signifcant (unlike RFC 1035 subdomains) and paths should avoid characters that need percent-encoding.
|
25
|
+
|
26
|
+
The recommended best practice is to use a lowercase subdomain and Go-like ASCII camelCase for path components. Some example typenames include `memory` (which will be fully-qualified as `kubernetes.io/resources/memory`), and `myveryown.org/shinyNewResource/v2`.
|
27
|
+
|
28
|
+
For future reference, note that some resources, such as CPU and network bandwidth, are _compressible_, which means that their usage can potentially be throttled in a relatively benign manner. All other resources are _incompressible_, which means that any attempt to throttle them is likely to cause grief. This distinction will be important if a Kubernetes implementation supports over-committing of resources.
|
29
|
+
|
30
|
+
### Resource quantities
|
31
|
+
|
32
|
+
Initially, all Kubernetes resource types are _quantitative_, and have an associated _unit_ for quantities of the associated resource (e.g., bytes for memory, bytes per seconds for bandwidth, instances for software licences). The units will always be a resource type's natural base units (e.g., bytes, not MB), to avoid confusion between binary and decimal multipliers and the underlying unit multiplier (e.g., is memory measured in MiB, MB, or GB?).
|
33
|
+
|
34
|
+
Resource quantities can be added and subtracted: for example, a node has a fixed quantity of each resource type that can be allocated to pods/containers; once such an allocation has been made, the allocated resources cannot be made available to other pods/containers without over-committing the resources.
|
35
|
+
|
36
|
+
To make life easier for people, quantities can be represented externally as unadorned integers, or as fixed-point integers with one of these SI suffices (E, P, T, G, M, K, m) or their power-of-two equivalents (Ei, Pi, Ti, Gi, Mi, Ki). For example, the following represent roughly the same value: 128974848, "129e6", "129M" , "123Mi". Small quantities can be represented directly as decimals (e.g., 0.3), or using milli-units (e.g., "300m").
|
37
|
+
* "Externally" means in user interfaces, reports, graphs, and in JSON or YAML resource specifications that might be generated or read by people.
|
38
|
+
* Case is significant: "m" and "M" are not the same, so "k" is not a valid SI suffix. There are no power-of-two equivalents for SI suffixes that represent multipliers less than 1.
|
39
|
+
* These conventions only apply to resource quantities, not arbitrary values.
|
40
|
+
|
41
|
+
Internally (i.e., everywhere else), Kubernetes will represent resource quantities as integers so it can avoid problems with rounding errors, and will not use strings to represent numeric values. To achieve this, quantities that naturally have fractional parts (e.g., CPU seconds/second) will be scaled to integral numbers of milli-units (e.g., milli-CPUs) as soon as they are read in. Internal APIs, data structures, and protobufs will use these scaled integer units. Raw measurement data such as usage may still need to be tracked and calculated using floating point values, but internally they should be rescaled to avoid some values being in milli-units and some not.
|
42
|
+
* Note that reading in a resource quantity and writing it out again may change the way its values are represented, and truncate precision (e.g., 1.0001 may become 1.000), so comparison and difference operations (e.g., by an updater) must be done on the internal representations.
|
43
|
+
* Avoiding milli-units in external representations has advantages for people who will use Kubernetes, but runs the risk of developers forgetting to rescale or accidentally using floating-point representations. That seems like the right choice. We will try to reduce the risk by providing libraries that automatically do the quantization for JSON/YAML inputs.
|
44
|
+
|
45
|
+
### Resource specifications
|
46
|
+
|
47
|
+
A _resource specification_ can be used to describe resource requests, resource allocations, and/or resource usage for a container or pod, and capacity for a node. For example (although it would be unusual to see all of these fields simultaneously):
|
48
|
+
```
|
49
|
+
resources: [
|
50
|
+
request: [ cpu: 2.5, memory: "40Mi" ],
|
51
|
+
limit: [ cpu: 4.0, memory: "99Mi" ],
|
52
|
+
capacity: [ cpu: 12, memory: "128Gi" ],
|
53
|
+
maxusage: [ cpu: 3.8, memory: "80Mi" ],
|
54
|
+
]
|
55
|
+
```
|
56
|
+
|
57
|
+
Where:
|
58
|
+
* _request_: the amount of resources being requested, or that were requested and have been allocated. Scheduler algorithms will use these quantities to test feasibility (whether a pod will fit onto a node). If a container (or pod) tries to use more resources than its _request_, any associated SLOs are voided - e.g., the program it is running may be throttled (compressible resource types), or the attempt may be denied. If _request_ is omitted for a container, it defaults to _limit_ if that is explicitly specified, otherwise to an implementation-defined value; this will always be 0 for a user-defined resource type. If _request_ is omitted for a pod, it defaults to the sum of the (explicit ior implicit) _request_ values for the containers it encloses.
|
59
|
+
|
60
|
+
* _limit_ [optional]: an upper bound or cap on the maximum amount of resources that will be made available to a container otr pod; if a container or pod uses more resources than its _limit_, it may be terminated. The _limit_ defaults to "unbounded"; in practice, this probably means the capacity of an enclosing container, pod, or node, but may result in non-deterministic behavior, especially for memory.
|
61
|
+
|
62
|
+
* _capacity_: the total allocatable resources of a node. Initially, the resources at a given scope will bound the resources of the sum of inner scopes. This may be loosened in the future to permit overcommittment.
|
63
|
+
|
64
|
+
* _maxusage_: the largest observed resource usage. (See the Appendix for richer data structures.)
|
65
|
+
|
66
|
+
Notes:
|
67
|
+
|
68
|
+
* It is an error to specify the same resource type more than once in each list.
|
69
|
+
|
70
|
+
* It is an error for the _request_ or _limit_ values for a pod to be less than the sum of the (explicit or defaulted) values for the containers it encloses. (We may relax this later.)
|
71
|
+
|
72
|
+
* If multiple pods are running on the same node and attempting to use more resources than they have requested, the result is implementation-defined. For example: unallocated or unused resources might be spread equally across claimants, or the assignment might be weighted by the size of the original request, or as a function of limits, or priority, or the phase of the moon, perhaps modulated by the direction of the tide. Thus, although it's not mandatory to provide a _request_, it's probably a good idea. (Note that the _request_ could be filled in by an automated system that is observing actual usage and/or historical data.)
|
73
|
+
|
74
|
+
* Internally, the Kubernetes master can decide the defaulting behavior and the kubelet implementation may expected an absolute specification. For example, if the master decided that "the default is unbounded" it would pass 2^64 to the kubelet.
|
75
|
+
|
76
|
+
|
77
|
+
|
78
|
+
## Kubernetes-defined resource types
|
79
|
+
The following resource types are predefined ("reserved") by Kubernetes in the `resources.kubernetes.io` namespace, and so cannot be used for user-defined resources. Note that the syntax of all resource types in the resource spec is deliberately similar, but some resource types (e.g., CPU) may receive significantly more support than simply tracking quantities in the schedulers and/or the Kubelet.
|
80
|
+
|
81
|
+
### Processor cycles
|
82
|
+
* Name: `cpu` (or `kubernetes.io/resources/cpu`)
|
83
|
+
* Units: Kubernetes Compute Unit seconds/second (i.e., CPU cores normalized to a canonical "Kubernetes CPU")
|
84
|
+
* Internal representation: milli-KCUs
|
85
|
+
* Compressible? yes
|
86
|
+
* Qualities: [this is a placeholder for the kind of thing that may be supported in the future]
|
87
|
+
* [future] `schedulingLatency`: as per lmctfy
|
88
|
+
* [future] `cpuConversionFactor`: property of a node: the speed of a CPU core on the node's processor divided by the speed of the canonical Kubernetes CPU (a floating point value; default = 1.0).
|
89
|
+
|
90
|
+
To reduce performance portability problems for pods, and to avoid worse-case provisioning behavior, the units of CPU will be normalized to a canonical "Kubernetes Compute Unit" (KCU, pronounced ˈko͝oko͞o), which will roughly be equivalent to a single CPU hyperthreaded core for some recent x86 processor. The normalization may be implementation-defined, although some reasonable defaults will be provided in the open-source Kubernetes code.
|
91
|
+
|
92
|
+
Note that requesting 2 KCU won't guarantee that precisely 2 physical cores will be allocated - control of aspects like this will be handled by resource _qualities_ (a future feature).
|
93
|
+
|
94
|
+
|
95
|
+
### Memory
|
96
|
+
* Name: `memory` (or `kubernetes.io/resources/memory`)
|
97
|
+
* Units: bytes
|
98
|
+
* Compressible? no (at least initially)
|
99
|
+
|
100
|
+
The precise meaning of what "memory" means is implementation dependent, but the basic idea is to rely on the underlying `memcg` mechanisms, support, and definitions.
|
101
|
+
|
102
|
+
Note that most people will want to use power-of-two suffixes (Mi, Gi) for memory quantities
|
103
|
+
rather than decimal ones: "64MiB" rather than "64MB".
|
104
|
+
|
105
|
+
|
106
|
+
## Resource metadata
|
107
|
+
A resource type may have an associated read-only ResourceType structure, that contains metadata about the type. For example:
|
108
|
+
```
|
109
|
+
resourceTypes: [
|
110
|
+
"kubernetes.io/resources/memory": [
|
111
|
+
isCompressible: false, ...
|
112
|
+
]
|
113
|
+
"kubernetes.io/resources/cpu": [
|
114
|
+
isCompressible: true, internalScaleExponent: 3, ...
|
115
|
+
]
|
116
|
+
"kubernetes.io/resources/diskSpace": [ ... }
|
117
|
+
]
|
118
|
+
```
|
119
|
+
|
120
|
+
Kubernetes will provide ResourceType metadata for its predefined types. If no resource metadata can be found for a resource type, Kubernetes will assume that it is a quantified, incompressible resource that is not specified in milli-units, and has no default value.
|
121
|
+
|
122
|
+
The defined properties are as follows:
|
123
|
+
|
124
|
+
| field name | type | contents |
|
125
|
+
| ---------- | ---- | -------- |
|
126
|
+
| name | string, required | the typename, as a fully-qualified string (e.g., `kubernetes.io/resources/cpu`) |
|
127
|
+
| internalScaleExponent | int, default=0 | external values are multiplied by 10 to this power for internal storage (e.g., 3 for milli-units) |
|
128
|
+
| units | string, required | format: `unit* [per unit+]` (e.g., `second`, `byte per second`). An empty unit field means "dimensionless". |
|
129
|
+
| isCompressible | bool, default=false | true if the resource type is compressible |
|
130
|
+
| defaultRequest | string, default=none | in the same format as a user-supplied value |
|
131
|
+
| _[future]_ quantization | number, default=1 | smallest granularity of allocation: requests may be rounded up to a multiple of this unit; implementation-defined unit (e.g., the page size for RAM). |
|
132
|
+
|
133
|
+
|
134
|
+
# Appendix: future extensions
|
135
|
+
|
136
|
+
The following are planned future extensions to the resource model, included here to encourage comments.
|
137
|
+
|
138
|
+
## Extended usage data
|
139
|
+
|
140
|
+
Singleton values for observed and predicted future usage will rapidly prove inadequate, so we will support the following structure for extended usage information:
|
141
|
+
|
142
|
+
```
|
143
|
+
resources: [
|
144
|
+
usage: [ cpu: <CPU-info>, memory: <memory-info> ],
|
145
|
+
predicted: [ cpu: <CPU-info>, memory: <memory-info> ],
|
146
|
+
]
|
147
|
+
```
|
148
|
+
|
149
|
+
where a `<CPU-info>` or `<memory-info>` structure looks like this:
|
150
|
+
```
|
151
|
+
{
|
152
|
+
mean: <value> # arithmetic mean
|
153
|
+
max: <value> # minimum value
|
154
|
+
min: <value> # maximum value
|
155
|
+
count: <value> # number of data points
|
156
|
+
percentiles: [ # map from %iles to values
|
157
|
+
"10": <10th-percentile-value>,
|
158
|
+
"50": <median-value>,
|
159
|
+
"99": <99th-percentile-value>,
|
160
|
+
"99.9": <99.9th-percentile-value>,
|
161
|
+
...
|
162
|
+
]
|
163
|
+
}
|
164
|
+
```
|
165
|
+
All parts of this structure are optional, although we strongly encourage including quantities for 50, 90, 95, 99, 99.5, and 99.9 percentiles. _[In practice, it will be important to include additional info such as the length of the time window over which the averages are calculated, the confidence level, and information-quality metrics such as the number of dropped or discarded data points.]_
|
166
|
+
and predicted
|
167
|
+
|
168
|
+
## Future resource types
|
169
|
+
|
170
|
+
### _[future] Network bandwidth_
|
171
|
+
* Name: "networkBandwidth" (or `kubernetes.io/resources/networkBandwidth`)
|
172
|
+
* Units: bytes per second
|
173
|
+
* Compressible? yes
|
174
|
+
|
175
|
+
### _[future] Network operations_
|
176
|
+
* Name: "networkIOPS" (or `kubernetes.io/resources/networkOperations`)
|
177
|
+
* Units: operations (messages) per second
|
178
|
+
* Compressible? yes
|
179
|
+
|
180
|
+
### _[future] Storage space_
|
181
|
+
* Name: "storageSpace" (or `kubernetes.io/resources/storageSpace`)
|
182
|
+
* Units: bytes
|
183
|
+
* Compressible? no
|
184
|
+
|
185
|
+
The amount of secondary storage space available to a container. The main target is local disk drives and SSDs, although this could also be used to qualify remotely-mounted volumes. Specifying whether a resource is a raw disk, an SSD, a disk array, or a file system fronting any of these, is left for future work.
|
186
|
+
|
187
|
+
### _[future] Storage time_
|
188
|
+
* Name: storageTime (or `kubernetes.io/resources/storageTime`)
|
189
|
+
* Units: seconds per second of disk time
|
190
|
+
* Internal representation: milli-units
|
191
|
+
* Compressible? yes
|
192
|
+
|
193
|
+
This is the amount of time a container spends accessing disk, including actuator and transfer time. A standard disk drive provides 1.0 diskTime seconds per second.
|
194
|
+
|
195
|
+
### _[future] Storage operations_
|
196
|
+
* Name: "storageIOPS" (or `kubernetes.io/resources/storageIOPS`)
|
197
|
+
* Units: operations per second
|
198
|
+
* Compressible? yes
|
199
|
+
|
200
|
+
|
201
|
+
## Named, individual resources
|
202
|
+
|
203
|
+
This is primarily important for things like disks, flash and network
|
204
|
+
cards, where there can be multiple, separate resource suppliers, and
|
205
|
+
the aprtition of the request across them may matter. (Note that the
|
206
|
+
unadorned `storageSpace` resource type doesn't imply a particular
|
207
|
+
disk.) Such resources will be identified by extending the resource
|
208
|
+
typename with the instance identifier. For example:
|
209
|
+
|
210
|
+
resources: [
|
211
|
+
request: [
|
212
|
+
cpu: 2.3, memory: "4Gi",
|
213
|
+
"storageSpace/hda": "0.5Ti", "storageTime/hda": 0.3,
|
214
|
+
"storageSpace/ssd1": "0.1Ti", "storageTime/ssd1": 0.9,
|
215
|
+
],
|
216
|
+
]
|
217
|
+
|
218
|
+
Note that this does make it hard to parse typenames (e.g., is "foo.com/a/b" a type named "a/b" or a type named "a" with an subdivision of "b"?). Comments welcome.
|
data/docs/roadmap.md
ADDED
@@ -0,0 +1,65 @@
|
|
1
|
+
# Kubernetes Roadmap
|
2
|
+
|
3
|
+
Updated August 28, 2014
|
4
|
+
|
5
|
+
This document is intended to capture the set of features, docs, and patterns that we feel are required to call Kubernetes “feature complete” for a 1.0 release candidate. This list does not emphasize the bug fixes and stabilization that will be required to take it all the way to production ready. This is a living document, and is certainly open for discussion.
|
6
|
+
|
7
|
+
## APIs
|
8
|
+
1. ~~Versioned APIs: Manage APIs for master components and kubelets with explicit versions, version-specific conversion routines, and component-to-component version checking.~~ **Done**
|
9
|
+
2. Component-centric APIs: Clarify which types belong in each component’s API and which ones are truly common.
|
10
|
+
1. Clarify the role of etcd in the cluster.
|
11
|
+
3. Idempotency: Whenever possible APIs must be idempotent.
|
12
|
+
4. Container restart policy: Policy for each pod or container stating whether and when it should be restarted upon termination.
|
13
|
+
5. Life cycle events/hooks and notifications: Notify containers about what is happening to them.
|
14
|
+
6. Re-think the network parts of the API: Find resolution on the the multiple issues around networking.
|
15
|
+
1. ~~Utility of HostPorts in ip-per-pod~~ **Done**
|
16
|
+
2. Services/Links/Portals/Ambassadors
|
17
|
+
7. Durable volumes: Provide a model for data that survives some kinds of outages.
|
18
|
+
8. Auth[nz] and ACLs: Have a plan for how the API and system will express:
|
19
|
+
1. Identity & authentication
|
20
|
+
2. Authorization & access control
|
21
|
+
3. Cluster subdivision, accounting, & isolation
|
22
|
+
|
23
|
+
## Factoring and pluggability
|
24
|
+
1. ~~Pluggable scheduling: Cleanly separate the scheduler from the apiserver.~~ **Done**
|
25
|
+
2. Pluggable naming and discovery: Call-outs or hooks to enable external naming systems.
|
26
|
+
3. Pluggable volumes: Allow new kinds of data sources as volumes.
|
27
|
+
4. Replication controller: Make replication controller a standalone entity in the master stack.
|
28
|
+
5. Pod templates: Proposal to make pod templates a first-class API object, rather than an artifact of replica controller
|
29
|
+
|
30
|
+
## Cluster features
|
31
|
+
1. ~~Minion death: Cleanly handle the loss of a minion.~~ **Done**
|
32
|
+
2. Configure DNS: Provide DNS service for k8s running pods, containers and services. Auto-populate it with the things we know.
|
33
|
+
3. Resource requirements and scheduling: Use knowledge of resources available and resources required to do better scheduling.
|
34
|
+
4. ~~True IP-per-pod: Get rid of last remnants of shared port spaces for pods.~~ **Done**
|
35
|
+
5. IP-per-service: Proposal to make services cleaner.
|
36
|
+
6. Basic deployment tools: This includes tools for higher-level deployments configs.
|
37
|
+
7. Standard mechanisms for deploying k8s on k8s with a clear strategy for reusing the infrastructure for self-host.
|
38
|
+
|
39
|
+
## Node features
|
40
|
+
1. Container termination reasons: Capture and report exit codes and other termination reasons.
|
41
|
+
2. Garbage collect old container images: Clean up old docker images that consume local disk. Maybe a TTL on images.
|
42
|
+
3. Container logs: Expose stdout/stderr from containers without users having to SSH into minions. Needs a rotation policy to avoid disks getting filled.
|
43
|
+
4. Container performance information: Capture and report performance data for each container.
|
44
|
+
5. Host log management: Make sure we don't kill nodes with full disks.
|
45
|
+
|
46
|
+
## Global features
|
47
|
+
2. Input validation: Stop bad input as early as possible.
|
48
|
+
3. Error propagation: Report problems reliably and consistently.
|
49
|
+
4. Consistent patterns of usage of IDs and names throughout the system.
|
50
|
+
5. Binary release: Repeatable process to produce binaries for release.
|
51
|
+
|
52
|
+
## Patterns, policies, and specifications
|
53
|
+
1. Deprecation policy: Declare the project’s intentions with regards to expiring and removing features and interfaces.
|
54
|
+
2. Compatibility policy: Declare the project’s intentions with regards to saved state and live upgrades of components.
|
55
|
+
3. Naming/discovery: Demonstrate techniques for common patterns:
|
56
|
+
1. Master-elected services
|
57
|
+
2. DB replicas
|
58
|
+
3. Sharded services
|
59
|
+
4. Worker pools
|
60
|
+
4. Health-checking: Specification for how it works and best practices.
|
61
|
+
5. Logging: Demonstrate setting up log collection.
|
62
|
+
6. ~~Monitoring: Demonstrate setting up cluster monitoring.~~ **Done**
|
63
|
+
7. Rolling updates: Demo and best practices for live application upgrades.
|
64
|
+
1. Have a plan for how higher level deployment / update concepts should / should not fit into Kubernetes
|
65
|
+
8. Minion requirements: Document the requirements and integrations between kubelet and minion machine environments.
|
data/docs/salt.md
ADDED
@@ -0,0 +1,85 @@
|
|
1
|
+
# Using Salt to configure Kubernetes
|
2
|
+
|
3
|
+
The Kubernetes cluster can be configured using Salt.
|
4
|
+
|
5
|
+
The Salt scripts are shared across multiple hosting providers, so it's important to understand some background information prior to making a modification to ensure your changes do not break hosting Kubernetes across multiple environments. Depending on where you host your Kubernetes cluster, you may be using different operating systems and different networking configurations. As a result, it's important to understand some background information before making Salt changes in order to minimize introducing failures for other hosting providers.
|
6
|
+
|
7
|
+
## Salt cluster setup
|
8
|
+
|
9
|
+
The **salt-master** service runs on the kubernetes-master node.
|
10
|
+
|
11
|
+
The **salt-minion** service runs on the kubernetes-master node and each kubernetes-minion node in the cluster.
|
12
|
+
|
13
|
+
Each salt-minion service is configured to interact with the **salt-master** service hosted on the kubernetes-master via the **master.conf** file.
|
14
|
+
|
15
|
+
```
|
16
|
+
[root@kubernetes-master] $ cat /etc/salt/minion.d/master.conf
|
17
|
+
master: kubernetes-master
|
18
|
+
```
|
19
|
+
The salt-master is contacted by each salt-minion and depending upon the machine information presented, the salt-master will provision the machine as either a kubernetes-master or kubernetes-minion with all the required capabilities needed to run Kubernetes.
|
20
|
+
|
21
|
+
If you are running the Vagrant based environment, the **salt-api** service is running on the kubernetes-master. It is configured to enable the vagrant user to introspect the salt cluster in order to find out about machines in the Vagrant environment via a REST API.
|
22
|
+
|
23
|
+
## Salt security
|
24
|
+
|
25
|
+
Security is not enabled on the salt-master, and the salt-master is configured to auto-accept incoming requests from minions. It is not recommended to use this security configuration in production environments.
|
26
|
+
|
27
|
+
```
|
28
|
+
[root@kubernetes-master] $ cat /etc/salt/master.d/auto-accept.conf
|
29
|
+
open_mode: True
|
30
|
+
auto_accept: True
|
31
|
+
```
|
32
|
+
## Salt minion configuration
|
33
|
+
|
34
|
+
Each minion in the salt cluster has an associated configuration that instructs the salt-master how to provision the required resources on the machine.
|
35
|
+
|
36
|
+
An example file is presented below using the Vagrant based environment.
|
37
|
+
|
38
|
+
```
|
39
|
+
[root@kubernetes-master] $ cat /etc/salt/minion.d/grains.conf
|
40
|
+
grains:
|
41
|
+
master_ip: $MASTER_IP
|
42
|
+
etcd_servers: $MASTER_IP
|
43
|
+
cloud_provider: vagrant
|
44
|
+
roles:
|
45
|
+
- kubernetes-master
|
46
|
+
```
|
47
|
+
|
48
|
+
Each hosting environment has a slightly different grains.conf file that is used to build conditional logic where required in the Salt files.
|
49
|
+
|
50
|
+
The following enumerates the set of defined key/value pairs that are supported today. If you add new ones, please make sure to update this list.
|
51
|
+
|
52
|
+
Key | Value
|
53
|
+
------------- | -------------
|
54
|
+
cbr-cidr | (Optional) The minion IP address range used for the docker container bridge.
|
55
|
+
cloud | (Optional) Which IaaS platform is used to host kubernetes, *gce*, *azure*
|
56
|
+
cloud_provider | (Optional) The cloud_provider used by apiserver: *gce*, *azure*, *vagrant*
|
57
|
+
etcd_servers | (Required) Comma-delimited list of IP addresses the apiserver and kubelet use to reach etcd
|
58
|
+
hostnamef | (Optional) The full host name of the machine, i.e. hostname -f
|
59
|
+
master_ip | (Optional) The IP address that the apiserver will bind against
|
60
|
+
node_ip | (Optional) The IP address to use to address this node
|
61
|
+
minion_ip | (Optional) Mapped to the kubelet hostname_override, K8S TODO - change this name
|
62
|
+
network_mode | (Optional) Networking model to use among nodes: *openvswitch*
|
63
|
+
roles | (Required) 1. **kubernetes-master** means this machine is the master in the kubernetes cluster. 2. **kubernetes-pool** means this machine is a kubernetes-minion. Depending on the role, the Salt scripts will provision different resources on the machine.
|
64
|
+
|
65
|
+
These keys may be leveraged by the Salt sls files to branch behavior.
|
66
|
+
|
67
|
+
In addition, a cluster may be running a Debian based operating system or Red Hat based operating system (Centos, Fedora, RHEL, etc.). As a result, its important to sometimes distinguish behavior based on operating system using if branches like the following.
|
68
|
+
|
69
|
+
```
|
70
|
+
{% if grains['os_family'] == 'RedHat' %}
|
71
|
+
// something specific to a RedHat environment (Centos, Fedora, RHEL) where you may use yum, systemd, etc.
|
72
|
+
{% else %}
|
73
|
+
// something specific to Debian environment (apt-get, initd)
|
74
|
+
{% endif %}
|
75
|
+
```
|
76
|
+
|
77
|
+
## Best Practices
|
78
|
+
|
79
|
+
1. When configuring default arguments for processes, its best to avoid the use of EnvironmentFiles (Systemd in Red Hat environments) or init.d files (Debian distributions) to hold default values that should be common across operating system environments. This helps keep our Salt template files easy to understand for editors that may not be familiar with the particulars of each distribution.
|
80
|
+
|
81
|
+
## Future enhancements (Networking)
|
82
|
+
|
83
|
+
Per pod IP configuration is provider specific, so when making networking changes, its important to sand-box these as all providers may not use the same mechanisms (iptables, openvswitch, etc.)
|
84
|
+
|
85
|
+
We should define a grains.conf key that captures more specifically what network configuration environment is being used to avoid future confusion across providers.
|
data/docs/security.md
ADDED
@@ -0,0 +1,26 @@
|
|
1
|
+
# Security in Kubernetes
|
2
|
+
|
3
|
+
General design principles and guidelines related to security of containers, APIs, and infrastructure in Kubernetes.
|
4
|
+
|
5
|
+
|
6
|
+
## Objectives
|
7
|
+
|
8
|
+
1. Ensure a clear isolation between container and the underlying host it runs on
|
9
|
+
2. Limit the ability of the container to negatively impact the infrastructure or other containers
|
10
|
+
3. [Principle of Least Privilege](http://en.wikipedia.org/wiki/Principle_of_least_privilege) - ensure components are only authorized to perform the actions they need, and limit the scope of a compromise by limiting the capabilities of individual components
|
11
|
+
4. Reduce the number of systems that have to be hardened and secured by defining clear boundaries between components
|
12
|
+
|
13
|
+
|
14
|
+
## Design Points
|
15
|
+
|
16
|
+
### Isolate the data store from the minions and supporting infrastructure
|
17
|
+
|
18
|
+
Access to the central data store (etcd) in Kubernetes allows an attacker to run arbitrary containers on hosts, to gain access to any protected information stored in either volumes or in pods (such as access tokens or shared secrets provided as environment variables), to intercept and redirect traffic from running services by inserting middlemen, or to simply delete the entire history of the custer.
|
19
|
+
|
20
|
+
As a general principle, access to the central data store should be restricted to the components that need full control over the system and which can apply appropriate authorization and authentication of change requests. In the future, etcd may offer granular access control, but that granularity will require an administrator to understand the schema of the data to properly apply security. An administrator must be able to properly secure Kubernetes at a policy level, rather than at an implementation level, and schema changes over time should not risk unintended security leaks.
|
21
|
+
|
22
|
+
Both the Kubelet and Kube Proxy need information related to their specific roles - for the Kubelet, the set of pods it should be running, and for the Proxy, the set of services and endpoints to load balance. The Kubelet also needs to provide information about running pods and historical termination data. The access pattern for both Kubelet and Proxy to load their configuration is an efficient "wait for changes" request over HTTP. It should be possible to limit the Kubelet and Proxy to only access the information they need to perform their roles and no more.
|
23
|
+
|
24
|
+
The controller manager for Replication Controllers and other future controllers act on behalf of a user via delegation to perform automated maintenance on Kubernetes resources. Their ability to access or modify resource state should be strictly limited to their intended duties and they should be prevented from accessing information not pertinent to their role. For example, a replication controller needs only to create a copy of a known pod configuration, to determine the running state of an existing pod, or to delete an existing pod that it created - it does not need to know the contents or current state of a pod, nor have access to any data in the pods attached volumes.
|
25
|
+
|
26
|
+
The Kubernetes pod scheduler is responsible for reading data from the pod to fit it onto a minion in the cluster. At a minimum, it needs access to view the ID of a pod (to craft the binding), its current state, any resource information necessary to identify placement, and other data relevant to concerns like anti-affinity, zone or region preference, or custom logic. It does not need the ability to modify pods or see other resources, only to create bindings. It should not need the ability to delete bindings unless the scheduler takes control of relocating components on failed hosts (which could be implemented by a separate component that can delete bindings but not create them). The scheduler may need read access to user or project-container information to determine preferential location (underspecified at this time).
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: kubernetes
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.0.
|
4
|
+
version: 0.0.2
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Riccardo Carlesso
|
@@ -10,14 +10,32 @@ bindir: bin
|
|
10
10
|
cert_chain: []
|
11
11
|
date: 2010-09-21 00:00:00.000000000 Z
|
12
12
|
dependencies: []
|
13
|
-
description: Currently a placeholder. In the future, a Gem to get up to speed with
|
14
|
-
Kubernetes
|
13
|
+
description: ! 'Currently a placeholder. In the future, a Gem to get up to speed with
|
14
|
+
Kubernetes. For more info: https://github.com/GoogleCloudPlatform/kubernetes/'
|
15
15
|
email: palladiusbonton@gmail.com
|
16
16
|
executables: []
|
17
17
|
extensions: []
|
18
18
|
extra_rdoc_files: []
|
19
19
|
files:
|
20
|
+
- Makefile
|
20
21
|
- README.md
|
22
|
+
- docs/access.md
|
23
|
+
- docs/architecture.dia
|
24
|
+
- docs/architecture.svg
|
25
|
+
- docs/client-libraries.md
|
26
|
+
- docs/flaky-tests.md
|
27
|
+
- docs/identifiers.md
|
28
|
+
- docs/networking.md
|
29
|
+
- docs/ovs-networking.md
|
30
|
+
- docs/ovs-networking.png
|
31
|
+
- docs/pods.md
|
32
|
+
- docs/releasing.dot
|
33
|
+
- docs/releasing.md
|
34
|
+
- docs/releasing.png
|
35
|
+
- docs/resources.md
|
36
|
+
- docs/roadmap.md
|
37
|
+
- docs/salt.md
|
38
|
+
- docs/security.md
|
21
39
|
homepage: http://rubygems.org/gems/kubernetes
|
22
40
|
licenses:
|
23
41
|
- Apache License
|
@@ -41,5 +59,5 @@ rubyforge_project:
|
|
41
59
|
rubygems_version: 2.3.0
|
42
60
|
signing_key:
|
43
61
|
specification_version: 4
|
44
|
-
summary: Kubernetes
|
62
|
+
summary: To install Kubernetes in rubyish environments
|
45
63
|
test_files: []
|