kubernetes 0.0.1 → 0.0.2

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,11 @@
1
+ ## kubernetes API client libraries
2
+
3
+ ### Supported
4
+ * [Go](https://github.com/GoogleCloudPlatform/kubernetes/tree/master/pkg/client)
5
+
6
+ ### User Contributed
7
+ *Note: Libraries provided by outside parties are supported by their authors, not the core Kubernetes team*
8
+
9
+ * [Java](https://github.com/nirmal070125/KubernetesAPIJavaClient)
10
+ * [Ruby](https://github.com/Ch00k/kuber)
11
+
@@ -0,0 +1,52 @@
1
+ # Hunting flaky tests in Kubernetes
2
+ Sometimes unit tests are flaky. This means that due to (usually) race conditions, they will occasionally fail, even though most of the time they pass.
3
+
4
+ We have a goal of 99.9% flake free tests. This means that there is only one flake in one thousand runs of a test.
5
+
6
+ Running a test 1000 times on your own machine can be tedious and time consuming. Fortunately, there is a better way to achieve this using Kubernetes.
7
+
8
+ _Note: these instructions are mildly hacky for now, as we get run once semantics and logging they will get better_
9
+
10
+ There is a testing image ```brendanburns/flake``` up on the docker hub. We will use this image to test our fix.
11
+
12
+ Create a replication controller with the following config:
13
+ ```yaml
14
+ id: flakeController
15
+ desiredState:
16
+ replicas: 24
17
+ replicaSelector:
18
+ name: flake
19
+ podTemplate:
20
+ desiredState:
21
+ manifest:
22
+ version: v1beta1
23
+ id: ""
24
+ volumes: []
25
+ containers:
26
+ - name: flake
27
+ image: brendanburns/flake
28
+ env:
29
+ - name: TEST_PACKAGE
30
+ value: pkg/tools
31
+ - name: REPO_SPEC
32
+ value: https://github.com/GoogleCloudPlatform/kubernetes
33
+ restartpolicy: {}
34
+ labels:
35
+ name: flake
36
+ labels:
37
+ name: flake
38
+ ```
39
+
40
+ ```./cluster/kubecfg.sh -c controller.yaml create replicaControllers```
41
+
42
+ This will spin up 100 instances of the test. They will run to completion, then exit, the kubelet will restart them, eventually you will have sufficient
43
+ runs for your purposes, and you can stop the replication controller:
44
+
45
+ ```sh
46
+ ./cluster/kubecfg.sh stop flakeController
47
+ ./cluster/kubecfg.sh rm flakeController
48
+ ```
49
+
50
+ Now examine the machines with ```docker ps -a``` and look for tasks that exited with non-zero exit codes (ignore those that exited -1, since that's what happens when you stop the replica controller)
51
+
52
+ Happy flake hunting!
@@ -0,0 +1,90 @@
1
+ # Identifiers and Names in Kubernetes
2
+
3
+ A summarization of the goals and recommendations for identifiers in Kubernetes. Described in [GitHub issue #199](https://github.com/GoogleCloudPlatform/kubernetes/issues/199).
4
+
5
+
6
+ ## Definitions
7
+
8
+ UID
9
+ : A non-empty, opaque, system-generated value guaranteed to be unique in time and space; intended to distinguish between historical occurrences of similar entities.
10
+
11
+ Name
12
+ : A non-empty string guaranteed to be unique within a given scope at a particular time; used in resource URLs; provided by clients at creation time and encouraged to be human friendly; intended to facilitate creation idempotence and space-uniqueness of singleton objects, distinguish distinct entities, and reference particular entities across operations.
13
+
14
+ [rfc1035](http://www.ietf.org/rfc/rfc1035.txt)/[rfc1123](http://www.ietf.org/rfc/rfc1123.txt) label (DNS_LABEL)
15
+ : An alphanumeric (a-z, A-Z, and 0-9) string, with a maximum length of 63 characters, with the '-' character allowed anywhere except the first or last character, suitable for use as a hostname or segment in a domain name
16
+
17
+ [rfc1035](http://www.ietf.org/rfc/rfc1035.txt)/[rfc1123](http://www.ietf.org/rfc/rfc1123.txt) subdomain (DNS_SUBDOMAIN)
18
+ : One or more rfc1035/rfc1123 labels separated by '.' with a maximum length of 253 characters
19
+
20
+ [rfc4122](http://www.ietf.org/rfc/rfc4122.txt) universally unique identifier (UUID)
21
+ : A 128 bit generated value that is extremely unlikely to collide across time and space and requires no central coordination
22
+
23
+
24
+ ## Objectives for names and UIDs
25
+
26
+ 1. Uniquely identify (via a UID) an object across space and time
27
+
28
+ 2. Uniquely name (via a name) an object across space
29
+
30
+ 3. Provide human-friendly names in API operations and/or configuration files
31
+
32
+ 4. Allow idempotent creation of API resources (#148) and enforcement of space-uniqueness of singleton objects
33
+
34
+ 5. Allow DNS names to be automatically generated for some objects
35
+
36
+
37
+ ## General design
38
+
39
+ 1. When an object is created via an API, a Name string (a DNS_SUBDOMAIN) must be specified. Name must be non-empty and unique within the apiserver. This enables idempotent and space-unique creation operations. Parts of the system (e.g. replication controller) may join strings (e.g. a base name and a random suffix) to create a unique Name. For situations where generating a name is impractical, some or all objects may support a param to auto-generate a name. Generating random names will defeat idempotency.
40
+ * Examples: "guestbook.user", "backend-x4eb1"
41
+
42
+ 2. When an object is created via an api, a Namespace string (a DNS_SUBDOMAIN? format TBD via #1114) may be specified. Depending on the API receiver, namespaces might be validated (e.g. apiserver might ensure that the namespace actually exists). If a namespace is not specified, one will be assigned by the API receiver. This assignment policy might vary across API receivers (e.g. apiserver might have a default, kubelet might generate something semi-random).
43
+ * Example: "api.k8s.example.com"
44
+
45
+ 3. Upon acceptance of an object via an API, the object is assigned a UID (a UUID). UID must be non-empty and unique across space and time.
46
+ * Example: "01234567-89ab-cdef-0123-456789abcdef"
47
+
48
+
49
+ ## Case study: Scheduling a pod
50
+
51
+ Pods can be placed onto a particular node in a number of ways. This case
52
+ study demonstrates how the above design can be applied to satisfy the
53
+ objectives.
54
+
55
+ ### A pod scheduled by a user through the apiserver
56
+
57
+ 1. A user submits a pod with Namespace="" and Name="guestbook" to the apiserver.
58
+
59
+ 2. The apiserver validates the input.
60
+ 1. A default Namespace is assigned.
61
+ 2. The pod name must be space-unique within the Namespace.
62
+ 3. Each container within the pod has a name which must be space-unique within the pod.
63
+
64
+ 3. The pod is accepted.
65
+ 1. A new UID is assigned.
66
+
67
+ 4. The pod is bound to a node.
68
+ 1. The kubelet on the node is passed the pod's UID, Namespace, and Name.
69
+
70
+ 5. Kubelet validates the input.
71
+
72
+ 6. Kubelet runs the pod.
73
+ 1. Each container is started up with enough metadata to distinguish the pod from whence it came.
74
+ 2. Each attempt to run a container is assigned a UID (a string) that is unique across time.
75
+ * This may correspond to Docker's container ID.
76
+
77
+ ### A pod placed by a config file on the node
78
+
79
+ 1. A config file is stored on the node, containing a pod with UID="", Namespace="", and Name="cadvisor".
80
+
81
+ 2. Kubelet validates the input.
82
+ 1. Since UID is not provided, kubelet generates one.
83
+ 2. Since Namespace is not provided, kubelet generates one.
84
+ 1. The generated namespace should be deterministic and cluster-unique for the source, such as a hash of the hostname and file path.
85
+ * E.g. Namespace="file-f4231812554558a718a01ca942782d81"
86
+
87
+ 3. Kubelet runs the pod.
88
+ 1. Each container is started up with enough metadata to distinguish the pod from whence it came.
89
+ 2. Each attempt to run a container is assigned a UID (a string) that is unique across time.
90
+ 1. This may correspond to Docker's container ID.
@@ -0,0 +1,108 @@
1
+ # Networking
2
+
3
+ ## Model and motivation
4
+
5
+ Kubernetes deviates from the default Docker networking model. The goal is for each [pod](https://github.com/GoogleCloudPlatform/kubernetes/blob/master/docs/pods.md) to have an IP in a flat shared networking namespace that has full communication with other physical computers and containers across the network. IP-per-pod creates a clean, backward-compatible model where pods can be treated much like VMs or physical hosts from the perspectives of port allocation, networking, naming, service discovery, load balancing, application configuration, and migration.
6
+
7
+ OTOH, dynamic port allocation requires supporting both static ports (e.g., for externally accessible services) and dynamically allocated ports, requires partitioning centrally allocated and locally acquired dynamic ports, complicates scheduling (since ports are a scarce resource), is inconvenient for users, complicates application configuration, is plagued by port conflicts and reuse and exhaustion, requires non-standard approaches to naming (e.g., etcd rather than DNS), requires proxies and/or redirection for programs using standard naming/addressing mechanisms (e.g., web browsers), requires watching and cache invalidation for address/port changes for instances in addition to watching group membership changes, and obstructs container/pod migration (e.g., using CRIU). NAT introduces additional complexity by fragmenting the addressing space, which breaks self-registration mechanisms, among other problems.
8
+
9
+ With the IP-per-pod model, all user containers within a pod behave as if they are on the same host with regard to networking. They can all reach each other’s ports on localhost. Ports which are published to the host interface are done so in the normal Docker way. All containers in all pods can talk to all other containers in all other pods by their 10-dot addresses.
10
+
11
+ In addition to avoiding the aforementioned problems with dynamic port allocation, this approach reduces friction for applications moving from the world of uncontainerized apps on physical or virtual hosts to containers within pods. People running application stacks together on the same host have already figured out how to make ports not conflict (e.g., by configuring them through environment variables) and have arranged for clients to find them.
12
+
13
+ The approach does reduce isolation between containers within a pod -- ports could conflict, and there couldn't be private ports across containers within a pod, but applications requiring their own port spaces could just run as separate pods and processes requiring private communication could run within the same container. Besides, the premise of pods is that containers within a pod share some resources (volumes, cpu, ram, etc.) and therefore expect and tolerate reduced isolation. Additionally, the user can control what containers belong to the same pod whereas, in general, they don't control what pods land together on a host.
14
+
15
+ When any container calls SIOCGIFADDR, it sees the IP that any peer container would see them coming from -- each pod has its own IP address that other pods can know. By making IP addresses and ports the same within and outside the containers and pods, we create a NAT-less, flat address space. "ip addr show" should work as expected. This would enable all existing naming/discovery mechanisms to work out of the box, including self-registration mechanisms and applications that distribute IP addresses. (We should test that with etcd and perhaps one other option, such as Eureka (used by Acme Air) or Consul.) We should be optimizing for inter-pod network communication. Within a pod, containers are more likely to use communication through volumes (e.g., tmpfs) or IPC.
16
+
17
+ This is different from the standard Docker model. In that mode, each container gets an IP in the 172-dot space and would only see that 172-dot address from SIOCGIFADDR. If these containers connect to another container the peer would see the connect coming from a different IP than the container itself knows. In short - you can never self-register anything from a container, because a container can not be reached on its private IP.
18
+
19
+ An alternative we considered was an additional layer of addressing: pod-centric IP per container. Each container would have its own local IP address, visible only within that pod. This would perhaps make it easier for containerized applications to move from physical/virtual hosts to pods, but would be more complex to implement (e.g., requiring a bridge per pod, split-horizon/VP DNS) and to reason about, due to the additional layer of address translation, and would break self-registration and IP distribution mechanisms.
20
+
21
+ ## Current implementation
22
+
23
+ For the Google Compute Engine cluster configuration scripts, [advanced routing](https://developers.google.com/compute/docs/networking#routing) is set up so that each VM has an extra 256 IP addresses that get routed to it. This is in addition to the 'main' IP address assigned to the VM that is NAT-ed for Internet access. The networking bridge (called `cbr0` to differentiate it from `docker0`) is set up outside of Docker proper and only does NAT for egress network traffic that isn't aimed at the virtual network.
24
+
25
+ Ports mapped in from the 'main IP' (and hence the internet if the right firewall rules are set up) are proxied in user mode by Docker. In the future, this should be done with `iptables` by either the Kubelet or Docker: [Issue #15](https://github.com/GoogleCloudPlatform/kubernetes/issues/15).
26
+
27
+ We start Docker with:
28
+ DOCKER_OPTS="--bridge cbr0 --iptables=false"
29
+
30
+ We set up this bridge on each node with SaltStack, in [container_bridge.py](https://github.com/GoogleCloudPlatform/kubernetes/blob/master/cluster/saltbase/salt/_states/container_bridge.py).
31
+
32
+ cbr0:
33
+ container_bridge.ensure:
34
+ - cidr: {{ grains['cbr-cidr'] }}
35
+ ...
36
+ grains:
37
+ roles:
38
+ - kubernetes-pool
39
+ cbr-cidr: $MINION_IP_RANGE
40
+
41
+ We make these addresses routable in GCE:
42
+
43
+ gcutil addroute ${MINION_NAMES[$i]} ${MINION_IP_RANGES[$i]} \
44
+ --norespect_terminal_width \
45
+ --project ${PROJECT} \
46
+ --network ${NETWORK} \
47
+ --next_hop_instance ${ZONE}/instances/${MINION_NAMES[$i]} &
48
+
49
+ The minion IP ranges are /24s in the 10-dot space.
50
+
51
+ GCE itself does not know anything about these IPs, though.
52
+
53
+ These are not externally routable, though, so containers that need to communicate with the outside world need to use host networking. To set up an external IP that forwards to the VM, it will only forward to the VM's primary IP (which is assigned to no pod). So we use docker's -p flag to map published ports to the main interface. This has the side effect of disallowing two pods from exposing the same port. (More discussion on this in [Issue #390](https://github.com/GoogleCloudPlatform/kubernetes/issues/390).)
54
+
55
+ We create a container to use for the pod network namespace -- a single loopback device and a single veth device. All the user's containers get their network namespaces from this pod networking container.
56
+
57
+ Docker allocates IP addresses from a bridge we create on each node, using its “container” networking mode.
58
+
59
+ 1. Create a normal (in the networking sense) container which uses a minimal image and runs a command that blocks forever. This is not a user-defined container, and gets a special well-known name.
60
+ - creates a new network namespace (netns) and loopback device
61
+ - creates a new pair of veth devices and binds them to the netns
62
+ - auto-assigns an IP from docker’s IP range
63
+
64
+ 2. Create the user containers and specify the name of the network container as their “net” argument. Docker finds the PID of the command running in the network container and attaches to the netns of that PID.
65
+
66
+ ### Other networking implementation examples
67
+ With the primary aim of providing IP-per-pod-model, other implementations exist to serve the purpose outside of GCE.
68
+ - [OpenVSwitch with GRE/VxLAN](https://github.com/GoogleCloudPlatform/kubernetes/blob/master/docs/ovs-networking.md)
69
+ - [Flannel](https://github.com/coreos/flannel#flannel)
70
+
71
+ ## Challenges and future work
72
+
73
+ ### Docker API
74
+
75
+ Right now, docker inspect doesn't show the networking configuration of the containers, since they derive it from another container. That information should be exposed somehow.
76
+
77
+ ### External IP assignment
78
+
79
+ We want to be able to assign IP addresses externally from Docker ([Docker issue #6743](https://github.com/dotcloud/docker/issues/6743)) so that we don't need to statically allocate fixed-size IP ranges to each node, so that IP addresses can be made stable across network container restarts ([Docker issue #2801](https://github.com/dotcloud/docker/issues/2801)), and to facilitate pod migration. Right now, if the network container dies, all the user containers must be stopped and restarted because the netns of the network container will change on restart, and any subsequent user container restart will join that new netns, thereby not being able to see its peers. Additionally, a change in IP address would encounter DNS caching/TTL problems. External IP assignment would also simplify DNS support (see below).
80
+
81
+ ### Naming, discovery, and load balancing
82
+
83
+ In addition to enabling self-registration with 3rd-party discovery mechanisms, we'd like to setup DDNS automatically ([Issue #146](https://github.com/GoogleCloudPlatform/kubernetes/issues/146)). hostname, $HOSTNAME, etc. should return a name for the pod ([Issue #298](https://github.com/GoogleCloudPlatform/kubernetes/issues/298)), and gethostbyname should be able to resolve names of other pods. Probably we need to set up a DNS resolver to do the latter ([Docker issue #2267](https://github.com/dotcloud/docker/issues/2267)), so that we don't need to keep /etc/hosts files up to date dynamically.
84
+
85
+ Service endpoints are currently found through [Docker-links-compatible](https://docs.docker.com/userguide/dockerlinks/) environment variables specifying ports opened by the service proxy. We don't actually use [the Docker ambassador pattern](https://docs.docker.com/articles/ambassador_pattern_linking/) to link containers because we don't require applications to identify all clients at configuration time. Regardless, we're considering moving away from the current approach to an approach more akin to our approach for individual pods: allocate an IP address per service and automatically register the service in DDNS -- L3 load balancing, essentially. Using a flat service namespace doesn't scale and environment variables don't permit dynamic updates, which complicates service deployment by imposing implicit ordering constraints.
86
+
87
+ We'd also like to accommodate other load-balancing solutions (e.g., HAProxy), non-load-balanced services ([Issue #260](https://github.com/GoogleCloudPlatform/kubernetes/issues/260)), and other types of groups (worker pools, etc.). Providing the ability to Watch a label selector applied to pod addresses would enable efficient monitoring of group membership, which could be directly consumed or synced with a discovery mechanism. Event hooks ([Issue #140](https://github.com/GoogleCloudPlatform/kubernetes/issues/140)) for join/leave events would probably make this even easier.
88
+
89
+ ### External routability
90
+
91
+ We want traffic between containers to use the pod IP addresses across nodes. Say we have Node A with a container IP space of 10.244.1.0/24 and Node B with a container IP space of 10.244.2.0/24. And we have Container A1 at 10.244.1.1 and Container B1 at 10.244.2.1. We want Container A1 to talk to Container B1 directly with no NAT. B1 should see the "source" in the IP packets of 10.244.1.1 -- not the "primary" host IP for Node A. That means that we want to turn off NAT for traffic between containers (and also between VMs and containers).
92
+
93
+ We'd also like to make pods directly routable from the external internet. However, we can't yet support the extra container IPs that we've provisioned talking to the internet directly. So, we don't map external IPs to the container IPs. Instead, we solve that problem by having traffic that isn't to the internal network (! 10.0.0.0/8) get NATed through the primary host IP address so that it can get 1:1 NATed by the GCE networking when talking to the internet. Similarly, incoming traffic from the internet has to get NATed/proxied through the host IP.
94
+
95
+ So we end up with 3 cases:
96
+
97
+ 1. Container -> Container or Container <-> VM. These should use 10. addresses directly and there should be no NAT.
98
+
99
+ 2. Container -> Internet. These have to get mapped to the primary host IP so that GCE knows how to egress that traffic. There is actually 2 layers of NAT here: Container IP -> Internal Host IP -> External Host IP. The first level happens in the guest with IP tables and the second happens as part of GCE networking. The first one (Container IP -> internal host IP) does dynamic port allocation while the second maps ports 1:1.
100
+
101
+ 3. Internet -> Container. This also has to go through the primary host IP and also has 2 levels of NAT, ideally. However, the path currently is a proxy with (External Host IP -> Internal Host IP -> Docker) -> (Docker -> Container IP). Once [issue #15](https://github.com/GoogleCloudPlatform/kubernetes/issues/15) is closed, it should be External Host IP -> Internal Host IP -> Container IP. But to get that second arrow we have to set up the port forwarding iptables rules per mapped port.
102
+
103
+ Another approach could be to create a new host interface alias for each pod, if we had a way to route an external IP to it. This would eliminate the scheduling constraints resulting from using the host's IP address.
104
+
105
+ ### IPv6
106
+
107
+ IPv6 would be a nice option, also, but we can't depend on it yet. Docker support is in progress: [Docker issue #2974](https://github.com/dotcloud/docker/issues/2974), [Docker issue #6923](https://github.com/dotcloud/docker/issues/6923), [Docker issue #6975](https://github.com/dotcloud/docker/issues/6975). Additionally, direct ipv6 assignment to instances doesn't appear to be supported by major cloud providers (e.g., AWS EC2, GCE) yet. We'd happily take pull requests from people running Kubernetes on bare metal, though. :-)
108
+
@@ -0,0 +1,14 @@
1
+ # Kubernetes OpenVSwitch GRE/VxLAN networking
2
+
3
+ This document describes how OpenVSwitch is used to setup networking between pods across minions.
4
+ The tunnel type could be GRE or VxLAN. VxLAN is preferable when large scale isolation needs to be performed within the network.
5
+
6
+ ![ovs-networking](./ovs-networking.png "OVS Networking")
7
+
8
+ The vagrant setup in Kubernetes does the following:
9
+
10
+ The docker bridge is replaced with a brctl generated linux bridge (kbr0) with a 256 address space subnet. Basically, a node gets 10.244.x.0/24 subnet and docker is configured to use that bridge instead of the default docker0 bridge.
11
+
12
+ Also, an OVS bridge is created(obr0) and added as a port to the kbr0 bridge. All OVS bridges across all nodes are linked with GRE tunnels. So, each node has an outgoing GRE tunnel to all other nodes. It does not need to be a complete mesh really, just meshier the better. STP (spanning tree) mode is enabled in the bridges to prevent loops.
13
+
14
+ Routing rules enable any 10.244.0.0/16 target to become reachable via the OVS bridge connected with the tunnels.
Binary file
data/docs/pods.md ADDED
@@ -0,0 +1,23 @@
1
+ # Pods
2
+
3
+ A _pod_ (as in a pod of whales or pea pod) is a relatively tightly coupled group of containers that are scheduled onto the same host. It models an application-specific "virtual host" in a containerized environment. Pods serve as units of scheduling, deployment, and horizontal scaling/replication, and share fate.
4
+
5
+ Why doesn't Kubernetes just support an affinity mechanism for co-scheduling containers instead? While pods have a number of benefits (e.g., simplifying the scheduler), the primary motivation is resource sharing.
6
+
7
+ In addition to defining the containers that run in the pod, the pod specifies a set of shared storage volumes. Pods facilitate data sharing and IPC among their constituents. In the future, they may share CPU and/or memory ([LPC2013](http://www.linuxplumbersconf.org/2013/ocw//system/presentations/1239/original/lmctfy%20(1).pdf)).
8
+
9
+ The containers in the pod also all use the same network namespace/IP (and port space). The goal is for each pod to have an IP address in a flat shared networking namespace that has full communication with other physical computers and containers across the network. [More details on networking](https://github.com/GoogleCloudPlatform/kubernetes/blob/master/docs/networking.md).
10
+
11
+ While pods can be used to host vertically integrated application stacks, their primary motivation is to support co-located, co-managed helper programs, such as:
12
+ - content management systems, file and data loaders, local cache managers, etc.
13
+ - log and checkpoint backup, compression, rotation, snapshotting, etc.
14
+ - data change watchers, log tailers, logging and monitoring adapters, event publishers, etc.
15
+ - proxies, bridges, and adapters
16
+ - controllers, managers, configurators, and updaters
17
+
18
+ Individual pods are not intended to run multiple instances of the same application, in general.
19
+
20
+ Why not just run multiple programs in a single Docker container?
21
+
22
+ 1. Transparency. Making the containers within the pod visible to the infrastructure enables the infrastructure to provide services to those containers, such as process management and resource monitoring. This facilitates a number of conveniences for users.
23
+ 2. Decoupling software dependencies. The individual containers may be rebuilt and redeployed independently. Kubernetes may even support live updates of individual containers someday.
@@ -0,0 +1,113 @@
1
+ // Build it with:
2
+ // $ dot -Tsvg releasing.dot >releasing.svg
3
+
4
+ digraph tagged_release {
5
+ size = "5,5"
6
+ // Arrows go up.
7
+ rankdir = BT
8
+ subgraph left {
9
+ // Group the left nodes together.
10
+ ci012abc -> pr101 -> ci345cde -> pr102
11
+ style = invis
12
+ }
13
+ subgraph right {
14
+ // Group the right nodes together.
15
+ version_commit -> dev_commit
16
+ style = invis
17
+ }
18
+ { // Align the version commit and the info about it.
19
+ rank = same
20
+ // Align them with pr101
21
+ pr101
22
+ version_commit
23
+ // release_info shows the change in the commit.
24
+ release_info
25
+ }
26
+ { // Align the dev commit and the info about it.
27
+ rank = same
28
+ // Align them with 345cde
29
+ ci345cde
30
+ dev_commit
31
+ dev_info
32
+ }
33
+ // Join the nodes from subgraph left.
34
+ pr99 -> ci012abc
35
+ pr102 -> pr100
36
+ // Do the version node.
37
+ pr99 -> version_commit
38
+ dev_commit -> pr100
39
+ tag -> version_commit
40
+ pr99 [
41
+ label = "Merge PR #99"
42
+ shape = box
43
+ fillcolor = "#ccccff"
44
+ style = "filled"
45
+ fontname = "Helvetica Neue, Helvetica, Segoe UI, Arial, freesans, sans-serif"
46
+ ];
47
+ ci012abc [
48
+ label = "012abc"
49
+ shape = circle
50
+ fillcolor = "#ffffcc"
51
+ style = "filled"
52
+ fontname = "Consolas, Liberation Mono, Menlo, Courier, monospace"
53
+ ];
54
+ pr101 [
55
+ label = "Merge PR #101"
56
+ shape = box
57
+ fillcolor = "#ccccff"
58
+ style = "filled"
59
+ fontname = "Helvetica Neue, Helvetica, Segoe UI, Arial, freesans, sans-serif"
60
+ ];
61
+ ci345cde [
62
+ label = "345cde"
63
+ shape = circle
64
+ fillcolor = "#ffffcc"
65
+ style = "filled"
66
+ fontname = "Consolas, Liberation Mono, Menlo, Courier, monospace"
67
+ ];
68
+ pr102 [
69
+ label = "Merge PR #102"
70
+ shape = box
71
+ fillcolor = "#ccccff"
72
+ style = "filled"
73
+ fontname = "Helvetica Neue, Helvetica, Segoe UI, Arial, freesans, sans-serif"
74
+ ];
75
+ version_commit [
76
+ label = "678fed"
77
+ shape = circle
78
+ fillcolor = "#ccffcc"
79
+ style = "filled"
80
+ fontname = "Consolas, Liberation Mono, Menlo, Courier, monospace"
81
+ ];
82
+ dev_commit [
83
+ label = "456dcb"
84
+ shape = circle
85
+ fillcolor = "#ffffcc"
86
+ style = "filled"
87
+ fontname = "Consolas, Liberation Mono, Menlo, Courier, monospace"
88
+ ];
89
+ pr100 [
90
+ label = "Merge PR #100"
91
+ shape = box
92
+ fillcolor = "#ccccff"
93
+ style = "filled"
94
+ fontname = "Helvetica Neue, Helvetica, Segoe UI, Arial, freesans, sans-serif"
95
+ ];
96
+ release_info [
97
+ label = "pkg/version/base.go:\ngitVersion = \"v0.5\";"
98
+ shape = none
99
+ fontname = "Helvetica Neue, Helvetica, Segoe UI, Arial, freesans, sans-serif"
100
+ ];
101
+ dev_info [
102
+ label = "pkg/version/base.go:\ngitVersion = \"v0.5-dev\";"
103
+ shape = none
104
+ fontname = "Helvetica Neue, Helvetica, Segoe UI, Arial, freesans, sans-serif"
105
+ ];
106
+ tag [
107
+ label = "$ git tag -a v0.5"
108
+ fillcolor = "#ffcccc"
109
+ style = "filled"
110
+ fontname = "Helvetica Neue, Helvetica, Segoe UI, Arial, freesans, sans-serif"
111
+ ];
112
+ }
113
+
data/docs/releasing.md ADDED
@@ -0,0 +1,152 @@
1
+ # Releasing Kubernetes
2
+
3
+ This document explains how to create a Kubernetes release (as in version) and
4
+ how the version information gets embedded into the built binaries.
5
+
6
+ ## Origin of the Sources
7
+
8
+ Kubernetes may be built from either a git tree (using `hack/build-go.sh`) or
9
+ from a tarball (using either `hack/build-go.sh` or `go install`) or directly by
10
+ the Go native build system (using `go get`).
11
+
12
+ When building from git, we want to be able to insert specific information about
13
+ the build tree at build time. In particular, we want to use the output of `git
14
+ describe` to generate the version of Kubernetes and the status of the build
15
+ tree (add a `-dirty` prefix if the tree was modified.)
16
+
17
+ When building from a tarball or using the Go build system, we will not have
18
+ access to the information about the git tree, but we still want to be able to
19
+ tell whether this build corresponds to an exact release (e.g. v0.3) or is
20
+ between releases (e.g. at some point in development between v0.3 and v0.4).
21
+
22
+ ## Version Number Format
23
+
24
+ In order to account for these use cases, there are some specific formats that
25
+ may end up representing the Kubernetes version. Here are a few examples:
26
+
27
+ - **v0.5**: This is official version 0.5 and this version will only be used
28
+ when building from a clean git tree at the v0.5 git tag, or from a tree
29
+ extracted from the tarball corresponding to that specific release.
30
+ - **v0.5-15-g0123abcd4567**: This is the `git describe` output and it indicates
31
+ that we are 15 commits past the v0.5 release and that the SHA1 of the commit
32
+ where the binaries were built was `0123abcd4567`. It is only possible to have
33
+ this level of detail in the version information when building from git, not
34
+ when building from a tarball.
35
+ - **v0.5-15-g0123abcd4567-dirty** or **v0.5-dirty**: The extra `-dirty` prefix
36
+ means that the tree had local modifications or untracked files at the time of
37
+ the build, so there's no guarantee that the source code matches exactly the
38
+ state of the tree at the `0123abcd4567` commit or at the `v0.5` git tag
39
+ (resp.)
40
+ - **v0.5-dev**: This means we are building from a tarball or using `go get` or,
41
+ if we have a git tree, we are using `go install` directly, so it is not
42
+ possible to inject the git version into the build information. Additionally,
43
+ this is not an official release, so the `-dev` prefix indicates that the
44
+ version we are building is after `v0.5` but before `v0.6`. (There is actually
45
+ an exception where a commit with `v0.5-dev` is not present on `v0.6`, see
46
+ later for details.)
47
+
48
+ ## Injecting Version into Binaries
49
+
50
+ In order to cover the different build cases, we start by providing information
51
+ that can be used when using only Go build tools or when we do not have the git
52
+ version information available.
53
+
54
+ To be able to provide a meaningful version in those cases, we set the contents
55
+ of variables in a Go source file that will be used when no overrides are
56
+ present.
57
+
58
+ We are using `pkg/version/base.go` as the source of versioning in absence of
59
+ information from git. Here is a sample of that file's contents:
60
+
61
+ ```
62
+ var (
63
+ gitVersion string = "v0.4-dev" // version from git, output of $(git describe)
64
+ gitCommit string = "" // sha1 from git, output of $(git rev-parse HEAD)
65
+ )
66
+ ```
67
+
68
+ This means a build with `go install` or `go get` or a build from a tarball will
69
+ yield binaries that will identify themselves as `v0.4-dev` and will not be able
70
+ to provide you with a SHA1.
71
+
72
+ To add the extra versioning information when building from git, the
73
+ `hack/build-go.sh` script will gather that information (using `git describe` and
74
+ `git rev-parse`) and then create a `-ldflags` string to pass to `go install` and
75
+ tell the Go linker to override the contents of those variables at build time. It
76
+ can, for instance, tell it to override `gitVersion` and set it to
77
+ `v0.4-13-g4567bcdef6789-dirty` and set `gitCommit` to `4567bcdef6789...` which
78
+ is the complete SHA1 of the (dirty) tree used at build time.
79
+
80
+ ## Handling Official Versions
81
+
82
+ Handling official versions from git is easy, as long as there is an annotated
83
+ git tag pointing to a specific version then `git describe` will return that tag
84
+ exactly which will match the idea of an official version (e.g. `v0.5`).
85
+
86
+ Handling it on tarballs is a bit harder since the exact version string must be
87
+ present in `pkg/version/base.go` for it to get embedded into the binaries. But
88
+ simply creating a commit with `v0.5` on its own would mean that the commits
89
+ coming after it would also get the `v0.5` version when built from tarball or `go
90
+ get` while in fact they do not match `v0.5` (the one that was tagged) exactly.
91
+
92
+ To handle that case, creating a new release should involve creating two adjacent
93
+ commits where the first of them will set the version to `v0.5` and the second
94
+ will set it to `v0.5-dev`. In that case, even in the presence of merges, there
95
+ will be a single comit where the exact `v0.5` version will be used and all
96
+ others around it will either have `v0.4-dev` or `v0.5-dev`.
97
+
98
+ The diagram below illustrates it.
99
+
100
+ ![Diagram of git commits involved in the release](releasing.png)
101
+
102
+ After working on `v0.4-dev` and merging PR 99 we decide it is time to release
103
+ `v0.5`. So we start a new branch, create one commit to update
104
+ `pkg/version/base.go` to include `gitVersion = "v0.5"` and `git commit` it.
105
+
106
+ We test it and make sure everything is working as expected.
107
+
108
+ Before sending a PR for it, we create a second commit on that same branch,
109
+ updating `pkg/version/base.go` to include `gitVersion = "v0.5-dev"`. That will
110
+ ensure that further builds (from tarball or `go install`) on that tree will
111
+ always include the `-dev` prefix and will not have a `v0.5` version (since they
112
+ do not match the official `v0.5` exactly.)
113
+
114
+ We then send PR 100 with both commits in it.
115
+
116
+ Once the PR is accepted, we can use `git tag -a` to create an annotated tag
117
+ *pointing to the one commit* that has `v0.5` in `pkg/version/base.go` and push
118
+ it to GitHub. (Unfortunately GitHub tags/releases are not annotated tags, so
119
+ this needs to be done from a git client and pushed to GitHub using SSH.)
120
+
121
+ ## Parallel Commits
122
+
123
+ While we are working on releasing `v0.5`, other development takes place and
124
+ other PRs get merged. For instance, in the example above, PRs 101 and 102 get
125
+ merged to the master branch before the versioning PR gets merged.
126
+
127
+ This is not a problem, it is only slightly inaccurate that checking out the tree
128
+ at commit `012abc` or commit `345cde` or at the commit of the merges of PR 101
129
+ or 102 will yield a version of `v0.4-dev` *but* those commits are not present in
130
+ `v0.5`.
131
+
132
+ In that sense, there is a small window in which commits will get a
133
+ `v0.4-dev` or `v0.4-N-gXXX` label and while they're indeed later than `v0.4`
134
+ but they are not really before `v0.5` in that `v0.5` does not contain those
135
+ commits.
136
+
137
+ Unfortunately, there is not much we can do about it. On the other hand, other
138
+ projects seem to live with that and it does not really become a large problem.
139
+
140
+ As an example, Docker commit a327d9b91edf has a `v1.1.1-N-gXXX` label but it is
141
+ not present in Docker `v1.2.0`:
142
+
143
+ ```
144
+ $ git describe a327d9b91edf
145
+ v1.1.1-822-ga327d9b91edf
146
+
147
+ $ git log --oneline v1.2.0..a327d9b91edf
148
+ a327d9b91edf Fix data space reporting from Kb/Mb to KB/MB
149
+
150
+ (Non-empty output here means the commit is not present on v1.2.0.)
151
+ ```
152
+