conlink 2.0.3 → 2.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -13,5 +13,12 @@ jobs:
13
13
  - name: Checkout
14
14
  uses: actions/checkout@v3
15
15
 
16
- - name: stub step
17
- run: "echo stub stub"
16
+ - name: npm install
17
+ run: npm install
18
+
19
+ - name: compose build of conlink
20
+ run: docker compose -f examples/test1-compose.yaml build
21
+
22
+ - name: "./run-tests.sh"
23
+ timeout-minutes: 5
24
+ run: time ./run-tests.sh
package/Dockerfile CHANGED
@@ -24,11 +24,11 @@ FROM node:16-slim as run
24
24
  RUN apt-get -y update
25
25
  # Runtime deps and utilities
26
26
  RUN apt-get -y install libpcap-dev tcpdump iproute2 iputils-ping curl \
27
- iptables bridge-utils \
27
+ iptables bridge-utils ethtool \
28
28
  openvswitch-switch openvswitch-testcontroller
29
29
 
30
30
  COPY --from=build /app/ /app/
31
- ADD link-add.sh link-del.sh /app/
31
+ ADD link-add.sh link-del.sh link-mirred.sh /app/
32
32
  ADD schema.yaml /app/build/
33
33
 
34
34
  ENV PATH /app:$PATH
package/README.md CHANGED
@@ -13,6 +13,8 @@ General:
13
13
  Other:
14
14
  * For Open vSwtich (OVS) bridging, the `openvswitch` kernel module
15
15
  must loaded on the host system (where docker engine is running).
16
+ * For patch connections (`bridge: patch`), the kernel must support
17
+ tc qdisc mirred filtering via the `act_mirred` kernel module.
16
18
  * For podman usage (e.g. second part of `test3`), podman is required.
17
19
  * For remote connections/links (e.g. `test5`), the `geneve` (and/or
18
20
  `vxlan`) kernel module must be loaded on the host system (where
@@ -44,17 +46,28 @@ will also be required for the conlink container. In particular, if the
44
46
  container uses systemd, then it will likely use `SYS_NICE` and
45
47
  `NET_BROADCAST` and conlink will likewise need those capabilities.
46
48
 
47
- ### Bridging: Open vSwtich/OVS or Linux bridge
48
-
49
- Conlink creates bridges/switches and connects veth container links to
50
- those bridges (specified by `bridge:` in the link specification).
51
- By default, conlink will attempt to create Open vSwitch/OVS bridges
52
- for these connections, however, if the kernel does not provide support
53
- (`openvswitch` kernel module loaded), then conlink will fallback to
54
- using standard Linux bridges. The fallback behavior can be changed by
55
- setting the `--bridge-mode` option to either "ovs" or "linux". If the
56
- bridge mode is set to "ovs" then conlink will fail to start if the
57
- `openvswitch` kernel module is not detected.
49
+ ### Bridging: Open vSwtich/OVS, Linux bridge, and patch
50
+
51
+ Conlink connects container veth links together via a bridge or via a
52
+ direct patch. All veth type links must have a `bridge` property that
53
+ defines which links will be connected together (i.e. the same
54
+ broadcast domain). The default bridge mode is defined by the
55
+ `--default-bridge-mode` parameter and defaults to "auto". If a bridge
56
+ is set to mode "auto" then conlink will check if the kernel has the
57
+ `openvswitch` kernel module loaded and if so it will create an Open
58
+ vSwitch/OVS bridge/switch for that bridge, otherwise it will create a
59
+ regular Linux bridge (e.g. brctl). If any bridges are explicitly
60
+ defined with an "ovs" mode and the kernel does not have support then
61
+ conlink will stop/error on startup.
62
+
63
+ The "patch" mode will connect two links together using tc qdisc
64
+ ingress filters. This type connection is equivalent to a patch panel
65
+ ("bump-in-the-wire") connection and all traffic will be passed between
66
+ the two links unchanged unlike Linux and OVS bridges which typically
67
+ block certain bridge control broadcast traffic). The primary downside
68
+ of "patch" connections is that they limited to two links whereas "ovs"
69
+ and "linux" bridge modes can support many links connected into the
70
+ same bridge (broadcast domain).
58
71
 
59
72
  ## Network Configuration Syntax
60
73
 
@@ -122,6 +135,21 @@ than the MTU of the parent (outer-dev) device.
122
135
  For the `netem` property, refer to the `netem` man page. The `OPTIONS`
123
136
  grammar defines the valid strings for the `netem` property.
124
137
 
138
+ ### Bridges
139
+
140
+ The bridge settings currently only support the "mode" setting. If
141
+ the mode is not specified in this section or the section is omitted
142
+ entirely, then bridges specified in the links configuration will
143
+ default to the value of the `--default-bridge-mode` parameter (which
144
+ itself defaults to "auto").
145
+
146
+ The following table describes the bridge properties:
147
+
148
+ | property | format | description |
149
+ |-----------|---------|--------------------------------|
150
+ | bridge | string | conlink bridge / domain name |
151
+ | mode | string | auto, ovs, or linux |
152
+
125
153
  ### Tunnels
126
154
 
127
155
  Tunnels links/interfaces will be created and attached to the specified
@@ -476,6 +504,19 @@ Note: to connect to the vlan node (NODE2_HOST_ADDRESS) you will need
476
504
  to configure your physical switch/router with routing/connectivity to
477
505
  VLAN 5 on the same physical link to your host.
478
506
 
507
+ ### test9: bridge modes
508
+
509
+ This example demonstrates the supported bridge modes.
510
+
511
+ Start the test9 compose configuration using different bridge modes and
512
+ validate connectivity using ping:
513
+
514
+ ```
515
+ export BRIDGE_MODE="linux" # "ovs", "patch", "auto"
516
+ docker-compose -f examples/test9-compose.yaml up --build --force-recreate
517
+ docker-compose -f examples/test9-compose.yaml exec node ping 10.0.1.2
518
+ ```
519
+
479
520
  ## GraphViz network configuration rendering
480
521
 
481
522
  You can use d3 and GraphViz to create a visual graph rendering of
@@ -1,5 +1,10 @@
1
1
  version: "2.4"
2
2
 
3
+ services:
4
+ r0:
5
+ volumes:
6
+ - ./scripts:/scripts
7
+
3
8
  x-network:
4
9
  commands:
5
10
  - {service: r0, command: "python3 -m http.server 80"}
@@ -0,0 +1,32 @@
1
+ # This file demonstrates using different bridge modes and
2
+ # usage of the --default-brige-mode parameter.
3
+
4
+ version: "2.4"
5
+
6
+ services:
7
+ network:
8
+ build: {context: ../}
9
+ image: conlink
10
+ pid: host
11
+ network_mode: none
12
+ cap_add: [SYS_ADMIN, NET_ADMIN, SYS_NICE, NET_BROADCAST, IPC_LOCK]
13
+ security_opt: [ 'apparmor:unconfined' ] # needed on Ubuntu 18.04
14
+ volumes:
15
+ - /var/run/docker.sock:/var/run/docker.sock
16
+ - /var/lib/docker:/var/lib/docker
17
+ - ./:/test
18
+ environment:
19
+ - BRIDGE_MODE
20
+ command: /app/build/conlink.js --default-bridge-mode linux --compose-file /test/test9-compose.yaml
21
+
22
+ node:
23
+ image: alpine
24
+ network_mode: none
25
+ scale: 2
26
+ command: sleep Infinity
27
+
28
+ x-network:
29
+ links:
30
+ - {bridge: s1, service: node, ip: 10.0.1.1/24}
31
+ bridges:
32
+ - {bridge: s1, mode: "${BRIDGE_MODE:-auto}"}
package/link-mirred.sh ADDED
@@ -0,0 +1,114 @@
1
+ #!/bin/bash
2
+
3
+ # Copyright (c) 2024, Equinix, Inc
4
+ # Licensed under MPL 2.0
5
+
6
+ set -e
7
+
8
+ usage () {
9
+ echo >&2 "${0} [OPTIONS] INTF0 INTF1"
10
+ echo >&2 ""
11
+ echo >&2 "Create traffic mirror/redirect between INTF0 and INTF1."
12
+ echo >&2 ""
13
+ echo >&2 "Positional arguments:"
14
+ echo >&2 " INTF0 is the first interface name"
15
+ echo >&2 " INTF1 is the second interface name"
16
+ echo >&2 ""
17
+ echo >&2 "INTF0 must exist, but if INTF1 is missing, then exit with 0."
18
+ echo >&2 "Each interface will be checked for correct ingress/mirred config"
19
+ echo >&2 "and configured if the configuration is missing."
20
+ echo >&2 "These two aspect make this script idempotent. It can be called"
21
+ echo >&2 "whenever either interface appears and when the second appears,"
22
+ echo >&2 "the mirror/redirect action will be setup fully/bidirectionally."
23
+ echo >&2 ""
24
+ echo >&2 "OPTIONS:"
25
+ echo >&2 " --verbose - Verbose output (set -x)"
26
+ exit 2
27
+ }
28
+
29
+ info() { echo "link-mirred [${LOG_ID}] ${*}"; }
30
+ warn() { >&2 echo "link-mirred [${LOG_ID}] ${*}"; }
31
+ die() { warn "ERROR: ${*}"; exit 1; }
32
+
33
+ # Idempotently add ingress qdisc to an interface
34
+ add_ingress() {
35
+ local IF=$1 res=
36
+
37
+ res=$(tc qdisc show dev ${IF} 2>&1)
38
+ case "${res}" in
39
+ *"qdisc ingress ffff:"*)
40
+ info "${IF0} already has ingress qdisc"
41
+ ;;
42
+ ""|*"qdisc noqueue"*)
43
+ info "Adding ingress qdisc to ${IF}"
44
+ tc qdisc add dev "${IF}" ingress \
45
+ || die "Could not add ingress qdisc to ${IF}"
46
+ ;;
47
+ *)
48
+ die "${IF} has invalid ingress qdisc or could not be queried"
49
+ ;;
50
+ esac
51
+ }
52
+
53
+ # Idempotently add mirred filter redirect rule to an interface
54
+ add_mirred() {
55
+ local IF0=$1 IF1=$2 res=
56
+
57
+ res=$(tc filter show dev ${IF0} parent ffff: 2>&1)
58
+ case "${res}" in
59
+ *"action order 1: mirred (Egress Redirect to device ${IF1}"*)
60
+ info "${IF0} already has filter redirect action"
61
+ ;;
62
+ "")
63
+ info "Adding filter redirect action from ${IF0} to ${IF1}"
64
+ tc filter add dev ${IF0} parent ffff: protocol all u32 match u8 0 0 action \
65
+ mirred egress redirect dev ${IF1} \
66
+ || die "Could not add filter redirect action from ${IF0} to ${IF1}"
67
+ ;;
68
+ *)
69
+ die "${IF0} has invalid filter redirect action or could not be queried"
70
+ ;;
71
+ esac
72
+ }
73
+
74
+ # Parse arguments
75
+ VERBOSE=${VERBOSE:-}
76
+ positional=
77
+ while [ "${*}" ]; do
78
+ param=$1; OPTARG=$2
79
+ case ${param} in
80
+ --verbose) VERBOSE=1 ;;
81
+ -h|--help) usage ;;
82
+ *) positional="${positional} $1" ;;
83
+ esac
84
+ shift
85
+ done
86
+ set -- ${positional}
87
+ IF0=$1 IF1=$2
88
+
89
+ [ "${VERBOSE}" ] && set -x || true
90
+
91
+ # Sanity check arguments
92
+ [ "${IF0}" -a "${IF1}" ] || usage
93
+
94
+ LOG_ID="mirred ${IF0}:${IF1}"
95
+
96
+ # Sanity checks
97
+ if ! ip link show ${IF0} >/dev/null; then
98
+ die "${IF0} does not exist"
99
+ fi
100
+ if ! ip link show ${IF1} >/dev/null; then
101
+ info "${IF1} missing, exiting"
102
+ exit 0
103
+ fi
104
+
105
+ ### Do the work
106
+
107
+ info "Creating filter rediction action between ${IF0} and ${IF1}"
108
+
109
+ add_ingress ${IF0}
110
+ add_ingress ${IF1}
111
+ add_mirred ${IF0} ${IF1}
112
+ add_mirred ${IF1} ${IF0}
113
+
114
+ info "Created filter rediction action between ${IF0} and ${IF1}"
package/mdc CHANGED
@@ -17,6 +17,7 @@ LS=$(which ls)
17
17
  RESOLVE_DEPS="${RESOLVE_DEPS-./node_modules/@lonocloud/resolve-deps/resolve-deps.py}"
18
18
  DOCKER_COMPOSE="${DOCKER_COMPOSE:-docker-compose}"
19
19
 
20
+ [ -f "${RESOLVE_DEPS}" ] || die "Missing ${RESOLVE_DEPS}. Perhaps 'npm install'?"
20
21
  MODE_SPEC="${1}"; shift
21
22
  if [ "${RESOLVE_DEPS}" ]; then
22
23
  MODES="$(${RESOLVE_DEPS} "${MODES_DIR}" ${MODE_SPEC})"
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "conlink",
3
- "version": "2.0.3",
3
+ "version": "2.1.0",
4
4
  "description": "conlink - Declarative Low-Level Networking for Containers",
5
5
  "repository": "https://github.com/LonoCloud/conlink",
6
6
  "license": "SEE LICENSE IN LICENSE",
package/run-tests.sh ADDED
@@ -0,0 +1,191 @@
1
+ #!/usr/bin/env bash
2
+
3
+ export VERBOSE=${VERBOSE:-}
4
+ export COMPOSE_PROJECT_NAME=${COMPOSE_PROJECT_NAME:-conlink-test}
5
+ declare TEST_NUM=0
6
+ declare -A RESULTS
7
+ declare PASS=0
8
+ declare FAIL=0
9
+
10
+ die() { echo >&2 "${*}"; exit 1; }
11
+ vecho() { [ "${VERBOSE}" ] && echo "${*}" || true; }
12
+ dc() { ${DOCKER_COMPOSE} "${@}"; }
13
+ mdc() { ./mdc "${@}" || die "mdc invocation failed"; }
14
+
15
+ # Determine compose command
16
+ for dc in "docker compose" "docker-compose"; do
17
+ ${dc} version 2>/dev/null >&2 && DOCKER_COMPOSE="${dc}" && break
18
+ done
19
+ [ "${DOCKER_COMPOSE}" ] || die "No compose command found"
20
+ echo >&2 "Using compose command '${DOCKER_COMPOSE}'"
21
+
22
+ dc_init() {
23
+ local cont="${1}" idx="${2}"
24
+ dc down --remove-orphans -t1
25
+ dc up -d --force-recreate "${@}"
26
+ while ! dc logs network | grep "All links connected"; do
27
+ vecho "waiting for conlink startup"
28
+ sleep 1
29
+ done
30
+ }
31
+
32
+ dc_wait() {
33
+ local tries="${1}" cont="${2}" try=1 svc= idx= result=
34
+ case "${cont}" in
35
+ *_[0-9]|*_[0-9][0-9]) svc="${cont%_*}" idx="${cont##*_}" ;;
36
+ *) svc="${cont}" idx=1 ;;
37
+ esac
38
+ shift; shift
39
+
40
+ #echo "target: ${1}, service: ${svc}, index: ${idx}"
41
+ while true; do
42
+ result=0
43
+ if [ "${VERBOSE}" ]; then
44
+ vecho "Running: dc exec -T --index ${idx} ${svc} sh -c ${*}"
45
+ dc exec -T --index ${idx} ${svc} sh -c "${*}" || result=$?
46
+ else
47
+ dc exec -T --index ${idx} ${svc} sh -c "${*}" > /dev/null || result=$?
48
+ fi
49
+ [ "${result}" -eq 0 -o "${try}" -ge "${tries}" ] && break
50
+ echo " command failed (${result}), sleeping 2s before retry (${try}/${tries})"
51
+ sleep 2
52
+ try=$(( try + 1 ))
53
+ done
54
+ return ${result}
55
+ }
56
+
57
+ dc_test() {
58
+ name="${TEST_NUM} ${GROUP}: ${@}"
59
+ TEST_NUM=$(( TEST_NUM + 1 ))
60
+ vecho " > Running test: ${name}"
61
+ dc_wait 1 "${@}"
62
+ RESULTS["${name}"]=$?
63
+ if [ "${RESULTS["${name}"]}" = 0 ]; then
64
+ PASS=$(( PASS + 1 ))
65
+ vecho " > PASS (0 for ${*})"
66
+ else
67
+ FAIL=$(( FAIL + 1 ))
68
+ echo " > FAIL (${RESULTS[${name}]} for ${*})"
69
+ fi
70
+ }
71
+
72
+
73
+ echo -e "\n\n>>> test1: combined config"
74
+ GROUP=test1
75
+ echo "COMPOSE_FILE=examples/test1-compose.yaml" > .env
76
+ dc_init || die "test1 startup failed"
77
+
78
+ echo " >> Ping nodes from other nodes"
79
+ dc_test h1 ping -c1 10.0.0.100
80
+ dc_test h2 ping -c1 192.168.1.100
81
+ dc_test h3 ping -c1 172.16.0.100
82
+
83
+ echo -e "\n\n>>> test2: separate config and scaling"
84
+ GROUP=test2
85
+ echo "COMPOSE_FILE=examples/test2-compose.yaml" > .env
86
+ dc_init || die "test2 startup failed"
87
+
88
+ echo " >> Cross-node ping and ping the 'internet'"
89
+ dc_test node_1 ping -c1 10.0.1.2
90
+ dc_test node_2 ping -c1 10.0.1.1
91
+ dc_test node_1 ping -c1 8.8.8.8
92
+ dc_test node_2 ping -c1 8.8.8.8
93
+
94
+ echo " >> Scale the nodes from 2 to 5"
95
+ dc up -d --scale node=5
96
+ dc_wait 10 node_5 'ip addr | grep "10\.0\.1\.5"' || die "test2 scale-up failed"
97
+ echo " >> Ping the fifth node from the second"
98
+ dc_test node_2 ping -c1 10.0.1.5
99
+
100
+
101
+ echo -e "\n\n>>> test4: multiple compose / mdc"
102
+ GROUP=test4
103
+ export MODES_DIR=./examples/test4-multiple/modes
104
+
105
+ mdc node1
106
+ dc_init; dc_wait 10 r0_1 'ip addr | grep "10\.1\.0\.100"' \
107
+ || die "test4 node1 startup failed"
108
+ echo " >> Ping the r0 router host from node1"
109
+ dc_test node1_1 ping -c1 10.0.0.100
110
+
111
+ mdc node1,nodes2
112
+ dc_init; dc_wait 10 node2_2 'ip addr | grep "10\.2\.0\.2"' \
113
+ || die "test4 node1,nodes2 startup failed"
114
+ echo " >> From both node2 replicas, ping node1 across the r0 router"
115
+ dc_test node2_1 ping -c1 10.1.0.1
116
+ dc_test node2_2 ping -c1 10.1.0.1
117
+ echo " >> From node1, ping both node2 replicas across the r0 router"
118
+ dc_test node1 ping -c1 10.2.0.1
119
+ dc_test node1 ping -c1 10.2.0.2
120
+
121
+ mdc all
122
+ dc_init; dc exec -T r0 /scripts/wait.sh -t 10.0.0.100:80 \
123
+ || die "test4 all startup failed"
124
+ echo " >> From node2, download from the web server in r0"
125
+ dc_test node2_1 wget -O- 10.0.0.100
126
+ dc_test node2_2 wget -O- 10.0.0.100
127
+
128
+
129
+ echo -e "\n\n>>> test7: MAC, MTU, and NetEm settings"
130
+ GROUP=test7
131
+ echo "COMPOSE_FILE=examples/test7-compose.yaml" > .env
132
+
133
+ dc_init; dc_wait 10 node_1 'ip addr | grep "10\.0\.1\.1"' \
134
+ || die "test7 startup failed"
135
+ echo " >> Ensure MAC and MTU are set correctly"
136
+ dc_test node_1 'ip link show eth0 | grep "ether 00:0a:0b:0c:0d:01"'
137
+ dc_test node_2 'ip link show eth0 | grep "ether 00:0a:0b:0c:0d:02"'
138
+ dc_test node_1 'ip link show eth0 | grep "mtu 4111"'
139
+ dc_test node_2 'ip link show eth0 | grep "mtu 4111"'
140
+ echo " >> Check for round-trip ping delay of 80ms"
141
+ dc_test node_1 'ping -c2 10.0.1.2 | tail -n1 | grep "max = 80\."'
142
+
143
+
144
+ echo -e "\n\n>>> test9: bridge modes and variable templating"
145
+ echo "COMPOSE_FILE=examples/test9-compose.yaml" > .env
146
+
147
+ echo -e "\n\n >> test9: bridge mode: auto"
148
+ GROUP=test9-auto
149
+ export BRIDGE_MODE=auto
150
+ dc_init; dc_wait 10 node_1 'ip addr | grep "10\.0\.1\.1"' \
151
+ || die "test9 (auto) startup failed"
152
+ echo " >> Check for round-trip ping connectivity (BRIDGE_MODE=auto)"
153
+ dc_test node_1 'ping -c2 10.0.1.2'
154
+
155
+ echo -e "\n\n >> test9: bridge mode: linux"
156
+ GROUP=test9-linux
157
+ export BRIDGE_MODE=linux
158
+ dc_init; dc_wait 10 node_1 'ip addr | grep "10\.0\.1\.1"' \
159
+ || die "test9 (linux) startup failed"
160
+ echo " >> Check for round-trip ping connectivity (BRIDGE_MODE=linux)"
161
+ dc_test node_1 'ping -c2 10.0.1.2'
162
+
163
+ echo -e "\n\n >> test9: bridge mode: patch"
164
+ GROUP=test9-patch
165
+ export BRIDGE_MODE=patch
166
+ dc_init; dc_wait 10 node_1 'ip addr | grep "10\.0\.1\.1"' \
167
+ || die "test9 startup failed"
168
+ echo " >> Ensure ingest filter rules exist (BRIDGE_MODE=patch)"
169
+ dc_test network 'tc filter show dev node_1-eth0 parent ffff: | grep "action order 1: mirred"'
170
+ echo " >> Check for round-trip ping connectivity (BRIDGE_MODE=patch)"
171
+ dc_test node_1 'ping -c2 10.0.1.2'
172
+
173
+
174
+ echo -e "\n\n>>> Cleaning up"
175
+ dc down -t1 --remove-orphans
176
+ rm -f .env
177
+
178
+ if [ "${VERBOSE}" ]; then
179
+ for t in "${!RESULTS[@]}"; do
180
+ echo "RESULT: '${t}' -> ${RESULTS[${t}]}"
181
+ done
182
+ fi
183
+
184
+ if [ "${FAIL}" = 0 ]; then
185
+ echo -e "\n\n>>> ALL ${PASS} TESTS PASSED"
186
+ exit 0
187
+ else
188
+ echo -e "\n\n>>> ${FAIL} TESTS FAILED, ${PASS} TESTS PASSED"
189
+ exit 1
190
+ fi
191
+
package/schema.yaml CHANGED
@@ -46,6 +46,13 @@ properties:
46
46
  mode: {type: string}
47
47
  vlanid: {type: number}
48
48
 
49
+ bridges:
50
+ type: array
51
+ items:
52
+ type: object
53
+ properties:
54
+ mode: {type: string, enum: [auto, ovs, linux, patch]}
55
+
49
56
  tunnels:
50
57
  type: array
51
58
  items:
@@ -27,7 +27,7 @@ General Options:
27
27
  -v, --verbose Show verbose output (stderr)
28
28
  [env: VERBOSE]
29
29
  --show-config Print loaded network config JSON and exit
30
- --bridge-mode BRIDGE-MODE Bridge mode (ovs, linux, or auto)
30
+ --default-bridge-mode BRIDGE-MODE Default bridge mode (ovs, linux, patch, or auto)
31
31
  to use for bridge/switch connections
32
32
  [default: auto] [env: CONLINK_BRIDGE_MODE]
33
33
  --network-file NETWORK-FILE... Network config file
@@ -58,10 +58,12 @@ General Options:
58
58
  (def LINK-ADD-OPTS [:ip :mac :route :mtu :nat :netem :mode :vlanid :remote :vni])
59
59
  (def INTF-MAX-LEN 15)
60
60
 
61
- (def ctx (atom {:error #(apply Eprintln "ERROR:" %&)
62
- :warn #(apply Eprintln "WARNING:" %&)
63
- :log Eprintln
64
- :info list}))
61
+ (def ctx (atom {:error #(apply Eprintln "ERROR:" %&)
62
+ :warn #(apply Eprintln "WARNING:" %&)
63
+ :log Eprintln
64
+ :info #(identity nil)
65
+ :kmod-ovs? false
66
+ :kmod-mirred? false}))
65
67
 
66
68
  ;; Simple utility functions
67
69
  (defn json-str [obj]
@@ -94,12 +96,13 @@ General Options:
94
96
  net-cfg))
95
97
 
96
98
  (defn enrich-link
97
- "Add default values to a link:
99
+ "Resolve bridge name to full bridge map.
100
+ Add default values to a link:
98
101
  - type: veth
99
102
  - dev: eth0
100
103
  - mtu: 9000 (for non *vlan type)
101
104
  - base: :conlink for veth type, :host for *vlan types, :local otherwise"
102
- [{:as link :keys [type base bridge ip vlanid]}]
105
+ [{:as link :keys [type base bridge ip vlanid]} bridges]
103
106
  (let [type (keyword (or type "veth"))
104
107
  base-default (cond (= :veth type) :conlink
105
108
  (VLAN-TYPES type) :host
@@ -110,18 +113,57 @@ General Options:
110
113
  {:type type
111
114
  :dev (get link :dev "eth0")
112
115
  :base base}
116
+ (when bridge
117
+ {:bridge (get bridges bridge)})
113
118
  (when (not (VLAN-TYPES type))
114
119
  {:mtu (get link :mtu 9000)}))]
115
120
  link))
116
121
 
122
+ (defn enrich-bridge
123
+ "If bridge mode is :auto then return :ovs if the 'openvswitch' kernel module
124
+ is loaded otherwise fall back to :linux. Exit with an error if mode is :ovs
125
+ or :patch and the 'openvswitch' or 'act_mirred' kernel modules are not
126
+ loaded respectively."
127
+ [{:as bridge-opts :keys [bridge mode]}]
128
+ (let [{:keys [warn default-bridge-mode kmod-ovs? kmod-mirred?]} @ctx
129
+ mode (keyword (or mode default-bridge-mode))
130
+ _ (when (and (= :ovs mode) (not kmod-ovs?))
131
+ (fatal 1 (str "bridge " bridge " mode is 'ovs', "
132
+ "but no 'openvswitch' kernel module loaded")))
133
+ _ (when (and (= :patch mode) (not kmod-mirred?))
134
+ (warn (str "bridge " bridge " mode is 'patch', "
135
+ "but no 'act_mirred' kernel module loaded, "
136
+ " assuming it will load when needed.")))
137
+ _ (when (and (= :auto mode) (not kmod-ovs?))
138
+ (warn (str "bridge " bridge " mode is 'auto', "
139
+ " but no 'openvswitch' kernel module loaded, "
140
+ " so falling back to 'linux'")))
141
+ mode (if (= :auto mode)
142
+ (if kmod-ovs? :ovs :linux)
143
+ mode)]
144
+ (assoc bridge-opts :mode mode)))
145
+
117
146
  (defn enrich-network-config
118
- "Validate and update each link (enrich-link) and add
119
- :containers and :services maps with restructured link and command
120
- configuration to provide a more efficient structure for looking up
121
- configuration later."
122
- [{:as cfg :keys [links commands]}]
123
- (let [links (vec (map enrich-link links))
124
- cfg (merge cfg {:links links :containers {} :services {}})
147
+ "Validate and update each bridge (enrich-bridge) and link (enrich-link) and
148
+ add :bridges, :containers, and :services maps with restructured bridge, link,
149
+ and command configuration to provide a more efficient structure for looking
150
+ up configuration later."
151
+ [{:as cfg :keys [links commands bridges]}]
152
+ (let [bridge-map (reduce (fn [acc b] (assoc acc (:bridge b) b))
153
+ {} bridges)
154
+ ;; Add bridges specified in links only
155
+ all-bridges (reduce (fn [bs b]
156
+ (assoc bs b (get bs b {:bridge b})))
157
+ bridge-map
158
+ (keep :bridge links))
159
+ ;; Enrich each bridge
160
+ bridges (reduce (fn [bs [k v]] (assoc bs k (enrich-bridge v)))
161
+ {} all-bridges)
162
+ links (mapv #(enrich-link % bridges) links)
163
+ cfg (merge cfg {:links links
164
+ :bridges bridges
165
+ :containers {}
166
+ :services {}})
125
167
  rfn (fn [kind cfg {:as x :keys [container service]}]
126
168
  (cond-> cfg
127
169
  container (update-in [:containers container kind] conjv x)
@@ -153,15 +195,17 @@ General Options:
153
195
  "\nUser config:\n" (indent-pprint-str data " "))
154
196
  "\nValidation errors:\n" msg))))))
155
197
 
198
+
199
+ ;;; Runtime state related
200
+
156
201
  (defn gen-network-state
157
202
  "Generate network state/context from network configuration. Adds
158
203
  empty :devices map and :bridges map containing nil status for
159
204
  each bridge mentioned in the network config :links and :tunnels."
160
- [{:keys [links tunnels]}]
161
- (reduce (fn [state bridge]
162
- (assoc-in state [:bridges bridge :status] nil))
163
- {:devices {} :bridges {}}
164
- (keep :bridge (concat links tunnels))))
205
+ [{:keys [links tunnels bridges]}]
206
+ {:devices {}
207
+ :bridges (into {} (for [[k v] bridges]
208
+ [k (merge v {:status nil :links #{}})]))})
165
209
 
166
210
  (defn link-outer-dev
167
211
  "outer-dev format:
@@ -298,81 +342,127 @@ General Options:
298
342
  res (run cmd {:quiet true})]
299
343
  (and (= 0 (:code res)) (= kmod (trim (:stdout res))))))
300
344
 
301
- ;;; Link and bridge commands
345
+ ;;; Bridge commands
302
346
 
303
347
  (defn check-no-bridge
304
348
  "Check that no bridge named 'bridge' is currently configured.
305
- Bridge type is dependent on bridge-mode (:ovs or :linux). Exit with
349
+ Bridge type is dependent on mode (:ovs or :linux). Exit with
306
350
  error if the bridge already exists."
307
- [bridge]
308
- (P/let [{:keys [info bridge-mode]} @ctx
351
+ [{:keys [bridge mode]}]
352
+ (P/let [{:keys [info]} @ctx
309
353
  cmd (get {:ovs (str "ovs-vsctl list-ifaces " bridge)
310
- :linux (str "ip link show type bridge " bridge)}
311
- bridge-mode)
312
- res (run cmd {:quiet true})]
313
- (if (= 0 (:code res))
314
- ;; TODO: maybe mark as :exists and use without cleanup
315
- (fatal 1 (str "Bridge " bridge " already exists"))
316
- (if (re-seq #"(does not exist|no bridge named)" (:stderr res))
317
- true
318
- (fatal 1 (str "Unable to run '" cmd "': " (:stderr res)))))))
354
+ :linux (str "ip link show type bridge " bridge)
355
+ :patch nil}
356
+ mode)]
357
+ (if (not cmd)
358
+ true
359
+ (P/let [res (run cmd {:quiet true})]
360
+ (if (= 0 (:code res))
361
+ ;; TODO: maybe mark as :exists and use without cleanup
362
+ (fatal 1 (str "Bridge " bridge " already exists"))
363
+ (if (re-seq #"(does not exist|no bridge named)" (:stderr res))
364
+ true
365
+ (fatal 1 (str "Unable to run '" cmd "': " (:stderr res)))))))))
319
366
 
320
367
 
321
368
  (defn bridge-create
322
369
  "Create a bridge named 'bridge'.
323
- Bridge type is dependent on bridge-mode (:ovs or :linux)."
324
- [bridge]
325
- (P/let [{:keys [info error bridge-mode]} @ctx
326
- _ (info "Creating bridge/switch" bridge)
370
+ Bridge type is dependent on mode (:ovs or :linux)."
371
+ [{:keys [bridge mode]}]
372
+ (P/let [{:keys [info error]} @ctx
327
373
  cmd (get {:ovs (str "ovs-vsctl add-br " bridge)
328
- :linux (str "ip link add " bridge " up type bridge")}
329
- bridge-mode)
330
- res (run cmd)]
331
- (if (not= 0 (:code res))
332
- (error (str "Unable to create bridge/switch " bridge))
333
- (swap! ctx assoc-in [:network-state :bridges bridge :status] :created))
334
- res))
374
+ :linux (str "ip link add " bridge " up type bridge")
375
+ :patch nil}
376
+ mode)]
377
+ (if (not cmd)
378
+ (info (str "Ignoring bridge/switch " bridge " for mode " mode))
379
+ (P/let [_ (info "Creating bridge/switch" bridge)
380
+ res (run cmd)]
381
+ (if (not= 0 (:code res))
382
+ (error (str "Unable to create bridge/switch " bridge))
383
+ (swap! ctx assoc-in [:network-state :bridges bridge :status] :created))
384
+ true))))
335
385
 
336
386
  (defn bridge-del
337
387
  "Delete the bridge named 'bridge'.
338
- Bridge type is dependent on bridge-mode (:ovs or :linux)."
339
- [bridge]
340
- (P/let [{:keys [info error bridge-mode]} @ctx
341
- _ (info "Deleting bridge/switch" bridge)
388
+ Bridge type is dependent on mode (:ovs or :linux)."
389
+ [{:keys [bridge mode]}]
390
+ (P/let [{:keys [info error]} @ctx
342
391
  cmd (get {:ovs (str "ovs-vsctl del-br " bridge)
343
- :linux (str "ip link del " bridge)} bridge-mode)
344
- res (run cmd)]
345
- (if (not= 0 (:code res))
346
- (error (str "Unable to delete bridge " bridge))
347
- (swap! ctx assoc-in [:network-state :bridges bridge :status] nil))
348
- res))
392
+ :linux (str "ip link del " bridge)
393
+ :patch nil} mode)]
394
+ (if (not cmd)
395
+ (info (str "Ignoring bridge/switch " bridge " for mode " mode))
396
+ (P/let [_ (info "Deleting bridge/switch" bridge)
397
+ res (run cmd)]
398
+ (if (not= 0 (:code res))
399
+ (error (str "Unable to delete bridge " bridge))
400
+ (swap! ctx assoc-in [:network-state :bridges bridge :status] nil))
401
+ true))))
349
402
 
350
403
  (defn bridge-add-link
351
404
  "Add the link/interface 'dev' to the bridge 'bridge'.
352
- Bridge type is dependent on bridge-mode (:ovs or :linux)."
353
- [bridge dev]
354
- (P/let [{:keys [error bridge-mode]} @ctx
405
+ Bridge type is dependent on mode (:ovs or :linux)."
406
+ [{:keys [bridge mode]} dev]
407
+ (P/let [{:keys [error]} @ctx
355
408
  cmd (get {:ovs (str "ovs-vsctl add-port " bridge " " dev)
356
409
  :linux (str "ip link set dev " dev " master " bridge)}
357
- bridge-mode)
410
+ mode)
358
411
  res (run cmd)]
359
- (when (not= 0 (:code res))
360
- (error (str "Unable to add link " dev " into " bridge)))
361
- res))
412
+ (if (= 0 (:code res))
413
+ (swap! ctx update-in [:network-state :bridges bridge :links] conj dev)
414
+ (error (str "Unable to add link " dev " into " bridge)))))
362
415
 
363
416
  (defn bridge-drop-link
364
417
  "Remove the link/interface 'dev' from the bridge 'bridge'.
365
- Bridge type is dependent on bridge-mode (:ovs or :linux)."
366
- [bridge dev]
367
- (P/let [{:keys [error bridge-mode]} @ctx
418
+ Bridge type is dependent on mode (:ovs or :linux)."
419
+ [{:keys [bridge mode]} dev]
420
+ (P/let [{:keys [error]} @ctx
368
421
  cmd (get {:ovs (str "ovs-vsctl del-port " bridge " " dev)
369
422
  :linux (str "ip link set dev " dev " nomaster")}
370
- bridge-mode)
423
+ mode)
371
424
  res (run cmd)]
372
- (when (not= 0 (:code res))
373
- (error (str "Unable to drop link " dev " from " bridge)))
374
- res))
425
+ (if (= 0 (:code res))
426
+ (swap! ctx update-in [:network-state :bridges bridge :links] disj dev)
427
+ (error (str "Unable to drop link " dev " from " bridge)))))
428
+
429
+ (defn patch-add-link
430
+ "Setup patch between 'dev' and its peer link using tc qdisc mirred
431
+ filter action. Peer links are tracked in pseudo-bridge 'bridge'."
432
+ [{:keys [bridge mode]} dev]
433
+ (let [{:keys [info error]} @ctx
434
+ links-path [:network-state :bridges bridge :links]
435
+ links (get-in @ctx links-path)
436
+ peers (disj links dev)]
437
+ (condp = (count peers)
438
+ 0
439
+ (P/do
440
+ (info (str "Registering first peer link "
441
+ dev " in :patch 'bridge' " bridge))
442
+ (swap! ctx update-in links-path conj dev))
443
+
444
+ 1
445
+ (P/let [cmd (str "link-mirred.sh " dev " " (first peers))
446
+ res (run cmd)]
447
+ (if (= 0 (:code res))
448
+ (swap! ctx update-in links-path conj dev)
449
+ (error (str "Failed to setup tc filter action for "
450
+ dev " in :patch 'bridge' " bridge))))
451
+
452
+ (error "Cannot add third peer link "
453
+ dev " to :patch 'bridge' " bridge))))
375
454
 
455
+ (defn patch-drop-link
456
+ "Remove tracking of 'dev' from pseudo-bridge 'bridge'."
457
+ [{:keys [bridge mode]} dev]
458
+ (let [{:keys [info error]} @ctx
459
+ links-path [:network-state :bridges bridge :links]]
460
+ (info (str "Removing peer link "
461
+ dev " from :patch 'bridge' " bridge))
462
+ ;; State is in the links, no extra cleanup
463
+ (swap! ctx update-in links-path conj dev)))
464
+
465
+ ;;; Link commands
376
466
 
377
467
  (defn link-add
378
468
  "Create a link/interface defined by 'link' in a container by calling
@@ -519,7 +609,10 @@ General Options:
519
609
  (P/do
520
610
  (swap! ctx assoc-in status-path :creating)
521
611
  (link-add link)
522
- (when bridge (bridge-add-link bridge outer-dev))
612
+ (when bridge
613
+ (if (= :patch (:mode bridge))
614
+ (patch-add-link bridge outer-dev)
615
+ (bridge-add-link bridge outer-dev)))
523
616
  (swap! ctx assoc-in status-path :created)))
524
617
 
525
618
  "die"
@@ -527,7 +620,10 @@ General Options:
527
620
  (error (str "Link " dev-id " does not exist"))
528
621
  (P/do
529
622
  (swap! ctx assoc-in status-path :deleting)
530
- (when bridge (bridge-drop-link bridge outer-dev))
623
+ (when bridge
624
+ (if (= :patch (:mode bridge))
625
+ (patch-drop-link bridge outer-dev)
626
+ (bridge-drop-link bridge outer-dev)))
531
627
  (link-del link)
532
628
  (swap! ctx assoc-in status-path nil))))))
533
629
 
@@ -635,7 +731,7 @@ General Options:
635
731
  (when (seq bridges)
636
732
  (P/do
637
733
  (log (str "Removing bridges: " (S/join ", " (keys bridges))))
638
- (P/all (map bridge-del (keys bridges)))))
734
+ (P/all (map bridge-del (vals bridges)))))
639
735
  (js/process.exit 127))))
640
736
 
641
737
 
@@ -650,40 +746,6 @@ General Options:
650
746
  (when (empty? config-schema)
651
747
  (fatal 2 "Could not find config-schema" orig-config-schema)))
652
748
 
653
- (defn startup-checks
654
- "Check startup state and return map of :bridge-mode, :docker, and
655
- :podman. If bridge-mode is :auto then return :ovs if the
656
- 'openvswitch' kernel module is loaded otherwise fall back to :linux.
657
- Exit with an error if bridge-mode is :ovs and the 'openvswitch'
658
- kernel module is not loaded or if neither a docker or podman
659
- connection could be established."
660
- [{:keys [bridge-mode docker-socket podman-socket]}]
661
- (P/let
662
- [{:keys [info warn]} @ctx
663
- ovs? (kmod-loaded? "openvswitch")
664
- bridge-mode (condp = [bridge-mode ovs?]
665
- [:auto true]
666
- :ovs
667
-
668
- [:auto false]
669
- (do
670
- (warn (str "bridge-mode is 'auto' but no 'openvswitch' "
671
- "kernel module loaded, so using 'linux'"))
672
- :linux)
673
-
674
- [:ovs false]
675
- (fatal 1 (str "bridge-mode is 'ovs', but no 'openvswitch' "
676
- "kernel module loaded"))
677
-
678
- bridge-mode)
679
- docker (docker-client docker-socket)
680
- podman (docker-client podman-socket)]
681
- (when (and (not docker) (not podman))
682
- (fatal 1 "Failed to start either docker or podman client/listener"))
683
- {:bridge-mode bridge-mode
684
- :docker docker
685
- :podman podman}))
686
-
687
749
  (defn server
688
750
  "Process:
689
751
  - parse/validate command line options
@@ -692,7 +754,7 @@ General Options:
692
754
  - determine our own container ID and compose properties (if any)
693
755
  - generate runtime network state and other process context/state
694
756
  - install exit/cleanup handlers
695
- - start/init openvswitch daemons/config (if :ovs bridge-mode)
757
+ - start/init openvswitch daemons/config (if any bridges use :ovs mode)
696
758
  - check that any defined bridges do not already exist
697
759
  - create any bridges defined in network config links
698
760
  - start listening/handling docker/podman container events
@@ -704,7 +766,7 @@ General Options:
704
766
  {:keys [log info]} (swap! ctx merge (when verbose {:info Eprintln}))
705
767
  opts (merge
706
768
  opts
707
- {:bridge-mode (keyword (:bridge-mode opts))
769
+ {:default-bridge-mode (keyword (:default-bridge-mode opts))
708
770
  :orig-config-schema (:config-schema opts)
709
771
  :config-schema (resolve-path (:config-schema opts) SCHEMA-PATHS)
710
772
  :network-file (mapcat #(S/split % #":") (:network-file opts))
@@ -716,6 +778,11 @@ General Options:
716
778
  env (js->clj (js/Object.assign #js {} js/process.env))
717
779
  self-pid js/process.pid
718
780
  schema (load-config (:config-schema opts))
781
+ kmod-ovs? (kmod-loaded? "openvswitch")
782
+ kmod-mirred? (kmod-loaded? "act_mirred")
783
+ _ (swap! ctx merge {:default-bridge-mode (:default-bridge-mode opts)
784
+ :kmod-ovs? kmod-ovs?
785
+ :kmod-mirred? kmod-mirred?})
719
786
  network-config (P/-> (load-configs compose-file network-file)
720
787
  (interpolate-walk env)
721
788
  (check-schema schema verbose)
@@ -724,7 +791,11 @@ General Options:
724
791
  (println (js/JSON.stringify (->js network-config)))
725
792
  (js/process.exit 0))
726
793
 
727
- {:keys [bridge-mode docker podman]} (startup-checks opts)
794
+ docker (docker-client (:docker-socket opts))
795
+ podman (docker-client (:podman-socket opts))
796
+ _ (when (and (not docker) (not podman))
797
+ (fatal 1 "Failed to start either docker or podman client/listener"))
798
+
728
799
  self-cid (get-container-id)
729
800
  self-container-obj (when self-cid
730
801
  (get-container (or docker podman) self-cid))
@@ -733,8 +804,7 @@ General Options:
733
804
  {:project compose-project}
734
805
  (get-compose-labels self-container))
735
806
  network-state (gen-network-state network-config)
736
- ctx-data {:bridge-mode bridge-mode
737
- :network-config network-config
807
+ ctx-data {:network-config network-config
738
808
  :network-state network-state
739
809
  :compose-opts compose-opts
740
810
  :docker docker
@@ -749,7 +819,6 @@ General Options:
749
819
  (js/process.on "SIGTERM" #(exit-handler % "signal"))
750
820
  (js/process.on "uncaughtException" #(exit-handler %1 %2))
751
821
 
752
- (log "Bridge mode:" (name bridge-mode))
753
822
  (log (str "Using schema at '" (:config-schema opts) "'"))
754
823
  (info (str "Starting network config\n"
755
824
  (indent-pprint-str network-config " ")))
@@ -764,15 +833,16 @@ General Options:
764
833
  (when self-cid
765
834
  (rename-docker-eth0))
766
835
 
767
- (when (= :ovs bridge-mode)
836
+ (when (some #(= :ovs (:mode %)) (-> network-config :bridges vals))
768
837
  (start-ovs))
769
838
 
770
839
  ;; Check that bridges/switches do not already exist
771
- (P/all (for [bridge (-> network-state :bridges keys)]
840
+ (P/all (for [bridge (vals (:bridges network-state))]
772
841
  (check-no-bridge bridge)))
842
+
773
843
  ;; Create bridges/switch configs
774
844
  ;; TODO: should be done on-demand
775
- (P/all (for [bridge (-> network-state :bridges keys)]
845
+ (P/all (for [bridge (vals (:bridges network-state))]
776
846
  (bridge-create bridge)))
777
847
 
778
848
  ;; Create tunnels configs