conlink 2.0.3 → 2.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -13,5 +13,12 @@ jobs:
13
13
  - name: Checkout
14
14
  uses: actions/checkout@v3
15
15
 
16
- - name: stub step
17
- run: "echo stub stub"
16
+ - name: npm install
17
+ run: npm install
18
+
19
+ - name: compose build of conlink
20
+ run: docker compose -f examples/test1-compose.yaml build
21
+
22
+ - name: "./run-tests.sh"
23
+ timeout-minutes: 5
24
+ run: time ./run-tests.sh
package/Dockerfile CHANGED
@@ -24,11 +24,11 @@ FROM node:16-slim as run
24
24
  RUN apt-get -y update
25
25
  # Runtime deps and utilities
26
26
  RUN apt-get -y install libpcap-dev tcpdump iproute2 iputils-ping curl \
27
- iptables bridge-utils \
27
+ iptables bridge-utils ethtool \
28
28
  openvswitch-switch openvswitch-testcontroller
29
29
 
30
30
  COPY --from=build /app/ /app/
31
- ADD link-add.sh link-del.sh /app/
31
+ ADD link-add.sh link-del.sh link-mirred.sh link-forward.sh /app/
32
32
  ADD schema.yaml /app/build/
33
33
 
34
34
  ENV PATH /app:$PATH
package/README.md CHANGED
@@ -13,6 +13,8 @@ General:
13
13
  Other:
14
14
  * For Open vSwtich (OVS) bridging, the `openvswitch` kernel module
15
15
  must loaded on the host system (where docker engine is running).
16
+ * For patch connections (`bridge: patch`), the kernel must support
17
+ tc qdisc mirred filtering via the `act_mirred` kernel module.
16
18
  * For podman usage (e.g. second part of `test3`), podman is required.
17
19
  * For remote connections/links (e.g. `test5`), the `geneve` (and/or
18
20
  `vxlan`) kernel module must be loaded on the host system (where
@@ -44,17 +46,28 @@ will also be required for the conlink container. In particular, if the
44
46
  container uses systemd, then it will likely use `SYS_NICE` and
45
47
  `NET_BROADCAST` and conlink will likewise need those capabilities.
46
48
 
47
- ### Bridging: Open vSwtich/OVS or Linux bridge
48
-
49
- Conlink creates bridges/switches and connects veth container links to
50
- those bridges (specified by `bridge:` in the link specification).
51
- By default, conlink will attempt to create Open vSwitch/OVS bridges
52
- for these connections, however, if the kernel does not provide support
53
- (`openvswitch` kernel module loaded), then conlink will fallback to
54
- using standard Linux bridges. The fallback behavior can be changed by
55
- setting the `--bridge-mode` option to either "ovs" or "linux". If the
56
- bridge mode is set to "ovs" then conlink will fail to start if the
57
- `openvswitch` kernel module is not detected.
49
+ ### Bridging: Open vSwtich/OVS, Linux bridge, and patch
50
+
51
+ Conlink connects container veth links together via a bridge or via a
52
+ direct patch. All veth type links must have a `bridge` property that
53
+ defines which links will be connected together (i.e. the same
54
+ broadcast domain). The default bridge mode is defined by the
55
+ `--default-bridge-mode` parameter and defaults to "auto". If a bridge
56
+ is set to mode "auto" then conlink will check if the kernel has the
57
+ `openvswitch` kernel module loaded and if so it will create an Open
58
+ vSwitch/OVS bridge/switch for that bridge, otherwise it will create a
59
+ regular Linux bridge (e.g. brctl). If any bridges are explicitly
60
+ defined with an "ovs" mode and the kernel does not have support then
61
+ conlink will stop/error on startup.
62
+
63
+ The "patch" mode will connect two links together using tc qdisc
64
+ ingress filters. This type connection is equivalent to a patch panel
65
+ ("bump-in-the-wire") connection and all traffic will be passed between
66
+ the two links unchanged unlike Linux and OVS bridges which typically
67
+ block certain bridge control broadcast traffic). The primary downside
68
+ of "patch" connections is that they limited to two links whereas "ovs"
69
+ and "linux" bridge modes can support many links connected into the
70
+ same bridge (broadcast domain).
58
71
 
59
72
  ## Network Configuration Syntax
60
73
 
@@ -79,28 +92,31 @@ interfaces in the host.
79
92
 
80
93
  The following table describes the link properties:
81
94
 
82
- | property | link types | format | default | description |
83
- |-----------|------------|------------|---------|--------------------------|
84
- | type | * | string 1 | veth | link/interface type |
85
- | service | * | string | 2 | compose service |
86
- | container | * | string | | container name |
87
- | bridge | veth | string | | conlink bridge / domain |
88
- | outer-dev | not dummy | string[15] | | conlink/host intf name |
89
- | dev | * | string[15] | eth0 | container intf name |
90
- | ip | * | CIDR | | IP CIDR (index offset) |
91
- | mac | 3 | MAC | | MAC addr (index offset) |
92
- | mtu | * | number 4 | 9000 | intf MTU |
93
- | route | * | string | | ip route add args |
94
- | nat | * | IP | | DNAT/SNAT to IP |
95
- | netem | * | string | | tc qdisc NetEm options |
96
- | mode | 5 | string | | virt intf mode |
97
- | vlanid | vlan | number | | VLAN ID |
95
+ | property | link types | format | default | description |
96
+ |-----------|------------|----------------|---------|--------------------------|
97
+ | type | * | string 1 | veth | link/interface type |
98
+ | service | * | string | 2 | compose service |
99
+ | container | * | string | | container name |
100
+ | bridge | veth | string | | conlink bridge / domain |
101
+ | outer-dev | not dummy | string[15] | | conlink/host intf name |
102
+ | dev | * | string[15] | eth0 | container intf name |
103
+ | ip | * | CIDR | | IP CIDR 7 |
104
+ | mac | 3 | MAC | | MAC addr 7 |
105
+ | mtu | * | number 4 | 65535 | intf MTU |
106
+ | route | * | string | | ip route add args |
107
+ | nat | * | IP | | DNAT/SNAT to IP |
108
+ | netem | * | string | | tc qdisc NetEm options |
109
+ | mode | 5 | string | | virt intf mode |
110
+ | vlanid | vlan | number | | VLAN ID |
111
+ | forward | veth | string array 6 | | forward conlink ports 7 |
98
112
 
99
113
  - 1 - veth, dummy, vlan, ipvlan, macvlan, ipvtap, macvtap
100
114
  - 2 - defaults to outer compose service
101
115
  - 3 - not ipvlan/ipvtap
102
116
  - 4 - max MTU of parent device for \*vlan, \*vtap types
103
117
  - 5 - macvlan, macvtap, ipvlan, ipvtap
118
+ - 6 - string syntax: `conlink_port:container_port/proto`
119
+ - 7 - offset by scale/replica index
104
120
 
105
121
  Each link has a 'type' key that defaults to "veth" and each link
106
122
  definition must also have either a `service` key or a `container` key.
@@ -122,6 +138,33 @@ than the MTU of the parent (outer-dev) device.
122
138
  For the `netem` property, refer to the `netem` man page. The `OPTIONS`
123
139
  grammar defines the valid strings for the `netem` property.
124
140
 
141
+ The `forward` property is an array of strings that defines ports to
142
+ forward from the conlink container into the container over this link.
143
+ Traffic arriving on the conlink container's docker interface of type
144
+ `proto` and destined for port `conlink_port` is forwarded over this
145
+ link to the container IP and port `container_port` (`ip` is required).
146
+ The initial port (`conlink_port`) is offset by the service
147
+ replica/scale number (minus 1). So if the first replica has port 80
148
+ forwarded then the second replica will have port 81 forwarded.
149
+ For publicly publishing a port, the conlink container needs to be on
150
+ a docker network and the `conlink_port` should match the target port
151
+ of a docker published port (for the conlink container).
152
+
153
+ ### Bridges
154
+
155
+ The bridge settings currently only support the "mode" setting. If
156
+ the mode is not specified in this section or the section is omitted
157
+ entirely, then bridges specified in the links configuration will
158
+ default to the value of the `--default-bridge-mode` parameter (which
159
+ itself defaults to "auto").
160
+
161
+ The following table describes the bridge properties:
162
+
163
+ | property | format | description |
164
+ |-----------|---------|--------------------------------|
165
+ | bridge | string | conlink bridge / domain name |
166
+ | mode | string | auto, ovs, or linux |
167
+
125
168
  ### Tunnels
126
169
 
127
170
  Tunnels links/interfaces will be created and attached to the specified
@@ -421,8 +464,10 @@ Start the test7 compose configuration:
421
464
  docker-compose -f examples/test7-compose.yaml up --build --force-recreate
422
465
  ```
423
466
 
424
- Show the links in both node containers to see that the MAC addresses
425
- are `00:0a:0b:0c:0d:0*` and the MTUs are set to `4111`.
467
+ Show the links in both node containers to see that on the eth0
468
+ interfaces the MAC addresses are `00:0a:0b:0c:0d:0*` and the MTUs are
469
+ set to `4111`. The eth1 interfaces should have the command line set
470
+ default MTU of `5111`.
426
471
 
427
472
  ```
428
473
  docker-compose -f examples/test7-compose.yaml exec --index 1 node ip link
@@ -476,6 +521,67 @@ Note: to connect to the vlan node (NODE2_HOST_ADDRESS) you will need
476
521
  to configure your physical switch/router with routing/connectivity to
477
522
  VLAN 5 on the same physical link to your host.
478
523
 
524
+ ### test9: bridge modes
525
+
526
+ This example demonstrates the supported bridge modes.
527
+
528
+ Start the test9 compose configuration using different bridge modes and
529
+ validate connectivity using ping:
530
+
531
+ ```
532
+ export BRIDGE_MODE="linux" # "ovs", "patch", "auto"
533
+ docker-compose -f examples/test9-compose.yaml up --build --force-recreate
534
+ docker-compose -f examples/test9-compose.yaml exec node ping 10.0.1.2
535
+ ```
536
+
537
+ ### test10: port forwarding
538
+
539
+ This example demonstrates port forwarding from the conlink container
540
+ to two containers running simple web servers.
541
+
542
+ Start the test10 compose configuration:
543
+
544
+ ```
545
+ docker-compose -f examples/test10-compose.yaml up --build --force-recreate
546
+ ```
547
+
548
+ Ports 3080 and 8080 are both published on the host by the conlink
549
+ container using standard Docker port mapping. The internal mapping of
550
+ those ports (1080 and 1180 respectively) are both are forwarded to
551
+ port 80 in the node1 container using conlink's port forwarding
552
+ mechanism. The two paths look like this:
553
+
554
+ ```
555
+ host:3080 --> 1080 (in conlink) --> node1:80
556
+ host:8080 --> 1180 (in conlink) --> node1:80
557
+ ```
558
+
559
+ Use curl on the host to query both of these paths to node1:
560
+
561
+ ```
562
+ curl 0.0.0.0:3080
563
+ curl 0.0.0.0:8080
564
+ ```
565
+
566
+ Ports 80 and 81 are published on the host by the conlink container
567
+ using standard Docker port mapping. Then conlink forwards from ports
568
+ 80 and 81 to the first and second replica (respectively) of node2,
569
+ each of which listen internally on port 80. The two paths look like
570
+ this:
571
+
572
+ ```
573
+ host:80 -> 80 (in conlink) -> node2_1:80
574
+ host:81 -> 81 (in conlink) -> node2_2:80
575
+ ```
576
+
577
+ Use curl on the host to query both replicas of node2:
578
+
579
+ ```
580
+ curl 0.0.0.0:80
581
+ curl 0.0.0.0:81
582
+ ```
583
+
584
+
479
585
  ## GraphViz network configuration rendering
480
586
 
481
587
  You can use d3 and GraphViz to create a visual graph rendering of
@@ -0,0 +1,38 @@
1
+ version: "2.4"
2
+
3
+ services:
4
+ node1:
5
+ image: python:3-alpine
6
+ network_mode: none
7
+ command: "python3 -m http.server -d /var 80"
8
+ x-network:
9
+ links:
10
+ - {bridge: s2, ip: "10.0.1.1/24", route: "default",
11
+ forward: ["1080:80/tcp", "1180:80/tcp"]}
12
+
13
+ node2:
14
+ image: python:3-alpine
15
+ network_mode: none
16
+ scale: 2
17
+ command: "python3 -m http.server -d /usr 80"
18
+ x-network:
19
+ links:
20
+ - {bridge: s1, ip: "10.0.2.1/24", route: "default",
21
+ forward: ["80:80/tcp"]}
22
+
23
+ network:
24
+ build: {context: ../}
25
+ image: conlink
26
+ pid: host
27
+ cap_add: [SYS_ADMIN, NET_ADMIN, SYS_NICE, NET_BROADCAST, IPC_LOCK]
28
+ security_opt: [ 'apparmor:unconfined' ] # needed on Ubuntu 18.04
29
+ volumes:
30
+ - /var/run/docker.sock:/var/run/docker.sock
31
+ - /var/lib/docker:/var/lib/docker
32
+ - ../:/test
33
+ ports:
34
+ - "3080:1080/tcp"
35
+ - "8080:1180/tcp"
36
+ - "80:80/tcp"
37
+ - "81:81/tcp"
38
+ command: /app/build/conlink.js --compose-file /test/examples/test10-compose.yaml
@@ -1,5 +1,10 @@
1
1
  version: "2.4"
2
2
 
3
+ services:
4
+ r0:
5
+ volumes:
6
+ - ./scripts:/scripts
7
+
3
8
  x-network:
4
9
  commands:
5
10
  - {service: r0, command: "python3 -m http.server 80"}
@@ -15,7 +15,7 @@ services:
15
15
  - /var/run/docker.sock:/var/run/docker.sock
16
16
  - /var/lib/docker:/var/lib/docker
17
17
  - ./:/test
18
- command: /app/build/conlink.js --compose-file /test/test7-compose.yaml
18
+ command: /app/build/conlink.js --default-mtu 5111 --compose-file /test/test7-compose.yaml
19
19
 
20
20
  node:
21
21
  image: alpine
@@ -29,3 +29,6 @@ services:
29
29
  mac: 00:0a:0b:0c:0d:01
30
30
  mtu: 4111
31
31
  netem: "delay 40ms rate 10mbit"
32
+ - bridge: s2
33
+ ip: 100.0.1.1/16
34
+ dev: eth1
@@ -0,0 +1,32 @@
1
+ # This file demonstrates using different bridge modes and
2
+ # usage of the --default-brige-mode parameter.
3
+
4
+ version: "2.4"
5
+
6
+ services:
7
+ network:
8
+ build: {context: ../}
9
+ image: conlink
10
+ pid: host
11
+ network_mode: none
12
+ cap_add: [SYS_ADMIN, NET_ADMIN, SYS_NICE, NET_BROADCAST, IPC_LOCK]
13
+ security_opt: [ 'apparmor:unconfined' ] # needed on Ubuntu 18.04
14
+ volumes:
15
+ - /var/run/docker.sock:/var/run/docker.sock
16
+ - /var/lib/docker:/var/lib/docker
17
+ - ./:/test
18
+ environment:
19
+ - BRIDGE_MODE
20
+ command: /app/build/conlink.js --default-bridge-mode linux --compose-file /test/test9-compose.yaml
21
+
22
+ node:
23
+ image: alpine
24
+ network_mode: none
25
+ scale: 2
26
+ command: sleep Infinity
27
+
28
+ x-network:
29
+ links:
30
+ - {bridge: s1, service: node, ip: 10.0.1.1/24}
31
+ bridges:
32
+ - {bridge: s1, mode: "${BRIDGE_MODE:-auto}"}
package/link-add.sh CHANGED
@@ -45,6 +45,8 @@ usage () {
45
45
  echo >&2 ""
46
46
  echo >&2 " --netem NETEM - tc qdisc netem OPTIONS (man 8 netem)"
47
47
  echo >&2 " --nat TARGET - Stateless NAT traffic to/from TARGET"
48
+ echo >&2 " (in primary/PID0 netns)"
49
+ echo >&2 ""
48
50
  exit 2
49
51
  }
50
52
 
@@ -0,0 +1,45 @@
1
+ #!/bin/bash
2
+
3
+ # Copyright (c) 2024, Equinix, Inc
4
+ # Licensed under MPL 2.0
5
+
6
+ set -e
7
+
8
+ usage () {
9
+ echo >&2 "${0} [OPTIONS] <add|del> INTF_A INTF_B PORT_A:IP:PORT_B/PROTO"
10
+ echo >&2 ""
11
+ echo >&2 "Match traffic on INTF_A that has destination port PORT_A and"
12
+ echo >&2 "protocol PROTO (tcp or udp). Forward/DNAT traffic to IP:PORT_B "
13
+ echo >&2 "via INTF_B."
14
+ exit 2
15
+ }
16
+
17
+ info() { echo "link-forward [${LOG_ID}] ${*}"; }
18
+
19
+ IPTABLES() {
20
+ case "${action}" in
21
+ add) iptables -C "${@}" 2>/dev/null || iptables -I "${@}" ;;
22
+ del) iptables -D "${@}" 2>/dev/null || true;;
23
+ esac
24
+ }
25
+
26
+ action=$1; shift || usage
27
+ intf_a=$1; shift || usage
28
+ intf_b=$1; shift || usage
29
+ spec=$1; shift || usage
30
+ read port_a ip port_b proto <<< "${spec//[:\/]/ }"
31
+
32
+ [ "${action}" -a "${intf_a}" -a "${intf_b}" ] || usage
33
+ [ "${port_a}" -a "${ip}" -a "${port_b}" -a "${proto}" ] || usage
34
+
35
+ LOG_ID="${spec}"
36
+
37
+ info "${action^} forwarding ${intf_a} -> ${intf_b}"
38
+
39
+ IPTABLES PREROUTING -t nat -i ${intf_a} -p ${proto} --dport ${port_a} -j DNAT --to-destination ${ip}:${port_b}
40
+ IPTABLES POSTROUTING -t nat -o ${intf_b} -j MASQUERADE
41
+
42
+ case "${action}" in
43
+ add) ip route replace ${ip} dev ${intf_b} ;;
44
+ del) ip route delete ${ip} dev ${intf_b} || true;;
45
+ esac
package/link-mirred.sh ADDED
@@ -0,0 +1,114 @@
1
+ #!/bin/bash
2
+
3
+ # Copyright (c) 2024, Equinix, Inc
4
+ # Licensed under MPL 2.0
5
+
6
+ set -e
7
+
8
+ usage () {
9
+ echo >&2 "${0} [OPTIONS] INTF0 INTF1"
10
+ echo >&2 ""
11
+ echo >&2 "Create traffic mirror/redirect between INTF0 and INTF1."
12
+ echo >&2 ""
13
+ echo >&2 "Positional arguments:"
14
+ echo >&2 " INTF0 is the first interface name"
15
+ echo >&2 " INTF1 is the second interface name"
16
+ echo >&2 ""
17
+ echo >&2 "INTF0 must exist, but if INTF1 is missing, then exit with 0."
18
+ echo >&2 "Each interface will be checked for correct ingress/mirred config"
19
+ echo >&2 "and configured if the configuration is missing."
20
+ echo >&2 "These two aspect make this script idempotent. It can be called"
21
+ echo >&2 "whenever either interface appears and when the second appears,"
22
+ echo >&2 "the mirror/redirect action will be setup fully/bidirectionally."
23
+ echo >&2 ""
24
+ echo >&2 "OPTIONS:"
25
+ echo >&2 " --verbose - Verbose output (set -x)"
26
+ exit 2
27
+ }
28
+
29
+ info() { echo "link-mirred [${LOG_ID}] ${*}"; }
30
+ warn() { >&2 echo "link-mirred [${LOG_ID}] ${*}"; }
31
+ die() { warn "ERROR: ${*}"; exit 1; }
32
+
33
+ # Idempotently add ingress qdisc to an interface
34
+ add_ingress() {
35
+ local IF=$1 res=
36
+
37
+ res=$(tc qdisc show dev ${IF} 2>&1)
38
+ case "${res}" in
39
+ *"qdisc ingress ffff:"*)
40
+ info "${IF0} already has ingress qdisc"
41
+ ;;
42
+ ""|*"qdisc noqueue"*)
43
+ info "Adding ingress qdisc to ${IF}"
44
+ tc qdisc add dev "${IF}" ingress \
45
+ || die "Could not add ingress qdisc to ${IF}"
46
+ ;;
47
+ *)
48
+ die "${IF} has invalid ingress qdisc or could not be queried"
49
+ ;;
50
+ esac
51
+ }
52
+
53
+ # Idempotently add mirred filter redirect rule to an interface
54
+ add_mirred() {
55
+ local IF0=$1 IF1=$2 res=
56
+
57
+ res=$(tc filter show dev ${IF0} parent ffff: 2>&1)
58
+ case "${res}" in
59
+ *"action order 1: mirred (Egress Redirect to device ${IF1}"*)
60
+ info "${IF0} already has filter redirect action"
61
+ ;;
62
+ "")
63
+ info "Adding filter redirect action from ${IF0} to ${IF1}"
64
+ tc filter add dev ${IF0} parent ffff: protocol all u32 match u8 0 0 action \
65
+ mirred egress redirect dev ${IF1} \
66
+ || die "Could not add filter redirect action from ${IF0} to ${IF1}"
67
+ ;;
68
+ *)
69
+ die "${IF0} has invalid filter redirect action or could not be queried"
70
+ ;;
71
+ esac
72
+ }
73
+
74
+ # Parse arguments
75
+ VERBOSE=${VERBOSE:-}
76
+ positional=
77
+ while [ "${*}" ]; do
78
+ param=$1; OPTARG=$2
79
+ case ${param} in
80
+ --verbose) VERBOSE=1 ;;
81
+ -h|--help) usage ;;
82
+ *) positional="${positional} $1" ;;
83
+ esac
84
+ shift
85
+ done
86
+ set -- ${positional}
87
+ IF0=$1 IF1=$2
88
+
89
+ [ "${VERBOSE}" ] && set -x || true
90
+
91
+ # Sanity check arguments
92
+ [ "${IF0}" -a "${IF1}" ] || usage
93
+
94
+ LOG_ID="mirred ${IF0}:${IF1}"
95
+
96
+ # Sanity checks
97
+ if ! ip link show ${IF0} >/dev/null; then
98
+ die "${IF0} does not exist"
99
+ fi
100
+ if ! ip link show ${IF1} >/dev/null; then
101
+ info "${IF1} missing, exiting"
102
+ exit 0
103
+ fi
104
+
105
+ ### Do the work
106
+
107
+ info "Creating filter rediction action between ${IF0} and ${IF1}"
108
+
109
+ add_ingress ${IF0}
110
+ add_ingress ${IF1}
111
+ add_mirred ${IF0} ${IF1}
112
+ add_mirred ${IF1} ${IF0}
113
+
114
+ info "Created filter rediction action between ${IF0} and ${IF1}"
package/mdc CHANGED
@@ -17,6 +17,7 @@ LS=$(which ls)
17
17
  RESOLVE_DEPS="${RESOLVE_DEPS-./node_modules/@lonocloud/resolve-deps/resolve-deps.py}"
18
18
  DOCKER_COMPOSE="${DOCKER_COMPOSE:-docker-compose}"
19
19
 
20
+ [ -f "${RESOLVE_DEPS}" ] || die "Missing ${RESOLVE_DEPS}. Perhaps 'npm install'?"
20
21
  MODE_SPEC="${1}"; shift
21
22
  if [ "${RESOLVE_DEPS}" ]; then
22
23
  MODES="$(${RESOLVE_DEPS} "${MODES_DIR}" ${MODE_SPEC})"
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "conlink",
3
- "version": "2.0.3",
3
+ "version": "2.2.0",
4
4
  "description": "conlink - Declarative Low-Level Networking for Containers",
5
5
  "repository": "https://github.com/LonoCloud/conlink",
6
6
  "license": "SEE LICENSE IN LICENSE",