conlink 2.1.0 → 2.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -11,7 +11,7 @@ jobs:
11
11
  runs-on: ubuntu-latest
12
12
  steps:
13
13
  - name: Checkout
14
- uses: actions/checkout@v3
14
+ uses: actions/checkout@v4
15
15
 
16
16
  - name: npm install
17
17
  run: npm install
@@ -19,6 +19,6 @@ jobs:
19
19
  - name: compose build of conlink
20
20
  run: docker compose -f examples/test1-compose.yaml build
21
21
 
22
- - name: "./run-tests.sh"
22
+ - name: "dctest test/test*.yaml"
23
23
  timeout-minutes: 5
24
- run: time ./run-tests.sh
24
+ run: time node_modules/.bin/dctest --verbose-commands conlink-test $(ls -v test/test*.yaml)
package/Dockerfile CHANGED
@@ -24,11 +24,11 @@ FROM node:16-slim as run
24
24
  RUN apt-get -y update
25
25
  # Runtime deps and utilities
26
26
  RUN apt-get -y install libpcap-dev tcpdump iproute2 iputils-ping curl \
27
- iptables bridge-utils ethtool \
27
+ iptables bridge-utils ethtool jq \
28
28
  openvswitch-switch openvswitch-testcontroller
29
29
 
30
30
  COPY --from=build /app/ /app/
31
- ADD link-add.sh link-del.sh link-mirred.sh /app/
31
+ ADD link-add.sh link-del.sh link-mirred.sh link-forward.sh /app/
32
32
  ADD schema.yaml /app/build/
33
33
 
34
34
  ENV PATH /app:$PATH
package/README.md CHANGED
@@ -1,9 +1,46 @@
1
1
  # conlink: Declarative Low-Level Networking for Containers
2
2
 
3
-
4
3
  Create (layer 2 and layer 3) networking between containers using
5
4
  a declarative configuration.
6
5
 
6
+ Conlink also includes scripts that make docker compose a much more
7
+ powerful development and testing environment (refer to
8
+ [Compose scripts](#compose-scripts-mdc-waitsh-and-copysh) for
9
+ details):
10
+
11
+ * [mdc](#mdc): modular management of multiple compose configurations
12
+ * [wait.sh](#waitsh): wait for network and file conditions before continuing
13
+ * [copy.sh](#copysh): recursively copy files with variable templating
14
+
15
+ ## Why conlink?
16
+
17
+ There are a number of limitations of docker-compose networking that
18
+ conlink addresses:
19
+
20
+ * Operates at layer 3 of the network stack (i.e. IP).
21
+ * Has a fixed model of container interface naming: first interface is
22
+ `eth0`, second is `eth1`, etc.
23
+ * For containers with multiple interfaces, the mapping between docker
24
+ compose networks and container interface is not controllable
25
+ (https://github.com/docker/compose/issues/4645#issuecomment-701351876,
26
+ https://github.com/docker/compose/issues/8561#issuecomment-1872510968)
27
+ * If a container uses the scale property, then IPs cannot be
28
+ assigned and user assigned MAC addresses will be the same for every
29
+ instance of that service.
30
+ * Docker bridge networking interferes with switch and bridge protocol
31
+ traffic (BPDU, STP, LLDP, etc). Conlink supports the "patch" link
32
+ mode that allows this type of traffic to pass correctly.
33
+
34
+ Conlink has the following features:
35
+
36
+ - Declarative network configuration (links, bridges, patches, etc)
37
+ - Event driven (container restarts and scale changes)
38
+ - Low-level control of network interfaces/links: MTU, routes, port
39
+ forwarding, netem properties, etc
40
+ - Automatic IP and MAC address incrementing for scaled containers
41
+ - Central network container for easy monitoring and debug
42
+ - Composable configuration from multiple sources/locations
43
+
7
44
  ## Prerequisites
8
45
 
9
46
  General:
@@ -74,12 +111,12 @@ same bridge (broadcast domain).
74
111
  Network configuration can either be loaded directly from configuration
75
112
  files using the `--network-config` option or it can be loaded from
76
113
  `x-network` properties contained in docker-compose files using the
77
- `--compose-file`. Multiple of each option may be specified and all the
78
- network configuration will be merged into a final network
79
- configuration.
114
+ `--compose-file` option. Multiple of each option may be specified and
115
+ all the network configuration will be merged into a final network
116
+ configuration. Both options also support colon separated lists.
80
117
 
81
- The network configuration can have three top level keys: `links`,
82
- `tunnels`, and `commands`.
118
+ The network configuration can have four top level keys: `links`,
119
+ `bridges`, `tunnels`, and `commands`.
83
120
 
84
121
  ### Links
85
122
 
@@ -92,28 +129,32 @@ interfaces in the host.
92
129
 
93
130
  The following table describes the link properties:
94
131
 
95
- | property | link types | format | default | description |
96
- |-----------|------------|------------|---------|--------------------------|
97
- | type | * | string 1 | veth | link/interface type |
98
- | service | * | string | 2 | compose service |
99
- | container | * | string | | container name |
100
- | bridge | veth | string | | conlink bridge / domain |
101
- | outer-dev | not dummy | string[15] | | conlink/host intf name |
102
- | dev | * | string[15] | eth0 | container intf name |
103
- | ip | * | CIDR | | IP CIDR (index offset) |
104
- | mac | 3 | MAC | | MAC addr (index offset) |
105
- | mtu | * | number 4 | 9000 | intf MTU |
106
- | route | * | string | | ip route add args |
107
- | nat | * | IP | | DNAT/SNAT to IP |
108
- | netem | * | string | | tc qdisc NetEm options |
109
- | mode | 5 | string | | virt intf mode |
110
- | vlanid | vlan | number | | VLAN ID |
132
+ | property | link types | format | default | description |
133
+ |-----------|------------|----------------|---------|--------------------------|
134
+ | type | * | string 1 | veth | link/interface type |
135
+ | service | * | string | 2 | compose service |
136
+ | container | * | string | | container name |
137
+ | bridge | veth | string | | conlink bridge / domain |
138
+ | outer-dev | not dummy | string[15] | | conlink/host intf name |
139
+ | dev | * | string[15] | eth0 | container intf name |
140
+ | ip | * | CIDR | | IP CIDR 7 |
141
+ | mac | 3 | MAC | | MAC addr 7 |
142
+ | mtu | * | number 4 | 65535 | intf MTU |
143
+ | route | * | strings 8 | | ip route add args |
144
+ | nat | * | IP | | DNAT/SNAT to IP |
145
+ | netem | * | strings 8 | | tc qdisc NetEm options |
146
+ | mode | 5 | string | | virt intf mode |
147
+ | vlanid | vlan | number | | VLAN ID |
148
+ | forward | veth | strings 6 8 | | forward conlink ports 7 |
111
149
 
112
150
  - 1 - veth, dummy, vlan, ipvlan, macvlan, ipvtap, macvtap
113
151
  - 2 - defaults to outer compose service
114
152
  - 3 - not ipvlan/ipvtap
115
153
  - 4 - max MTU of parent device for \*vlan, \*vtap types
116
154
  - 5 - macvlan, macvtap, ipvlan, ipvtap
155
+ - 6 - string syntax: `conlink_port:container_port/proto`
156
+ - 7 - offset by scale/replica index
157
+ - 8 - either a single string or an array of strings
117
158
 
118
159
  Each link has a 'type' key that defaults to "veth" and each link
119
160
  definition must also have either a `service` key or a `container` key.
@@ -135,6 +176,18 @@ than the MTU of the parent (outer-dev) device.
135
176
  For the `netem` property, refer to the `netem` man page. The `OPTIONS`
136
177
  grammar defines the valid strings for the `netem` property.
137
178
 
179
+ The `forward` property is an array of strings that defines ports to
180
+ forward from the conlink container into the container over this link.
181
+ Traffic arriving on the conlink container's docker interface of type
182
+ `proto` and destined for port `conlink_port` is forwarded over this
183
+ link to the container IP and port `container_port` (`ip` is required).
184
+ The initial port (`conlink_port`) is offset by the service
185
+ replica/scale number (minus 1). So if the first replica has port 80
186
+ forwarded then the second replica will have port 81 forwarded.
187
+ For publicly publishing a port, the conlink container needs to be on
188
+ a docker network and the `conlink_port` should match the target port
189
+ of a docker published port (for the conlink container).
190
+
138
191
  ### Bridges
139
192
 
140
193
  The bridge settings currently only support the "mode" setting. If
@@ -449,8 +502,10 @@ Start the test7 compose configuration:
449
502
  docker-compose -f examples/test7-compose.yaml up --build --force-recreate
450
503
  ```
451
504
 
452
- Show the links in both node containers to see that the MAC addresses
453
- are `00:0a:0b:0c:0d:0*` and the MTUs are set to `4111`.
505
+ Show the links in both node containers to see that on the eth0
506
+ interfaces the MAC addresses are `00:0a:0b:0c:0d:0*` and the MTUs are
507
+ set to `4111`. The eth1 interfaces should have the command line set
508
+ default MTU of `5111`.
454
509
 
455
510
  ```
456
511
  docker-compose -f examples/test7-compose.yaml exec --index 1 node ip link
@@ -514,7 +569,205 @@ validate connectivity using ping:
514
569
  ```
515
570
  export BRIDGE_MODE="linux" # "ovs", "patch", "auto"
516
571
  docker-compose -f examples/test9-compose.yaml up --build --force-recreate
517
- docker-compose -f examples/test9-compose.yaml exec node ping 10.0.1.2
572
+ docker-compose -f examples/test9-compose.yaml exec node1 ping 10.0.1.2
573
+ ```
574
+
575
+ ### test10: port forwarding and routing
576
+
577
+ This example demonstrates port forwarding from the conlink container
578
+ to two containers running simple web servers. It also demonstrates the
579
+ use of a router container and multiple route rules in the other
580
+ containers.
581
+
582
+ Start the test10 compose configuration:
583
+
584
+ ```
585
+ docker-compose -f examples/test10-compose.yaml up --build --force-recreate
586
+ ```
587
+
588
+ Ports 3080 and 8080 are both published on the host by the conlink
589
+ container using standard Docker port mapping. The internal mapping of
590
+ those ports (1080 and 1180 respectively) are both are forwarded to
591
+ port 80 in the node1 container using conlink's port forwarding
592
+ mechanism. The two paths look like this:
593
+
594
+ ```
595
+ host:3080 --> 1080 (in conlink) --> node1:80
596
+ host:8080 --> 1180 (in conlink) --> node1:80
597
+ ```
598
+
599
+ Use curl on the host to query both of these paths to node1:
600
+
601
+ ```
602
+ curl 0.0.0.0:3080
603
+ curl 0.0.0.0:8080
604
+ ```
605
+
606
+ Ports 80 and 81 are published on the host by the conlink container
607
+ using standard Docker port mapping. Then conlink forwards from ports
608
+ 80 and 81 to the first and second replica (respectively) of node2,
609
+ each of which listen internally on port 80. The two paths look like
610
+ this:
611
+
612
+ ```
613
+ host:80 -> 80 (in conlink) -> node2_1:80
614
+ host:81 -> 81 (in conlink) -> node2_2:80
615
+ ```
616
+
617
+ Use curl on the host to query both replicas of node2:
618
+
619
+ ```
620
+ curl 0.0.0.0:80
621
+ curl 0.0.0.0:81
622
+ ```
623
+
624
+ Start a two tcpdump processes in the conlink container to watch
625
+ routed ICMP traffic and then ping between containers across the router
626
+ container:
627
+
628
+ ```
629
+ docker compose -f examples/test10-compose.yaml exec network tcpdump -nli router_1-es1 icmp
630
+ docker compose -f examples/test10-compose.yaml exec network tcpdump -nli router_1-es2 icmp
631
+ ```
632
+
633
+ ```
634
+ docker-compose -f examples/test10-compose.yaml exec node1 ping 10.2.0.1
635
+ docker-compose -f examples/test10-compose.yaml exec node1 ping 10.2.0.2
636
+ docker-compose -f examples/test10-compose.yaml exec node2 ping 10.1.0.1
637
+
638
+ ```
639
+
640
+ ## Compose scripts: mdc, wait.sh, and copy.sh
641
+
642
+ ### mdc
643
+
644
+ The `mdc` command adds flexibility and power to the builtin overlay
645
+ capability of docker compose. Docker compose can specify multiple
646
+ compose files that will be combined into a single configuration.
647
+ Compose files that are specified later will overlay or override
648
+ earlier compose files. For example, if compose files A and B are
649
+ loaded by docker compose, then the `image` property of a service in
650
+ file B will take precedence (or override) the `image` property for the
651
+ same service in file A. Some properties such as `volumes` and
652
+ `environment` will have the sub-properties merged or appended to.
653
+
654
+ There are several ways that `mdc` adds to the composition capabilities
655
+ of docker compose:
656
+ 1. **Mode/module dependency resolution**. The modes or modules that
657
+ are combined by `mdc` are defined as directories that contain
658
+ mode/module specific content. A `deps` file in a mode/module
659
+ directory is used to specify dependencies on other modes/modules.
660
+ The syntax and resolution algorithm is defined by the
661
+ [resolve-deps](https://github.com/Viasat/resolve-deps) project.
662
+ 2. **Environment variable file combining/overlaying**. Each `.env`
663
+ file that appears in a mode/module directory will be appended into
664
+ a single `.env` file at the top-level where the `mdc` command is
665
+ invoked. Later environment variables will override earlier ones
666
+ with the same name. Variable interpolation and some shell-style
667
+ variable expansion can be used to combine/append environment
668
+ variables. For example if FOO and BAR are defined in an earlier
669
+ mode/module, then BAZ could be defined like this:
670
+ `BAZ="${FOO:-${BAR}-SUFF"` which will set BAZ to FOO if FOO is set,
671
+ otherwise, it will set BAZ to BAR with a "-SUFF" suffix.
672
+ 3. **Directory hierarchy combining/overlaying**. If the mode/module
673
+ directory has subdirectories that themselves contain a "files/"
674
+ sub-directory, then the mode subdirectories will be recursively
675
+ copied into the top-level ".files/" directory. For example,
676
+ consider if the following files exists under the modes "foo" and
677
+ "bar" (with a dependency of "bar" on "foo"):
678
+ `foo/svc1/files/etc/conf1`, `foo/svc2/files/etc/conf2`, and
679
+ `bar/svc1/files/etc/conf1`. When `mdc` is run this will result in
680
+ the following two files: `.files/svc1/etc/conf1` and
681
+ `.files/svc2/etc/conf2`. The content of `conf1` will come from the
682
+ "bar" mode because it is resolved second. The use of the `copy.sh`
683
+ script (described below) simplifies recursive file copying and also
684
+ provides variable templating of copied files.
685
+ 4. **Set environment variables based on the selected modes/modules**.
686
+ When `mdc` is run it will set the following special environment
687
+ variables in the top-level `.env` file:
688
+ * `COMPOSE_FILE`: A colon separated and dependency ordered list of
689
+ compose file paths from each resolved mode/module directory.
690
+ * `COMPOSE_DIR`: The directory where the top-level `.env` is
691
+ created.
692
+ * `COMPOSE_PRPOFILES`: A comma separated list of each resolved
693
+ mode/module with a `MODE_` prefix on the name. These are docker
694
+ compose profiles that can be used to enable services in one
695
+ mode/module compose file when a different mode/module is
696
+ selected/resolved by `mdc`. For example, if a compose file in
697
+ "bar" has a service that should only be enabled when the "foo"
698
+ mode/module is also requested/resolved, then the service can be
699
+ tagged with the `MODE_foo` profile.
700
+ * `MDC_MODE_DIRS`: A comma separated list of mode/module
701
+ directories. This can be used by other external tools that have
702
+ specific mode/module behavior.
703
+
704
+ Conlink network configuration can be specified in `x-network`
705
+ properties within compose files. This can be a problem with the
706
+ builtin overlay functionality of docker compose because `x-` prefixed
707
+ properties are simply overriden as a whole without any special merging
708
+ behavior. To work around this limitation, conlink has the ability to
709
+ directly merge `x-network` configuration from multiple compose files
710
+ by passing the `COMPOSE_FILE` variable to the conlink `--compose-file`
711
+ parameter (which supports a colon sperated list of compose files).
712
+
713
+ ### wait.sh
714
+
715
+ The dynamic event driven nature of conlink mean that interfaces may
716
+ appear after the container service code starts running (unlike plain
717
+ docker container networking). For this reason, the `wait.sh` script is
718
+ provided to simplify waiting for interfaces to appear (and other
719
+ network conditions). Here is a compose file snippit that will wait for
720
+ `eth0` to appear and for `eni1` to both appear and have an IP address
721
+ assigned before running the startup command (after the `--`):
722
+
723
+ ```
724
+ services:
725
+ svc1:
726
+ volumes:
727
+ - ./conlink/scripts:/scripts:ro
728
+ command: /scripts/wait.sh -i eth0 -I eni1 -- /start-cmd.sh arg1 arg2
729
+ ```
730
+
731
+ In addition to waiting for interfaces and address assignment,
732
+ `wait.sh` can also wait for a file to appear (`-f FILE`), a remote TCP
733
+ port to become accessible (`-t HOST:PORT`), or run a command until it
734
+ completes successfully (`-c COMMAND`).
735
+
736
+
737
+ ### copy.sh
738
+
739
+ One of the features of the `mdc` command is to collect directory
740
+ hierarchies from mode/module directories into a single `.files/`
741
+ directory at the top-level. The intended use of the merged directory
742
+ hierarchy is to be merged into file-systems of running containers.
743
+ However, simple volume mounts will replace entire directory
744
+ hierarchies (and hide all prior files under the mount point). The
745
+ `copy.sh` script is provided for easily merging/overlaying one
746
+ directory hierarchy onto another one. In addition, the `-T` option
747
+ will also replace special `{{VAR}}` tokens in the files being copied
748
+ with the value of the matching environment variable.
749
+
750
+ Here is a compose file snippit that shows the use of `copy.sh` to
751
+ recursively copy/overlay the directory tree in `./.files/svc2` onto
752
+ the container root file-system. In addition, due to the use of the
753
+ `-T` option, the script will replace any occurence of the string
754
+ `{{FOO}}` with the value of the `FOO` environment variable within any
755
+ of the files that are copied:
756
+
757
+ ```
758
+ services:
759
+ svc2:
760
+ environment:
761
+ - FOO=123
762
+ volumes:
763
+ - ./.files/svc2:/files:ro
764
+ command: /scripts/copy.sh -T /files / -- /start-cmd.sh arg1 arg2
765
+ ```
766
+
767
+ Note that instances of `copy.sh` and `wait.sh` can be easily chained
768
+ together like this:
769
+ ```
770
+ /scripts/copy.sh -T /files / -- /scripts/wait.sh -i eth0 -- cmd args
518
771
  ```
519
772
 
520
773
  ## GraphViz network configuration rendering
@@ -0,0 +1,49 @@
1
+ version: "2.4"
2
+
3
+ services:
4
+ node1:
5
+ image: python:3-alpine
6
+ network_mode: none
7
+ command: "python3 -m http.server -d /var 80"
8
+ x-network:
9
+ links:
10
+ - {bridge: s1, ip: "10.1.0.1/24",
11
+ route: ["10.0.0.0/8 via 10.1.0.100"],
12
+ forward: ["1080:80/tcp", "1180:80/tcp"]}
13
+
14
+ node2:
15
+ image: python:3-alpine
16
+ network_mode: none
17
+ scale: 2
18
+ command: "python3 -m http.server -d /usr 80"
19
+ x-network:
20
+ links:
21
+ - {bridge: s2, ip: "10.2.0.1/24",
22
+ route: "10.0.0.0/8 via 10.2.0.100",
23
+ forward: "80:80/tcp"}
24
+
25
+ router:
26
+ image: python:3-alpine
27
+ network_mode: none
28
+ command: sleep Infinity
29
+ x-network:
30
+ links:
31
+ - {bridge: s1, ip: "10.1.0.100/24", dev: es1}
32
+ - {bridge: s2, ip: "10.2.0.100/24", dev: es2}
33
+
34
+ network:
35
+ build: {context: ../}
36
+ image: conlink
37
+ pid: host
38
+ cap_add: [SYS_ADMIN, NET_ADMIN, SYS_NICE, NET_BROADCAST, IPC_LOCK]
39
+ security_opt: [ 'apparmor:unconfined' ] # needed on Ubuntu 18.04
40
+ volumes:
41
+ - /var/run/docker.sock:/var/run/docker.sock
42
+ - /var/lib/docker:/var/lib/docker
43
+ - ../:/test
44
+ ports:
45
+ - "3080:1080/tcp"
46
+ - "8080:1180/tcp"
47
+ - "80:80/tcp"
48
+ - "81:81/tcp"
49
+ command: /app/build/conlink.js --compose-file /test/examples/test10-compose.yaml
@@ -15,7 +15,7 @@ services:
15
15
  - /var/run/docker.sock:/var/run/docker.sock
16
16
  - /var/lib/docker:/var/lib/docker
17
17
  - ./:/test
18
- command: /app/build/conlink.js --compose-file /test/test7-compose.yaml
18
+ command: /app/build/conlink.js --default-mtu 5111 --compose-file /test/test7-compose.yaml
19
19
 
20
20
  node:
21
21
  image: alpine
@@ -28,4 +28,12 @@ services:
28
28
  ip: 10.0.1.1/16
29
29
  mac: 00:0a:0b:0c:0d:01
30
30
  mtu: 4111
31
- netem: "delay 40ms rate 10mbit"
31
+ netem: "rate 10mbit delay 40ms"
32
+ - bridge: s2
33
+ ip: 100.0.1.1/16
34
+ dev: eth1
35
+
36
+ x-network:
37
+ links:
38
+ # The delay setting is overridden by the one in the service
39
+ - {service: node, bridge: s1, netem: "delay 200ms"}
@@ -19,14 +19,19 @@ services:
19
19
  - BRIDGE_MODE
20
20
  command: /app/build/conlink.js --default-bridge-mode linux --compose-file /test/test9-compose.yaml
21
21
 
22
- node:
22
+ node1:
23
+ image: alpine
24
+ network_mode: none
25
+ command: sleep Infinity
26
+
27
+ node2:
23
28
  image: alpine
24
29
  network_mode: none
25
- scale: 2
26
30
  command: sleep Infinity
27
31
 
28
32
  x-network:
29
33
  links:
30
- - {bridge: s1, service: node, ip: 10.0.1.1/24}
34
+ - {bridge: s1, service: node1, ip: 10.0.1.1/24}
35
+ - {bridge: s1, service: node2, ip: 10.0.1.2/24}
31
36
  bridges:
32
37
  - {bridge: s1, mode: "${BRIDGE_MODE:-auto}"}
package/link-add.sh CHANGED
@@ -32,9 +32,8 @@ usage () {
32
32
  echo >&2 " --mac MAC0 - MAC address for INTF0"
33
33
  echo >&2 " --mac0 MAC0 - MAC address for INTF0"
34
34
  echo >&2 " --mac1 MAC1 - MAC address for INTF1"
35
- echo >&2 " --route 'ROUTE' - route to add to INTF0"
36
- echo >&2 " --route|--route0 'ROUTE' - route to add to INTF0"
37
- echo >&2 " --route1 'ROUTE' - route to add to INTF1"
35
+ echo >&2 " --route|--route0 'ROUTE' - route to add to INTF0 (can repeat)"
36
+ echo >&2 " --route1 'ROUTE' - route to add to INTF1 (can repeat)"
38
37
  echo >&2 " --mtu MTU - MTU for both interfaces"
39
38
  echo >&2 ""
40
39
  echo >&2 " --mode MODE - Mode settings for *vlan TYPEs"
@@ -43,8 +42,10 @@ usage () {
43
42
  echo >&2 " --remote REMOTE - Remote address for geneve/vxlan types"
44
43
  echo >&2 " --vni VNI - Virtual Network Identifier for geneve/vxlan types"
45
44
  echo >&2 ""
46
- echo >&2 " --netem NETEM - tc qdisc netem OPTIONS (man 8 netem)"
45
+ echo >&2 " --netem NETEM - tc qdisc netem OPTIONS (man 8 netem) (can repeat)"
47
46
  echo >&2 " --nat TARGET - Stateless NAT traffic to/from TARGET"
47
+ echo >&2 " (in primary/PID0 netns)"
48
+ echo >&2 ""
48
49
  exit 2
49
50
  }
50
51
 
@@ -52,17 +53,21 @@ info() { echo "link-add [${LOG_ID}] ${*}"; }
52
53
  warn() { >&2 echo "link-add [${LOG_ID}] ${*}"; }
53
54
  die() { warn "ERROR: ${*}"; exit 1; }
54
55
 
55
- # Set MAC, IP, ROUTE, MTU, and up state for interface in netns
56
+ # Set MAC, IP, ROUTES, MTU, and up state for interface in netns
56
57
  setup_if() {
57
- local IF=$1 NS=$2 MAC=$3 IP=$4 ROUTE=$5 MTU=$6
58
+ local IF=$1 NS=$2 MAC=$3 IP=$4 MTU=$5 ROUTES=$6 routes=
59
+ echo >&2 "ROUTES: ${ROUTES}"
60
+ while read rt; do
61
+ [ "${rt}" ] && routes="${routes}\nroute add ${rt} dev ${IF}"
62
+ done < <(echo -e "${ROUTES}")
58
63
 
59
- info "Setting ${IP:+IP ${IP}, }${MAC:+MAC ${MAC}, }${MTU:+MTU ${MTU}, }${ROUTE:+ROUTE '${ROUTE}', }up state"
64
+ info "Setting ${IP:+IP ${IP}, }${MAC:+MAC ${MAC}, }${MTU:+MTU ${MTU}, }${ROUTES:+ROUTES '${ROUTES//$'\n'/,}', }up state"
60
65
  ip -netns ${NS} --force -b - <<EOF
61
66
  ${IP:+addr add ${IP} dev ${IF}}
62
67
  ${MAC:+link set dev ${IF} address ${MAC}}
63
68
  ${MTU:+link set dev ${IF} mtu ${MTU}}
64
69
  link set dev ${IF} up
65
- ${ROUTE:+route add ${ROUTE} dev ${IF}}
70
+ $(echo -e "${routes}")
66
71
  EOF
67
72
  }
68
73
 
@@ -76,7 +81,7 @@ IPTABLES() {
76
81
  # Parse arguments
77
82
  VERBOSE=${VERBOSE:-}
78
83
  PID1=${PID1:-<SELF>} IF1=${IF1:-eth0}
79
- IP0= IP1= MAC0= MAC1= ROUTE0= ROUTE1= MTU=
84
+ IP0= IP1= MAC0= MAC1= ROUTES0= ROUTES1= MTU=
80
85
  MODE= VLANID= REMOTE= VNI= NETEM= NAT=
81
86
  positional=
82
87
  while [ "${*}" ]; do
@@ -89,8 +94,8 @@ while [ "${*}" ]; do
89
94
  --ip1) IP1="${OPTARG}"; shift ;;
90
95
  --mac|--mac0) MAC0="${OPTARG}"; shift ;;
91
96
  --mac1) MAC1="${OPTARG}"; shift ;;
92
- --route|--route0) ROUTE0="${OPTARG}"; shift ;;
93
- --route1) ROUTE1="${OPTARG}"; shift ;;
97
+ --route|--route0) ROUTES0="${ROUTES0}\n${OPTARG}"; shift ;;
98
+ --route1) ROUTES1="${ROUTES1}\n${OPTARG}"; shift ;;
94
99
  --mtu) MTU="${OPTARG}"; shift ;;
95
100
 
96
101
  --mode) MODE="${OPTARG}"; shift ;;
@@ -99,13 +104,15 @@ while [ "${*}" ]; do
99
104
  --remote) REMOTE="${OPTARG}"; shift ;;
100
105
  --vni) VNI="${OPTARG}"; shift ;;
101
106
 
102
- --netem) NETEM="${OPTARG}"; shift ;;
107
+ --netem) NETEM="${NETEM} ${OPTARG}"; shift ;;
103
108
  --nat) NAT="${OPTARG}"; shift ;;
104
109
  -h|--help) usage ;;
105
110
  *) positional="${positional} $1" ;;
106
111
  esac
107
112
  shift
108
113
  done
114
+ ROUTES0="${ROUTES0#\\n}"
115
+ ROUTES1="${ROUTES1#\\n}"
109
116
  set -- ${positional}
110
117
  TYPE=$1 PID0=$2 IF0=$3
111
118
 
@@ -179,9 +186,9 @@ geneve|vxlan)
179
186
  ;;
180
187
  esac
181
188
 
182
- setup_if ${IF0} ${NS0} "${MAC0}" "${IP0}" "${ROUTE0}" "${MTU}"
189
+ setup_if ${IF0} ${NS0} "${MAC0}" "${IP0}" "${MTU}" "${ROUTES0}"
183
190
  [ "${TYPE}" = "veth" ] && \
184
- setup_if ${IF1} ${NS1} "${MAC1}" "${IP1}" "${ROUTE1}" "${MTU}"
191
+ setup_if ${IF1} ${NS1} "${MAC1}" "${IP1}" "${MTU}" "${ROUTES1}"
185
192
 
186
193
  if [ "${NETEM}" ]; then
187
194
  info "Setting tc qdisc netem: ${NETEM}"
@@ -0,0 +1,46 @@
1
+ #!/bin/bash
2
+
3
+ # Copyright (c) 2024, Equinix, Inc
4
+ # Licensed under MPL 2.0
5
+
6
+ set -e
7
+
8
+ usage () {
9
+ echo >&2 "${0} [OPTIONS] <add|del> INTF_A INTF_B PORT_A:IP:PORT_B/PROTO"
10
+ echo >&2 ""
11
+ echo >&2 "Match traffic on INTF_A that has destination port PORT_A and"
12
+ echo >&2 "protocol PROTO (tcp or udp). Forward/DNAT traffic to IP:PORT_B "
13
+ echo >&2 "via INTF_B."
14
+ exit 2
15
+ }
16
+
17
+ info() { echo "link-forward [${LOG_ID}] ${*}"; }
18
+
19
+ IPTABLES() {
20
+ case "${action}" in
21
+ add) iptables -C "${@}" 2>/dev/null || iptables -I "${@}" ;;
22
+ del) iptables -D "${@}" 2>/dev/null || true;;
23
+ esac
24
+ }
25
+
26
+ action=$1; shift || usage
27
+ intf_a=$1; shift || usage
28
+ intf_b=$1; shift || usage
29
+ spec=$1; shift || usage
30
+ read port_a ip port_b proto <<< "${spec//[:\/]/ }"
31
+
32
+ [ "${action}" -a "${intf_a}" -a "${intf_b}" ] || usage
33
+ [ "${port_a}" -a "${ip}" -a "${port_b}" -a "${proto}" ] || usage
34
+
35
+ LOG_ID="${spec}"
36
+
37
+ info "${action^} forwarding ${intf_a} -> ${intf_b}"
38
+
39
+ IPTABLES PREROUTING -t nat -i ${intf_a} -p ${proto} --dport ${port_a} -j DNAT --to-destination ${ip}:${port_b}
40
+ IPTABLES PREROUTING -t nat -i ${intf_a} -p ${proto} --dport ${port_a} -j MARK --set-mark 1
41
+ IPTABLES POSTROUTING -t nat -o ${intf_b} -m mark --mark 1 -j MASQUERADE
42
+
43
+ case "${action}" in
44
+ add) ip route replace ${ip} dev ${intf_b} ;;
45
+ del) ip route delete ${ip} dev ${intf_b} || true;;
46
+ esac