Binding Sockets to Multiple Network Namespaces in Go Without Thread Overhead

The Namespace Binding Challenge

When proxying traffic across multiple network namespaces, a common misconception is that each namespace requires a dedicated OS thread. This implies calling setns(2) on a new thread for every namespace, resulting in a 1:1 ratio between namespaces and threads. This approach consumes excessive system resources and complicates thread management. Offloading this to a separate C process introduces further overhead in terms of inter-process communication and protocol parsing.

The key realization, however, is that the network namespace is only relevant at the moment the socket is created. Once the file descriptor is instantiated within the target namespace, the thread can safely revert to the default namespace. This allows a single process to monitor ports across numerous namespaces without spawning a thread for each one.

Go Implementation Strategy

To ensure a socket is born inside a specific namespace, the Go runtime must be prevented from migrating the goroutine to a different OS thread mid-operation. The procedure for establishing a TCP listener is:

  • Acquire and store a handle to the root (default) network namespace.
  • Acquire a global lock and pin the goroutine to its current OS thread using runtime.LockOSThread() to prevent scheduling disruptions.
  • Retrieve the target namespace handle and switch to it via setns.
  • Invoke net.Listen to instantiate the socket.
  • Restore the root namespace context.
  • Release the thread lock and the global mutex.

TCP Listener Example

Using the github.com/vishvananda/netns package, we can listen on port 8080 across the root namespace, ns-alpha, and ns-beta. Prepare the namespaces with:

ip netns add ns-alpha
ip netns add ns-beta

The Go implementation demonstrates how to switch contexts safely:

package main

import (
	"net"
	"runtime"
	"sync"

	"github.com/pkg/errors"
	"github.com/sirupsen/logrus"
	"github.com/vishvananda/netns"
)

var (
	rootNsHandle netns.NsHandle
	nsLock       sync.Mutex
)

func initRootNamespace() {
	handle, err := netns.Get()
	if err != nil {
		panic(err)
	}
	rootNsHandle = handle
}

func bindSocketInNamespace(targetNs, proto, addr string) (net.Listener, error) {
	if targetNs == "" {
		return net.Listen(proto, addr)
	}

	var switched bool

	nsLock.Lock()
	runtime.LockOSThread()

	defer func() {
		if switched {
			if err := netns.Set(rootNsHandle); err != nil {
				logrus.WithError(err).Warn("Failed to revert to root namespace")
			}
		}
		runtime.UnlockOSThread()
		nsLock.Unlock()
	}()

	nsHandle, err := netns.GetFromName(targetNs)
	if err != nil {
		return nil, errors.Wrap(err, "failed to get namespace handle")
	}
	defer nsHandle.Close()

	if err = netns.Set(nsHandle); err != nil {
		return nil, errors.Wrap(err, "failed to switch namespace")
	}
	switched = true

	return net.Listen(proto, addr)
}

func handleConnections(listener net.Listener) {
	for {
		conn, err := listener.Accept()
		if err != nil {
			logrus.WithError(err).Error("Accept error")
			return
		}
		logrus.WithFields(logrus.Fields{"local": conn.LocalAddr(), "remote": conn.RemoteAddr()}).Info("Connection established")
		conn.Write([]byte("ack"))
		conn.Close()
	}
}

func main() {
	initRootNamespace()

	targets := []string{"", "ns-alpha", "ns-beta"}
	var wg sync.WaitGroup

	for _, ns := range targets {
		wg.Add(1)
		go func(namespace string) {
			defer wg.Done()
			listener, err := bindSocketInNamespace(namespace, "tcp", ":8080")
			if err != nil {
				panic(err)
			}
			logrus.WithFields(logrus.Fields{"netns": namespace, "addr": listener.Addr()}).Info("Listening")
			handleConnections(listener)
		}(ns)
	}

	wg.Wait()
}

Handling UDP and SCTP

UDP sockets follow the same pattern without causing additional thread spawns, as the Go runtime manages scheduling effectively. SCTP, however, presents a distinct challenge. Libraries like github.com/ishidawataru/sctp provide basic file descriptor wrappers where the Accept() method operates as a blocking syscall. When invoked in a loop, the Go runtime creates a new OS thread to prevent the blocking from stalling the scheduler, defeating our thread-conservation goal.

The remedy involves bypassing the library's blocking accept loop and managing the file descriptor manaully:

  • Configure the SCTP socket to non-blocking mode.
  • Implement a custom epoll-based event loop. (Avoid select or poll as their performance degrades severely under high file descriptor counts).

To extract the file descriptor and apply non-blocking settings during creation:

type nonBlockingSctpListener struct {
	*sctp.SCTPListener
	rawFd int
}

func createNonBlockingSctpListener(network, addr string) (*nonBlockingSctpListener, error) {
	parsedAddr, err := parseSctpAddress(addr)
	if err != nil {
		return nil, err
	}

	capturedFd := 0
	cfg := sctp.SocketConfig{
		InitMsg: sctp.InitMsg{NumOstreams: sctp.SCTP_MAX_STREAM},
		Control: func(net, address string, rawConn syscall.RawConn) error {
			return rawConn.Control(func(fd uintptr) {
				if err := syscall.SetNonblock(int(fd), true); err != nil {
					syscall.Close(int(fd))
					return
				}
				capturedFd = int(fd)
			})
		},
	}

	listener, err := cfg.Listen(network, parsedAddr)
	if err != nil {
		return nil, err
	}

	return &nonBlockingSctpListener{SCTPListener: listener, rawFd: capturedFd}, nil
}

Production Metrics

In a production deployment on a 4-core machine, this approach yields significant resource savings. The process manages over 1200 file descriptors (encompassing TCP, UDP, and SCTPv6 sockets across various namespaces), yet operates with merely 14 OS threads. This confirms the viability of sharing threads across multipel namespace contexts rather than dedicating a thread per namespace.

Tags: Go netns setns Epoll SCTP

Posted on Thu, 11 Jun 2026 17:53:23 +0000 by sivarts