summaryrefslogtreecommitdiffhomepage
AgeCommit message (Collapse)AuthorFilesLines
2026-04-14wgengine: replace reflect.DeepEqual with typed Equal for maybeReconfigInputs ↵Fernando Serboncini2-4/+151
(#19365) reflect.DeepEqual is expensive and allocates heavily. Replace it with a field-by-field comparison that does zero allocations. Adds tests and benchmarks for the new Equal method. Fixes #19363 Signed-off-by: Fernando Serboncini <fserb@tailscale.com>
2026-04-14util/linuxfw: fix nil deref in nftables chain checkBrad Fitzpatrick2-2/+43
Fix a panic in getOrCreateChain when the kernel lacks nftables support (CONFIG_NF_TABLES). When the nftables netlink connection fails, chain objects returned by getChainFromTable can have nil Hooknum and Priority fields. Dereferencing these caused tailscaled to SIGSEGV during router configuration, which manifested as tailscaled silently crashing ~13 seconds after "tailscale up" on arm64 gokrazy (whose kernel.arm64 build doesn't include nftables). Updates #13038 Change-Id: I14433616da5ed57895cad37038921fb4f79c3534 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-14tstest/integration: use linkat to hardlink test binaries on LinuxBrad Fitzpatrick4-5/+104
Use linkat via /proc/self/fd with AT_SYMLINK_FOLLOW to create a hardlink of the test binary instead of copying it. This avoids copying ~50MB+ binaries into each test's temp directory, making test setup faster and reducing disk I/O. The simpler os.Link(b.Path, ret.Path) can't be used here because the source binary lives in the first test's TempDir, which may be cleaned up before later tests call CopyTo. The open FD keeps the inode alive after the path is deleted, but os.Link needs a valid path. (See also b9f468240f which tried os.Link but is racy for this reason.) The /proc/self/fd approach works without elevated privileges, unlike AT_EMPTY_PATH which requires CAP_DAC_READ_SEARCH. If the linkat fails for any reason (e.g. cross-filesystem temp dirs), it falls back to the existing full-copy path. Fixes #19397 Change-Id: I4b1f97f7e63a9ae9e09dce36dfbdd1f6cff92320 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-14tstest: fix kernel version parsing for Debian-style version stringsAvery Pennarun2-4/+46
The kernel version parser used strings.Cut with "-" to handle versions like "5.4.0-76-generic", but Debian uses "+" in versions like "6.12.41+deb13-amd64". Use strings.IndexAny to find the first "-" or "+" and truncate there. Fixes TestKernelVersion on Debian systems. Fixes #19395 Change-Id: I70e5f95682d54baf908e51f9f4b51c130b00aaaa Co-Authored-By: Brad Fitzpatrick <bradfitz@tailscale.com> Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2026-04-14wgengine/magicsock: deflake TestTwoDevicePing compare-metrics-statsBrad Fitzpatrick1-72/+107
The compare-metrics-stats subtest reset two independent counting systems (physical connection counters and expvar.Int user metrics) non-atomically. Background WireGuard keepalives arriving between the resets could increment one system but not the other, causing off-by-one packet/byte mismatches in either direction. Replace the reset-then-compare pattern with snapshot-and-delta: snapshot both systems before pings, snapshot again after, and compare the deltas. This eliminates the non-atomic reset window entirely. As a belt-and-suspenders safety net, tolerate a difference of exactly one packet (and corresponding bytes) from a stray keepalive that could still arrive in the narrow window between the two snapshots. flakestress passes with ~5900 runs (~2800 without -race, ~3100 with -race) but it also passed previously too. This is an annoying one to repro. Fixes #11762 Change-Id: I3447ad67e71c8146e85eed38b7a665033ef9e284 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-14net/dns: fix TestDNSTrampleRecovery failure under flakestressBrad Fitzpatrick4-31/+36
The test had two problems: 1. runFileWatcher passed hardcoded "/etc/" to the inotify watcher, but the test filesystem uses a temp directory prefix. The watcher was watching the real /etc/, never seeing the test's file writes. 2. The test's watchFile used gonotify.NewDirWatcher which creates goroutines that block on real inotify syscalls. These don't work inside synctest's fake-time bubble. The test only passed standalone by accident: gonotify walks /etc/ on startup producing fake events that happened to trigger trample detection at the right time. Fix the path issue by adding ActualPath to the wholeFileFS interface, which translates logical paths (like "/etc/resolv.conf") to real filesystem paths (respecting any test prefix). Use it in runFileWatcher so the inotify watch targets the correct directory. Replace gonotify in the test with a one-shot timer that synctest can advance through fake time, reliably triggering the trample check. Fixes #19400 Change-Id: Idb252881ec24d0ab3b3c1d154dbdaf532db837d4 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-14control/controlclient: improve filter on netmap updates (#19308)Claus Lensbøl2-41/+240
The previous filters would allow for a handful of subtle issues such as updating the last seen date when the key or online status had not changed, and making online keys unconditionally make an engine update. These have been fixed along side making no change updates from TSMP into a no-op for the engine so we don't have to reconfigure. A bunch of additional testing has been added as well. Updates #12639 Signed-off-by: Claus Lensbøl <claus@tailscale.com>
2026-04-13go.mod: upgrade go-git to v5.17.1Patrick O'Doherty5-9/+9
Partially resolve govulncheck warnings in OSS and corp. Updates #cleanup Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>
2026-04-13derp/derpserver: increase minimum token bucket sizeJordan Whited2-24/+36
And cap WaitN calls to prevent token bucket errors. Frame length is inclusive of DERP key for FrameSendPacket frames. Updates tailscale/corp#40171 Signed-off-by: Jordan Whited <jordan@tailscale.com>
2026-04-13tstest/integration: clear SSH_CLIENT env to prevent false positive detectionAvery Pennarun1-0/+3
When running integration tests over SSH (e.g., in remote development environments), the SSH_CLIENT environment variable is set. This causes isSSHOverTailscale() to incorrectly detect an SSH session and change behavior. Clear SSH_CLIENT in the test node environment to prevent these false positives. Fixes #19393 Change-Id: I1411abf0be9704cce37051476efb04d59beed386 Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2026-04-13all: fix six tests that failed with -count=2Brad Fitzpatrick9-15/+70
Avery found a bunch of tests that fail with -count=2. Updates tailscale/corp#40176 (tracks making our CI detect them) Change-Id: Ie3e4398070dd92e4fe0146badddf1254749cca20 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com> Co-authored-by: Avery Pennarun <apenwarr@tailscale.com>
2026-04-13.gitignore: explicitly include tool/go.exeJames Tucker1-1/+4
Updates #19255 Signed-off-by: James Tucker <james@tailscale.com>
2026-04-13cmd/derper: fix TestLookupMetric to pass when run aloneBrad Fitzpatrick1-6/+28
TestLookupMetric was added in e8d140654 (2023-08-17) without initializing the dnsCache and dnsCacheBytes globals. When run in isolation, handleBootstrapDNS writes a nil body (from the uninitialized dnsCacheBytes), causing getBootstrapDNS to fail decoding an empty response with EOF. Add a setDNSCache test helper that stores the dnsEntryMap, marshals dnsCacheBytes, and registers a t.Cleanup to nil both out, so tests that forget to call it will hit the dnsCache-nil fatal in getBootstrapDNS rather than silently depending on prior test state. Also add AssertNotParallel and a dnsCache-nil fatal check to getBootstrapDNS, the central helper all bootstrap DNS tests flow through, to prevent future tests from running in parallel (they all mutate package-level DNS caches and metrics) and to give a clear error if a test forgets to initialize the DNS caches. Fixes #19388 Change-Id: I8ad454ec6026c71f13ecfa14d25925df5478b908 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com> Co-authored-by: Avery Pennarun <apenwarr@tailscale.com>
2026-04-13tstest/integration/nat, tstest/natlab/vnet: fix natlab test flakeBrad Fitzpatrick3-8/+138
The natlab-integrationtest CI job frequently flakes by exhausting its 3m go test timeout. The root cause is that the QEMU VMs run under pure software emulation (TCG) with no KVM. Under TCG, the guest kernel's timer calibration busy-loops are at the mercy of host CPU scheduling. When two VMs boot simultaneously on a 2-core CI runner, one VM's calibration gets starved and produces wrong results, leaving the kernel with broken timers that prevent it from ever completing boot — even after the other VM finishes and frees up CPU. Additionally, the microvm machine type doesn't provide HPET hardware, but the kernel command line specified clocksource=hpet. And the VM image build (make natlab) ran inside the test itself, consuming most of the 3m timeout budget before the actual test started. Fix by: - Enabling KVM when /dev/kvm is available, so timer calibration uses real hardware timers unaffected by host CPU scheduling. - Adding a CI step to set /dev/kvm permissions on the GitHub Actions runner (ubuntu-latest provides KVM but needs a udev rule). - Pre-building the VM image in a separate CI step so it doesn't cut into the go test -timeout budget. - Replacing the hardcoded 60s context timeout with one derived from t.Deadline(), so the test uses the full -timeout budget. - Adding VM boot progress detection (AwaitFirstPacket) and QMP diagnostics, so boot failures produce clear errors instead of opaque "context deadline exceeded" messages. With KVM enabled, the test passes reliably even on a single CPU core with 3 parallel workers — a scenario that was 100% broken under TCG. Fixes #18906 Change-Id: I4c87631a9c9678d185b9f30cb05c0f7bfa9f5c62 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-13tstest: add AssertNotParallel helperBrad Fitzpatrick1-0/+14
For tests to loudly declare (and panic on violation) when they're doing something that's not safe in a parallel test. Fixes #19385 Change-Id: If79693b0c235c146871a05ed74fa9ea75bb500f9 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-13wgengine/netstack: fix data race on in-flight connection test globalsBrad Fitzpatrick2-9/+13
The maxInFlightConnectionAttemptsForTest and maxInFlightConnectionAttemptsPerClientForTest globals were plain ints read by background gVisor TCP handler goroutines (via wrapTCPProtocolHandler) and written by tstest.Replace cleanup in TestTCPForwardLimits_PerClient. When a gVisor goroutine outlived the test cleanup window, the race detector caught the unsynchronized access. The race-prone code was introduced in c5abbcd4b4d8 (2024-02-26, "wgengine/netstack: add a per-client limit for in-flight TCP forwards") which added both the plain int globals and the TestTCPForwardLimits_PerClient test that writes them via tstest.Replace. It is not obvious why this has only recently started being detected as a data race; likely some combination of gVisor version bumps, Go toolchain scheduler changes, and additional TCP-injecting subtests (e.g. 03461ea7f, 2026-01-30) increased goroutine churn enough to hit the window. Change both globals to atomic.Int32 and replace tstest.Replace (which does non-atomic *target = old on cleanup) with explicit Store/Cleanup pairs. Fixes #19118 Change-Id: Id26ba6fbfb2e4ade319976db80af8e16c7c8778e Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-13cmd/containerboot: mark TestContainerBoot as flakyBrad Fitzpatrick1-0/+2
Updates #19380 Change-Id: Ib1be53836e37224265d10abd0c2213644ea54d64 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-13version: show tailscale/go toolchain git hash in version outputBrad Fitzpatrick4-13/+69
When built with the Tailscale Go toolchain, include the toolchain's git revision in the version output. The non-JSON output shows the first 10 hex digits: go version: go1.26.2 (tailscale/go dfe2a5fd8e) The JSON output includes the full hash as "tailscaleGoGitHash", or omits the field when not using tsgo. The toolchain rev is read via a separate sync.OnceValue rather than piggybacking on getEmbeddedInfo, because that function discards all data when VCS fields are absent (e.g. in test binaries), while the tailscale.toolchain.rev setting is still present. Also add a CI-only test verifying tailscaleToolchainRev is non-empty when built with the tailscale_go build tag. Fixes #19374 Change-Id: Ied0b16d7aead5471d8c614c30cba8b0dcf80c691 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-13ipn/ipnlocal: mark TestStateMachineSeamless as flakyBrad Fitzpatrick1-0/+2
Updates #19377 Change-Id: I7dbf5b954effbfa821339e79d02d8a6e46d2862a Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-13types/netmap,tailcfg: update documentation for Services capAdriano Sela Aviles2-4/+8
Updates tailscale/corp#40052 Signed-off-by: Adriano Sela Aviles <adriano@tailscale.com>
2026-04-13ssh/tailssh: speed up SSH integration testsBrad Fitzpatrick4-118/+104
Parallelize the SSH integration tests across OS targets and reduce per-container overhead: - CI: use GitHub Actions matrix strategy to run all 4 OS containers (ubuntu:focal, ubuntu:jammy, ubuntu:noble, alpine:latest) in parallel instead of sequentially (~4x wall-clock improvement) - Makefile: run docker builds in parallel for local dev too - Dockerfile: consolidate ~20 separate RUN commands into 5 (one per test phase), eliminating Docker layer overhead. Combine test binary invocations where no state mutation is needed between them. Fix a bug where TestDoDropPrivileges was silently not being run (was passed as a second positional arg to -test.run instead of using regex alternation). - TestMain: replace tail -F + 2s sleep with synchronous log read, eliminating 2s overhead per test binary invocation. Set debugTest once in TestMain instead of redundantly in each test function. - session.read(): close channel on EOF so non-shell tests return immediately instead of waiting for the 1s silence timeout. Updates #19244 Change-Id: I2cc8588964fbce0dd7b654fb94e7ff33440b8584 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-13licenses: update license noticesLicense Updater3-6/+6
Signed-off-by: License Updater <noreply+license-updater@tailscale.com>
2026-04-13cmd/derper: mark rate-config flag as experimental and unstableJordan Whited1-1/+1
Updates tailscale/corp#38509 Signed-off-by: Jordan Whited <jordan@tailscale.com>
2026-04-13ipn/localapi,client/local: add services over localapiAdriano Sela Aviles3-0/+28
Updates tailscale/corp#40052 Signed-off-by: Adriano Sela Aviles <adriano@tailscale.com>
2026-04-13ssh/tailssh: gofmtBrad Fitzpatrick1-2/+2
I'm not sure how this file got into the repo without gofmt. Maybe gofmt rules changed in some Go release? Updates #cleanup Change-Id: Ia8bd46e29f116f7fbfca11be80c8ef48699cd9f2 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-13tailscaleroot: add test that tsgo rev is in Go build cache keysBrad Fitzpatrick1-0/+57
Verify that GODEBUG=gocachehash=1 output from ./tool/go includes the git revision from go.toolchain.rev, ensuring that bumping the Tailscale Go fork (without a Go version number change) properly invalidates the build cache. The test only runs in CI or when the current Go binary is the Tailscale toolchain (GOROOT contains /.cache/tsgo/), so open source contributors using stock Go aren't forced to download tsgo. Fixes tailscale/corp#36589 Change-Id: Ia98d3a3aa8c7fa67f9a0293066fa02a1997dcb95 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-13tailcfg,types/netmap: add (visible) Services to SelfNode Caps (#19335)Adriano Sela Aviles2-0/+54
Updates #40052 Signed-off-by: Adriano Sela Aviles <adriano@tailscale.com>
2026-04-11tstest/tailmac: add headless mode for automated VM testingBrad Fitzpatrick5-10/+36
Add a --headless flag to the Host.app Run subcommand for running macOS VMs without a GUI, enabling use from test frameworks. Key changes: - HostCli.swift: When --headless is set, run the VM via VMController + RunLoop.main.run() instead of NSApplicationMain. Using the RunLoop (not dispatchMain) is required because VZ framework callbacks depend on RunLoop sources. - VMController.swift: Add headless parameter to createVirtualMachine that configures a single socket-based NIC (no NAT NIC). This matches the NIC configuration used when creating/saving VMs, so saved state restoration works correctly. A NIC count mismatch causes VZ to silently fail to execute guest code. - TailMacConfigHelper.swift: Clean up socket network device logging. - Config.swift: Move VM storage from ~/VM.bundle to ~/.cache/tailscale/vmtest/macos/. - TailMac.swift: Fix dispatchMain→RunLoop.main.run() in the create command (same VZ RunLoop requirement). Updates #13038 Change-Id: Iea51c043aa92e8fc6257139b9f0e2e7677072fa2 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-10gokrazy: add arm64 natlab appliance image supportBrad Fitzpatrick7-3/+27
Add natlabapp.arm64 config and gokrazydeps.go for building a gokrazy natlab appliance image targeting arm64 (Apple Silicon). This is the arm64 counterpart to the existing natlabapp (amd64) used by vmtest. The arm64 image uses github.com/gokrazy/kernel.arm64 and is built with "make natlab-arm64" in the gokrazy directory. Updates #13038 Change-Id: I0e1f8e5840083a5de5954f2cf46e3babec129d96 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-10.github, tool/listpkgs: automatically find tests which use tstest.RequireRootBrad Fitzpatrick5-11/+82
Updates tailscale/corp#40007 Change-Id: I677d3d9e276cb6633a14ac07e4b58ea08e52fac4 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-10cmd/derper,derp: add --rate-config file with SIGHUP reload (#19314)Mike O'Driscoll3-52/+412
Add a --rate-config flag pointing to a JSON file for per-client receive rate limits (bytes/sec and burst bytes). The config is reloaded on SIGHUP, updating all existing client connections live. The --per-client-rate-limit and --per-client-rate-burst flags are removed in favor of the config file. In derpserver, rate limiting uses an atomic.Pointer[xrate.Limiter] per client: nil when unlimited or mesh (zero overhead), non-nil when rate-limited. Document that clientSet.activeClient Store operations require Server.mu. Updates tailscale/corp#38509 Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
2026-04-10wgengine/router/osrouter: fix privileged tests missing fake netfilter runnerAmal Bansode1-0/+4
These test failures were never caught by CI because the package in question was missing from our privileged tests list. tailscale/corp#40007 covers improving our process around this. Fixes #19316 Signed-off-by: Amal Bansode <amal@tailscale.com>
2026-04-10tstest: add RequireRoot helperBrad Fitzpatrick4-18/+15
Start using a common helper for tests to declare that they require root. This is step 1. A later step will then make this helper track which tests were skipped so a subsequent pass will run these test as root. Updates tailscale/corp#40007 Change-Id: I4979e1def0fa3691d38c83f48c89aaa443e7f62e Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-10tka: Revert "improve logging for Compact and Commit operations"Alex Chan2-13/+0
This reverts commit b25920dfc07452833895ad00b42db7e581b3cec8. The `log.Printf` messages are causing panics in corp, in particular: > panic: please use tailscale.com/logger.Logf instead of the log package Fixing the TKA code to plumb through a logger properly is going to be a hassle, so for now remove these logs to unblock merges to corp. Updates tailscale/corp#39455 Signed-off-by: Alex Chan <alexc@tailscale.com>
2026-04-10tka: keep the CompactionDefaults alongside the other limitsAlex Chan3-7/+19
Updates #cleanup Change-Id: Ib5e481d5a9c7ec7ac3e6b3913909ab1bf21d7a4d Signed-off-by: Alex Chan <alexc@tailscale.com>
2026-04-09ipn/ipnlocal: add netmap mutations to the ipn bus (#19120)Jonathan Nobels4-13/+242
ipn/local: add netmap mutations to the ipn bus updates tailscale/tailscale#1909 This adds a new new NotifyWatchOpt that allows watchers to receive PeerChange events (derived from node mutations) on the IPN bus in lieu of a complete netmap. We'll continue to send the full netmap for any map response that includes it, but for mutations, sending PeerChange events gives the client the option to manage it's own models more selectively and cuts way down on json serialization overhead. On chatty tailnets, this will vastly reduce the amount of chatter on the bus. This change should be backwards compatible, it is purely additive. Clients that subscribe to NotifyNetmap will get the full netmap for every delta. New clients can omit that and instead opt into NotifyPeerChanges. Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
2026-04-09cmd/k8s-operator: set PreferDualStack on ProxyGroup egress services (#19194)Fernando Serboncini2-3/+5
On dual-stack clusters defaulting to IPv6, the ProxyGroup egress service only got an IPv6 address, which causes request failures. Individual egress proxies already set PreferDualStack correctly. Fixes: #18768 Signed-off-by: Fernando Serboncini <fserb@tailscale.com>
2026-04-09ssh/tailssh: fix default PATH for DebianAndrew Dunham1-1/+1
Validated against a modern Debian install, fixes a typo. Updates #cleanup Signed-off-by: Andrew Dunham <andrew@du.nham.ca> Change-Id: I7b26012f54dbd2f0f9fea98722e8edc2fe97645a
2026-04-09tstest/natlab: add TestSubnetRouterFreeBSD with FreeBSD cloud image supportBrad Fitzpatrick6-39/+163
As a warm-up to making natlab support multiple operating systems, start with an easy one (in that it's also Unixy and open source like Linux) and add FreeBSD 15.0 as a VM OS option for the vmtest integration test framework, and add TestSubnetRouterFreeBSD which tests subnet routing through a FreeBSD VM (Gokrazy → FreeBSD → Gokrazy). Key changes: - Add FreeBSD150 OSImage using the official FreeBSD 15.0 BASIC-CLOUDINIT cloud image (xz-compressed qcow2) - Add GOOS()/IsFreeBSD() methods to OSImage for cross-compilation and OS-specific behavior - Handle xz-compressed image downloads in ensureImage - Refactor compileBinaries into compileBinariesForOS to support multiple GOOS targets (linux, freebsd), with binaries registered at <goos>/<name> paths on the file server VIP - Add FreeBSD-specific cloud-init (nuageinit) user-data generation: string-form runcmd (nuageinit doesn't support YAML arrays), fetch(1) instead of curl, FreeBSD sysctl names for IP forwarding, mkdir /usr/local/bin, PATH setup for tta - Skip network-config in cidata ISO for FreeBSD (DHCP via rc.conf) Updates tailscale/tailscale#13038 Change-Id: Ibeb4f7d02659d5cd8e3a7c3a66ee7b1a92a0110d Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-09cmd/k8s-operator: migrate to tailscale-client-go-v2 (#19010)David Bond33-933/+909
This commit modifies the kubernetes operator to use the `tailscale-client-go-v2` package instead of the internal tailscale client it was previously using. This now gives us the ability to expand out custom resources and features as they become available via the API module. The tailnet reconciler has also been modified to manage clients as tailnets are created and removed, providing each subsequent reconciler with a single `ClientProvider` that obtains a tailscale client for the respective tailnet by name, or the operator's default when presented with a blank string. Fixes: https://github.com/tailscale/corp/issues/38418 Signed-off-by: David Bond <davidsbond93@gmail.com>
2026-04-09tka: improve logging for Compact and Commit operationsAlex Chan2-0/+13
Log whenever we: * Commit an AUM which was previously soft-deleted (which we don't expect to happen in practice, and may indicate an issue with our sync code) * Purge AUMs during a Compact operation. * Successfully commit AUMs as part of a bootstrap or sync operation. All three logs mention `tka` for easy of discoverability. Updates tailscale/corp#39455 Change-Id: I2b07bb0ef075877f40ec34b80bb668be59e1cdc3 Signed-off-by: Alex Chan <alexc@tailscale.com>
2026-04-08vmtest: add VM-based integration test frameworkBrad Fitzpatrick12-11/+1382
Add tstest/natlab/vmtest, a high-level framework for running multi-VM integration tests with mixed OS types (gokrazy + Ubuntu/Debian cloud images) connected via natlab's vnet virtual network. The vmtest package provides: - Env type that orchestrates vnet, QEMU processes, and agent connections - OS image support (Gokrazy, Ubuntu2404, Debian12) with download/cache - QEMU launch per OS type (microvm for gokrazy, q35+KVM for cloud) - Cloud-init seed ISO generation with network-config for multi-NIC - Cross-compilation of test binaries for cloud VMs - Debug SSH NIC on cloud VMs for interactive debugging - Test helpers: ApproveRoutes, HTTPGet, TailscalePing, DumpStatus, WaitForPeerRoute, SSHExec TTA enhancements (cmd/tta): - Parameterize /up (accept-routes, advertise-routes, snat-subnet-routes) - Add /set, /start-webserver, /http-get endpoints - /http-get uses local.Client.UserDial for Tailscale-routed requests - Fix /ping for non-gokrazy systems TestSubnetRouter exercises a 3-VM subnet router scenario: client (gokrazy) → subnet-router (Ubuntu, dual-NIC) → backend (gokrazy) Verifies HTTP access to the backend webserver through the Tailscale subnet route. Passes in ~30 seconds. Updates tailscale/tailscale#13038 Change-Id: I165b64af241d37f5f5870e796a52502fc56146fa Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-08tsweb: add TS_DEBUG_TRUSTED_CIDRS envknob to debug (#19283)Jason O'Donnell2-0/+129
Add a new envknob that allows connections from trusted CIDR ranges to access debug endpoints without Tailscale authentication. This is useful for in-cluster scrapers like Prometheus that are not on a tailnet, do not have static IP addresses and cannot use debug keys. Fixes #19282 Signed-off-by: Jason O'Donnell <2160810+jasonodonnell@users.noreply.github.com>
2026-04-08misc: add install-git-hooks.go and git hook for Change-Id trackingBrad Fitzpatrick5-3/+408
Add misc/install-git-hooks.go and misc/git_hook/ to the OSS repo, adapted from the corp repo. The primary motivation is Change-Id generation in commit messages, which provides a persistent identifier for a change across cherry-picks between branches. The installer uses "git rev-parse --git-common-dir" instead of go-git to find the hooks directory, avoiding a new direct dependency while still supporting worktrees. Hooks included: - commit-msg: adds Change-Id trailer - pre-commit: blocks NOCOMMIT / DO NOT SUBMIT markers - pre-push: blocks local-directory replace directives in go.mod - post-checkout: warns when the hook binary is outdated Also update docs/commit-messages.md to reflect that Change-Id is no longer optional in the OSS repo. Updates tailscale/corp#39860 Change-Id: I09066b889118840c0ec6995cc03a9cf464740ffa Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-08tool/goexe: refactor to use windows_sysNathan Perry5-249/+79
Updates #19255 Signed-off-by: Nathan Perry <nathan@tailscale.com> Change-Id: Idf69f23b5a61417d5fa3638a276d64856a6a6964
2026-04-08tool: replace go.cmd with a 19KB Rust go.exe wrapperBrad Fitzpatrick11-107/+757
go.cmd used cmd.exe to invoke PowerShell, which mangled arguments: cmd.exe treats ^ as an escape character (so -run "^$" became -run "$", running all tests instead of none) and = signs also caused issues in the PowerShell→cmd.exe argument passing layer. Replace it with a tiny no_std Rust binary (19KB, 32-bit x86 for universal Windows compat: x86/x64/ARM64) that directly invokes the Tailscale Go toolchain via CreateProcessW. The raw command line from GetCommandLineW is passed through to CreateProcessW with only argv[0] replaced, so arguments are never parsed or re-escaped. The binary also handles first-run toolchain download natively using curl.exe and tar.exe (both ship with Windows 10+), so PowerShell is no longer required for normal operation. The PowerShell fallback is only used for the rare TS_USE_GOCROSS=1 path. PowerShell prefers go.exe over go.cmd when resolving ./tool/go, so this is a drop-in replacement. With go.exe in place, the CI can use the natural -bench=. -benchtime=1x -run="^$" flags directly. Also removes tool/go-win.ps1 which is now unused. Updates #19255 Change-Id: I80da23285b74796e7694b89cff29a9fa0eaa6281 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-08tstest/natlab/vnet: add multi-NIC node support, DHCP fixes, and VIPsBrad Fitzpatrick4-27/+314
Multi-NIC support: - Add nodeNIC type and node.extraNICs for secondary network interfaces - Add netForMAC/macForNet to route packets to the correct network by MAC - Update initFromConfig to allocate a MAC + LAN IP per network - Fix handleEthernetFrameFromVM, ServeUnixConn to use netForMAC - Fix MACOfIP, writeEth, WriteUDPPacketNoNAT, gVisor write path, and createARPResponse to use macForNet (return the MAC actually on that network, not the node's primary MAC) - Fix createDHCPResponse for multi-NIC (correct client IP and subnet) - Add nodeNICMac for secondary NIC MAC generation - Add Node accessors: NumNICs, NICMac, Networks, LanIP DHCP fixes: - Include LeaseTime, SubnetMask, Router, DNS in DHCP Offer (not just Ack). systemd-networkd requires these to accept an Offer. - Fix DHCP response source IP: use gateway IP instead of echoing the request's destination (which was 255.255.255.255 for discovers) New VIPs: - cloud-init.tailscale: serves per-node cloud-init meta-data, user-data, and network-config for VMs booting with nocloud datasource - files.tailscale: serves binary files (tta, tailscale, tailscaled) registered via RegisterFile for cloud VM provisioning - Add ControlServer() accessor for test control server This is necessary for a three-VM natlab subnet router integration test, coming later. Updates #13038 Change-Id: I59f9f356bae9b5509c117265237983972dfdd5af Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-08tstest/integration/testcontrol: notify peers when subnet routes changeBrad Fitzpatrick1-0/+7
When SetSubnetRoutes is called, also send updatePeerChanged to all other connected nodes so they re-fetch their MapResponse and learn about the updated AllowedIPs. Without this, peers never see new subnet routes until they happen to reconnect to the control server. Discovered while working on a three-VM natlab subnet router integration test, coming later. Updates #13038 Change-Id: I20e7a2fda994a8ab0e7a24240e6eae536f4f5f15 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-08control/controlclient: avoid calls to ms.netmap() (#19281)Claus Lensbøl2-18/+13
Instead of generating the full netmap, just fetch the peers out the the existing peers map. The extra usage was introduced with netmap caching, but there is no need to call the netmap to get this information, rather the existing peermap can be used. Updates #12639 Signed-off-by: Claus Lensbøl <claus@tailscale.com>
2026-04-08wgengine/netstack: allow UDP listeners to receive traffic on Service VIP ↵Tom Meadows2-0/+216
addresses (#18972) Fixes UDP listeners on VIP Service addresses not receiving inbound traffic. - Modified shouldProcessInbound to check for registered UDP transport endpoints when processing packets to service VIPs - Uses FindTransportEndpoint to determine if a UDP listener exists for the destination VIP/port - Supports both IPv4 and IPv6 The aim was to mirror the existing TCP logic, providing feature parity for UDP-based services on VIP Services. Fixes #18971 Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>