tailscale - The easiest, most secure way to use WireGuard and 2FA

Age	Commit message (Collapse)	Author	Files	Lines
2024-12-06	Finish up the fix, automated testtomhjp/consistent-state-test	Tom Proctor	3	-176/+162
	Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2024-12-05	local interactive test code	Tom Proctor	1	-13/+124
	Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2024-12-05	cmd/containerboot: wait for consistent state on shutdown	Tom Proctor	3	-0/+117
	tailscaled's ipn package writes a collection of keys to state after authenticating to control, but one at a time. If containerboot happens to send a SIGTERM signal to tailscaled in the middle of writing those keys, it may shut down with an inconsistent state Secret and never recover. While we can't durably fix this with our current single-use auth keys (no atomic operation to auth + write state), we can reduce the window for this race condition by checking for partial state before sending SIGTERM to tailscaled. Best effort only. Updates #14080 Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2024-12-05	cmd/k8s-operator: don't error for transient failures (#14073)	Tom Proctor	8	-17/+84
	Every so often, the ProxyGroup and other controllers lose an optimistic locking race with other controllers that update the objects they create. Stop treating this as an error event, and instead just log an info level log line for it. Fixes #14072 Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2024-12-04	cmd/tailscale,net/netcheck: add debug feature to force preferred DERP	James Tucker	7	-1/+140
	This provides an interface for a user to force a preferred DERP outcome for all future netchecks that will take precedence unless the forced region is unreachable. The option does not persist and will be lost when the daemon restarts. Updates tailscale/corp#18997 Updates tailscale/corp#24755 Signed-off-by: James Tucker <james@tailscale.com>
2024-12-04	net/tstun: remove tailscaled_outbound_dropped_packets_total reason=acl ↵	Brad Fitzpatrick	2	-4/+5
	metric for now Updates #14280 Change-Id: Idff102b3d7650fc9dfbe0c340168806bdf542d76 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-12-04	cmd/{containerboot,k8s-operator},kube/kubetypes: kube Ingress L7 proxies ↵	Irbe Krumina	12	-128/+443
	only advertise HTTPS endpoint when ready (#14171) cmd/containerboot,kube/kubetypes,cmd/k8s-operator: detect if Ingress is created in a tailnet that has no HTTPS This attempts to make Kubernetes Operator L7 Ingress setup failures more explicit: - the Ingress resource now only advertises HTTPS endpoint via status.ingress.loadBalancer.hostname when/if the proxy has succesfully loaded serve config - the proxy attempts to catch cases where HTTPS is disabled for the tailnet and logs a warning Updates tailscale/tailscale#12079 Updates tailscale/tailscale#10407 Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-12-04	cmd/k8s-operator: fix a bunch of status equality checks (#14270)	Irbe Krumina	8	-15/+15
	Updates tailscale/tailscale#14269 Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-12-03	cmd/k8s-operator/deploy/chart: allow reading OAuth creds from a CSI driver's ↵	Oliver Rahner	3	-4/+30
	volume and annotating operator's Service account (#14264) cmd/k8s-operator/deploy/chart: allow reading OAuth creds from a CSI driver's volume and annotating operator's Service account Updates #14264 Signed-off-by: Oliver Rahner <o.rahner@dke-data.com>
2024-12-03	cmd/k8s-operator: avoid port collision with metrics endpoint (#14185)	Tom Proctor	1	-7/+7
	When the operator enables metrics on a proxy, it uses the port 9001, and in the near future it will start using 9002 for the debug endpoint as well. Make sure we don't choose ports from a range that includes 9001 so that we never clash. Setting TS_SOCKS5_SERVER, TS_HEALTHCHECK_ADDR_PORT, TS_OUTBOUND_HTTP_PROXY_LISTEN, and PORT could also open arbitrary ports, so we will need to document that users should not choose ports from the 10000-11000 range for those settings. Updates #13406 Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2024-12-03	cmd/k8s-operator,k8s-operator,go.mod: optionally create ServiceMonitor (#14248)	Irbe Krumina	21	-22/+877
	* cmd/k8s-operator,k8s-operator,go.mod: optionally create ServiceMonitor Adds a new spec.metrics.serviceMonitor field to ProxyClass. If that's set to true (and metrics are enabled), the operator will create a Prometheus ServiceMonitor for each proxy to which the ProxyClass applies. Additionally, create a metrics Service for each proxy that has metrics enabled. Updates tailscale/tailscale#11292 Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-12-03	cmd/k8s-operator,docs/k8s: run tun mode proxies in privileged containers ↵	Irbe Krumina	9	-41/+36
	(#14262) We were previously relying on unintended behaviour by runc where all containers where by default given read/write/mknod permissions for tun devices. This behaviour was removed in https://github.com/opencontainers/runc/pull/3468 and released in runc 1.2. Containerd container runtime, used by Docker and majority of Kubernetes distributions bumped runc to 1.2 in 1.7.24 https://github.com/containerd/containerd/releases/tag/v1.7.24 thus breaking our reference tun mode Tailscale Kubernetes manifests and Kubernetes operator proxies. This PR changes the all Kubernetes container configs that run Tailscale in tun mode to privileged. This should not be a breaking change because all these containers would run in a Pod that already has a privileged init container. Updates tailscale/tailscale#14256 Updates tailscale/tailscale#10814 Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-12-02	IPN: Update ServeConfig to accept configuration for Services.	KevinLiang10	4	-2/+144
	This commit updates ServeConfig to allow configuration to Services (VIPServices for now) via Serve. The scope of this commit is only adding the Services field to ServeConfig. The field doesn't actually allow packet flowing yet. The purpose of this commit is to unblock other work on k8s end. Updates #22953 Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
2024-12-02	net/netcheck: clean up ICMP probe AddrPort lookup	Brad Fitzpatrick	2	-29/+36
	Fixes #14200 Change-Id: Ib086814cf63dda5de021403fe1db4fb2a798eaae Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-12-02	cmd/containerboot: serve health on local endpoint (#14246)	Tom Proctor	7	-66/+251
	* cmd/containerboot: serve health on local endpoint We introduced stable (user) metrics in #14035, and `TS_LOCAL_ADDR_PORT` with it. Rather than requiring users to specify a new addr/port combination for each new local endpoint they want the container to serve, this combines the health check endpoint onto the local addr/port used by metrics if `TS_ENABLE_HEALTH_CHECK` is used instead of `TS_HEALTHCHECK_ADDR_PORT`. `TS_LOCAL_ADDR_PORT` now defaults to binding to all interfaces on 9002 so that it works more seamlessly and with less configuration in environments other than Kubernetes, where the operator always overrides the default anyway. In particular, listening on localhost would not be accessible from outside the container, and many scripted container environments do not know the IP address of the container before it's started. Listening on all interfaces allows users to just set one env var (`TS_ENABLE_METRICS` or `TS_ENABLE_HEALTH_CHECK`) to get a fully functioning local endpoint they can query from outside the container. Updates #14035, #12898 Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2024-12-02	cmd/checkmetrics: add command for checking metrics against kb	Brad Fitzpatrick	2	-0/+142
	This commit adds a command to validate that all the metrics that are registring in the client are also present in a path or url. It is intended to be ran from the KB against the latest version of tailscale. Updates tailscale/corp#24066 Updates tailscale/corp#22075 Co-Authored-By: Brad Fitzpatrick <bradfitz@tailscale.com> Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-11-29	cmd/k8s-operator: always set stateful filtering to false (#14216)	Irbe Krumina	3	-22/+11
	Updates tailscale/tailscale#12108 Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-11-29	Makefile,./build_docker.sh: update kube operator image build target name ↵	Irbe Krumina	2	-2/+2
	(#14251) Updates tailscale/corp#24540 Updates tailscale/tailscale#12914 Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-11-29	cmd/k8s-operator: fix port name change bug for egress ProxyGroup proxies ↵	Irbe Krumina	3	-24/+77
	(#14247) Ensure that the ExternalName Service port names are always synced to the ClusterIP Service, to fix a bug where if users created a Service with a single unnamed port and later changed to 1+ named ports, the operator attempted to apply an invalid multi-port Service with an unnamed port. Also, fixes a small internal issue where not-yet Service status conditons were lost on a spec update. Updates tailscale/tailscale#10102 Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-11-28	tsnet: remove flaky test marker from metrics	Kristoffer Dalby	1	-4/+4
	Updates #13420 Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-11-28	wgengine/magicsock: packet/bytes metrics should not count disco	Kristoffer Dalby	1	-3/+3
	Updates #13420 Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-11-28	tsnet: validate sent data in metrics test	Kristoffer Dalby	1	-7/+13
	Updates #13420 Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-11-28	tsnet: split bytes and routes metrics tests	Kristoffer Dalby	1	-61/+123
	Updates #13420 Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-11-28	tsnet: send less data in metrics integration test	Kristoffer Dalby	1	-8/+6
	this commit reduced the amount of data sent in the metrics data integration test from 10MB to 1MB. On various machines 10MB was quite flaky, while 1MB has not failed once on 10000 runs. Updates #13420 Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-11-28	health: move health metrics test to health_test	Kristoffer Dalby	3	-33/+50
	Updates #13420 Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2024-11-27	logtail: avoid bytes.Buffer allocation (#11858)	Joe Tsai	1	-2/+10
	Re-use a pre-allocated bytes.Buffer struct and shallow the copy the result of bytes.NewBuffer into it to avoid allocating the struct. Note that we're only reusing the bytes.Buffer struct itself and not the underling []byte temporarily stored within it. Updates #cleanup Updates tailscale/corp#18514 Updates golang/go#67004 Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2024-11-27	ipn/localapi: count localapi requests to metric endpoints	Anton Tolchanov	1	-1/+5
	Updates tailscale/corp#22075 Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2024-11-26	control/controlhttp: set *health.Tracker in tests	Andrew Dunham	1	-0/+3
	Observed during another PR: https://github.com/tailscale/tailscale/actions/runs/12040045880/job/33569141807 Updates #cleanup Signed-off-by: Andrew Dunham <andrew@du.nham.ca> Change-Id: I9e0f49a35485fa2e097892737e5e3c95bf775a90
2024-11-26	cmd/tailscale/cli: fix format string	Nick Khyl	1	-2/+2
	Updates #12687 Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-11-26	ipn/ipnlocal: only check CanUseExitNode if we are attempting to use one (#14230)	Mario Minardi	1	-1/+6
	In https://github.com/tailscale/tailscale/pull/13726 we added logic to `checkExitNodePrefsLocked` to error out on platforms where using an exit node is unsupported in order to give users more obvious feedback than having this silently fail downstream. The above change neglected to properly check whether the device in question was actually trying to use an exit node when doing the check and was incorrectly returning an error on any calls to `checkExitNodePrefsLocked` on platforms where using an exit node is not supported as a result. This change remedies this by adding a check to see whether the device is attempting to use an exit node before doing the `CanUseExitNode` check. Updates https://github.com/tailscale/corp/issues/24835 Signed-off-by: Mario Minardi <mario@tailscale.com>
2024-11-25	net/netmon: improve panic reporting from #14202	James Tucker	1	-2/+5
	I was hoping we'd catch an example input quickly, but the reporter had rebooted their machine and it is no longer exhibiting the behavior. As such this code may be sticking around quite a bit longer and we might encounter other errors, so include the panic in the log entry. Updates #14201 Updates #14202 Updates golang/go#70528 Signed-off-by: James Tucker <james@tailscale.com>
2024-11-25	docs/windows/policy: update ADMX policy definitions to reflect the syspolicy ↵	Nick Khyl	2	-51/+91
	settings We add a policy definition for the AllowedSuggestedExitNodes syspolicy setting, allowing admins to configure a list of exit node IDs to be used as a pool for automatic suggested exit node selection. We update definitions for policy settings configurable on both a per-user and per-machine basis, such as UI customizations, to specify class="Both". Lastly, we update the help text for existing policy definitions to include a link to the KB article as the last line instead of in the first paragraph. Updates #12687 Updates tailscale/corp#19681 Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-11-23	cmd/containerboot: preserve headers of metrics endpoints responses (#14204)	Irbe Krumina	1	-1/+1
	Updates tailscale/tailscale#11292 Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-11-22	net/netmon: catch ParseRIB panic to gather buffer data	James Tucker	1	-1/+9
	Updates #14201 Updates golang/go#70528 Signed-off-by: James Tucker <james@tailscale.com>
2024-11-22	ipn/ipnlocal: rebuild allowed suggested exit nodes when syspolicy changes	Nick Khyl	1	-5/+38
	In this PR, we update LocalBackend to rebuild the set of allowed suggested exit nodes whenever the AllowedSuggestedExitNodes syspolicy setting changes. Additionally, we request a new suggested exit node when this occurs, enabling its use if the ExitNodeID syspolicy setting is set to auto:any. Updates #12687 Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-11-22	control/controlclient: use the most recent ↵	Nick Khyl	1	-11/+2
	syspolicy.MachineCertificateSubject value This PR removes the sync.Once wrapper around retrieving the MachineCertificateSubject policy setting value, ensuring the most recent version is always used if it changes after the service starts. Although this policy setting is used by a very limited number of customers, recent support escalations have highlighted issues caused by outdated or incorrect policy values being applied. Updates #12687 Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-11-22	ipn/ipnlocal: update ipn.Prefs when there's a change in syspolicy settings	Nick Khyl	2	-26/+199
	In this PR, we update ipnlocal.NewLocalBackend to subscribe to policy change notifications and reapply syspolicy settings to the current profile's ipn.Prefs whenever a change occurs. Updates #12687 Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-11-22	ipn/ipnlocal: move syspolicy handling from setExitNodeID to applySysPolicy	Nick Khyl	2	-45/+56
	This moves code that handles ExitNodeID/ExitNodeIP syspolicy settings from (*LocalBackend).setExitNodeID to applySysPolicy. Updates #12687 Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-11-22	cmd/tailscaled: log SCM interactions if the policy setting is enabled at the ↵	Nick Khyl	1	-5/+4
	time of interaction This updates the syspolicy.LogSCMInteractions check to run at the time of an interaction, just before logging a message, instead of during service startup. This ensures the most recent policy setting is used if it has changed since the service started. Updates #12687 Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-11-22	cmd/tailscaled: flush DNS if FlushDNSOnSessionUnlock is true upon receiving ↵	Nick Khyl	1	-11/+10
	a session change notification In this PR, we move the syspolicy.FlushDNSOnSessionUnlock check from service startup to when a session change notification is received. This ensures that the most recent policy setting value is used if it has changed since the service started. We also plan to handle session change notifications for unrelated reasons and need to decouple notification subscriptions from DNS anyway. Updates #12687 Updates tailscale/corp#18342 Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-11-22	util/syspolicy/rsop: reduce policyReloadMinDelay and policyReloadMaxDelay ↵	Nick Khyl	3	-9/+15
	when in tests These delays determine how soon syspolicy change callbacks are invoked after a policy setting is updated in a policy source. For tests, we shorten these delays to minimize unnecessary wait times. This adjustment only affects tests that subscribe to policy change notifications and modify policy settings after they have already been set. Initial policy settings are always available immediately without delay. Updates #12687 Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-11-22	ipn/{ipnlocal,localapi}, wgengine/netstack: call (*LocalBackend).Shutdown ↵	Nick Khyl	4	-0/+8
	when tests that create them complete We have several places where LocalBackend instances are created for testing, but they are rarely shut down when the tests that created them exit. In this PR, we update newTestLocalBackend and similar functions to use testing.TB.Cleanup(lb.Shutdown) to ensure LocalBackend instances are properly shut down during test cleanup. Updates #12687 Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-11-22	cmd/{containerboot,k8s-operator},k8s-operator: new options to expose user ↵	Tom Proctor	14	-34/+472
	metrics (#14035) containerboot: Adds 3 new environment variables for containerboot, `TS_LOCAL_ADDR_PORT` (default `"${POD_IP}:9002"`), `TS_METRICS_ENABLED` (default `false`), and `TS_DEBUG_ADDR_PORT` (default `""`), to configure metrics and debug endpoints. In a follow-up PR, the health check endpoint will be updated to use the `TS_LOCAL_ADDR_PORT` if `TS_HEALTHCHECK_ADDR_PORT` hasn't been set. Users previously only had access to internal debug metrics (which are unstable and not recommended) via passing the `--debug` flag to tailscaled, but can now set `TS_METRICS_ENABLED=true` to expose the stable metrics documented at https://tailscale.com/kb/1482/client-metrics at `/metrics` on the addr/port specified by `TS_LOCAL_ADDR_PORT`. Users can also now configure a debug endpoint more directly via the `TS_DEBUG_ADDR_PORT` environment variable. This is not recommended for production use, but exposes an internal set of debug metrics and pprof endpoints. operator: The `ProxyClass` CRD's `.spec.metrics.enable` field now enables serving the stable user metrics documented at https://tailscale.com/kb/1482/client-metrics at `/metrics` on the same "metrics" container port that debug metrics were previously served on. To smooth the transition for anyone relying on the way the operator previously consumed this field, we also _temporarily_ serve tailscaled's internal debug metrics on the same `/debug/metrics` path as before, until 1.82.0 when debug metrics will be turned off by default even if `.spec.metrics.enable` is set. At that point, anyone who wishes to continue using the internal debug metrics (not recommended) will need to set the new `ProxyClass` field `.spec.statefulSet.pod.tailscaleContainer.debug.enable`. Users who wish to opt out of the transitional behaviour, where enabling `.spec.metrics.enable` also enables debug metrics, can set `.spec.statefulSet.pod.tailscaleContainer.debug.enable` to false (recommended). Separately but related, the operator will no longer specify a host port for the "metrics" container port definition. This caused scheduling conflicts when k8s needs to schedule more than one proxy per node, and was not necessary for allowing the pod's port to be exposed to prometheus scrapers. Updates #11292 --------- Co-authored-by: Kristoffer Dalby <kristoffer@tailscale.com> Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2024-11-22	cmd/k8s-operator/deploy: ensure that operator can write kube state Events ↵	Irbe Krumina	2	-0/+16
	(#14177) A small follow-up to #14112- ensures that the operator itself can emit Events for its kube state store changes. Updates tailscale/tailscale#14080 Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-11-21	cli: present risk warning when setting up app connector on macOS (#14181)	Andrea Gottardo	3	-3/+23

2024-11-21	net/tsaddr: include test input in test failure output	Brad Fitzpatrick	1	-2/+2
	https://go.dev/wiki/CodeReviewComments#useful-test-failures (Previously it was using subtests with names including the input, but once those went away, there was no context left) Updates #14169 Change-Id: Ib217028183a3d001fe4aee58f2edb746b7b3aa88 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-11-20	cmd/tailscale/cli: create netmon in debug ts2021	Andrew Dunham	2	-0/+9
	Otherwise we'll see a panic if we hit the dnsfallback code and try to call NewDialer with a nil NetMon. Updates #14161 Signed-off-by: Andrew Dunham <andrew@du.nham.ca> Change-Id: I81c6e72376599b341cb58c37134c2a948b97cf5f
2024-11-20	util/fastuuid: delete unused package	Brad Fitzpatrick	2	-128/+0
	Its sole user was deleted in 02cafbe1cadfc. And it has no public users: https://pkg.go.dev/tailscale.com/util/fastuuid?tab=importedby And nothing in other Tailsale repos that I can find. Updates tailscale/corp#24721 Change-Id: I8755770a255a91c6c99f596e6d10c303b3ddf213 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-11-20	tsweb: change RequestID format to have a date in it	Brad Fitzpatrick	5	-13/+35
	So we can locate them in logs more easily. Updates tailscale/corp#24721 Change-Id: Ia766c75608050dde7edc99835979a6e9bb328df2 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-11-20	net/tsaddr: extract IsTailscaleIPv4 from IsTailscaleIP (#14169)	James Scott	2	-2/+76
	Extracts tsaddr.IsTailscaleIPv4 out of tsaddr.IsTailscaleIP. This will allow for checking valid Tailscale assigned IPv4 addresses without checking IPv6 addresses. Updates #14168 Updates tailscale/corp#24620 Signed-off-by: James Scott <jim@tailscale.com>