summaryrefslogtreecommitdiffhomepage
path: root/k8s-operator
AgeCommit message (Collapse)AuthorFilesLines
2026-01-23all: remove AUTHORS file and references to itWill Norris36-36/+36
This file was never truly necessary and has never actually been used in the history of Tailscale's open source releases. A Brief History of AUTHORS files --- The AUTHORS file was a pattern developed at Google, originally for Chromium, then adopted by Go and a bunch of other projects. The problem was that Chromium originally had a copyright line only recognizing Google as the copyright holder. Because Google (and most open source projects) do not require copyright assignemnt for contributions, each contributor maintains their copyright. Some large corporate contributors then tried to add their own name to the copyright line in the LICENSE file or in file headers. This quickly becomes unwieldy, and puts a tremendous burden on anyone building on top of Chromium, since the license requires that they keep all copyright lines intact. The compromise was to create an AUTHORS file that would list all of the copyright holders. The LICENSE file and source file headers would then include that list by reference, listing the copyright holder as "The Chromium Authors". This also become cumbersome to simply keep the file up to date with a high rate of new contributors. Plus it's not always obvious who the copyright holder is. Sometimes it is the individual making the contribution, but many times it may be their employer. There is no way for the proejct maintainer to know. Eventually, Google changed their policy to no longer recommend trying to keep the AUTHORS file up to date proactively, and instead to only add to it when requested: https://opensource.google/docs/releasing/authors. They are also clear that: > Adding contributors to the AUTHORS file is entirely within the > project's discretion and has no implications for copyright ownership. It was primarily added to appease a small number of large contributors that insisted that they be recognized as copyright holders (which was entirely their right to do). But it's not truly necessary, and not even the most accurate way of identifying contributors and/or copyright holders. In practice, we've never added anyone to our AUTHORS file. It only lists Tailscale, so it's not really serving any purpose. It also causes confusion because Tailscalars put the "Tailscale Inc & AUTHORS" header in other open source repos which don't actually have an AUTHORS file, so it's ambiguous what that means. Instead, we just acknowledge that the contributors to Tailscale (whoever they are) are copyright holders for their individual contributions. We also have the benefit of using the DCO (developercertificate.org) which provides some additional certification of their right to make the contribution. The source file changes were purely mechanical with: git ls-files | xargs sed -i -e 's/\(Tailscale Inc &\) AUTHORS/\1 contributors/g' Updates #cleanup Change-Id: Ia101a4a3005adb9118051b3416f5a64a4a45987d Signed-off-by: Will Norris <will@tailscale.com>
2026-01-21cmd/k8s-operator,k8s-operator: Allow the use of multiple tailnets (#18344)David Bond13-0/+1181
This commit contains the implementation of multi-tailnet support within the Kubernetes Operator Each of our custom resources now expose the `spec.tailnet` field. This field is a string that must match the name of an existing `Tailnet` resource. A `Tailnet` resource looks like this: ```yaml apiVersion: tailscale.com/v1alpha1 kind: Tailnet metadata: name: example # This is the name that must be referenced by other resources spec: credentials: secretName: example-oauth ``` Each `Tailnet` references a `Secret` resource that contains a set of oauth credentials. This secret must be created in the same namespace as the operator: ```yaml apiVersion: v1 kind: Secret metadata: name: example-oauth # This is the name that's referenced by the Tailnet resource. namespace: tailscale stringData: client_id: "client-id" client_secret: "client-secret" ``` When created, the operator performs a basic check that the oauth client has access to all required scopes. This is done using read actions on devices, keys & services. While this doesn't capture a missing "write" permission, it catches completely missing permissions. Once this check passes, the `Tailnet` moves into a ready state and can be referenced. Attempting to use a `Tailnet` in a non-ready state will stall the deployment of `Connector`s, `ProxyGroup`s and `Recorder`s until the `Tailnet` becomes ready. The `spec.tailnet` field informs the operator that a `Connector`, `ProxyGroup`, or `Recorder` must be given an auth key generated using the specified oauth client. For backwards compatibility, the set of credentials the operator is configured with are considered the default. That is, where `spec.tailnet` is not set, the resource will be deployed in the same tailnet as the operator. Updates https://github.com/tailscale/corp/issues/34561
2026-01-19k8s-operator,kube: remove enableSessionRecording from Kubernetes Cap Map ↵Tom Meadows1-10/+4
(#18452) * k8s-operator,kube: removing enableSessionRecordings option. It seems like it is going to create a confusing user experience and it's going to be a very niche use case, so we have decided to defer this for now. Updates tailscale/corp#35796 Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk> * k8s-operator: adding metric for env var deprecation Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk> --------- Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
2026-01-16k8s-operator,kube: allowing k8s api request events to be enabled via grants ↵Tom Meadows4-49/+107
(#18393) Updates #35796 Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
2025-11-20cmd/k8s-operator: add multi replica support for recorders (#17864)David Bond3-1/+14
This commit adds the `spec.replicas` field to the `Recorder` custom resource that allows for a highly available deployment of `tsrecorder` within a kubernetes cluster. Many changes were required here as the code hard-coded the assumption of a single replica. This has required a few loops, similar to what we do for the `Connector` resource to create auth and state secrets. It was also required to add a check to remove dangling state and auth secrets should the recorder be scaled down. Updates: https://github.com/tailscale/tailscale/issues/17965 Signed-off-by: David Bond <davidsbond93@gmail.com>
2025-11-18all: rename variables with lowercase-l/uppercase-IAlex Chan2-6/+6
See http://go/no-ell Signed-off-by: Alex Chan <alexc@tailscale.com> Updates #cleanup Change-Id: I8c976b51ce7a60f06315048b1920516129cc1d5d
2025-10-17cmd/k8s-operator: allow pod tolerations on nameservers (#17260)David Bond3-0/+54
This commit modifies the `DNSConfig` custom resource to allow specifying [tolerations](https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/) on the nameserver pods. This will allow users to dictate where their nameserver pods are located within their clusters. Fixes: https://github.com/tailscale/tailscale/issues/17092 Signed-off-by: David Bond <davidsbond93@gmail.com>
2025-10-16k8s-operator/api-proxy: put kube api server events behind environment ↵David Bond2-0/+10
variable (#17550) This commit modifies the k8s-operator's api proxy implementation to only enable forwarding of api requests to tsrecorder when an environment variable is set. This new environment variable is named `TS_EXPERIMENTAL_KUBE_API_EVENTS`. Updates https://github.com/tailscale/corp/issues/32448 Signed-off-by: David Bond <davidsbond93@gmail.com>
2025-10-08k8s-operator/sessionrecording: gives the connection to the recorder from the ↵Tom Meadows2-3/+12
hijacker a dedicated context (#17403) The hijacker on k8s-proxy's reverse proxy is used to stream recordings to tsrecorder as they pass through the proxy to the kubernetes api server. The connection to the recorder was using the client's (e.g., kubectl) context, rather than a dedicated one. This was causing the recording stream to get cut off in scenarios where the client cancelled the context before streaming could be completed. By using a dedicated context, we can continue streaming even if the client cancels the context (for example if the client request completes). Fixes #17404 Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
2025-10-08cmd/tsrecorder: adds sending api level logging to tsrecorder (#16960)Tom Meadows2-8/+683
Updates #17141 Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
2025-10-01all: use Go 1.20's errors.Join instead of our multierr packageBrad Fitzpatrick2-6/+4
Updates #7123 Change-Id: Ie9be6814831f661ad5636afcd51d063a0d7a907d Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-09-30cmd/k8s-operator: add DNS policy and config support to ProxyClass (#16887)Raj Singh3-0/+23
DNS configuration support to ProxyClass, allowing users to customize DNS resolution for Tailscale proxy pods. Fixes #16886 Signed-off-by: Raj Singh <raj@tailscale.com>
2025-09-29cmd/k8s-operator: add replica support to nameserver (#17246)David Bond3-0/+10
This commit modifies the `DNSConfig` custom resource to allow specifying a replica count when deploying a nameserver. This allows deploying nameservers in a HA configuration. Updates https://github.com/tailscale/corp/issues/32589 Signed-off-by: David Bond <davidsbond93@gmail.com>
2025-09-25k8s-operator: add IPv6 support for DNS records (#16691)Raj Singh3-2/+5
This change adds full IPv6 support to the Kubernetes operator's DNS functionality, enabling dual-stack and IPv6-only cluster support. Fixes #16633 Signed-off-by: Raj Singh <raj@tailscale.com>
2025-09-02cmd/k8s-operator: allow specifying replicas for connectors (#16721)David Bond3-4/+94
This commit adds a `replicas` field to the `Connector` custom resource that allows users to specify the number of desired replicas deployed for their connectors. This allows users to deploy exit nodes, subnet routers and app connectors in a highly available fashion. Fixes #14020 Signed-off-by: David Bond <davidsbond93@gmail.com>
2025-08-22cmd/k8s-proxy,k8s-operator: fix serve config for userspace mode (#16919)Tom Proctor1-7/+23
The serve code leaves it up to the system's DNS resolver and netstack to figure out how to reach the proxy destination. Combined with k8s-proxy running in userspace mode, this means we can't rely on MagicDNS being available or tailnet IPs being routable. I'd like to implement that as a feature for serve in userspace mode, but for now the safer fix to get kube-apiserver ProxyGroups consistently working in all environments is to switch to using localhost as the proxy target instead. This has a small knock-on in the code that does WhoIs lookups, which now needs to check the X-Forwarded-For header that serve populates to get the correct tailnet IP to look up, because the request's remote address will be loopback. Fixes #16920 Change-Id: I869ddcaf93102da50e66071bb00114cc1acc1288 Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2025-07-31cmd/k8s-operator,k8s-operator: allow setting a `priorityClassName` (#16685)Lee Briggs2-0/+6
* cmd/k8s-operator,k8s-operator: allow setting a `priorityClassName` Fixes #16682 Signed-off-by: Lee Briggs <lee@leebriggs.co.uk> * Update k8s-operator/apis/v1alpha1/types_proxyclass.go Co-authored-by: Tom Proctor <tomhjp@users.noreply.github.com> Signed-off-by: Lee Briggs <jaxxstorm@users.noreply.github.com> * run make kube-generate-all Change-Id: I5f8f16694fdc181b048217b9f05ec2ee2aa04def Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com> --------- Signed-off-by: Lee Briggs <lee@leebriggs.co.uk> Signed-off-by: Lee Briggs <jaxxstorm@users.noreply.github.com> Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com> Co-authored-by: Tom Proctor <tomhjp@users.noreply.github.com>
2025-07-28k8s-operator: fix test flake (#16680)Tom Proctor1-13/+23
This occasionally panics waiting on a nil ctx, but was missed in the previous PR because it's quite a rare flake as it needs to progress to a specific point in the parser. Updates #16678 Change-Id: Ifd36dfc915b153aede36b8ee39eff83750031f95 Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2025-07-28k8s-operator: handle multiple WebSocket frames per read (#16678)Tom Proctor5-52/+84
When kubectl starts an interactive attach session, it sends 2 resize messages in quick succession. It seems that particularly in HTTP mode, we often receive both of these WebSocket frames from the underlying connection in a single read. However, our parser currently assumes 0-1 frames per read, and leaves the second frame in the read buffer until the next read from the underlying connection. It doesn't take long after that before we end up failing to skip a control message as we normally should, and then we parse a control message as though it will have a stream ID (part of the Kubernetes protocol) and error out. Instead, we should keep parsing frames from the read buffer for as long as we're able to parse complete frames, so this commit refactors the messages parsing logic into a loop based on the contents of the read buffer being non-empty. k/k staging/src/k8s.io/kubectl/pkg/cmd/attach/attach.go for full details of the resize messages. There are at least a couple more multiple-frame read edge cases we should handle, but this commit is very conservatively fixing a single observed issue to make it a low-risk candidate for cherry picking. Updates #13358 Change-Id: Iafb91ad1cbeed9c5231a1525d4563164fc1f002f Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2025-07-28k8s-operator: adding session type to cast header (#16660)Tom Meadows1-3/+4
Updates #16490 Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
2025-07-22cmd/{k8s-operator,k8s-proxy},kube: use consistent type for auth mode config ↵Tom Proctor2-5/+5
(#16626) Updates k8s-proxy's config so its auth mode config matches that we set in kube-apiserver ProxyGroups for consistency. Updates #13358 Change-Id: I95e29cec6ded2dc7c6d2d03f968a25c822bc0e01 Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2025-07-21cmd/k8s-operator: Allow specifying cluster ips for nameservers (#16477)David Bond3-2/+48
This commit modifies the kubernetes operator's `DNSConfig` resource with the addition of a new field at `nameserver.service.clusterIP`. This field allows users to specify a static in-cluster IP address of the nameserver when deployed. Fixes #14305 Signed-off-by: David Bond <davidsbond93@gmail.com>
2025-07-21all-kube: create Tailscale Service for HA kube-apiserver ProxyGroup (#16572)Tom Proctor5-38/+119
Adds a new reconciler for ProxyGroups of type kube-apiserver that will provision a Tailscale Service for each replica to advertise. Adds two new condition types to the ProxyGroup, TailscaleServiceValid and TailscaleServiceConfigured, to post updates on the state of that reconciler in a way that's consistent with the service-pg reconciler. The created Tailscale Service name is configurable via a new ProxyGroup field spec.kubeAPISserver.ServiceName, which expects a string of the form "svc:<dns-label>". Lots of supporting changes were needed to implement this in a way that's consistent with other operator workflows, including: * Pulled containerboot's ensureServicesUnadvertised and certManager into kube/ libraries to be shared with k8s-proxy. Use those in k8s-proxy to aid Service cert sharing between replicas and graceful Service shutdown. * For certManager, add an initial wait to the cert loop to wait until the domain appears in the devices's netmap to avoid a guaranteed error on the first issue attempt when it's quick to start. * Made several methods in ingress-for-pg.go and svc-for-pg.go into functions to share with the new reconciler * Added a Resource struct to the owner refs stored in Tailscale Service annotations to be able to distinguish between Ingress- and ProxyGroup- based Services that need cleaning up in the Tailscale API. * Added a ListVIPServices method to the internal tailscale client to aid cleaning up orphaned Services * Support for reading config from a kube Secret, and partial support for config reloading, to prevent us having to force Pod restarts when config changes. * Fixed up the zap logger so it's possible to set debug log level. Updates #13358 Change-Id: Ia9607441157dd91fb9b6ecbc318eecbef446e116 Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2025-07-14k8s-operator,sessionrecording: fixing race condition between resize (#16454)Tom Meadows9-239/+345
messages and cast headers when recording `kubectl attach` sessions Updates #16490 Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
2025-07-09cmd/k8s-operator: don't require generation for Available condition (#16497)Tom Proctor1-6/+7
The observed generation was set to always 0 in #16429, but this had the knock-on effect of other controllers considering ProxyGroups never ready because the observed generation is never up to date in proxyGroupCondition. Make sure the ProxyGroupAvailable function does not requires the observed generation to be up to date, and add testing coverage to catch regressions. Updates #16327 Change-Id: I42f50ad47dd81cc2d3c3ce2cd7b252160bb58e40 Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2025-07-09cmd/{k8s-operator,k8s-proxy}: add kube-apiserver ProxyGroup type (#16266)Tom Proctor6-158/+178
Adds a new k8s-proxy command to convert operator's in-process proxy to a separately deployable type of ProxyGroup: kube-apiserver. k8s-proxy reads in a new config file written by the operator, modelled on tailscaled's conffile but with some modifications to ensure multiple versions of the config can co-exist within a file. This should make it much easier to support reading that config file from a Kube Secret with a stable file name. To avoid needing to give the operator ClusterRole{,Binding} permissions, the helm chart now optionally deploys a new static ServiceAccount for the API Server proxy to use if in auth mode. Proxies deployed by kube-apiserver ProxyGroups currently work the same as the operator's in-process proxy. They do not yet leverage Tailscale Services for presenting a single HA DNS name. Updates #13358 Change-Id: Ib6ead69b2173c5e1929f3c13fb48a9a5362195d8 Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2025-07-07cmd/k8s-operator: always set ProxyGroup status conditions (#16429)Tom Proctor2-2/+6
Refactors setting status into its own top-level function to make it easier to ensure we _always_ set the status if it's changed on every reconcile. Previously, it was possible to have stale status if some earlier part of the provision logic failed. Updates #16327 Change-Id: Idab0cfc15ae426cf6914a82f0d37a5cc7845236b Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2025-06-27cmd/{containerboot,k8s-operator}: use state Secret for checking device auth ↵Tom Proctor2-6/+15
(#16328) Previously, the operator checked the ProxyGroup status fields for information on how many of the proxies had successfully authed. Use their state Secrets instead as a more reliable source of truth. containerboot has written device_fqdn and device_ips keys to the state Secret since inception, and pod_uid since 1.78.0, so there's no need to use the API for that data. Read it from the state Secret for consistency. However, to ensure we don't read data from a previous run of containerboot, make sure we reset containerboot's state keys on startup. One other knock-on effect of that is ProxyGroups can briefly be marked not Ready while a Pod is restarting. Introduce a new ProxyGroupAvailable condition to more accurately reflect when downstream controllers can implement flows that rely on a ProxyGroup having at least 1 proxy Pod running. Fixes #16327 Change-Id: I026c18e9d23e87109a471a87b8e4fb6271716a66 Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2025-06-27cmd/k8s-operator, k8s-operator: support Static Endpoints on ProxyGroups (#16115)Tom Meadows4-0/+272
updates: #14674 Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
2025-05-19cmd/k8s-operator,kube/kubetypes,k8s-operator/apis: reconcile L3 HA Services ↵Tom Meadows1-0/+3
(#15961) This reconciler allows users to make applications highly available at L3 by leveraging Tailscale Virtual Services. Many Kubernetes Service's (irrespective of the cluster they reside in) can be mapped to a Tailscale Virtual Service, allowing access to these Services at L3. Updates #15895 Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
2025-05-19{cmd,}/k8s-operator: support IRSA for Recorder resources (#15913)Tom Proctor3-0/+71
Adds Recorder fields to configure the name and annotations of the ServiceAccount created for and used by its associated StatefulSet. This allows the created Pod to authenticate with AWS without requiring a Secret with static credentials, using AWS' IAM Roles for Service Accounts feature, documented here: https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html Fixes #15875 Change-Id: Ib0e15c0dbc357efa4be260e9ae5077bacdcb264f Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2025-05-06cmd/k8s-operator,k8s-operator/api-proxy: move k8s proxy code to library (#15857)Tom Proctor4-0/+657
The defaultEnv and defaultBool functions are copied over temporarily to minimise diff. This lays the ground work for having both the operator and the new k8s-proxy binary implement the API proxy Updates #13358 Change-Id: Ieacc79af64df2f13b27a18135517bb31c80a5a02 Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2025-04-15k8s-operator: add age column to all custom resources (#15663)Satyam Soni5-0/+5
This change introduces an Age column in the output for all custom resources to enhance visibility into their lifecycle status. Fixes #15499 Signed-off-by: satyampsoni <satyampsoni@gmail.com>
2025-04-08net/{netx,memnet},all: add netx.DialFunc, move memnet Network implBrad Fitzpatrick2-3/+4
This adds netx.DialFunc, unifying a type we have a bazillion other places, giving it now a nice short name that's clickable in editors, etc. That highlighted that my earlier move (03b47a55c7956) of stuff from nettest into netx moved too much: it also dragged along the memnet impl, meaning all users of netx.DialFunc who just wanted netx for the type definition were instead also pulling in all of memnet. So move the memnet implementation netx.Network into memnet, a package we already had. Then use netx.DialFunc in a bunch of places. I'm sure I missed some. And plenty remain in other repos, to be updated later. Updates tailscale/corp#27636 Change-Id: I7296cd4591218e8624e214f8c70dab05fb884e95 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-03-28cmd/k8s-operator,k8s-operator: enable HA Ingress again. (#15453)Irbe Krumina2-2/+2
Re-enable HA Ingress again that was disabled for 1.82 release. This reverts commit fea74a60d529bcccbc8ded74644256bb6f6c7727. Updates tailscale/corp#24795 Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2025-03-26cmd/k8s-operator,k8s-operator: disable HA Ingress before stable release (#15433)v1.83.0-preIrbe Krumina2-2/+2
Temporarily make sure that the HA Ingress reconciler does not run, as we do not want to release this to stable just yet. Updates tailscale/corp#24795 Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2025-03-21cmd/k8s-operator,k8s-operator: allow optionally using LE staging endpoint ↵Irbe Krumina2-0/+16
for Ingress (#15360) cmd/k8s-operator,k8s-operator: allow using LE staging endpoint for Ingress Allow to optionally use LetsEncrypt staging endpoint to issue certs for Ingress/HA Ingress, so that it is easier to experiment with initial Ingress setup without hiting rate limits. Updates tailscale/corp#24795 Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2025-02-04cmd/k8s-operator: reinstate HA Ingress reconciler (#14887)Irbe Krumina2-2/+2
This change: - reinstates the HA Ingress controller that was disabled for 1.80 release - fixes the API calls to manage VIPServices as the API was changed - triggers the HA Ingress reconciler on ProxyGroup changes Updates tailscale/tailscale#24795 Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2025-01-30cmd/k8s-operator: temporarily disable HA Ingress controller (#14833)Irbe Krumina2-2/+2
The HA Ingress functionality is not actually doing anything valuable yet, so don't run the controller in 1.80 release yet. Updates tailscale/tailscale#24795 Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2025-01-29cmd/{k8s-operator,containerboot},kube: ensure egress ProxyGroup proxies ↵Irbe Krumina1-10/+0
don't terminate while cluster traffic is still routed to them (#14436) cmd/{containerboot,k8s-operator},kube: add preshutdown hook for egress PG proxies This change is part of work towards minimizing downtime during update rollouts of egress ProxyGroup replicas. This change: - updates the containerboot health check logic to return Pod IP in headers, if set - always runs the health check for egress PG proxies - updates ClusterIP Services created for PG egress endpoints to include the health check endpoint - implements preshutdown endpoint in proxies. The preshutdown endpoint logic waits till, for all currently configured egress services, the ClusterIP Service health check endpoint is no longer returned by the shutting-down Pod (by looking at the new Pod IP header). - ensures that kubelet is configured to call the preshutdown endpoint This reduces the possibility that, as replicas are terminated during an update, a replica gets terminated to which cluster traffic is still being routed via the ClusterIP Service because kube proxy has not yet updated routig rules. This is not a perfect check as in practice, it only checks that the kube proxy on the node on which the proxy runs has updated rules. However, overall this might be good enough. The preshutdown logic is disabled if users have configured a custom health check port via TS_LOCAL_ADDR_PORT env var. This change throws a warnign if so and in future setting of that env var for operator proxies might be disallowed (as users shouldn't need to configure this for a Pod directly). This is backwards compatible with earlier proxy versions. Updates tailscale/tailscale#14326 Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2025-01-09cmd/k8s-operator,k8s-operator: allow users to set custom labels for the ↵Irbe Krumina3-7/+93
optional ServiceMonitor (#14475) * cmd/k8s-operator,k8s-operator: allow users to set custom labels for the optional ServiceMonitor Updates tailscale/tailscale#14381 Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2025-01-08cmd/k8s-operator,k8s-operator: support ingress ProxyGroup type (#14548)Irbe Krumina2-6/+11
Currently this does not yet do anything apart from creating the ProxyGroup resources like StatefulSet. Updates tailscale/corp#24795 Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-12-20cmd/k8s-operator,k8s-operator: include top-level CRD descriptions (#14435)Tom Proctor3-0/+28
When reading https://doc.crds.dev/github.com/tailscale/tailscale/tailscale.com/ProxyGroup/v1alpha1@v1.78.3 I noticed there is no top-level description for ProxyGroup and Recorder. Add one to give some high-level direction. Updates #cleanup Change-Id: I3666c5445be272ea5a1d4d02b6d5ad4c23afb09f Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2024-12-11cmd/k8s-operator,k8s-operator: operator integration tests (#12792)Tom Proctor1-0/+11
This is the start of an integration/e2e test suite for the tailscale operator. It currently only tests two major features, ingress proxy and API server proxy, but we intend to expand it to cover more features over time. It also only supports manual runs for now. We intend to integrate it into CI checks in a separate update when we have planned how to securely provide CI with the secrets required for connecting to a test tailnet. Updates #12622 Change-Id: I31e464bb49719348b62a563790f2bc2ba165a11b Co-authored-by: Irbe Krumina <irbe@tailscale.com> Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2024-12-03cmd/k8s-operator,k8s-operator,go.mod: optionally create ServiceMonitor (#14248)Irbe Krumina4-2/+63
* cmd/k8s-operator,k8s-operator,go.mod: optionally create ServiceMonitor Adds a new spec.metrics.serviceMonitor field to ProxyClass. If that's set to true (and metrics are enabled), the operator will create a Prometheus ServiceMonitor for each proxy to which the ProxyClass applies. Additionally, create a metrics Service for each proxy that has metrics enabled. Updates tailscale/tailscale#11292 Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-12-03cmd/k8s-operator,docs/k8s: run tun mode proxies in privileged containers ↵Irbe Krumina2-6/+7
(#14262) We were previously relying on unintended behaviour by runc where all containers where by default given read/write/mknod permissions for tun devices. This behaviour was removed in https://github.com/opencontainers/runc/pull/3468 and released in runc 1.2. Containerd container runtime, used by Docker and majority of Kubernetes distributions bumped runc to 1.2 in 1.7.24 https://github.com/containerd/containerd/releases/tag/v1.7.24 thus breaking our reference tun mode Tailscale Kubernetes manifests and Kubernetes operator proxies. This PR changes the all Kubernetes container configs that run Tailscale in tun mode to privileged. This should not be a breaking change because all these containers would run in a Pod that already has a privileged init container. Updates tailscale/tailscale#14256 Updates tailscale/tailscale#10814 Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-11-22cmd/{containerboot,k8s-operator},k8s-operator: new options to expose user ↵Tom Proctor3-2/+64
metrics (#14035) containerboot: Adds 3 new environment variables for containerboot, `TS_LOCAL_ADDR_PORT` (default `"${POD_IP}:9002"`), `TS_METRICS_ENABLED` (default `false`), and `TS_DEBUG_ADDR_PORT` (default `""`), to configure metrics and debug endpoints. In a follow-up PR, the health check endpoint will be updated to use the `TS_LOCAL_ADDR_PORT` if `TS_HEALTHCHECK_ADDR_PORT` hasn't been set. Users previously only had access to internal debug metrics (which are unstable and not recommended) via passing the `--debug` flag to tailscaled, but can now set `TS_METRICS_ENABLED=true` to expose the stable metrics documented at https://tailscale.com/kb/1482/client-metrics at `/metrics` on the addr/port specified by `TS_LOCAL_ADDR_PORT`. Users can also now configure a debug endpoint more directly via the `TS_DEBUG_ADDR_PORT` environment variable. This is not recommended for production use, but exposes an internal set of debug metrics and pprof endpoints. operator: The `ProxyClass` CRD's `.spec.metrics.enable` field now enables serving the stable user metrics documented at https://tailscale.com/kb/1482/client-metrics at `/metrics` on the same "metrics" container port that debug metrics were previously served on. To smooth the transition for anyone relying on the way the operator previously consumed this field, we also _temporarily_ serve tailscaled's internal debug metrics on the same `/debug/metrics` path as before, until 1.82.0 when debug metrics will be turned off by default even if `.spec.metrics.enable` is set. At that point, anyone who wishes to continue using the internal debug metrics (not recommended) will need to set the new `ProxyClass` field `.spec.statefulSet.pod.tailscaleContainer.debug.enable`. Users who wish to opt out of the transitional behaviour, where enabling `.spec.metrics.enable` also enables debug metrics, can set `.spec.statefulSet.pod.tailscaleContainer.debug.enable` to false (recommended). Separately but related, the operator will no longer specify a host port for the "metrics" container port definition. This caused scheduling conflicts when k8s needs to schedule more than one proxy per node, and was not necessary for allowing the pod's port to be exposed to prometheus scrapers. Updates #11292 --------- Co-authored-by: Kristoffer Dalby <kristoffer@tailscale.com> Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2024-11-18sessionrecording: implement v2 recording endpoint support (#14105)Andrew Lytvynov2-3/+3
The v2 endpoint supports HTTP/2 bidirectional streaming and acks for received bytes. This is used to detect when a recorder disappears to more quickly terminate the session. Updates https://github.com/tailscale/corp/issues/24023 Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
2024-11-12cmd/{k8s-operator,containerboot},k8s-operator: remove support for proxies ↵Irbe Krumina1-3/+0
below capver 95. (#13986) Updates tailscale/tailscale#13984 Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-11-11cmd/k8s-operator,k8s-operator,kube/kubetypes: add an option to configure app ↵Irbe Krumina3-8/+86
connector via Connector spec (#13950) * cmd/k8s-operator,k8s-operator,kube/kubetypes: add an option to configure app connector via Connector spec Updates tailscale/tailscale#11113 Signed-off-by: Irbe Krumina <irbe@tailscale.com>