summaryrefslogtreecommitdiffhomepage
path: root/k8s-operator
AgeCommit message (Collapse)AuthorFilesLines
2025-09-02cmd/k8s-operator: allow specifying replicas for connectors (#16721)David Bond3-4/+94
This commit adds a `replicas` field to the `Connector` custom resource that allows users to specify the number of desired replicas deployed for their connectors. This allows users to deploy exit nodes, subnet routers and app connectors in a highly available fashion. Fixes #14020 Signed-off-by: David Bond <davidsbond93@gmail.com>
2025-08-22cmd/k8s-proxy,k8s-operator: fix serve config for userspace mode (#16919)Tom Proctor1-7/+23
The serve code leaves it up to the system's DNS resolver and netstack to figure out how to reach the proxy destination. Combined with k8s-proxy running in userspace mode, this means we can't rely on MagicDNS being available or tailnet IPs being routable. I'd like to implement that as a feature for serve in userspace mode, but for now the safer fix to get kube-apiserver ProxyGroups consistently working in all environments is to switch to using localhost as the proxy target instead. This has a small knock-on in the code that does WhoIs lookups, which now needs to check the X-Forwarded-For header that serve populates to get the correct tailnet IP to look up, because the request's remote address will be loopback. Fixes #16920 Change-Id: I869ddcaf93102da50e66071bb00114cc1acc1288 Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2025-07-31cmd/k8s-operator,k8s-operator: allow setting a `priorityClassName` (#16685)Lee Briggs2-0/+6
* cmd/k8s-operator,k8s-operator: allow setting a `priorityClassName` Fixes #16682 Signed-off-by: Lee Briggs <lee@leebriggs.co.uk> * Update k8s-operator/apis/v1alpha1/types_proxyclass.go Co-authored-by: Tom Proctor <tomhjp@users.noreply.github.com> Signed-off-by: Lee Briggs <jaxxstorm@users.noreply.github.com> * run make kube-generate-all Change-Id: I5f8f16694fdc181b048217b9f05ec2ee2aa04def Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com> --------- Signed-off-by: Lee Briggs <lee@leebriggs.co.uk> Signed-off-by: Lee Briggs <jaxxstorm@users.noreply.github.com> Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com> Co-authored-by: Tom Proctor <tomhjp@users.noreply.github.com>
2025-07-28k8s-operator: fix test flake (#16680)Tom Proctor1-13/+23
This occasionally panics waiting on a nil ctx, but was missed in the previous PR because it's quite a rare flake as it needs to progress to a specific point in the parser. Updates #16678 Change-Id: Ifd36dfc915b153aede36b8ee39eff83750031f95 Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2025-07-28k8s-operator: handle multiple WebSocket frames per read (#16678)Tom Proctor5-52/+84
When kubectl starts an interactive attach session, it sends 2 resize messages in quick succession. It seems that particularly in HTTP mode, we often receive both of these WebSocket frames from the underlying connection in a single read. However, our parser currently assumes 0-1 frames per read, and leaves the second frame in the read buffer until the next read from the underlying connection. It doesn't take long after that before we end up failing to skip a control message as we normally should, and then we parse a control message as though it will have a stream ID (part of the Kubernetes protocol) and error out. Instead, we should keep parsing frames from the read buffer for as long as we're able to parse complete frames, so this commit refactors the messages parsing logic into a loop based on the contents of the read buffer being non-empty. k/k staging/src/k8s.io/kubectl/pkg/cmd/attach/attach.go for full details of the resize messages. There are at least a couple more multiple-frame read edge cases we should handle, but this commit is very conservatively fixing a single observed issue to make it a low-risk candidate for cherry picking. Updates #13358 Change-Id: Iafb91ad1cbeed9c5231a1525d4563164fc1f002f Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2025-07-28k8s-operator: adding session type to cast header (#16660)Tom Meadows1-3/+4
Updates #16490 Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
2025-07-22cmd/{k8s-operator,k8s-proxy},kube: use consistent type for auth mode config ↵Tom Proctor2-5/+5
(#16626) Updates k8s-proxy's config so its auth mode config matches that we set in kube-apiserver ProxyGroups for consistency. Updates #13358 Change-Id: I95e29cec6ded2dc7c6d2d03f968a25c822bc0e01 Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2025-07-21cmd/k8s-operator: Allow specifying cluster ips for nameservers (#16477)David Bond3-2/+48
This commit modifies the kubernetes operator's `DNSConfig` resource with the addition of a new field at `nameserver.service.clusterIP`. This field allows users to specify a static in-cluster IP address of the nameserver when deployed. Fixes #14305 Signed-off-by: David Bond <davidsbond93@gmail.com>
2025-07-21all-kube: create Tailscale Service for HA kube-apiserver ProxyGroup (#16572)Tom Proctor5-38/+119
Adds a new reconciler for ProxyGroups of type kube-apiserver that will provision a Tailscale Service for each replica to advertise. Adds two new condition types to the ProxyGroup, TailscaleServiceValid and TailscaleServiceConfigured, to post updates on the state of that reconciler in a way that's consistent with the service-pg reconciler. The created Tailscale Service name is configurable via a new ProxyGroup field spec.kubeAPISserver.ServiceName, which expects a string of the form "svc:<dns-label>". Lots of supporting changes were needed to implement this in a way that's consistent with other operator workflows, including: * Pulled containerboot's ensureServicesUnadvertised and certManager into kube/ libraries to be shared with k8s-proxy. Use those in k8s-proxy to aid Service cert sharing between replicas and graceful Service shutdown. * For certManager, add an initial wait to the cert loop to wait until the domain appears in the devices's netmap to avoid a guaranteed error on the first issue attempt when it's quick to start. * Made several methods in ingress-for-pg.go and svc-for-pg.go into functions to share with the new reconciler * Added a Resource struct to the owner refs stored in Tailscale Service annotations to be able to distinguish between Ingress- and ProxyGroup- based Services that need cleaning up in the Tailscale API. * Added a ListVIPServices method to the internal tailscale client to aid cleaning up orphaned Services * Support for reading config from a kube Secret, and partial support for config reloading, to prevent us having to force Pod restarts when config changes. * Fixed up the zap logger so it's possible to set debug log level. Updates #13358 Change-Id: Ia9607441157dd91fb9b6ecbc318eecbef446e116 Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2025-07-14k8s-operator,sessionrecording: fixing race condition between resize (#16454)Tom Meadows9-239/+345
messages and cast headers when recording `kubectl attach` sessions Updates #16490 Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
2025-07-09cmd/k8s-operator: don't require generation for Available condition (#16497)Tom Proctor1-6/+7
The observed generation was set to always 0 in #16429, but this had the knock-on effect of other controllers considering ProxyGroups never ready because the observed generation is never up to date in proxyGroupCondition. Make sure the ProxyGroupAvailable function does not requires the observed generation to be up to date, and add testing coverage to catch regressions. Updates #16327 Change-Id: I42f50ad47dd81cc2d3c3ce2cd7b252160bb58e40 Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2025-07-09cmd/{k8s-operator,k8s-proxy}: add kube-apiserver ProxyGroup type (#16266)Tom Proctor6-158/+178
Adds a new k8s-proxy command to convert operator's in-process proxy to a separately deployable type of ProxyGroup: kube-apiserver. k8s-proxy reads in a new config file written by the operator, modelled on tailscaled's conffile but with some modifications to ensure multiple versions of the config can co-exist within a file. This should make it much easier to support reading that config file from a Kube Secret with a stable file name. To avoid needing to give the operator ClusterRole{,Binding} permissions, the helm chart now optionally deploys a new static ServiceAccount for the API Server proxy to use if in auth mode. Proxies deployed by kube-apiserver ProxyGroups currently work the same as the operator's in-process proxy. They do not yet leverage Tailscale Services for presenting a single HA DNS name. Updates #13358 Change-Id: Ib6ead69b2173c5e1929f3c13fb48a9a5362195d8 Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2025-07-07cmd/k8s-operator: always set ProxyGroup status conditions (#16429)Tom Proctor2-2/+6
Refactors setting status into its own top-level function to make it easier to ensure we _always_ set the status if it's changed on every reconcile. Previously, it was possible to have stale status if some earlier part of the provision logic failed. Updates #16327 Change-Id: Idab0cfc15ae426cf6914a82f0d37a5cc7845236b Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2025-06-27cmd/{containerboot,k8s-operator}: use state Secret for checking device auth ↵Tom Proctor2-6/+15
(#16328) Previously, the operator checked the ProxyGroup status fields for information on how many of the proxies had successfully authed. Use their state Secrets instead as a more reliable source of truth. containerboot has written device_fqdn and device_ips keys to the state Secret since inception, and pod_uid since 1.78.0, so there's no need to use the API for that data. Read it from the state Secret for consistency. However, to ensure we don't read data from a previous run of containerboot, make sure we reset containerboot's state keys on startup. One other knock-on effect of that is ProxyGroups can briefly be marked not Ready while a Pod is restarting. Introduce a new ProxyGroupAvailable condition to more accurately reflect when downstream controllers can implement flows that rely on a ProxyGroup having at least 1 proxy Pod running. Fixes #16327 Change-Id: I026c18e9d23e87109a471a87b8e4fb6271716a66 Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2025-06-27cmd/k8s-operator, k8s-operator: support Static Endpoints on ProxyGroups (#16115)Tom Meadows4-0/+272
updates: #14674 Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
2025-05-19cmd/k8s-operator,kube/kubetypes,k8s-operator/apis: reconcile L3 HA Services ↵Tom Meadows1-0/+3
(#15961) This reconciler allows users to make applications highly available at L3 by leveraging Tailscale Virtual Services. Many Kubernetes Service's (irrespective of the cluster they reside in) can be mapped to a Tailscale Virtual Service, allowing access to these Services at L3. Updates #15895 Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
2025-05-19{cmd,}/k8s-operator: support IRSA for Recorder resources (#15913)Tom Proctor3-0/+71
Adds Recorder fields to configure the name and annotations of the ServiceAccount created for and used by its associated StatefulSet. This allows the created Pod to authenticate with AWS without requiring a Secret with static credentials, using AWS' IAM Roles for Service Accounts feature, documented here: https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html Fixes #15875 Change-Id: Ib0e15c0dbc357efa4be260e9ae5077bacdcb264f Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2025-05-06cmd/k8s-operator,k8s-operator/api-proxy: move k8s proxy code to library (#15857)Tom Proctor4-0/+657
The defaultEnv and defaultBool functions are copied over temporarily to minimise diff. This lays the ground work for having both the operator and the new k8s-proxy binary implement the API proxy Updates #13358 Change-Id: Ieacc79af64df2f13b27a18135517bb31c80a5a02 Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2025-04-15k8s-operator: add age column to all custom resources (#15663)Satyam Soni5-0/+5
This change introduces an Age column in the output for all custom resources to enhance visibility into their lifecycle status. Fixes #15499 Signed-off-by: satyampsoni <satyampsoni@gmail.com>
2025-04-08net/{netx,memnet},all: add netx.DialFunc, move memnet Network implBrad Fitzpatrick2-3/+4
This adds netx.DialFunc, unifying a type we have a bazillion other places, giving it now a nice short name that's clickable in editors, etc. That highlighted that my earlier move (03b47a55c7956) of stuff from nettest into netx moved too much: it also dragged along the memnet impl, meaning all users of netx.DialFunc who just wanted netx for the type definition were instead also pulling in all of memnet. So move the memnet implementation netx.Network into memnet, a package we already had. Then use netx.DialFunc in a bunch of places. I'm sure I missed some. And plenty remain in other repos, to be updated later. Updates tailscale/corp#27636 Change-Id: I7296cd4591218e8624e214f8c70dab05fb884e95 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-03-28cmd/k8s-operator,k8s-operator: enable HA Ingress again. (#15453)Irbe Krumina2-2/+2
Re-enable HA Ingress again that was disabled for 1.82 release. This reverts commit fea74a60d529bcccbc8ded74644256bb6f6c7727. Updates tailscale/corp#24795 Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2025-03-26cmd/k8s-operator,k8s-operator: disable HA Ingress before stable release (#15433)v1.83.0-preIrbe Krumina2-2/+2
Temporarily make sure that the HA Ingress reconciler does not run, as we do not want to release this to stable just yet. Updates tailscale/corp#24795 Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2025-03-21cmd/k8s-operator,k8s-operator: allow optionally using LE staging endpoint ↵Irbe Krumina2-0/+16
for Ingress (#15360) cmd/k8s-operator,k8s-operator: allow using LE staging endpoint for Ingress Allow to optionally use LetsEncrypt staging endpoint to issue certs for Ingress/HA Ingress, so that it is easier to experiment with initial Ingress setup without hiting rate limits. Updates tailscale/corp#24795 Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2025-02-04cmd/k8s-operator: reinstate HA Ingress reconciler (#14887)Irbe Krumina2-2/+2
This change: - reinstates the HA Ingress controller that was disabled for 1.80 release - fixes the API calls to manage VIPServices as the API was changed - triggers the HA Ingress reconciler on ProxyGroup changes Updates tailscale/tailscale#24795 Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2025-01-30cmd/k8s-operator: temporarily disable HA Ingress controller (#14833)Irbe Krumina2-2/+2
The HA Ingress functionality is not actually doing anything valuable yet, so don't run the controller in 1.80 release yet. Updates tailscale/tailscale#24795 Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2025-01-29cmd/{k8s-operator,containerboot},kube: ensure egress ProxyGroup proxies ↵Irbe Krumina1-10/+0
don't terminate while cluster traffic is still routed to them (#14436) cmd/{containerboot,k8s-operator},kube: add preshutdown hook for egress PG proxies This change is part of work towards minimizing downtime during update rollouts of egress ProxyGroup replicas. This change: - updates the containerboot health check logic to return Pod IP in headers, if set - always runs the health check for egress PG proxies - updates ClusterIP Services created for PG egress endpoints to include the health check endpoint - implements preshutdown endpoint in proxies. The preshutdown endpoint logic waits till, for all currently configured egress services, the ClusterIP Service health check endpoint is no longer returned by the shutting-down Pod (by looking at the new Pod IP header). - ensures that kubelet is configured to call the preshutdown endpoint This reduces the possibility that, as replicas are terminated during an update, a replica gets terminated to which cluster traffic is still being routed via the ClusterIP Service because kube proxy has not yet updated routig rules. This is not a perfect check as in practice, it only checks that the kube proxy on the node on which the proxy runs has updated rules. However, overall this might be good enough. The preshutdown logic is disabled if users have configured a custom health check port via TS_LOCAL_ADDR_PORT env var. This change throws a warnign if so and in future setting of that env var for operator proxies might be disallowed (as users shouldn't need to configure this for a Pod directly). This is backwards compatible with earlier proxy versions. Updates tailscale/tailscale#14326 Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2025-01-09cmd/k8s-operator,k8s-operator: allow users to set custom labels for the ↵Irbe Krumina3-7/+93
optional ServiceMonitor (#14475) * cmd/k8s-operator,k8s-operator: allow users to set custom labels for the optional ServiceMonitor Updates tailscale/tailscale#14381 Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2025-01-08cmd/k8s-operator,k8s-operator: support ingress ProxyGroup type (#14548)Irbe Krumina2-6/+11
Currently this does not yet do anything apart from creating the ProxyGroup resources like StatefulSet. Updates tailscale/corp#24795 Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-12-20cmd/k8s-operator,k8s-operator: include top-level CRD descriptions (#14435)Tom Proctor3-0/+28
When reading https://doc.crds.dev/github.com/tailscale/tailscale/tailscale.com/ProxyGroup/v1alpha1@v1.78.3 I noticed there is no top-level description for ProxyGroup and Recorder. Add one to give some high-level direction. Updates #cleanup Change-Id: I3666c5445be272ea5a1d4d02b6d5ad4c23afb09f Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2024-12-11cmd/k8s-operator,k8s-operator: operator integration tests (#12792)Tom Proctor1-0/+11
This is the start of an integration/e2e test suite for the tailscale operator. It currently only tests two major features, ingress proxy and API server proxy, but we intend to expand it to cover more features over time. It also only supports manual runs for now. We intend to integrate it into CI checks in a separate update when we have planned how to securely provide CI with the secrets required for connecting to a test tailnet. Updates #12622 Change-Id: I31e464bb49719348b62a563790f2bc2ba165a11b Co-authored-by: Irbe Krumina <irbe@tailscale.com> Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2024-12-03cmd/k8s-operator,k8s-operator,go.mod: optionally create ServiceMonitor (#14248)Irbe Krumina4-2/+63
* cmd/k8s-operator,k8s-operator,go.mod: optionally create ServiceMonitor Adds a new spec.metrics.serviceMonitor field to ProxyClass. If that's set to true (and metrics are enabled), the operator will create a Prometheus ServiceMonitor for each proxy to which the ProxyClass applies. Additionally, create a metrics Service for each proxy that has metrics enabled. Updates tailscale/tailscale#11292 Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-12-03cmd/k8s-operator,docs/k8s: run tun mode proxies in privileged containers ↵Irbe Krumina2-6/+7
(#14262) We were previously relying on unintended behaviour by runc where all containers where by default given read/write/mknod permissions for tun devices. This behaviour was removed in https://github.com/opencontainers/runc/pull/3468 and released in runc 1.2. Containerd container runtime, used by Docker and majority of Kubernetes distributions bumped runc to 1.2 in 1.7.24 https://github.com/containerd/containerd/releases/tag/v1.7.24 thus breaking our reference tun mode Tailscale Kubernetes manifests and Kubernetes operator proxies. This PR changes the all Kubernetes container configs that run Tailscale in tun mode to privileged. This should not be a breaking change because all these containers would run in a Pod that already has a privileged init container. Updates tailscale/tailscale#14256 Updates tailscale/tailscale#10814 Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-11-22cmd/{containerboot,k8s-operator},k8s-operator: new options to expose user ↵Tom Proctor3-2/+64
metrics (#14035) containerboot: Adds 3 new environment variables for containerboot, `TS_LOCAL_ADDR_PORT` (default `"${POD_IP}:9002"`), `TS_METRICS_ENABLED` (default `false`), and `TS_DEBUG_ADDR_PORT` (default `""`), to configure metrics and debug endpoints. In a follow-up PR, the health check endpoint will be updated to use the `TS_LOCAL_ADDR_PORT` if `TS_HEALTHCHECK_ADDR_PORT` hasn't been set. Users previously only had access to internal debug metrics (which are unstable and not recommended) via passing the `--debug` flag to tailscaled, but can now set `TS_METRICS_ENABLED=true` to expose the stable metrics documented at https://tailscale.com/kb/1482/client-metrics at `/metrics` on the addr/port specified by `TS_LOCAL_ADDR_PORT`. Users can also now configure a debug endpoint more directly via the `TS_DEBUG_ADDR_PORT` environment variable. This is not recommended for production use, but exposes an internal set of debug metrics and pprof endpoints. operator: The `ProxyClass` CRD's `.spec.metrics.enable` field now enables serving the stable user metrics documented at https://tailscale.com/kb/1482/client-metrics at `/metrics` on the same "metrics" container port that debug metrics were previously served on. To smooth the transition for anyone relying on the way the operator previously consumed this field, we also _temporarily_ serve tailscaled's internal debug metrics on the same `/debug/metrics` path as before, until 1.82.0 when debug metrics will be turned off by default even if `.spec.metrics.enable` is set. At that point, anyone who wishes to continue using the internal debug metrics (not recommended) will need to set the new `ProxyClass` field `.spec.statefulSet.pod.tailscaleContainer.debug.enable`. Users who wish to opt out of the transitional behaviour, where enabling `.spec.metrics.enable` also enables debug metrics, can set `.spec.statefulSet.pod.tailscaleContainer.debug.enable` to false (recommended). Separately but related, the operator will no longer specify a host port for the "metrics" container port definition. This caused scheduling conflicts when k8s needs to schedule more than one proxy per node, and was not necessary for allowing the pod's port to be exposed to prometheus scrapers. Updates #11292 --------- Co-authored-by: Kristoffer Dalby <kristoffer@tailscale.com> Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2024-11-18sessionrecording: implement v2 recording endpoint support (#14105)Andrew Lytvynov2-3/+3
The v2 endpoint supports HTTP/2 bidirectional streaming and acks for received bytes. This is used to detect when a recorder disappears to more quickly terminate the session. Updates https://github.com/tailscale/corp/issues/24023 Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
2024-11-12cmd/{k8s-operator,containerboot},k8s-operator: remove support for proxies ↵Irbe Krumina1-3/+0
below capver 95. (#13986) Updates tailscale/tailscale#13984 Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-11-11cmd/k8s-operator,k8s-operator,kube/kubetypes: add an option to configure app ↵Irbe Krumina3-8/+86
connector via Connector spec (#13950) * cmd/k8s-operator,k8s-operator,kube/kubetypes: add an option to configure app connector via Connector spec Updates tailscale/tailscale#11113 Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-10-30cmd/k8s-operator,k8s-operator: add topology spread constraints to ProxyClass ↵Irbe Krumina3-0/+12
(#13959) Now when we have HA for egress proxies, it makes sense to support topology spread constraints that would allow users to define more complex topologies of how proxy Pods need to be deployed in relation with other Pods/across regions etc. Updates tailscale/tailscale#13406 Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-10-10k8s-operator/apis: revert ProxyGroup readiness cond name change (#13770)Irbe Krumina1-1/+1
No need to prefix this with 'Tailscale' for tailscale.com custom resource types. Updates tailscale/tailscale#13406 Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-10-09cmd/k8s-operator,k8s-operator/apis: set a readiness condition on egress ↵Irbe Krumina1-8/+13
Services for ProxyGroup (#13746) cmd/k8s-operator,k8s-operator/apis: set a readiness condition on egress Services Set a readiness condition on ExternalName Services that define a tailnet target to route cluster traffic to via a ProxyGroup's proxies. The condition is set to true if at least one proxy is currently set up to route. Updates tailscale/tailscale#13406 Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-10-08cmd/k8s-operator,k8s-operator: use default ProxyClass if set for ProxyGroup ↵Tom Proctor4-5/+6
(#13720) The default ProxyClass can be set via helm chart or env var, and applies to all proxies that do not otherwise have an explicit ProxyClass set. This ensures proxies created by the new ProxyGroup CRD are consistent with the behaviour of existing proxies Nearby but unrelated changes: * Fix up double error logs (controller runtime logs returned errors) * Fix a couple of variable names Updates #13406 Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2024-10-07cmd/k8s-operator,k8s-operator: create ConfigMap for egress services + small ↵Irbe Krumina2-4/+2
fixes for egress services (#13715) cmd/k8s-operator, k8s-operator: create ConfigMap for egress services + small reconciler fixes Updates tailscale/tailscale#13406 Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-10-07cmd/{containerboot,k8s-operator},k8s-operator,kube: add ProxyGroup ↵Tom Proctor5-7/+15
controller (#13684) Implements the controller for the new ProxyGroup CRD, designed for running proxies in a high availability configuration. Each proxy gets its own config and state Secret, and its own tailscale node ID. We are currently mounting all of the config secrets into the container, but will stop mounting them and instead read them directly from the kube API once #13578 is implemented. Updates #13406 Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2024-10-04cmd/{k8s-operator,containerboot},k8s-operator,kube: reconcile ExternalName ↵Irbe Krumina5-19/+62
Services for ProxyGroup (#13635) Adds a new reconciler that reconciles ExternalName Services that define a tailnet target that should be exposed to cluster workloads on a ProxyGroup's proxies. The reconciler ensures that for each such service, the config mounted to the proxies is updated with the tailnet target definition and that and EndpointSlice and ClusterIP Service are created for the service. Adds a new reconciler that ensures that as proxy Pods become ready to route traffic to a tailnet target, the EndpointSlice for the target is updated with the Pods' endpoints. Updates tailscale/tailscale#13406 Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-09-27cmd/k8s-operator,k8s-operator: add ProxyGroup CRD (#13591)Tom Proctor4-7/+381
The ProxyGroup CRD specifies a set of N pods which will each be a tailnet device, and will have M different ingress or egress services mapped onto them. It is the mechanism for specifying how highly available proxies need to be. This commit only adds the definition, no controller loop, and so it is not currently functional. This commit also splits out TailnetDevice and RecorderTailnetDevice into separate structs because the URL field is specific to recorders, but we want a more generic struct for use in the ProxyGroup status field. Updates #13406 Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2024-09-25cmd/k8s-operator, k8s-operator: fix outdated kb links (#13585)Cameron Stokes3-8/+8
updates #13583 Signed-off-by: Cameron Stokes <cameron@tailscale.com>
2024-09-11cmd/k8s-operator,k8s-operator,kube: Add TSRecorder CRD + controller (#13299)Tom Proctor7-39/+845
cmd/k8s-operator,k8s-operator,kube: Add TSRecorder CRD + controller Deploys tsrecorder images to the operator's cluster. S3 storage is configured via environment variables from a k8s Secret. Currently only supports a single tsrecorder replica, but I've tried to take early steps towards supporting multiple replicas by e.g. having a separate secret for auth and state storage. Example CR: ```yaml apiVersion: tailscale.com/v1alpha1 kind: Recorder metadata: name: rec spec: enableUI: true ``` Updates #13298 Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2024-09-07sessionrecording,ssh/tailssh,k8s-operator: log connected recorder address ↵Irbe Krumina2-7/+17
(#13382) Updates tailscale/corp#19821 Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-09-03cmd/k8s-operator,k8s-operator/sessionrecording: ensure recording header ↵Irbe Krumina6-90/+253
contains terminal size for terminal sessions (#12965) * cmd/k8s-operator,k8s-operator/sessonrecording: ensure CastHeader contains terminal size For tsrecorder to be able to play session recordings, the recording's CastHeader must have '.Width' and '.Height' fields set to non-zero. Kubectl (or whoever is the client that initiates the 'kubectl exec' session recording) sends the terminal dimensions in a resize message that the API server proxy can intercept, however that races with the first server message that we need to record. This PR ensures we wait for the terminal dimensions to be processed from the first resize message before any other data is sent, so that for all sessions with terminal attached, the header of the session recording contains the terminal dimensions and the recording can be played by tsrecorder. Updates tailscale/tailscale#19821 Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-08-14cmd/k8s-operator,k8s-operator/sessionrecording: support recording kubectl ↵Irbe Krumina11-60/+1225
exec sessions over WebSockets (#12947) cmd/k8s-operator,k8s-operator/sessionrecording: support recording WebSocket sessions Kubernetes currently supports two streaming protocols, SPDY and WebSockets. WebSockets are replacing SPDY, see https://github.com/kubernetes/enhancements/issues/4006. We were currently only supporting SPDY, erroring out if session was not SPDY and relying on the kube's built-in SPDY fallback. This PR: - adds support for parsing contents of 'kubectl exec' sessions streamed over WebSockets - adds logic to distinguish 'kubectl exec' requests for a SPDY/WebSockets sessions and call the relevant handler Updates tailscale/corp#19821 Signed-off-by: Irbe Krumina <irbe@tailscale.com> Co-authored-by: Tom Proctor <tomhjp@users.noreply.github.com>
2024-07-29cmd/k8s-operator,k8s-operator/sessionrecording,sessionrecording,ssh/tailssh: ↵Irbe Krumina10-0/+1805
refactor session recording functionality (#12945) cmd/k8s-operator,k8s-operator/sessionrecording,sessionrecording,ssh/tailssh: refactor session recording functionality Refactor SSH session recording functionality (mostly the bits related to Kubernetes API server proxy 'kubectl exec' session recording): - move the session recording bits used by both Tailscale SSH and the Kubernetes API server proxy into a shared sessionrecording package, to avoid having the operator to import ssh/tailssh - move the Kubernetes API server proxy session recording functionality into a k8s-operator/sessionrecording package, add some abstractions in preparation for adding support for a second streaming protocol (WebSockets) Updates tailscale/corp#19821 Signed-off-by: Irbe Krumina <irbe@tailscale.com>