summaryrefslogtreecommitdiffhomepage
path: root/net/netcheck
AgeCommit message (Collapse)AuthorFilesLines
2025-06-13net/netcheck: preserve live home DERP through packet lossJames Tucker2-14/+58
During a short period of packet loss, a TCP connection to the home DERP may be maintained. If no other regions emerge as winners, such as when all regions but one are avoided/disallowed as candidates, ensure that the current home region, if still active, is not dropped as the preferred region until it has failed two keepalives. Relatedly apply avoid and no measure no home to ICMP and HTTP checks as intended. Updates tailscale/corp#12894 Updates tailscale/corp#29491 Signed-off-by: James Tucker <james@tailscale.com>
2025-06-12feature/relayserver,net/{netcheck,udprelay}: implement addr discovery (#16253)Jordan Whited1-3/+16
The relay server now fetches IPs from local interfaces and external perspective IP:port's via netcheck (STUN). Updates tailscale/corp#27502 Signed-off-by: Jordan Whited <jordan@tailscale.com>
2025-04-02all: use network less when running in v86 emulatorBrad Fitzpatrick1-1/+15
Updates #5794 Change-Id: I1d8b005a1696835c9062545f87b7bab643cfc44d Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-04-02net/netcheck: avoid ICMP unimplemented log spam on Plan 9Brad Fitzpatrick1-0/+4
Updates #5794 Change-Id: Ia6b2429d57b79770e4c278f011504f726136db5b Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-03-31net/netcheck: use NoMeasureNoHome in another spotBrad Fitzpatrick1-1/+4
It only affected js/wasm and tamago. Updates tailscale/corp#24697 Change-Id: I8fd29323ed9b663fe3fd8d4a86f26ff584a3e134 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-03-07tailcfg: add DERPRegion.NoMeasureNoHome, deprecate+document Avoid [cap 115]Brad Fitzpatrick2-4/+8
Fixes tailscale/corp#24697 Change-Id: Ib81994b5ded3dc87a1eef079eb268906a2acb3f8 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-02-14net/netcheck: remove unnecessary custom map clone functionJames Tucker1-14/+3
Updates #8419 Updates #cleanup Signed-off-by: James Tucker <james@tailscale.com>
2024-12-13net/netcheck: adjust HTTPS latency check to connection time and avoid data raceJames Tucker1-7/+17
The go-httpstat package has a data race when used with connections that are performing happy-eyeballs connection setups as we are in the DERP client. There is a long-stale PR upstream to address this, however revisiting the purpose of this code suggests we don't really need httpstat here. The code populates a latency table that may be used to compare to STUN latency, which is a lightweight RTT check. Switching out the reported timing here to simply the request HTTP request RTT avoids the problematic package. Fixes tailscale/corp#25095 Signed-off-by: James Tucker <james@tailscale.com>
2024-12-05net/netcheck: preserve STUN port defaulting to 3478 (#14289)Irbe Krumina1-0/+3
Updates tailscale/tailscale#14287 Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-12-04cmd/tailscale,net/netcheck: add debug feature to force preferred DERPJames Tucker2-1/+83
This provides an interface for a user to force a preferred DERP outcome for all future netchecks that will take precedence unless the forced region is unreachable. The option does not persist and will be lost when the daemon restarts. Updates tailscale/corp#18997 Updates tailscale/corp#24755 Signed-off-by: James Tucker <james@tailscale.com>
2024-12-02net/netcheck: clean up ICMP probe AddrPort lookupBrad Fitzpatrick2-29/+36
Fixes #14200 Change-Id: Ib086814cf63dda5de021403fe1db4fb2a798eaae Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-10-31net/netcheck: add addReportHistoryAndSetPreferredDERP() test case (#13989)Jordan Whited1-0/+9
Add an explicit case for exercising preferred DERP hysteresis around the branch that compares latencies on a percentage basis. Updates #cleanup Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-10-30net/netcheck: ensure prior preferred DERP is always in netchecksJames Tucker2-17/+93
In an environment with unstable latency, such as upstream bufferbloat, there are cases where a full netcheck could drop the prior preferred DERP (likely home DERP) from future netcheck probe plans. This will then likely result in a home DERP having a missing sample on the next incremental netcheck, ultimately resulting in a home DERP move. This change does not fix our overall response to highly unstable latency, but it is an incremental improvement to prevent single spurious samples during a full netcheck from alone triggering a flapping condition, as now the prior changes to include historical latency will still provide the desired resistance, and the home DERP should not move unless latency is consistently worse over a 5 minute period. Note that there is a nomenclature and semantics issue remaining in the difference between a report preferred DERP and a home DERP. A report preferred DERP is aspirational, it is what will be picked as a home DERP if a home DERP connection needs to be established. A nodes home DERP may be different than a recent preferred DERP, in which case a lot of netcheck logic is fallible. In future enhancements much of the DERP move logic should move to consider the home DERP, rather than recent report preferred DERP. Updates #8603 Updates #13969 Signed-off-by: James Tucker <james@tailscale.com>
2024-10-22net/netcheck: add a Now field to the netcheck ReportAndrew Dunham2-7/+23
This allows us to print the time that a netcheck was run, which is useful in debugging. Updates #10972 Signed-off-by: Andrew Dunham <andrew@du.nham.ca> Change-Id: Id48d30d4eb6d5208efb2b1526a71d83fe7f9320b
2024-10-18net/netcheck: remove arbitrary deadlines from GetReport() tests (#13832)Jordan Whited1-2/+29
GetReport() may have side effects when the caller enforces a deadline that is shorter than ReportTimeout. Updates #13783 Updates #13394 Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-10-10net/netcheck: fix netcheck cli-triggered nil pointer deref (#13782)Jordan Whited1-1/+1
Updates #13780 Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-10-08net/netcheck: don't panic if a region has no NodesAndrew Dunham1-0/+4
Updates #13728 Signed-off-by: Andrew Dunham <andrew@du.nham.ca> Change-Id: I1e8319d6b2da013ae48f15113b30c9333e69cc0b
2024-09-17net/netcheck,wgengine/magicsock: plumb OnlyTCP443 controlknob through ↵Jordan Whited1-14/+21
netcheck (#13491) Updates tailscale/corp#17879 Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-09-13wgengine/magicsock: remove redundant deadline from netcheck report call (#13395)Jordan Whited2-4/+25
netcheck.Client.GetReport() applies its own deadlines. This 2s deadline was causing GetReport() to never fall back to HTTPS/ICMP measurements as it was shorter than netcheck.stunProbeTimeout, leaving no time for fallbacks. Updates #13394 Updates #6187 Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-09-04all: use new Go 1.23 slices.Sorted moreBrad Fitzpatrick1-8/+2
Updates #12912 Change-Id: If1294e5bc7b5d3cf0067535ae10db75e8b988d8b Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-07-26health: introduce captive-portal-detected Warnable (#12707)Andrea Gottardo2-126/+3
Updates tailscale/tailscale#1634 This PR introduces a new `captive-portal-detected` Warnable which is set to an unhealthy state whenever a captive portal is detected on the local network, preventing Tailscale from connecting. ipn/ipnlocal: fix captive portal loop shutdown Change-Id: I7cafdbce68463a16260091bcec1741501a070c95 net/captivedetection: fix mutex misuse ipn/ipnlocal: ensure that we don't fail to start the timer Change-Id: I3e43fb19264d793e8707c5031c0898e48e3e7465 Signed-off-by: Andrew Dunham <andrew@du.nham.ca> Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
2024-06-06net/netcheck: fix probeProto.String result for IPv6 probesBrad Fitzpatrick1-1/+1
This bug was introduced in e6b84f215 (May 2020) but was only used in tests when stringifying probeProto values on failure so it wasn't noticed for a long time. But then it was moved into non-test code in 8450a18aa (Jun 2024) and I didn't notice during the code movement that it was wrong. It's still only used in failure paths in logs, but having wrong/ambiguous debugging information isn't the best. Whoops. Updates tailscale/corp#20654 Change-Id: I296c727ed1c292a04db7b46ecc05c07fc1abc774 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-06-06net/netcheck: flesh out some logging in error pathsBrad Fitzpatrick2-15/+18
Updates tailscale/corp#20654 Change-Id: Ie190f956b864985668f79b5b986438bbe07ce905
2024-06-05all: use math/rand/v2 moreMaisem Ali1-2/+2
Updates #11058 Signed-off-by: Maisem Ali <maisem@tailscale.com>
2024-05-24net/netcheck: apply some polish suggested from #12161James Tucker1-6/+5
Apply some post-submit code review suggestions. Updates #12161 Updates tailscale/corp#19106 Signed-off-by: James Tucker <james@tailscale.com>
2024-05-21net/netcheck: remove hairpin probesJames Tucker2-254/+17
Palo Alto reported interpreting hairpin probes as LAND attacks, and the firewalls may be responding to this by shutting down otherwise in use NAT sessions prematurely. We don't currently make use of the outcome of the hairpin probes, and they contribute to other user confusion with e.g. the AirPort Extreme hairpin session workaround. We decided in response to remove the whole probe feature as a result. Updates #188 Updates tailscale/corp#19106 Updates tailscale/corp#19116 Signed-off-by: James Tucker <james@tailscale.com>
2024-05-17net/netcheck,wgengine/magicsock: add potential workaround for Palo Alto DIPP ↵James Tucker2-18/+98
misbehavior Palo Alto firewalls have a typically hard NAT, but also have a mode called Persistent DIPP that is supposed to provide consistent port mapping suitable for STUN resolution of public ports. Persistent DIPP works initially on most Palo Alto firewalls, but some models/software versions have a bug which this works around. The bug symptom presents as follows: - STUN sessions resolve a consistent public IP:port to start with - Much later netchecks report the same IP:Port for a subset of sessions, most often the users active DERP, and/or the port related to sustained traffic. - The broader set of DERPs in a full netcheck will now consistently observe a new IP:Port. - After this point of observation, new inbound connections will only succeed to the new IP:Port observed, and existing/old sessions will only work to the old binding. In this patch we now advertise the lowest latency global endpoint discovered as we always have, but in addition any global endpoints that are observed more than once in a single netcheck report. This should provide viable endpoints for potential connection establishment across a NAT with this behavior. Updates tailscale/corp#19106 Signed-off-by: James Tucker <james@tailscale.com>
2024-05-07net/netcheck: do not add derps if IPv4/IPv6 is set to "none"Maisem Ali1-4/+4
It was documented as such but seems to have been dropped in a refactor, restore the behavior. This brings down the time it takes to run a single integration test by 2s which adds up quite a bit. Updates tailscale/corp#19786 Signed-off-by: Maisem Ali <maisem@tailscale.com>
2024-05-06all: make more tests pass/skip in airplane modeBrad Fitzpatrick2-7/+9
Updates tailscale/corp#19786 Change-Id: Iedc6730fe91c627b556bff5325bdbaf7bf79d8e6 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-05-03net/netcheck: don't spam on ICMP socket permission denied errorsBrad Fitzpatrick1-6/+14
While debugging a failing test in airplane mode on macOS, I noticed netcheck logspam about ICMP socket creation permission denied errors. Apparently macOS just can't do those, or at least not in airplane mode. Not worth spamming about. Updates #cleanup Change-Id: I302620cfd3c8eabb25202d7eef040c01bd8a843c Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-05-03derp/derphttp, net/netcheck: plumb netmon.Monitor to derp netcheck clientBrad Fitzpatrick1-1/+1
Fixes #11981 Change-Id: I0e15a09f93aefb3cfddbc12d463c1c08b83e09fd Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-04-28net/{interfaces,netmon}, all: merge net/interfaces package into net/netmonBrad Fitzpatrick1-1/+1
In prep for most of the package funcs in net/interfaces to become methods in a long-lived netmon.Monitor that can cache things. (Many of the funcs are very heavy to call regularly, whereas the long-lived netmon.Monitor can subscribe to things from the OS and remember answers to questions it's asked regularly later) Updates tailscale/corp#10910 Updates tailscale/corp#18960 Updates #7967 Updates #3299 Change-Id: Ie4e8dedb70136af2d611b990b865a822cd1797e5 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-04-28net/netmon, add: add netmon.State type alias of interfaces.StateBrad Fitzpatrick2-5/+3
... in prep for merging the net/interfaces package into net/netmon. This is a no-op change that updates a bunch of the API signatures ahead of a future change to actually move things (and remove the type alias) Updates tailscale/corp#10910 Updates tailscale/corp#18960 Updates #7967 Updates #3299 Change-Id: I477613388f09389214db0d77ccf24a65bff2199c Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-04-27net/netns, net/dns/resolver, etc: make netmon required in most placesBrad Fitzpatrick3-11/+6
The goal is to move more network state accessors to netmon.Monitor where they can be cheaper/cached. But first (this change and others) we need to make sure the one netmon.Monitor is plumbed everywhere. Some notable bits: * tsdial.NewDialer is added, taking a now-required netmon * because a tsdial.Dialer always has a netmon, anything taking both a Dialer and a NetMon is now redundant; take only the Dialer and get the NetMon from that if/when needed. * netmon.NewStatic is added, primarily for tests Updates tailscale/corp#10910 Updates tailscale/corp#18960 Updates #7967 Updates #3299 Change-Id: I877f9cb87618c4eb037cee098241d18da9c01691 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-04-26net/netcheck, wgengine/magicsock: make netmon.Monitor requiredBrad Fitzpatrick2-33/+33
This has been a TODO for ages. Time to do it. The goal is to move more network state accessors to netmon.Monitor where they can be cheaper/cached. Updates tailscale/corp#10910 Updates tailscale/corp#18960 Updates #7967 Updates #3299 Change-Id: I60fc6508cd2d8d079260bda371fc08b6318bcaf1 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-04-05net/netcheck,wgengine/magicsock: align DERP frame receive time heuristicsJames Tucker2-5/+6
The netcheck package and the magicksock package coordinate via the health package, but both sides have time based heuristics through indirect dependencies. These were misaligned, so the implemented heuristic aimed at reducing DERP moves while there is active traffic were non-operational about 3/5ths of the time. It is problematic to setup a good test for this integration presently, so instead I added comment breadcrumbs along with the initial fix. Updates #8603 Signed-off-by: James Tucker <james@tailscale.com>
2024-02-07util/cmpx: delete now that we're using Go 1.22Brad Fitzpatrick1-3/+3
Updates #11058 Change-Id: I09dea8e86f03ec148b715efca339eab8b1f0f644 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2023-12-13net/netcheck: use DERP frames as a signal for home region livenessAndrew Dunham2-12/+100
This uses the fact that we've received a frame from a given DERP region within a certain time as a signal that the region is stil present (and thus can still be a node's PreferredDERP / home region) even if we don't get a STUN response from that region during a netcheck. This should help avoid DERP flaps that occur due to losing STUN probes while still having a valid and active TCP connection to the DERP server. RELNOTE=Reduce home DERP flapping when there's still an active connection Updates #8603 Signed-off-by: Andrew Dunham <andrew@du.nham.ca> Change-Id: If7da6312581e1d434d5c0811697319c621e187a0
2023-12-13net/netcheck: only run HTTP netcheck for tamago clientsAndrea Barisani1-1/+1
Signed-off-by: Andrea Barisani <andrea@inversepath.com>
2023-08-11net/netcheck,wgengine/magicsock: reduce coupling between netcheck and magicsockJames Tucker3-121/+142
Netcheck no longer performs I/O itself, instead it makes requests via SendPacket and expects users to route reply traffic to ReceiveSTUNPacket. Netcheck gains a Standalone function that stands up sockets and goroutines to implement I/O when used in a standalone fashion. Magicsock now unconditionally routes STUN traffic to the netcheck.Client that it hosts, and plumbs the send packet sink. The CLI is updated to make use of the Standalone mode. Fixes #8723 Signed-off-by: James Tucker <james@tailscale.com>
2023-08-01ipnlocal, net/*: deprecate interfaces.GetState, use netmon more for itBrad Fitzpatrick1-0/+2
Updates #cleanup Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2023-07-18net/netcheck: ignore PreferredDERP changes that are smallAndrew Dunham2-5/+42
If the absolute value of the difference between the current PreferredDERP's latency and the best latency is <= 10ms, don't change it and instead prefer the previous value. This is in addition to the existing hysteresis that tries to remain on the previous DERP region if the relative improvement is small, but handles nodes that have low latency to >1 DERP region better. Updates #8603 Signed-off-by: Andrew Dunham <andrew@du.nham.ca> Change-Id: I1e34c94178f8c9a68a69921c5bc0227337514c70
2023-07-13net/netcheck, tailcfg: add DERPHomeParams and use itAndrew Dunham2-12/+81
This allows providing additional information to the client about how to select a home DERP region, such as preferring a given DERP region over all others. Updates #8603 Signed-off-by: Andrew Dunham <andrew@du.nham.ca> Change-Id: I7c4a270f31d8585112fab5408799ffba5b75266f
2023-06-07all: use cmpx.Or where it made senseBrad Fitzpatrick1-8/+5
I left a few out where writing it explicitly was better for various reasons. Updates #8296 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2023-05-10net/netcheck: reenable TestBasic on WindowsJames Tucker1-4/+0
This test was either fixed by intermediate changes or was mis-flagged as failing during #7876 triage. Updates #7876 Signed-off-by: James Tucker <jftucker@gmail.com>
2023-04-26net/ping,netcheck: add v6 pinging capabilities to pinger (#7971)Charlotte Brandhorst-Satzkorn1-4/+1
This change adds a v6conn to the pinger to enable sending pings to v6 addrs. Updates #7826 Signed-off-by: Charlotte Brandhorst-Satzkorn <charlotte@tailscale.com>
2023-04-22net/netcheck: fix crash when IPv6 kinda but not really worksBrad Fitzpatrick1-0/+11
Looks like on some systems there's an IPv6 address, but then opening a IPv6 UDP socket fails later. Probably some firewall. Tolerate it better and don't crash. To repro: check the "udp6" to something like "udp7" (something that'll fail) and run "go run ./cmd/tailscale netcheck" on a machine with active IPv6. It used to crash and now it doesn't. Fixes #7949 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2023-04-20all: avoid repeated default interface lookupsMihai Parparita1-9/+23
On some platforms (notably macOS and iOS) we look up the default interface to bind outgoing connections to. This is both duplicated work and results in logspam when the default interface is not available (i.e. when a phone has no connectivity, we log an error and thus cause more things that we will try to upload and fail). Fixed by passing around a netmon.Monitor to more places, so that we can use its cached interface state. Fixes #7850 Updates #7621 Signed-off-by: Mihai Parparita <mihai@tailscale.com>
2023-04-17net/netcheck: reenable TestNodeAddrResolve on WindowsJames Tucker1-3/+28
Updates #7876 Co-authored-by: Andrew Dunham <andrew@du.nham.ca> Signed-off-by: Andrew Dunham <andrew@du.nham.ca> Signed-off-by: James Tucker <james@tailscale.com> Change-Id: Idb2e6cc2edf6ca123b751d6c8f8729b0cba86023
2023-04-15wgengine/magicsock, types/nettype, etc: finish ReadFromUDPAddrPort netip ↵Brad Fitzpatrick1-9/+4
migration So we're staying within the netip.Addr/AddrPort consistently and avoiding allocs/conversions to the legacy net addr types. Updates #5162 Change-Id: I59feba60d3de39f773e68292d759766bac98c917 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>