summaryrefslogtreecommitdiffhomepage
path: root/prober
AgeCommit message (Collapse)AuthorFilesLines
2025-09-07prober: include current probe results in run-probe text responseAnton Tolchanov2-8/+10
It was a bit confusing that provided history did not include the current probe results. Updates tailscale/corp#20583 Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2025-08-19prober: update runall handler to be generic (#16895)Mike O'Driscoll2-2/+72
Update the runall handler to be more generic with an exclude param to exclude multiple probes as the requesters definition. Updates tailscale/corp#27370 Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
2025-08-16cmd/derpprobe,prober: add run all probes handler (#16875)Mike O'Driscoll2-1/+181
Add a Run all probes handler that executes all probes except those that are continuous or the derpmap probe. This is leveraged by other tooling to confirm DERP stability after a deploy. Updates tailscale/corp#27370 Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
2025-06-16prober: speed up TestCRL ~450x by baking in some test keysBrad Fitzpatrick1-12/+53
Fixes #16290 Updates tailscale/corp#28679 Change-Id: Ic90129b686779d0ed1cb40acf187cfcbdd39eb83 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-06-13prober: record DERP dropped packets as they occurJames Tucker1-0/+20
Record dropped packets as soon as they time out, rather than after tx record queues spill over, this will more accurately capture small amounts of packet loss in a timely fashion. Updates tailscale/corp#24522 Signed-off-by: James Tucker <james@tailscale.com>
2025-06-10cmd/{derp,derpprobe},prober,derp: add mesh support to derpprobe (#15414)Mike O'Driscoll1-17/+25
Add mesh key support to derpprobe for probing derpers with verify set to true. Move MeshKey checking to central point for code reuse. Fix a bad error fmt msg. Fixes tailscale/corp#27294 Fixes tailscale/corp#25756 Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
2025-05-20prober: update header check test (#15993)Mike O'Driscoll1-10/+29
Use of the httptest client doesn't render header ordering as expected. Use http.DefaultClient for the test to ensure that the header ordering test is valid. Updates tailscale/corp#27370 Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
2025-05-16prober: correct content-type response (#15989)Mike O'Driscoll2-1/+4
Content-type was responding as test/plain for probes accepting application/json. Set content type header before setting the response code to correct this. Updates tailscale/corp#27370 Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
2025-05-13prober: fix test logic (#15952)Mike O'Driscoll1-2/+2
Catch failing tests that have no expected error string. Updates #15912 Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
2025-05-12prober: update cert check for prober (#15919)Mike O'Driscoll2-82/+137
OCSP has been removed from the LE certs. Use CRL verification instead. If a cert provides a CRL, check its revocation status, if no CRL is provided and otherwise is valid, pass the check. Fixes #15912 Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com> Co-authored-by: Simon Law <sfllaw@tailscale.com>
2025-03-25prober: add address family label for udp metrics (#15413)Mike O'Driscoll2-1/+17
Add a label which differentiates the address family for STUN checks. Also initialize the derpprobe_attempts_total and derpprobe_seconds_total metrics by adding 0 for the alternate fail/ok case. Updates tailscale/corp#27249 Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
2025-02-05all: use new LocalAPI client package locationBrad Fitzpatrick1-2/+2
It was moved in f57fa3cbc30e. Updates tailscale/corp#22748 Change-Id: I19f965e6bded1d4c919310aa5b864f2de0cd6220 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-01-30prober: support multiple probes running concurrentlyAnton Tolchanov2-15/+105
Some probes might need to run for longer than their scheduling interval, so this change relaxes the 1-at-a-time restriction, allowing us to configure probe concurrency and timeout separately. The default values remain the same (concurrency of 1; timeout of 80% of interval). Updates tailscale/corp#25479 Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2025-01-21prober: fix nil pointer access in tcp-in-tcp probesPercy Wegmann1-0/+2
If unable to accept a connection from the bandwidth probe listener, return from the goroutine immediately since the accepted connection will be nil. Updates tailscale/corp#25958 Signed-off-by: Percy Wegmann <percy@tailscale.com>
2025-01-15prober: remove DERP pub key copying overheads in qd and non-tun measures ↵Jordan Whited1-6/+10
(#14659) Updates tailscale/corp#25883 Signed-off-by: Jordan Whited <jordan@tailscale.com>
2025-01-15prober: remove per-packet DERP pub key copying overheads (#14658)Jordan Whited1-2/+4
Updates tailscale/corp#25883 Signed-off-by: Jordan Whited <jordan@tailscale.com>
2025-01-13prober: record total bytes transferred in DERP bandwidth probesPercy Wegmann1-8/+14
This will enable Prometheus queries to look at the bandwidth over time windows, for example 'increase(derp_bw_bytes_total)[1h] / increase(derp_bw_transfer_time_seconds_total)[1h]'. Fixes commit a51672cafd8b6c4e87915a55bda1491eb7cbee84. Updates tailscale/corp#25503 Signed-off-by: Percy Wegmann <percy@tailscale.com>
2025-01-10prober: support filtering regions by region ID in addition to codePercy Wegmann2-19/+19
Updates tailscale/corp#25758 Signed-off-by: Percy Wegmann <percy@tailscale.com>
2025-01-09prober: record total bytes transferred in DERP bandwidth probesPercy Wegmann1-0/+1
This will enable Prometheus queries to look at the bandwidth over time windows, for example 'increase(derp_bw_bytes_total)[1h] / increase(derp_bw_transfer_time_seconds_total)[1h]'. Updates tailscale/corp#25503 Signed-off-by: Percy Wegmann <percy@tailscale.com>
2025-01-08prober: clone histogram buckets before handing to Prometheus for ↵Percy Wegmann1-1/+2
derp_qd_probe_delays_seconds Updates tailscale/corp#25697 Signed-off-by: Percy Wegmann <percy@tailscale.com>
2024-12-20prober: make histogram buckets cumulativePercy Wegmann2-2/+1
Histogram buckets should include counts for all values under the bucket ceiling, not just those between the ceiling and the next lower ceiling. See https://prometheus.io/docs/tutorials/understanding_metric_types/\#histogram Updates tailscale/corp#24522 Signed-off-by: Percy Wegmann <percy@tailscale.com>
2024-12-19cmd/derpprobe,prober: add ability to perform continuous queuing delay ↵Percy Wegmann7-40/+411
measurements against DERP servers This new type of probe sends DERP packets sized similarly to CallMeMaybe packets at a rate of 10 packets per second. It records the round-trip times in a Prometheus histogram. It also keeps track of how many packets are dropped. Packets that fail to arrive within 5 seconds are considered dropped. Updates tailscale/corp#24522 Signed-off-by: Percy Wegmann <percy@tailscale.com>
2024-12-16prober: fix WithBandwidthProbing behavior with optional tunAddressBrad Fitzpatrick1-6/+8
1ed9bd76d682299376f404521cf1958a7f9bea7a meant to make tunAddress be optional. Updates tailscale/corp#24635 Change-Id: Idc4a8540b294e480df5bd291967024c04df751c0 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-12-13prober: perform DERP bandwidth probes over TUN device to mimic real clientPercy Wegmann4-15/+397
Updates tailscale/corp#24635 Co-authored-by: Mario Minardi <mario@tailscale.com> Signed-off-by: Percy Wegmann <percy@tailscale.com>
2024-12-10prober,derp/derphttp: make dev-mode DERP probes work without TLS (#14347)Mario Minardi1-12/+16
Make dev-mode DERP probes work without TLS. Properly dial port `3340` when not using HTTPS when dialing nodes in `derphttp_client`. Skip verifying TLS state in `newConn` if we are not running a prober. Updates tailscale/corp#24635 Signed-off-by: Percy Wegmann <percy@tailscale.com> Co-authored-by: Percy Wegmann <percy@tailscale.com>
2024-11-15cmd/derpprobe,prober: add ability to restrict derpprobe to a single regionPercy Wegmann2-2/+52
Updates #24522 Co-authored-by: Mario Minardi <mario@tailscale.com> Signed-off-by: Percy Wegmann <percy@tailscale.com>
2024-08-08prober: make status page more clearAnton Tolchanov3-7/+8
Updates tailscale/corp#20583 Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2024-08-06prober: support JSON response in RunHandlerAnton Tolchanov2-2/+119
Updates tailscale/corp#20583 Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2024-08-06prober: add a status page handlerAnton Tolchanov2-0/+256
This change adds an HTTP handler with a table showing a list of all probes, their status, and a button that allows triggering a specific probe. Updates tailscale/corp#20583 Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2024-08-06prober: add an HTTP endpoint for triggering a probeAnton Tolchanov2-40/+311
- Keep track of the last 10 probe results and successful probe latencies; - Add an HTTP handler that triggers a given probe by name and returns it result as a plaintext HTML page, showing recent probe results as a baseline Updates tailscale/corp#20583 Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2024-07-09prober: propagate DERPMap request creation errorsAnton Tolchanov1-1/+1
Updates tailscale/corp#8497 Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2024-06-06cmd/derpprobe: support 'local' derpmap to get derp map via LocalAPIBrad Fitzpatrick1-22/+37
To make it easier for people to monitor their custom DERP fleet. Updates tailscale/corp#20654 Change-Id: Id8af22936a6d893cc7b6186d298ab794a2672524 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-05-15prober: plumb a now-required netmon to derphttpBrad Fitzpatrick1-1/+2
Updates #11896 Change-Id: Ie2f9cd024d85b51087d297aa36c14a9b8a2b8129 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-04-27net/netns, net/dns/resolver, etc: make netmon required in most placesBrad Fitzpatrick1-1/+2
The goal is to move more network state accessors to netmon.Monitor where they can be cheaper/cached. But first (this change and others) we need to make sure the one netmon.Monitor is plumbed everywhere. Some notable bits: * tsdial.NewDialer is added, taking a now-required netmon * because a tsdial.Dialer always has a netmon, anything taking both a Dialer and a NetMon is now redundant; take only the Dialer and get the NetMon from that if/when needed. * netmon.NewStatic is added, primarily for tests Updates tailscale/corp#10910 Updates tailscale/corp#18960 Updates #7967 Updates #3299 Change-Id: I877f9cb87618c4eb037cee098241d18da9c01691 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-04-16all: use Go 1.22 range-over-intBrad Fitzpatrick2-2/+2
Updates #11058 Change-Id: I35e7ef9b90e83cac04ca93fd964ad00ed5b48430 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-04-08prober: export probe class and metrics from bandwidth proberAnton Tolchanov10-116/+215
- Wrap each prober function into a probe class that allows associating metric labels and custom metrics with a given probe; - Make sure all existing probe classes set a `class` metric label; - Move bandwidth probe size from being a metric label to a separate gauge metric; this will make it possible to use it to calculate average used bandwidth using a PromQL query; - Also export transfer time for the bandwidth prober (more accurate than the total probe time, since it excludes connection establishment time). Updates tailscale/corp#17912 Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2024-04-08prober: remove unused notification codeAnton Tolchanov1-139/+0
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2024-04-04prober: support creating multiple probes in ForEachAddrAndrew Dunham3-22/+26
So that we can e.g. check TLS on multiple ports for a given IP. Updates tailscale/corp#16367 Signed-off-by: Andrew Dunham <andrew@du.nham.ca> Change-Id: I81d840a4c88138de1cbb2032b917741c009470e6
2024-04-04prober: add helper function to check all IPs for a DNS hostnameAndrew Dunham3-0/+339
This allows us to check all IP addresses (and address families) for a given DNS hostname while dynamically discovering new IPs and removing old ones as they're no longer valid. Also add a testable example that demonstrates how to use it. Alternative to #11610 Updates tailscale/corp#16367 Signed-off-by: Andrew Dunham <andrew@du.nham.ca> Change-Id: I6d6f39bafc30e6dfcf6708185d09faee2a374599
2024-03-13prober: add a DERP bandwidth probeAnton Tolchanov2-92/+366
Updates tailscale/corp#17912 Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2024-03-13prober: remove unused derp prober latency measurementsAnton Tolchanov1-33/+24
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2024-03-13prober: export probe counters and cumulative latencyAnton Tolchanov1-1/+18
Updates #cleanup Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2024-02-19prober: add TLS probe constructor to split dial addr from cert nameBrad Fitzpatrick2-11/+20
So we can probe load balancers by their unique DNS name but without asking for that cert name. Updates tailscale/corp#13050 Change-Id: Ie4c0a2f951328df64281ed1602b4e624e3c8cf2e Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2023-12-13prober: log HTTP response body on failureAnton Tolchanov1-1/+6
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2023-11-07Add support for custom DERP port in TLS proberThomas Kosiewski1-1/+7
Updates #10146 Signed-off-by: Thomas Kosiewski <thoma471@googlemail.com>
2023-06-21prober: fix data race when altering derpmap (#8397)valscale1-2/+3
Move the clearing of STUNOnly flag to the updateMap() function. Fixes #8395 Signed-off-by: Val <valerie@tailscale.com>
2023-06-20prober: allow monitoring of nodes marked as STUN only in default derpmap (#8391)valscale1-0/+2
prober uses NewRegionClient() to connect to a derper using a faked up single-node region, but NewRegionClient() fails to connect if there is no non-STUN only client in the region. Set the STUN only flag to false before we call NewRegionClient() so we can monitor nodes marked as STUN only in the default derpmap. Updates #11492 Signed-off-by: Val <valerie@tailscale.com>
2023-04-20all: avoid repeated default interface lookupsMihai Parparita1-1/+1
On some platforms (notably macOS and iOS) we look up the default interface to bind outgoing connections to. This is both duplicated work and results in logspam when the default interface is not available (i.e. when a phone has no connectivity, we log an error and thus cause more things that we will try to upload and fail). Fixed by passing around a netmon.Monitor to more places, so that we can use its cached interface state. Fixes #7850 Updates #7621 Signed-off-by: Mihai Parparita <mihai@tailscale.com>
2023-04-17various: add golangci-lint, fix issues (#7905)Andrew Dunham1-1/+1
This adds an initial and intentionally minimal configuration for golang-ci, fixes the issues reported, and adds a GitHub Action to check new pull requests against this linter configuration. Signed-off-by: Andrew Dunham <andrew@du.nham.ca> Change-Id: I8f38fbc315836a19a094d0d3e986758b9313f163
2023-04-11prober: migrate to Prometheus metric libraryAnton Tolchanov2-198/+121
This provides an example of using native Prometheus metrics with tsweb. Prober library seems to be the only user of PrometheusVar, so I am removing support for it in tsweb. Updates https://github.com/tailscale/corp/issues/10205 Signed-off-by: Anton Tolchanov <anton@tailscale.com>