[dns-operations] Filtering policy: false positive rate
Vladimír Čunát
vladimir.cunat+ietf at nic.cz
Thu Feb 8 09:14:52 UTC 2024
On 06/02/2024 17.06, Peter Thomassen wrote:
> Then, how to define a false positive rate?
>
> Look at all blocked queries, and do a post-hoc investigation?
>
> How about popularity -- should one factor in that blocking *.ddns.net
> is more severe than blocking *.blank.page? I.e., is it a ratio of
> blocked/total queries, or blocked/total names?
Yes, primarily post-hoc I expect - I mean, if we could easily recognize
false positives in advance, we'd do that during the blocking, right?
I'd do this statistically. Take a sample from the blocked names. You
could weight the names with whatever you like when choosing among them,
e.g. the mentioned popularity by unique IPs querying them. Then
evaluate the sample in some better way, probably by a human. You could
mix in a sample from non-blocked names, too (say [1]).
I think it's not difficult to design these measurements in a way that
you get an OK ratio of complexity (and human work) vs. precision of the
false-positive estimate. Actually I suspect that it's probably *not*
worth trying to affect the choice of the evaluated sample by reports
from users, as it's probably very hard to get statistically correct-ish
numbers out of that.
[1] https://en.wikipedia.org/wiki/Scientific_control#Controlled_experiments
(reposted from a correct e-mail address; Peter will probably get a
duplicate)
--Vladimir | knot-resolver.cz
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.dns-oarc.net/pipermail/dns-operations/attachments/20240208/80e3c33c/attachment-0001.html>
More information about the dns-operations
mailing list