[dns-operations] Filtering policy: false positive rate

Thu Feb 8 09:14:52 UTC 2024

On 06/02/2024 17.06, Peter Thomassen wrote:
> Then, how to define a false positive rate?
>
> Look at all blocked queries, and do a post-hoc investigation?
>
> How about popularity -- should one factor in that blocking *.ddns.net 
> is more severe than blocking *.blank.page? I.e., is it a ratio of 
> blocked/total queries, or blocked/total names? 

Yes, primarily post-hoc I expect - I mean, if we could easily recognize 
false positives in advance, we'd do that during the blocking, right?  
I'd do this statistically.  Take a sample from the blocked names.  You 
could weight the names with whatever you like when choosing among them, 
e.g. the mentioned popularity by unique IPs querying them.  Then 
evaluate the sample in some better way, probably by a human.  You could 
mix in a sample from non-blocked names, too (say [1]).

I think it's not difficult to design these measurements in a way that 
you get an OK ratio of complexity (and human work) vs. precision of the 
false-positive estimate.  Actually I suspect that it's probably *not* 
worth trying to affect the choice of the evaluated sample by reports 
from users, as it's probably very hard to get statistically correct-ish 
numbers out of that.

[1] https://en.wikipedia.org/wiki/Scientific_control#Controlled_experiments

(reposted from a correct e-mail address; Peter will probably get a 
duplicate)

--Vladimir | knot-resolver.cz
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.dns-oarc.net/pipermail/dns-operations/attachments/20240208/80e3c33c/attachment-0001.html>