[dns-operations] DNSViz Service Restoration

Matthew Pounsett matt at dns-oarc.net
Thu Mar 12 12:41:28 UTC 2020



> On Mar 12, 2020, at 07:04, Jim Popovitch via dns-operations <dns-operations at dns-oarc.net> wrote:
> 
> 
> From: Jim Popovitch <jimpop at domainmail.org>
> Subject: Re: [dns-operations] DNSViz Service Restoration
> Date: March 12, 2020 at 07:04:23 EDT
> To: dns-operations at lists.dns-oarc.net
> 
> 
> On March 12, 2020 5:04:23 AM UTC, Casey Deccio <casey at deccio.net> wrote:
>> 
>> Thanks for the perspective.  I believe there is value in being able to answer the question: "what did foo.example.net look like at time X?"
> 
> Sounds great.  I think the most important feature of dnsvis was the ability to link to a report to show a recent problem to others.  People haven't had that capability, in over a year, because someone else saw greater value in being able to show very very very old data.

While the snark may have sounded witty in your head, the decision-making was a actually a lot less obvious than that.

Had we known it was going to be a year of hacking at a broken database, of course we’d have taken this route in the first place.  But, when we first found that some corruption had been introduced it wasn’t obvious that would take very long to fix.  At all decision points along the way, it appeared as if we were no more than a month from having a functioning historical database.

At the OARC workshop in October, we thought we were hours away from announcing that it was back up and running with all of its historical data, but the import script running at that time was interrupted by the DB running up against its transaction limit, and we had to start a vacuum of the db.  That ran for another six weeks before failing on a full disk.

About six months in we started to consider the possibility of resetting the database and merging old data later, but that’s a much more complicated procedure as it involves both restructuring the corruption that broken the import in the first place AND massaging that data on import to avoid collisions with newly created rows that have unique constraints on them, all on top of the increased time it would take to do such an import while the service is active.  There’s also the risk that certain tests could never be imported as-is because of the potential of a new test’s reference name (the unique 6 characters in a specific test’s URL) colliding with an old test’s name, causing any stored URLs out there to show the wrong test data.

And Casey isn’t the only one who looks at—or links to—old tests; there are web sites out there with links to old tests used as a historical record or as case studies of the ways DNS can be broken, so it still seems useful to get those tests back online somehow.

Matt Pounsett
DNS-OARC Systems Engineering



-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: Message signed with OpenPGP
URL: <http://lists.dns-oarc.net/pipermail/dns-operations/attachments/20200312/cd993351/attachment.sig>


More information about the dns-operations mailing list