<div dir="ltr"><div>Hi everyone. It's about time for an update anyway, so with thanks to Viktor for bringing this up, and with my OARC contractor hat on...</div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, 28 Jun 2019 at 04:07, Viktor Dukhovni <<a href="mailto:ietf-dane@dukhovni.org">ietf-dane@dukhovni.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">$subject. It has been degraded for quite some time now. I am no<br>
longer optimistic about a return to full functionality.<br></blockquote><div><br></div><div>TLDR:<br></div><div>The issue with DNSViz's database is as much a question of time as technology. I'm expecting it to be back up in about about a week, give or take a long weekend, but have been sidelined enough times in the couple of months that I'm obviously hesitant to make firm promises.</div><div><br></div><div>Tell Me Everything, I Want to Know (I'm gonna make TMEIWTN the new TLDR): </div><div>I think I've given this general rundown here before, but here it is again with a bit more detail, and an update on recent events.</div><div><br></div><div>When Verisign was transferring the DNSViz service to OARC, we copied the database (across country) to one of our large file stores for safe keeping. Since the servers were going to spend the better part of a week on trucks, we were justifiably worried about the risk of data corruption or outright hardware failure in transit. The server arrived with some RAID errors. We had also discussed prior to shipping that the server's drives were configured with maximum performance in mind (at the expense of available storage), but the database was getting dangerously close to filling up the volume. We're not able to do a forklift upgrade of all of the drives this year, so we were already considering reconfiguring the RAID to some middle ground to gain back some space at the expense of some performance. This seemed preferable to throwing out older data. The site visit where I received and installed the DNSViz hardware was the same visit in which I rebuilt OARC's entire physical plant, and unfortunately ran out of time to do the rebuild of the database server while on-site. That meant that the server would need to be rebuilt remotely, but we didn't expect that to be a problem.</div><div><br></div><div>Anyone who does systems operations will be familiar with the issues surrounding remote management of any servers produced in the last 15 years, and prior to about 2 years ago: the Java-based remote consoles they depend on are no longer compatible with modern web browsers, which have all deprecated the security barn door that was the plugin architecture that Java-based browser apps used. Newer systems have an HTML5 alternative available, but most of OARC's Dell hardware is from this dark period, and we (like like everyone else) have developed workarounds involving VMs running old versions of Linux with old versions of Firefox and some third-party plugins. Unfortunately, the HP hardware that runs DNSViz is also from the dark period, but resistant to those workarounds. We burned a fair bit of time trying to make them work, though.</div><div><br></div><div>HP seems to only officially support the .NET interface to their older systems now, which means running Windows somewhere. Perhaps unsurprisingly, it appears that Windows 10 doesn't play nice with KVM's bridged networking. We also burned a lot of time trying to make that work. </div><div><br></div><div>In an average operation, all of this would not result in many weeks going by, but you must also remember that OARC is a small shop with one pair of hands per function. And in my case, it's only 75% of a pair of hands. So, instead of having 200 or 400 hours of sysadmins available per week, OARC has 30. OARC has aways done a lot of things with the limited resources it has; in the time since we received the DNSViz hardware this pair of hands has been involved in: rebuilding the physical plant, participating in IETF104, migrating to an entirely new user portal, running a DITL collection, and organizing and running an OARC workshop in Bangkok. And that doesn't get into the (literally) fifty other little services that OARC runs (some public, some only for members, some internal) that need attention or the less visible systems and network issues that need to be managed... things like patching this month's remote TCP exploit.</div><div><br></div><div><div>We did briefly try running the database from the fileserver where the backup is stored. We knew it would be a bit short on memory and expected some performance issues, but quickly ran into a conflict between the database and the memory requirements of the OS trying to operate the filesystem itself, and started to get CRC errors from the filesystem. That was our only option for alternate hardware to run the database on, and we will not be bringing it back up there because of the risk of corruption not only to the DNSViz database but also to this year's DITL collection, which is on that same filesystem (it's the only one with enough space to hold them at the moment).</div><div><br></div></div><div>Where things currently stand is that as of yesterday, with the help of remote hands shuffling things around, I have some external hardware hooked up to the database server, which gives me access to its console, and a USB key with an OS installer in place. Today I'm beginning work on actually rebuilding the RAID and reinstalling the server. Once that's done, we'll have to wait several days to a week for the data to be copied back to the server, at which point we'll be able to rebuild indexes, do testing, etc., and bring it back into service.</div><div><br></div><div>If everything goes perfectly (and we all know how often that happens) that probably means we could bring it back online next Thursday. That is also the first day of a US long weekend though, so even if we're ready then we may wait until the following week. </div><div><br></div><div>We understand how important DNSViz has become to many people. I use it myself on a very regular basis both in my function as OARC's systems engineer, and in my other work, and have as well in past lives. It's been an important and useful tool to me since Casey first introduced it. We're not taking the lack of historical data lightly, but we have to balance our use of resources carefully. In the final calculation DNSViz mostly works as-is (only features related to the historical data are missing), and it has cost a lot of time (which is also money) to get this server running again, and other things still have had to get done. </div><div><br></div><div>If you want to help OARC have more resources to spread around, please consider becoming a member. We've been investigating other ways to support OARC's services, but at the moment the lion's share comes from annual membership dues. OARC also accepts donations at <<a href="https://www.dns-oarc.net/donate">https://www.dns-oarc.net/donate</a>>. OARC is a 501(c)(3) not-for-profit in the US, so you will receive a tax receipt that will be useful at least to US individuals and corporations.</div><div><br></div><div>And that brings to a close this very long explanation for the continued lack of historical data in the DNSViz interface. If you made it this far, thanks for reading. Also, if you made it this far, I am quite jealous of your free time. </div></div></div>