Skip to Content.
Sympa Menu

discuss - Re: [opennic-discuss] Policy proposal for removal of non-responding T2 servers

discuss AT

Subject: Discuss mailing list

List archive

Re: [opennic-discuss] Policy proposal for removal of non-responding T2 servers

Chronological Thread 
  • From: Zach Gibbens <infocop411 AT>
  • To: discuss AT
  • Subject: Re: [opennic-discuss] Policy proposal for removal of non-responding T2 servers
  • Date: Wed, 10 Oct 2012 06:45:41 -0400

Personally, I think if it's a config issue, one passing test should be
enough, if it's a connection issue, I can see waiting a little, in
case it's a fluke, but I'd think something shorter like 6 hours would
work, or if an operator emails the dns-operators health and it passes
four tests

while I agree that connection issues are worse in ways, config errors
are easier to fix, connection errors might be outside the operator's
control, which is why despite the operators email, I'm still
recommending passing four tests, to make sure it can stay up for an

as far as how long till it moves offline, I think a week for both,
it's pulled into a temp outage on failure as is, I can see connection
issues potentially taking a day or two to fix (some ISP's just aren't
in a hurry to fix their network) a week isn't common, it'd take a
flood, earthquake or tornado to knock it off longer than a week (and
we've had each of those affect someone in the past year)

On Tue, Oct 9, 2012 at 1:20 AM, Jeff Taylor <shdwdrgn AT> wrote:
> Regarding the previous discussion about automating the removal of dead
> or failing Tier-2 servers...
> First off, a big thanks to Brian for getting the administrative tools
> created so we can better manage the status of these servers! We now
> have the tools in place to mark servers as offline or deleted, and
> handle each case appropriately. Please note that if your server is
> marked offline and you are able to repair it, you can contact Brian or
> myself to re-enable your server on the wiki.
> I am currently testing some new code which will automatically moving
> failing servers to an offline status (and remove them from the zone
> file). Servers that are marked offline will continue to be tested for
> functionality, and could potentially be automatically changed to an
> online status when they resume service. In looking through this thread,
> it does not appear we ever really established a policy that I could put
> into the code, so I would like to take a quick vote to see what everyone
> thinks would be best...
> Policies for marking servers as offline:
> 1) Testing fails more than (7, 14, 28) days
> 2) Connection fails more than (2, 3, 7, 14) days
> Policies for marking an offline server as functional again:
> 3) Passes all tests for at least (1, 2, 7) days
> My thoughts on this are that connection failures are more serious that
> testing failures, and should be given a stricter criteria. Also note
> that I *can* resolve the test times in hours rather than days, but at
> the moment it seems best to work on a day-by-day basic to give admins
> time to fix problems with their systems. Please let me know what values
> you think are best for the three questions above, and I'll tally up the
> results in a couple of days and start implementing the new automation.
> --------
> You are a member of the OpenNIC Discuss list.
> You may unsubscribe by emailing discuss-unsubscribe AT

Archive powered by MHonArc 2.6.19.

Top of Page