discuss AT lists.opennicproject.org
Subject: Discuss mailing list
List archive
- From: Jamyn Shanley <jshanley AT gmail.com>
- To: discuss AT lists.opennicproject.org
- Subject: Re: [opennic-discuss] Policy proposal for removal of non-responding T2 servers
- Date: Wed, 10 Oct 2012 13:01:18 -0500
Given how critical DNS is to both the end-user experience and general net functionality, I don't understand why non-responsive servers aren't removed from the zonefiles within 15 minutes of a problem. There's no reason why they couldn't be put back in rotation within an hour or two of being 100% functional again, but I gotta say if my local ISP had a policy that allowed them 7 days to get one of their DNS servers fixed (and also left the problematic server listed on their website/documentation) I'd be ... disappointed in their professionalism.
On Wed, Oct 10, 2012 at 11:52 AM, Jeff Taylor <shdwdrgn AT sourpuss.net> wrote:
While finishing up the code, I decided that what makes the most sense is
to take a server offline based on a value of <days>, but then to bring
it back online again based on a value of <hours>. The offline status is
really just an extension of the temp-outage status, but this step gets a
server removed from the public listings. I certainly don't what this
status to be viewed as a 'punishment' to the admins involved, rather it
should be considered a notice to the users that there is an extended
problem occurring.
It is interesting that between both replies so far, you have both
suggested the opposite extremes for bringing a server back into the
pool. My feelings on this is that since the code will automate the
process, we can keep the time fairly short, however the server was
marked offline for a reason, so we want to make sure it is running
smoothly for a long enough period that we can be sure it is stable
again. For this reason, I think 48 hours would be a reasonable period.
We should probably get some more opinions on this matter.
We all seem to be in agreement that 7 days is a good length of time to
wait for issues to be resolved before marking a server offline, so I'll
stick with that value while moving forward.
On 10/08/2012 11:20 PM, Jeff Taylor wrote:
> Regarding the previous discussion about automating the removal of dead
> or failing Tier-2 servers...
>
> First off, a big thanks to Brian for getting the administrative tools
> created so we can better manage the status of these servers! We now
> have the tools in place to mark servers as offline or deleted, and
> handle each case appropriately. Please note that if your server is
> marked offline and you are able to repair it, you can contact Brian or
> myself to re-enable your server on the wiki.
>
> I am currently testing some new code which will automatically moving
> failing servers to an offline status (and remove them from the zone
> file). Servers that are marked offline will continue to be tested for
> functionality, and could potentially be automatically changed to an
> online status when they resume service. In looking through this thread,
> it does not appear we ever really established a policy that I could put
> into the code, so I would like to take a quick vote to see what everyone
> thinks would be best...
>
> Policies for marking servers as offline:
> 1) Testing fails more than (7, 14, 28) days
> 2) Connection fails more than (2, 3, 7, 14) days
>
> Policies for marking an offline server as functional again:
> 3) Passes all tests for at least (1, 2, 7) days
>
> My thoughts on this are that connection failures are more serious that
> testing failures, and should be given a stricter criteria. Also note
> that I *can* resolve the test times in hours rather than days, but at
> the moment it seems best to work on a day-by-day basic to give admins
> time to fix problems with their systems. Please let me know what values
> you think are best for the three questions above, and I'll tally up the
> results in a couple of days and start implementing the new automation.
>
>
> --------
> You are a member of the OpenNIC Discuss list.
> You may unsubscribe by emailing discuss-unsubscribe AT lists.opennicproject.org
--------
You are a member of the OpenNIC Discuss list.
You may unsubscribe by emailing discuss-unsubscribe AT lists.opennicproject.org
- Re: [opennic-discuss] Policy proposal for removal of non-responding T2 servers, Zach Gibbens, 10/10/2012
- <Possible follow-up(s)>
- Re: [opennic-discuss] Policy proposal for removal of non-responding T2 servers, Peter Green, 10/10/2012
- Re: [opennic-discuss] Policy proposal for removal of non-responding T2 servers, Jeff Taylor, 10/10/2012
- Re: [opennic-discuss] Policy proposal for removal of non-responding T2 servers, Jamyn Shanley, 10/10/2012
- Re: [opennic-discuss] Policy proposal for removal of non-responding T2 servers, Peter Green, 10/10/2012
- Re: [opennic-discuss] Policy proposal for removal of non-responding T2 servers, Jeff Taylor, 10/10/2012
- Re: [opennic-discuss] Policy proposal for removal of non-responding T2 servers, sjeap, 10/10/2012
- Re: [opennic-discuss] Policy proposal for removal of non-responding T2 servers, Jeff Taylor, 10/10/2012
- Re: [opennic-discuss] Policy proposal for removal of non-responding T2 servers, sjeap, 10/10/2012
- Re: [opennic-discuss] Policy proposal for removal of non-responding T2 servers, Hunter 9999, 10/10/2012
- Re: [opennic-discuss] Policy proposal for removal of non-responding T2 servers, sjeap, 10/10/2012
- Re: [opennic-discuss] Policy proposal for removal of non-responding T2 servers, Jeff Taylor, 10/10/2012
- Re: [opennic-discuss] Policy proposal for removal of non-responding T2 servers, sjeap, 10/10/2012
- Re: [opennic-discuss] Policy proposal for removal of non-responding T2 servers, Jamyn Shanley, 10/10/2012
Archive powered by MHonArc 2.6.19.