pkg://dhcp-doc-3.0.6-5mdv2008.1.i586.rpm:199999/
usr/
share/
doc/
dhcp-doc/draft-ietf-dhc-failover-07.txt
info downloads
Network Working Group Ralph Droms
INTERNET DRAFT Bucknell University
Kim Kinnear
Mark Stapp
Cisco Systems
Bernie Volz
IPWorks
Steve Gonczi
Network Engines
Greg Rabil
Mike Dooley
Arun Kapur
Lucent Technologies
July 2000
Expires January 2001
DHCP Failover Protocol
<draft-ietf-dhc-failover-07.txt>
Status of this Memo
This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC2026.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet- Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
Droms, et. al. Expires January 2001 [Page 1]
Internet Draft DHCP Failover Protocol July 2000
Copyright Notice
Copyright (C) The Internet Society (2000). All Rights Reserved.
Abstract
DHCP [RFC 2131] allows for multiple servers to be operating on a
single network. Some sites are interested in running multiple
servers in such a way so as to provide redundancy in case of server
failure. In order for this to work reliably, the cooperating primary
and secondary servers must maintain a consistent database of the
lease information. This implies that servers will need to coordinate
any and all lease activity so that this information is synchronized
in case of failover.
This document defines a protocol to provide such synchronization
between two servers. One server is designated the "primary" server,
the other is the "secondary" server. This document also describes a
way to integrate the failover protocol with the DHCP load balancing
approach.
This document is a substantial reorganization as well as a technical
and editorial revision of draft-ietf-dhc-failover-05.txt.
Table of Contents
1. Introduction................................................. 4
2. Terminology.................................................. 5
2.1. Requirements terminology................................... 5
2.2. DHCP and failover terminology.............................. 5
3. Background and External Requirements......................... 9
3.1. Key aspects of the DHCP protocol........................... 9
3.2. BOOTP relay agent implementation........................... 11
3.3. What does it mean if a server can't communicate with its partner? 12
3.4. Challenging scenarios for a Failover protocol.............. 12
3.5. Using TCP to detect partner server failure................. 14
4. Design Goals................................................. 15
4.1. Design goals for this protocol............................. 15
4.2. Limitations of this protocol............................... 16
5. Protocol Overview............................................ 17
5.1. Messages and States........................................ 17
5.2. Fundamental guarantees..................................... 20
5.3. Load balancing............................................. 26
5.4. Operating in NORMAL state.................................. 27
5.5. Operating in COMMUNICATIONS-INTERRUPTED state.............. 27
5.6. Operating in PARTNER-DOWN state............................ 28
Droms, et. al. Expires January 2001 [Page 2]
Internet Draft DHCP Failover Protocol July 2000
5.7. Operating in RECOVER state................................. 28
5.8. Operating in STARTUP state................................. 28
5.9. Time synchronization between servers....................... 28
5.10. IP address binding-status................................. 29
5.11. DNS dynamic update considerations......................... 33
5.12. Reservations and failover................................. 37
5.13. Dynamic BOOTP and failover................................ 39
5.14. Guidelines for selecting MCLT............................. 39
6. Common Message Format........................................ 40
6.1. Message header format...................................... 40
6.2. Common option format....................................... 43
6.3. Batching multiple binding update transactions in one BNDUPD mes- 44
7. Protocol Messages............................................ 46
7.1. BNDUPD message [3]......................................... 46
7.2. BNDACK message [4]......................................... 56
7.3. UPDREQ message [9]......................................... 59
7.4. UPDREQALL message [7]...................................... 60
7.5. UPDDONE message [8]........................................ 61
7.6. POOLREQ message [1]........................................ 62
7.7. POOLRESP message [2]....................................... 63
7.8. CONNECT message [5]........................................ 64
7.9. CONNECTACK message [6]..................................... 68
7.10. STATE message [10]........................................ 71
7.11. CONTACT message [11]...................................... 72
7.12. DISCONNECT message [12]................................... 73
8. Connection Management........................................ 74
8.1. Connection granularity..................................... 74
8.2. Creating the TCP connection................................ 74
8.3. Using the TCP connection for determining communications status 76
8.4. Using the TCP connection for binding data.................. 78
8.5. Using the TCP connection for control messages.............. 78
8.6. Losing the TCP connection.................................. 78
9. Failover Endpoint States..................................... 79
9.1. Server Initialization...................................... 79
9.2. Server State Transitions................................... 79
9.3. STARTUP state.............................................. 82
9.4. PARTNER-DOWN state......................................... 84
9.5. RECOVER state.............................................. 86
9.6. NORMAL state............................................... 89
9.7. COMMUNICATIONS-INTERRUPTED State........................... 91
9.8. POTENTIAL-CONFLICT state................................... 95
9.9. RESOLUTION-INTERRUPTED state............................... 96
9.10. RECOVER-DONE state........................................ 97
9.11. PAUSED state.............................................. 98
9.12. SHUTDOWN state............................................ 98
10. Safe Period................................................. 99
11. Security.................................................... 101
Droms, et. al. Expires January 2001 [Page 3]
Internet Draft DHCP Failover Protocol July 2000
11.1. Simple shared secret...................................... 101
11.2. TLS....................................................... 102
12. Failover Options............................................ 103
12.1. addresses-transferred..................................... 103
12.2. assigned-IP-address....................................... 103
12.3. binding-status............................................ 104
12.4. client-identifier......................................... 104
12.5. client-hardware-address................................... 105
12.6. client-last-transaction-time.............................. 105
12.7. client-reply-options...................................... 105
12.8. client-request-options.................................... 106
12.9. DDNS...................................................... 107
12.10. delayed-service-parameter................................ 108
12.11. hash-bucket-assignment................................... 108
12.12. lease-expiration-time.................................... 108
12.13. max-unacked-bndupd....................................... 109
12.14. MCLT..................................................... 109
12.15. message.................................................. 109
12.16. message-digest........................................... 110
12.17. potential-expiration-time................................ 110
12.18. receive-timer............................................ 110
12.19. protocol-version......................................... 111
12.20. reject-reason............................................ 112
12.21. sending-server-IP-address................................ 113
12.22. server-flags............................................. 113
12.23. server-state............................................. 114
12.24. start-time-of-state...................................... 114
12.25. TLS-reply................................................ 115
12.26. TLS-request.............................................. 115
12.27. vendor-class-identifier.................................. 115
12.28. vendor-specific-options.................................. 116
13. IANA Considerations......................................... 116
14. Acknowledgments............................................. 116
15. References.................................................. 118
16. Author's information........................................ 119
17. Full Copyright Statement.................................... 120
1. Introduction
DHCP [RFC 2131] allows for multiple servers to be operating on a sin-
gle network. Some sites are interested in running multiple servers
in such a way so as to provide redundancy in case of server failure
since the DHCP subsystem is in many cases a critical part of the net-
work infrastructure.
This document defines a protocol to provide synchronization between
two servers in order that each can take over for the other should
Droms, et. al. Expires January 2001 [Page 4]
Internet Draft DHCP Failover Protocol July 2000
either one fail or become unreachable.
One server is designated the "primary" server, the other is the
"secondary" server, and most DHCP client requests are sent to each
server (see Section 3.1.1 for details).
In order to provide a high availability DHCP service, these
cooperating primary and secondary servers must maintain a consistent
database of lease information. This implies that servers will need
to coordinate all lease activity so that this information is syn-
chronized in case failover is required. The protocol messages and
processing techniques required to maintain a consistent database are
specified in the protocol described here.
The failover protocol also contains a way to integrate the DHCP load-
balancing algorithm described in [LOADB] with the failover protocol.
2. Terminology
This section discusses both the generic requirements terminology com-
mon to many IETF protocol specifications as well as specialized DHCP
and failover protocol specific terminology.
2.1. Requirements terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC 2119].
2.2. DHCP and failover terminology
This document uses the following terms:
o "binding"
A binding is a collection of configuration parameters, includ-
ing at least an IP address, associated with or "bound to" a
DHCP client. Bindings are managed by DHCP servers.
o "binding database"
The collection of bindings managed by a primary and secondary.
o "binding update transaction"
A binding update transaction refers to the set of information
(contained in options) necessary to perform a binding update
Droms, et. al. Expires January 2001 [Page 5]
Internet Draft DHCP Failover Protocol July 2000
for a single IP address. It will be comprised of the
assigned-IP-address option and the binding-status option, along
other options as appropriate.
o "binding-status"
The binding-status is the status of an IP address with respect
to its association with a client. There are specific binding-
status values defined for use by the failover protocol, e.g.,
ACTIVE, FREE, RELEASED, ABANDONED, etc. These are designed to
map more or less directly onto the binding-status values used
internally in most DHCP server implementations. The term
binding-status refers to the concept also sometimes known as
"lease state" or "IP address state", but in this document the
term "state" is reserved for the failover state of a failover
endpoint, and binding-status is always used to refer to the
state associated with an IP address or lease.
o "DHCP client" or "client"
A DHCP client is an Internet host using DHCP to obtain confi-
guration parameters such as a network address. The term
"client" used within this document always means a DHCP client,
and never one of the two failover servers.
o "DHCP server" or "server"
A DHCP server is an Internet host that returns configuration
parameters to DHCP clients.
o "DDNS"
An abbreviation for "Dynamic DNS", which refers to the capabil-
ity to update a DNS server's name (actually resource record)
database using an on-the-wire protocol defined in [RFC 2136].
o "DNS"
An abbreviation for "Domain Name System", a scheme where a cen-
tral name repository is used to map names to IP addresses and IP
addresses to names.
o "failover endpoint"
The failover protocol allows for there to be a unique failover
endpoint per partner per role (where role is primary or secon-
dary). This failover endpoint can take actions and hold unique
states. There are thus a maximum of two failover endpoints per
Droms, et. al. Expires January 2001 [Page 6]
Internet Draft DHCP Failover Protocol July 2000
server per partner (one for each partner as a primary and one
for that same partner as a secondary.)
o "FQDN"
An FQDN is a "fully qualified domain name". A fully qualified
domain name generally is a host name with at least one zone
name, for example "www.dhcp.org" is a fully qualified domain
name.
o "lazy update"
Lazy update refers to the requirement placed on a server imple-
menting a failover protocol to update its failover partner when-
ever the binding database changes. A failover protocol which
didn't support lazy update would require the failover partner
update to be complete before a DHCP server could respond to a
DHCP client request with a DHCPACK. A failover protocol which
does support lazy update places no such restriction on the
update of the failover partner server, and so a server can allo-
cate an IP address or extend a lease on an IP address and then
update its failover partner as time permits. A failover proto-
col which supports lazy update not only removes the requirement
to update the failover partner prior to responding to a DHCP
client with a DHCPACK, but also allows gathering up batches of
updates from one failover server to its partner.
o "MCLT"
The MCLT refers to maximum client lead time. This time is con-
figured on the primary server and transmitted from the primary
to the secondary server in the CONNECT message. It is the max-
imum amount of time that one server can extend a lease for a
client's binding beyond the time known by the partner server.
See section 5.2.1 for details.
o "partner"
A "partner", for the purposes of this document, refers to a
failover server, typically the other failover server. In many
(if not most) cases, the failover protocol is symmetric with
respect to the primary or secondary nature of the servers, and
so it is often appropriate to discuss "updating the partner
server", since it could be a primary server updating a secondary
server or a secondary server updating a primary server.
o "Primary server" or "Primary"
Droms, et. al. Expires January 2001 [Page 7]
Internet Draft DHCP Failover Protocol July 2000
A DHCP server configured to provide primary service to a set of
DHCP clients for a particular set of subnet address pools.
o "RR"
"RR" is an abbreviation for "resource record". All records in
the DNS are resource records. The resource records of most
relevance to this document are the "A" resource record, which
maps a DNS name to a particular IP address, the "PTR" resource
record, which allows a "reverse map", from the IP address back
to a DNS name, and the "KEY" resource record, which is used in
ways defined in [DDNS] to tag a DNS name with the identity of
the DHCP client with which it is associated.
o "Secondary server" or "Secondary"
A DHCP server configured to act as backup to a primary server
for a particular set of subnet address pools.
o "stable storage"
Every DHCP server is assumed to have some form of what is called
"stable storage". Stable storage is used to hold information
concerning IP address bindings (among other things) so that this
information is not lost in the event of a server failure which
requires restart of the server.
o "state"
In this document, the term "state" refers exclusively to the
state of a failover endpoint, for example: NORMAL,
COMMUNICATIONS-INTERRUPTED, PARTNER-DOWN. It is not used to
refer to any attributes of an IP address or a binding of an IP
address. See "binding-status".
o "subnet address pool"
A subnet address pool is the set of IP addresses which is asso-
ciated with a particular network number and subnet mask. In the
simple case, there is a single network number and subnet mask
and a set of IP addresses. In the more complex case (sometimes
called "secondary subnets", sometimes "superscopes"), several
(apparently unrelated) network number and subnet mask combina-
tions with their associated IP addresses may all be configured
together into one subnet address pool.
Droms, et. al. Expires January 2001 [Page 8]
Internet Draft DHCP Failover Protocol July 2000
3. Background and External Requirements
This section highlights key aspects of the DHCP protocol on which the
failover protocol depends. It also discusses the requirements that
the failover protocol places on other aspects of the network infras-
tructure, and some general issues surrounding server failure detec-
tion. Some failure scenarios that provide particular challenges to a
failover protocol are discussed. Finally, the challenges inherent in
using a TCP connection as a means to detect failure of a partner
server are elaborated.
3.1. Key aspects of the DHCP protocol
The failover protocol is designed to augment the DHCP protocol as
described in RFC 2131 [RFC 2131]. There are several key aspects of
the DHCP protocol which are required by the failover protocol in
order to successfully meet its design goals.
3.1.1. Broadcast behavior
There are two aspects of the broadcast behavior of the DHCP protocol
which are key to making the failover protocol operate successfully.
The first is simply that the DHCP protocol requires a DHCP client to
broadcast all DHCPDISCOVER and DHCPREQUEST/INIT-REBOOT messages.
Because of this requirement, a DHCP client who was communicating with
one server will automatically be able to communicate with another
server if one is available.
The second aspect of broadcast behavior is similar to the first, but
involves the distinction between a DHCPREQUEST/RENEW and
DHCPREQUEST/REBINDING. A DHCPREQUEST/RENEW is the message that a
DHCP client uses to extend its lease. It is unicast to the DHCP
server from which it acquired the lease. However, the DHCP protocol
(in a farsighted move), was explicitly designed so that in the event
that a DHCP client cannot contact the server from which it received a
lease on an IP address using a DHCPREQUEST/RENEW, the client is
required to broadcast its renewal using a DHCPREQUEST/REBINDING to
any available DHCP server. Since all DHCP clients were required to
implement this algorithm, the failover protocol can have a different
server from the one that initially granted a lease be the server to
renew a lease. Thus, one server can take over for another with no
interruption in the service as experienced by the DHCP client or its
associated applications software.
3.1.2. Client responsibility
In the DHCP protocol the DHCP clients are entrusted with a consider-
able responsibility. In particular, after they are granted a lease
Droms, et. al. Expires January 2001 [Page 9]
Internet Draft DHCP Failover Protocol July 2000
on an IP address, they are enjoined to only use that IP address while
their lease is valid. Every DHCP client is expected to stop using an
IP address if the expiration time on the lease has passed and if it
cannot get an extension on the lease for that IP address from some
DHCP server. Thus, the correct behavior of every DHCP client in this
regard is required to ensure the integrity of the DHCP service. On
the other hand, incorrect behavior by a client in this area will tend
to adversely affect at most one other DHCP client.
Furthermore, any DHCP client which sends in a DHCPREQUEST/RENEW or
DHCPREQUEST/REBINDING to a DHCP server (either unicast for a RENEW or
broadcast for a REBINDING) MUST still have time to run on the lease
for that IP address. The DHCP server sends the DHCPACK back unicast
to the IP address from which the RENEW or REBINDING originated.
Given the existing responsibility placed on the client to only use an
IP address when the lease is valid, and to only send in a RENEW or
REBINDING if the lease is valid, the failover protocol relies on DHCP
clients to perform responsibly and will, in the absence of conflict-
ing information, believe a DHCP client that is attempting to RENEW or
REBIND a lease on an IP address is the legitimate owner of that IP
address.
If clients do not follow these rules, it is possible for an address
to be in use by more than one client. For a single server, this hap-
pens because the server has leased the expired address to another
client and the original client is also attempting to use the address.
The server would NAK the renewal request. This is made slightly worse
in the failover protocol if the two servers are unable to communicate
with each other and one server leases an available address to a new
client while the other server receives a renewal from a different
client. In this case, both servers lease the same address to dif-
ferent clients for the MCLT time.
One troublesome issue is that of the DHCP client responsibility when
sending in DHCPREQUEST/INIT-REBOOT requests. While the original DHCP
RFC was written to require a DHCP client to have time left to run on
the lease for an IP address if the client is sending an INIT-REBOOT
request, it was sufficiently unclear that some client vendors didn't
realize this until recently. Since the INIT-REBOOT request was sent
with the IP address in the dhcp-requested-address option and not in
the ciaddr (for perfectly good reasons), the similarity to the RENEW
and REBINDING case was lost on many people.
At present, the failover protocol does not assume that a client send-
ing in an INIT-REBOOT request necessarily has a valid lease on the IP
address appearing in the dhcp-requested-address option in the INIT-
REBOOT request.
Droms, et. al. Expires January 2001 [Page 10]
Internet Draft DHCP Failover Protocol July 2000
The implications of this are as follows: Assume that there is a DHCP
client that gets a lease from one server while that server is unable
to communicate with its failover partner. Then, assume that after
that client reboots it is able only to communicate with the other
failover server. If the failover servers have not been able to com-
municate with each other during this process, then the DHCP client
will get a new IP address instead of being able to continue to use
its existing IP address. This will affect no applications on the DHCP
client, since it is rebooting. However, it will use up an additional
IP address in this marginal case.
3.1.3. Stable storage update before DHCPACK
The DHCP protocol allocates resources, and in order to operate
correctly it requires that a DHCP server update some form of stable
storage prior to sending a DHCPACK to a DHCP client in order to grant
that client a lease on an IP address.
One of the goals of the failover protocol is that it not add signifi-
cant additional time to this already time consuming requirement to
update stable storage prior to a DHCPACK. In particular, adding a
requirement to communicate with another server prior to sending a
DHCPACK would greatly simplify the failover protocol, but it would
unacceptably limit the potential scalability of any DHCP server which
employed the failover protocol.
3.2. BOOTP relay agent implementation
Many DHCP clients are not resident on the same network segment as a
DHCP server. In order to support this form of network architecture,
most contemporary routers implement something known as a BOOTP Relay
Agent. This capability inside of a router listens for all broadcasts
at the DHCP port, port 67, and will relay any broadcasts that it
receives on to a DHCP server. The IP address of the DHCP server must
have been previously configured into the router. As part of the
relay process, the relay agent will place the address of the inter-
face on which it received the broadcast into the giaddr field of the
DHCP packet.
Since the failover protocol requires two DHCP servers to receive any
broadcast DHCP messages, in order to work with DHCP clients which are
not local to the DHCP server, the BOOTP relay agent on the router
closest to the DHCP client must be configured to point at more than
one DHCP server.
Most BOOTP relay agent implementations allow this duplication of
packets.
Droms, et. al. Expires January 2001 [Page 11]
Internet Draft DHCP Failover Protocol July 2000
If this is not possible, an administrator might be able to configure
the relay agent with a subnet broadcast address, but in this case the
primary and secondary DHCP servers in a failover pair must both
reside on the same subnet.
3.3. What does it mean if a server can't communicate with its partner?
In any protocol designed to allow one server to take over some
responsibilities from a partner server in the event of "failure" of
that partner server, there is an inherent difficulty in determining
when that partner server has failed.
In fact, it is fundamentally impossible for one server to distinguish
a network communications failure from the outright failure of the
server to which it is trying to communicate. In the case where each
server is handing out resources (in this case IP addresses) to a
client community, mistaking an inability to communicate with a
partner server for failure of that partner server could easily cause
both servers to be handing out the same IP addresses to different
clients.
One way that this is sometimes handled is for there to be more than
two servers. In the case of an odd number of servers, the servers
that can still communicate with a majority of other servers will con-
sider themselves operational, and any server which can't communicate
to a majority of other servers must immediately cease operations.
While this technique works in some domains, having the only server to
which a DHCP client can communicate voluntarily shut itself down
seems like something worth avoiding.
The failover protocol will operate correctly while both servers are
unable to communicate, whether they are both running or not. At some
point there may be resource contention, and if one of the servers is
actually down, then the operator can inform the operational server
and the operational server will be able to use all of the failed
server's resources.
The protocol also allows detection of an orderly shutdown of a parti-
cipating server.
3.4. Challenging scenarios for a Failover protocol
There exist two failure scenarios which provide particular challenges
to the correctness guarantees of a failover protocol.
Droms, et. al. Expires January 2001 [Page 12]
Internet Draft DHCP Failover Protocol July 2000
3.4.1. Primary Server crash before "lazy" update:
In the case where the primary server sends a DHCPACK to a client for
a newly allocated IP address and then crashes prior to sending the
corresponding update to the secondary server, the secondary server
will have no record of the IP address allocation. When the secondary
server takes over, it may well try to allocate that IP address to a
different client. In the case where the first client to receive the
IP address is not on the net at the time (yet while there was still
time to run on its lease), an ICMP echo (i.e., ping) will not prevent
the secondary server from allocating that IP address to a different
client.
The failover protocol deals with this situation by having the primary
and secondary servers allocate addresses for new clients from dis-
joint address pools. See section 5.4 for details.
A more likely (in that DHCPRENEWs are presumably more common than
DHCPDISCOVERs) and more subtle version of this problem is where the
primary server crashes after extending a client's lease time, and
before updating the secondary with a new time using a lazy update.
After the secondary takes over, if the client is not connected to the
network the secondary will believe the client's lease has expired
when, in fact, it has not. In this case as well, the IP address
might be reallocated to a different client while the first client is
still using it.
This scenario is handled by the failover protocol through control of
the lease time and the use of the maximum client lead time (MCLT).
See section 5.2.1 for details.
3.4.2. Network partition where DHCP servers can't communicate but each
can talk to clients:
Several conditions are required for this situation to occur. First,
due to a network failure, the primary and secondary servers cannot
communicate. As well, some of the DHCP clients must be able to com-
municate with the primary server, and some of the clients must now
only be able to communicate with the secondary server. When this
condition occurs, both primary and secondary servers could attempt to
allocate IP addresses for new clients from the same pool of available
addresses. At some point, then, two clients will end up being allo-
cated the same IP address. This will cause problems when the network
failure that created this situation is corrected.
The failover protocol deals with this situation by having the primary
and secondary servers allocate addresses for new clients from dis-
joint address pools. See section 5.4 for details.
Droms, et. al. Expires January 2001 [Page 13]
Internet Draft DHCP Failover Protocol July 2000
3.5. Using TCP to detect partner server failure
There are several characteristics of TCP that are important to the
functioning of the failover protocol, which uses one TCP connection
for both bulk data transfer as well as to assess communications
integrity with the other server. Reliable and ordered message
delivery are chief among these important characteristics.
It would be nice to use the capabilities built in to TCP to allow it
to determine if communications integrity exists to the failover
partner but this strategy contains some problems which require
analysis. There exist three fundamental cases for an open TCP con-
nection that must be examined.
1. When no data is being sent then no messages are traveling
across the TCP connection.
2. When data is queued to be sent, and the receiver has not
blocked the sending of additional data, then messages are
flowing across the TCP connection containing the applications
data.
3. When data is queued to be sent, and the receiver has blocked
the transmission of additional data, then persist messages are
flowing from the receiver to the sender to ensure that the
sender doesn't miss the receiver opening the window for
further transmissions.
The first case can be turned into the second case by sending
application-level keep-alive messages periodically when there is no
other data queued to be sent. Note TCP keep-alive messages might be
used as well, but they present additional problems.
Thus, we can ensure that the TCP connection has messages flowing
periodically across the connection fairly easily. The question
remains as to what TCP will do if the other end of the connection
fails to respond (either because of network partition or because the
receiving server crashes). TCP will attempt to retransmit a message
with an exponential backoff, and will eventually timeout that
retransmission. However, the length of that timeout cannot, in gen-
eral, be set on a per-connection basis, and is frequently as long as
nine minutes, though in some cases it may be as short as two minutes.
On some systems it can be set system-wide, while on other systems it
cannot be changed at all.
A value for this timeout that would be appropriate for the failover
protocol, say less than 1 minute, could have unpleasant side-effects
on other applications running on the same server, assuming that it
Droms, et. al. Expires January 2001 [Page 14]
Internet Draft DHCP Failover Protocol July 2000
could be changed at all on the host operating system.
Nine minutes is a long time for the DHCP service to be unavailable to
any new clients that were being served by the server which has
crashed, when there is another server running that could respond to
them as soon as it determines that its partner is not operational.
The conclusion drawn from this analysis is that TCP provides very
useful support for the failover protocol in the areas of reliable and
ordered message delivery, but cannot by itself be relied upon to
detect partner server failure in a fashion acceptable to the needs of
the failover protocol. Additional failover protocol capabilities
have been created to support timely detection of partner server
failure. See section 8.3 for details on this mechanism.
4. Design Goals
This section lists the design goals and the limitations of the fail-
over protocol.
4.1. Design goals for this protocol
The following is a list of goals that are met by this protocol. They
are listed in priority order.
1. Implementations of this protocol must work with existing DHCP
client implementations based on the DHCP protocol [1].
2. Implementations of the protocol must work with existing BOOTP
relay agent implementations.
3. The protocol must provide failover redundancy between servers
that are not located on the same subnet.
4. Provide for continued service to DHCP clients through an
automated mechanism in the event of failure of the primary
server.
5. Avoid binding an IP address to a client while that binding is
currently valid for another client. In other words, do not
allocate the same IP address to two clients.
6. Minimize any need for manual administrative intervention.
7. Introduce no additional delays in server response time as a
result of the network communications required to implement the
failover protocol, i.e., don't require communications with the
partner between the receipt of a DHCPREQUEST and the
Droms, et. al. Expires January 2001 [Page 15]
Internet Draft DHCP Failover Protocol July 2000
corresponding DHCPACK.
8. Share IP address ranges between primary and secondary servers;
i.e., impose no requirement that the pool of available
addresses be manually or permanently divided between servers.
9. Continue to meet the goals and objectives of this protocol in
the event of server failure or network partition.
10. Provide graceful reintegration of full protocol service after
server failure or network partition.
11. Allow for one computer to act as a secondary server for multi-
ple primary servers. The protocol must allow failover primary
and secondary configuration choices to be made at a granular-
ity smaller than "all of the subnets served by a single
server", though individual implementations may not choose to
allow such flexibility.
12. Ensure that an existing client can keep its existing IP
address binding if it can communicate with either the primary
or secondary DHCP server implementing this protocol - not just
whichever server that originally offered it the binding.
13. Ensure that a new client can get an IP address from some
server. Ensure that in the face of partition, where servers
continue to run but cannot communicate with each other, the
above goals and requirements may be met. In addition, when
the partition condition is removed, allow graceful automatic
re-integration without requiring human intervention.
14. If either primary or secondary server loses all of the infor-
mation that it has stored in stable storage, ensure that it be
able to refresh its stable storage from the other server.
15. Support load balancing between the primary and secondary
servers, and allow configuration of the percentage of the
client population served by each with a moderately fine granu-
larity.
4.2. Limitations of this protocol
The following are explicit limitations of this protocol.
1. This protocol provides only one level of redundancy through a
single secondary server for each primary server.
Droms, et. al. Expires January 2001 [Page 16]
Internet Draft DHCP Failover Protocol July 2000
2. A subset of the address pool is reserved for secondary server
use. In order to handle the failure case where both servers
are able to communicate with DHCP clients, but unable to com-
municate with each other, a subset of the IP address pool must
be set aside as a private address pool for the secondary
server. The secondary can use these to service newly arrived
DHCP clients during such a period. The required size of this
private pool is based only on the arrival rate of new DHCP
clients and the length of expected downtime, and is not influ-
enced in any way by the total number of DHCP clients supported
by the server pair.
The failover protocol can be used in a mode where both the
primary and secondary servers can share the load between them
when both are operating. In this load balancing mode, the
addresses allocated by the primary server to the secondary
server are not unused, but are used instead to service the
portion of the client base to which the secondary server is
required to respond. See section 5.3 for more information on
load balancing.
3. The primary and secondary servers do not respond to client
requests at all while recovering from a failure that could
have resulted in duplicate IP assignments. (When synchroniz-
ing in POTENTIAL-CONFLICT state).
5. Protocol Overview
This section will discuss the failover protocol at a relatively high
level of detail. In the event that a description in this section
conflicts (or appears to conflict due to the overview nature of this
section) with information in later sections of this draft, the infor-
mation in the later sections should be considered authoritative.
5.1. Messages and States
This protocol is centered around the message exchange used by one
server to update the other server of binding database changes result-
ing from DHCP client activity:
o Communication of binding database changes
The binding update (BNDUPD) message is used to send the binding
database changes to the partner server, and the partner server
responds with a binding acknowledgement (BNDACK) message when it
has successfully committed those changes to its own stable
storage.
Droms, et. al. Expires January 2001 [Page 17]
Internet Draft DHCP Failover Protocol July 2000
All of the other messages involve ancillary issues:
o Management of available IP addresses
The pool request (POOLREQ) is used by the secondary server to
request an allocation of IP addresses from the primary server.
The pool response (POOLRESP) is used by the primary server to
inform the secondary server how many IP addresses were allocated
to the secondary server as the result of the pool request.
o Synchronization of the binding databases between the servers
after they've been out of communications
The update request (UPDREQ) message is used by one server to
request that its partner send it all binding database informa-
tion that it has not already seen. The update request all
(UPDREQALL) message is used by one server to request that all
binding database information be sent in order to recover from a
total loss of its binding database by the requesting server.
The update done (UPDDONE) message is used by the responding
server to indicate that all requested updates have been sent the
responding server and acked by the requesting server.
o Connection establishment
The connect (CONNECT) message is used by the primary server to
establish a high level connection with the other server, and to
transmit several important configuration data items between the
servers. The connect acknowledgement message (CONNECTACK) is
used by the secondary server to respond to a CONNECT message
from the primary server. The disconnect (DISCONNECT) message is
used by either server when closing a connection.
o Server synchronization
The state change (STATE) message is used by either server to
inform the other server of a change of failover state.
o Connection integrity management
The contact (CONTACT) message is used by either server to ensure
that the other server continues to see the connection as opera-
tional. It MUST be transmitted periodically over every esta-
blished connection if other message traffic is not flowing, and
it MAY be sent at any time.
Droms, et. al. Expires January 2001 [Page 18]
Internet Draft DHCP Failover Protocol July 2000
5.1.1. Failover endpoints
The proper operation of the failover protocol requires more than the
transmission of messages between one server and the other. Each end-
point might seem to be a single DHCP server, but in fact there are
many situations where additional flexibility in configuration is use-
ful.
For instance, there might be several servers which are each primary
for a distinct set of address pools, and one server which is secon-
dary for all of those address pools. The situation with the pri-
maries is straightforward, but the secondary will need to maintain a
separate failover state, partner state, and communications up/down
status for each of the separate primary servers for which it is act-
ing as a secondary.
The failover protocol calls for there to be a unique failover end-
point per partner per role (where role is primary or secondary).
This failover endpoint can take actions and hold unique states.
There are thus a maximum of two failover endpoints per partner (one
for the partner as a primary and one for that same partner as a
secondary.)
Thus, in the case where there are two primary servers A and B each
backed up by a single common secondary server C, there is one fail-
over endpoint on each of A and B, and two different failover end-
points on C. The two different failover endpoints on C each have
unique states and independent TCP connections.
This document frequently describes the behavior of the protocol in
terms of primary and secondary servers, not primary and secondary
failover endpoints. However, it is important to remember that every
'server' described in this document is in reality a failover endpoint
that resides in a particular process, and that many failover end-
points may reside in the same process.
It is not the case that there is a unique failover endpoint for each
subnet address pool that participates in a failover relationship. On
one server, there is one failover endpoint per partner per role,
regardless of how many subnet address pools are managed by that com-
bination of partner and role. Conversely, on a particular server,
any given subnet address pool will be associated with exactly one
failover endpoint.
When a connection is received from the partner, the unique failover
endpoint to which the message is directed is determined solely by the
IP address of the partner and the port to which the connection is
directed by the partner. See section 8.2.
Droms, et. al. Expires January 2001 [Page 19]
Internet Draft DHCP Failover Protocol July 2000
5.2. Fundamental guarantees
There a several fundamental restrictions this protocol places on what
one server can do in the absence of knowledge of the other server.
Operating within these restrictions allows certain guarantees to be
made to the partner server, and these are key to the correct opera-
tion of the protocol.
5.2.1. Control of lease time
The key problem with lazy update is that when a serve