The first thing I'd like to note, which is probably common-sense for many
of us, is that this attack is absolutely effective when combined with a
denial of service attack. If I can prevent the authority servers for a
domain from answering a query, I've eliminated the race I'd otherwise have
to win to get my forged responses back.
Because of this, as Mr. Vixie may have indirectly communicated to us, by
itself, randomizing the DNS ID isn't a real solution to the problem. Too
much depends on factors entirely external to the protocol - what if I call
the phone company and have the T1s feeding a zone's authority servers put
into intrusive testing? As long as I can force the authority servers not
to answer, I can brute force the entire 16-bit ID space effectively.
Tim Newsham mentioned to me that it's probably trivial to keep trying this
over and over, too... even if NXDOMAINs are cached, I can always flush the
entire cache with garbage records and try again. Even if that attack
doesn't work, I can force the nameserver to reboot - again, your security
now depends on not being vulnerable to denial of service attacks, which is
unacceptable.
It thus seems to me that denial of service attacks are the most important
issue to address with the current DNS protocol. If the issue can't be
resolved, the guessable-ID problem is unfixeable.
The thought occurred to me that the major technical issue involved here is
that there isn't enough space to include a random "cookie" value to ensure
an authentic response. I spent a bit of time thinking of other places that
random data could be stuffed, and returned intact in a response packet by
a completely unwitting nameserver.
The first thing that jumped into my head was extra queries. If we can put
multiple DNS queries in a single packet, we can effectively expand the
random ID space to make a prediction attack unfeasable by using the DNS
label in the second query as a cookie - a response that doesn't include an
NXDOMAIN answer for the second query isn't authentic.
Unfortunately, this is doesn't work. BIND checks the query count in the
header and ensures that it is one, returning a format error if that isn't
the case. This fails the "unwitting nameserver" test, so it's thrown out.
Another idea is to use random source ports to launch the query - a
response not directed to our source port is invalid, effectively expanding
the ID space by another 16 bits. Unfortunately, this annihilates firewall
configurations, so it's probably not an effective answer either. The
possibility of using IP options to include a 64 bit cookie is probably out
too, both because of filters and because BIND acknowledges IP options by
stamping them out of the packet.
I haven't come up with a satisfactory answer to question yet (but would
love to hear one). However, the first point I brought up is useable to a
limited extent.
Given that denial of service makes guessing randomized IDs feasable, we
greatly increase the complexity of the attack (and possible eliminate it
almost entirely) if we can ensure that we're at least capable of receiving
DNS responses from the legitimate authority servers for the domain we're
querying.
We can do this easily.
Follow the "real" query that we send to the authority server with an
entirely made up query for arbitrary random information. Send both queries
to every nameserver we query. If the nameserver is alive and receiving our
queries, it will respond with an NXDOMAIN for the second query, and we'll
know no denial of service is involved - meaning the 16-bit ID now needs to
be guessed within the round-trip time between the target nameserver and
the legitimate authority servers.
Decoupled from denial of service, the ID guessing attack is significantly
harder to perform. Every denial of service attack that I can come up with
will block both the "cookie" query and the real query, invalidating the
transaction entirely. Selective denial of service only seems doable if we
can see the contents of the packets being sent - in which case no amount
of random information will save us.
This attack will significantly increase DNS bandwidth requirements, but
it's obviously possible to enable it selectively for certain zones (using
a BIND directive in named.boot). Also, Mark Hittinger brought up the fact
that BIND should be able to slip into "secure" mode based on the
perception of an attack - hundreds of invalid query ID's suddenly arriving
- and invalidating the original query.
JE's caching attack is scary, primarily because it can be performed
transparently using forged addresses almost 100% successfully. While this
issue is supposedly resolved in 4.9.5-P1, I'm not confident that similar
issues won't come up. Can't we solve this problem entirely by simply not
caching information in the resource record section of the packet?
There shouldn't be a significant long-term bandwidth increase associated
with this, as we can still USE the resource records from the packet - just
not cache them - thus maintaining the advantage of not needing to do a
seperate "A" record lookup for MX and CNAME records.
I'd really like to see a BIND option to allow this behavior to be enabled
or disabled selectively. While I'm fairly certain to have a patch for my
own servers shortly, an officially distributed patch would go a long way
towards easing my mind.
----------------
Thomas Ptacek at EnterAct, L.L.C., Chicago, IL [tqbf@enteract.com]
----------------
"If you're so special, why aren't you dead?"