Re: an detailed explaination why land attack works?

Don Lewis (Don.Lewis@TSC.TDK.COM)
Thu, 04 Dec 1997 03:53:12 -0800

On Dec 3, 1:20pm, Bill Paul wrote:
} Subject: Re: an detailed explaination why land attack works?
} Of all the gin joints in all the towns in all the world, Feiyi Wang had
} to walk into mine and say:
}
} > Hi, there
} >
} > Can anyone give a detailed explaination about why land attack works on
} > some TCP/IP stack (say BSD-derived)? Which loop is trapped in by this
} > "self-connect" request? What's the state transition internally? I can't
} > figure it out.

} - Because the source and destination addresses and ports in the forged
} SYN packet were the same, the SYN,ACK is reflected back to tcp_input().
} The TCP sequence numbers in the packet are the same as those that
} were sent out before, which is where the problem starts: tcp_input()
} expects that the sequence numbers will relate to the ACK segment that
} it wants as part of the three way handshake. Instead, it gets a segment
} with the same sequence numbers as before, which places the segment
} outside the window. The fact that the ACK bit is set is probably what
} keeps tcp_input() from throwing the segment away right at the start.

Nope, the ACK bit is not required. The flag checking doesn't happen
until after the segment is trimmed to fit the window and the code that
does this is what decides to send the ACK (or a SYN-ACK in this case
since we're in the SYN-RECEIVED state).

} This is part of the bug: in the SYN_RECEIVED state, you're only supposed
} to get an ACK, not a SYN,ACK.

That is normally true, but not always so. You should expect to see a
SYN-ACK in the SYN-RECEIVED state if you are doing a simultaneous open.
This will only happen if the previous state was SYN-SENT.

} Eventually we arrive at the code detailed on page 958 of _TCP/IP
} Illustrated Vol. 2_. This is near line 1030 in our version of
} tcp_input.c. According to the commentary on page 959, "The entire
} segment lies outside the window and is not a window probe, so the
} segment is discarded and an ACK is sent. This ACK will contain the
} expected sequence number."
}
} This code does a 'goto dropafterack' which causes an ACK to be sent.
} In effect, tcp_input() is saying: "No, that's not the segment I want;
} try again." The connection meanwhile stays in the SYN_RECEIVED state.

I believe this is where the bug lies. I don't think an ACK should be sent
in this case if we're in the SYN-RECEIVED state, the incoming segment
has an ACK bit, and the acknowledge sequence number fails the SYN-RECEIVED
ACK test, which means that the ACK could not be for our SYN.

} The kernel is now trapped in a loop: tcp_input() will again send an ACK
} with the same sequence numbers, which will be reflected right back to
} tcp_input(), which will cause tcp_input() to send another ACK with the
} same sequence numbers, which will be reflected right back to tcp_input(),
} which will cause tcp_input() to send another ACK with the same sequence
} numbers, which will be reflected right back to tcp_input(), which will
} cause tcp_input() to send another ACK with the same sequence numbers,
} which will be reflected right back to tcp_input(), which will cause
} tcp_input() to send another ACK with the same sequence numbers, and so
} on, and so on, and so on, etc...

Yup.

} All this takes place with some interrupts masked, which means the OS
} is spinning out of control with no way to stop it short of a shutgun
} blast to the head. With BSD you can still manually break into the kernel
} debugger and regain partial control of the system and at the very least
} force a panic rather than pushing the big red button.

>From the symptom reporst it sounds like some implementations aren't
locked up quite so tightly and allow the TCP timers to run. If that
happens, then the connection will time out after a while and the machine
will recover.

} A very quick (but not entirely correct) way to short-circuit the loop
} is to test tp->t_state and tiflags before unconditionally branching to
} dropafterack; if we are in the TCPS_SYN_RECEIVED and tiflags has something
} other than the TH_ACK bit set, then we should not jump directly to
} dropafterack. This traps the case where the SYN,ACK is received while
} in the SYN_RECEIVED case (which is bogus) and stops the loop condition.

This is not correct. If our SYN-ACK is lost, then our client will
resend it's initial SYN. If we're doing T/TCP and our SYN-ACK is lost,
then I think we could see SYN-FIN. If we're doing a simultaneous open,
then we'll see a SYN-ACK.

} This is not the completely right thing to do though because there's
} another case where it won't help. Say for a moment that an attacker has
} a way to predict the initial sequence number chosen by the 'victim
} machine' and uses that in his forged 'loopback SYN' segment. If this
} happens, the connection will get all the way to the ESTABLISHED state,
} and the process listen()ing on the socket will end up talking to itself.
} I tested this on my home machine by hacking tcp_input() to choose a
} predictable sequence number and them modifying the 'loopback SYN'
} program to use the same ISS that the kernel would. This time, the
} reflected SYN,ACK that comes back to tcp_input() appears to be inside
} the expected window, so it avoids the branch to dropafterack. Instead,
} the connection becomes ESTABLISHED and the server processes becomes
} schitzophrenic and talks to itself.

Yup, though it should be possible to avoid this by following the advice
of RFC 1122 and keeping track of the state prior to switching to
SYN-RECEIVED. If the previous state was LISTEN, the we should never
see a SYN-ACK, and if the previous state was SYN-SENT, then we can expect
to see either SYN or SYN-ACK.

BTW, it's possible to also have phun with two listening sockets, either
on the same machine or two different machines, though this bug is probably
easier to exploit with two machines. You can chew up some bandwidth by
sending SYNs to the two sockets using each other's source addresses.
If you can guess the sequence numbers, then you can connect two listening
sockets to each other.

} Whether or not this problem affects a given system depends (obviously)
} on the TCP input processing. Some systems that are BSD-derived had made
} changes that affect SYN processing (like deflecting SYN floods for
} instance) which also happen to provide protection from this attack.
} Non-BSD systems like Solaris 2.x may perform more stringent checks,
} or do the state processing and window checking in a different order
} which allows them to spot the bogus segment before it does any damage.

There have been some changes made to the FreeBSD stack for SYN flood
protection, but they are of no use against this problem. There was
one change that was made and then backed out that seems to have protected
against this, but it had the problem that if you sent a forged SYN to
a listening socket, the socket ignored the RST segments that were sent
by the real owner of the source address in response to the SYN-ACKs
that were sent in response to the SYN.

Judging by the wide variety of vulnerable systems, I'd say this problem
is quite widespread in spite of the hardening that various vendors have
done.

Here's a message I sent to freebsd-hackers a day or so ago in hopes of
soliciting some comments from the experts.

On Dec 2, 12:17am, Don Lewis wrote:
) Subject: fixes for "LAND" and various other TCP bugs
) While studying the "LAND" bug I stumbled across a couple of other
) things that look like bugs in the FreeBSD TCP code.
)
) The most serious is that the introduction of the segment trimming
) fix from Stevens looks like it introduced a bug in RST processing
) that allows reset segments with sequence numbers spanning half the
) sequence number space to be accepted as valid and processed, which
) is contrary to RFC 793. I think the original code didn't really get
) this totally right, either.
)
) The second is that reset segments terminate connections in the
) TIME-WAIT state. While this is what RFC 793 says to do, it is
) discouraged by RFC 1337.
)
) The attached patch contains three partially redundant fixes (two
) new ones plus the present one relocated) for the "LAND" DoS attack
) and a related problem (ACK wars caused by sending spoofed SYNs to
) two listening sockets) as well as fixes the above problems.
)
) I did not include the following change suggested by RFC 1337 in
) section 2.1, but the final part of my "LAND" fix is really a special
) case of this:
)
) H2. The new connection may be de-synchronized, with the two ends
) in permanent disagreement on the state. Following the spec
) of RFC-793, this desynchronization results in an infinite ACK
) loop. (It might be reasonable to change this aspect of RFC-
) 793 and kill the connection instead.)
)
) I'm wondering if it might be wise to actually make this change, since
) it should provide at least partial protection against TCP splicing attacks.
)
) Something else that is missing is a fix for the potential to connect
) two listening sockets to each other by sending them each forged SYNs
) with carefully chosen sequence numbers. This should be easy to fix
) if RFC 1122 is followed:
) 4.2.2.11 Recovery from Old Duplicate SYN: RFC-793 Section 3.4,
) page 33
)
) Note that a TCP implementation MUST keep track of whether a
) connection has reached SYN_RCVD state as the result of a
) passive OPEN or an active OPEN.
) If this is the case, we would only expect to see a SYN-ACK in the
) SYN-RECEIVED state if the previous state was SYN-SENT. If the
) previous state was not SYN-SENT, then an RST should be sent and
) the connection dropped.
)
) This patch changes the location of the currently implemented "LAND" bug
) fix, which should allow legitimate self-connects to work. My testing
) with my 2.1-stable machine indicates that it is not vulnerable to the
) problem even without this protective measure.
)
) I *think* this is the correct fix, but I'm interested in any comments
) from the TCP experts, especially with regards to T/TCP.

[ patch deleted ]

I found one slight problem with that patch. Here's what I'm currently
using on a couple of machines. It's relative to 2.2-stable, but it
applies cleanly to 2.1-stable if you delete the first section.

--- tcp_input.c.2_2 Mon Dec 1 16:49:21 1997
+++ tcp_input.c Wed Dec 3 02:21:45 1997
@@ -318,19 +318,6 @@
#endif /* TUBA_INCLUDE */

/*
- * Reject attempted self-connects. XXX This actually masks
- * a bug elsewhere, since self-connect should work.
- * However, a urrently-active DoS attack in the Internet
- * sends a phony self-connect request which causes an infinite
- * loop.
- */
- if (ti->ti_src.s_addr == ti->ti_dst.s_addr
- && ti->ti_sport == ti->ti_dport) {
- tcpstat.tcps_badsyn++;
- goto drop;
- }
-
- /*
* Check that TCP offset makes sense,
* pull out TCP options and adjust length. XXX
*/
@@ -654,6 +641,24 @@
if (m->m_flags & (M_BCAST|M_MCAST) ||
IN_MULTICAST(ntohl(ti->ti_dst.s_addr)))
goto drop;
+
+ /*
+ * Reject attempted self-connects.
+ *
+ * Doing the test here should prevent the "LAND" DoS
+ * attack without affecting legitimate self-connects
+ * which will occur in the SYN-SENT state.
+ *
+ * In the dropafterack code below we'll also fix the real
+ * bug in the SYN-RECEIVED state that causes the infinite
+ * loop since it can also be used to generate ACK storms.
+ */
+ if (ti->ti_src.s_addr == ti->ti_dst.s_addr
+ && ti->ti_sport == ti->ti_dport) {
+ tcpstat.tcps_badsyn++;
+ goto drop;
+ }
+
am = m_get(M_DONTWAIT, MT_SONAME); /* XXX */
if (am == NULL)
goto drop;
@@ -962,17 +967,99 @@

/*
* States other than LISTEN or SYN_SENT.
- * First check timestamp, if present.
+ * First check the RST flag and sequence number since reset segments
+ * are exempt from the timestamp and connection count tests. This
+ * fixes a bug introduced by the Stevens, vol. 2, p. 960 bugfix
+ * below which allowed reset segments in half the sequence space
+ * to fall though and be processed (which gives forged reset
+ * segments with a random sequence number a 50 percent chance of
+ * killing a connection).
+ * Then check timestamp, if present.
* Then check the connection count, if present.
* Then check that at least some bytes of segment are within
* receive window. If segment begins before rcv_nxt,
* drop leading data (and SYN); if nothing left, just ack.
*
+ *
+ * If the RST bit is set, check the sequence number to see
+ * if this is a valid reset segment.
+ * RFC 793 page 37:
+ * In all states except SYN-SENT, all reset (RST) segments
+ * are validated by checking their SEQ-fields. A reset is
+ * valid if its sequence number is in the window.
+ * Note: this does not take into account delayed ACKs, so
+ * we should test against last_ack_sent instead of rcv_nxt.
+ * Also, it does not make sense to allow reset segments with
+ * sequence numbers greater than last_ack_sent to be processed
+ * since these sequence numbers are just the acknowledgement
+ * numbers in our outgoing packets being echoed back at us,
+ * and these acknowledgement numbers are monotonically
+ * increasing.
+ * If we have multiple segments in flight, the intial reset
+ * segment sequence numbers will be to the left of last_ack_sent,
+ * but they will eventually catch up.
+ * In any case, it never made sense to trim reset segments to
+ * fit the receive window since RFC 1122 says:
+ * 4.2.2.12 RST Segment: RFC-793 Section 3.4
+ *
+ * A TCP SHOULD allow a received RST segment to include data.
+ *
+ * DISCUSSION
+ * It has been suggested that a RST segment could contain
+ * ASCII text that encoded and explained the cause of the
+ * RST. No standard has yet been established for such
+ * data.
+ *
+ * If the reset segment passes the sequence number test examine
+ * the state:
+ * SYN_RECEIVED STATE:
+ * If passive open, return to LISTEN state.
+ * If active open, inform user that connection was refused.
+ * ESTABLISHED, FIN_WAIT_1, FIN_WAIT2, CLOSE_WAIT STATES:
+ * Inform user that connection was reset, and close tcb.
+ * CLOSING, LAST_ACK, TIME_WAIT STATES
+ * Close the tcb.
+ * TIME_WAIT state:
+ * Drop the segment - see Stevens, vol. 2, p. 964 and
+ * RFC 1337.
+ */
+ if (tiflags&TH_RST) {
+ if (tp->last_ack_sent == ti->ti_seq) {
+ switch (tp->t_state) {
+
+ case TCPS_SYN_RECEIVED:
+ so->so_error = ECONNREFUSED;
+ goto close;
+
+ case TCPS_ESTABLISHED:
+ case TCPS_FIN_WAIT_1:
+ case TCPS_FIN_WAIT_2:
+ case TCPS_CLOSE_WAIT:
+ so->so_error = ECONNRESET;
+ close:
+ tp->t_state = TCPS_CLOSED;
+ tcpstat.tcps_drops++;
+ tp = tcp_close(tp);
+ break;
+
+ case TCPS_CLOSING:
+ case TCPS_LAST_ACK:
+ tp = tcp_close(tp);
+ break;
+
+ case TCPS_TIME_WAIT:
+ break;
+ }
+ }
+ goto drop;
+ }
+
+ /*
* RFC 1323 PAWS: If we have a timestamp reply on this segment
* and it's less than ts_recent, drop it.
*/
- if ((to.to_flag & TOF_TS) != 0 && (tiflags & TH_RST) == 0 &&
- tp->ts_recent && TSTMP_LT(to.to_tsval, tp->ts_recent)) {
+ if ((to.to_flag & TOF_TS) != 0 && tp->ts_recent &&
+ TSTMP_LT(to.to_tsval, tp->ts_recent)) {

/* Check to see if ts_recent is over 24 days old. */
if ((int)(tcp_now - tp->ts_recent_age) > TCP_PAWS_IDLE) {
@@ -1003,10 +1090,19 @@
* RST segments do not have to comply with this.
*/
if ((tp->t_flags & (TF_REQ_CC|TF_RCVD_CC)) == (TF_REQ_CC|TF_RCVD_CC) &&
- ((to.to_flag & TOF_CC) == 0 || tp->cc_recv != to.to_cc) &&
- (tiflags & TH_RST) == 0)
+ ((to.to_flag & TOF_CC) == 0 || tp->cc_recv != to.to_cc))
goto dropafterack;

+ /*
+ * In the SYN-RECEIVED state, validate that the packet belongs to
+ * this connection before trimming the data to fit the receive
+ * window. Check the sequence number versus IRS since we know
+ * the sequence numbers haven't wrapped. This is a partial fix
+ * for the "LAND" DoS attack.
+ */
+ if (tp->t_state == TCPS_SYN_RECEIVED && SEQ_LT(ti->ti_seq, tp->irs))
+ goto dropwithreset;
+
todrop = tp->rcv_nxt - ti->ti_seq;
if (todrop > 0) {
if (tiflags & TH_SYN) {
@@ -1118,40 +1214,6 @@
}

/*
- * If the RST bit is set examine the state:
- * SYN_RECEIVED STATE:
- * If passive open, return to LISTEN state.
- * If active open, inform user that connection was refused.
- * ESTABLISHED, FIN_WAIT_1, FIN_WAIT2, CLOSE_WAIT STATES:
- * Inform user that connection was reset, and close tcb.
- * CLOSING, LAST_ACK, TIME_WAIT STATES
- * Close the tcb.
- */
- if (tiflags&TH_RST) switch (tp->t_state) {
-
- case TCPS_SYN_RECEIVED:
- so->so_error = ECONNREFUSED;
- goto close;
-
- case TCPS_ESTABLISHED:
- case TCPS_FIN_WAIT_1:
- case TCPS_FIN_WAIT_2:
- case TCPS_CLOSE_WAIT:
- so->so_error = ECONNRESET;
- close:
- tp->t_state = TCPS_CLOSED;
- tcpstat.tcps_drops++;
- tp = tcp_close(tp);
- goto drop;
-
- case TCPS_CLOSING:
- case TCPS_LAST_ACK:
- case TCPS_TIME_WAIT:
- tp = tcp_close(tp);
- goto drop;
- }
-
- /*
* If a SYN is in the window, then this is an
* error and we send an RST and drop the connection.
*/
@@ -1660,9 +1722,22 @@
/*
* Generate an ACK dropping incoming segment if it occupies
* sequence space, where the ACK reflects our state.
- */
- if (tiflags & TH_RST)
- goto drop;
+ *
+ * We can now skip the test for the RST flag since all
+ * paths to this code happen after packets containing
+ * RST have been dropped.
+ *
+ * In the SYN-RECEIVED state, don't send an ACK unless the
+ * segment we received passes the SYN-RECEIVED ACK test.
+ * If it fails send a RST. This breaks the loop in the
+ * "LAND" DoS attack, and also prevents an ACK storm
+ * between two listening ports that have been sent forged
+ * SYN segments, each with the source address of the other.
+ */
+ if (tp->t_state == TCPS_SYN_RECEIVED && (tiflags & TH_ACK) &&
+ (SEQ_GT(tp->snd_una, ti->ti_ack) ||
+ SEQ_GT(ti->ti_ack, tp->snd_max)) )
+ goto dropwithreset;
#ifdef TCPDEBUG
if (so->so_options & SO_DEBUG)
tcp_trace(TA_DROP, ostate, tp, &tcp_saveti, 0);

--- Truck