Re: strcpy versus strncpy

Mark Whitis (whitis@dbd.com)
Wed, 04 Mar 1998 16:48:03 -0500

On Tue, 3 Mar 1998, Chris L. Mason wrote:
> What's wrong with using the following (I got the idea from some of
> Stevens' code)?
>
> char *sstrcpy(char *dst, size_t n, const char *src) {
> if (strlen(src) > (n - 1)) {
> errno = ENOSPC;
> return NULL;
> }
>
> strcpy(dst, src);
> dst[n - 1] = '\0';
>
> return dst;
> }

Note: This is a long message which discusses strncpy() at length
and includes a new, improved, lemon-scented version.

As you point out, it is not multithread safe and there is no
good reason this has to be the case. But there is a more
serious problem which will manifest itself if you are not
extremely carefull to always check the errno and never
use the result if errno is set.

This, and at least one other code example posted here have
a serious flaw: in the event of a buffer overflow, they
fail to terminate the string (at zero length).
If the calling function ever forgets to check for an error
before using the result (or uses the result even though
it checked for an error), you will get a surprise: either
you will get an old value or you will get a garbage value
which may result in buffer overflows or segment faults elsewhere.
Even if you do check the return code, it is easy for situations
to creep in where the value is still used for some purposes later.

Almost all string functions should poke a zero into the first
byte of the return string as long as two conditions are met: the
pointer is not null and the string length is greater than zero.
I sometimes do this at the very beginning along with copy
routines that always add a zero after each byte copied; this
insures that the string will always be valid even if the
function ends or if the value is looked at from another thread
(although you should never rely on the value being valid
unless you have taken precautions to insure against
simultaneous access).

Also, many string functions are much friendlier if they treat
a NULL pointer the same as a null string for source values.
In some cases, this friendly behavior might let something slip
through but I think it is best to save the programmer from unneccessary
busy work so they have time to solve the more serious problems.
For example:
safe_strncpy(tmpdir,getenv("TEMP"),sizeof(tmpdir));
In those few cases where a null string is undesirable, you should
check for it explictly since it could result from other causes anyway.

A good string copy should:
- check for zero length on destination string
options:
assert()
return exception
do nothing
- check for null pointer on destination string
options:
assert()
return exception
do nothing
- check for null pointer on source string
options:
assert() (rarely desireable)
return exception (rarely desireable)
return zero length string
- check for overflow
options:
assert()
return exception and truncated string, zero terminated string
return exception and zero length string
return exception and unmodified string (Bad!)
return truncated, zero terminated string
return zero length string
overflow (Very bad)

Filling options:
- Always zero fill to end of buffer
slow with large buffers
prevents information leaks
tends to encourage immediate detection of overflow bugs
- Never fill beyond what is needed
may be faster
strncpy(dest,src,INT_MAX) == strcpy(dest,src)
- Moving zero terminate
- poke zero into last byte of buffer
this is the easiest workaround without a wrapper or replacement
function but generally the least desireable.

Usually, I prefer any string function to treat NULL and "" identically
for a source string; this tends to behave well in most real world
situations and saves writing a lot of extra code; #ifdef PEDANTIC,
you can return an exception or abort using assert() here.

I really dislike the use of dynamicly allocated strings for
mission critical applications. They have generally caused much
grief, although usually much more so when written by someone else.
Memory leaks and fragmentation problems have no place in
code which needs to be robust and secure. In preforked servers,
if there is any dynamic memory code I normally kill off each
subprocess after some finite number of uses to protect against
leaks. I also monitor the process size for leaks. It
will typically grow for some number of iterations and then
roughly stabilize (approach but not necessarilly reach an asymptote)
if there are no leaks; the healthy growth, as opposed
to the cancerous growth of memory leaks, can come from various sources.
Buffers may not all be allocated at the completion of initialization
or even the first few iterations since some buffers are not
allocated until they are first used and they may not all
be used until all code paths have been traversed. Functions
which use realloc() to double the size of a buffer as needed
(generally much better than alloc and free behavior) will take
a while to reach their final sizes.

The use of strncpy() itself is usually a sign of a bug unless
it is immediately followed by buf[sizeof(buf)-1]=0. In trivial
programs, I usually do this manually. In more serious programs
I use a wrapper or replacement function.

When using truncating functions, avoid situations where
the string is truncated to different lengths in different
places. Usually, truncating to a sufficiently conservative
length on input will suffice (should be less than used
by your library functions). I.E. if you truncate all
input strings to 256 bytes or less and all your library
functions are required to handle at least 1024 bytes then
you should be okay provided you do not combine more than
four strings (including prefixes), cummulative. Be particularly
wary of truncation before any deny known bad types of tests
(which your system should not rely on anyway).
Inconsistent truncation may result in the test failing to detect
unwanted input.

Don't forget to reduce length if you started at other than
the beginning of the buffer (i.e. for some kind of concatenation
operation). Otherwise, copy routines which poke a zero into
the end of the string, zero fill, or long source strings will
result in a little surprise.

In the example below, I give various compile time options.
gracefully. This function has been modified quite a bit
since it was extensively tested.

There is something to be said for having a fourth argument "options",
since it could be handy to choose your options based on context.

Given the number of options, I do suggest looking at the output
of the C preprocessor with your choice of options for a sanity check.

This function is generally suitable for replacing existing calls
to strncpy() with safe_strncpy() by global search and replace.
However, in a very small percentage of cases, you may find
existing code which depends on the documented ideosyncracies
of strncpy(). Usually, this is when strncpy() is called
with a value which will deliberately prevent the trailing null
from being written (either to write to a non null terminated
string or to insert into the middle of a string).

If you wish to incorporate the following function in GPLed code,
send me email.

#include <assert.h>
#include <errno.h>
/* safe_srncpy() is Copyright 1998 by Mark Whitis <whitis@dbd.com> */
/* All Rights reserved. */
/* use it or distribute it (including for profit) at your own risk */
/* but please don't GPL it. Retain this notice. Modified versions */
/* should be clearly labeled as such. */
char *safe_strncpy(char *dst, char *src, int size)
{
char *savedst;

savedst = dst;

/* Make sure destination buffer exists and has positive size */
#ifdef SAFE_STRNCPY_DESTINATION_REQUIRED
assert(dst);
assert(size>0);
#else
if(!dst) {
errno = EINVAL;
return(dst);
}
if(size<1) {
errno = EINVAL;
return(dst);
}
#endif

/* Do something sensible if we receive a NULL pointer */
#ifdef SAFE_STRNCPY_ABORT_NOSRC
assert(src);
#else
/* treat NULL and "", identically */
if(!src) {
dst[0]=0;
#ifdef SAFE_STRNCPY_ERROR_NOSRC
errno = EINVAL;
#endif
return(dst);
}
#endif

size--; /* leave room for trailing zero */
while(*src && ( size-- > 0 ) ) {
#ifdef SAFE_STRNCPY_MOVING_ZERO_TERM
*(dst+1) = 0;
#endif
*dst++ = *src++;
}
*dst = 0;

#ifdef SAFE_STRNCPY_ZERO_FILL
while( size-- > 0 ) {
*dst++ = 0;
}
#endif

/* If we didn't copy the whole string, report error */
if(*src) {
errno=ENOSPC;
#ifdef SAFE_STRNCPY_ALL_OR_NOTHING
/* return zero length string instead of truncated string */
/* on truncation error */
savedst = 0;
#endif
}
#ifdef SAFE_STRNCPY_TRUNC_ABORT
assert(!*src);
#endif

return(dst);
}

---------------------------------------------------------------------------
--- Mark Whitis <whitis@dbd.com> WWW: http://www.dbd.com/~whitis/ ---
--- 428-B Moseley Drive; Charlottesville, VA 22903 ---
---------------------------------------------------------------------------