Original Question:
We have various machines (Sparc 4,5,20,1000,2000,Ultra 2) all running Solaris
2.5.1. Once in a while a filesystem will fill up to capacity usually from some
runaway process that those lovely developers like to run. After we find the
culprit and delete the massive offending file (not to mention beating the
offending user liberally about the head and shoulders), the filesystem will not
reset itself. When I type in 'df -lk', the filesystem (/home) still thinks it's
at 100%. But a 'du -sk /home' shows that it is definitely not full. The only
way I have found to reset the filesystem is to umount then mount the filesystem.
Unfortunately, this is the user's home dirs and the only way to umount the
filesystem is to do an 'fuser -cuk /home' which kills all the processes
associated with the /home partition. Then I can umount/mount /home. Is there
any way I can reset the UFS so that I don't have to kick everyone off?
Answer:
Everyone who responded had the same answer: Just deleting the file isn't
enough. The process that has that file open must also be killed to free the
space it has allocated. The best program I have found for this is lsof
(vic.cc.purdue.edu/pub/tools/unix/lsof/lsof_3.81_W.tar.gz). I use lsof all the
time and forgot to mention it in my post. There are at least three scenarios;
1) You know the name of the file that is growing huge. This is the easiest one.
Just run 'lsof <path-to-huge-file>' e.g. lsof /tmp/hugefile This will show
you the process and the PID that has this file open. You can then kill the
process and delete the file.
2) You DON'T know the name of the file but DO know the name of the user. Do an
'lsof -u<username|userid>' This will show you the users processes and the open
files associated with them. This can be a lot of info you have to wade thru
hopefully you can pick out the offending process. If not you can kill that
users sessions, then do an lsof -u<user> to see if anything is left.
3) You don't know squat about the culprit. All you know is what filesystem is
full. This was my case a couple of times. There are no large files and none of
the users will fess up to anything (typical). You're basically left with doing
an 'lsof <filesystem>' e.g. lsof /home Alas, if there are 50 people logged in
you can see the immediate problem...how the heck do you know which one is
causing the problem. That's when I resort to 'fuser -cuk /home' then
umount/mount /home. If anybody has a better way please let us know.
Thanks to all the reponders;
"Bruce Rossiter" <AROSSITE@us.oracle.com>
Tom Mornini <tmornini@infomania.com>
Ric Anderson <ric@rtd.com>
Rich Kulawiec <rsk@itw.com>
Glenn Satchell <Glenn.Satchell@Uniq.com.au>
White Gary <Gary.White@ramstein.af.mil>
Casper Dik <casper@holland.Sun.COM>
rtrzaska@uk.mdis.com (Ray Trzaska)
Douglas Weigold <dweigold@hokie.cary.mci.net>
"Iskander, Tim" <ISKANDER@infimed.com>
Per Boussard <per@era-t.ericsson.se>
"K.Ravi" <RAVKRISH.IN.ORACLE.COM.ofcmail@in.oracle.com>
seanw@amgen.com (Sean Ward)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Rick von Richter | Phone: 619-552-6222
Systems/Network Admin | Fax: 619-552-6221
Maintenance Warehouse | Email: rickv@mwh.com
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Science is true. Don't be misled by facts.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~