SUMMARY: Strange RPC Problem

poore (poore@rdesparc.vistachrome.com)
Thu, 06 Mar 1997 09:01:25 -0500

Dear Sunners,

Here is the original question I posed:

"During the boot process, after the first batch of RPC services loaded
(kerbd, keyserv, etc.) the system would hang. I was able to determine
that the problem was occuring during execution of the /etc/rc2.d/S71rpc
script, which is pretty much stock out of the box. None of the
configuration files associated with this have been modified (as far as
I know) and all of the scripts and programs called by this startup
script are in place and unmodified. Prior to this I had verified that
the network connection was good and that there wasn't any odd behavior
in the network.

After booting without this script, I manually started rpcbind and then
executed the other startup scripts for those services which rely on rpc
(nis, nfs, etc.). It all worked just fine. I then put the rpc
run-script back in its proper location, and rebooted the system. It
booted normally and appears to be functioning normally at this time."

=-=-=-=-=-

Thanks to:

birger@Vest.Sdata.No
ric@rtd.com

for the following suggestions:

=-=-=-=-=-

"Strange events with RPC are sometimes related to blank lines in NIS maps.
At least this was a common problem with Solaris 1.x.
Check your rpc and services maps for blank lines.

Birger"

=-=-=-=-

This was an interesting suggestion. I checked all the NIS maps and found no
blank lines...

=-=-=-=-=-

Generally, whacky net related boot hangs are caused by
something external (like nameservice) being offline. The
best insurance is to make sure nsswitch.conf has a hosts line
that reads:
hosts: files dns (or "files nis" or whatever)
so /etc/hosts get searched ahead of other sources. Then
make sure all the entries for the local machine are present
in /etc/hosts. At that point, you should be able to boot
with the network disconnected (at least up to the point
you try to NFS mount disks).

Cheers,
Ric (<ric@rtd.com> "Ric Anderson", using RTD's public internet access)

=-=-=-=-=-

I checked the local hosts file and the entries were appropriate. Also, the
nsswitch.conf file was pointing to files first.

The only explanation I can think of at this point is that there must have
been some strange network problem happening with this particular hosts'
connection, since the NIS server was providing service to all the clients
on the network. I'll do some network testing this weekend...

Thanks for your help,

David Poore
poore@homes.com