Pages

Tuesday, July 29, 2014

[nim] bootp request retry attempt

The following checks are to be made from the master's side. 

Before running any check it is always good to run a reset/deallocate operation so we know that we're starting fresh. 

# nim -Fo reset client_name
# nim -o deallocate -a subclass=all client_name 

  • The first check from your master's side would be to verify that your master/client definitions and network definitions are correct. 

Start off with full hostname resolution. For these examples my master's hostname will be 'shadoe' and my client's hostname will be 'kintaro'. You will use your own appropriate hostnames. 

# host shadoe
# host shadoe.austin.ibm.com
# host 9.3.58.215

Your output should read similar to the following for all 3 commands. Make sure all 3 commands output the exact same output (verbatim) : 

shadoe is 9.3.58.215,  Aliases:   shadoe.austin.ibm.com

# host kintaro
# host kintaro.austin.ibm.com
# host 9.3.58.216

kintaro is 9.3.58.216,  Aliases:   kintaro.austin.ibm.com

If there is any discrepancy or unexpected output, this should be fixed before proceeding. If hostname resolution looks good, then we'll want to make sure NIM has the correct information stored in it's database as well. 

# lsnim -l master

This command will typically have a lot of output. What we're really interested in is the master's network definition. That is held with the “if1” attribute. You may have more adapters defined within NIM but we'll keep it simple for the purpose of this example. To get only the network information run : *note – the word 'master' is not in italics. You will actually use the word 'master' in this case – not the hostname of the NIM master. 

# lsnim -l master |grep if1
 if1                 = master_net shadoebso.austin.ibm.com 00145EB7F3F5

The master's network object name is “master_net”. You also will want to verify the master's hostname is correct in the output of that command as well. 
Next we'll look at the master's network definition : 

# lsnim -l master_net
master_net:
   class      = networks
   type       = ent
   Nstate     = ready for use
   prev_state = information is missing from this object's definition
   net_addr   = 9.3.58.0
   snm        = 255.255.255.0
   routing1   = default 9.3.58.1

Given the IP addresses I know to be correct this network definition looks right. Next you'll check your client. 

# lsnim -l client_name

kintaro :
   class           = machines
   type            = standalone
   connect         = shell
   platform        = chrp
   netboot_kernel  = mp
   if1             = master_net kintaro 0
   cable_type1     = tp
   Cstate          = ready for a NIM operation
   prev_state      = ready for a NIM operation
   Mstate          = not running
   cpuid           = 00012D4AD200
   Cstate_result   = reset

We can see from the output that this client is also defined on “master_net”, which I know is correct. If this client was defined on any other NIM network, that might be the source of my bootp (or possibly a different NIM boot) failure. 

  • Check the bootp service and /etc/bootptab file.

To verify bootp is running execute the following : 

# lssrc -t bootps
Service       Command  Description  Status
bootps       /usr/sbin/bootpd         bootpd /etc/bootptab active

Make sure the command is listed as “/usr/sbin/bootpd” and make sure the status is active. If either the command is wrong or the status is set to “inoperative” you can check your /etc/inetd.conf file. 
Make sure the command string is correct for the bootps line : 

bootps  dgram   udp     wait    root    /usr/sbin/bootpd       bootpd /etc/bootptab

Many times network admins will comment out bootp for security reasons. If this is the case simply uncomment the 'bootps' line. While you're in this file also make sure the 'tftp' line is uncommented.
Run a refresh of inetd from command line to restart these services. 
*note : make sure you clear this with your network admin first. They may not allow you to do this without their knowledge. 

# refresh -s inetd
Your bootp and tftp should both show active now.
# lssrc -t bootps
Service       Command                  Description              Status
 bootps       /usr/sbin/bootpd         bootpd /etc/bootptab     active

# lssrc -s tftpd
Subsystem         Group            PID          Status
 tftpd            tcpip            3735960      active

Next you'll want to setup for your install again. Your /etc/bootptab file may be populated with incorrect information about your master, client, or both. 
Most commonly you'll setup for a NIM install via the 'smit' tool. 

# smitty nim_bosinst

Once you've setup for your NIM install cat your /etc/bootptab file. The last entry in the file should be for your target client. There should also only be one entry for any one client. 

# cat /etc/bootptab
kintaro:bf=/tftpboot/kintaro:ip=9.3.58.216:ht=ethernet:sa=9.3.58.215:sm=255.255.255.0:

All of these addresses look valid. If you are still receiving bootp errors you can put bootp into debug. This will tell us whether or not the client's bootp request is making it to the master, whether or not the master is sending a reply, and whether or not the master recognizes the client it's trying to reply to. 


  • Putting bootp into debug

This process is most helpful when running in a NIM environment where the master and client are on different networks. Often times network admins will block bootp requests by use of the routers. Running this test will show us whether or not that is a likely possibility. 

Putting bootp into debug : 
The following commands will be executed on the NIM master. 

# nim -Fo reset client_name
# nim -o deallocate -a subclass=all client_name
# stopsrc -t bootps
# ps -ef |grep bootp 
Make sure there are no processes still running. If so, kill them off (as long as they don't have a parent id of 1).
# vi /etc/inetd.conf 
Comment out the bootps line with a # . 
Now that bootp has stopped go ahead and bring up another window on your master. Putting bootp into debug is going to lock the window. 

# bootpd -d -d -d -d -s

This will display any bootp requests coming into the master. The master may also detect other bootp requests going across the network – those can be ignored. Go ahead and setup for your installation operation again from smit. 

# smitty nim_bosisnt

Once that is done initiate the bootp request from the client side. If you do not see a request being made from the client, and if you're certain the IP addressing is correct then your router or firewall are most likely the source of your bootp failure. You should contact your network admins to have them resolve that problem. 
If you do see a bootp request coming in but still do not get a receipt from the client side you should contact the support center for further diagnostics. 
Once you are finished testing bootp make sure you remember to remove the comment from the /etc/inittab file, ctrl-c out of the bootp debug, stop bootp, and refresh inetd. 

No comments:

Post a Comment