Vic/Solaris 10
From Summerseas
Solaris 10 Notes
[edit]
Release History
- Solaris 10 01/06 (Update 1)
- Solaris 10 06/06 (Update 2) (ZFS Included with this release.)
- Solaris 10 11/06 (Update 3)
- Solaris 10 08/07 (Update 4)
- Solaris 10 05/08 (Update 5)
- Solaris 10 10/08 (Update 6)
- Solaris 10 05/09 (Update 7)
[edit]
MPxIO Configuration
- First of all mpxio must be enabled. For sparc machines this is simple...
- stmsboot -e
- And for x86 it is simpler still - mpxio is always enabled.
- Depending on your storage you may need to either turn on ALUA at the target side or add an entry to /kernel/drv/scsi_vhci.conf
- Here are a couple of sample entries for configuring symmetrically accessed storage; one for Solaris 10 and another for the latest iteration of OpenSolaris.
- This is for Solaris 10
device-type-scsi-options-list = "VENDOR PRODUCT", "symmetric-option"; symmetric-option = 0x1000000;
- This is for OpenSolaris.
scsi-vhci-failover-override = "VENDOR PRODUCT", "f_sym";
[edit]
Fibre Channel Drivers
[edit]
Leadville
- Solaris provides a native fibre channel stack along with a multipathing layer for Solaris 10 and some prior releases. The components are included with Solaris 10 and available via packages for the supported prior releases. The fibre channel stack is referred to as the Leadville stack and includes Emulex(emlxs) and Qlogic(qlc) HBA drivers along with transport and protocol modules, fcp, fctl, and fp. The Leadville stack features dynamic discovery which eliminates the requirements for persistent bindings and reboots when adding new luns to the Solaris host.
[edit]
Fibre Channel Troubleshooting Commands
- These commands might be useful if you're trying to resolve discovery issues.
- echo "::fcptrace" | mdb -k >/tmp/fcptrace
- echo "::fptrace" | mdb -k >/tmp/fptrace
[edit]
Testing With a Running Kernel
- With solaris 10 the ability to tune sd_retry_count was removed but there is work in progress I believe to add this ability back in the form of an sd-config-list entry. If you're not familiar with the sd-config-list option it is a couple of entries in either sd.conf or ssd.conf. The sd-config-list identifies a specific type of lun by VID/PID and the name of the other sd.conf entry that will contain the tunables for that VID/PID combination. It is somewhat complex in that the 2nd entry has a hex bitmask that identifies the positional bits you want to tune. The tunables are represented by their position in the bit map. For example max_throttle might be bit 0 so if you provide max_throttle you must set the appropriate bit in the bit mask. This may be a very efficient way to convey information to the sd/ssd kernel module but I have to wonder how something so complex ever got through whatever kind of code and usability review process exists at Sun.
- Anyway, if you absolutely have to change the sd_retry_count in Solaris 10 this procedure might be helpful. I had asked this question on the OpenSolaris storage::discuss forun and Larry Liu kindly provided assistance. Thanks Larry!
- 1. The host must be booted with the kernel debugger. For sparc "boot kadb".
- Note - if the host isn't booted with the debugger this procedure will
cause the host to crash.
- 2. Use mdb to modify the sd_lun data structure.
- Example from my sparc system:
[root@sol10host7a--->]mdb -K kmdb: target stopped at: kaif_enter: ta %icc, %g0 + 0x7d [0]> *ssd_state::walk softstate 30003150080 3000320a580 3000320b000 30003145500 [0]> 30003150080::print -at struct sd_lun un_retry_count 30003150104 uint_t un_retry_count = 0x3 [0]> 30003150104/W 9 0x30003150104: 0x3 = 0x9 [0]> 30003150080::print -at struct sd_lun un_retry_count 30003150104 uint_t un_retry_count = 0x9 [0]> :c [root@sol10host7a--->]
- Check the results...
- echo "*ssd_state::walk softstate |::print -t struct sd_lun un_retry_count" | mdb -k |more
- Note - See /usr/include/sys/scsi/targets/sddef.h for more info on the sd_lun struct.
- echo "*ssd_state::walk softstate |::print -t struct sd_lun un_retry_count" | mdb -k |more
- Update, Perl script and no kadb requirement
- With a little trial and error I've automated and streamlined the tuning via a perl script. I really love perl...
[edit]
Emulex Patch
- If you are using an Emulex HBA with the leadville emlxs driver with Solaris 10 Upadte 3 or prior release you will need to install the latest kernel patch and leadville driver patch. Otherwise you may see plogi error in /var/adm/messages during path failure/recovery which may lead to an outage.
- A quick check of the fcp driver version should tell you whether you are vulnerable.
modinfo |grep fcp 23 127e588 19040 292 1 fcp (SunFC FCP v20070116-1.118)
- Look for a driver with version date 20070116 or newer. If you have an older driver you will need install the latest kernel patch and patch 119130-33 (Note - the x86 patch will be a different patch number and this patch may be superseded or obsoleted in the future.)
[edit]
Emulex LPFC
- Emulex provides a fibre channel driver for their Light Pulse HBAs called lpfc. The lpfc driver package includes the HBA driver along with transport and protocol functionality. When installed on Solaris this driver works along side the leadville stack but does not interact with the Leadville stack. For Solaris 10 systems the utility, emlxdrv, must be used to unbind the leadville driver before the lpfc driver can be used. The utility is available from Emulex. In addition to the lpfc package a management application called HBAnywhere is available from Emulex which includes both GUI and command line tools for managing the Emulex HBAs.
[edit]
Solaris 10 Services / SMF
- The SMF Repository is a more organized service model compared with the legacy run control script system used prior to Solaris 10. Services are defined by XML files and plug into the SMF framework in a standard way. The definition files are in standard places as are the executable files and logs.
- OpenSolaris SMF FAQ
- Primary commands
- svcs - list services
- svcadm - manage services
- svccfg - configure services
- Common command usage
- List all services "svcs -a"
- List failed services "svcs -xv"
- Start a service "svcadm enable apache2"
- Stop a service "svcadm disable apache2"
- List service properties "svccfg -s apache2 listprop"
- List specific property "svccfg -s apache2 listprop start/exec"
- Change a property
- "svccfg -s apache2 setprop start/exec = astring "/lib/svc/method/http-apache2_custom start"
- Once the services are configured as desired the following command will create a generic.xml file which can be used to configure other systems following jumpstart or zone creation.
- svccfg extract > generic.xml
- Now move generic.xml to /var/svc/profile/generic.xom on the target system or zone prior to first boot. This would be typically done by a jumpstart finish script or by a zone creation script or even manually.
- Common directories
- /var/svc/log
- /lib/svc
[edit]
Sun x86 Hardware Notes_FAQ
[edit]
How do I assign a static IP for my x2200 Net MGT port?
- set /SP/AgentInfo DhcpConfigured=disable
- set /SP/AgentInfo NetMask=255.255.255.0
- set /SP/AgentInfo Gateway=129.144.82.254
- set /SP/AgentInfo IpAddress=129.144.82.26
[edit]
Why don't I get any console output from my x2200 Net MGT port?
- See Sun doc 819-6601-14
- Or just follow these steps...
1. Edit the /boot/solaris/bootenv.rc file to read: setprop console ‘ttyb’ setprop ttyb-mode 115200,8,n,1,- 2. Edit the /boot/grub/menu.lst file to read: kernel /platform/i86pc/multiboot -B console=ttyb 3. Edit the /kernel/drv/asy.conf file and add the following: name="asy" parent="isa" reg=1, 0x2f8 interrupts=3; 4. Edit the /var/svc/manifest/system/console-login.xml file to read: <propval name='label' type='astring' value='115200'/> 5. Reboot the system using the following command: reboot -- -r
[edit]
Can I power cycle an x86 system using the SP?
- yes
- x4200...Log on to the SP and enter "reset /SYS"
- x2200...Log on to the SP and cd /SP/SystemInfo/CtrlInfo then enter set PowerCtrl=reset
[edit]
How do I connect to the SP?
- There are actually 3 ways.
- 1. Use a terminal server to access the Ser MGT port.
- 2. Use an SSH or Telnet client to access the Net MGT port
- 3. Use a web browser to access the Net MGT port.
- Option 3 is probably the best option if you have good bandwidth.
[edit]
How do I logon to the SP once I'm connected?
- The default username password is root/changeme. The root password should be changed and additional administrator level users should be created during SP initial setup.
[edit]
Ok, how do I set the root password?
- Login to the SP with an administrator account and issue these commands...
- 1. cd /SP/users
- 2. set root password (You'll be prompted for the password. Supply an 8+ character password)
[edit]
How do I create users?
- x4200...cd /SP/users or on x2200...cd /SP/User
- Then enter "create username". You'll be prompted for a password.
- If desired then cd username and enter "set role=administrator"
[edit]
How do I access the system console from the SP?
- x2200...start /SP/AgentInfo/console
- x4200...start /SP/console
[edit]
How do I add a GRUB entry to boot in NON-SunCluster mode?
- Edit /boot/grub/menu.lst and add an entry like the following...
title Solaris 10 11/06 s10x_u3wos_10 X86 No Cluster root (hd0,0,a) kernel /platform/i86pc/multiboot -x module /platform/i86pc/boot_archive
- Note - the -x switch can be added dynamically using the existing grub entry at boot. The menu item just adds a bit of convenience.
[edit]
Solaris Zones
- Simple perl script to create several containers
- Example configuration
[root@sunx4200-shu02--->]zonecfg -z sunx4200-shu02-zone1 sunx4200-shu02-zone1: No such zone configured Use 'create' to begin configuring a new zone. zonecfg:sunx4200-shu02-zone1> create zonecfg:sunx4200-shu02-zone1> set zonepath=/zones/sunx4200-shu02-zone1 zonecfg:sunx4200-shu02-zone1> set autoboot=true zonecfg:sunx4200-shu02-zone1> add net zonecfg:sunx4200-shu02-zone1:net> set address=10.60.181.229 zonecfg:sunx4200-shu02-zone1:net> set physical=e1000g1 zonecfg:sunx4200-shu02-zone1:net> end zonecfg:sunx4200-shu02-zone1> verify zonecfg:sunx4200-shu02-zone1> exit [root@sunx4200-shu02--->]zoneadm -z sunx4200-shu02-zone1 install
- Commands
- zlogin -C zonename (Connect to the zone console)
- zlogin zonename (Login to a zone)
- zoneadm -z zonename boot|halt (Start/Stop a zone)
- zoneadm list -cv (List all the zones and their states)
- zonecfg -z sunx4200-shu02-zone2 -f /u/engle/zones/template
- Example Command Files
- Zone Creation
create set zonepath=/zones/sunx4200-shu02-zone2 set autoboot=true add net set address=10.60.181.230 set physical=e1000g1 end commit
- Adding a dataset
add dataset set name=apache/docs end commit
- Now the zone is created. Next install it and boot it then connect to the console.
- zoneadm -z sunx4200-shu02-zone2 install
- zoneadm -z zonename boot
- zlogin -C zonename
[edit]
OS Notes
[edit]
IP Change
- sys-unconfig or update these files...
- /etc/defaultrouter
- /etc/netmasks
- /etc/hosts
- /etc/hostname.interface (only if the host name changed or if this file incorrectly contained an IP)
- /etc/nodename
- /etc/inet/ipnodes
- /etc/dumpadm.conf (only if the host name changed. modify DUMPADM_SAVDIR and create the corresponding directory if it does not exist)
- Use "reboot" command to reboot after change. (init 6 might hang the box)
[edit]
Multipath management and status commands
- mpathadm
- mpathadm list lu
- mpathadm show lu </dev/rdsk/...>
- luxadm
- This command was originally provided by Sun for management of Sun Enterprise Network Arrays, SENA, like the old photon fibre channel arrays. Primarily it was used in conjunction with disk replacements and hardware maintenance. Functionality has grown to include mpxio path management and leadville controlled HBA management. The command is deprecated in favor of more recent management commands like fcinfo and mpathadm.
- luxadm probe
- luxadm display [ /dev/rdsk/... | WWPN ]
[edit]
pgrep, pfiles, pargs and more
- These Solaris commands provide very useful process specific details. Some or all of the info provided by these commands could be derived via other means so the "p" commands are likely now used as much as they should be by Solaris systems administrators.
- pgrep - pgrep is useful for finding running processes matching a string. For example, the comand "pgrep -u engle -x telnet" would return the PIDs of any processes named telnet with UID=engle. Include the "-n" switch to get only the newest telnet process or the "-o" switch to get only the oldest telnet process. The "-x" switch forces an exact match for the string "telnet". Without the "-x" then any process with "telnet" as part of the name would match. Note that only the PIDs are returned so no awk processing is required if you're after only the PIDs.
- pfiles - Displays details about files that the process currently has open.
- pargs - This command will show you the environment and argument variables for a running process. The "-e" switch displays the ENV variables and the "-a" switch shows the arguments. The "-l" switch provides the command line form.
- pstop - Stop a process
- pstart - Start a process
- pkill - Kill a process
[edit]
Sun Hardware Notes
[edit]
x4200 Console
- Connecting to the console and managing this system is a little odd.
- This procedure worked for me:
- 1. Attach the Ser. Mgmt port to a terminal server then connect to the port.
- 2. Apply power to the system and observ the ilom linux boot messages.
- 3. Login to the ilom. Default user/pass is root/changeme.
- 4. cd to /SP/console
- 5. Enter "start" and accept the prompt.
- 6. Use a ballpoint pen to press the power button on the front of the system to turn power on for the host. Note - if power was already on, turn it off and then back on.
- 7. Select ttya console from the grub menu.
[edit]
ZFS
- ZFS was first introduced in Solaris 10 Update 2.
- Features include the following:
- Storage pool based volume management.
- Transaction based filesystem
- Checksums on every block.
- WAFL like architecture
- Snapshots and clones'
- NFS and iSCSI integration.
[edit]
ZFS Links
[edit]
Jumpstart
- Most installs should be via jumpstart to ensure consistency.
- Jumpstart related documentation and HowTo's
[edit]
Limiting access to a test environment via tcp wrappers
- To ensure long running tests are not influenced by unwanted user logon's it may be a good idea to restrict access to a test environment from time to time. I have used the following procedure to restrict access to cluster nodes to only the subnet they belong to and my OpenSolaris workstation.
- TCP Wrapper Configuration
- 1. svccfg -s inetd setprop defaults/tcp_wrappers=true
- 2. svcadm restart inetd
- 3. /etc/hosts.deny (Deny everyone)
- ALL: ALL
- 4. /etc/hosts.allow (But allow the 240 net and my workstation.)
- ALL: 10.60.240.
- ALL: 10.61.17.198
[edit]
Creating a system image with flarcreate
- A flar image is handy for installing a system via jumpstart. Typically the image will have additional packages installed and will be patched as needed so that minimal work is required to put the newly installed host into operation following the jumpstart.
- flarcreate -n image.flar -c -S -R / my_image_filename.flar
[edit]
Checking current lun queue depth
- echo "*sd_state::walk softstate |::print -t struct sd_lun un_throttle" | mdb -k |more
- echo "*ssd_state::walk softstate |::print -t struct sd_lun un_throttle" | mdb -k |more
- Note - See /usr/include/sys/scsi/targets/sddef.h for more info.
[edit]
Work-arounds
[edit]
Volfs fails to start
- Patch 125075-01 is now available to address this problem. The patch is the correct was to resolve the issue and the workaround and explanation are for informational purposes only.
- Here is the work-around...
- In the service init script, /lib/svc/method/svc-volfs, change the following block of code in the "start" function
- if [ ! -f /dev/volctl ]; then
- devfsadm -i vol
- fi
- if [ ! -f /dev/volctl ]; then
- To this
- if [ ! -c /dev/volctl ]; then
- devfsadm -i vol
- fi
- if [ ! -c /dev/volctl ]; then
- The file, /dev/volctl, is a link to a character device, not a regular file so the [ ! -f /dev/volctl ] test is always returns true and devfsadm is executed everytime.
[edit]
Tips and Other Useful Info
[edit]
Create an ascii file from a man page
- The following example creates the zonecfg man page in ascii
- man zonecfg | col -x -b > zonecfg.txt
[edit]
How to get a list of all net interfaces
dladm show-link
[edit]
Getting a Trace from Solaris Crashdump Cores
- Run mdb against the cores. (where X is a number that you will see in the core names.)
- mdb -k unix.X vmcore.X
- Get back trace with the $C command within mdb
- > $C
[edit]
Locating Files Using pkgchk
- pkgchk is pretty usefull for finding files installed via pkgadd.
[root@sunt2000-shu01--->]pkgchk -l -P ldm |grep "Pathname:.*bin" Pathname: /opt/SUNWldm/bin Pathname: /opt/SUNWldm/bin/ldm Pathname: /opt/SUNWldm/bin/ldmd Pathname: /opt/SUNWldm/bin/ldmd_start
- Alternatively you may wish to simply grep the install/contents file...
[root@sunt2000-shu01--->]grep "bin/ldm" /var/sadm/install/contents /opt/SUNWldm/bin/ldm f none 0755 root sys 294848 9777 1187889619 SUNWldm /opt/SUNWldm/bin/ldmd f none 0755 root sys 6886040 6034 1187889619 SUNWldm /opt/SUNWldm/bin/ldmd_start f none 0755 root sys 1046 14812 1187889619 SUNWldm
[edit]
Which Device did the OS Boot From?
- prtconf -vp | grep bootpath
[edit]
Cna't boot, even to single user...
- boot -m milestone=none
