Posted by: Master Will | October 18, 2014

How to change Charset in ORACLE

At server side:
You can change database character set by doing the following steps.

In this case SID name is fujudb. (Check at TNSNAME.ORA)

Login to oracle Database server as sys privilege

SQLPLUS> connect sys/syspassword@fujudb as sysdba;

SQLPLUS> shutdown immediate;
SQLPLUS> startup mount;

SQLPLUS> alter system enable restricted session;
SQLPLUS> alter system set job_queue_processes=0;
SQLPLUS> alter system set aq_tm_processes=0;
SQLPLUS> alter database open;
SQLPLUS> alter database ORCL character set TH8TISASCII;
OR
SQLPLUS> connect sys/syspassword@fujudb as sysdba;
SQLPLUS> update prop$ set value$ = ‘TH8TISASCII’ where name=’NLS_CHARACTERSET’;
To Check DATBASE Charset
SQLPLUS> SELECT * FROM NLS_DATABASE_PARAMETERS;
NLS_CHARACTERSET = TH8TISASCII
NLS_NCHAR_CHARACTERSET = AL16UTF16
At client side:
Edit Registry
Regedit > HKEY_LOCAL_MACHINE > Software > Oracle > Home0(KEY_OraDb10g_home1) > NLS_LANG
Change NLS_LANG => THAI_AMERICA.TH8TISASCII
Posted by: Master Will | September 15, 2014

How To Set Up vsftpd on CentOS 6

#yum install vsftpd
#vi /etc/vsftpd/vsftpd.conf
Change to:
anonymous_enable=NO
local_enable=YES
chroot_local_user=YES

Save and Exit
#service vsftpd restart
#chkconfig vsftpd on

Note: If iptables is enabled, then
#modprobe -i ip_conntrack_ftp

To make it start at boot time
#vi /etc/sysconfig/iptables-config 
IPTABLES_MODULES="nf_conntrack_ftp"
 
 
Posted by: Master Will | September 11, 2014

Installing a VPS Node with Flashcache (SolusVM and OpenVZ)

The following guide will help you install a VPS Node with flashcache using SolusVM for OpenVZ.
We will be using the net install iso for this guide. I will also use nano over vi in the guide, you can choose any text editor you wish. Everything in the guide is preformed as root.

Recommended System Requirements:

E3 or Better CPU
32GB of ram or more
4x 1TB Sata Drives on a Raid card with Sata 3 ports on a Raid 10 array.
1x SSD 120GB On a sata 3 port on your motherboard

Configure your Raid card for a raid 10 array. Please see your manufacturer website for additional information on that.

Either burn or mount the netinstall centos 6 x86_64 iso as you would normally to install the OS

NOTE: Centos 6 Net install URL: http://mirror.centos.org/centos/6/os/x86_64/

This is the standard partitioning scheme i used when creating VPS nodes and it works quite well.

Partitions:
/swap ram+2GB
/boot 200MB
/ 20-30GB
/vz rest

When installing CentOS i would recommend choosing base install over minimal. It will save you time for little things you are looking for in the future.

Once you have the OS installed run the following.

yum update -y
reboot -n

Wait for the system to download any updates it can and restart. This is to make sure everything is up to date.

Change the SSH Port:

nano /etc/ssh/sshd_config
Uncomment #Port 22 and replace 22 with any port you wish. (Do not choose a port SolusVM or any other service you install might need as it will make a conflict)
Press CTRL + X and save the changes
service sshd restart
Your ssh connection will now end and you will need to reconnect using the new port you have specified in sshd_config.

Turning off SELinux

nano /etc/selinux/config
Change the directive to disabled.
CTRL + X and save the changes.

Installing SolusVM

wget http://soluslabs.com/installers/solusvm/install
chmod 755 install
./install

Follow the steps to setup your first node as a master or a slave OpenVZ.
I recommend choosing Automatic location to make sure you get the best possible speed downloading the files.
After this has completed please make note of your output. It will contain important information such as your username/password, url:ports, Solusvm ID and Password. Please copy and paste this as you will need this information later.

Once complete reboot the server again.

reboot -n

Once logged back in make sure the OpenVZ kernel loaded

uname -r

You should recieve an output like the following:

2.6.32-042stab075.10

Install Git and Dependents:

yum -y install git dkms gcc make yum-utils kernel-devel

Download and install your kernel devel:

cd /tmp
wget http://download.openvz.org/kernel/branches/rhel6-2.6.32/042stab075.2/vzkernel-devel-2.6.32-042stab075.10.x86_64.rpm (Note: Check the url and replace the file information to match your uname -r output!)
yum -y localinstall vzkernel-devel-2.6.32-042stab075.10.x86_64.rpm

Git Clone Flashcache:

git clone https://github.com/facebook/flashcache.git

Install and Configure Flashcache (Note: Please remember to change the versions to match your uname -r output!)

cd flashcache/
make KERNEL_TREE=/usr/src/kernels/2.6.32-042stab075.10
make install KERNEL_TREE=/usr/src/kernels/2.6.32-042stab075.10
modprobe flashcache
Make sure its running: dmesg | tail

Making /vz flashcached:
umount /vz
Find your UUID for /vz: grep “/vz” /etc/fstab
flashcache_create -p back vz_cached /dev/sdb /dev/disk/by-uuid/replace-with-your-uuid
Comment out vz in fstab: nano /etc/fstab

Configuring Flashcache

Copy the config file: cp /tmp/flashcache/utils/flashcache /etc/init.d
Change the permissions: chmod 755 /etc/init.d/flashcache
nano /etc/init.d/flashcache

Change the following information:

SSD_DISK=/dev/sdb
BACKEND_DISK=/dev/disk/by-uuid/replace-with-your-uuid
CACHEDEV_NAME=vz_cached
MOUNTPOINT=/vz
FLASHCACHE_NAME=vz_cached
CTRL +X and save it.

Turn on Flashcache at boot: chkconfig flashcache on
Reboot the Server: reboot -n

Make sure everything looks good and run df-h.

Do a DD test to make sure flashcache is working.

First do a DD test in the root directory:
cd /
dd if=/dev/zero of=test bs=64k count=16k conv=fdatasync; unlink test

Note the output

Now do a DD test in the /vz directory.
cd /vz
dd if=/dev/zero of=test bs=64k count=16k conv=fdatasync; unlink test

Note awesomeness.

Everything is now setup as far as the base system with SolusVM OpenVZ and Flashcache!

If you have any issues or questions please post below and I will address them as best as possible.

Thank you,

 

Credit: http://www.allvpsinfo.com/discussion/6/installing-a-vps-node-with-flashcache-solusvm-and-openvz/p1

Posted by: Master Will | August 25, 2014

Extend or Resize harddisk size via lvm on centOS

1. First, need to create a new partition.

[root@server ~]# fdisk /dev/sda

Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel
Building a new DOS disklabel. Changes will remain in memory only,
until you decide to write them. After that, of course, the previous
content won’t be recoverable.Warning: invalid flag 0×0000 of partition table 4 will be corrected by w(rite)

Command (m for help): n <– ENTER
Command action
e extended
p primary partition (1-4)
p <– ENTER
Partition number (1-4): 3 <– ENTER
First cylinder (1-652, default 1):
Using default value 1
Last cylinder or +size or +sizeM or +sizeK (1-652, default 652):
Using default value 652

Command (m for help): p

Disk /dev/sdb: 5368 MB, 5368709120 bytes
255 heads, 63 sectors/track, 652 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/sdb1 1 652 5237158+ 83 Linux

Command (m for help): t <– ENTER
Selected partition 3 <– ENTER
Hex code (type L to list codes): 8e <– ENTER
Changed system type of partition 1 to 8e (Linux LVM)

Command (m for help): w <– ENTER
The partition table has been altered!

Calling ioctl() to re-read partition table.
Syncing disks.

 

2. Reboot server 1 time to take affect
[root@server ~]# shutdown -r now

 

3. Create new Physical Volume using the NEW partition. (pvcreate)

————— pvdisplay ————
[root@server ~]# pvdisplay
— Physical volume —
PV Name /dev/sda2
VG Name vg_server
PV Size 14.51 GiB / not usable 2.00 MiB
Allocatable yes (but full)
PE Size 4.00 MiB
Total PE 3714
Free PE 0
Allocated PE 3714
PV UUID mnFLbX-Zps0-alJR-SZQn-8Rqo-c1rr-2fTWXO
————— pvcreate ————
[root@server ~]# pvcreate /dev/sda3
Physical volume “/dev/sda3” successfully created

[root@server ~]# pvdisplay
— Physical volume —
PV Name /dev/sda2
VG Name vg_server
PV Size 14.51 GiB / not usable 2.00 MiB
Allocatable yes (but full)
PE Size 4.00 MiB
Total PE 3714
Free PE 0
Allocated PE 3714
PV UUID mnFLbX-Zps0-alJR-SZQn-8Rqo-c1rr-2fTWXO

“/dev/sda3” is a new physical volume of “24.99 GiB”
— NEW Physical volume —
PV Name /dev/sda3
VG Name
PV Size 24.99 GiB
Allocatable NO
PE Size 0
Total PE 0
Free PE 0
Allocated PE 0
PV UUID P0sEeq-zgI6-cGHI-3wwA-ysDZ-wy4W-VuFouO

 

 

4. Extend Volume Group using the NEW PV now. (vgextend)
————— vgdisplay —————
[root@server ~]# vgdisplay
— Volume group —
VG Name vg_server
System ID
Format lvm2
Metadata Areas 1
Metadata Sequence No 6
VG Access read/write
VG Status resizable
MAX LV 0
Cur LV 3
Open LV 3
Max PV 0
Cur PV 1
Act PV 1
VG Size 14.51 GiB
PE Size 4.00 MiB
Total PE 3714
Alloc PE / Size 3714 / 14.51 GiB
Free PE / Size 0 / 0
VG UUID cnv1x2-NIWJ-fsIf-HMPs-WVld-YXJ0-jcKd3V

————— vgextend —————
[root@server ~]# vgextend /dev/vg_server /dev/sda3
Volume group “vg_server” successfully extended

[root@server ~]# vgdisplay
— Volume group —
VG Name vg_server
System ID
Format lvm2
Metadata Areas 2
Metadata Sequence No 7
VG Access read/write
VG Status resizable
MAX LV 0
Cur LV 3
Open LV 3
Max PV 0
Cur PV 2
Act PV 2
VG Size 39.50 GiB
PE Size 4.00 MiB
Total PE 10112
Alloc PE / Size 3714 / 14.51 GiB
Free PE / Size 6398 / 24.99 GiB
VG UUID cnv1x2-NIWJ-fsIf-HMPs-WVld-YXJ0-jcKd3V

 

5. You can upgrade the LVM size using lvextend command.

————— lvextend —————
[root@server ~]# lvextend -l +100%FREE /dev/vg_server/lv_home
Extending logical volume lv_home to 30.70 GiB
Logical volume lv_home successfully resized

 

6. Finally, let’s resize the file system to the new allocated space. (resize2fs)

————— resize2fs —————
[root@server ~]# resize2fs /dev/vg_server/lv_home
resize2fs 1.41.12 (17-May-2010)
Filesystem at /dev/vg_server/lv_home is mounted on /home; on-line resizing required
old desc_blocks = 1, new_desc_blocks = 2
Performing an on-line resize of /dev/vg_server/lv_home to 8046592 (4k) blocks.
The filesystem on /dev/vg_server/lv_home is now 8046592 blocks long.
————— reboot 1 time to take affect—————

Usually, an svn cleanup fixes most issues with tortoise svn. However, I ran into an issue which caused me some grief. 
The specific error I was seeing:
Previous operation has not finished; run 'cleanup' if it was interrupted

Solution:

 Somehow, svn is stuck on the previous operation. We need to remove this operation from it’s ‘work queue’. 
The data is stored in the wc.db sqllite database in the offending folder.

1. Install sqllite (32 bit binary for windows) from http://www.sqlite.org/download.html

2. sqlite .svn/wc.db “select * from work_queue”

The SELECT should show you your offending folder/file as part of the work queue. What you need to do is delete this item from the work queue.

3. sqlite .svn/wc.db “delete from work_queue”

That’s it. Now, you can run cleanup again – and it should work. Or you can proceed directly to the task you were doing before being prompted to run cleanup (adding a new file etc.)

Also, svn.exe (a command line tool) is part of the Tortoise installer – but is unchecked for some reason. Just run the installer again, choose ‘modify’ and select the ‘command line tools’.

Posted by: Master Will | March 5, 2014

Disabling RLimitMEM directive from httdp.conf

executed that tool inWHM >> Main >> Service Configuration >> Apache Configuration >> Memory Usage Restrictions

To remove it.

cd /var/cpanel/templates/apache2
vi main.default

comment these lines

#RLimitMEM [% main.rlimitmem.item.softrlimitmem %] [% main.rlimitmem.item.maxrlimitmem %]
#RLimitCPU [% main.rlimitcpu.item.softrlimitcpu %] [% main.rlimitcpu.item.maxrlimitcpu %]

/scripts/rebuildhttpdconf
/etc/init.d/httpd restart
Posted by: Master Will | February 5, 2014

reinstall vzctl

vzctl –version
cat /etc/issue
yum remove vzctl*
cd /tmp

yum -y install parted
wget ftp://ftp.icm.edu.pl/vol/rzm1/linux-openvz/utils/ploop/1.4/ploop-lib-1.4-1.x86_64.rpm
wget ftp://ftp.icm.edu.pl/vol/rzm1/linux-openvz/utils/ploop/1.4/ploop-1.4-1.x86_64.rpm
rpm -ivh ploop*

wget http://download.openvz.org/utils/vzctl/3.3/vzctl-3.3-1.x86_64.rpm
wget http://download.openvz.org/utils/vzctl/3.3/vzctl-lib-3.3-1.x86_64.rpm
rpm -ivh vzctl-3.3-1.x86_64.rpm vzctl-lib-3.3-1.x86_64.rpm

wget http://files.soluslabs.com/solusvm/installer/v2/ve-vswap-solus.conf-sample -O /etc/vz/conf/ve-vswap-solus.conf-sample
vi /etc/vz/vz.conf

–Change line to

CONFIGFILE=”vswap-solus”

/etc/init.d/vz restart
vzctl –version

Posted by: Master Will | January 20, 2014

How to Rebuild LSI RAID

The LSI card does not automatically rebuild the mirror onto the newly replaced drive. The drive is put into the UNCONFIGURED BAD state and requires manual intervention to initiate a rebuild. With the LSI there is no way to initiate an array rebuild (or do any array maintenance for that matter) so a reboot into the LSI BIOS is necessary.

megaraid-01-drive-replaced

Even though the drive has been physically replaced, the BIOS shows that there is a “PD Missing” on backplane 252, slot 1.  By switching to the “physical view” and selecting the drive that’s shown as “Unconfigured Bad”, the drive can be changed to “unconfigured good” by marking the radio button and clicking Go.

megaraid-02-go-to-physical-view

megaraid-03-make-drive-unconfigured-good

Now that the drive is in a “good” state, it can be added into the array by marking the radio button beside Replace Missing PD and hitting Go.

megaraid-04-replace-missing-physical-disk-pd2

After that, choose to Rebuild Drive and away you go.

 

Credit: http://www.packetmischief.ca/2011/03/31/monitoring-direct-attached-storage-under-esxi/

Posted by: Master Will | January 20, 2014

MegaCLI Scripts and Commands

MegaCLI is the command line interface (CLI) binary used to communicate with the full LSI family of raid controllers found in Supermicro, DELL (PERC), ESXi and Intel servers. The program is a text based command line interface (CLI) and is comprised of a single static binary file. We are not a fan of graphical interfaces (GUI) and appreciate the control a command line program gives over a GUI solution. Using some simple shell scripting we can find out the health of the RAID, email ourselves about problems and work with failed drives.

There are many MegaCLI command pages which simply rehash the same commands over and over and we wanted to offer something more. For our examples we are using Ubuntu Linux and FreeBSD with the MegaCli64 binary. All of these same scripts and commands work for the 32bit and 64bit binaries.

Installing the MegaCLI binary

In order to communicate with the LSI card you will need the MegaCLI or MegaCLI64 (64bit) program. The install should be quite easy, but LSI make us jump through a few hoops. This is what we found:

  • Go to the LSI Downloads page: LSI Downloads
  • Search by keyword “megacli
  • Click on “Management Software and Tools”
  • Download the MegaCLI zip file. You will see the same file is for DOS, Windows, Linux and FreeBSD.
  • Unzip the file
  • In the Linux directory there is an RPM. If you are using Redhat you can install it. For Ubuntu got the next step.
  • For Ubuntu run “rpm2cpio MegaCli-*.rpm | cpio -idmv” to expand the directory structure. You may need to “apt-get install rpm2cpio” .
  • For FreeBSD unzip the file in the FreeBSD directory.

On our Ubuntu Linux 64bit and FreeBSD 64bit servers we simply copied MegaCli64 (64bit) to /usr/local/sbin/ . You can put the binary anywhere you want, but we choose /usr/local/sbin/ because it is in root’s path. Make sure to secure the binary. Make the owner root and chmod the binary to 700 (chown root /usr/local/sbin/MegaCli64; chmod 700 /usr/local/sbin/MegaCli64). The install is now done. We would like to see LSI make a Ubuntu PPA or FreeBSD ports entry sometime in the future, but this setup was not too bad.

The lsi.sh MegaCLI interface script

Once you have MegaCLI installed, the following is a script to help in getting information from the raid card. The shell script does nothing more then execute the commands you normally use on the CLI. The script can show the status of the raid and drives. You can identify any drive slot by using the blinking light on the chassis. The script can help you identify drives which are starting to error out or slow down the raid so you can replace drives early. We have also included a “setdefaults” method to setup a new raid card to specs we use for our 400+ raids. Finally, use the “checkNemail” method to check the raid status and mail you with a list of drives and which one is reporting the problem.

You are welcome to copy and paste the following script. We call the script “lsi.sh”, but you can use any name you wish. just make sure to set the full path to the MegaCli binary in the script and make the script executable. We tried to comment every method so take a look at the script before using it.

 

#!/bin/bash
#
# Calomel.org 
#     https://calomel.org/megacli_lsi_commands.html
#     LSI MegaRaid CLI 
#     lsi.sh @ Version 0.05
#
# description: MegaCLI script to configure and monitor LSI raid cards.

# Full path to the MegaRaid CLI binary
MegaCli="/usr/local/sbin/MegaCli64"

# The identifying number of the enclosure. Default for our systems is "8". Use
# "MegaCli64 -PDlist -a0 | grep "Enclosure Device"" to see what your number
# is and set this variable.
ENCLOSURE="8"

if [ $# -eq 0 ]
   then
    echo ""
    echo "            OBPG  .:.  lsi.sh $arg1 $arg2"
    echo "-----------------------------------------------------"
    echo "status        = Status of Virtual drives (volumes)"
    echo "drives        = Status of hard drives"
    echo "ident \$slot   = Blink light on drive (need slot number)"
    echo "good \$slot    = Simply makes the slot \"Unconfigured(good)\" (need slot number)"
    echo "replace \$slot = Replace \"Unconfigured(bad)\" drive (need slot number)"
    echo "progress      = Status of drive rebuild"
    echo "errors        = Show drive errors which are non-zero"
    echo "bat           = Battery health and capacity"
    echo "batrelearn    = Force BBU re-learn cycle"
    echo "logs          = Print card logs"
    echo "checkNemail   = Check volume(s) and send email on raid errors"
    echo "allinfo       = Print out all settings and information about the card"
    echo "settime       = Set the raid card's time to the current system time"
    echo "setdefaults   = Set preferred default settings for new raid setup"
    echo ""
   exit
 fi

# General status of all RAID virtual disks or volumes and if PATROL disk check
# is running.
if [ $1 = "status" ]
   then
      $MegaCli -LDInfo -Lall -aALL -NoLog
      echo "###############################################"
      $MegaCli -AdpPR -Info -aALL -NoLog
      echo "###############################################"
      $MegaCli -LDCC -ShowProg -LALL -aALL -NoLog
   exit
fi

# Shows the state of all drives and if they are online, unconfigured or missing.
if [ $1 = "drives" ]
   then
      $MegaCli -PDlist -aALL -NoLog | egrep 'Slot|state' | awk '/Slot/{if (x)print x;x="";}{x=(!x)?$0:x" -"$0;}END{print x;}' | sed 's/Firmware state://g'
   exit
fi

# Use to blink the light on the slot in question. Hit enter again to turn the blinking light off.
if [ $1 = "ident" ]
   then
      $MegaCli  -PdLocate -start -physdrv[$ENCLOSURE:$2] -a0 -NoLog
      logger "`hostname` - identifying enclosure $ENCLOSURE, drive $2 "
      read -p "Press [Enter] key to turn off light..."
      $MegaCli  -PdLocate -stop -physdrv[$ENCLOSURE:$2] -a0 -NoLog
   exit
fi

# When a new drive is inserted it might have old RAID headers on it. This
# method simply removes old RAID configs from the drive in the slot and make
# the drive "good." Basically, Unconfigured(bad) to Unconfigured(good). We use
# this method on our FreeBSD ZFS machines before the drive is added back into
# the zfs pool.
if [ $1 = "good" ]
   then
      # set Unconfigured(bad) to Unconfigured(good)
      $MegaCli -PDMakeGood -PhysDrv[$ENCLOSURE:$2] -a0 -NoLog
      # clear 'Foreign' flag or invalid raid header on replacement drive
      $MegaCli -CfgForeign -Clear -aALL -NoLog
   exit
fi

# Use to diagnose bad drives. When no errors are shown only the slot numbers
# will print out. If a drive(s) has an error you will see the number of errors
# under the slot number. At this point you can decided to replace the flaky
# drive. Bad drives might not fail right away and will slow down your raid with
# read/write retries or corrupt data. 
if [ $1 = "errors" ]
   then
      echo "Slot Number: 0"; $MegaCli -PDlist -aALL -NoLog | egrep -i 'error|fail|slot' | egrep -v '0'
   exit
fi

# status of the battery and the amount of charge. Without a working Battery
# Backup Unit (BBU) most of the LSI read/write caching will be disabled
# automatically. You want caching for speed so make sure the battery is ok.
if [ $1 = "bat" ]
   then
      $MegaCli -AdpBbuCmd -aAll -NoLog
   exit
fi

# Force a Battery Backup Unit (BBU) re-learn cycle. This will discharge the
# lithium BBU unit and recharge it. This check might take a few hours and you
# will want to always run this in off hours. LSI suggests a battery relearn
# monthly or so. We actually run it every three(3) months by way of a cron job.
# Understand if your "Current Cache Policy" is set to "No Write Cache if Bad
# BBU" then write-cache will be disabled during this check. This means writes
# to the raid will be VERY slow at about 1/10th normal speed. NOTE: if the
# battery is new (new bats should charge for a few hours before they register)
# or if the BBU comes up and says it has no charge try powering off the machine
# and restart it. This will force the LSI card to re-evaluate the BBU. Silly
# but it works.
if [ $1 = "batrelearn" ]
   then
      $MegaCli -AdpBbuCmd -BbuLearn -aALL -NoLog
   exit
fi

# Use to replace a drive. You need the slot number and may want to use the
# "drives" method to show which drive in a slot is "Unconfigured(bad)". Once
# the new drive is in the slot and spun up this method will bring the drive
# online, clear any foreign raid headers from the replacement drive and set the
# drive as a hot spare. We will also tell the card to start rebuilding if it
# does not start automatically. The raid should start rebuilding right away
# either way. NOTE: if you pass a slot number which is already part of the raid
# by mistake the LSI raid card is smart enough to just error out and _NOT_
# destroy the raid drive, thankfully.
if [ $1 = "replace" ]
   then
      logger "`hostname` - REPLACE enclosure $ENCLOSURE, drive $2 "
      # set Unconfigured(bad) to Unconfigured(good)
      $MegaCli -PDMakeGood -PhysDrv[$ENCLOSURE:$2] -a0 -NoLog
      # clear 'Foreign' flag or invalid raid header on replacement drive
      $MegaCli -CfgForeign -Clear -aALL -NoLog
      # set drive as hot spare
      $MegaCli -PDHSP -Set -PhysDrv [$ENCLOSURE:$2] -a0 -NoLog
      # show rebuild progress on replacement drive just to make sure it starts
      $MegaCli -PDRbld -ShowProg -PhysDrv [$ENCLOSURE:$2] -a0 -NoLog
   exit
fi

# Print all the logs from the LSI raid card. You can grep on the output.
if [ $1 = "logs" ]
   then
      $MegaCli -FwTermLog -Dsply -aALL -NoLog
   exit
fi

# Use to query the RAID card and find the drive which is rebuilding. The script
# will then query the rebuilding drive to see what percentage it is rebuilt and
# how much time it has taken so far. You can then guess-ti-mate the
# completion time.
if [ $1 = "progress" ]
   then
      DRIVE=`$MegaCli -PDlist -aALL -NoLog | egrep 'Slot|state' | awk '/Slot/{if (x)print x;x="";}{x=(!x)?$0:x" -"$0;}END{print x;}' | sed 's/Firmware state://g' | egrep build | awk '{print $3}'`
      $MegaCli -PDRbld -ShowProg -PhysDrv [$ENCLOSURE:$DRIVE] -a0 -NoLog
   exit
fi

# Use to check the status of the raid. If the raid is degraded or faulty the
# script will send email to the address in the $EMAIL variable. We normally add
# this method to a cron job to be run every few hours so we are notified of any
# issues.
if [ $1 = "checkNemail" ]
   then
      EMAIL="raidadmin@localhost"

      # Check if raid is in good condition
      STATUS=`$MegaCli -LDInfo -Lall -aALL -NoLog | egrep -i 'fail|degrad|error'`

      # On bad raid status send email with basic drive information
      if [ "$STATUS" ]; then
         $MegaCli -PDlist -aALL -NoLog | egrep 'Slot|state' | awk '/Slot/{if (x)print x;x="";}{x=(!x)?$0:x" -"$0;}END{print x;}' | sed 's/Firmware state://g' | mail -s `hostname`' - RAID Notification' $EMAIL
      fi
fi

# Use to print all information about the LSI raid card. Check default options,
# firmware version (FW Package Build), battery back-up unit presence, installed
# cache memory and the capabilities of the adapter. Pipe to grep to find the
# term you need.
if [ $1 = "allinfo" ]
   then
      $MegaCli -AdpAllInfo -aAll -NoLog
   exit
fi

# Update the LSI card's time with the current operating system time. You may
# want to setup a cron job to call this method once a day or whenever you
# think the raid card's time might drift too much. 
if [ $1 = "settime" ]
   then
      $MegaCli -AdpGetTime -aALL -NoLog
      $MegaCli -AdpSetTime `date +%Y%m%d` `date +%H:%M:%S` -aALL -NoLog
      $MegaCli -AdpGetTime -aALL -NoLog
   exit
fi

# These are the defaults we like to use on the hundreds of raids we manage. You
# will want to go through each option here and make sure you want to use them
# too. These options are for speed optimization, build rate tweaks and PATROL
# options. When setting up a new machine we simply execute the "setdefaults"
# method and the raid is configured. You can use this on live raids too.
if [ $1 = "setdefaults" ]
   then
      # Read Cache enabled specifies that all reads are buffered in cache memory. 
       $MegaCli -LDSetProp -Cached -LAll -aAll -NoLog
      # Adaptive Read-Ahead if the controller receives several requests to sequential sectors
       $MegaCli -LDSetProp ADRA -LALL -aALL -NoLog
      # Hard Disk cache policy enabled allowing the drive to use internal caching too
       $MegaCli -LDSetProp EnDskCache -LAll -aAll -NoLog
      # Write-Back cache enabled
       $MegaCli -LDSetProp WB -LALL -aALL -NoLog
      # Continue booting with data stuck in cache. Set Boot with Pinned Cache Enabled.
       $MegaCli -AdpSetProp -BootWithPinnedCache -1 -aALL -NoLog
      # PATROL run every 672 hours or monthly (RAID6 77TB @60% rebuild takes 21 hours)
       $MegaCli -AdpPR -SetDelay 672 -aALL -NoLog
      # Check Consistency every 672 hours or monthly
       $MegaCli -AdpCcSched -SetDelay 672 -aALL -NoLog
      # Enable autobuild when a new Unconfigured(good) drive is inserted or set to hot spare
       $MegaCli -AdpAutoRbld -Enbl -a0 -NoLog
      # RAID rebuild rate to 60% (build quick before another failure)
       $MegaCli -AdpSetProp \{RebuildRate -60\} -aALL -NoLog
      # RAID check consistency rate to 60% (fast parity checks)
       $MegaCli -AdpSetProp \{CCRate -60\} -aALL -NoLog
      # Enable Native Command Queue (NCQ) on all drives
       $MegaCli -AdpSetProp NCQEnbl -aAll -NoLog
      # Sound alarm disabled (server room is too loud anyways)
       $MegaCli -AdpSetProp AlarmDsbl -aALL -NoLog
      # Use write-back cache mode even if BBU is bad. Make sure your machine is on UPS too.
       $MegaCli -LDSetProp CachedBadBBU -LAll -aAll -NoLog
      # Disable auto learn BBU check which can severely affect raid speeds
       OUTBBU=$(mktemp /tmp/output.XXXXXXXXXX)
       echo "autoLearnMode=1" > $OUTBBU
       $MegaCli -AdpBbuCmd -SetBbuProperties -f $OUTBBU -a0 -NoLog
       rm -rf $OUTBBU
   exit
fi

### EOF ###

How do I use the lsi.sh script ?

First, execute the script without any arguments. The script will print out the “help” statement showing all of the available commands and a very short description of the function. Inside the script you can also see we also put in detailed comments.

For example, lets look at the status of the RAID volumes or what LSI calls virtual drives. Run the script with the “status” argument. This will simply print the details of the raid drives and if PATROL or Check Consistency is running. In our example we have two(2) RAID6 volumes of 18.1TB each. The first array is “Partially Degraded” and the second is “Optimal” which means it is healthy.

calomel@lsi:~# ./lsi.sh status

Why is the first volume is degraded ?

The first virtual disk lost a drive, which was already replaced and is now rebuilding. We can look at the status of all the drives using the lsi.sh script and the “drives” argument. You can see slot number 9 is the drive which is rebuilding.

calomel@lsi:~# ./lsi.sh drives

When will the rebuild be finished ?

The card will only tell use how far the rebuild is done and how long the process has been running. Using the “progress” script argument we see the rebuild is 32% done and the rebuild has taken 169 minutes so far. Since the rebuild is close enough to 33% done we simply multiply the time taken (169 minutes) times 3 to derive the total time of 507 minutes or 8.45 hours if the load on the raid is the same to completion.

calomel@lsi:~#./lsi.sh progress

How does the lsi.sh script check errors and send out email ?

The “checkNemail” argument will check the status of the volumes, also called virtual drives, and if the string degraded or error is found will send out email. Make sure to set the $EMAIL variable to your email address in the script. The output of the email shows slot 9 rebuilding. The first virtual drive in this example contain slots 0 through 11. If the physical drive was bad on the other hand we would see slot 9 as Unconfigured(bad) , Unconfigured(good) or even Missing.

We prefer to run the script with “checkNemail” in a cron job. This way when the raid has an issue we get notification. The following cron job will run the script every two(2) hours. As long as the raid is degraded you will get email. We see this function as a reminder to check on the raid if it is not finished rebuilding by morning.

SHELL=/bin/bash

PATH=/bin:/sbin:/usr/bin:/usr/sbin
#
#minute (0-59)
#|   hour (0-23)
#|   |    day of the month (1-31)
#|   |    |   month of the year (1-12 or Jan-Dec)
#|   |    |   |   day of the week (0-6 with 0=Sun or Sun-Sat)
#|   |    |   |   |   commands
#|   |    |   |   |   |
# raid status, check and report 
00   */2  *   *   *   /root/lsi.sh checkNemail

What do the “Cache Policy” values mean ?

Cache Policy’s are how the raid card uses on board RAM to collect data before writing out to disk or to read data before the system asks for it. Write cache is used when we have a lot of data to write and it is faster to write data sequentially to disk instead of writing small chunks. Read cache is used when the system has asked for some data and the raid card keeps the data in cache in case the system asks for the same data again. It is always faster to read and write to cache then to access spinning disks. Understand that you should only use caching if you have good UPS power to the system. If the system looses power and does not flush the cache it is possible to loose data. No one wants that. Lets look at each cache policy LSI raid card use.

  • WriteBack uses the card’s cache to collect enough data to make a series of long sequential writes out to disk. This is the fastest write method.
  • WriteThrough tells the card to write all data directly to disk without cache. This method is quite slow by about 1/10 the speed of WriteBack, but is safer as no data can be lost that was in cache when the machine’s power fails.
  • ReadAdaptive uses an algorithm to see if when the OS asks for a bunch of data blocks sequentially, if we should read a few more sequential blocks because the OS _might_ ask for those too. This method can lead to good speed increases.
  • ReadAheadNone tells the raid card to only read the data off the raid disk if it was actually asked for. No more, no less.
  • Cached allows the general use of the cards cache for any data which is read or written. Very efficient if the same data is accessed over and over again.
  • Direct is straight access to the disk without ever storing data in the cache. This can be slow as any I/O has to touch the disk platters.
  • Write Cache OK if Bad BBU tells the card to use write caching even if the Battery Backup Unit (BBU) is bad, disabled or missing. This is a good setting if your raid card’s BBU charger is bad, if you do not want or can’t to replace the BBU or if you do not want WriteThrough enabled during a BBU relearn test.
  • No Write Cache if Bad BBU if the BBU is not available for any reason then disable WriteBack and turn on WriteThrough. This option is safer for your data, but the raid card will switch to WriteThrough during a battery relearn cycle.
  • Disk Cache Policy: Enabled Use the hard drive’s own cache. For example if data is written out the drives this option lets the drives themselves cache data internally before writing data to its platters.
  • Disk Cache Policy: Disabled does not allow the drive to use any of its own internal cache.

 

Credit: https://calomel.org/megacli_lsi_commands.html

#yum -y install OpenIPMI OpenIPMI-tools OpenIPMI-libs OpenIPMI-devel

#/sbin/chkconfig ipmi on

#/sbin/service ipmi start

Configure the BMC for Remote Usage:

1) There are two ways to configure the BMC. You can configure it through the boot-time menu (Ctrl-E), where you can set the management password and IP address information. Or, you can configure it with ipmitool from the OS. Replace my sample IP address, gateway, and netmask with your own:

/usr/bin/ipmitool -I open lan set 1 ipaddr 192.168.40.88
/usr/bin/ipmitool -I open lan set 1 defgw ipaddr 192.168.40.1
/usr/bin/ipmitool -I open lan set 1 netmask 255.255.255.0
/usr/bin/ipmitool -I open lan set 1 access on

2) Secure the BMC, so unauthorized people can’t power cycle your machines. To do this you want to change the default SNMP community, the “null” user password, and the root user password. First, set the SNMP community, either to a random string or something you know:

/usr/bin/ipmitool -I open lan set 1 snmp YOURSNMPCOMMUNITY

Then set the null user password to something random. Replace CRAPRANDOMSTRING with something random and secure:

/usr/bin/ipmitool -I open lan set 1 password CRAPRANDOMSTRING

Last, set the root user password to something you know:

/usr/bin/ipmitool -I open user set password 2 REMEMBERTHIS

Double-check your settings with:

/usr/bin/ipmitool -I open lan print 1

Trying it:

1) You can set an environment variable, IPMI_PASSWORD, with the password you used above. That will save some typing:

export IPMI_PASSWORD="REMEMBERTHIS"

If you use this substitute the “-a” in the following commands with a “-E”.

2) From another machine issue the following command, obviously replacing the IP with the target BMC’s IP:

/usr/bin/ipmitool -I lan -U root -H 192.168.40.88 -a chassis power status

You should get something like:

Chassis Power is on

If you get anything else, or nothing, double-check to make sure the BMC is set right, you entered the right password, and the IP it has is reachable from the machine you’re on. You can double-check your work via the Ctrl-E boot menu, too.

Beyond that, get familiar with:

/usr/bin/ipmitool -I lan -U root -H 192.168.40.88 -a chassis power off

/usr/bin/ipmitool -I lan -U root -H 192.168.40.88 -a chassis power cycle

/usr/bin/ipmitool -I lan -U root -H 192.168.40.88 -a sel list

For me, a “chassis power off” command kills the box. “SEL” is the system event log.

You can issue all of these commands locally, too:

/usr/bin/ipmitool sel list

 

Note: To restart IPMI type:

ipmitool mc reset cold

 

Hopefully this helps a little. If you find any errors in this please leave me a comment or send me an email. Thanks!

Credit: http://lonesysadmin.net/2007/06/21/how-to-configure-ipmi-on-a-dell-poweredge-running-red-hat-enterprise-linux/

« Newer Posts - Older Posts »

Categories