Thursday, 22 May 2014

Openstack: OpenvSwitch Kernel panic

When starting openvswtich-swtich or quantum-plugin-openvswitch-agent with enable_tunneling = True the network node kernel panics.

The kernel version linux 3.5.0-41-generic
Openvswitch version 1.4.6

Solution

Upgrade to Openvswitch version to 1.10.2 from the Havana repositories.


add-apt-repository -y cloud-archive:Havana && apt-get update && apt-get install openvswitch-switch


Tuesday, 1 April 2014

Enabling Virtio drivers in kernel for running androidx86 on Openstack

Guest operating systems running on virtualised systems needs to cooperate with the underlying hypervisors when using virtualised resources.  Virtio is a set of standard for disk and network virtualisation that is required to be installed on instances that runs on Openstack. The default Androidx86 kernel does not come with these modules installed. Also, you have to edit the source code of Androidx86 OS to detect the virtualised block devices. Otherwise you will see a screen with "Detecting Android-x86... (continuous dots :s)".

This is what you will see in androidx86 debug mode:


If you do not want to compile the source and set it up yourself. I have already created the image for you. Download it from here.

1. First,  to compile the OS you have to initialise the build environment.  Follow instructions here.

2. Download and install the repo client using instructions here.

3. Download Androidx86 source from here.

4. Alter the kernel defconfig files in "kernel/arch/x86/configs/android-x86_defconfig"
and "kernel/arch/x86/configs/android-x86_64_defconfig" by adding following lines.

CONFIG_VIRT_DRIVERS=Y
CONFIG_VIRTIO=y
CONFIG_VIRTIO_PCI=y
CONFIG_VIRTIO_MMIO=m
CONFIG_VIRTIO_BALLOON=m
CONFIG_VIRTIO_BLK=y
CONFIG_VIRTIO_NET=m
CONFIG_VIRTIO_RING=m
CONFIG_VIRTIO_CONSOLE=m
CONFIG_HW_RANDOM_VIRTIO=m

5. To be able to detect virtual block devices, alter the "bootable/newinstaller/initrd/init" file. 
In line 124, Change

for device in ${ROOT:-/dev/sr* /dev/[hs]d[a-z]* /dev/mmcblk*}; do

to

for device in ${ROOT:-/dev/sr* /dev/[hsv]d[a-z]* /dev/mmcblk*}; do

6. Insert following lines to device/generic/x86/init.sh

#Force dhcp on eth0 interface.
netcfg eth0 dhcp
#Start SSH daemon at startup.
start-ssh 

7. Add follwing packages to PRODUCT_PACKAGES in device/generic/x86/packages.mk.

ssh-keygen
sshd_config
start-ssh

8. Then compile.

9. Install androidx86 OS onto a disk storage (dvi, vmdk, qcow2) .

10. To fetch the nova keypair, create fetchsshkeys script with following lines in /data/local/.

#!/system/bin/sh

# Fetch public key using HTTP
cd /data
wget http://169.254.169.254/latest/meta-data/public-keys/0/openssh-key 
cat /data/openssh-key > /data/ssh/authorized_keys

chmod 0600 /data/ssh/authorized_keys
restorecon /data/ssh/authorized_keys

rm /data/openssh-key 

11. Make the script executable.

chmod 755 /data/local/fetchsshkeys

12. Call the script in /etc/init.sh file.

#Fetch ssh keys from openstack metadata service.
/data/local/fetchsshkeys

13. Upload the disk to Glance.

It should now detect /dev/vda1.


14. Create a virtual machine using the uploaded image. Your VM should get a private ip. (Make sure to specify a keypair)

15. You should be able to ssh into the androidx86 VM now.

16. To get get GUI, add "nomodeset" to grub.

Thanks to everyone helped me on this issue from Androidx86 mailing list.

Wednesday, 12 March 2014

Openstack Grizzly slow API and timeouts

I have come across this issue a while ago on our Openstack testbed that as the amount of API calls increase, the delay when you do API calls increased.  Then I discovered that Openstack does not clean the expired tokens automatically when you are using a SQL backend.

https://bugs.launchpad.net/ubuntu/+source/keystone/+bug/1032633

To delete the expired tokens I have created a script so that I can automate this, by creating a cron job.

You can find the script here on my github. Download the script and change the password to your mysql password.

Then create a cron job to periodically execute this script.

0 0 * * * /path/to/keystone_flush 2>&1 >> /path/to/log/keystone_flush.log

Above cron job will run ones everyday. You may change the timing of the job as you desire.

However this problem is solved in Openstack Havana release.

https://blueprints.launchpad.net/keystone/+spec/keystone-manage-token-flush



Wednesday, 29 January 2014

Openstack Folsom cant ping instance/VM

The most annoying bug that I came across when setting up Openstack Folsom was that, after creating the a virtual machine it was not able to reach the metadata service and I was not able to reach the virtual machine.

After days of debugging, I found out that it was not only me that was having the same issue. Finally it was reported as a known bug to the openstack developers.

https://bugs.launchpad.net/neutron/+bug/1091605

The problem seems to be that OpenvSwitch devices have been brought back up after after rebooting.  Hence before I upgraded to Grizzily from Folsom, I created the following script and had it running at the boot-up in the network node.

 #!/bin/sh  
   
 #  
 # This script fixes the following bug.  
 # Has to run after reboot.  
 # Bug: https://bugs.launchpad.net/quantum/+bug/1091605  
 #  
 #  
 # Chathura S. Magurawalage  
   
 ((  
 #Check for root permissions  
 # Make sure only root can run our script  
 if [ "$(id -u)" != "0" ]; then  
   echo "This script must be run as root"  
   exit 1  
 fi  
   
 BRIDGE=${BRIDGE:-br0}  
   
 while getopts b:hv option  
 do   
   case "${option}"  
   in  
     b) BRIDGE=${OPTARG};;  
      v) set -x;;  
     h) cat <<EOF   
 Usage: $0 [-b bridge_name]  
   
 Add -v for verbose mode, -h to display this message.  
 EOF  
 exit 0  
 ;;  
      \?) echo "Use -h for help"  
        exit 1;;  
   esac  
 done  
   
 ovs-vsctl del-br br-int  
 ovs-vsctl del-br br-ex  
 ovs-vsctl add-br br-int  
 ovs-vsctl add-br br-ex  
 ovs-vsctl add-port br-ex $BRIDGE  
 ip link set up br-ex  
   
 service quantum-plugin-openvswitch-agent restart  
 service quantum-dhcp-agent restart  
 service quantum-l3-agent restart  
   
 )) 2>&1 | tee $0.log  
   

Ubuntu Kernel panic when creating Openstack instance

Currently I am using Openstack Folsom, nova version  2013.1.4, Installed on Ubuntu 12.04.  I upgraded my compute and network nodes's kernel to  "Linux 3.2.0-58-generic". Since then, when I try to create an instance on a the compute nodes or when I restart the network node, the systems goes to Kernel panic. 

The only fix for this issue I found, is to downgrade the kernel from "Linux 3.2.0-58-generic" to "Linux 3.2.0-55-generic" on all compute nodes and network node to keep things consistent.  The controller nodes does not seem to be affected by this kernel bug.

Also make sure to install the header if the corresponding kernel, if you are installing the kernel. Or if you have already got "Linux 3.2.0-55-generic" kernel installed, then just change it to the default kernel in grub records. 

First find out the index of the kernel in the grub records.

grep menuentry /boot/grub2/grub.cfg

Then the fist answer of the following question should help when editing the grub config.

http://askubuntu.com/questions/216398/set-older-kernel-as-default-grub-entry

Openstack Grizzily Error "FATAL: Module openvswitch_mod not found" when start openvswitch or at startup

I came across this error on Openstack Grizzily installed on Ubuntu 12.04 LTS, when I upgrade or degrade the linux kernel. So its very likely that you will get this error after running a "dist-upgrade" on your system.

The cause of this problem seem to be that when a different kernel is used, the openvswitch modules have not been built  with it.

Error:
FATAL: Module openvswitch_mod not found.
 * Inserting openvswitch module
 * not removing bridge module because bridges exist (virbr0)
invoke-rc.d: initscript openvswitch-switch, action "load-kmod" failed.

Solution:

Copy the following script to a file. Then save.

 #!/bin/bash  
   
 sudo apt-get install linux-headers-$(uname -r)  
   
 depmod  
 modprobe openvswitch  
   
 sudo apt-get remove openvswitch-switch openvswitch-datapath-dkms quantum-plugin-openvswitch-agent  
   
 sudo apt-get install openvswitch-switch openvswitch-datapath-dkms quantum-plugin-openvswitch-agent  
   
   


Then make it make the file executable.

Go to the file directory then execute following command in terminal:

sudo chmod +x filename

Finally execute the file.

./filename 

Openstack Folsom Error "FATAL: Module openvswitch_mod not found" when start openvswitch or at startup

I came across this error on Openstack Folsom installed on Ubuntu 12.04 LTS, when I upgrade or degrade the linux kernel. So its very likely that you will get this error after running a "dist-upgrade" on your system.

The cause of this problem seem to be that when a different kernel is used, the openvswitch modules have not been built  with it.

Error:

FATAL: Module openvswitch_mod not found.
 * Inserting openvswitch module
 * not removing bridge module because bridges exist (virbr0)
invoke-rc.d: initscript openvswitch-switch, action "load-kmod" failed.

Solution:

Copy the following script to a file. Then save.

 #!/bin/bash  
   
 sudo apt-get install linux-headers-$(uname -r)  
   
 sudo apt-get remove quantum-plugin-openvswitch openvswitch-switch quantum-plugin-openvswitch-agent openvswitch-datapath-dkms openvswitch-common quantum-common python-quantum  
   
 apt-get install openvswitch-switch  
 mkdir -p /etc/quantum/  
 apt-get install quantum-plugin-openvswitch-agent quantum-dhcp-agent quantum-l3-agent quantum-server  
   
 depmod  
 modprobe openvswitch  
   

Then make it make the file executable.

Go to the file directory then execute following command in terminal:

sudo chmod +x filename

Finally execute the file.

./filename