Thursday, 2 April 2015

Mini-HOWTO:
OpenStack & Thin Provisioning on RHEL/CentOS 7


This time it's just the mini how-to with some proposal for a fix.

If you are OpenStack user, you may meet the issue that when you create a thin provisioned volume from an image, it takes too long time than expected and the volume becomes too far from being "thin", even if the volume is marked as thin provisioned. Of course you could manually try to reclaim the unused space but it would take extra steps and some time.


So, if you want to save your time (and the utilization of the storage) and you can have a look at the list of OpenStack (Linux) processes, then the qemu-img seems "a very good" suspect of the issue. You may be right, because it's very probably that your block storage needs a special command for the proper thin provisioning and doesn't receive these commands from the qemu-img utility.

OpenStack (its Cinder component) uses qemu-img to transfer a data between images and volumes (e.g. iSCSI). For SCSI protocols the missing commands may be UNMAP or WRITE SAME (with UNMAP bit).

The commands have been supported by QEMU since version 1.5, but the qemu-img (the convert feature) supports them "by default" only since version 2.0. Also the later versions of QEMU bring some more improvements in there.

But RHEL/CentOS 7 (and 7.1 too) has the qemu-img at version 1.5.3 what obviously can cause the issue. Red Hat within the latest products - e.g. OpenStack Platform 6.0.1 (the latest one) - provides qemu-img 2.1.2 (the new package name is qemu-img-rhev). For those ppl which need the proper thin provisioning just for testing purposes and don't have the access to RHEL OSP 6.0.1, I have prepared the qemu-img 2.1.3 package. The rpm is based on the sources from Fedora 21 updates.

I would recommend to install the package in the OpenStack node without QEMU installed as hypervisor. For example, the installation may look like below:


Finally you should be able to create new volumes from images where the new volumes are actually thin provisioned since they are created. So, have a nice provisioning! ;-)

As usually, any substantive comments, questions, requests or errata (related to my post) are very welcome. 

A few references/links:
 

Friday, 20 March 2015

Storage for the cloud:
OpenStack & HP StoreVirtual 
(as example)


One of the most important components of every cloud is its storage. In this post I would like to show you how to set up the block storage in OpenStack platform. As example of the storage backend I will take one of the SDS (Software Defined Storage) solutions - HP StoreVirtual.

SDS market is emerging, growing and different vendors use this name in more or less different ways (e.g. more or less "hardware-agnostic" way), but SDS is not the topic of this post. Although I will probably come back to it in some of the next ones. Just to mention now, other SDS vendors include IBM (Spectrum Storage), NetApp (Data ONTAP Edge), Nexenta (NexentaStor), StarWind (Virtual SAN) and a few more. Also EMC (besides VMware's Virtual SAN) is more "software defined" (through ViPR and ScaleIO). If we look at the world of open source software, the Ceph is a very good project (now supported by RedHat) and the software is often used as OpenStack storage. However, it doesn't have a direct support for non-Ceph (non-RBD) protocols like iSCSI or FC(oE) (yet).

But let's go back to our example. The OpenStack block storage component (named Cinder) has built-in support for most of industry-leading storage platforms. Even if there is no built-in volume driver for our storage, OpenStack is currently the widely recognized platform and very probably our vendor provides it.

Regarding the EMC storage (ViPR, ScaleIO, but also VMAX, VNX and others), the company is one of members of the OpenStack Foundation. Some of the drivers are part of the OpenStack source distribution, others are provided by EMC separately. If you are EMC customer, this post may be useful for you.

The OpenStack platform supports FC switches (zoning) as well. Brocade and Cisco drivers are included in the main distribution (and so RedHat, etc.).

However in our case (of HP Lefthand/StoreVirtual storage), the only supported protocol is iSCSI. For the volume management the OpenStack driver supports CLIQ (SSH) and REST interfaces of HP LeftHand OS.

Before we can run the driver in the REST mode (it requires LeftHand OS version 11.5 or higher), firstly we may need to have the PIP (Python packages installer) installed. For the Enterprise Linux distributions (RHEL, CentOS) the installer is available from the EPEL repositories. The rpm package name is python-pip:

yum install -y python-pip

The next step is the installation of the HP LeftHand/StoreVirtual REST Client (at the moment the client's most current version is 1.0.4). For example:


The OpenStack Block Storage component takes own configuration from the /etc/cinder/cinder.conf file. To work properly the driver needs parameters like below:

# LeftHand WS API Server URL
hplefthand_api_url=https://192.168.100.5:8081/lhos

# LeftHand Super user username
hplefthand_username=tomekw

# LeftHand Super user password
hplefthand_password=oiafr944icnl93

# LeftHand cluster to use for volume creation
hplefthand_clustername=Klaster1

# LeftHand iSCSI driver
volume_driver=cinder.volume.drivers.san.hp.hp_lefthand_iscsi.HPLeftHandISCSIDriver

## OPTIONAL SETTINGS

# Should CHAPS authentication be used (default=false)
hplefthand_iscsi_chap_enabled=false

# Enable HTTP debugging to LeftHand (default=false)
hplefthand_debug=false


If we are going to use more volume backends within the same Cinder instance, we should enable the multiple backends support and separate the backends configurations. In the cinder.conf there are special parameters and sections for this. For example:

enabled_backends=lvmdriver-1,hpdriver-1

[lvmdriver-1]
volume_backend_name=LVM_iSCSI
volume_driver=cinder.volume.drivers.lvm.LVMISCSIDriver
# Name for the VG that will contain exported volumes (string value)
volume_group=cinder-volumes
# If >0, create LVs with multiple mirrors. Note that this
# requires lvm_mirrors + 2 PVs with available space (integer value)
#lvm_mirrors=0
# Type of LVM volumes to deploy; (default or thin) (string
# value)
#lvm_type=default

[hpdriver-1]
volume_backend_name=HPSTORAGE_iSCSI
# LeftHand iSCSI driver
volume_driver=cinder.volume.drivers.san.hp.hp_lefthand_iscsi.HPLeftHandISCSIDriver
# LeftHand WS API Server URL
hplefthand_api_url=https://192.168.100.5:8081/lhos
# LeftHand Super user username
hplefthand_username=tomekw
# LeftHand Super user password
hplefthand_password=oiafr944icnl93
# LeftHand cluster to use for volume creation
hplefthand_clustername=Klaster1
## OPTIONAL SETTINGS
# Should CHAPS authentication be used (default=false)
hplefthand_iscsi_chap_enabled=false
# Enable HTTP debugging to LeftHand (default=false)
hplefthand_debug=false

In a case of multiple backends the Cinder scheduler decides which backend the volume has to be created in or - if there are different volume_backend_name(s) - we may choose a backend. If we have used the same volume_backend_name with two or more backends, the capacity filter scheduler is used to choose most suitable backend.

The Cinder documentation:

The filter scheduler:

1. Filters the available back ends. By default, AvailabilityZoneFilter, CapacityFilter and CapabilitiesFilter are enabled.

2. Weights the previously filtered back ends. By default, the CapacityWeigher option is enabled. When this option is enabled, the filter scheduler assigns the highest weight to back ends with the most available capacity.

The scheduler uses filters and weights to pick the best back end to handle the request. The scheduler uses volume types to explicitly create volumes on specific back ends.

In short, we can use the filters (which are parametrized per backend) also as a part of our storage management policies.

After all the configuration of the volume drivers, we have to restart the cinder-volume service and we may link the new backends to volume types:


Now we can create a new virtual machine which requires more disk space than it was available before:


So, the new disk volume is created...


and it is taking place in our new backend (the new array).
Of course, we can see the new volume in HP StoreVirtual Centralized Management Console:


In our case iSCSI is the protocol in use:



For this post, that's all. I hope you have found it useful. As usually, any substantive comments, questions, requests or errata are very welcome. 

Some references/links: