设为首页 收藏本站
查看: 1241|回复: 0

[经验分享] 迁移学习心得,从OpenStack到QEMU

[复制链接]

尚未签到

发表于 2015-10-11 12:14:18 | 显示全部楼层 |阅读模式
最近一直在看有关OpenStack里迁移方面的情况,也顺便往下看了下QEMU里关于migration的实现。在这里分享一下自己学习的体会。
我看的OpenStack为g版本,不久将至的h版本在migration方面应该没有太大的改变。(至少没有从f到g来的大 DSC0000.gif
OpenStack关于migration可以分为两类,migration和live migration,migration进入dashboard的管理员模式就使用了,live migration因为dashboard还没加进这个功能,所以需要敲命令行实现(nova live-migration instance_ID)。
migration和live migration无论在功能上还是实现上有很大的不同。migration可以说完全是openstack一手操控的,而live migration实现大部分依赖了libvirt以及qemu。
我们先来看看migration的代码:


\nova\compute\manager.py
   @exception.wrap_exception(notifier=notifier, publisher_id=publisher_id())
@reverts_task_state
@wrap_instance_event
@wrap_instance_fault
def resize_instance(self, context, instance, image,
reservations=None, migration=None, migration_id=None,
instance_type=None):
"""Starts the migration of a running instance to another host."""
if not migration:
migration = self.conductor_api.migration_get(context, migration_id)
with self._error_out_instance_on_exception(context, instance['uuid'],
reservations):
if not instance_type:
instance_type = self.conductor_api.instance_type_get(context,
migration['new_instance_type_id'])
network_info = self._get_instance_nw_info(context, instance)
migration = self.conductor_api.migration_update(context,
migration, 'migrating')
instance = self._instance_update(context, instance['uuid'],
task_state=task_states.RESIZE_MIGRATING,
expected_task_state=task_states.RESIZE_PREP)
self._notify_about_instance_usage(
context, instance, "resize.start", network_info=network_info)
block_device_info = self._get_instance_volume_block_device_info(
context, instance)
disk_info = self.driver.migrate_disk_and_power_off(
context, instance, migration['dest_host'],
instance_type, network_info,
block_device_info)
self._terminate_volume_connections(context, instance)
self.conductor_api.network_migrate_instance_start(context,
instance,
migration)
migration = self.conductor_api.migration_update(context,
migration, 'post-migrating')
instance = self._instance_update(context, instance['uuid'],
host=migration['dest_compute'],
node=migration['dest_node'],
task_state=task_states.RESIZE_MIGRATED,
expected_task_state=task_states.
RESIZE_MIGRATING)
self.compute_rpcapi.finish_resize(context, instance,
migration, image, disk_info,
migration['dest_compute'], reservations)
self._notify_about_instance_usage(context, instance, "resize.end",
network_info=network_info)



非常明显,OpenStack中是通过resize这个命令来实现migration的,那很容易想到migration的实现过程和resize就非常相近了:先shutdown,然后(换个flavor)换个host,接着配置个网络,更新个信息,最后重新启动instance。
然后我们看看OpenStack怎么通过libvirt实现migration的。
\nova\virt\libvirt\driver.py
def migrate_disk_and_power_off(self, context, instance, dest,
instance_type, network_info,
block_device_info=None):
LOG.debug(_("Starting migrate_disk_and_power_off"),
instance=instance)
disk_info_text = self.get_instance_disk_info(instance['name'],
block_device_info=block_device_info)
disk_info = jsonutils.loads(disk_info_text)
# copy disks to destination
# rename instance dir to +_resize at first for using
# shared storage for instance dir (eg. NFS).
inst_base = libvirt_utils.get_instance_path(instance)
inst_base_resize = inst_base + "_resize"
shared_storage = self._is_storage_shared_with(dest, inst_base)
# try to create the directory on the remote compute node
# if this fails we pass the exception up the stack so we can catch
# failures here earlier
if not shared_storage:
utils.execute('ssh', dest, 'mkdir', '-p', inst_base)
self.power_off(instance)
block_device_mapping = driver.block_device_info_get_mapping(
block_device_info)
for vol in block_device_mapping:
connection_info = vol['connection_info']
disk_dev = vol['mount_device'].rpartition("/")[2]
self.volume_driver_method('disconnect_volume',
connection_info,
disk_dev)
try:
utils.execute('mv', inst_base, inst_base_resize)
# if we are migrating the instance with shared storage then
# create the directory.  If it is a remote node the directory
# has already been created
if shared_storage:
dest = None
utils.execute('mkdir', '-p', inst_base)
for info in disk_info:
# assume inst_base == dirname(info['path'])
img_path = info['path']
fname = os.path.basename(img_path)
from_path = os.path.join(inst_base_resize, fname)
if info['type'] == 'qcow2' and info['backing_file']:
tmp_path = from_path + "_rbase"
# merge backing file
utils.execute('qemu-img', 'convert', '-f', 'qcow2',
'-O', 'qcow2', from_path, tmp_path)
if shared_storage:
utils.execute('mv', tmp_path, img_path)
else:
libvirt_utils.copy_image(tmp_path, img_path, host=dest)
utils.execute('rm', '-f', tmp_path)
else:  # raw or qcow2 with no backing file
libvirt_utils.copy_image(from_path, img_path, host=dest)
except Exception:
with excutils.save_and_reraise_exception():
self._cleanup_remote_migration(dest, inst_base,
inst_base_resize,
shared_storage)
return disk_info_text





可以看到migration的操作就是poweroff该instance,然后建立ssh链接,将image从源主机拷贝到目的主机。是否为共享存储的区别不过在于是否真的将image“拷贝”还是仅仅改变一个地址。
所以migration在openstack里面的操作还是简单的,相比之下live migration操作要复杂得多。
live migration很大的一个变化是g版本把这个操作从nova-compute挪到了nova-conductor里了。废话不多说,上代码。


\nova\compute\api.py


@check_instance_state(vm_state=[vm_states.ACTIVE])
def live_migrate(self, context, instance, block_migration,
disk_over_commit, host_name):
"""Migrate a server lively to a new host."""
LOG.debug(_("Going to try to live migrate instance to %s"),
host_name or "another host", instance=instance)
instance = self.update(context, instance,
task_state=task_states.MIGRATING,
expected_task_state=None)
self.compute_task_api.migrate_server(context, instance,
scheduler_hint={'host': host_name},
live=True, rebuild=False, flavor=None,
block_migration=block_migration,
disk_over_commit=disk_over_commit)





这里的compute_task_api就是指conductor,我们接着看conductor。



\nova\conductor\manager.py
def migrate_server(self, context, instance, scheduler_hint, live, rebuild,
flavor, block_migration, disk_over_commit):
if not live or rebuild or (flavor != None):
raise NotImplementedError()
destination = scheduler_hint.get("host")
self.scheduler_rpcapi.live_migration(context, block_migration,
disk_over_commit, instance, destination)






好吧,conductor把皮球踢给了scheduler,在这里限于篇幅,我们简单看一下scheduler里面做的是什么。
try:
self._schedule_live_migration(context, instance, dest,
block_migration, disk_over_commit)
except (exception.NoValidHost,
exception.ComputeServiceUnavailable,
exception.InvalidHypervisorType,
exception.UnableToMigrateToSelf,
exception.DestinationHypervisorTooOld,
exception.InvalidLocalStorage,
exception.InvalidSharedStorage,
exception.MigrationPreCheckError) as ex:
request_spec = {'instance_properties': {
'uuid': instance['uuid'], },
}



scheduler无非检查一下destination的情况,storage的情况等,然后接着把皮球踢还conductor。
\conductor\tasks\live_migrate.py
class LiveMigrationTask(object):
def __init__(self, context, instance, destination,
block_migration, disk_over_commit,
select_hosts_callback):
self.context = context
self.instance = instance
self.destination = destination
self.block_migration = block_migration
self.disk_over_commit = disk_over_commit
self.select_hosts_callback = select_hosts_callback
self.source = instance['host']
self.migrate_data = None
self.compute_rpcapi = compute_rpcapi.ComputeAPI()
self.servicegroup_api = servicegroup.API()
self.image_service = glance.get_default_image_service()
def execute(self):
self._check_instance_is_running()
self._check_host_is_up(self.source)
if not self.destination:
self.destination = self._find_destination()
else:
self._check_requested_destination()
#TODO(johngarbutt) need to move complexity out of compute manager
return self.compute_rpcapi.live_migration(self.context,
host=self.source,
instance=self.instance,
dest=self.destination,
block_migration=self.block_migration,
migrate_data=self.migrate_data)





终于干正事了,conductor通过MQ把live migration的消息告诉了源主机的nova-compute。咱们开始迁移吧!老规矩,还是让libvirt来干活~
\nova\virt\libvirt\driver.py
def live_migration(self, context, instance, dest,
post_method, recover_method, block_migration=False,
migrate_data=None):
"""Spawning live_migration operation for distributing high-load.
:params context: security context
:params instance:
nova.db.sqlalchemy.models.Instance object
instance object that is migrated.
:params dest: destination host
:params block_migration: destination host
:params post_method:
post operation method.
expected nova.compute.manager.post_live_migration.
:params recover_method:
recovery method when any exception occurs.
expected nova.compute.manager.recover_live_migration.
:params block_migration: if true, do block migration.
:params migrate_data: implementation specific params
"""
greenthread.spawn(self._live_migration, context, instance, dest,
post_method, recover_method, block_migration,
migrate_data)
def _live_migration(self, context, instance, dest, post_method,
recover_method, block_migration=False,
migrate_data=None):
"""Do live migration.
:params context: security context
:params instance:
nova.db.sqlalchemy.models.Instance object
instance object that is migrated.
:params dest: destination host
:params post_method:
post operation method.
expected nova.compute.manager.post_live_migration.
:params recover_method:
recovery method when any exception occurs.
expected nova.compute.manager.recover_live_migration.
:params migrate_data: implementation specific params
"""
# Do live migration.
try:
if block_migration:
flaglist = CONF.block_migration_flag.split(',')
else:
flaglist = CONF.live_migration_flag.split(',')
flagvals = [getattr(libvirt, x.strip()) for x in flaglist]
logical_sum = reduce(lambda x, y: x | y, flagvals)
dom = self._lookup_by_name(instance["name"])
dom.migrateToURI(CONF.live_migration_uri % dest,
logical_sum,
None,
CONF.live_migration_bandwidth)
except Exception as e:
with excutils.save_and_reraise_exception():
LOG.error(_("Live Migration failure: %s"), e,
instance=instance)
recover_method(context, instance, dest, block_migration)
# Waiting for completion of live_migration.
timer = loopingcall.FixedIntervalLoopingCall(f=None)

接着就要调用libvirt以及他下面的QEMU了,也就是dom.migrateToURI()这个函数。
如果玩过类似virt manager的东西,对于这个命令里的参数应该不会陌生。事实上,OpenStack里面的live-migration和在virt manager里做migration没什么本质的区别。
到这里总结一下,OpenStack首先在qemu上面找了个libvirt这样的函数库,封装他的API。毕竟OpenStack除了KVM外,还要支持Xen,ESX这样的虚拟化工具。然后相比于单机上的虚拟化,作为云平台的管理工具,OpenStack当然要负责完成把instance的信息在数据库里更新(也就是conductor的任务)、搭建起网络、选择一个目的主机等一系列工序。至于是不是shared storage,其实OpenStack不会太关注,因为libvirt或者更准确的说,QEMU完全能自己搞定这件事了。OpenStack通知他一声就成~
另外,值得一提的是,OpenStack默认是不支持live-migration的,因为OpenStack认为live-migration会因为内存中dirty page rate大于网络的brandwidth而不能停止(详细解释请关注之后QEMU的部分)。因此如果不进行相关设置的话,OpenStack默认会用migration代替(live-migration是不是有种需要打开隐藏关卡才能看到的赶脚)。
总体感觉OpenStack在live migration方面做得还是挺不错的,我在实践中也没发现什么问题。毕竟OpenStack也就是个管理工具,live migration活怎么样(downtime多少,总迁移时间多少,性能下降多少等),和他也没什么直接关系。
反正写到这里OpenStack方面关于迁移的事情我觉得交代的差不多了,接着讲讲QEMU里是怎么实现的吧。(未完待续)
版权声明:本文为博主原创文章,未经博主允许不得转载。

运维网声明 1、欢迎大家加入本站运维交流群:群②:261659950 群⑤:202807635 群⑦870801961 群⑧679858003
2、本站所有主题由该帖子作者发表,该帖子作者与运维网享有帖子相关版权
3、所有作品的著作权均归原作者享有,请您和我们一样尊重他人的著作权等合法权益。如果您对作品感到满意,请购买正版
4、禁止制作、复制、发布和传播具有反动、淫秽、色情、暴力、凶杀等内容的信息,一经发现立即删除。若您因此触犯法律,一切后果自负,我们对此不承担任何责任
5、所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其内容的准确性、可靠性、正当性、安全性、合法性等负责,亦不承担任何法律责任
6、所有作品仅供您个人学习、研究或欣赏,不得用于商业或者其他用途,否则,一切后果均由您自己承担,我们对此不承担任何法律责任
7、如涉及侵犯版权等问题,请您及时通知我们,我们将立即采取措施予以解决
8、联系人Email:admin@iyunv.com 网址:www.yunweiku.com

所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其承担任何法律责任,如涉及侵犯版权等问题,请您及时通知我们,我们将立即处理,联系人Email:kefu@iyunv.com,QQ:1061981298 本贴地址:https://www.yunweiku.com/thread-125418-1-1.html 上篇帖子: OneStack:Ubuntu 12.04 上一键自动部署 OpenStack 下篇帖子: 改善OpenStack上DHCP的性能
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

扫码加入运维网微信交流群X

扫码加入运维网微信交流群

扫描二维码加入运维网微信交流群,最新一手资源尽在官方微信交流群!快快加入我们吧...

扫描微信二维码查看详情

客服E-mail:kefu@iyunv.com 客服QQ:1061981298


QQ群⑦:运维网交流群⑦ QQ群⑧:运维网交流群⑧ k8s群:运维网kubernetes交流群


提醒:禁止发布任何违反国家法律、法规的言论与图片等内容;本站内容均来自个人观点与网络等信息,非本站认同之观点.


本站大部分资源是网友从网上搜集分享而来,其版权均归原作者及其网站所有,我们尊重他人的合法权益,如有内容侵犯您的合法权益,请及时与我们联系进行核实删除!



合作伙伴: 青云cloud

快速回复 返回顶部 返回列表