设为首页 收藏本站
查看: 1191|回复: 0

[经验分享] OpenStack Nova internals of instance launching

[复制链接]

尚未签到

发表于 2015-10-11 11:30:16 | 显示全部楼层 |阅读模式

原文地址:http://www.laurentluce.com/posts/openstack-nova-internals-of-instance-launching/
January 30, 2011  This article describes the internals of launching an instance in OpenStack Nova.
Overview
  Launching a new instance involves multiple components inside OpenStack Nova:

  • API server: handles requests from the user and relays them to the cloud controller.
  • Cloud controller: handles the communication between the compute nodes, the networking controllers, the API server and the scheduler.
  • Scheduler: selects a host to run a command.
  • Compute worker: manages computing instances: launch/terminate instance, attach/detach volumes…
  • Network controller: manages networking resources: allocate fixed IP addresses, configure VLANs…
  Note: There are more components in Nova like the authentication manager, the object store and the volume controller but we are not going to study them as we are focusing on instance launching in this article.
  The flow of launching an instance goes like this: The API server receives a run_instances command from the user. The API server relays the message to the cloud controller (1). Authentication is performed to make sure this user has the required permissions.The cloud controller sends the message to the scheduler (2). The scheduler casts the message to a random host and asks him to start a new instance (3). The compute worker on the host grabs the message (4). The compute worker needs a fixed IP to launch a newinstance so it sends a message to the network controller (5,6,7,8). The compute worker continues with spawning a new instance. We are going to see all those steps in details next.
DSC0000.png
API
  You can use the OpenStack API or EC2 API to launch a new instance. We are going to use the EC2 API. We add a new key pair and we use it to launch an instance of type m1.tiny.
1cd /tmp/2euca-add-keypair test > test.pem3euca-run-instances-k test -tm1.tiny ami-tiny  run_instances() in api/ec2/cloud.py is called which results in compute API create() in compute/API.py being called.
1def run_instances(self,context, **kwargs):2  ...3  instances = self.compute_api.create(context,4            instance_type=instance_types.get_by_type(5                kwargs.get('instance_type', None)),6            image_id=kwargs['image_id'],7            ...  Compute API create() does the following:

  • Check if the maximum number of instances of this type has been reached.
  • Create a security group if it doesn’t exist.
  • Generate MAC addresses and hostnames for the new instances.
  • Send a message to the scheduler to run the instances.
Cast
  Let’s pause for a minute and look at how the message is sent to the scheduler. This type of message delivery in OpenStack is defined as RPC casting. RabbitMQ is used here for delivery. The publisher (API) sends the message to a topic exchange (schedulertopic). A consumer (Scheduler worker) retrieves the message from the queue. No response is expected as it is a cast and not a call. We will see call later.
DSC0001.png
  Here is the code casting that message:
1LOG.debug(_("Castingto scheduler for %(pid)s/%(uid)s's"2        "instance %(instance_id)s") % locals())3rpc.cast(context,4         FLAGS.scheduler_topic,5         {"method": "run_instance",6          "args":{"topic":FLAGS.compute_topic,7                   "instance_id":instance_id,8                   "availability_zone":availability_zone}})  You can see that the scheduler topic is used and the message arguments indicates what we want the scheduler to use for its delivery. In this case, we want the scheduler to send the message using the compute topic.
Scheduler
  The scheduler receives the message and sends the run_instance message to a random host. The chance scheduler is used here. There are more scheduler types like the zone scheduler (pick a random host which is up in a specific availability zone) or the simplescheduler (pick the least loaded host). Now that a host has been selected, the following code is executed to send the message to a compute worker on the host.
1rpc.cast(context,2         db.queue_get_for(context,topic, host),3         {"method":method,4          "args":kwargs})5LOG.debug(_("Castingto %(topic)s %(host)s for %(method)s") % locals())Compute
  The Compute worker receives the message and the following method in compute/manager.py is called:
1def run_instance(self,context, instance_id, **_kwargs):2  """Launcha new instance with specified options."""3  ...  run_instance() does the following:

  • Check if the instance is already running.
  • Allocate a fixed IP address.
  • Setup a VLAN and a bridge if not already setup.
  • Spawn the instance using the virtualization driver.
Call to network controller
  A RPC call is used to allocate a fixed IP. A RPC call is different than a RPC cast because it uses a topic.host exchange meaning that a specific host is targeted. A response is also expected.
DSC0002.png
Spawn instance
  Next is the instance spawning process performed by the virtualization driver. libvirt is used in our case. The code we are going to look at is located in virt/libvirt_conn.py.
  First thing that needs to be done is the creation of the libvirt xml to launch the instance. The to_xml() method is used to retrieve the xml content. Following is the XML for our instance.
01<domain type='qemu'>02    <name>instance-00000001</name>03    <memory>524288</memory>04    <os>05        <type>hvm</type>06        <kernel>/opt/novascript/trunk/nova/..//instances/instance-00000001/kernel</kernel>07        <cmdline>root=/dev/vdaconsole=ttyS0</cmdline>08        <initrd>/opt/novascript/trunk/nova/..//instances/instance-00000001/ramdisk</initrd>09    </os>10    <features>11        <acpi/>12    </features>13    <vcpu>1</vcpu>14    <devices>15        <disk type='file'>16            <driver type='qcow2'/>17            <source file='/opt/novascript/trunk/nova/..//instances/instance-00000001/disk'/>18            <target dev='vda' bus='virtio'/>19        </disk>20        <interface type='bridge'>21            <source bridge='br100'/>22            <mac address='02:16:3e:17:35:39'/>23            <!--  <model type='virtio'/>  CANT RUN virtio network right now -->24            <filterref filter=&quot;nova-instance-instance-00000001&quot;>25                <parameter name=&quot;IP&quot; value=&quot;10.0.0.3&quot; />26                <parameter name=&quot;DHCPSERVER&quot; value=&quot;10.0.0.1&quot; />27                <parameter name=&quot;RASERVER&quot; value=&quot;fe80::1031:39ff:fe04:58f5/64&quot; />28                <parameter name=&quot;PROJNET&quot; value=&quot;10.0.0.0&quot; />29                <parameter name=&quot;PROJMASK&quot; value=&quot;255.255.255.224&quot; />30                <parameter name=&quot;PROJNETV6&quot; value=&quot;fd00::&quot; />31                <parameter name=&quot;PROJMASKV6&quot; value=&quot;64&quot; />32            </filterref>33        </interface>34 35        <!--The order is significant here.  File must be defined first -->36        <serial type=&quot;file&quot;>37            <source path='/opt/novascript/trunk/nova/..//instances/instance-00000001/console.log'/>38            <target port='1'/>39        </serial>40 41        <console type='pty' tty='/dev/pts/2'>42            <source path='/dev/pts/2'/>43            <target port='0'/>44        </console>45 46        <serial type='pty'>47            <source path='/dev/pts/2'/>48            <target port='0'/>49        </serial>50 51    </devices>52</domain>  The hypervisor used is qemu. The memory allocated for the guest will be 524 kbytes. The guest OS will boot from a kernel and initrd stored on the host OS.
  Number of virtual CPUs allocated for the guest OS is 1. ACPI is enabled for power management.
  Multiple devices are defined:

  • The disk image is a file on the host OS using the driver qcow2. qcow2 is a qemu disk image copy-on-write format.
  • The network interface is a bridge visible to the guest. We define network filtering parameters like IP which means this interface will always use 10.0.0.3 as the source IP address.
  • Device logfile. All data sent to the character device is written to console.log.
  • Pseudo TTY: virsh console can be used to connect to the serial port locally.
  Next is the preparation of the network filtering. The firewall driver used by default is iptables. The rules are defined in apply_ruleset() in the class IptablesFirewallDriver. Let’s take a look at the firewall chains and rules for this instance.
01*filter02...03:nova-ipv4-fallback- [0:0]04:nova-local -[0:0]05:nova-inst-1- [0:0]06:nova-sg-1- [0:0]07-Anova-ipv4-fallback -j DROP08-AFORWARD -j nova-local09-Anova-local -d10.0.0.3 -j nova-inst-110-Anova-inst-1 -m state --state INVALID -j DROP11-Anova-inst-1 -m state --state ESTABLISHED,RELATED -j ACCEPT12-Anova-inst-1 -j nova-sg-113-Anova-inst-1 -s 10.1.3.254 -p udp --sport 67 --dport 6814-Anova-inst-1 -j nova-ipv4-fallback15-Anova-sg-1 -p tcp -s 10.0.0.0/27 -m multiport --dports 1:65535 -j ACCEPT16-Anova-sg-1 -p udp -s 10.0.0.0/27 -m multiport --dports 1:65535 -j ACCEPT17-Anova-sg-1 -p icmp -s 10.0.0.0/27 -m icmp --icmp-type 1/65535-j ACCEPT18COMMIT  First you have the chains: nova-local, nova-inst-1, nova-sg-1, nova-ipv4-fallback and then the rules.
  Let’s look at the different chains and rules:
  Packets routed through the virtual network are handled by the chain nova-local.
1-AFORWARD -j nova-local  If the destination is 10.0.0.3 then it is for our instance so we jump to the chain nova-inst-1.
1-Anova-local -d10.0.0.3 -j nova-inst-1  If the packet could not be identified, drop it.
1-Anova-inst-1 -m state --state INVALID -j DROP  If the packet is associated with an established connection or is starting a new connection but associated with an existing connection, accept it.
1-Anova-inst-1 -m state --state ESTABLISHED,RELATED -j ACCEPT  Allow DHCP responses.
1-Anova-inst-1 -s 10.0.0.254 -p udp --sport 67 --dport 68  Jump to the security group chain to check the packet against its rules.
1-Anova-inst-1 -j nova-sg-1  Security group chain. Accept all TCP packets from 10.0.0.0/27 and ports 1 to 65535.
1-Anova-sg-1 -p tcp -s 10.0.0.0/27 -m multiport --dports 1:65535 -j ACCEPT  Accept all UDP packets from 10.0.0.0/27 and ports 1 to 65535.
1-Anova-sg-1 -p udp -s 10.0.0.0/27 -m multiport --dports 1:65535 -j ACCEPT  Accept all ICMP packets from 10.0.0.0/27 and ports 1 to 65535.
1-Anova-sg-1 -p icmp -s 10.0.0.0/27 -m icmp --icmp-type 1/65535-j ACCEPT  Jump to fallback chain.
1-Anova-inst-1 -j nova-ipv4-fallback  This is the fallback chain’s rule where we drop the packet.
1-Anova-ipv4-fallback -j DROP  Here is an example of a packet for a new TCP connection to 10.0.0.3:
DSC0003.png
  Following the firewall rules preparation is the image creation. This happens in _create_image().
1def _create_image(self,inst, libvirt_xml, suffix='',disk_images=None):2  ...  In this method, libvirt.xml is created based on the XML we generated above.
  A copy of the ramdisk, initrd and disk images are made for the hypervisor to use.
  If the flat network manager is used then a network configuration is injected into the guest OS image. We are using the VLAN manager in this example.
  The instance’s SSH key is injected into the image. Let’s look at this part in more details. The disk inject_data() method is called.
1disk.inject_data(basepath('disk'),key, net,2                 partition=target_partition,3                 nbd=FLAGS.use_cow_images)  basepath(‘disk’) is where the instance’s disk image is located on the host OS. key is the SSH key string. net is not set in our case because we don’t inject a networking configuration. partition is None because we are using a kernel image otherwise we coulduse a partitioned disk image. Let’s look inside inject_data().
  First thing happening here is linking the image to a device. This happens in _link_device().
1device = _allocate_device()2utils.execute('sudoqemu-nbd -c %s %s' % (device,image))3#NOTE(vish): this forks into another process, so give it a chance4#            to set up before continuuing5for i in xrange(10):6    if os.path.exists(&quot;/sys/block/%s/pid&quot; % os.path.basename(device)):7        return device8    time.sleep(1)9raise exception.Error(_('nbddevice %s did not show up') % device)  _allocate_device() returns the next available ndb device: /dev/ndbx where x is between 0 and 15. qemu-nbd is a QEMU disk network block device server. Once this is done, we get the device, let say: /dev/ndb0.
  We disable filesystem check for this device. mapped_device here is “/dev/ndb0″.
1out,err = utils.execute('sudotune2fs -c 0 -i 0 %s' % mapped_device)  We mount the file system to a temporary directory and we add the SSH key to the ssh authorized_keys file.
1sshdir = os.path.join(fs, 'root', '.ssh')2utils.execute('sudomkdir -p %s' % sshdir)  #existing dir doesn't matter3utils.execute('sudochown root %s' % sshdir)4utils.execute('sudochmod 700 %s' % sshdir)5keyfile = os.path.join(sshdir, 'authorized_keys')6utils.execute('sudotee -a %s' % keyfile, '\n' &#43; key.strip() &#43; '\n')  In the code above, fs is the temporary directory.
  Finally, we unmount the filesystem and unlink the device. This concludes the image creation and setup.
  Next step in the virtualization driver spawn() method is the instance launch itself using the driver createXML() binding. Following that is the firewall rules apply step.
  That’s it for now. I hope you enjoyed this article. Please write a comment if you have any feedback. If you need help with a project written in Python or with building a new web service, I am available as a freelancer: LinkedInprofile. Follow me on Twitter @laurentluce.
  tags: Python
posted in Uncategorized by Laurent Luce

运维网声明 1、欢迎大家加入本站运维交流群:群②:261659950 群⑤:202807635 群⑦870801961 群⑧679858003
2、本站所有主题由该帖子作者发表,该帖子作者与运维网享有帖子相关版权
3、所有作品的著作权均归原作者享有,请您和我们一样尊重他人的著作权等合法权益。如果您对作品感到满意,请购买正版
4、禁止制作、复制、发布和传播具有反动、淫秽、色情、暴力、凶杀等内容的信息,一经发现立即删除。若您因此触犯法律,一切后果自负,我们对此不承担任何责任
5、所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其内容的准确性、可靠性、正当性、安全性、合法性等负责,亦不承担任何法律责任
6、所有作品仅供您个人学习、研究或欣赏,不得用于商业或者其他用途,否则,一切后果均由您自己承担,我们对此不承担任何法律责任
7、如涉及侵犯版权等问题,请您及时通知我们,我们将立即采取措施予以解决
8、联系人Email:admin@iyunv.com 网址:www.yunweiku.com

所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其承担任何法律责任,如涉及侵犯版权等问题,请您及时通知我们,我们将立即处理,联系人Email:kefu@iyunv.com,QQ:1061981298 本贴地址:https://www.yunweiku.com/thread-125388-1-1.html 上篇帖子: OpenStack中国行(上海站)现场记录 下篇帖子: CentOS下多节点Openstack安装(五)—— cinder安装
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

扫码加入运维网微信交流群X

扫码加入运维网微信交流群

扫描二维码加入运维网微信交流群,最新一手资源尽在官方微信交流群!快快加入我们吧...

扫描微信二维码查看详情

客服E-mail:kefu@iyunv.com 客服QQ:1061981298


QQ群⑦:运维网交流群⑦ QQ群⑧:运维网交流群⑧ k8s群:运维网kubernetes交流群


提醒:禁止发布任何违反国家法律、法规的言论与图片等内容;本站内容均来自个人观点与网络等信息,非本站认同之观点.


本站大部分资源是网友从网上搜集分享而来,其版权均归原作者及其网站所有,我们尊重他人的合法权益,如有内容侵犯您的合法权益,请及时与我们联系进行核实删除!



合作伙伴: 青云cloud

快速回复 返回顶部 返回列表