|
1. 背景说明
glance在openstack中负责镜像相关的服务,支持将运行的虚拟机转换为快照,镜像和快照都存储在glance中,glance的后端支持多种存储方式,包括本地的文件系统,http,glusterfs,ceph,swift等等。
默认情况下,glance采用本地文件系统的方式存储image,存储的路径为/var/lib/glance/images,随着时间的推移,当镜像越来越多的时候,根目录的空间将会越来越大,所以对于glance的路径来说,需要提前做好规划和准备,如划分一个单独的空间存储image,或者存放在分布式的文件系统,如ceph,swift上等。我所在的环境中,刚上线的时候,由于缺乏对glance的规划,采用默认的路径/var/lib/glance/images,后来因为空间的不够的问题,而采取更改路径,在更改的过程中,引发了"血案".
2. 血案现场
#获取镜像id
[root@controller ~]# glance image-list
+--------------------------------------+---------------+-------------+------------------+-------------+--------+
| ID | Name | Disk Format | Container Format | Size | Status |
+--------------------------------------+---------------+-------------+------------------+-------------+--------+
| 37aaedc7-6fe6-4fc8-b110-408d166b8e51 | cirrors | qcow2 | bare | 13200896 | active |
#获取网络的id号
[root@controller ~]# neutron net-list
+--------------------------------------+---------------+-------------------------------------------------------+
| id | name | subnets |
+--------------------------------------+---------------+-------------------------------------------------------+
| 99c68a93-336a-4605-aa78-343d41ca1206 | vmTest | 79cb82a1-eac1-4311-8e6d-badcabd22e44 192.168.100.0/24 |
+--------------------------------------+---------------+-------------------------------------------------------+
#获取flavor的id号码
[root@controller ~]# nova flavor-list
+--------------------------------------+------------------+-----------+------+-----------+------+-------+-------------+-----------+
| ID | Name | Memory_MB | Disk | Ephemeral | Swap | VCPUs | RXTX_Factor | Is_Public |
+--------------------------------------+------------------+-----------+------+-----------+------+-------+-------------+-----------+
| 1 | m1.large | 8192 | 100 | 10 | | 4 | 1.0 | True | 2. 创建instance
[root@controller ~]# nova boot --flavor m1.large --image 37aaedc7-6fe6-4fc8-b110-408d166b8e51 --nic net-id=99c68a93-336a-4605-aa78-343d41ca1206 glance_image_error_test
+--------------------------------------+------------------------------------------------+
| Property | Value |
+--------------------------------------+------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | nova |
| OS-EXT-SRV-ATTR:host | - |
| OS-EXT-SRV-ATTR:hypervisor_hostname | - |
| OS-EXT-SRV-ATTR:instance_name | instance-000001ff |
| OS-EXT-STS:power_state | 0 |
| OS-EXT-STS:task_state | scheduling |
| OS-EXT-STS:vm_state | building |
| OS-SRV-USG:launched_at | - |
| OS-SRV-USG:terminated_at | - |
| accessIPv4 | |
| accessIPv6 | |
| adminPass | X39vzn4RKwrL |
| config_drive | |
| created | 2016-01-27T11:14:46Z |
| flavor | m1.large (1) |
| hostId | |
| id | b143fd7d-b1b7-49b4-ba20-7968777460bc |
| image | cirrors (37aaedc7-6fe6-4fc8-b110-408d166b8e51) |
| key_name | - |
| metadata | {} |
| name | glance_image_error_test |
| os-extended-volumes:volumes_attached | [] |
| progress | 0 |
| security_groups | default |
| status | BUILD |
| tenant_id | 842ab3268a2c47e6a4b0d8774de805ae |
| updated | 2016-01-27T11:14:46Z |
| user_id | bc5e46fc4204497185ae3ca6f8b7affb |
+--------------------------------------+------------------------------------------------+ 3. 创建失败
[root@controller ~]# nova list |grep b143fd7d-b1b7-49b4-ba20-7968777460bc
| b143fd7d-b1b7-49b4-ba20-7968777460bc | glance_image_error_test | ERROR | - | NOSTATE | | ChuangYiYuan_10_16_2_21 | 3.寻根究底
- 查看glance日志,包括glance-api和glance-registry
[root@controller ~]# tail -n 2 /var/log/glance/api.log
2016-01-27 19:15:22.917 2664 INFO urllib3.connectionpool [-] Starting new HTTP connection (1): controller
2016-01-27 19:15:22.948 2664 INFO glance.wsgi.server [89d3f8c3-9d66-4d75-b88c-eafe746f9a6b bc5e46fc4204497185ae3ca6f8b7affb 842ab3268a2c47e6a4b0d8774de805ae - - -] 10.16.2.8 - - [27/Jan/2016 19:15:22] "HEAD /v1/images/37aaedc7-6fe6-4fc8-b110-408d166b8e51 HTTP/1.1" 200 856 0.031628
[root@controller ~]# tail -n 2 /var/log/glance/registry.log
2016-01-27 19:15:22.946 2763 INFO glance.registry.api.v1.images [cca31ae2-f412-4605-a5db-0cc0a507955b bc5e46fc4204497185ae3ca6f8b7affb 842ab3268a2c47e6a4b0d8774de805ae - - -] Successfully retrieved image 37aaedc7-6fe6-4fc8-b110-408d166b8e51
2016-01-27 19:15:22.946 2763 INFO glance.wsgi.server [cca31ae2-f412-4605-a5db-0cc0a507955b bc5e46fc4204497185ae3ca6f8b7affb 842ab3268a2c47e6a4b0d8774de805ae - - -] 127.0.0.1 - - [27/Jan/2016 19:15:22] "GET /images/37aaedc7-6fe6-4fc8-b110-408d166b8e51 HTTP/1.1" 200 847 0.017350
#!!未发现有异常!! 2. 查看nova的日志,包括nova-api,nova-scheduler,nova-conductor和nova-compute节点日志
2016-01-09 17:42:09.653 2872 WARNING nova.openstack.common.loopingcall [-] task run outlasted interval by 9.578928 sec
2016-01-09 17:47:25.755 2872 WARNING nova.openstack.common.loopingcall [-] task run outlasted interval by 5.842983 sec
2016-01-27 19:14:49.762 2872 ERROR nova.scheduler.filter_scheduler [req-46235a89-6ed4-47e5-ac06-85f6dedc8985 bc5e46fc4204497185ae3ca6f8b7affb 842ab3268a2c47e6a4b0d8774de805ae]
[instance: b143fd7d-b1b7-49b4-ba20-7968777460bc] Error from last host: ChuangYiYuan_10_16_2_22 (node ChuangYiYuan_10_16_2_22): [u'Traceback (most recent call last):\n', u' Fil
e "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 1328, in _build_instance\n set_access_ip=set_access_ip)\n', u' File "/usr/lib/python2.6/site-packages/nov
a/compute/manager.py", line 393, in decorated_function\n return function(self, context, *args, **kwargs)\n', u' File "/usr/lib/python2.6/site-packages/nova/compute/manager.
py", line 1740, in _spawn\n LOG.exception(_(\'Instance failed to spawn\'), instance=instance)\n', u' File "/usr/lib/python2.6/site-packages/nova/openstack/common/excutils.p
y", line 68, in __exit__\n six.reraise(self.type_, self.value, self.tb)\n', u' File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 1737, in _spawn\n bl
ock_device_info)\n', u' File "/usr/lib/python2.6/site-packages/nova/virt/libvirt/driver.py", line 2287, in spawn\n admin_pass=admin_password)\n', u' File "/usr/lib/python2
.6/site-packages/nova/virt/libvirt/driver.py", line 2656, in _create_image\n project_id=instance[\'project_id\'])\n', u' File "/usr/lib/python2.6/site-packages/nova/virt/li
bvirt/imagebackend.py", line 192, in cache\n *args, **kwargs)\n', u' File "/usr/lib/python2.6/site-packages/nova/virt/libvirt/imagebackend.py", line 383, in create_image\n
prepare_template(target=base, max_size=size, *args, **kwargs)\n', u' File "/usr/lib/python2.6/site-packages/nova/openstack/common/lockutils.py", line 249, in inner\n ret
urn f(*args, **kwargs)\n', u' File "/usr/lib/python2.6/site-packages/nova/virt/libvirt/imagebackend.py", line 182, in fetch_func_sync\n fetch_func(target=target, *args, **k
wargs)\n', u' File "/usr/lib/python2.6/site-packages/nova/virt/libvirt/utils.py", line 653, in fetch_image\n max_size=max_size)\n', u' File "/usr/lib/python2.6/site-packag
es/nova/virt/images.py", line 78, in fetch_to_raw\n max_size=max_size)\n', u' File "/usr/lib/python2.6/site-packages/nova/virt/images.py", line 72, in fetch\n image_serv
ice.download(context, image_id, dst_path=path)\n', u' File "/usr/lib/python2.6/site-packages/nova/image/glance.py", line 331, in download\n _reraise_translated_image_except
ion(image_id)\n', u' File "/usr/lib/python2.6/site-packages/nova/image/glance.py", line 329, in download\n image_chunks = self._client.call(context, 1, \'data\', image_id)\
n', u' File "/usr/lib/python2.6/site-packages/nova/image/glance.py", line 209, in call\n return getattr(client.images, method)(*args, **kwargs)\n', u' File "/usr/lib/pytho
n2.6/site-packages/glanceclient/v1/images.py", line 127, in data\n % urllib.quote(str(image_id)))\n', u' File "/usr/lib/python2.6/site-packages/glanceclient/common/http.py"
, line 289, in raw_request\n return self._http_request(url, method, **kwargs)\n', u' File "/usr/lib/python2.6/site-packages/glanceclient/common/http.py", line 249, in _http
_request\n raise exc.from_response(resp, body_str)\n', u'ImageNotFound: Image 37aaedc7-6fe6-4fc8-b110-408d166b8e51 could not be found.\n']
#在nova-scheduler和nova-compute的日志中查看到"ImageNotFound: Image 37aaedc7-6fe6-4fc8-b110-408d166b8e51 could not be found"的报错信息! 3.查看glance的服务状态
[root@controller ~]# /etc/init.d/openstack-glance-api status
openstack-glance-api (pid 2222) is running...
[root@controller ~]# /etc/init.d/openstack-glance-registry status
openstack-glance-registry (pid 2694) is running...
#状态正常
[root@controller ~]# glance image-list
+--------------------------------------+---------------+-------------+------------------+-------------+--------+
| ID | Name | Disk Format | Container Format | Size | Status |
+--------------------------------------+---------------+-------------+------------------+-------------+--------+
| 37aaedc7-6fe6-4fc8-b110-408d166b8e51 | cirrors | qcow2 | bare | 13200896 | active |
+--------------------------------------+---------------+-------------+------------------+-------------+--------+
#正常工作,尝试upload一个镜像,也能够正常工作,原因何在呢?? 4.抓住元凶
因为在运维过程中,修改过glance的默认路径由/var/lib/glance/images修改为/data1/glance,并且将/var/lib/glance/images下的镜像都mv至/data1/glance下了,而此时尽管数据已经前已过去了,但是image的元数据信息却牢牢的记录在glance的image_locations表中,查看得知:
mysql> select * from glance.image_locations where image_id='37aaedc7-6fe6-4fc8-b110-408d166b8e51'\G;
*************************** 1. row ***************************
id: 37
image_id: 37aaedc7-6fe6-4fc8-b110-408d166b8e51
value: file:///var/lib/glance/images/37aaedc7-6fe6-4fc8-b110-408d166b8e51 #元凶
created_at: 2015-12-21 06:10:24
updated_at: 2015-12-21 06:10:24
deleted_at: NULL
deleted: 0
meta_data: {}
status: active
1 row in set (0.00 sec) 真像:原来原有目录/var/lib/glance/images目录下的镜像都已经mv至/data1/glance下,而数据库中却依旧记录着就的路径内容,从而,衍生的一个问题:当nova尝试启动一台instance的时候,nova会到instance镜像缓存路径,默认/var/lib/nova/_base下查找是否有该镜像,如果没有则向glance发起result api请求,请求下载指定image的镜像到本地,glance则根据数据库中image_locations所定义的值去查找镜像,从而导致失败!
解决方法:更新glance的元数据信息
mysql> update glance.image_locations set value='file:///data1/glance/37aaedc7-6fe6-4fc8-b110-408d166b8e51' where image_id='37aaedc7-6fe6-4fc8-b110-408d166b8e51'\G;
Query OK, 1 row affected (0.05 sec)
Rows matched: 1 Changed: 1 Warnings: 0
#重建虚拟机,故障解决!!! 5. 进一步探索
glance中,主要有两张表很重要:images和image_locations,其中image负责存储镜像相关的信息,而image_locations记录镜像的存储url路径。
mysql> select * from glance.images limit 2\G;
*************************** 1. row ***************************
id: 0267dcbf-9f72-4ce8-9976-7106e38ee948
name: cirror1
size: 6899532
status: deleted
is_public: 1
created_at: 2015-12-02 01:45:13
updated_at: 2015-12-02 01:46:41
deleted_at: 2015-12-02 01:46:41
deleted: 1
disk_format: qcow2
container_format: bare
checksum: 7c607794659403b970a5d0a00fb2c311
owner: 842ab3268a2c47e6a4b0d8774de805ae
min_disk: 0
min_ram: 0
protected: 0
virtual_size: NULL
*************************** 2. row ***************************
id: 2437cede-d03a-4680-b704-6d27c4d7198e
name: test1
size: 0
status: deleted
is_public: 0
created_at: 2015-12-21 09:02:41
updated_at: 2015-12-21 09:06:02
deleted_at: 2015-12-21 09:06:02
deleted: 1
disk_format: qcow2
container_format: bare
checksum: d41d8cd98f00b204e9800998ecf8427e
owner: 842ab3268a2c47e6a4b0d8774de805ae
min_disk: 0
min_ram: 0
protected: 0
virtual_size: NULL
2 rows in set (0.00 sec)
#即记录着创建时候相关信息,还记得deleted字段的作用么?哈哈,删除镜像的原理??额 2. image_locations表
mysql> select * from image_locations;
+----+--------------------------------------+--------------------------------------------------------------------+---------------------+---------------------+---------------------+---------+-----------+--------+
| id | image_id | value | created_at | updated_at | deleted_at | deleted | meta_data | status |
+----+--------------------------------------+--------------------------------------------------------------------+---------------------+---------------------+---------------------+---------+-----------+--------+
| 1 | 437d860f-1c9f-4bb2-a3ca-8ec062441909 | file:///var/lib/glance/images/437d860f-1c9f-4bb2-a3ca-8ec062441909 | 2015-06-24 10:40:39 | 2015-12-01 11:52:20 | 2015-12-01 11:52:20 | 1 | {} | active |
| 2 | 5ce414b0-660a-46e1-ad0a-b842b2afc0b7 | file:///var/lib/glance/images/5ce414b0-660a-46e1-ad0a-b842b2afc0b7 | 2015-06-25 02:49:33 | 2015-06-25 02:49:33 | NULL 6. 附录
images表的结构:
mysql> desc glance.images;
+------------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+------------------+--------------+------+-----+---------+-------+
| id | varchar(36) | NO | PRI | NULL | |
| name | varchar(255) | YES | | NULL | |
| size | bigint(20) | YES | | NULL | |
| status | varchar(30) | NO | | NULL | |
| is_public | tinyint(1) | NO | MUL | NULL | |
| created_at | datetime | NO | | NULL | |
| updated_at | datetime | YES | | NULL | |
| deleted_at | datetime | YES | | NULL | |
| deleted | tinyint(1) | NO | MUL | NULL | |
| disk_format | varchar(20) | YES | | NULL | |
| container_format | varchar(20) | YES | | NULL | |
| checksum | varchar(32) | YES | MUL | NULL | |
| owner | varchar(255) | YES | MUL | NULL | |
| min_disk | int(11) | NO | | NULL | |
| min_ram | int(11) | NO | | NULL | |
| protected | tinyint(1) | YES | | NULL | |
| virtual_size | bigint(20) | YES | | NULL | |
+------------------+--------------+------+-----+---------+-------+
17 rows in set (0.00 sec) 2. image_locations表结构
mysql> desc image_locations;
+------------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+-------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| image_id | varchar(36) | NO | MUL | NULL | |
| value | text | NO | | NULL | |
| created_at | datetime | NO | | NULL | |
| updated_at | datetime | YES | | NULL | |
| deleted_at | datetime | YES | | NULL | |
| deleted | tinyint(1) | NO | MUL | NULL | |
| meta_data | text | YES | | NULL | |
| status | varchar(30) | NO | | active | |
+------------+-------------+------+-----+---------+----------------+
9 rows in set (0.00 sec)
|
|