丹调生活 发表于 2015-10-13 12:39:55

cloudstack故障更新

  工作中经常用到CloudStack,过程中发现的一些故障排查分享出来,希望可以帮到大家。
  一、添加主机失败
  现象1:
123456789101112131415161718192021222324252627282930313233# tail -f /var/log/cloudstack/management/management-server.log2014-02-28 11:05:32,172 DEBUG (catalina-exec-22:null) Timeout, to waitfor the host connecting to mgt svr, assuming it is failed2014-02-28 11:05:32,205 WARN (catalina-exec-22:null) Unable tofind the server resources at http://192.168.150.2502014-02-28 11:05:32,220 INFO (catalina-exec-22:null) Could notfind exception: com.cloud.exception.DiscoveryException in error code list for exceptions2014-02-28 11:05:32,220 WARN (catalina-exec-22:null) Exception:com.cloud.exception.DiscoveryException: Unable to add the host at com.cloud.resource.ResourceManagerImpl.discoverHostsFull(ResourceManagerImpl.java:798) at com.cloud.resource.ResourceManagerImpl.discoverHosts(ResourceManagerImpl.java:590) at org.apache.cloudstack.api.command.admin.host.AddHostCmd.execute(AddHostCmd.java:143) at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:158) at com.cloud.api.ApiServer.queueCommand(ApiServer.java:514) at com.cloud.api.ApiServer.handleRequest(ApiServer.java:372) at com.cloud.api.ApiServlet.processRequest(ApiServlet.java:305) at com.cloud.api.ApiServlet.doPost(ApiServlet.java:71) at javax.servlet.http.HttpServlet.service(HttpServlet.java:637) at javax.servlet.http.HttpServlet.service(HttpServlet.java:717) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:555) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298) at org.apache.coyote.http11.Http11NioProcessor.process(Http11NioProcessor.java:889) at org.apache.coyote.http11.Http11NioProtocol$Http11ConnectionHandler.process(Http11NioProtocol.java:721) at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:2268) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:679)2014-02-28 11:05:32,222 INFO (catalina-exec-22:null) Unable to add the host2014-02-28 11:05:32,224 DEBUG (catalina-exec-22:null) ===END=== 192.168.151.234 -- POSTcommand=addHost&response=json&sessionkey=GEI3EIOONoV5RG9Mcs4xcdx31oc%3D  现象2:
12# /etc/init.d/cloudstack-agent status    ##查看kvm主机的cloudstack-agent服务状态cloudstack-agent dead but subsys locked  
  现象3:
12# cat /var/log/cloudstack/agent/agent.log      ##查看kvm主机的agent.log日志中的异常ERROR (main:null) Unable to start agent: NO HVM support on this machine, pleasemake sure: 1. VT/SVM is supported by your CPU, or is enabledin BIOS. 2. kvm modules are loaded (kvm, kvm_amd|kvm_intel)  
  解决方法:
1.必须安装虚拟化套件支持1# yum -y groupinstall 'Virtualization' 'Virtualization Client' 'Virtualzation Platform' 'Virtualization Tools'2.确认kvm模块已经被正确加载123# lsmod | grep kvmkvm_intel 52570 0kvm 314739 1 kvm_intel如果没有任何信息,请使用如下命令加载kvm模块:12# modprobe kvm_intel   ##intel平台# modprobe kvm_amd       ##amd平台  3.再次添加。
  
  福利:
  关于添加主机过程中的错误,千奇百怪,而java的报错又。。。教给大家一个小技巧:
  当添加主机报错,日志中有没有明确原因时,可以手动在agent上面执行添加主机的命令。具体添加主机的命令可以在management的日志中获得:
12345# cat /var/log/cloudstack/management/management-server.log | grep cloudstack-setup-agent2014-03-13 09:56:17,758DEBUG (catalina-exec-11:null) Executing cmd: cloudstack-setup-agent-m192.168.153.28 -z 2-p 2 -c 2 -g 0d21492f-9565-329d-9a26-0c85f6d39d12 -a --pubNic=cloud0 --prvNic=cloud0 --guestNic=cloud02014-03-13 09:56:52,775DEBUG (catalina-exec-11:null) cloudstack-setup-agent-m192.168.153.28 -z 2-p 2 -c 2 -g 0d21492f-9565-329d-9a26-0c85f6d39d12 -a --pubNic=cloud0 --prvNic=cloud0 --guestNic=cloud0 output:CloudStack Agent setupis done!2014-03-13 11:12:22,455DEBUG (catalina-exec-12:null) Executing cmd: cloudstack-setup-agent-m192.168.153.28 -z 3-p 3 -c 3 -g 0d21492f-9565-329d-9a26-0c85f6d39d12 -a --pubNic=cloud0 --prvNic=cloud0 --guestNic=cloud02014-03-13 11:12:57,267DEBUG (catalina-exec-12:null) cloudstack-setup-agent-m192.168.153.28 -z 3-p 3 -c 3 -g 0d21492f-9565-329d-9a26-0c85f6d39d12 -a --pubNic=cloud0 --prvNic=cloud0 --guestNic=cloud0 output:CloudStack Agent setupis done!  比如我上面的例子,得到如下命令,并在agent上面执行:
1234567891011# cloudstack-setup-agent-m 192.168.153.28 -z 3 -p 3 -c 3 -g 0d21492f-9565-329d-9a26-0c85f6d39d12 -a --pubNic=cloud0 --prvNic=cloud0 --guestNic=cloud0Starting to configure your system:Configure Cgroup ...          Configure SElinux ...         Configure Network ...         Configure Libvirt ...         Configure Firewall ...      Configure Nfs ...             Configure cloudAgent ...      CloudStack Agent setup is done!#  这个过程中,如果报错,就很轻易就能判断出问题是出在哪一步。
  另外,上面cloudstack-setup-agent命令的参数如下,根据自己的情况改写:
1234567891011121314# cloudstack-setup-agent -hUsage: cloudstack-setup-agent Options:-h, --help            show this help message and exit-a                  auto mode-m MGT, --host=MGT    Management server hostnameor IP-Address-z ZONE, --zone=ZONEzone id-p POD, --pod=POD   pod id-c CLUSTER, --cluster=CLUSTER                        cluster id-g GUID, --guid=GUIDguid--pubNic=PUBNIC       Public traffic interface--prvNic=PRVNIC       Private traffic interface--guestNic=GUESTNIC   Guest traffic interface  至于参数后面具体的值,可以从agent主机的/etc/cloudstack/agent/agent.properties中获得:
1234567891011121314151617# cat /etc/cloudstack/agent/agent.properties#Storage#Thu Mar 13 11:23:48 CST 2014guest.network.device=cloud0workers=5private.network.device=cloud0port=8250resource=com.cloud.hypervisor.kvm.resource.LibvirtComputingResourcepod=3zone=3guid=0d21492f-9565-329d-9a26-0c85f6d39d12public.network.device=cloud0cluster=3local.storage.uuid=ac70655b-f452-4d14-a1a1-2a5eebc4bb01domr.scripts.dir=scripts/network/domr/kvmLibvirtComputingResource.id=0host=192.168.153.28  
  持续更新中。。。
  本文出自 “systems” 博客,请务必保留此出处http://systems.blog.iyunv.com/2500547/1375332
页: [1]
查看完整版本: cloudstack故障更新