ansible故障
故障:管理端连接172.16.1.8的受控端不能管理# ansible 172.16.1.8 -m command -a "w"172.16.1.8 | UNREACHABLE! => { "changed": false, "msg": "Failed to connect to the host via ssh: ", "unreachable": true}第一步:查询排错:查询详细连接过程发现受控端连接头部异常# ansible 172.16.1.8 -m ping -vvvvUsing /etc/ansible/ansible.cfg as config fileLoading callback plugin minimal of type stdout, v2.0 from /usr/lib/python2.6/site-packages/ansible/plugins/callback/__init__.pycMETA: ran handlersUsing module file /usr/lib/python2.6/site-packages/ansible/modules/system/ping.py<172.16.1.8> ESTABLISH SSH CONNECTION FOR USER: None<172.16.1.8> SSH: EXEC ssh -vvv -C -o ControlMaster=auto -o ControlPersist=60s -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o ConnectTimeout=10 -o ControlPath=/root/.ansible/cp/923ebeb605 172.16.1.8 '/bin/sh -c '"'"'echo ~ && sleep 0'"'"''<172.16.1.8> (255, '', 'OpenSSH_5.3p1, OpenSSL 1.0.1e-fips 11 Feb 2013\ndebug1: Reading configuration data /etc/ssh/ssh_config\r\ndebug1: Applying options for *\r\ndebug1: auto-mux: Trying existing master\r\ndebug2: fd 3 setting O_NONBLOCK\r\ndebug2: mux_client_hello_exchange: master version 4\r\ndebug3: mux_client_request_forwards: requesting forwardings: 0 local, 0 remote\r\ndebug3: mux_client_request_session: entering\r\ndebug3: mux_client_request_alive: entering\r\ndebug3: mux_client_request_alive: done pid = 22508\r\ndebug3: mux_client_request_session: session request sent\r\ndebug1: mux_client_request_session: master session id: 12\r\ndebug3: mux_client_read_packet: read header failed: Broken pipe\r\ndebug2: Control master terminated unexpectedly\r\n')172.16.1.8 | UNREACHABLE! => { "changed": false, "msg": "Failed to connect to the host via ssh: OpenSSH_5.3p1, OpenSSL 1.0.1e-fips 11 Feb 2013\ndebug1: Reading configuration data /etc/ssh/ssh_config\r\ndebug1: Applying options for *\r\ndebug1: auto-mux: Trying existing master\r\ndebug2: fd 3 setting O_NONBLOCK\r\ndebug2: mux_client_hello_exchange: master version 4\r\ndebug3: mux_client_request_forwards: requesting forwardings: 0 local, 0 remote\r\ndebug3: mux_client_request_session: entering\r\ndebug3: mux_client_request_alive: entering\r\ndebug3: mux_client_request_alive: done pid = 22508\r\ndebug3: mux_client_request_session: session request sent\r\ndebug1: mux_client_request_session: master session id: 12\r\ndebug3: mux_client_read_packet: read header failed: Broken pipe\r\ndebug2: Control master terminated unexpectedly\r\n", "unreachable": true}Reading configuration data /etc/ssh/ssh_config\r\ndebug1: Applying options for *\r\ndebug1: auto-mux: Trying existing master\r\ndebug2: fd 3 setting O_NONBLOCK\r\ndebug2: mux_client_hello_exchange: master version 4\r\ndebug3: mux_client_request_forwards: requesting forwardings: 0 local, 0 remote\r\ndebug3: mux_client_request_session: entering\r\ndebug3: mux_client_request_alive: entering\r\ndebug3: mux_client_request_alive: done pid = 22508\r\ndebug3: mux_client_request_session: session request sent\r\ndebug1: mux_client_request_session: master session id: 12\r\ndebug3: mux_client_read_packet: read header failed: Broken pipe\r\ndebug2: Control master terminated unexpectedly\r\n", "unreachable": true》这个错误大=大致的意思就是连接的时候读取ssh头部异常。第二步:在172.16.1.8主机上进行进程检查:# ps -ef|grep sshroot 21204 10 15:08 ? 00:00:00 sshd: root@pts/1 root 21272 10 15:14 ? 00:00:00 sshd: root@notty root 21818 10 15:43 ? 00:00:00 /usr/sbin/sshdroot 21845212060 15:46 pts/1 00:00:00 grep ssh# kill 21272 # kill 21272 -bash: kill: (21272) - No such process# kill 21272 -bash: kill: (21272) - No such process# kill 21272 -bash: kill: (21272) - No such process解决说明:root 21272 10 15:14 ? 00:00:00 sshd: root@notty 这一条进程卡死了连接请求,需要杀掉这一条进程然后到管理端重新建立管理测试。第三步:查看管理端:连通性过程详细过程# ansible 172.16.1.8 -m ping172.16.1.8 | SUCCESS => { "changed": false, "ping": "pong"}说明:pong表示连通正常第四步:再来管理端测试一下:# ansible oldboy -m command -a "w"172.16.1.8 | SUCCESS | rc=0 >> 15:47:04 up7:28,3 users,load average: 0.00, 0.00, 0.00USER TTY FROM LOGIN@ IDLE JCPU PCPU WHATroot tty1 - 31Aug178:45 0.00s0.00s -bashroot pts/0 m01 15:47 0.00s0.11s0.00s /bin/sh -c /usrroot pts/1 10.0.0.253 31Aug17 23.00s0.06s0.06s -bash
172.16.1.31 | SUCCESS | rc=0 >> 15:47:05 up 3 days,4:14,2 users,load average: 0.00, 0.00, 0.00USER TTY FROM LOGIN@ IDLE JCPU PCPU WHATroot pts/0 10.0.0.253 08:08 15:37 0.02s0.02s -bashroot pts/2 m01 15:47 1.00s0.09s0.00s /bin/sh -c /usr
172.16.1.41 | SUCCESS | rc=0 >> 15:47:05 up 2 days, 22:58,3 users,load average: 0.00, 0.00, 0.00USER TTY FROM LOGIN@ IDLE JCPU PCPU WHATroot tty1 - 09:21 6:24m0.00s0.00s -bashroot pts/0 10.0.0.253 09:23 10:11 0.02s0.02s -bashroot pts/1 m01 15:47 1.00s0.18s0.00s /bin/sh -c /usr说明:此时已经管理端显示正常,故障解决
页:
[1]