2016-04-14 12:01:55,025 - org.jboss.netty.channel.DefaultChannelPipeline -59679839 [New I/O worker #29] WARN null - [DUBBO] An exception was thrown by a user handler while handling an exception event ([id: 0x5f980c11, /xxx:55386 => /xxx:6666] EXCEPTION: com.alibaba.dubbo.remoting.ExecutionException:>
com.alibaba.dubbo.remoting.ExecutionException:> at com.alibaba.dubbo.remoting.transport.dispatcher.all.AllChannelHandler.caught(AllChannelHandler.java:67)
at com.alibaba.dubbo.remoting.transport.AbstractChannelHandlerDelegate.caught(AbstractChannelHandlerDelegate.java:44)
at com.alibaba.dubbo.remoting.transport.AbstractChannelHandlerDelegate.caught(AbstractChannelHandlerDelegate.java:44)
at com.alibaba.dubbo.remoting.transport.AbstractPeer.caught(AbstractPeer.java:127)
at com.alibaba.dubbo.remoting.transport.netty.NettyHandler.exceptionCaught(NettyHandler.java:112)
at com.alibaba.dubbo.remoting.transport.netty.NettyCodecAdapter$InternalDecoder.exceptionCaught(NettyCodecAdapter.java:165)
at org.jboss.netty.channel.Channels.fireExceptionCaught(Channels.java:525)
at org.jboss.netty.channel.AbstractChannelSink.exceptionCaught(AbstractChannelSink.java:48)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at com.alibaba.dubbo.remoting.transport.netty.NettyCodecAdapter$InternalDecoder.messageReceived(NettyCodecAdapter.java:148)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:109)
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:90)
at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:714)
at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:949)
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1360)
at com.alibaba.dubbo.remoting.transport.dispatcher.all.AllChannelHandler.caught(AllChannelHandler.java:65)
... 19 more
查看output日志,发现其中有这么一句。
SEVERE: The web application [/xxx] appears to have started a thread named [DubboResponseTimeoutScanTimer] but has failed to stop it. This is very likely to create a memory leak.
根据日志提示貌似有内存泄露,以前确实还没有碰到过这个错误,一片迷茫。重新启动后,先用命令jstat -gc xxx 1000 30查看java 进程的gc情况,发现在30秒的世界内minor gc了n次,随怀疑年轻代内存配置少了,查看个区域内存的配置参数如下:
-Xms10g -Xmx10g -XX:PermSize=1g -XX:MaxPermSize=2g -Xshare:off -Xmn1024m
按照年轻代为堆内存为百分之三的原则修改为-Xmn4g,重新启动观察之后mimor gc的频率确实有所下降,测试大约过了3小时候之后又反馈tomcat down掉了,继续分析启动参数配置的时候发现了这么一句-XX:-+DisableExplicitGC,显示的禁止了System.gc(),但是使用了java.nio的大量框架中使用System.gc()来执行gc期通过full gc来强迫已经无用的DirectByteBuffer对象释放掉它们关联的native memory,如果禁用会导致OOM,随即怀疑是否是这个参数引发的问题,在启动参数中去掉它。
为了防止再次出现异常的时候能更加详细的分析堆内存的使用情况,在启动参数中添加了-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/usr/local/logs/java/,当tomcat down的时候让输出堆内存文件,一边也启动jvisualvm工具来实时的监控内存各个线程的使用情况。
数据库连接池
继续使用压测工具来压测,在压测的过程中发现名为com.mchange.v2.resourcepool.ssync.ThreadPoolAsynchronousRunner$PoolThred-#xxx的线程不断的增长,并且后台tomcat报错如下:
2016-04-13 16:55:15,175 - com.alibaba.dubbo.common.threadpool.support.AbortPolicyWithReport -83649035 [New I/O worker #27] WARN - [DUBBO] Thread pool is EXHAUSTED! Thread Name: DubboServerHandler-xxx:6666, Pool>
2016-04-13 16:55:15,176 - com.alibaba.dubbo.common.threadpool.support.AbortPolicyWithReport -83649036 [New I/O worker #27] WARN - [DUBBO] Thread pool is EXHAUSTED! Thread Name: DubboServerHandler-xxx:6666, Pool>
2016-04-13 16:55:15,177 - org.jboss.netty.channel.DefaultChannelPipeline -83649037 [New I/O worker #27] WARN - [DUBBO] An exception was thrown by a user handler while handling an exception event ([id: 0x2f345d45, /192.168.57.20:36475 => /xxx:6666] EXCEPTION: com.alibaba.dubbo.remoting.ExecutionException:>
com.alibaba.dubbo.remoting.ExecutionException:> at com.alibaba.dubbo.remoting.transport.dispatcher.all.AllChannelHandler.caught(AllChannelHandler.java:67)
at com.alibaba.dubbo.remoting.transport.AbstractChannelHandlerDelegate.caught(AbstractChannelHandlerDelegate.java:44)
at com.alibaba.dubbo.remoting.transport.AbstractChannelHandlerDelegate.caught(AbstractChannelHandlerDelegate.java:44)
at com.alibaba.dubbo.remoting.transport.AbstractPeer.caught(AbstractPeer.java:127)
at com.alibaba.dubbo.remoting.transport.netty.NettyHandler.exceptionCaught(NettyHandler.java:112)
at com.alibaba.dubbo.remoting.transport.netty.NettyCodecAdapter$InternalDecoder.exceptionCaught(NettyCodecAdapter.java:165)
at org.jboss.netty.channel.Channels.fireExceptionCaught(Channels.java:525)
at org.jboss.netty.channel.AbstractChannelSink.exceptionCaught(AbstractChannelSink.java:48)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at com.alibaba.dubbo.remoting.transport.netty.NettyCodecAdapter$InternalDecoder.messageReceived(NettyCodecAdapter.java:148)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:109)
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:90)
at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.RejectedExecutionException: Thread pool is EXHAUSTED! Thread Name: DubboServerHandler-xxx:6666, Pool> at com.alibaba.dubbo.common.threadpool.support.AbortPolicyWithReport.rejectedExecution(AbortPolicyWithReport.java:53)
at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372)
at com.alibaba.dubbo.remoting.transport.dispatcher.all.AllChannelHandler.caught(AllChannelHandler.java:65)
... 19 more
根据这些信息随怀疑数据库连接池有问题,为了更好的监控连接池的使用,因此前期使用c3p0也会出现的一些问题,所以我们决定将数据库连接池替换成druid,已经在别的项目中使用测试过,因此非常快速的更换投产。投产后继续用压测工具来测试,根据druid的后台监控页面发现(项目地址/druid/index.html),每次前端掉用一次数据库连接就加一次,执行完成之后数据库连接并没有释放。如下图红色区域,我们将数据库连接池调整成1000,不一会就占满了。
{:.center}
一个脚本引发的血案
根据这些信息判断出,数据库执行sql后肯定没有释放数据库连接,导致数据库连接池用满后,后续的线程无法执行,检查代码之后发现果然有问题,请看下方代码,我们最先使用的是SqlSessionFactory,如果使用SqlSessionFactory,在执行完sql后必须要执行session.close()来关闭连接,才会被连接池重新回收。
public>br/>@Resource
private SqlSessionFactory coreSqlSessionFactory;
protected SqlSession getSession() {
return coreSqlSessionFactory.openSession();
}
}