|
一、WebCache WebCache,web缓存,是一种缓存技术,用于临时存储(缓存)的网页文件,如HTML页面和图像等静态资源(此处不绝对,也可以缓存动态页面,但是存储到本地后也为静态文件),减少带宽以及后端服务器的压力,通常一个WebCache也是一个反向代理软件,既可以通过缓存响应用户的请求,当本地没有缓存时,可以代理用户请求至后端主机。(自己学习总结,由于昨天刚接触varnish,所以有错的地方还请担待)
WebCache分为正向和反向之分,一般正向WebCache不常用,本文以反向WebCache为主。
WebCache的由来:
由于程序具有局部性,而局部性分为:时间局部性和空间局部性
(1)时间局部性是指:在单位时间内,大部分用户访问的数据只是热点数据(热点数据指经常被访问的数据)
(2)空间局部性是指:比如,某新闻网站突然出来一个重大新闻,此新闻会被被反复访问。
WebCache的新鲜度监测机制:数据都是可变的,所以缓存中的内容要做新鲜度检测
过期日期:
由于网站是可变的,可能缓存定义的时间在未到达之前,数据就已经发生了改变,这在大部分电商站点上是经常发现的,这个时候我们就不得不对数据做新鲜度检测,其方式分为:
HTTP/1.0:Expires
例如:expires:Sat, 20 May 2017 07:49:55 GMT 在具体时间到达之前缓存服务器不会去后端服务器请求,但是会有一个问题,不同地区的时间可能不同
HTTP/1.1:Cache-Control:max-age
例如:Cache-Control: max-age=600 为了解决HTTP/1.0中对于新鲜度控制的策略而生,通过相对时间来控制缓存使用期限
缓存有效性验证机制:
如果原始内容未发生改变,则仅响应首部(不附带body部分),响应码304(Not Modified)
如果原始内容发生改变,则正常响应,响应码200
如果原始内容消失,则响应404,此时缓存中的cache object应被删除
条件式请求首部:
If-Modified-Since:基于请求内容的时间戳作验正,如果后端服务器数据的时间戳未发生改变则继续使用,反之亦然
If-None-Match:通过Etag来跟后端服务器进行匹配,如果数据的Etag未发生改变,既不匹配,则响应新数据,否则继续使用当前数据
WebCache的缓存控制机制:
Cache-Control = "Cache-Control" ":" 1#cache-directive
cache-directive = cache-request-directive
| cache-response-directive
cache-request-directive = //请求报文中的缓存指令
"no-cache" //不要缓存的实体,要求现在从WEB服务器去取
| "no-store" (backup) //不要缓存,其中可能包括用户的敏感信息
| "max-age" "=" delta-seconds //只接受 Age 值小于 max-age 值,并且没有过期的对象
| "max-stale" [ "=" delta-seconds ] //可以接受过去的对象,但是过期时间必须小于 max-stale 值
| "min-fresh" "=" delta-seconds //接受其新鲜生命期大于其当前 Age 跟 min-fresh 值之和的缓存对象
| "only-if-cached" //只有当缓存中有副本时,客户端才会获取一份副本
| cache-extension
cache-response-directive =
"public" //可以用 Cached 内容回应任何用户
| "private" [ "=" <"> 1#field-name <"> ] //只能用缓存内容回应先前请求该内容的那个用户
| "no-cache" [ "=" <"> 1#field-name <"> ] //可以缓存,但是只有在跟WEB服务器验证了其有效后,才能返回给客户端
| "no-store" //此内容不允许缓存到缓存服务器上,可能包含用户的敏感信息
| "no-transform" //未改变
| "max-age" "=" delta-seconds //本响应包含的对象的过期时间
| "s-maxage" "=" delta-seconds //本响应包含的对象的过期时间
| cache-extension
常见WebCache软件:
[td]Name | Operating system | Forward
mode | Reverse
mode | License | Untangle | Linux | Yes | Yes | Proprietary | ApplianSys CACHEbox | Linux | Yes | Yes | Proprietary | aiScaler Dynamic Cache Control | Linux | Yes | Yes | Proprietary | Nginx | Linux, BSD variants, OS X, Solaris, AIX, HP-UX, other *nix flavors | No | Yes | 2-clause BSD-like | Varnish | Linux, Unix | No (possible with a VMOD) | Yes | BSD | Traffic Server | Linux, Unix | Yes | Yes | Apache License 2.0 | Squid | Linux, Unix, Windows | Yes | Yes | GNU General Public License | Blue Coat ProxySG | SGOS | Yes | Yes | Proprietary | WinGate | Windows | Yes | Yes | Proprietary / Free for 3 users | Microsoft Forefront Threat Management Gateway | Windows | Yes | Yes | Proprietary | Polipo | Windows, OS X, Linux, OpenWrt, FreeBSD | No | Yes | MIT License | Apache HTTP Server | Windows, OS X, Linux, Unix, FreeBSD, Solaris, Novell NetWare, OS/2, TPF, OpenVMS and eComStation | No | Yes | Apache License 2.0 | 二、Varnish架构
(1)Managentment管理进程
CLI interface:命令行接口来,目前Web interface为收费接口,而telnet纯文本传输,所以只能使用ClI interface.
managentment主要用于编译VCL并应用新配置、监控varnish、初始化varnish,并提供一个CLI。
(2)child/cache
child/cache线程有几类:
Acceptor:接收新的连接请求;
Worker:用于处理并响应用户请求;
Expiry:从缓存中清理过期cache object
(3)log
shared memory log,共享内容日志方式存储,一般其大小为90MB,分为两部分:前一部分为计数器、后一部分为客户请求相关的数据
Varnish支持的后端缓存存储机制:
malloc[,size] 使用内存缓存机制
VARNISH_STORAGE="malloc,64M"
file[,path[,size[,granularity]]] 通过文件方式存储
VARNISH_STORAGE="file,${VARNISH_STORAGE_FILE},${VARNISH_STORAGE_SIZE}"
persistent,path,size 前两者在重启后缓存都会消失,persistent可以永久保存缓存,但还为开发阶段
Varnish的 state engine:
vcl配置的缓存策略会在state engine中生效
上图为varnish中缓存控制的规则,每一个request进入vcl_recv后都会被发往到各state engine上,不同的vcl规则会发往不同的state engine上,我们可以通过vcl规则来控制用户请求,下面说一些常见的场景。
未命中缓存时:
命中缓存时:
直接与后端服务器建立管道:
经过vcl_pass交由后端服务器:
三、Varnish安装配置环境介绍:
varnish_server | varnish3.0 | 172.18.4.70 | CentOS7 | backend_server | httpd+php | 172.18.4.71 | CentOS7 | zabbix官网的yum仓库:https://repo.varnish-cache.org/
安装:
1
| #yum install varnish gcc -y
|
配置启动环境
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
| # vim /etc/sysconfig/varnish
NFILES=131072 //可打开最大的文件数
MEMLOCK=82000 //锁定的内存空间
RELOAD_VCL=1 //是否在重启varnish服务时装载vcl配置文件
VARNISH_VCL_CONF=/etc/varnish/default.vcl //vcl默认读取配置文件路径
VARNISH_LISTEN_PORT=80 //varnish 默认监听端口
VARNISH_ADMIN_LISTEN_ADDRESS=127.0.0.1 //varnish管理监听地址
VARNISH_ADMIN_LISTEN_PORT=6082 //varnish管理监听端口
VARNISH_SECRET_FILE=/etc/varnish/secret //varnish 密钥文件
VARNISH_MIN_THREADS=50 //最小线程数,varnish进程启动时启动多少个线程
VARNISH_MAX_THREADS=1000 //最大线程数,一般varnish的总线程数不超过5000(线程池数x最大线程数)
VARNISH_THREAD_TIMEOUT=120 //线程超时时间
VARNISH_STORAGE_FILE=/var/lib/varnish/varnish_storage.bin //varnish缓存文件,varnish将缓存存储为单个文件
VARNISH_STORAGE_SIZE=64M //varnish存储大小
VARNISH_STORAGE="malloc,${VARNISH_STORAGE_SIZE}" //varnish存储访方式,内存方式
|
启动服务
1
| # systemctl start varnish
|
四、vcl配置(1)常见变量
1、在任何引擎中均可使用:
now, .host, .port
2、用于处理请求阶段:
client.ip, server.hostname, server.ip, server.port
req.request:请求方法
req.url: 请求的URL
req.proto: HTTP协议版本
req.backend: 用于服务此次请求的后端主机;
req.backend.healthy: 后端主机健康状态;
req.http.HEADER: 引用请求报文中指定的首部;
req.can_gzip:客户端是否能够接受gzip压缩格式的响应内容;
req.restarts: 此请求被重启的次数;
3、varnish向backend主机发起请求前可用的变量
bereq.request: 请求方法
bereq.url:请求url
bereq.proto:请求协议
bereq.http.HEADER:请求首部
bereq.connect_timeout: 等待与be建立连接的超时时长
4、backend主机的响应报文到达本主机(varnish)后,将其放置于cache中之前可用的变量
beresp.do_stream: 流式响应;
beresp.do_gzip:是否压缩之后再存入缓存;
beresp.do_gunzip:是否解压缩之后存入缓存
beresp.http.HEADER:报文首部;
beresp.proto: 协议
beresp.status:响应状态码
beresp.response:响应时的原因短语
beresp.ttl:响应对象剩余的生存时长,单位为second;
beresp.backend.name: 此响应报文来源backend名称;
beresp.backend.ip:后端主机ip
beresp.backend.port:后端主机的端口
beresp.storage:
5、缓存对象存入cache之后可用的变量
obj.proto:协议
obj.status:状态
obj.response:响应报文
obj.ttl:生存周期
obj.hits:命中
obj.http.HEADER:http首部
6、在决定对请求键做hash计算时可用的变量
req.hash:将请求交给hash
7、在为客户端准备响应报文时可用的变量
resp.proto:协议
resp.status:状态
resp.response:响应
resp.http.HEADER:http首部
(3)各变量可用的状态引擎
五、常用示例为了便于使用及理解,在介绍实例前,介绍一个varnish的命令行工具:varnishadm
在命令行下,直接敲varnishadm
1
2
3
4
5
6
7
8
9
10
11
12
| # varnishadm
200
-----------------------------
Varnish Cache CLI 1.0
-----------------------------
Linux,2.6.32-573.el6.x86_64,x86_64,-sfile,-smalloc,-hcritbit
varnish-3.0.6 revision 1899836
Type 'help' for command list.
Type 'quit' to close CLI session.
varnish>
|
会看到上面的一个界面,可以使用help命令来获取帮助。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
| varnish> help
200
help [command]
ping [timestamp]
auth response
quit
banner
status
start
stop
vcl.load
vcl.inline
vcl.use
vcl.discard
vcl.list
vcl.show
param.show [-l] []
param.set
panic.show
panic.clear
storage.list
backend.list
backend.set_health matcher state
ban.url
ban [&& ]...
ban.list
|
常用的子命令有:
vcl.list:用于列出当前使用的配置及状态
vcl.load:用于加载新配置
vcl.show:查看配置中的内容
ping:用来测试varnish是否正常
vcl.use:切换新配置
(1)实例一:添加http首部,让客户端可知缓存是否从服务器中得到
修改后端主机:
1
2
3
4
5
| # vim /etc/varnish/defatult.vcl
backend default {
.host = "172.18.4.71";
.port = "80";
}
|
添加vcl语句:
1
2
3
4
5
6
7
8
9
10
11
12
| # vim /etc/varnish/defatult.vcl
sub vcl_deliver {
if (obj.hits > 0) {
set resp.http.X-Cache = "HIT";
} else {
set resp.http.X-Cache = "MISS";
}
}
sub vcl_hit {
return (deliver);
}
|
通过命令行工具varnishadm重载配置
1
2
3
4
5
6
7
8
9
10
11
12
| varnish> vcl.list
200
active 0 boot
varnish> vcl.load test1 /etc/varnish/default.vcl
200
VCL compiled.
varnish> vcl.use test1
200
varnish> vcl.list
200
available 0 boot
active 0 test1
|
访问并测试
(2)实例二,设置http首部,让后端主机知道真实客户端ip地址
修改配置文件
1
2
3
4
5
6
7
8
9
10
| # vim /etc/varnish/default.vcl
sub vcl_recv {
if (req.restarts == 0) {
if (req.http.x-forwarded-for) {
set req.http.X-Forwarded-For =
req.http.X-Forwarded-For + ", " + client.ip;
} else {
set req.http.X-Forwarded-For = client.ip;
}
}
|
重载配置
1
2
3
4
5
6
7
8
9
10
11
12
| varnish> vcl.load test2 /etc/varnish/default.vcl
200
VCL compiled.
varnish> vcl.use test2
200
varnish> vcl.list
200
available 0 boot
active 0 test2
|
修改后端web服务器配置文件
1
2
3
| # vim /etc/httpd/conf/httpd.conf
LogFormat "%{X-Forwarded-For}i %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined
# systemctl reload httpd
|
访问并查看httpd日志
1
| 172.18.250.172 - - [23/May/2016:22:40:07 +0800] "GET / HTTP/1.1" 200 24 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.87 Safari/537.36"
|
并不是varnish:172.18.4.70
(3)实例三:移除某个对象
修改配置文件
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
| acl purgers {
"127.0.0.1";
"172.18.0.0"/16;
}
sub vcl_recv {
if (req.request == "PURGE") {
if (!client.ip ~ purgers) {
error 405 "Method not allowed";
}
return (lookup);
}
}
sub vcl_hit {
if (req.request == "PURGE") {
purge;
error 200 "Purged";
}
}
sub vcl_miss {
if (req.request == "PURGE") {
purge;
error 404 "Not in cache";
}
}
sub vcl_pass {
if (req.request == "PURGE") {
error 502 "PURGE on a passed object";
}
}
|
重载配置
1
2
3
4
5
| varnish> vcl.load test3 /etc/varnish/default.vcl
200
VCL compiled.
varnish> vcl.use test3
200
|
访问并测试
客户端在发起HTTP请求时,只需要为所请求的URL使用PURGE方法即可,其命令使用方式如下:
# curl -I -X PURGE http://varniship/path/to/someurl
启用默认vcl_recv默认配置时使用的方式:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
| sub vcl_recv {
if (req.restarts == 0) {
if (req.http.x-forwarded-for) {
set req.http.X-Forwarded-For =
req.http.X-Forwarded-For + ", " + client.ip;
} else {
set req.http.X-Forwarded-For = client.ip;
}
}
if (req.request == "PURGE" ) {
if (!client.ip ~ purgers) {
error 405 "Method not allowed.";
}
}
if (req.request != "GET" &&
req.request != "HEAD" &&
req.request != "PUT" &&
req.request != "POST" &&
req.request != "TRACE" &&
req.request != "OPTIONS" &&
req.request != "DELETE" &&
req.request != "PURGE" ) {
/* Non-RFC2616 or CONNECT which is weird. */
return (pipe);
}
if (req.request != "GET" && req.request != "HEAD" && req.request != "PURGE") {
/* We only deal with GET and HEAD by default */
return (pass);
}
if (req.http.Authorization || req.http.Cookie) {
/* Not cacheable by default */
return (pass);
}
return (lookup);
}
|
(4)实例四:控制指定来源地址可访问的资源
修改配置文件
1
2
3
4
5
6
7
8
9
10
11
| # vim /etc/varnish/default.conf
acl admingroup {
"127.0.0.1";
"172.18.0.0"/16;
}
if (req.url ~ "login") {
if (!client.ip ~ admingroup) {
error 404 "no permission access";
}
return (lookup);
}
|
重载配置
为了实现效果,这里我修改配置文件,拒绝本机访问
1
2
3
4
5
| # vim /etc/varnish/default.conf
acl admingroup {
"127.0.0.1";
# "172.18.0.0"/16; //注释此行
}
|
重载配置并测试
1
2
3
4
5
6
| varnish> vcl.load test5 /etc/varnish/default.vcl
200
VCL compiled.
varnish> vcl.use test5
200
|
(5)实例五:varnish多主机配置
添加配置
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
| backend web1 {
.host = "172.18.4.71";
.port = "80";
}
director webservers random {
.retries = 5;
{
.backend = web1;
.weight = 2;
}
{
.backend = {
.host = "172.18.4.72";
.port = "80";
}
.weight = 3;
}
}
sub vcl_recv {
set req.backend = webservers;
return (lookup);
}
|
多主机配置中,常用算法有两种:random和round-robin
重载配置
1
2
3
4
5
6
| varnish> vcl.load test6 /etc/varnish/default.vcl
200
VCL compiled.
varnish> vcl.use test6
200
|
访问测试
1
2
3
4
5
6
| varnish> backend.list
200
Backend name Refs Admin Probe
default(172.18.4.71,,80) 4 probe Healthy (no probe)
web1(172.18.4.71,,80) 1 probe Healthy (no probe)
webservers[1](172.18.4.72,,80) 1 probe Healthy (no probe)
|
由于本地是缓存服务器,所以测试效果不是很明显,所以此次并没有测试结果
(6)实例六:varnish健康状态检查
修改配置文件
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
| backend server1 {
.host = "server1.example.com";
.probe = {
.url = "/";
.interval = 5s;
.timeout = 1 s;
.window = 5;
.threshold = 3;
}
}
backend server2 {
.host = "server2.example.com";
.probe = {
.url = "/";
.interval = 5s;
.timeout = 1 s;
.window = 5;
.threshold = 3;
}
}
sub vcl_recv {
if (req.url ~ ".php$") {
set req.backend = web1;
} else {
set req.backend = web2;
}
return (pass);
}
|
重载配置
1
2
3
4
5
6
| varnish> vcl.load test8 default.vcl
200
VCL compiled.
varnish> vcl.use test8
200
|
查看后端状态
1
2
3
4
5
| varnish> backend.list
200
Backend name Refs Admin Probe
web1(172.18.4.71,,80) 1 probe Healthy 8/8
web2(172.18.4.72,,80) 1 probe Healthy 8/8
|
关闭其中一台,并查看
1
2
3
4
5
6
7
8
9
10
11
12
| # systemctl stop httpd
varnish> backend.list
200
Backend name Refs Admin Probe
web1(172.18.4.71,,80) 1 probe Sick 1/8
web2(172.18.4.72,,80) 1 probe Healthy 8/8
varnish> backend.list
200
Backend name Refs Admin Probe
web1(172.18.4.71,,80) 1 probe Sick 0/8
web2(172.18.4.72,,80) 1 probe Healthy 8/8
|
(6)生产案例
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
| backend shopweb {
.host = "172.18.4.1";
.port = "80";
}
acl purge {
"localhost";
"127.0.0.1";
"10.1.0.0"/16;
"192.168.0.0"/16;
}
sub vcl_hash {
hash_data(req.url);
return (hash);
}
sub vcl_recv {
set req.backend = shopweb;
# set req.grace = 4h;
if (req.request == "PURGE") {
if (!client.ip ~ purge) {
error 405 "Not allowed.";
}
return(lookup);
}
if (req.request == "REPURGE") {
if (!client.ip ~ purge) {
error 405 "Not allowed.";
}
ban("req.http.host == " + req.http.host + " && req.url ~ " + req.url);
error 200 "Ban OK";
}
if (req.restarts == 0) {
if (req.http.x-forwarded-for) {
set req.http.X-Forwarded-For = req.http.X-Forwarded-For + ", " + client.ip;
}
else {
set req.http.X-Forwarded-For = client.ip;
}
}
if (req.request != "GET" &&
req.request != "HEAD" &&
req.request != "PUT" &&
req.request != "POST" &&
req.request != "TRACE" &&
req.request != "OPTIONS" &&
req.request != "DELETE") {
/* Non-RFC2616 or CONNECT which is weird. */
return (pipe);
}
if (req.request != "GET" && req.request != "HEAD") {
/* We only deal with GET and HEAD by default */
return (pass);
}
if (req.http.Authorization) {
/* Not cacheable by default */
return (pass);
}
if ( req.url == "/Heartbeat.html" ) {
return (pipe);
}
if ( req.url == "/" ) {
return (pipe);
}
if ( req.url == "/index.jsp" ) {
return (pipe);
}
if (req.http.Cookie ~ "dper=") {
return (pass);
}
if (req.http.Cookie ~ "sqltrace=") {
return (pass);
}
if (req.http.Cookie ~ "errortrace=") {
return (pass);
}
# if ( req.request == "GET" && req.url ~ "req.url ~ "^/shop/[0-9]+$" ) {
if ( req.url ~ "^/shop/[0-9]+$" || req.url ~ "^/shop/[0-9]?.*" ) {
return (lookup);
}
if ( req.url ~ "^/shop/(d{1,})/editmember" || req.url ~ "^/shop/(d{1,})/map" || req.url ~ "^/shop/(d+)/dish-([^/]+)" ) {
return (lookup);
}
return (pass);
# return (lookup);
}
sub vcl_pipe {
return (pipe);
}
sub vcl_pass {
return (pass);
}
sub vcl_hit {
if (req.request == "PURGE") {
purge;
error 200 "Purged.";
}
return (deliver);
}
sub vcl_miss {
if (req.request == "PURGE") {
error 404 "Not in cache.";
}
# if (object needs ESI processing) {
# unset bereq.http.accept-encoding;
# }
return (fetch);
}
sub vcl_fetch {
set beresp.ttl = 3600s;
set beresp.http.expires = beresp.ttl;
#set beresp.grace = 4h;
# if (object needs ESI processing) {
# set beresp.do_esi = true;
# set beresp.do_gzip = true;
# }
if ( req.url ~ "^/shop/[0-9]+$" || req.url ~ "^/shop/[0-9]?.*" ) {
set beresp.ttl = 4h;
}
if ( req.url ~ "^/shop/(d{1,})/editmember" || req.url ~ "^/shop/(d{1,})/map" || req.url ~ "^/shop/(d+)/dish-([^/]+)" ) {
set beresp.ttl = 24h;
}
if (beresp.status != 200){
return (hit_for_pass);
}
return (deliver);
}
sub vcl_deliver {
if (obj.hits > 0){
set resp.http.X-Cache = "HIT";
}
else {
set resp.http.X-Cache = "MISS";
}
set resp.http.X-Powered-By = "Cache on " + server.ip;
set resp.http.X-Age = resp.http.Age;
return (deliver);
}
sub vcl_error {
set obj.http.Content-Type = "text/html; charset=utf-8";
set obj.http.Retry-After = "5";
synthetic {""} + obj.status + " " + obj.response + {""};
return (deliver);
}
sub vcl_init {
return (ok);
}
sub vcl_fini {
return (ok);
}
|
|
-
|