web缓存及varnish 使用基础

2322312 · 发表于 2016-11-21 08:39:21

Web Page Cache：

1.介绍

★概念：
缓存就是数据交换的缓冲区（称作Cache），当某一硬件要读取数据时，会首先从缓存中查找需要的数据，如果找到了则直接执行，找不到的话则从内存中找。由于缓存的运行速度比内存快得多，故缓存的作用就是帮助硬件更快地运行。

★程序的运行具有局部性特征：
时间局部性：刚刚访问过的数据有可能在很短的时间内再次被访问；
空间局部性：一个数据被访问到了，其周边的数据也有可能被访问到；

正是因为局部性的存在，才使得缓存有意义；
★缓存的存储方式
缓存以key-value形式存储，缓存主要缓存的是热点数据
key：访问路径,url,经过hash计算后存储;
value：web content

★热区：局部性的数据；
时效性：
缓存空间耗尽：LRU（最近最少使用算法清理）
过期：缓存清理

★缓存命中率：hit/(hit+miss) 命中次数/所有请求的次数
范围：（0,1），百分比描述；

页面命中率：基于页面数量进行衡量
字节命中率：基于页面的体积进行衡量

☉缓存与否：
私有数据：private，private cache（私有缓存，如，浏览器）；
公共数据：public, public or private cache;

2.缓存分类及控制相关的首部信息

★缓存分类：
服务端缓存：如：Nginx，Apache
又分为代理服务器缓存和反向代理服务器缓存（也叫网关缓存，比如 Nginx反向代理、Squid等），其实广泛使用的 CDN 也是一种服务端缓存，目的都是让用户的请求走”捷径“，并且都是缓存图片、文件等静态资源。
客户端侧缓：如：web browser
存一般指的是浏览器缓存，目的就是加速各种静态资源的访问，想想现在的大型网站，随便一个页面都是一两百个请求，每天 pv 都是亿级别，如果没有缓存，用户体验会急剧下降、同时服务器压力和网络带宽都面临严重的考验。

★Cache-related Headers Fields（缓存有关的头文件）
Expires：过期时间；如：Expires:Thu, 22 Oct 2026 06:34:30 GMT
Cache-Control
Etag
Last-Modified
If-Modified-Since
If-None-Match
Vary
Age

3.缓存有效性判断机制：（http头信息控制缓存）

★Expires策略：过期时间
HTTP/1.0
Expires
HTTP/1.1
  Cache-Control: maxage=
  Cache-Control: s-maxage=

解释：
Expires是Web服务器响应消息头字段，在响应http请求时告诉浏览器在过期时间前浏览器可以直接从浏览器缓存取数据，而无需再次请求。不过Expires 是HTTP 1.0的东西，现在默认浏览器均默认使用HTTP 1.1，所以它的作用基本忽略。Expires 的一个缺点就是，返回的到期时间是服务器端的时间，这样存在一个问题，如果客户端的时间与服务器的时间相差很大（比如时钟不同步，或者跨时区），那么误差就很大，所以在HTTP 1.1版开始，使用Cache-Control: max-age=秒替代。

★Cache-control策略：条件式请求（重点关注）
Last-Modified/If-Modified-Since
Etag/If-None-Match

☉其值有：
public:
指示响应可以被任何缓存区缓存；
private：
内容只缓存到私有缓存中；
no-cache：
指示请求或响应的消息不能缓存，该选项并不是说可以是指“不缓存”，容易望文生义；
no-store：
用于防止重要的信息被无意的发布，在请求消息中发送将使得请求和响应消息都不使用缓存，完全存不下来；
max-age：
指示客户机可以接受不大于指定时间（以秒为单位）的新的生存期（缓存时长）的响应；
must-revalidae：
  如果缓存的内容失败，请求必须发送到服务器/代理进行重新验证；

☉Last-Modified/If-Modified-Since：
Last-Modified/If-Modified-Since要配合Cache-Control使用。
Last-Modified：
If-Modified-Since

解释：
Last-Modified是响应头，If-Modified-Since是请求头。Last-Modified把Web组件的最后修改时间告诉客户端，客户端在下次请求此Web组件的时候，会把上次服务端响应的最后修改时间作为If-Modified-Since的值发送给服务器，服务器可以通过这个值来判断是否需要重新发送，如果不需要，就简单的发送一个304状态码，客户端将从缓存里直接读取所需的Web组件。

☉Etag/If-None-Match：
Etag/If-None-Match也要配合Cache-Control使用。
解释：
ETag是响应头，If-None-Match是请求头。Last-Modified / If-Modified-Since的主要缺点就是它只能精确到秒的级别，一旦在一秒的时间里出现了多次修改，那么Last-Modified / If-Modified-Since是无法体现的。相比较，ETag / If-None-Match没有使用时间作为判断标准，而是使用一个特征串。Etag把Web组件的特征串告诉客户端，客户端在下次请求此Web组件的时候，会把上次服务端响应的特征串作为If-None-Match的值发送给服务端，服务端可以通过这个值来判断是否需要从重新发送，如果不需要，就简单的发送一个304状态码，客户端将从缓存里直接读取所需的Web组件。

附图1：

附图2：

================================================================================

开源解决方案：varnish 1.varnish介绍及程序架构

★解决方案：
squid：
varnish：

★varnish官方站点：
http://www.varnish-cache.org/
版本：
Community
Enterprise

★程序架构：
Manager进程
Cacher进程，包含多种类型的线程：
accept, worker, expiry, ...
shared memory log：
统计数据：计数器；
日志区域：日志记录；
   varnishlog, varnishncsa, varnishstat...
配置接口：VCL
Varnish Configuration Language,
   vcl complier --> c complier --> shared object

附图：
  程序架构图：

2.varnish-4.0.3 的安装及程序环境

★安装
yum install varnish -y

★配置文件
/etc/varnish/varnish.params：
配置varnish服务进程的工作特性，例如监听的地址和端口，缓存机制；
/etc/varnish/default.vcl：
配置各Child/Cache线程的工作属性；

★主程序：
/usr/sbin/varnishd

★CLI interface：
/usr/bin/varnishadm

★Shared Memory Log交互工具：
/usr/bin/varnishhist
/usr/bin/varnishlog
/usr/bin/varnishncsa
/usr/bin/varnishstat
/usr/bin/varnishtop

★测试工具程序：
/usr/bin/varnishtest

★VCL配置文件重载程序：
/usr/sbin/varnish_reload_vcl

★Systemd Unit File：
☉启动varnish服务
/usr/lib/systemd/system/varnish.service

☉启动日志持久的服务
/usr/lib/systemd/system/varnishlog.service
/usr/lib/systemd/system/varnishncsa.service

3.varnish缓存机制

★malloc[,size]
内存存储，[,size]用于定义空间大小；重启后所有缓存项失效；

★file[,path[,size[,granularity]]]
文件存储，黑盒；重启后所有缓存项失效；

★persistent,path,size
文件存储，黑盒；重启后所有缓存项有效；实验；

4.varnish程序的选项：

★程序选项：/etc/varnish/varnish.params文件
-a address[:port][,address[:port][...]，//默认为本机所有地址的6081端口；
-T address[:port]，                   //默认为本机127.0.0.1的6082端口；
-s [name=]type[,options]，             //定义缓存存储机制；
-u user
-g group
-f config：  //VCL配置文件；
-F：       //调试时使用，运行于前台；

...
★运行时参数：/etc/varnish/varnish.params文件， DEAMON_OPTS
-p param=value：    //设定运行参数及其值；可重复使用多次；
-r param[,param...]: //设定指定的参数为只读状态；

注意：
varnish是一个web代理缓存服务器，既然是代理就要代表原始web服务器工作，所以，接收的用户请求会首先发送到varnish，那这样一来按理说就应该监听在80端口；其实不然，为了防止和本机的80端口起冲突，varnish默认是监听在6081端口的；这时因为虽然varnish为代理缓存服务器，但并不会让其直接面向用户，而是通过前端的nignx或者haproxy将用户的请求反代至varnish，由varnish完成缓存管理，如果本地缓存没有，varnish再去查找原始服务器，如果有就直接返回给前端反代服务器。
6082端口为基于命令行工具去管理和控制varnish的接口，为了安全起见监听在本机的127.0.0.1的地址；

演示：

实现varnish代理缓存后端web服务

操作环境描述：

准备两台CentOS 7 虚拟主机，一台作为varnish的代理缓存服务器以及nginx的反代服务器，另一台作为后端原始服务器

-------------------------------------------------------------------------------

1.假如我这里设定使用内存为缓存机制，编辑其配置文件/etc/varnish/varnish.params

,修改如下：（使用内存缓存时间久了会产生碎片，会使得缓存效率很低，实际工作中要使用文件缓存，但要保证缓存文件I/O能力很强，所以，一般要使用固态硬盘）

2.设置nginx的配置文件/etc/nginx/conf.d/default.conf,使其所有内容反代至本地varnish的缓存服务器，如下：

3.启动nginx服务和varnish缓存服务，服务及查看nginx 80端口和varnish 6081,6082端口；

[iyunv@centos7~]# nginx -t
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful
[iyunv@centos7~]# nginx
[iyunv@centos7~]# systemctl restart varnish
[iyunv@centos7~]# ss -tnl
State    Recv-Q Send-Q    Local Address:Port                   Peer Address:Port
LISTEN    0    128                   *:6081                               *:*
LISTEN    0    10             127.0.0.1:6082                               *:*
LISTEN    0    25                      *:514                               *:*
LISTEN    0    128                   *:80                                  *:*
LISTEN    0    128                   *:22                                  *:*
LISTEN    0    128             127.0.0.1:631                               *:*
LISTEN    0    100             127.0.0.1:25                                  *:*
LISTEN    0    128             127.0.0.1:6010                               *:*
LISTEN    0    128                   :::6081                               :::*
LISTEN    0    25                   :::514                               :::*
LISTEN    0    128                   :::22                               :::*
LISTEN    0    128                   ::1:631                               :::*
LISTEN    0    100                   ::1:25                               :::*
LISTEN    0    128                   ::1:6010                               :::*

  在浏览器中访问，可以看到提示为后端服务获取失败为varnish缓存服务器，因为我们这里还没有定义后端服务器，所以会提示错误，如下：

3.在varnish服务配置文件中指明后端服务器地址和端口，重载varnish服务，如下：
[iyunv@centos7 ~]# cd /etc/varnish/
[iyunv@centos7 varnish]# ls
default.vcl  secret  varnish.params
[iyunv@centos7 varnish]# vim default.vcl
16 backend default {
17    .host = "10.1.252.161";
18    .port = "80";
19 }

[iyunv@centos7 ~]# varnish_reload_vcl    //重载vcl配置文件
Loading vcl from /etc/varnish/default.vcl
Current running config name is boot
Using new config name reload_2016-11-19T08:57:15
VCL compiled.
VCL 'reload_2016-11-19T08:57:15' now active
available    0 boot
active       0 reload_2016-11-19T08:57:15

Done

为后端主机提供测试页，启动httpd服务，如下：

【注意：在实际生产环境中varnish应该有两个网卡接口，一个为公网地址，连接反代服务器（如nginx）接受用户请求，一个为私网地址，连接内网后端原始服务器】

[iyunv@centos7 ~]# echo "<h1>Backend Server 1</h1>" > /var/www/html/index.html
[iyunv@centos7 ~]# cat /var/www/html/index.html
<h1>Backend Server 1</h1>

[iyunv@centos7 ~]# systemctl start httpd.service

再次在浏览器中刷新访问，可以看到显示页面为Backend Server 1 ，连续刷新几次，查看后端服务器http的访问日志可以看到，只有一次是从后端服务器返回的，剩余的都是varnish缓存服务器返回的结果，如下：

如上，就实现了通过varnish代理缓存后端的原始服务器...
--------------------------------------------------------------------------------------------------------------------------------------

4.重载vcl配置文件及varnishadm命令：

★重载vcl配置文件：
~ ]# varnish_reload_vcl

★varnishadm 命令：
-S /etc/varnish/secret -T [ADDRESS:]PORT
-S指明共享秘钥文件，-T指明连接varnish服务器的地址和端口
参数如下：

⊙配置文件相关：
vcl.list
vcl.load：装载，加载并编译；
vcl.use：激活；
vcl.discard：删除；
vcl.show [-v] <configname>：查看指定的配置文件的详细信息；

⊙运行时参数：
param.show -l：显示所有列表；
param.show <PARAM>
param.set <PARAM> <VALUE>

⊙缓存存储：
storage.list

⊙后端服务器：
backend.list

演示：
1.使用varnishadm命令连接varnish服务器，管理控制varnish接口

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37

# 连接本机的varnish 服务器，要指明秘钥文件和地址端口
[iyunv@centos7 ~]# varnishadm -S /etc/varnish/secret -T :6082
200
-----------------------------
Varnish Cache CLI 1.0
-----------------------------
Linux,3.10.0-327.el7.x86_64,x86_64,-smalloc,-smalloc,-hcritbit
varnish-4.0.3 revision b8c4a34

Type 'help' for command list.
Type 'quit' to close CLI session.

help
200  # 执行help命令后的返回值
help [<command>]
ping [<timestamp>]
auth <response>
quit
banner
status
start
stop
vcl.load <configname> <filename>
vcl.inline <configname> <quoted_VCLstring>
vcl.use <configname>
vcl.discard <configname>
vcl.list
param.show [-l] [<param>]
param.set <param> <value>
panic.show
panic.clear
storage.list
vcl.show [-v] <configname>
backend.list [<backend_expression>]
backend.set_health <backend_expression> <state>
ban <field> <operator> <arg> [&& <field> <oper> <arg>]...
ban.list

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30

ping # 探测服务是否OK
200
PONG 1479518941 1.0
banner # 显示开头信息
200
-----------------------------
Varnish Cache CLI 1.0
-----------------------------
Linux,3.10.0-327.el7.x86_64,x86_64,-smalloc,-smalloc,-hcritbit
varnish-4.0.3 revision b8c4a34

Type 'help' for command list.
Type 'quit' to close CLI session.

status       # 显示当前服务器状态
200
Child in state running
vcl.list          # 列出已经编译成功的vcl列表
200
available    0 boot
active       0 reload_2016-11-19T08:57:15

vcl.use boot       # 使用之前的vcl版本
200
VCL 'boot' now active

vcl.list
200
active       0 boot
available    0 reload_2016-11-19T08:57:15  # 由active变为available

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39

param.show thread_pool_max # 列出指定其参数的值
200
thread_pool_max
      Value is: 500 [threads]
      Default is: 5000
      Minimum is: 5

      The maximum number of worker threads in each pool.

      Do not set this higher than you have to, since excess worker
      threads soak up RAM and CPU and generally just get in the way
      of getting work done.

      Minimum is 10 threads.

      NB: This parameter may take quite some time to take (full)
      effect.

param.set thread_pool_max 1024 # 修改其指定参数的值
200

param.show thread_pool_max
200
thread_pool_max
      Value is: 1024 [threads]
      Default is: 5000
      Minimum is: 5

      The maximum number of worker threads in each pool.

      Do not set this higher than you have to, since excess worker
      threads soak up RAM and CPU and generally just get in the way
      of getting work done.

      Minimum is 10 threads.

      NB: This parameter may take quite some time to take (full)
      effect.

VCL：

---"域"专有类型的配置语言；

★vcl
varnish域专用配置语言，是基于状态引擎，转台之间存在着相关性，但彼此之间相互隔离，每个引擎使用return来退出当前状态并进入下一个状态，不同的状态的引擎是不尽相同。
state engine：状态引擎；
VCL有多个状态引擎，状态之间存在相关性，但彼此间互相隔离；每个状态引擎可使用return(x)指明关联至哪个下一级引擎；
  例如：vcl_hash --> return(hit) --> vcl_hit

★请求流程：
请求处理流程：请求分为为可缓存和不可缓存，当请求可缓存时，是否命中，命中则从本地缓存响应，未命中则到达后端主机取得相应的结果，公共缓存则可缓存，缓存一份到缓存后再次响应给客服端，如私有数据则不可缓存直接响应即可。

附图：
  vcl请求流程图：

☉数据流向：
vcl_recv-->vcl_hash-->
1)vcl_hit-->vcl_deliver
2)vcl_hist-->vcl_pass-->vcl_backend_fetch
vcl_miss-->vcl_pass
  vcl_miss-->vcl_backend_fetch
  vcl_purge-->vcl_synth
  vcl_pipe-->done
  vcl_backend_fetch-->vcl_backend_respose
  vcl_backend_fetch-->vcl_backend_error

账号		自动登录	找回密码
密码			立即注册

Centos6.5×64安装配置openmeetings3.0.3详

大疆运维招人啦，

C++ :try 语句块和异常处理

C++的多态

Red Hat RHCE 8 (EX294) Cert Guide

Java/C++ 区别：看完这一篇，就够用！

别再用过时库了！这 13 个顶级 C++ 库才是

[经验分享] web缓存及varnish 使用基础

扫码加入运维网微信交流群