C# 分析 IIS 日志(Log)

xunlei 发表于 2015-8-16 03:57:11

由于最近又要对 IIS日志 (Log) 分析，以便得出各个搜索引擎每日抓取的频率，所以这两天一直在尝试各个办法来分析 IIS 日志 (Log)，其中尝试过：导入数据库、Log parser、Powsershell 等等方法，最后改用的是c# 读取 IIS 日志的方法，性能最好，定制化也比较能满足需求。

读取 100M 的 log日志，大概10几秒就能完成，下面是一个读取IISlog日志分析各个爬虫来的数量的例子：

   //百度爬虫标识符号： Baiduspider
   //谷歌爬虫标识符号： Googlebot
   //搜狗爬虫标识符号： Sogou+web+spider
   //搜搜爬虫标识符号： Sosospider
   private void button1_Click(object sender, EventArgs e)
   {
         int Baidubot = 0, Googlebot = 0, Sogoubot = 0, Sosobot = 0;
         //log 日志的目录
         string url = textBox1.Text.Trim();
         FileStream fs = new FileStream(url, FileMode.Open, FileAccess.Read, FileShare.ReadWrite);
         #region 循环读取文本，并统计各个爬虫次数
         using (StreamReader sr = new StreamReader(fs, System.Text.Encoding.Default))
         {
            string line = string.Empty;
            while (!string.IsNullOrEmpty(line = sr.ReadLine()))
            {
               if (line.Contains("Baiduspider"))
               {
                     ++Baidubot;
               }
               else if (line.Contains("Googlebot"))
               {
                     ++Googlebot;
               }
               else if (line.Contains("Sogou+web+spider"))
               {
                     ++Sogoubot;
               }
               else if (line.Contains("Sosospider"))
               {
                     ++Sosobot;
               }
            }
         }
         #endregion
         label2.Text = "搜索引擎光顾次数：\n\r\n\r";
         label2.Text += "百度：" + Baidubot + "\n\r\n\r";
         label2.Text += "谷歌：" + Googlebot + "\n\r\n\r";
         label2.Text += "搜狗：" + Sogoubot + "\n\r\n\r";
         label2.Text += "搜搜：" + Sosobot + "\n\r\n\r";
   }

页: [1]

运维网's Archiver

C# 分析 IIS 日志(Log)