xiu12 发表于 2018-10-9 07:54:08

mysql 分页

  Efficient Pagination Using MySQL
  Surat Singh Bhati (surat@yahoo-inc.com)
  Rick James (rjames@yahoo-inc.com)
  Yahoo Inc
  Percona Performance Conference 2009
  - 2 -
  Outline
  1. Overview
  – Common pagination UI pattern
  – Sample table and typical solution using OFFSET
  – Techniques to avoid large OFFSET
  – Performance comparison
  – Concerns
  - 3 -
  Common Patterns
  - 4 -
  Basics
  First step toward having efficient pagination over large data set
  – Use index to filter rows (resolve WHERE)
  – Use same index to return rows in sorted order (resolve ORDER)
  Step zero
  – http://dev.mysql.com/doc/refman/5.1/en/mysql-indexes.html
  – http://dev.mysql.com/doc/refman/5.1/en/order-by-optimization.html
  – http://dev.mysql.com/doc/refman/5.1/en/limit-optimization.html
  - 5 -
  Using Index
  KEY a_b_c (a, b, c)
  ORDER may get resolved using Index
  – ORDER BY a
  – ORDER BY a,b
  – ORDER BY a, b, c
  – ORDER BY a DESC, b DESC, c DESC
  WHERE and ORDER both resolved using index:
  – WHERE a = const ORDER BY b, c
  – WHERE a = const AND b = const ORDER BY c
  – WHERE a = const ORDER BY b, c
  – WHERE a = const AND b > const ORDER BY b, c
  ORDER will not get resolved uisng index (file sort)
  – ORDER BY a ASC, b DESC, c DESC /* mixed sort direction */
  – WHERE g = const ORDER BY b, c /* a prefix is missing */
  – WHERE a = const ORDER BY c /* b is missing */
  – WHERE a = const ORDER BY a, d /* d is not part of index */
  - 6 -
  Sample Schema
  CREATE TABLE `message` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `title` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
  `user_id` int(11) NOT NULL,
  `content` text COLLATE utf8_unicode_ci NOT NULL,
  `create_time` int(11) NOT NULL,
  `thumbs_up` int(11) NOT NULL DEFAULT '0', /* Vote Count */
  PRIMARY KEY (`id`),
  KEY `thumbs_up_key` (`thumbs_up`,`id`)
  ) ENGINE=InnoDB
  mysql> show table status like 'message' \G
  Engine: InnoDB
  Version: 10
  Row_format: Compact
  Rows: 50000040 /* 50 Million */
  Avg_row_length: 565
  Data_length: 28273803264 /* 26 GB */
  Index_length: 789577728 /* 753 MB */
  Data_free: 6291456
  Create_time: 2009-04-20 13:30:45
  Two use case:
  Paginate by time, recent message one page one
  Paginate by thumps_up, largest value on page one
  - 7 -
  Typical Query
  1. Get the total records
  SELECT count(*) FROM message
  2. Get current page
  SELECT * FROM message

  ORDER BY>  http://domain.com/message?page=1

  ORDER BY>  http://domain.com/message?page=2

  ORDER BY>  http://domain.com/message?page=3

  ORDER BY>
  Note:>  –
  - 8 -
  Explain
  mysql> explain SELECT * FROM message

  ORDER BY>  LIMIT 10000, 20\G
  ***************** 1. row **************
  id: 1
  select_type: SIMPLE
  table: message
  type: index
  possible_keys: NULL
  key: PRIMARY
  key_len: 4
  ref: NULL
  rows: 10020
  Extra:
  1 row in set (0.00 sec)
  – it can read rows using index scan and execution will stop as soon as it finds
  required rows.
  – LIMIT 10000, 20 means it has to read 10020 and throw away 10000 rows, then
  return next 20 rows.
  - 9 -
  Performance Implications
  – Larger OFFSET is going to increase active data set, MySQL has to bring data
  in memory that is never returned to caller.
  – Performance issue is more visible when your have database that can't fit in
  main memory.
  – Small percentage of request with large OFFSET would be able to hit disk I/O
  Disk I/O bottleneck
  – In order to display “21 to 40 of 1000,000” , some one has to count 1000,000
  rows.
  - 10 -
  Simple Solution
  – Do not display total records, does user really care?
  – Do not let user go to deep pages, redirect him
  http://en.wikipedia.org/wiki/Internet_addiction_disorder after certain number of
  pages
  - 11 -
  Avoid Count(*)
  1. Never display total messages, let user see more message by clicking
  'next'
  2. Do not count on every request, cache it, display stale count, user do not
  care about 324533 v/s 324633
  3. Display 41 to 80 of Thousands
  4. Use pre calculated count, increment/decrement value as insert/delete
  happens.
  - 12 -
  Solution to avoid offset
  1. Change User Interface
  – No direct jumps to Nth page
  2. LIMIT N is fine, Do not use LIMIT M,N
  – Provide extra clue about from where to start given page
  – Find the desired records using more restricted WHERE using given clue and
  ORDER BY and LIMIT N without OFFSET)
  - 13 -
  Find the clue
  150
  111
  102 Page One
  101
  100
  98
  97
  96 Page Two
  95
  94
  93
  92
  91 Page Three
  90
  89
  Next
  Prev
  Prev
  Next
  Next
  - 14 -
  Solution using clue
  Next Page:
  http://domain.com/forum?page=2&last_seen=100&dir=next

  WHERE>
  ORDER BY>  Prev Page:
  http://domain.com/forum?page=1&last_seen=98&dir=prev

  WHERE>
  ORDER BY>  Reverse given 10 rows before sending to user
  - 15 -
  Explain
  mysql> explain
  SELECT * FROM message

  WHERE>
  ORDER BY>  *************************** 1. row ***************************
  id: 1
  select_type: SIMPLE
  table: message
  type: range
  possible_keys: PRIMARY
  key: PRIMARY
  key_len: 4
  ref: NULL
  Rows: 25000020 /* ignore this */
  Extra: Using where
  1 row in set (0.00 sec)
  - 16 -
  What about order by non unique values?
  We can't do:
  WHERE thumbs_up < 98
  ORDER BY thumbs_up DESC /* It will return few seen rows */
  Can we say this:
  WHERE thumbs_up   - 18 -
  Solution
  First Page

  SELECT thumbs_up,>  FROM message

  ORDER BY thumbs_up DESC,>  LIMIT $page_size
  +-----------+----+

  | thumbs_up |>  +-----------+----+
  | 99 | 14 |
  | 99 | 2 |
  | 98 | 18 |
  | 98 | 15 |
  | 98 | 13 |
  +-----------+----+
  Next Page

  SELECT thumbs_up,>  FROM message
  WHERE thumbs_up   LIMIT $page_size
  +-----------+----+

  | thumbs_up |>  +-----------+----+
  | 98 | 10 |
  | 98 | 6 |
  | 97 | 17 |
  - 19 -
  Make it better..
  Query:
  SELECT * FROM message
  WHERE thumbs_up   LIMIT 20
  Can be written as:
  SELECT m2.* FROM message m1, message m2
  WHERE m1.id = m2.id
  AND m1.thumbs_up
页: [1]
查看完整版本: mysql 分页