设为首页 收藏本站
查看: 704|回复: 0

[经验分享] Skype Plans for PostgreSQL to Scale to 1 Billion Users

[复制链接]

尚未签到

发表于 2016-11-20 11:21:39 | 显示全部楼层 |阅读模式
Skype uses PostgreSQL as their backend database. PostgreSQL doesn't get enough run in the database world so I was excited to see how PostgreSQL is used "as the main DB for most of [Skype's] business needs." Their approach is to use a traditional stored procedure interface for accessing data and on top of that layer proxy servers which hash SQL requests to a set of database servers that actually carry out queries. The result is a horizontally partitioned system that they think will scale to handle 1 billion users.

·Skype's goal is an architecture that can handle 1 billion plus users. This level of scale isn't practically solvable with one really big computer, so our masked superhero horizontal scaling comes to the rescue.
·Hardware is dual or quad Opterons with SCSI RAID.
·Followed common database progression: Start with one DB. Add new databases partitioned by functionality. Replicate read-mostly data for better read access. Then horizontally partition data across multiple nodes..
·In a first for this blog anyway, Skype uses a traditional database architecture where all database access is encapsulated in stored procedures. This allows them to make behind the scenes performance tweaks without impacting frontend servers. And it fits in cleanly with their partitioning strategy using PL/Proxy.
·PL/Proxy is used to scale the OLTP portion of their system by creating a horizontally partitioned cluster:

- Database queries are routed by a proxy across a set of database servers. The proxy creates partitions based on a field value, typically a primary key.
- For example, you could partition users across a cluster by hashing based on user name. Each user is slotted into a shard based on the hash.
- Remote database calls are executed using a new PostgreSQL database language called plproxy. An example from Kristo Kaiv's blog:

First, code to insert a user in a database:CREATE OR REPLACE FUNCTION insert_user(i_username text) RETURNS text AS $$BEGIN    PERFORM 1 FROM users WHERE username = i_username;    IF NOT FOUND THEN        INSERT INTO users (username) VALUES (i_username);        RETURN 'user created';    ELSE        RETURN 'user already exists';    END IF;END;$$ LANGUAGE plpgsql SECURITY DEFINER;Heres the proxy code to distribute the user insert to the correct partition:queries=#CREATE OR REPLACE FUNCTION insert_user(i_username text) RETURNS TEXT AS $$    CLUSTER 'queries'; RUN ON hashtext(i_username);$$ LANGUAGE plproxy;Your SQL query looks normal:SELECT insert_user("username");
- The result of a query is exactly that same as if was executed on the remote database.
- Currently they can route 1000-2000 requests/sec on Dual Opteron servers to a 16 parition cluster.
·They like PL/Proxy approach for OLTP because:
- PL/Proxy servers form a scalable and uniform "DB-bus." Proxies are robust because in a redundant configuration if one fails you can just connect to another. And if the proxy tier becomes slow you can add more proxies and load balance between them.
- More partitions can be added to improve performance.
- Only data on a failed partition is unavailable during a failover. All other partitions operate normally.
·PgBouncer is used as a connection pooler for PostgreSQL. PL/Proxy "somewhat wastes connections as it opens connection to each partition from each backend process" so the pooler helps reduce the number of connections.
·Hot-standby servers are created using WAL (Write Ahead Log) shipping. It doesn't appear that these servers can be used for read-only operations.
·More sophisticated organizations often uses an OLTP database system to handle high performance transaction needs and then create seperate systems for more non-transactional needs. For example, an OLAP (Online analytical processing) system is often used for handling complicated analysis and reporting problems. These differ in schema, indexing, etc from the OLTP system. Skype also uses seperate systems for the presentation layer of web applications, sending email, and prining invoices. This requires data be moved from the OLTP to the other systems.
- Initially Slony1 was used to move data to the other systems, but "as the complexity and loads grew Slony1 started to cause us greater and greater pains."
- To solve this problem Skype developed their on lighter weight queueing and replication toolkit called SkyTools.

The proxy approach is interesting and is an architecture we haven't seen previously. Its power comes from the make another level of indirection school of problem solving, which has advantages:
·Applications are independent of the structure of the database servers. That's encapsulated in the proxy servers.
·Applications do not need to change in response to partition, mapping, or other changes.
·Load balancing, failover, and read/write splitting are invisible to applications.

The downsides are:
·Reduced performance. Another hop is added and queries must be parsed to perform all the transparent magic.
·Inability to perform joins and other database operations across partitions.
·Added administration complexity of dealing with proxy configuration and HA for the proxy servers.

It's easy to see how the advantages can outweigh the disadvantages. Without changing your application you can slip in a proxy layer and get a lot of very cool features for what seems like a low cost. If you are a MySQL user and this approach interests you then take a look at MySQL Proxy, which accomplishes something similar in a different sort of way.

Related Articles
·An Unorthodox Approach to Database Design : The Coming of the Shard
·PostgreSQProducts - Scaling infinitely with PL/Proxy
·PL/Proxy
·Heroku also uses PostgreSQL.
·MySQL Proxy
·PostgreSQL cluster: partitioning with plproxy (part I) by Kristo Kaiv'.
·PostgreSQL cluster: partitioning with plproxy (part II) by Kristo Kaiv'.
·PostgreSQL at Skype.
·Skytools database scripting framework & PgQ by Kristo Kaiv'.
·PostgreSQL High Availability.

原文摘自: http://highscalability.com/skype-plans-postgresql-scale-1-billion-users

运维网声明 1、欢迎大家加入本站运维交流群:群②:261659950 群⑤:202807635 群⑦870801961 群⑧679858003
2、本站所有主题由该帖子作者发表,该帖子作者与运维网享有帖子相关版权
3、所有作品的著作权均归原作者享有,请您和我们一样尊重他人的著作权等合法权益。如果您对作品感到满意,请购买正版
4、禁止制作、复制、发布和传播具有反动、淫秽、色情、暴力、凶杀等内容的信息,一经发现立即删除。若您因此触犯法律,一切后果自负,我们对此不承担任何责任
5、所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其内容的准确性、可靠性、正当性、安全性、合法性等负责,亦不承担任何法律责任
6、所有作品仅供您个人学习、研究或欣赏,不得用于商业或者其他用途,否则,一切后果均由您自己承担,我们对此不承担任何法律责任
7、如涉及侵犯版权等问题,请您及时通知我们,我们将立即采取措施予以解决
8、联系人Email:admin@iyunv.com 网址:www.yunweiku.com

所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其承担任何法律责任,如涉及侵犯版权等问题,请您及时通知我们,我们将立即处理,联系人Email:kefu@iyunv.com,QQ:1061981298 本贴地址:https://www.yunweiku.com/thread-302866-1-1.html 上篇帖子: MoSQL:简化MongoDB与PostgreSQL之间的同步 下篇帖子: postgresql集群方案hot standby 安装和测试
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

扫码加入运维网微信交流群X

扫码加入运维网微信交流群

扫描二维码加入运维网微信交流群,最新一手资源尽在官方微信交流群!快快加入我们吧...

扫描微信二维码查看详情

客服E-mail:kefu@iyunv.com 客服QQ:1061981298


QQ群⑦:运维网交流群⑦ QQ群⑧:运维网交流群⑧ k8s群:运维网kubernetes交流群


提醒:禁止发布任何违反国家法律、法规的言论与图片等内容;本站内容均来自个人观点与网络等信息,非本站认同之观点.


本站大部分资源是网友从网上搜集分享而来,其版权均归原作者及其网站所有,我们尊重他人的合法权益,如有内容侵犯您的合法权益,请及时与我们联系进行核实删除!



合作伙伴: 青云cloud

快速回复 返回顶部 返回列表