设为首页 收藏本站
查看: 995|回复: 0

[经验分享] Mongodb Redis HBase

[复制链接]

尚未签到

发表于 2016-12-19 07:47:48 | 显示全部楼层 |阅读模式
from :

so classic to be noted here.

In this light, here is a comparison of Cassandra, Mongodb, CouchDB, Redis, Riak, Membase, Neo4j and HBase:
CouchDB (V1.1.1)

    Written in: Erlang
    Main point: DB consistency, ease of use
    License: Apache
    Protocol: HTTP/REST
    Bi-directional (!) replication,
    continuous or ad-hoc,
    with conflict detection,
    thus, master-master replication. (!)
    MVCC - write operations do not block reads
    Previous versions of documents are available
    Crash-only (reliable) design
    Needs compacting from time to time
    Views: embedded map/reduce
    Formatting views: lists & shows
    Server-side document validation possible
    Authentication possible
    Real-time updates via _changes (!)
    Attachment handling
    thus, CouchApps (standalone js apps)
    jQuery library included

Best used: For accumulating, occasionally changing data, on which pre-defined queries are to be run. Places where versioning is important.

For example: CRM, CMS systems. Master-master replication is an especially interesting feature, allowing easy multi-site deployments.
Redis (V2.4)

    Written in: C/C++
    Main point: Blazing fast
    License: BSD
    Protocol: Telnet-like
    Disk-backed in-memory database,
    Currently without disk-swap (VM and Diskstore were abandoned)
    Master-slave replication
    Simple values or hash tables by keys,
    but complex operations like ZREVRANGEBYSCORE.
    INCR & co (good for rate limiting or statistics)
    Has sets (also union/diff/inter)
    Has lists (also a queue; blocking pop)
    Has hashes (objects of multiple fields)
    Sorted sets (high score table, good for range queries)
    Redis has transactions (!)
    Values can be set to expire (as in a cache)
    Pub/Sub lets one implement messaging (!)

Best used: For rapidly changing data with a foreseeable database size (should fit mostly in memory).

For example: Stock prices. Analytics. Real-time data collection. Real-time communication.
MongoDB

    Written in: C++
    Main point: Retains some friendly properties of SQL. (Query, index)
    License: AGPL (Drivers: Apache)
    Protocol: Custom, binary (BSON)
    Master/slave replication (auto failover with replica sets)
    Sharding built-in
    Queries are javascript expressions
    Run arbitrary javascript functions server-side
    Better update-in-place than CouchDB
    Uses memory mapped files for data storage
    Performance over features
    Journaling (with --journal) is best turned on
    On 32bit systems, limited to ~2.5Gb
    An empty database takes up 192Mb
    GridFS to store big data + metadata (not actually an FS)
    Has geospatial indexing

Best used: If you need dynamic queries. If you prefer to define indexes, not map/reduce functions. If you need good performance on a big DB. If you wanted CouchDB, but your data changes too much, filling up disks.

For example: For most things that you would do with MySQL or PostgreSQL, but having predefined columns really holds you back.
Riak (V1.0)

    Written in: Erlang & C, some Javascript
    Main point: Fault tolerance
    License: Apache
    Protocol: HTTP/REST or custom binary
    Tunable trade-offs for distribution and replication (N, R, W)
    Pre- and post-commit hooks in JavaScript or Erlang, for validation and security.
    Map/reduce in JavaScript or Erlang
    Links & link walking: use it as a graph database
    Secondary indices: search in metadata
    Large object support (Luwak)
    Comes in "open source" and "enterprise" editions
    Full-text search, indexing, querying with Riak Search server (beta)
    In the process of migrating the storing backend from "Bitcask" to Google's "LevelDB"
    Masterless multi-site replication replication and SNMP monitoring are commercially licensed

Best used: If you want something Cassandra-like (Dynamo-like), but no way you're gonna deal with the bloat and complexity. If you need very good single-site scalability, availability and fault-tolerance, but you're ready to pay for multi-site replication.

For example: Point-of-sales data collection. Factory control systems. Places where even seconds of downtime hurt. Could be used as a well-update-able web server.
Membase

    Written in: Erlang & C
    Main point: Memcache compatible, but with persistence and clustering
    License: Apache 2.0
    Protocol: memcached plus extensions
    Very fast (200k+/sec) access of data by key
    Persistence to disk
    All nodes are identical (master-master replication)
    Provides memcached-style in-memory caching buckets, too
    Write de-duplication to reduce IO
    Very nice cluster-management web GUI
    Software upgrades without taking the DB offline
    Connection proxy for connection pooling and multiplexing (Moxi)

Best used: Any application where low-latency data access, high concurrency support and high availability is a requirement.

For example: Low-latency use-cases like ad targeting or highly-concurrent web apps like online gaming (e.g. Zynga).
Neo4j (V1.5M02)

    Written in: Java
    Main point: Graph database - connected data
    License: GPL, some features AGPL/commercial
    Protocol: HTTP/REST (or embedding in Java)
    Standalone, or embeddable into Java applications
    Full ACID conformity (including durable data)
    Both nodes and relationships can have metadata
    Integrated pattern-matching-based query language ("Cypher")
    Also the "Gremlin" graph traversal language can be used
    Indexing of nodes and relationships
    Nice self-contained web admin
    Advanced path-finding with multiple algorithms
    Indexing of keys and relationships
    Optimized for reads
    Has transactions (in the Java API)
    Scriptable in Groovy
    Online backup, advanced monitoring and High Availability is AGPL/commercial licensed

Best used: For graph-style, rich or complex, interconnected data. Neo4j is quite different from the others in this sense.

For example: Social relations, public transport links, road maps, network topologies.
Cassandra

    Written in: Java
    Main point: Best of BigTable and Dynamo
    License: Apache
    Protocol: Custom, binary (Thrift)
    Tunable trade-offs for distribution and replication (N, R, W)
    Querying by column, range of keys
    BigTable-like features: columns, column families
    Has secondary indices
    Writes are much faster than reads (!)
    Map/reduce possible with Apache Hadoop
    I admit being a bit biased against it, because of the bloat and complexity it has partly because of Java (configuration, seeing exceptions, etc)

Best used: When you write more than you read (logging). If every component of the system must be in Java. ("No one gets fired for choosing Apache's stuff.")

For example: Banking, financial industry (though not necessarily for financial transactions, but these industries are much bigger than that.) Writes are faster than reads, so one natural niche is real time data analysis.
HBase

(With the help of ghshephard)

    Written in: Java
    Main point: Billions of rows X millions of columns
    License: Apache
    Protocol: HTTP/REST (also Thrift)
    Modeled after BigTable
    Map/reduce with Hadoop
    Query predicate push down via server side scan and get filters
    Optimizations for real time queries
    A high performance Thrift gateway
    HTTP supports XML, Protobuf, and binary
    Cascading, hive, and pig source and sink modules
    Jruby-based (JIRB) shell
    No single point of failure
    Rolling restart for configuration changes and minor upgrades
    Random access performance is like MySQL

Best used: If you're in love with BigTable. DSC0000.gif And when you need random, realtime read/write access to your Big Data.

For example: Facebook Messaging Database (more general example coming soon)

Of course, all systems have much more features than what's listed here. I only wanted to list the key points that I base my decisions on. Also, development of all are very fast, so things are bound to change. I'll do my best to keep this list updated.

-- Kristof

运维网声明 1、欢迎大家加入本站运维交流群:群②:261659950 群⑤:202807635 群⑦870801961 群⑧679858003
2、本站所有主题由该帖子作者发表,该帖子作者与运维网享有帖子相关版权
3、所有作品的著作权均归原作者享有,请您和我们一样尊重他人的著作权等合法权益。如果您对作品感到满意,请购买正版
4、禁止制作、复制、发布和传播具有反动、淫秽、色情、暴力、凶杀等内容的信息,一经发现立即删除。若您因此触犯法律,一切后果自负,我们对此不承担任何责任
5、所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其内容的准确性、可靠性、正当性、安全性、合法性等负责,亦不承担任何法律责任
6、所有作品仅供您个人学习、研究或欣赏,不得用于商业或者其他用途,否则,一切后果均由您自己承担,我们对此不承担任何法律责任
7、如涉及侵犯版权等问题,请您及时通知我们,我们将立即采取措施予以解决
8、联系人Email:admin@iyunv.com 网址:www.yunweiku.com

所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其承担任何法律责任,如涉及侵犯版权等问题,请您及时通知我们,我们将立即处理,联系人Email:kefu@iyunv.com,QQ:1061981298 本贴地址:https://www.yunweiku.com/thread-316107-1-1.html 上篇帖子: redis安装与基本命令入门 下篇帖子: 3.4 redis数据类型之集合(set)
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

扫码加入运维网微信交流群X

扫码加入运维网微信交流群

扫描二维码加入运维网微信交流群,最新一手资源尽在官方微信交流群!快快加入我们吧...

扫描微信二维码查看详情

客服E-mail:kefu@iyunv.com 客服QQ:1061981298


QQ群⑦:运维网交流群⑦ QQ群⑧:运维网交流群⑧ k8s群:运维网kubernetes交流群


提醒:禁止发布任何违反国家法律、法规的言论与图片等内容;本站内容均来自个人观点与网络等信息,非本站认同之观点.


本站大部分资源是网友从网上搜集分享而来,其版权均归原作者及其网站所有,我们尊重他人的合法权益,如有内容侵犯您的合法权益,请及时与我们联系进行核实删除!



合作伙伴: 青云cloud

快速回复 返回顶部 返回列表