设为首页 收藏本站
查看: 673|回复: 0

[经验分享] Apache Cassandra Learning Step by Step (3): Samples ABC

[复制链接]

尚未签到

发表于 2017-1-11 10:27:21 | 显示全部楼层 |阅读模式
  ====16 Feb 2012, by Bright Zheng (IT进行时)====
4. Samples ABC
Wetry to learn it step by step to understand the concepts and Java API usages bymeans of:
1. Concept Introduction
2. CLI
3. Java Sample Code
4.1. Get a Single Column by a Key
4.1.1. Sample Code
public QueryResult<HColumn<String,String>>execute() {       

        ColumnQuery<String, String,String> columnQuery = HFactory.createStringColumnQuery(keyspace);

        columnQuery.setColumnFamily("Npanxx");

        columnQuery.setKey("512204");

        columnQuery.setName("city");

        QueryResult<HColumn<String,String>> result = columnQuery.execute();

       

        returnresult;

    }


4.1.2.  Sample Code runby Maven
C:\projects_learning\learning-cassandra-tutorial>mvn-e exec:java -Dexec.args="get" -Dexec.mainClass="com.datastax.tutorial.TutorialRunner"
The output is:
[INFO] --- exec-maven-plugin:1.1.2-Beta1:java(default-cli) @ cassandra-tutorial ---
HColumn(city=Austin)
[INFO]------------------------------------------------------------------------
[INFO] BUILD SUCCESS
4.1.3.  CLI
[default@Tutorial] get Npanxx['512204']['city'];
=> (column=city, value=Austin, timestamp=1329234388328000)
Elapsed time: 16 msec(s).
4.2. Get multiple columns by a Key
4.2.1. Sample Code
publicQueryResult<ColumnSlice<Long,String>> execute() {

        SliceQuery<String, Long,String> sliceQuery =

            HFactory.createSliceQuery(keyspace, stringSerializer,longSerializer, stringSerializer);

        sliceQuery.setColumnFamily("StateCity");

        sliceQuery.setKey("TX Austin");

       

        //way 1: set multiplecolumnNames

        sliceQuery.setColumnNames(202L, 203L,204L);

       

        //way 2: use setRange

        // change 'reversed' totrue to get the columns in reverse order

        //sliceQuery.setRange(202L,204L, false, 5);

        

        QueryResult<ColumnSlice<Long,String>> result = sliceQuery.execute();

        returnresult;

    }
4.2.2. Sample Code runby Maven
C:\projects_learning\learning-cassandra-tutorial>mvn-e exec:java -Dexec.args="get_slice_sc"-Dexec.mainClass="com.datastax.tutorial.TutorialRunner"
The output is:
[INFO] ---exec-maven-plugin:1.1.2-Beta1:java (default-cli) @ cassandra-tutorial ---
ColumnSlice([HColumn(202=30.27x097.74),HColumn(203=30.27x097.74), HColumn(204=30.32x097.73)]
[INFO]------------------------------------------------------------------------
[INFO] BUILD SUCCESS
4.2.3. CLI(TODO)
TODO: Refering to CLI Syntax, Cassandra can’t getmultiple columns at one ‘get’ command?
4.3. Get multiple rows by a set of Key
4.3.1. Sample Code
public QueryResult<Rows<String,String,String>>execute() {

        MultigetSliceQuery<String, String,String> multigetSlicesQuery =

            HFactory.createMultigetSliceQuery(keyspace, stringSerializer,stringSerializer, stringSerializer);

        multigetSlicesQuery.setColumnFamily("Npanxx");

        multigetSlicesQuery.setColumnNames("city","state","lat","lng");       

        multigetSlicesQuery.setKeys("512202","512203","512205","512206");

        QueryResult<Rows<String,String, String>> results = multigetSlicesQuery.execute();

        returnresults;

    }
4.3.2. Sample Code runby Maven
C:\projects_learning\learning-cassandra-tutorial>mvn-e exec:java -Dexec.args="multiget_slice"-Dexec.mainClass="com.datastax.tutorial.TutorialRunner"
The output is:
[INFO] --- exec-maven-plugin:1.2:java (default-cli) @cassandra-tutorial ---
Rows({
512205=Row(512205,ColumnSlice([HColumn(city=Austin),HColumn(lat=30.32), HColumn(lng=097.73), HColumn(state=TX)])),
512206=Row(512206,ColumnSlice([HColumn(city=Austin),HColumn(lat=30.32), HColumn(lng=097.73), HColumn(state=TX)])),
512203=Row(512203,ColumnSlice([HColumn(city=Austin),HColumn(lat=30.27), HColumn(lng=097.74), HColumn(state=TX)])),
512202=Row(512202,ColumnSlice([HColumn(city=Austin),HColumn(lat=30.27), HColumn(lng=097.74), HColumn(state=TX)]))})
[INFO]------------------------------------------------------------------------
[INFO] BUILD SUCCESS
4.3.3. CLI(TODO)
TODO: N/A?
4.4. Get Slices from a Range of Rowsby Key
4.4.1. Sample Code
GetRangeSlicesForStateCity.java
publicQueryResult<OrderedRows<String,String,String>> execute() {

        RangeSlicesQuery<String, String,String> rangeSlicesQuery =

            HFactory.createRangeSlicesQuery(keyspace, stringSerializer,stringSerializer, stringSerializer);

        rangeSlicesQuery.setColumnFamily("Npanxx");

        rangeSlicesQuery.setColumnNames("city","state","lat","lng");       

        rangeSlicesQuery.setKeys("512202", "512205");

        rangeSlicesQuery.setRowCount(5);

        QueryResult<OrderedRows<String,String, String>> results = rangeSlicesQuery.execute();

        returnresults;

    }
Important Note: The result actually is NOT meaningful (expectedreturn might be 512202-512205, 4 rows, but actually not) since the Key issorted by RandomPartitioner (which can be configured in/conf/cassandra.yaml, but not recommend to do so).  The result can be referred at “Sample Coderun by Maven”.
4.4.2. Sample Code runby Maven
C:\projects_learning\learning-cassandra-tutorial>mvn-e exec:java -Dexec.args="get_range_slices"-Dexec.mainClass="com.datastax.tutorial.TutorialRunner"
The output is:
[INFO] --- exec-maven-plugin:1.2:java (default-cli) @cassandra-tutorial ---
Rows({
512202=Row(512202,ColumnSlice([HColumn(city=Austin),HColumn(lat=30.27), HColumn(lng=097.74), HColumn(state=TX)])),
512206=Row(512206,ColumnSlice([HColumn(city=Austin),HColumn(lat=30.32), HColumn(lng=097.73), HColumn(state=TX)])),
512205=Row(512205,ColumnSlice([HColumn(city=Austin),HColumn(lat=30.32), HColumn(lng=097.73), HColumn(state=TX)]))
})
[INFO]------------------------------------------------------------------------
[INFO] BUILD SUCCESS
4.4.3. CLI(TODO)
TODO: N/A
4.5. Get Slices from a Range of Rowsby Columns
4.5.1. Sample Code
GetSliceForAreaCodeCity.java
publicQueryResult<ColumnSlice<String,String>> execute() {

        SliceQuery<String,String, String> sliceQuery =

            HFactory.createSliceQuery(keyspace, stringSerializer, stringSerializer, stringSerializer);

       sliceQuery.setColumnFamily("AreaCode");

        sliceQuery.setKey("512");

        // change the order argument to 'true' to get the last 2 columns indescending order

        // gets the first 4 columns "between" Austin and Austin__204according to comparator

        sliceQuery.setRange("Austin", "Austin__204", false, 5);

 

       QueryResult<ColumnSlice<String, String>> result =sliceQuery.execute();

 

        return result;

    }
4.5.2. Sample Code runby Maven
C:\projects_learning\learning-cassandra-tutorial>mvn-e exec:java -Dexec.args="get_slice_acc"-Dexec.mainClass="com.datastax.tutorial.TutorialRunner"
The output is:
[INFO] --- exec-maven-plugin:1.2:java (default-cli) @cassandra-tutorial ---
ColumnSlice([
HColumn(Austin__202=30.27x097.74),
HColumn(Austin__203=30.27x097.74),
HColumn(Austin__204=30.32x097.73)
])
[INFO]------------------------------------------------------------------------
[INFO] BUILD SUCCESS
4.5.3. CLI
N/A
4.6. Get Slices from Indexed Columns
4.6.1. Sample Code
GetIndexedSlicesForCityState.java
public QueryResult<OrderedRows<String,String, String>> execute() {

       IndexedSlicesQuery<String, String, String> indexedSlicesQuery =

            HFactory.createIndexedSlicesQuery(keyspace, stringSerializer, stringSerializer, stringSerializer);

       indexedSlicesQuery.setColumnFamily("Npanxx");

       indexedSlicesQuery.setColumnNames("city","lat","lng");

       indexedSlicesQuery.addEqualsExpression("state", "TX");

       indexedSlicesQuery.addEqualsExpression("city", "Austin");

       indexedSlicesQuery.addGteExpression("lat", "30.30");

       QueryResult<OrderedRows<String, String, String>> result =indexedSlicesQuery.execute();

        

        return result;

    }
4.6.2. Sample Code runby Maven
 
The output is:
[INFO] --- exec-maven-plugin:1.2:java (default-cli) @cassandra-tutorial ---
Rows({512204=Row(
512204,ColumnSlice([HColumn(city=Austin), HColumn(lat=30.32),HColumn(lng=097.73)])),
512206=Row(512206,ColumnSlice([HColumn(city=Austin),HColumn(lat=30.32), HColumn(lng=097.73)])),
512205=Row(512205,ColumnSlice([HColumn(city=Austin),HColumn(lat=30.32), HColumn(lng=097.73)]))})
[INFO]------------------------------------------------------------------------
[INFO] BUILD SUCCESS
4.6.3. CLI
[default@Tutorial] get npanxx where state='TX' andcity='Austin'and lat>'30.30';
-------------------
RowKey: 512204
=> (column=city, value=Austin, timestamp=1329299521508000)
=> (column=lat, value=30.32, timestamp=1329299521540000)
=> (column=lng, value=097.73,timestamp=1329299521555000)
=> (column=state, value=TX,timestamp=1329299521524000)
-------------------
RowKey: 512206
=> (column=city, value=Austin, timestamp=1329299521618000)
=> (column=lat, value=30.32,timestamp=1329299521633000)
=> (column=lng, value=097.73,timestamp=1329299522491000)
=> (column=state, value=TX,timestamp=1329299521618000)
-------------------
RowKey: 512205
=> (column=city, value=Austin, timestamp=1329299521555000)
=> (column=lat, value=30.32,timestamp=1329299521586000)
=> (column=lng, value=097.73,timestamp=1329299521602000)
=> (column=state, value=TX,timestamp=1329299521571000)
 
3 Rows Returned.
Elapsed time: 16 msec(s).
 
4.7. Insertion
4.7.1. Sample Code
InsertRowsForColumnFamilies.java
public QueryResult<?> execute() {

        Mutator<String>mutator = HFactory.createMutator(keyspace, stringSerializer);

       

        mutator.addInsertion("CA Burlingame", "StateCity", HFactory.createColumn(650L,"37.57x122.34",longSerializer,stringSerializer));

        mutator.addInsertion("650", "AreaCode", HFactory.createStringColumn("Burlingame__650", "37.57x122.34"));

        mutator.addInsertion("650222", "Npanxx", HFactory.createStringColumn("lat", "37.57"));

        mutator.addInsertion("650222", "Npanxx", HFactory.createStringColumn("lng", "122.34"));

        mutator.addInsertion("650222", "Npanxx", HFactory.createStringColumn("city", "Burlingame"));

        mutator.addInsertion("650222", "Npanxx", HFactory.createStringColumn("state", "CA"));                

       

        MutationResult mr =mutator.execute();

        return null;

    }
4.7.2. Sample Code runby Maven
Omitted
4.7.3. CLI
[default@Tutorial] set StateCity['CABurlingame']['650']='37.57x122.34';
[default@Tutorial] set AreaCode[‘650'][‘Burlingame__650’]=’37.57x122.34';
[default@Tutorial] setNpanxx['650222']['lat']='37.57';

4.8. Deletion
4.8.1. Sample Code
InsertRowsForColumnFamilies.java
public QueryResult<?> execute() {

        Mutator<String>mutator = HFactory.createMutator(keyspace, stringSerializer);

       

        //Mutator.addDeletion(String key, String cf, String columnName, Serializer<String>nameSerializer)

        //columnName as null means to delete the whole row.

        mutator.addDeletion("CA Burlingame", "StateCity", null, stringSerializer);

        mutator.addDeletion("650", "AreaCode", null, stringSerializer);

        mutator.addDeletion("650222", "Npanxx", null, stringSerializer);

        // adding a non-existent key like the following will cause the insertionof a tombstone

        // mutator.addDeletion("652", "AreaCode", null,stringSerializer);

        MutationResult mr =mutator.execute();

        return null;

 

    }
4.8.2. Sample Code runby Maven
Omitted…
4.8.3. CLI
[default@Tutorial] del StateCity['CA Burlingame'];
[default@Tutorial] del AreaCode['650'];
[default@Tutorial] delNpanxx['650222'];
Important Note: Whatever you use, either java code or CLI, thedeletion event will still leave the DeletedColumn row key there marked as Tombstone(hehe, 墓碑, a really good naming) which can be retrieved backby command of ‘list’ like this.
[default@Tutorial]list StateCity;
Usingdefault limit of 100
-------------------
RowKey: CA Burlingame
-------------------
RowKey: TXAustin
=>(column=202, value=30.27x097.74, timestamp=1329297768323000)
=>(column=203, value=30.27x097.74, timestamp=1329297768338000)
=>(column=204, value=30.32x097.73, timestamp=1329297768354000)
=>(column=205, value=30.32x097.73, timestamp=1329297768370000)
=>(column=206, value=30.32x097.73, timestamp=1329297768385000)


2 Rows Returned.
Elapsedtime: 16 msec(s).
Asyou see, two rows returned! Even the row of ‘CA Burlingame’ has been deleted.
Evenworse, if the deletion of non-existing key will cause an issue called ‘insertionof a tombstone’ which means it will add one more row in the Column Family!!!
Fortrunately,the command of ‘get’ won’t retrieve it back any more.
[default@Tutorial] get StateCity['CA Burlingame'];
Returned 0 results.
Elapsed time: 0 msec(s).
 
Godeeper? Please read on.
Whenwill Cassandra remove these tombstones? As I know, two ways:
1. Wait until gc_grace_seconds istimeout (Not verified yet)
Thegc_grace_seconds is set per CF and can be updated without a restart.
Howto get gc_grace_seconds? Simply use CLI:
[default@Tutorial] show schema;

create column family StateCity
  withcolumn_type = 'Standard'
  andcomparator = 'LongType'
  anddefault_validation_class = 'UTF8Type'
  andkey_validation_class = 'UTF8Type'
  androws_cached = 0.0
  androw_cache_save_period = 0
  androw_cache_keys_to_save = 2147483647
  andkeys_cached = 200000.0
  andkey_cache_save_period = 14400
  andread_repair_chance = 1.0
  and gc_grace =864000   // 10 days, OMG
  andmin_compaction_threshold = 4
  andmax_compaction_threshold = 32
  andreplicate_on_write = true
  and row_cache_provider= 'ConcurrentLinkedHashCacheProvider'
and compaction_strategy ='org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy';

 
2. The Compaction event (underinvestigation but no luck yet)
TheCompaction will be triggered automatically.
Buthow to trigger compaction manually? Use nodetool as well.
C:\java\apache-cassandra-1.0.7\bin>nodetool -hlocalhost flush Tutorial
Starting NodeTool
C:\java\apache-cassandra-1.0.7\bin>nodetool -hlocalhost compact Tutorial
Starting NodeTool
Thenwe can see some logging messages in the Cassandra console.
Butas I found, the tombstones are still here. (WHY???)
C:\java\apache-cassandra-1.0.7\bin>sstable2json..\runtime\data\Tutorial\StateCity-hc-9-Data.db
{
"4341204275726c696e67616d65":[["650","37.57x122.34",1329316454906000]],
"54582041757374696e":[["202","30.27x097.74",1329297768323000],["203","30.27x097.74",1329297768338000],["204","30.32x097.73",1329297768354000],["205","30.32x097.73",1329297768370000],["206","30.32x097.73",1329297768385000]],
"616263": []
}
Andstill appears in the list command. (KAO, 阴魂不散? Big why???)
[default@Tutorial] list statecity;
Using default limit of 100
-------------------
RowKey: CA Burlingame
-------------------
RowKey: TX Austin
=> (column=202, value=30.27x097.74,timestamp=1329297768323000)
=> (column=203, value=30.27x097.74,timestamp=1329297768338000)
=> (column=204, value=30.32x097.73,timestamp=1329297768354000)
=> (column=205, value=30.32x097.73,timestamp=1329297768370000)
=> (column=206, value=30.32x097.73,timestamp=1329297768385000)
-------------------
RowKey: abc
 
3 Rows Returned.
Elapsed time: 31 msec(s).
 
 
在这儿咱发几句牢骚:
1. 可能是学习深度还不足的原因,感觉CLI比较弱,适合初始化建模DDL和简单的数据分析;
2. Tombstone的清理问题还没有最终得到验证,暂时挂起,权当悬案先,以后有答案了再补充、更正
 

运维网声明 1、欢迎大家加入本站运维交流群:群②:261659950 群⑤:202807635 群⑦870801961 群⑧679858003
2、本站所有主题由该帖子作者发表,该帖子作者与运维网享有帖子相关版权
3、所有作品的著作权均归原作者享有,请您和我们一样尊重他人的著作权等合法权益。如果您对作品感到满意,请购买正版
4、禁止制作、复制、发布和传播具有反动、淫秽、色情、暴力、凶杀等内容的信息,一经发现立即删除。若您因此触犯法律,一切后果自负,我们对此不承担任何责任
5、所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其内容的准确性、可靠性、正当性、安全性、合法性等负责,亦不承担任何法律责任
6、所有作品仅供您个人学习、研究或欣赏,不得用于商业或者其他用途,否则,一切后果均由您自己承担,我们对此不承担任何法律责任
7、如涉及侵犯版权等问题,请您及时通知我们,我们将立即采取措施予以解决
8、联系人Email:admin@iyunv.com 网址:www.yunweiku.com

所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其承担任何法律责任,如涉及侵犯版权等问题,请您及时通知我们,我们将立即处理,联系人Email:kefu@iyunv.com,QQ:1061981298 本贴地址:https://www.yunweiku.com/thread-326926-1-1.html 上篇帖子: Apache Common HttpClient 使用之七种武器 下篇帖子: org.apache.commons.digester.Digester 解析 XML文件或者流
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

扫码加入运维网微信交流群X

扫码加入运维网微信交流群

扫描二维码加入运维网微信交流群,最新一手资源尽在官方微信交流群!快快加入我们吧...

扫描微信二维码查看详情

客服E-mail:kefu@iyunv.com 客服QQ:1061981298


QQ群⑦:运维网交流群⑦ QQ群⑧:运维网交流群⑧ k8s群:运维网kubernetes交流群


提醒:禁止发布任何违反国家法律、法规的言论与图片等内容;本站内容均来自个人观点与网络等信息,非本站认同之观点.


本站大部分资源是网友从网上搜集分享而来,其版权均归原作者及其网站所有,我们尊重他人的合法权益,如有内容侵犯您的合法权益,请及时与我们联系进行核实删除!



合作伙伴: 青云cloud

快速回复 返回顶部 返回列表