Cassandra data modelling less then 1000 records to fit in one row -
we have entity uniquely identified generated uuid. need support find name query. need support sorting name.
we know there no more 1000 of entities of type can fit in 1 row. viable idea hardcode primary key, use name clustering key , id clustering key there satisfy uniqueness. lets need school entity. here example:
create table school ( constant text, name text, id uuid, description text, location text, primary key ((constant), name, id) );
initial state give me schools , filtering exact name happen. our reasoning behind place schools in single row fast access, have name clustering column filtering , have id clustering column guaranty uniqueness. can use constant = school
known hardcoded value access row.
what solution values in 1 row , fast reads. can solve sorting easy clustering column. not hardcoded value constant
seams odd. use name
pk have 1000 records spread across couple of partitions, find without name slower , not sorted.
question 1
is viable solution , there problems not see? did not see example on cassandra data modelling hardcoded primary key reason doubting solution.
question 2
name editable field, changed (someone can make typo or school can change name) can change. best way achieve this? delete insert inside batch (lte can applied same row conditional clause)?
yes approach such small dataset. because cassandra can partition large datasets across multiple nodes not mean need use ability every table. using constant partition key, telling cassandra want data stored on 1 node can access , in sorted order. relational databases act on data in single node time, not such unusual thing do.
for safety want use replication factor higher 1 there @ least 2 copies of single partition. in way not lose access data if 1 node stored went down.
this approach cause problems if expect have lot of clients (i.e. thousands of clients) reading , writing table, since become hot spot. 1000 records can keep rows cached in memory setting table cache keys , rows.
you won't find lot of examples done because people move cassandra support of large datasets want scalability comes using multiple partitions. examples geared towards that.
Comments
Post a Comment