Schema Management in Cassandra
Starting with Cassandra 0.7 the schema management in Cassandra is very easy. It is as good as centralized schema management with no SPoF . Typically schema operations involve loading schema initially, making changes to existing schema like adding CF and/or modifying existing CF attributes, and dropping schema elements like CFs and Keyspaces.
There are 3 ways these operations can be performed:
Load schema from cassandra.yaml using schematool or JMX Console: This option can be used to load schema only once. Running it twice in a cluster won't have any impact. So this is good for loading initial schema.
schematool import
OR
JConsole:MBeans->org.apache.cassandra.db->StorageService -> Operations -> loadSchemaFromYAML
Create/Modify schema using Thrift APIs: This provides high flexiibility and good for applications that wish to create/drop Keyspaces and ColumnFamilies on fly. You cannot modify existing ColumnFamilies using the APIs. Refer to
Cassandra Wiki - API for details of the APIs available. Following APIs are available:
describe_keyspace describe_keyspaces system_add_column_family system_drop_column_family system_add_keyspace system_drop_keyspace
Create/Modify schema using cassandra-cli: This is the most flexible option available. It allow practically everything that option #1 and #2 allow collectively. Following commands are supported. You can see the commands by entering "help;" command on cassandra-cli. For details of specific command type "help ;". For eg "help create keyspace;".
Describe keyspace Show list of keyspaces Add a new keyspace with the specified attribute(s) and value(s) Update a keyspace with the specified attribute(s) and value(s) Create a new column family with the specified attribute(s) and value(s) Update a column family with the specified attribute(s) and value(s) Delete a keyspace Delete a column family
Under the hood
The
Cassandra Wiki - Schema Updates describes the operations in good details. Following is the high level summary:
Cassandra uses Schema and Migrations ColumnFamily in system keyspace for maintaining schema and changes to schema respectively. Schema changes done on one node are propagated on other nodes in the cluster Migrations CF tracks individual changes to schema. Schema CF contains reference to the latest version in use Some manual cleanup may be needed if node crashes while schema changes are being applied to the cluster To avoid concurrency issues always push schema changes through one node
Examples
Dropping a Keyspace
Connect to cassandra-cli on a node and run drop keyspace command.
[root@rwc-sb6240-1 bin]# ./cassandra-cli
Welcome to cassandra CLI.
Type 'help;' or '?' for help. Type 'quit;' or 'exit;' to quit.
[default@unknown] drop keyspace KeyspaceMigration;
5fba3f1f-4a82-11e0-b8ee-f90f8a3f5e1f
[default@unknown] exit;
[root@rwc-sb6240-1 bin]# Connected to: "NarenCluster072" on 20.17.221.19/9160 [default@unknown] connect 20.17.221.19/9160;
The logs on the node will show following events (DEBUG MODE)
DEBUG [pool-1-thread-151] 2011-03-09 11:21:03,334 CassandraServer.java (line 759) drop_keyspace
...
DEBUG [CompactionExecutor:1] 2011-03-09 11:21:04,146 CompactionManager.java (line 109) Checking to see if compaction of Schema would be useful DEBUG [MigrationStage:1] 2011-03-09 11:21:04,146 MigrationManager.java (line 106) Announcing my schema is 5fba3f1f-4a82-11e0-b8ee-f90f8a3f5e1f
DEBUG [CompactionExecutor:1] 2011-03-09 11:21:04,147 CompactionManager.java (line 109) Checking to see if compaction of Migrations would be useful
DEBUG [ReadStage:14] 2011-03-09 11:21:04,150 MigrationManager.java (line 87) Their data definitions are old. Sending updates since d052796e-4a80-11e0-b8ee-f90f8a3f5e1f
DEBUG [ReadStage:15] 2011-03-09 11:21:04,151 MigrationManager.java (line 87) Their data definitions are old. Sending updates since d052796e-4a80-11e0-b8ee-f90f8a3f5e1f
...
DEBUG [pool-1-thread-151] 2011-03-09 11:21:05,629 StorageProxy.java (line 628) My version is 5fba3f1f-4a82-11e0-b8ee-f90f8a3f5e1f DEBUG [pool-1-thread-151] 2011-03-09 11:21:05,629 StorageProxy.java (line 659) Schemas are in agreement. DEBUG [MigrationStage:1] 2011-03-09 11:21:03,343 Table.java (line 397) applying mutation of row 35666261336631662d346138322d313165302d623865652d663930663861336635653166
On the other nodes the log entries will look like
DEBUG [ReadStage:9] 2011-03-09 11:12:19,250 MigrationManager.java (line 82) My data definitions are old. Asking for updates since d052796e-4a80-11e0-b8ee-f90f8a3f5e1f
DEBUG [ReadStage:9] 2011-03-09 11:12:19,253 MigrationManager.java (line 106) Announcing my schema is d052796e-4a80-11e0-b8ee-f90f8a3f5e1f
DEBUG [MigrationStage:1] 2011-03-09 11:12:19,273 SchemaCheckVerbHandler.java (line 36) Received schema check request.
...
DEBUG [MigrationStage:1] 2011-03-09 11:12:20,681 MigrationManager.java (line 106) Announcing my schema is 5fba3f1f-4a82-11e0-b8ee-f90f8a3f5e1f