1. Keyspaces:
- Definition: In Cassandra, a keyspace is the outermost container for data. It is similar to a schema in a relational database.
- Purpose: A keyspace holds one or more tables and defines the replication strategy and settings for the data it contains.
- Replication Factor: Determines the number of replicas (copies) of the data across the nodes in the cluster.
CREATE KEYSPACE mykeyspace WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 3};
2. Tables:
- Definition: Tables in Cassandra are where data is stored. Each table is associated with a specific keyspace.
- Schema-less Design: Unlike relational databases, Cassandra tables can be dynamic and do not require a fixed schema.
- Primary Key: Composed of one or more columns, the primary key uniquely identifies each row in the table.
- Columns: Besides the primary key, tables can have other columns, including static columns and collections.
CREATE TABLE mykeyspace.mytable ( id UUID PRIMARY KEY, name TEXT, age INT );
- Inserting Data:
INSERT INTO mykeyspace.mytable (id, name, age) VALUES (uuid(), 'John Doe', 25);
3. Nodes:
- Definition: Nodes are individual instances of Cassandra running in a cluster. Each node is responsible for storing a portion of the data.
- Peer-to-Peer Architecture: Cassandra follows a peer-to-peer architecture where all nodes in the cluster are equal and communicate with each other.
- Data Distribution: Data is distributed across nodes using a partitioner. Each node is responsible for a range of data.
- Replication: Replicas of data are stored on multiple nodes for fault tolerance.
SELECT * FROM system.peers;
Additional Considerations:
- Consistency Levels:
- Cassandra offers different consistency levels for read and write operations, allowing you to balance between consistency and availability.Examples include
ONE
,QUORUM
,LOCAL_QUORUM
, etc.
- Cassandra offers different consistency levels for read and write operations, allowing you to balance between consistency and availability.Examples include
SELECT * FROM mykeyspace.mytable WHERE id = ? CONSISTENCY QUORUM;
- Tuning and Maintenance:
- Regular maintenance tasks include compaction, repair, and nodetool operations for monitoring and managing the cluster.
- CQL (Cassandra Query Language):
- CQL is the query language used to interact with Cassandra. It is similar to SQL but has its own syntax and features tailored for Cassandra.
CREATE KEYSPACE mykeyspace WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 3};
USE mykeyspace;
CREATE TABLE mytable ( id UUID PRIMARY KEY, name TEXT, age INT );
INSERT INTO mytable (id, name, age) VALUES (uuid(), 'Jane Doe', 30);
SELECT * FROM mytable WHERE id = ?;
Practice Exercise:
- Create a keyspace named
myblog
with a replication factor of 2. - Design a table named
posts
within themyblog
keyspace to store blog posts. Include columns forpost_id
(UUID),title
(TEXT),content
(TEXT), andauthor
(TEXT). - Insert a few sample blog posts into the
posts
table. - Query the
posts
table to retrieve the blog posts.