Cassandra - A Overview
Cassandra: - Cassandra was developed in 2008 by
AvinashLakshman and Prashant Malik. The developer of Cassandra is Apache
Software Foundation. It is written in Java. It is an open source distributed database
management system designed to handle large amounts of data across many commodity servers, providing high
availability with no single point of failure. Cassandra
achieves the highest throughput for the maximum number of nodes in all
experiments" although "this comes at the price of high write and read
latencies.
Features
v Decentralized
Every node in the cluster has the same role. There
is no single point of failure.
v
Supports replication and multi data center replication
Replication
strategies are configurable. Cassandra is designed as a distributed system, for
deployment of large numbers of nodes across multiple data centers.
v Scalability
Read and
write throughput both increase linearly as new machines are added, with no
downtime or interruption to applications.
v Fault-tolerant
Data is
automatically replicated to multiple nodes
for fault-tolerance. Replication across multiple data centers is
supported. Failed nodes can be replaced with no downtime.
v Tunable consistency
Writes and
reads offer a tunable level of consistency, all the way from "writes never
fail" to "block for all replicas to be readable", with the quorum level in the middle.
v MapReduce support
Cassandra
has Hadoop integration, with MapReduce support. There is
support also for Apache Pig and Apache Hive.
v Query language
Cassandra introduces CQL (Cassandra Query Language), a SQL-like alternative to the traditional RPC
interface. Language drivers are available or Java (JDBC), Python(DBAPI2), Node.JS (Helenus) and Go (gocql).
Advantages
Ø It offers robust support for clusters spanning multiple
datacenters.
Ø Its performance is high.
Disadvantage
Cassandra does not support joins or subqueries.