Ph: 62209526

Friday, April 1, 2011

Cassandra 0.7.x - Understanding the output of nodetool cfhistograms


Command - Usage and Output
Cassandra provides nodetool cfhistograms command to print statistic histograms for a given column family. Following is the usage:
./nodetool -h -p cfhistograms

The output of the command has following 6 columns:
Offset SSTables Write Latency Read Latency Row Size Column Count

Interpreting the output
Offset: This represents the series of values to which the counts for below 5 columns correspond. This corresponds to the X axis values in histograms. The unit is determined based on the other columns. SSTables: This represents the number of SSTables accessed per read. For eg if a read operation involved accessing 3 SSTables then you will find a +ve value against Offset 3. The values are recent i.e. for duration lapsed between two calls. Write Latency: This shows the distribution of number of operations across the range of Offset values representing latency in microseconds. For eg. If 100 operations took say 5 ms then you will find a +ve value against offset 5. Read Latency: This is similar to write latency. The values are recent i.e. for duration lapsed between two calls. Row Size: This shows the distribution of rows across the range of Offset values representing size in bytes. For eg. If you have 100 rows of size 2000bytes then you will find a +ve value against offset 2000. Column Count: This is similar to row size. The offset values represent column count.

Some additional details
Typically in a histogram the values are plotted over discrete intervals. Similarly Cassandra defines buckets. The number of buckets is 1 more than the bucket offsets. The last element is values greater than the last offset. The values you see in the Offset column in the output is bucket offsets. The bucket offset starts at 1 and grows by 1.2 each time (rounding and removing duplicates). It goes from 1 to around 36M by default (creating 90+1 buckets), which will give us timing resolution from microseconds to 36 seconds, with less precision as the numbers get larger. (see EstimatedHistogram class)





Friday, March 11, 2011

Schema Management in Cassandra 0.7

Schema Management in Cassandra

Starting with Cassandra 0.7 the schema management in Cassandra is very easy. It is as good as centralized schema management with no SPoF . Typically schema operations involve loading schema initially, making changes to existing schema like adding CF and/or modifying existing CF attributes, and dropping schema elements like CFs and Keyspaces.

There are 3 ways these operations can be performed:

Load schema from cassandra.yaml using schematool or JMX Console: This option can be used to load schema only once. Running it twice in a cluster won't have any impact. So this is good for loading initial schema.

schematool import
OR
JConsole:MBeans->org.apache.cassandra.db->StorageService -> Operations -> loadSchemaFromYAML

Create/Modify schema using Thrift APIs: This provides high flexiibility and good for applications that wish to create/drop Keyspaces and ColumnFamilies on fly. You cannot modify existing ColumnFamilies using the APIs. Refer to Cassandra Wiki - API for details of the APIs available. Following APIs are available:
describe_keyspace describe_keyspaces system_add_column_family system_drop_column_family system_add_keyspace system_drop_keyspace

Create/Modify schema using cassandra-cli: This is the most flexible option available. It allow practically everything that option #1 and #2 allow collectively. Following commands are supported. You can see the commands by entering "help;" command on cassandra-cli. For details of specific command type "help ;". For eg "help create keyspace;".
Describe keyspace Show list of keyspaces Add a new keyspace with the specified attribute(s) and value(s) Update a keyspace with the specified attribute(s) and value(s) Create a new column family with the specified attribute(s) and value(s) Update a column family with the specified attribute(s) and value(s) Delete a keyspace Delete a column family

Under the hood

The Cassandra Wiki - Schema Updates describes the operations in good details. Following is the high level summary:

Cassandra uses Schema and Migrations ColumnFamily in system keyspace for maintaining schema and changes to schema respectively. Schema changes done on one node are propagated on other nodes in the cluster Migrations CF tracks individual changes to schema. Schema CF contains reference to the latest version in use Some manual cleanup may be needed if node crashes while schema changes are being applied to the cluster To avoid concurrency issues always push schema changes through one node

Examples

Dropping a Keyspace

Connect to cassandra-cli on a node and run drop keyspace command.

[root@rwc-sb6240-1 bin]# ./cassandra-cli
Welcome to cassandra CLI.

Type 'help;' or '?' for help. Type 'quit;' or 'exit;' to quit.


[default@unknown] drop keyspace KeyspaceMigration;
5fba3f1f-4a82-11e0-b8ee-f90f8a3f5e1f
[default@unknown] exit;
[root@rwc-sb6240-1 bin]# Connected to: "NarenCluster072" on 20.17.221.19/9160 [default@unknown] connect 20.17.221.19/9160;


The logs on the node will show following events (DEBUG MODE)

DEBUG [pool-1-thread-151] 2011-03-09 11:21:03,334 CassandraServer.java (line 759) drop_keyspace

...
DEBUG [CompactionExecutor:1] 2011-03-09 11:21:04,146 CompactionManager.java (line 109) Checking to see if compaction of Schema would be useful DEBUG [MigrationStage:1] 2011-03-09 11:21:04,146 MigrationManager.java (line 106) Announcing my schema is 5fba3f1f-4a82-11e0-b8ee-f90f8a3f5e1f
DEBUG [CompactionExecutor:1] 2011-03-09 11:21:04,147 CompactionManager.java (line 109) Checking to see if compaction of Migrations would be useful
DEBUG [ReadStage:14] 2011-03-09 11:21:04,150 MigrationManager.java (line 87) Their data definitions are old. Sending updates since d052796e-4a80-11e0-b8ee-f90f8a3f5e1f
DEBUG [ReadStage:15] 2011-03-09 11:21:04,151 MigrationManager.java (line 87) Their data definitions are old. Sending updates since d052796e-4a80-11e0-b8ee-f90f8a3f5e1f
...
DEBUG [pool-1-thread-151] 2011-03-09 11:21:05,629 StorageProxy.java (line 628) My version is 5fba3f1f-4a82-11e0-b8ee-f90f8a3f5e1f DEBUG [pool-1-thread-151] 2011-03-09 11:21:05,629 StorageProxy.java (line 659) Schemas are in agreement. DEBUG [MigrationStage:1] 2011-03-09 11:21:03,343 Table.java (line 397) applying mutation of row 35666261336631662d346138322d313165302d623865652d663930663861336635653166


On the other nodes the log entries will look like

DEBUG [ReadStage:9] 2011-03-09 11:12:19,250 MigrationManager.java (line 82) My data definitions are old. Asking for updates since d052796e-4a80-11e0-b8ee-f90f8a3f5e1f
DEBUG [ReadStage:9] 2011-03-09 11:12:19,253 MigrationManager.java (line 106) Announcing my schema is d052796e-4a80-11e0-b8ee-f90f8a3f5e1f
DEBUG [MigrationStage:1] 2011-03-09 11:12:19,273 SchemaCheckVerbHandler.java (line 36) Received schema check request.
...
DEBUG [MigrationStage:1] 2011-03-09 11:12:20,681 MigrationManager.java (line 106) Announcing my schema is 5fba3f1f-4a82-11e0-b8ee-f90f8a3f5e1f

Thursday, December 17, 2009

Residential Gateway - Part 2

Since I am not much busy now a days, you may see multiple posts from me in a single day :). In the last part I talked about the Residential Gateway in general. In this post I will talk about the WAN side interface i.e. DSL.

DSL stands for Digital Subscriber Line. It is the technology that is used to transmit digital content over phone line (the very same line that is connected to your landline phone). Some of you must be wondering how is that possible? Will I be able to use my phone and Internet simultaneously?

The technology has the answer. The phone line that we have today is under utilized. It is used to carry only voice traffic which is transmitted over the frequency band 300 Hz to 3400 Hz. Whereas the cable is capable of carrying signals at very high frequency. The DSL technology makes use of the unused frequency bands to send/receive data.

What about simultaneous use? It is possible using a splitter/microfilter. A splitter is a small piece of hardware that is usually supplied by the broadband service provider. The phone line is connected to splitter. There are two output ports on a splitter. One port connects to DSL modem whereas the other port connects to phone. Splitter splits the signals based on the frequency. Signals with lower frequency

There are a number of variations/standards of DSL technology. They primarily differ in two parameters viz speed and distance they support. Ofcourse there are core technology differences. Note that with distance the signal quality deteriorates and it is not possible to install repeaters for data signals. Hence, distance plays important role.

The most popular DSL standards are ADSL and VDSL.

ADSL: Asymmetric Digital Subscriber Line
As the name suggests the download and upload speeds are different. In most of the home networks people download more than they upload. Hence, this results in very good user experience. The band from 25.875 kHz to 138 kHz is used for upstream communication, while 138 kHz to 1104 kHz is used for downstream communication.

ADSL supports download speed of upto 12 Mbps and upload speed of upto 1.5 Mbps. ADSL2+ extends the capability of ADSL by doubling the downstream bits which is done by extending the downstream frequency band from 1.1 MHz to 2.2 MHz. As a result ADSL2+ supports download speed of upto 24 Mbps. ADSL works for max distance upto 5000 meters from exchange. The close the exchange the better would be the signal quality and speed.

Following diagram shows the ADSL2+ router in the broadband network:
ADSL2+ Router in Network

VDSL: Very High Bitrate DSL
It is similar to ADSL2+ but uses high frequency band, in the order of 30Mhz. As a result it provides very high download and upload speed, approx 100 Mbps . The indicated max speeds are achievable for max distance of upto 300 meters from exchange.

The distance of 300 meters from exchange is not practical in most cases . Hence, Optical Network Unit is used to provide service even from a larger distance. The broadband service provider lays Optical cables from exchange to the locality (typically building or group of buildings) which are connected to ONU. The ONU then connects to routers at home over the phone line (DSL). Optical cables are capable of carrying data at speed of Gpbs.

Due to high speed VDSL is ideal for IPTV and HDTV services. Since it supports symmetric upload and download speeds as well it is suitable for video conferencing.

Following diagram shows the VDSL network diagram:
VDSL Network

Refer to following link for more technical differences between ADSL and VDSL:
http://www.pulsewan.com/data101/adsl_vdsl_basics.htm

Tuesday, December 15, 2009

Residential Gateway - Part 1

I worked for 2 years at a company that manufactures Residential Gateway. I primarily worked on the GUI and Configuration customization of these gateways. At times the work involved debugging functional issues that required understanding of the underlying protocols and standards. This provided me the opportunity to learn about various networking standards.

I referred Wikipedia and RFCs for most of the things I learned. I am going to share my learning through series of blogs. I don't intend to capture the internal details like multiplexing techniques or packet/frame formats as those can be obtained from standards documents. The information here will provide conceptual understanding and some useful facts about the technologies.

In this first installment let's understand what is Residential Gateway.

Residential Gateway is quite a popular term in USA but not a well known term in India. In India people refer it as Modem mostly because:
- It is just a Modem
- If it is not just a Modem then either people don't know about its features and/or they don't use other features.

The primary function of Residential Gateway is to enable broadband Internet connection for home users. It is a combination of modem and router. In addition, it provides other features like:
- Firewall
- NAT
- DHCP
- DNS
- VoIP

Following diagram shows the complete broadband ecosystem. The CPE in the diagram is the residential gateway:
Broadband Network

Following diagrams shows how a typical home network looks like:
Home Network 1
Home Network 2
Following diagram shows the common ports available on a residential gateway and sample devices that can be connected:
Residential Gateway Ports and Devices

If you look at the diagrams available at above links you will notice that the most common interface on the home or LAN side are:
- USB
- Ethernet (RJ-45)
- Wireless (Wifi or 802.11x)
- Coax (For TV or STB)

On the WAN side the most common interfaces are HPNA (DSL) and Cable (this is same as the one on which you get Cable TV service). Out of these the DSL interface is most commonly used, atleast in India. The primary reason for DSL popularity is that the basic infrastructure i.e. the phone line is already present. User just need to buy the modem cum router and he is ready to setup his/her home network.

I will talk about DSL and related technologies in the next part. Stay tuned...


Sunday, September 6, 2009

E71 MAC Address

E71 supports Wifi. Hence, I use Wifi to access internet from E71 at home and office. It is fast and saves money as well as I have unlimited data plan.

To keep my Wifi access point safe and avoid misuse I have enabled MAC address filtering. I forgot that and when I tried to access net for the first time from my mobile I got error "Web: Gateway no reply". I initially thought it is due to some problem with the site or my ISP gateway. Later I realized that it is because of MAC address filtering rule at I added to my access point.

The question is how to find MAC address of your E71. The answer is key in following sequence on the homescreen:



*#62209526#
I would encourage all to enable MAC address filtering to safeguard your homenetwork and keep hackers at bay.


You are viewing a mobilized version of this site...
View original page here

Mobilized by Mowser Mowser