Memcached is Free & open source, high-performance, distributed memory object caching system, generic in nature, but intended for use in speeding up dynamic web applications by alleviating database load.
Memcached is an in-memory key-value store for small chunks of arbitrary data (strings, objects) from results of database calls, API calls, page rendering or simply anything you like to store temporarily.
Memcached is simply a Key/Value store. You can see it as a standalone distributed hash table or dictionary. Memcached doesn’t know what your date is like, all it does is to store key value pairs with expiration and using LRU (Least recently used) algorithm to maintain the cached items.
What makes Memcached cool is that it provides scalability. How does it do this? By hashing algorithms that its clients implements. There are two phase of hashing, one of which happens at the client and the other happens at the server. Therefore, eventually all the memcached clients implement some hashing algorithm in order to benefit from the memcached’s distributed nature. I will mention this in a bit.
Having said that memcached is a distributed hash table (key value store), servers are disconnected from each other, and usually they are unaware of each there. There is no communication between memcached servers, any synchronization or broadcasting. This increases the flexibility to be able to scale out the memcached servers. If you are running low on resources on a memcached server, you can add another memcached server, and you can keep adding more cache servers as you need. You should pay attention that, if you don’t add cache servers, your cached item will start to be dropped out of memcached as the cache becomes full and as I mentioned Least recently used algorithm is used to drop the oldest cached items. This is also called Eviction.
In computer science world, one of the most important notations is Algorithmic complexity. Example: searching for an item, sorting a collection etc. Memcached considers this and implements a O(1), constant time, key value store. This means that storing an item to cache and extracting an item from the cache is constant time operation, which is very fast. This is achieved by implementing a good hash code method that doesn’t cause collisions.
On another note, Memcached storage of cached items is not traversable/iterable. You cannot traverse the whole cache.
Memcached is awesome! But not for every architecture.
- You have objects larger than 1MB.
- Memcached is not for large media and streaming huge blobs.
- You have keys larger than 250 chars.
- Memcached doesn’t support more than 250 chars.
- If you want persistence or a database. You might consider MemcacheDB which provides persistence for Memcached.
- You’re running in an insecure environment. Memcached doesn’t have any authentication or authorization system.
As I mentioned Memcached has two-stage hashing. It behaves as a giant hash table, looking up key = value pairs. Give it a key, and set or get some arbitrary data. That is it really. That is all it does.
When doing a memcached lookup, first the client hashes the key against the whole list of servers that you need to introduce to your clients. Once it has chosen a server after the first hashing procedure, the client then sends its request, and the server that was chosen does an internal hash key lookup for the actual item data. This enables the client to know which cache server to query again, when the item that was sent to cache is requested.
You need to understand that memcached is not redundant. There is no notion of replication or communication between cache servers; they are unaware of each other. Their only purpose is to store an item and give back an item. If one of your cache servers fails, you will lose all your data within that cache server. You will have to remove the cache server that failed from the list of the cache servers of your clients, ie: configuration.
Moreover, when one of your cache fail and you want to add another one, or you remove your cache from your clients, that will cause a big problem which is all your data, cached items will be invalid. This is due to double hashing mechanism, which clients, uses to hash the items based on the servers. So all your cache will be invalid, and you will have a spike. In order to avoid this, you will have to start a new node and assign the IP address of the dead node to the newly created node; this will prevent all your data to be invalid. Yet another way to solve this problem would be to use Consistent Hashing , in order to avoid computation of hash values of all the data.
Memcached operations aim to be atomic. All individual commands sent to memcached are atomic.
Memcached mimics organization of data via namespaces and it only stores objects. On the other hand, Microsoft App Fabric Cache provides notion of regions or sections. This allows you to keep the same Type of items in independent caches. Moreover, you can store strong Type instead of objects with AppFabric Cache, which I think these features are great.
Memcached is fast. It utilizes highly efficient, non-blocking networking libraries to ensure that memcached is always fast even under heavy load. In other words, in circumstances where your database might be falling over, memcached won’t be. Which is precisely what memcache was designed to do: to take the load off of your database, which for the majority of popular web applications is the biggest performance bottleneck and risk to scalability.
Memcached is simple and easy to deploy. It does not require a lot of technical knowledge to use or use effectively – it just does what it is supposed to.
Compressing large values is a great way to get more out of your memory and network communication/bandwidth. Compression can save a lot of memory for some values, and also potentially reduce latency as smaller values are quicker to fetch over the network.
Most clients support enabling or disabling compression by threshold of item size, and some on a per-item basis. Smaller items won’t necessarily benefit as much from having their data reduced, and would simply waste CPU.
Main Operations to work with Memcached is as follows:
Storing an item to database, you can pass values for datetime or timespan for expiration of the object being set.
Remove an item from the cache.
Increment and decrement given a keys value.
TryGet or Get is used to get an object from the cache.
Some of the clients allows retrieving multiple elements from the cache.
On another note, you can read about Microsoft AppFabric Cache.
