Tuesday, July 3, 2007

Caching: Some "Good to Know" Concepts

Couple of terminologies related to caching and some basic questions one should know and ask before designing a cache :


What is a State?


It is the data and status of the data in the system at a given time, that is collectively called a state. Caching is a framework for state management in the system. It is important to be aware of the state characteristics before designing a caching framework for the application.



What is lifetime of a State?


It is the period for which the state of the system is valid in the cache. Common lifetime of a state are:


1.       Permanent State: The data is valid throughout the course of the application. Persistent data. For example, metadata information picked from the database or file system.

2.       Process State (Atomic Transaction): Valid only during the lifetime of that transaction.

3.       Process State (Long Running Transaction): Valid for all the messages involved in that transaction.

4.       Session State: Data which is valid only for the current user session.

5.       Message State:  Information that is passed between two communicating services. Valid from the time one process sends the message and it is received and processed by the second service.



What is scope of a State?


Physical scope of the application means from what physical locations can the cache be accessed. Physical scope of the state can be Organization, Farm, Machine, Process or Application Domain.


There is also something called as the logical scope of the application. It is the logical locations from where the cache can be accessed. Logical scope of the application can be application, business process, role or a user.



What is State Staleness?


State staleness is defined as the difference between state from which the cache was create, and the current state of the cache.

Depending on how the state is going to be used in the system, state staleness, may or may not be an issue. For example, state staleness will not be an issue, if the cached data in the web application was the page style. But, will be a problem if the data stored was a master data which can be updated over time.


What is State Staleness Tolerance?


State Staleness Tolerance is how much does state staleness affect the application. An application can either have “No tolerance”  i.e. no state staleness is accepted or it can have “Some tolerance” i.e. some level of state staleness is acceptable. What is the time window for which the state staleness is acceptable is an important factor while designing the cache.



Where all data can be cached?


Mainly two very obviously choices for the place where data should be cached are:


1.       In Memory : When data access by the application is frequent. Reduces number of disk operations . Also, if processing of the same set of data is frequent, we can use a in memory cache.

2.       On Disk : When data handled is very huge and data providing application may not be available all the time when data requests are made.



Different ways of loading data into the cache?


1.       Reactive Loading : Loading the cache when the application requests for information.

2.       Proactive Loading : Loading the cache when the application starts up, so that when the first request is made, the data is already in the cache.



Some high level questions one must try and answer before going ahead with the cache design:

1.       What problem is caching going to solve anyways J? Reduced number of disk operations, or reduced amount of data processing or will it reduce the amount of data  transfer between various processes ?

2.       What is the lifetime of a state in my application ?

3.       What should be the physical and logical scope of the state ?

4.       What will be the state staleness tolerance of the cache ?

5.       How should the cache be loaded and flushed?

6.       What will be the expiration policies? Obviously, would depend on the state staleness tolerance and its affecting criterion.

7.       What might be the security risks in having a cache?


Note: This is not an exhaustive list of questions but enough to atleast get started towards the right direction in design of the cache J



Thanks & Regards,

Arun Manglick

SMTS || Microsoft Technology Practice || Bridgestone - Tyre Link || Persistent Systems || 3023-6258


DISCLAIMER ========== This e-mail may contain privileged and confidential information which is the property of Persistent Systems Pvt. Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Pvt. Ltd. does not accept any liability for virus infected mails.

No comments:

Post a Comment