Application level sector cache

Maelstorm · Aug 9, 2012

Looks like I am at it again. This time I'm writing a application level sector cache. Here's what I've got:

I have a USB thumb drive in which the raw device is opened for read/write. I have requested the kernel use an exclusive lock on the device and the no cache flag is specified. The cache table is 256 sectors. So for 512 byte sectors, the data size of the cache is 128KB. The search algorithm is linear so it starts at the top of the cache (most recently referenced sectors) and ends at the bottom. The actual cache algorithm that I have in place searches the cache table. If there is a hit, then that entry is brought up to the top while the other entries that are between the original position of the found entry and the top entry are pushed down one spot. If there is no hit, it's a write, and the table is full, then the entry with the lowest sequence number is removed from the list, unless it's towards the top end. In that case, the bottom entry is excised from the table instead.

What do you guys think about this? Is there a better way?

Crivens · Aug 9, 2012

First question would be, is this "for the exercise" or do you really need it?

In case you really need it I would suggest using mmap of the device and let the kernel handle all the caching for you, in cooperation with all other processes and their memory demands. You could end up caching a lot more than your fixed maximum length.

But to the technical part:
The algorithm you use seems sound, but I would suggest using a different approach to find the entry in question. Using an array is possible, but pushing all entries down one is expensive. Using a (double) linked list is faster as you only have to unchain one entry and put it to the head of the list.

Keeping an auxilary hash map to quickly find an entry set would also speed up things a lot.
So you would for each entry have a node entry providing the LRU chain and I would suggest a second pairing to link all hashes together so that when you have found the start of the hash set you can traverse it to find the correct entry.

But it is better to first get things going and later get them going faster, so you may start with the easy approach of an array and later change the data structures to make it faster if you need it. Just keep an eye on the interfaces so you do not nail things down when you do not need to do so.

Maelstorm · Aug 10, 2012

Crivens said:
First question would be, is this "for the exercise" or do you really need it?

This is a server application, so to keep things read from disk in an application level cache does speed things up.

In case you really need it I would suggest using mmap of the device and let the kernel handle all the caching for you, in cooperation with all other processes and their memory demands. You could end up caching a lot more than your fixed maximum length.

The cache is sector oriented. One sector per entry. The supported physical sector sizes range from 512 bytes to 8192 bytes. The device code does check to see which one is used. So at 256 sectors, the max amount of memory used is 2MB.

But to the technical part:
The algorithm you use seems sound, but I would suggest using a different approach to find the entry in question. Using an array is possible, but pushing all entries down one is expensive. Using a (double) linked list is faster as you only have to unchain one entry and put it to the head of the list.

The linked list idea is a good one as I can search from both ends of the list and check two entries at a time with one loop iteration.

Keeping an auxiliary hash map to quickly find an entry set would also speed up things a lot. So you would for each entry have a node entry providing the LRU chain and I would suggest a second pairing to link all hashes together so that when you have found the start of the hash set you can traverse it to find the correct entry.

I have played with hashing before. I used each entry as the head pointer to a singly linked list to resolve conflicts. What conflict resolution do you suggest?

But it is better to first get things going and later get them going faster, so you may start with the easy approach of an array and later change the data structures to make it faster if you need it. Just keep an eye on the interfaces so you do not nail things down when you do not need to do so.

The interface to the cache subsystem wouldn't change. I have functions declared like this:

Code:

typedef unsigned long long int uint64;
typedef unsigned char uint8;

int cache_read(uint64 sector, uint8 *data, int *status);
int cache_write(uint64 sector, uint8 *data);
etc...

The return value for the function is either 0 or an error code. Status is a return value that indicates if there was a hit or miss.

Crivens · Aug 10, 2012

Maelstorm said:
The linked list idea is a good one as I can search from both ends of the list and check two entries at a time with one loop iteration.

How would you do that?
You travel the list which can have a length not known in advance.
The list would allow you to remove an entry from any place and place it at the head for the LRU notation using O(1) runtime whereby doing the same with an array would cost you O(n).

Maelstorm said:
I have played with hashing before. I used each entry as the head pointer to a singly linked list to resolve conflicts. What conflict resolution do you suggest?

I would suggest that you let the OS handle the caching as it knows best how much memory it can spare for this and give you not only a lot more than 2MB but also can simplify your code by not re-inventing a wheel.

Maelstorm said:
The interface to the cache subsystem wouldn't change. I have functions declared like this:

Code:

typedef unsigned long long int uint64; typedef unsigned char uint8; int cache_read(uint64 sector, uint8 *data, int *status); int cache_write(uint64 sector, uint8 *data); etc...

The return value for the function is either 0 or an error code. Status is a return value that indicates if there was a hit or miss.

May I point you towards int64_t and it's companions? There is no need to typedef your own types here.

Maelstorm · Aug 11, 2012

Crivens said:
How would you do that?
You travel the list which can have a length not known in advance.
The list would allow you to remove an entry from any place and place it at the head for the LRU notation using O(1) runtime whereby doing the same with an array would cost you O(n).

Actually, the size of the list *IS* known because the code is keeping count of what gets added/removed. Once the max entries is reached, that's when the look-expire-write mechanism takes over. Here's some code for it.

Code:

struct cache_record
  {
    uint64 sect;
    uint8 *data;
    uint64 seq;
    struct cache_record next, prev;
  };

uint64 sequence = 0;
uint64 record_count = 0;
struct cache_record head = NULL, tail = NULL;


int fscache_psearch(uint64 sect, uint8 *data, int *status)
  {
    struct cache_record currf, currb;
    uint64 i, terminal_count;
    
    if (record_count == 0)
      {
        *status = 0;
	return(0);
      }
    pthread_mutex_lock(&cache_lock);
    if (record_count == 1)
        {
	  if (head->sect == sect)
	      {
		pthread_mutex_lock(&cache_lock);
		memcpy(data, head->data, geom.sects);
		head->seq++;
		*status = 1;
		pthread_mutex_unlock(&cache_lock);
		return(0);
	      }
	    else
	      {
	        *status = 0;
		pthread_mutex_unlock(&cache_lock);
		return(0);
	      }
	}
      else
        {
	  currf = head;
	  currb = tail;
	  terminal_count = (record_count >> 1) + 1;
	  for(i = 0; i < terminal_count; i++)
	    {
	      if (currf != NULL)
	        {
	          if (currf->sect == sect)
	            {
		      memcpy(data, currf->data, geom.sects);
		      currf->seq++;
		      fscache_pmove2head(currf);
		      *status = 1;
 		      pthread_mutex_unlock(&cache_lock);
		      return(0);
		    }
	          currf = currf->next;
		}
	      if (currb != NULL)
	        {
	          if (currb->sect == sect)
	            {
		      memcpy(data, currb->data, geom.sects);
		      currb->seq++;
		      fscache_pmove2head(currb);
		      *status = 1;
		      pthread_mutex_unlock(&cache_lock);
		      return(0);
		    }
	          currb = currb->prev;
		}
	      if (currf == NULL && currb == NULL)
	        {
		  *status = 0;
		  pthread_mutex_unlock(&cache_lock);
		  return(0);
		}
	    }
	}
    pthread_mutex_unlock(&cache_lock);
    return(0);
  }

/* Move entry to head of list */
void fscache_pmove2head(struct cache_record *rec)
  {
    /* 4 conditions we need to check */
    if (rec->next == NULL && rec->prev == NULL) return;  /* Nothing to do */
    if (rec->next != NULL && rec->prev == NULL) return;  /* Nothing to do */
    if (rec->next == NULL) /* End of list */
      {
        rec->prev->next = NULL;
	tail = rec->prev;
	rec->prev = NULL;
	rec->next = head;
	head = rec;
	return;
      }
    /* Somewhere in the middle */
    rec->prev->next = rec->next;
    rec->next->prev = rec->prev;
    rec->next = head;
    head = rec;
    return;
  }

May I point you towards int64_t and it's companions? There is no need to typedef your own types here.

There's alot of code that uses those typedef definitions. It would have to be changed everywhere.

kpa · Aug 11, 2012

Typedef your own types to the standard ones from <stdint.h>. there's absolutely no need to re-invent the wheel anymore with your own types (that are bound to be wrong anyway if you aim at good portability).

Application level sector cache

Maelstorm

Crivens

Administrator

Maelstorm

Crivens

Administrator

Maelstorm

kpa