mmap, multiprocess, performance

I wrote an utility which maps a file to a memory and process one. Several copies of my utility can be run at the same time, but usually this value is less than the number of processors. In this case performance of my utility is low, but the CPU usage is also low! I feel that most of times processors are locked. What is wrong in my source?

Here is the plan of my utility.

Code:
#include <errno.h>
#include <sys/mman.h>
#include <sys/stat.h>

int mm_file;
void* mm_memory = MAP_FAILED;
size_t mm_size;

void* map_file(const std::string& filepath)
{
  mm_file = open(filepath.c_str(), O_RDONLY);
  if (mm_file < 0) 
  {
    printf("open file error()\n");
    return NULL;
  }

  struct stat stat_buf;
  if (fstat(mm_file, &stat_buf) < 0) 
  {
    printf("fstat error() %d\n", errno);
    return NULL;
  }
  mm_size = stat_buf.st_size;

  mm_memory = mmap(0, mm_size, PROT_READ, MAP_PRIVATE, mm_file, 0);
  if (mm_memory == MAP_FAILED) 
  {
    printf("mmap error\n");
    return NULL;
  }

  return mm_memory;
}

void unmap_file()
{
  if (mm_memory != MAP_FAILED) 
    munmap(mm_memory, mm_size);
  if (mm_file)
    close(mm_file);
}

int main(int argc, char* argv[])
{
  void data* map_file("/usr/home/...");
  process_data(data);
  unmap_file(data);
}
 
I'm no programmer but I can't see anything wrong here. Perhaps the problem is with the process_data function?
 
SirDice said:
I'm no programmer but I can't see anything wrong here. Perhaps the problem is with the process_data function?

No, process_data is my function. This function contains mathematical calculations with using data pointer. There are no system calls in this function.

For example, this function processes a data in 0.2s. Ok, this is good. The server has eight processors and I hope that the server can process 8 * 5 = 40 queries in a second. Ok, may be 30. But the real value is 5. And CPU usage is 12%.
 
Can you describe things like size and nature of the file being mapped and what file system you are using. Try to setup a small mfs slice and put the file in there and see if things improved.
 
expl said:
Can you describe things like size and nature of the file being mapped and what file system you are using. Try to setup a small mfs slice and put the file in there and see if things improved.

The file system is ufs. And also I used a tmpfs command to organize a file system in the memory. The results were similar.

The file size is approximately 350M and will be grown. It contains a binary data in a special format for quick search. The utility makes searching in this file (brute force full scan). The searching is very specific and can not be implemented in SQL (something like search of similar chess problems).

There are two utilities. First one reads data from the database, process and write into a file. The web script runs second utility for quick searching.
 
350Mb is a significant amount of data for a full scan even for a memory based reading. I would expect a I/O lag that has little todo with power of the CPU or the number of cores. Thats why database indexing was invented but I know nothing about the structure of your file.
 
Also wanted to add that inter-graded database engines based on files work best only with very few parallel clients or the i/o performance goes down steeply thats where databases based on client<->daemon come in with algorithms boosting performance of many parallel clients if enough of memory is available.
 
Can you load the whole file into memory and use multiple threads to access and process the data? If you don't want to use threads maybe load into shared memory and use multiple processes?
 
Back
Top