C Simple filecopy in C with progress, that leaves CPU surviving...

Hello,

top returns me a 100% xterm and xorg, because of '\r'.
I tried to use the printf every percent, but I have an issue with float, double and round. Any idea would be highly highly welcome because I do not find any possible solution, after long long trials. Help would be great.

Thank you !!!

C:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <dirent.h>
#include <ctype.h>
#include <sys/stat.h>
#include <dirent.h>
#include <sys/types.h>
#include <unistd.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/stat.h>

//#include <math.h>
//#include <time.h>

long totalfilesize = 0;

///////////////////////////////
///////////////////////////////
///////////////////////////////
/////////////////////////////////
/////////////////////////////////
/////////////////////////////////
int fexist(char *a_option)
{
  char dir1[PATH_MAX];
  char *dir2;
  DIR *dip;
  strncpy( dir1 , "",  PATH_MAX  );
  strncpy( dir1 , a_option,  PATH_MAX  );

  struct stat st_buf;
  int status;
  int fileordir = 0 ;

  status = stat ( dir1 , &st_buf);
  if (status != 0) {
    fileordir = 0;
  }

  // this is compatible to check if a file exists
  FILE *fp2check = fopen( dir1  ,"r");
  if( fp2check ) {
  // exists
  fileordir = 1;
  fclose(fp2check);
  }

  if (S_ISDIR (st_buf.st_mode)) {
    fileordir = 2;
  }
return fileordir;
/////////////////////////////
}






void fseek_filesize(const char *filename)
{
    FILE *fp = NULL;
    long off;

    fp = fopen(filename, "r");
    if (fp == NULL)
    {
        printf("failed to fopen %s\n", filename);
        exit(EXIT_FAILURE);
    }

    if (fseek(fp, 0, SEEK_END) == -1)
    {
        printf("failed to fseek %s\n", filename);
        exit(EXIT_FAILURE);
    }

    off = ftell(fp);
    if (off == (long)-1)
    {
        printf("failed to ftell %s\n", filename);
        exit(EXIT_FAILURE);
    }

    printf("%ld\n", off);
    totalfilesize = off;

    if (fclose(fp) != 0)
    {
        printf("failed to fclose %s\n", filename);
        exit(EXIT_FAILURE);
    }
}



double f_round(double dval, int n)
{
    char l_fmtp[32], l_buf[64];
    char *p_str;
    sprintf (l_fmtp, "%%.%df", n);
    if (dval>=0)
            sprintf (l_buf, l_fmtp, dval);
    else
            sprintf (l_buf, l_fmtp, dval);
    return ((double)strtod(l_buf, &p_str));

}


//////////////////
void ncpdisplay( char *filetarget,  char *filesource )
{
  /// new
  int curtime = 100;
  int lastime = 100;
  // fread
  char            buffer[1];
  size_t          n;
  size_t          m;
  FILE *fp;
  FILE *fp1;
  FILE *fp2;
  int counter = 0 ;
  int freader = 1 ;
  int i , j ,posy, posx ;
  long copypos = -1;

  if ( fexist( filesource ) == 1 )
  {
        fp = fopen( filesource, "rb");
        fp2 = fopen( filetarget, "wb");
        counter = 0;

        while(  !feof(fp) && ( freader == 1)   ) {
           if (  feof(fp)   )
           {
                freader = 0 ;
           }
           n = fread(  buffer, sizeof(char), 1 , fp);
           m = fwrite( buffer, sizeof(char), 1,  fp2);

           copypos++;

           //////////////////////////////////////////////////////////////
           //////////////// this area will destroy the CPU !!!! <----
           ///  [[[[
           double z = (double)copypos / (double)totalfilesize*100;
           //if ( (int)f_round( z, 0 ) % 2 == 0 )
           {
              printf("Copying byte: [%ld/%ld] [%lf \%] [%lf]", copypos, totalfilesize, z, f_round(z, 0) );
              printf("\r");
           }
           ///  ]]]]
           //////////////////////////////////////////////////////////////

           usleep( 1 );
        }
        fclose(fp2);
        fclose(fp);
      }
      printf("\n");
}







int main(int argc, char *argv[])
{
    int i ;
    if (argc < 2)
    {
        printf("%s <file1> <file2>...\n", argv[0]);
        exit(0);
    }

    if (argc >= 2)
    if (argc == 3)
    {
       fseek_filesize(argv[1]);
       printf( "Source: %s\n", argv[1]);
       printf( "Target: %s\n", argv[2]);
       ncpdisplay( argv[2], argv[1] );
    }

    return 0;
}
 
You know, usleep(3) is the abbreviation for microseconds sleep. One microsecond does not leave much time to the CPU for doing things in other processes. I would give it at least a millisecond, i.e. 1000 µs.
 
Also, reading and writing one byte at a time is - well - Linus threw one of his fits about something like this. Lets call it beginners simplification.
 
I could make about 100 comments about the source code. Don't have time for that.

For your performance problems: Copying 1 byte at a time is insane, as Crivens already said. Using FILE is also pointless and performance wasting, you should use open/read/write instead of fopen/fread/fwrite. Then use a buffer size large enough so that prefetching in the file system underneath becomes efficient, and you minimize the number of performance-killing user/kernel transitions. I think my personal copy program (which also takes a SHA-512 checksum while copying) still uses 1MiB blocks, but it was written a decade ago. Today I would use 8 or 16MIB blocks.

And then, you really don't need to do floating point arithmetic for calculating percents. It can be easily done with integer arithmetic, which runs much faster. And you really don't need to do the calculation of percentage for every byte. I would think about it this way: When do you need to show progress reports with percentages? When the human user is likely to be impatient. What is the limit where humans become inpatient? A few seconds. You could use run a timer and show progress reports for example every 1 or 5 seconds if the copy hasn't finished yet, or you could estimate the throughput of a copy (using single disks and normal file systems it's probably 50MB/s), and give a progress report every so many MB.

Another interesting question: Your trick with opening the file and then stat'ing it is silly. The easy way to find out whether a file system object exists, whether it is a file or directory (or something entirely different, like a device or socket), and what its size is, is to do a single stat() call. Matter-of-fact, what I would do is first open() the file (handle failures there), then call fstat() on the open file descriptor. Why? Because otherwise you will get inconsistent results if the file is being deleted or renamed while your program is running. You really want to access the file exactly once, to get atomicity.

While we're talking about atomicity: What happens if the source file size changes while you're copying it? It could get longer or shorter, and that is very hard to prevent on a Unix machine. Since I don't know your use case and requirements for this program, I don't know what the correct answer is; my personal copy program (which is used as part of a backup system) will copy as much as it finds in a single pass, and report a warning if the file size changed while reading.

Advanced homework assignment: Efficiently copy sparse files.

P.S.: What I forgot to mention: For good measure, you should cache the mtime of the file when you start the copy, and recheck it after the copy is done (all while holding the open() handle to make sure the file identity doesn't change oder you, remember a file name is not a unique and persistent identifier for a file). If the time changes while you are copying (or if the size changes), then someone has likely modified the file, and your copy is most likely not usable, because it may contain a mix of old and new content. What to do in this case is up to you, because it depends on the requirements for the copy. Unfortunately, in Unix it is virtually impossible to use mandatory file locks universally (unless you go to file-system specific implementations), so full atomicity of file access can't be done. If you want that, you need to get a professional-strength production OS, like Windows (!!!), MVS or VMS. If you are willing to be file-system specific, the easiest technique for consistency is to use snapshots.
 
Back
Top