C Using posix_fallocate from userland.

drumbsd · Nov 19, 2018

Hello, if someone needs to use posix_fallocate(2) from userland, I have this simple C code

C:

#include <stdio.h>
#include <fcntl.h>
#include <sys/stat.h>
#include <getopt.h>
#include <stdlib.h>
#include <unistd.h>
#include <err.h>
#include <errno.h>
#include <string.h>

int fd, risultato, ch;
int sflag = 0;
extern char *__progname;

void
usage(void)
{
        (void)fprintf(stderr, "Usage: %s -s SIZE -f FILENAME\n",__progname);
        exit(1);
}

int main(int argc, char *argv[]) {

      while ((ch = getopt(argc, argv, "s:f:")) != -1) {
        switch (ch) {
        case 's':
                sflag = atoi(optarg);
                break;
        case 'f':
                if ((fd = open(optarg, O_CREAT|O_RDWR, 0)) == -1)
                        err(1, "%s", optarg);
                break;
        default:
                usage();
                return 1;
        }
      }
   argc -= optind;
   argv += optind;


   if (optind < 5) {
           usage();
   }


   fchmod(fd,S_IRUSR|S_IWUSR|S_IRGRP|S_IWGRP|S_IROTH|S_IWOTH);

   risultato=posix_fallocate(fd,0,sflag);

   if ( risultato != 0 ) {
       fprintf(stderr,"%s: \n",strerror(errno));
       close(fd);
       return 1;
   } else {
       close(fd);
       return 0;
   }
}

kbw · Nov 23, 2018

Why not use ftruncate instead?

drumbsd · Nov 23, 2018

To be honest I didn't know the existence of truncate command by shell. It already does it

So I'll use it.
Many thanks.

ralphbsz · Nov 23, 2018

The difference between ftruncate() and posix_fallocate() is that the latter guarantees that there is enough space to actually write the data. Think about sparse files to understand that. A call to ftruncate() that increases the file size only sets the file size to the desired value. It does not have to allocate disk space for the blocks that were "jumped over". In contrast, a call to posix_fallocate() makes sure that every byte between 0 and the indicated size has disk blocks allocated to it: after a call to posix_fallocate(), no write() call to that range can fail with ENOSPC.

I think in some file system implementations, posix_fallocate() also has a side effect: Since it is typically used before writing a lot of data sequentially, it allows the file system to pre-allocate space in large units, ideally linearly on disk, to optimize write performance, and future read performance. Typically, applications that use posix_fallocate() are interested in high-speed sequential IO over the whole file. In contrast, ftruncate() is often used to create sparse files, so file systems can't optimize the file for speed in that case.

drumbsd · Nov 23, 2018

Many thanks rakphbsz!

kbw · Nov 23, 2018

I don't see anything in the description of ftruncate that precludes that guarantee and it is also defined to write zeros when extending a file.

Obviously there's a problem or else posix_fallocate wouldn't be needed, but from the definition of ftruncate, incorrect coping with low capcity or sparse files would seem to be a bug, rather than a feature.

ralphbsz · Nov 23, 2018

No, ftruncate is not defined to write zeros. It says "as if by writing bytes with the value zero", which could be done by making a sparse file: The only posix portable way to determine the content of a file is to read it; and a sparse file will read as zeroes.

Now your first sentence is definitely correct: Nothing precludes ftruncate from actually writing zeroes, except that this would be inefficient in the case of intentionally sparse files. And unfortunately sparse file do exist in the real world (I don't like that, but it is a fact). My favorite example is Microsoft Office applications from a decade or two ago: They used to seek to a position of roughly 2^64 - 100 bytes in the file, and read and write a few bytes there (those were used to implement locking of files against concurrent access, probably in some Paxos-like fashion). That meant that all .doc and .xls files were highly sparse. If a file system implemented this by actually writing zeroes to that position, nearly all computers in the world would run out of disk space (there are only very few machines in the world that are able to write a file that is 2^64 bytes in size).

kbw · Nov 24, 2018

Got it. Thanks.

Crivens · Nov 24, 2018

One thing you can do with huge sparse files is fool around with email scanners. Create a set off sparse files, tar them up (tar knows how to handle this) and mail that to someone. The scanner will try to scan the files in the attachment, extract them, and 'thingy go boom'. Either it runs out of space if the untar is broken, or it spends some time scanning terabytes of zeros for signatures.

Another example of sparse files is core dumps, by the way.

C Using posix_fallocate from userland.

Administrator