Solved Question about queue(3) STAILQ implementation

tembun · Apr 20, 2026

I'm exploring queue(3) framework - both its public API and internal implementation in sys/queue.h and I'm a little bit confused about one thing in structure that is used for {S}TAILQ head:

C:

#define STAILQ_HEAD(name, type)                                \
struct name {                                                  \
    struct type *stqh_first; /* first element */               \
    struct type **stqh_last; /* addr of last next element */   \
}

I can't comprehend why we store pointer to the first element (struct type *) in stqh_first, but we store 'address of last next element' (struct type **) in stqh_last? Honestly, not only can't I understand what 'last next element' means, but also why can't we just store a pointer to the last element (struct type *) in stqh_last, as we do for stqh_first?

The man page though says:

Code:

A singly-linked tail queue is headed by a structure defined by the STAILQ_HEAD macro.
This structure contains a pair of pointers, one to the first element in the tail queue
and the other to the last element in the tail queue.

but the actual implementaion differs, as we can see. It seems that the header file sys/queue.h doesn't explain this choice either.

Does anyone know (or have any guesses) why this structure is implemented in this particular way? What does this approach allow?

bakul · Apr 20, 2026

stqh_last points to the next *field* of the last entry. This is a slight optimization. The macros are hard to read but you can define a sample tail queue and operations on them using these macros, then look at the output of cc -E to make sense of them.

tembun · Apr 20, 2026

bakul said:
stqh_last points to the next *field* of the last entry.

bakul said:
This is a slight optimization.

I guess I start to sort of understand it, but it still is not very clear to me, where the optimization is... Could you please elaborate a bit?

bakul said:
The macros are hard to read but you can define a sample tail queue and operations on them using these macros, then look at the output of cc -E to make sense of them.

Yeah, I already did that, that's why I actually ended up having this question - this is how I saw that head has different types for first and last elements. Preprocessor's output is still a little bit tangled though

bakul · Apr 20, 2026

The best way to understand this is by drawing a picture of such a queue. The head has two ptrs, one to head and one to the next field of the last entry. Each entry has a next pointer *after* 3 integer fields, for example. Now draw pointer connections for three cases: an empty queue, a queue with N entries, adding a new entry at the end.

I think you can use indent (or manual formatting) to clean up the output.

The optimization is *head->tail->next = new_entry; head->tail = &new_entry; vs *head->lastnext = new_entry; head->lastnext = &new_entry->next. The idea is that you will a) walk the queue more often than b) adding new entries, so better to move adding the offset for next to the b) rather than a).

In addition, these macros are particularly useful when an entry has to be on more than one queue. For example implementing an NxM sparse matrix. Here each entry will contain x index, y index, value at matrix[x,y] and *two* next ptrs, one for each direction. And you will have one head struct each for rows or columns that have at least one entry.

Again, draw these pictures and it will become clearer. I hope.

Emrion · Apr 20, 2026

Once, I stumble on this question: why a pointer of pointer and not a pointer for the next element? I talk about double chained lists.

I read that to write the base functions of such a list needs less code and will be faster. So, I coded myself to verify.

Simple pointers:

Code:

struct ml {
    int data;
    struct ml* next;
    struct ml* prev;
};
struct head { struct ml* hnext; };

void inshead(struct head* h, struct ml* l) /* 5 lines */
void insbefore(struct head* h, struct ml* lb, struct ml* el) /* 6 lines */
void insafter(struct ml* la, struct ml* el) /* 5 lines */
void lremove(struct head* h, struct ml* el) /* 5 lines */
void lswap(struct head* h, struct ml* el1, struct ml* el2) /* 25 lines */

Total 46 lines.

With a pointer of pointer:

Code:

struct ml {
    int data;
    struct ml* next;
    struct ml** prev;
};
struct head { struct ml* hnext; };

void inshead(struct head* h, struct ml* el) /* 5 lines */
void insbefore(struct ml* lb, struct ml* el) /* 4 lines */
void insafter(struct ml* la, struct ml* el) /* 5 lines */
void lremove(struct ml* el) /* 3 lines */
void lswap(struct ml* el1, struct ml* el2) /* 21 lines */

Total 38 lines.

When it's a kernel code than runs all the times, it makes a difference.
Note also, you need less often to pass the head element to the functions.
I can post the code also, but it's somewhat long.

tembun · Apr 20, 2026

bakul, Emrion, guys, thank you so much for such valuable information! That's really interesting, I never even thought that just using a pointer can make a considerable difference.

Solved Question about queue(3) STAILQ implementation

tembun

bakul

tembun

bakul

Emrion

tembun