Hi, I'm about to design a file format for a scripting language of mine. And this also involves storing versions of the scripts in memory to be iterated over and interpreted by a virtual machine.
My question is, for performance, say I wanted to store a tree structure (for example an expression like "a = b * c + d * e"), should I represent this in memory as a kind of "deep" tree where the "+" has a pointer to some "*" object nodes, and the "=" has a pointer to the "+" node... or should I make everything "flat" so that the entire expression is contiguous in memory (as in one block allocated with malloc)?
It would seem that if the representation was flat, I would need to provide some kind of "simulated pointers" anyway - offsets so that each operator would know where its operands were.
I know modern CPUs try to predict what data you will access next... but do they prefer "flat" data or "lumpy" data? And are there other factors to consider?
Thanks.
My question is, for performance, say I wanted to store a tree structure (for example an expression like "a = b * c + d * e"), should I represent this in memory as a kind of "deep" tree where the "+" has a pointer to some "*" object nodes, and the "=" has a pointer to the "+" node... or should I make everything "flat" so that the entire expression is contiguous in memory (as in one block allocated with malloc)?
It would seem that if the representation was flat, I would need to provide some kind of "simulated pointers" anyway - offsets so that each operator would know where its operands were.
I know modern CPUs try to predict what data you will access next... but do they prefer "flat" data or "lumpy" data? And are there other factors to consider?
Thanks.