Hi everyone,
As most of us know that unix philosophy is one of fascinating ideas where simple, well done small programs work together to accomplish one large task. However, as far I know that one problem with unix philosophy is that each program deals with plain text input/output. For example cat a.file | grep something. Maybe in 70s there wasn't good structured data such as xml, json, yaml and others, I don't know.
I was thinking what if the OS (say freebsd) adds one more file descriptor (STD IO) to STDIN, STDOUT, and STDERR. Let's call it STDSTRCT for structured data. This file descriptor (FD) is used by programs and it opens by default by the kernel for every new task just like STDIN, STDOUT, and STDERR. So the process can write/read from this FD for structured data. Let's assume we only deal with json format. Process A can print the following to STDOUT
Which is fine but also process A could write to STDSTRCT too the same data but this time in json format
this json is not printed to STDOUT, it is just written to STDSTRUCT FD (let's say fd=3). Then if piped A | B, process B can handle both (optionally STDSTRUCT). Process B doesn't have to deal with STDSTRUCT input, it can just simply discard it. Or B has the chance to process json format or any structured data if B wants that (it is optional). Wouldn't be great to make this kind of changes to the kernel level so it becomes easy for developers to adapt processing structured data? I believe it is possible to do this technique with shm_open but it is kind of extra work for programs developers. Regarding which format, maybe it needs a standard such as first 64 bytes or whatever written to STDSTRUCT are to indicate which format is the rest of the stream will be.
Please let me know about your opinions and ideas regarding dealing with structured data in unix way.
Thank you.
As most of us know that unix philosophy is one of fascinating ideas where simple, well done small programs work together to accomplish one large task. However, as far I know that one problem with unix philosophy is that each program deals with plain text input/output. For example cat a.file | grep something. Maybe in 70s there wasn't good structured data such as xml, json, yaml and others, I don't know.
I was thinking what if the OS (say freebsd) adds one more file descriptor (STD IO) to STDIN, STDOUT, and STDERR. Let's call it STDSTRCT for structured data. This file descriptor (FD) is used by programs and it opens by default by the kernel for every new task just like STDIN, STDOUT, and STDERR. So the process can write/read from this FD for structured data. Let's assume we only deal with json format. Process A can print the following to STDOUT
Code:
1 Edward the Elder United Kingdom House of Wessex 899-925
2 Athelstan United Kingdom House of Wessex 925-940
3 Edmund United Kingdom House of Wessex 940-946
4 Edred United Kingdom House of Wessex 946-955
5 Edwy United Kingdom House of Wessex 955-959
Which is fine but also process A could write to STDSTRCT too the same data but this time in json format
JSON:
[
{
"ID": 1,
"Name": "Edward the Elder",
"Country": "United Kingdom",
"House": "House of Wessex",
"Reign": "899-925"
},
{
"ID": 2,
"Name": "Athelstan",
"Country": "United Kingdom",
"House": "House of Wessex",
"Reign": "925-940"
},
{
"ID": 3,
"Name": "Edmund",
"Country": "United Kingdom",
"House": "House of Wessex",
"Reign": "940-946"
},
{
"ID": 4,
"Name": "Edred",
"Country": "United Kingdom",
"House": "House of Wessex",
"Reign": "946-955"
},
{
"ID": 5,
"Name": "Edwy",
"Country": "United Kingdom",
"House": "House of Wessex",
"Reign": "955-959"
}
]
this json is not printed to STDOUT, it is just written to STDSTRUCT FD (let's say fd=3). Then if piped A | B, process B can handle both (optionally STDSTRUCT). Process B doesn't have to deal with STDSTRUCT input, it can just simply discard it. Or B has the chance to process json format or any structured data if B wants that (it is optional). Wouldn't be great to make this kind of changes to the kernel level so it becomes easy for developers to adapt processing structured data? I believe it is possible to do this technique with shm_open but it is kind of extra work for programs developers. Regarding which format, maybe it needs a standard such as first 64 bytes or whatever written to STDSTRUCT are to indicate which format is the rest of the stream will be.
Please let me know about your opinions and ideas regarding dealing with structured data in unix way.
Thank you.