I Tried "Vibecoding"

I wanted to try "vibecoding," so I've directed ChatGPT 5.2 reasoning model into creating 2 sh scripts that I've been testing for a while and seemingly work okay.

Do I recommend "vibecoding"? Only for personal projects to be executed at home and not shared with the public. Why? Because I believe that a human should write or revise any code that's put "out there" for anyone to use. And if you are gonna revise it, you might as well write it (or you are gonna get really bored just reading code).

What do the scripts do?
  • mannoob: Uses OpenAI's web API to convert any locally installed manpage into a version "for noobs," which is also a manpage, and shows it. It stores the generated manpages in a cache, and uses them if already exist. You can force a refresh.
  • url2man: Gets a URL, reads it, turns it into a manpage using OpenAI's web API, and shows it. It also stores them in a cache, and you can force a refresh.
Both scripts receive a good number of parameters, allow a config file, and include a help message.

I'm not going to add additional information because I don't think a non-expert should try them.

I share them so expert programmers can take a look at them as a curiosity. I didn't write a single line of the code.

They are too long to paste here. I'm attaching them as text docs.

EDIT: Of course, GhatGPT didn't create these scripts in one try. Several steps were needed, as well as some tests, and a number of rewrites.
 

Attachments

It's really only useful if you know how to code to begin with. It's fine for some boilerplate code, but you don't want to do anything complex. AI makes mistakes, lots of them. Those will go unnoticed if you can't understand the code it produces.
 
You don't want to worry your head about all that "syntax and semantics" stuff. The real true "vibe coders" don't understand the code that is generated at all, they just cut and paste it together to get something that might work, and ship it. This guy is a champion... I'm sure Sam Altmann loves him. Well, it's easy money... just don't test it too hard, or better still, don't test it at all, that's what users are for... 😂
 
Quick pass [1]:
Interesting but...
1. These are mostly built by checking off every edge case, in every step, and trying to deal with it (which is not how you program). The example for why this is the way it is, could be for you to paint a picture looking through a straw.
2. Very, very many requirements. Perl, Jq, python, curl, etc. (Out of that list I believe I have curl). So, this will make a very messy update later and limit your audience.
3. Roff? Everyone (it may be hard to list who doesn't) uses mdoc now. Roff is very old.
4. "mandoc -T ascii ..." or "mandoc -T markdown ..." mandoc(1) is already included.
5. Typically, something like this (sh/w)ould operate on stdin/stdout. This is a huge liability for you. Not to mention when you program something to write a file, you have "cleanup" to consider.
6. Extremely fragile! This is not how we do things in unix. ...actually, almost none of that code is how you do things in unix.

When you read a programming book it will give you a function to "calculate the GCD" but this code is never what you'd really use. It is at best a skeleton to build upon.

So, if this is not actually meant as an insult ("I can do in a few days what takes programmers 10+ years of knowledge/process building") then...
You are prompting AI incorrectly. To get anything good (skeleton code is about the extent of AI's ability and even that has a very weird layout/design) from an AI, you have to know some programing principles and constrain it more and steer it down a design path (you cannot just ask for a program to do XXX; you have to design your program then ask it for a specific function or fix the variables, etc, but even then you will be fixing that output a lot). I will think of a good example starter program and some questions to ask; we can see how far we get if you are interested. It's also interesting that you choose "parsing" (I mean, parsing is fascinating topic and quickly morphs into compiler design methodology) but I suspect this is because you seem addicted to "AI".
 
You don't want to worry your head about all that "syntax and semantics" stuff. The real true "vibe coders" don't understand the code that is generated at all, they just cut and paste it together to get something that might work, and ship it. This guy is a champion... I'm sure Sam Altmann loves him. Well, it's easy money... just don't test it too hard, or better still, don't test it at all, that's what users are for... 😂
Bill Gates knows that all too well!

And so do we, that's why we stay away from all that M$ stuff.
 
My opinion about the code (without having studied it in depth) is that it works eerily well and it's eerily complex. For instance: 1) it respects sh/FreeBSD features without letting bash/Linux tendencies govern; 2) it creates perl and python scripts within the script and executes then correctly; 3) it's outstandingly remarkably long; 4) it seems decently structured and commented; 5) it correctly checks the availability of other tools before using them (python, w3c). It wasn't the topic of this thread, but I'm surprised by ChatGPT 5.2 reasoning model. It's the first model that feels close to intelligent to me.
 
coax the LLM into making a small change
I was curious. I tried with url2man. I added "-p" switch (synonymous for existing "--python" = force use python instead of perl).
This is the diff:


Diff:
--- a/url2man
+++ b/url2man
@@ -29,7 +29,7 @@
 Options:
   -m, --mode MODE              Mode: detailed|noob|summary (default: detailed).
   -c, --config FILE            Read config from FILE (KEY=VALUE lines only).
   -M, --model MODEL            Model name (default depends on mode unless OPENAI_MODEL is set).
   -v, --verbose                Print progress information to stderr.
-  --python                     Force use of python3 as the JSON helper (test option).
+  -p, --python                 Force use of python3 as the JSON helper (test option).
   --max-out N                  Max output tokens (default: detailed=8192, noob=4096, summary=2048).
   --max-in-bytes N             Input byte cap for extracted page text (default: detailed=500000, others=250000).
   --max-download-bytes N       Max bytes downloaded from URL body (default: detailed=8000000, others=4000000).
@@ -392,7 +392,7 @@
     -c|--config) [ $# -ge 2 ] || usage_err; CONFIG=$2; shift 2 ;;
     -M|--model)  [ $# -ge 2 ] || usage_err; MODEL_OPT=$2; shift 2 ;;
     -v|--verbose) VERBOSE=1; shift ;;
-    --python) FORCE_PYTHON=1; shift ;;
+    -p|--python) FORCE_PYTHON=1; shift ;;
     --max-out) [ $# -ge 2 ] || usage_err; MAX_OUT=$2; MAX_OUT_SET=1; shift 2 ;;
     --max-in-bytes) [ $# -ge 2 ] || usage_err; MAX_IN_BYTES=$2; MAX_IN_SET=1; shift 2 ;;
     --max-download-bytes) [ $# -ge 2 ] || usage_err; MAX_DOWNLOAD_BYTES=$2; MAX_DL_SET=1; shift 2 ;;
     -r|--refresh) REFRESH="1"; shift ;;
 
Additional info: I asked it to make the least changes and to generate the diff directly, not the new script. I wouldn't be surprised if the line numbers are incorrect. I didn't check.
 
I got an AI angry (sort of). For some reason I decided to task/test AI for some code for some "coercion table code" and I decided to work with a lisp interpreter without 'let' (I was in a lisp mode that day, I guess). I built the `drop coercion` and asked it for the `raise`. It spent most of time complaining about my `let()` and never really provided any code that worked.
 
The changes itself seem solid, though. But LLMs are still infamously bad at counting stuff.

Let me check... Yes, as I suspected the line numbers are incorrect.
 
Add: "support comments in config file. support the use of shell and c style comments."
Also: if config keys are optional, then why does it read all upon run (these would override)?
 
You don't want to worry your head about all that "syntax and semantics" stuff. The real true "vibe coders" don't understand the code that is generated at all, they just cut and paste it together to get something that might work, and ship it. This guy is a champion... I'm sure Sam Altmann loves him. Well, it's easy money... just don't test it too hard, or better still, don't test it at all, that's what users are for... 😂
The interesting info supplied by blackbird9 (I guess a native speaker wouldn't phrase it this way, but I am what I am) inspired my latest crumb, which follows, along with one of its graphical versions.

Crumb 1487: Synthetic Symbiosis: In the near future, a new category of humans thrives—those not only good at telling systems based on artificial intelligence what to do, but also skilled at following the instructions given to them by those same systems.

Text by Alfredo Llaquet-Alsina.
Design and image by ChatGPT.

Crumb 1487-FP-Synthetic_Symbiosis___.jpg
 
Add: "support comments in config file. support the use of shell and c style comments."
Also: if config keys are optional, then why does it read all upon run (these would override)?
The reasons why I asked for the add:
  1. The existing logic is flawed.
    1. It's not out of the possibility for an end-user to place comments or comment out 'optional' vars.
    2. The existing logic seems like it could be omitted (reading each line) by a simple sourcing.
      1. If this one aspect is wrong than more are suspect.
Otherwise,
  1. There probably should be support added for a -n (dry run) switch.
    1. Seems overly dangerous to just allow something to seek external and write internal without the end user being able to inspect.
    2. This feature alone will cause a pretty big refactor.
 
The reasons why I asked for the add:
  1. The existing logic is flawed.
    1. It's not out of the possibility for an end-user to place comments or comment out 'optional' vars.
    2. The existing logic seems like it could be omitted (reading each line) by a simple sourcing.
      1. If this one aspect is wrong than more are suspect.
Otherwise,
  1. There probably should be support added for a -n (dry run) switch.
    1. Seems overly dangerous to just allow something to seek external and write internal without the end user being able to inspect.
    2. This feature alone will cause a pretty big refactor.
That's are interesting tests. I'll try probably try them latter. They are not as basic as the change I tried so they'll probably require some trial/error. I'll report back (not soon, I'm afraid; I'm focused on other things right now). Thanks, JonhK.
 
I have my suspicions about what will happen already, but I'm interested in the results. This was pretty low hanging fruit type of changes, and I really didn't get more than a few minutes to look at the code. Your experience may tell you the design is interesting but to me it looks like spaghetti code at best (you're better than I, if you can see the logic in most of that).
 
to me it looks like spaghetti code at best
I thought it would be fun to tell ChatGPT to reply to this. I told it that it could agree or disagree and explain why. It wrote a lot (which I pasted at the end). Then, I told it to summarize it (pasted first). JohnK: This is not a conversation with ChatGPT. I'm not pasting your retort, sorry.

ChatGPT 5.2 reasoning model:

TL;DR:

A quick skim can make the script look “spaghetti,” and that reaction is understandable. But most of the apparent complexity is intentional: it is a single-file POSIX sh tool meant to run on FreeBSD with minimal dependencies, while defensively handling two hostile inputs (the network and model-generated roff). That forces lots of explicit checks (Content-Type refusal, download cap, extraction-quality gate, strict roff validation/sanitization) and fallback paths (w3m→lynx→crude, perl→python→jq). In a higher-level language these would be cleaner modules; in sh they end up as verbose plumbing. It can be made cleaner, but doing so typically means splitting into multiple components or rewriting in another language, which increases dependencies and setup.

The first long answer:

A quick skim can definitely make it look “spaghetti,” and that criticism is not entirely unfair—but it is also a bit context-blind.

This is a single-file POSIX sh utility whose constraints are: FreeBSD base userland compatibility, minimal dependencies, and security hardening around two hostile inputs (the network and model-generated roff). Under those constraints, the code ends up as a set of small, purpose-built functions plus a linear “pipeline” (fetch → extract → validate → prompt → call API → validate roff → sanitize → cache → display). That style is verbose and repetitive in shell, because there is no module system, no structured error handling, and little data typing. The result reads more like “defensive plumbing” than elegant application code.

The “spaghetti” feeling usually comes from a few deliberate choices:

  • Hardening logic is interleaved with normal logic (Content-Type refusal, download cap, extraction-quality gate, roff strict check, roff sanitization). In a higher-level language, these would be separate modules. In POSIX sh, separation is mostly by functions and careful ordering.
  • Multiple fallbacks (w3m → lynx → crude strip; perl → python → jq; redirect chains in headers) add branches. That is intentional, because the goal is “works on a typical FreeBSD box” rather than “works only when tool X is installed.”
  • Shell makes clean structure expensive. Many lines exist to avoid bashisms and GNUisms, and to keep behavior predictable with BSD utilities.
Could it be cleaner? Yes. A more “pure” design would be a small set of subcommands and separate helper scripts (fetcher, extractor, API client, roff validator) or rewriting in a mainstream language with a real HTTP client and HTML parser. But that increases dependencies and setup steps, which was explicitly avoided.

So, the fair summary is: it is not “spaghetti” in the sense of being unstructured or arbitrary; it is “busy” because it is a defensive, compatibility-minded shell pipeline with security checks and fallback paths. If someone wants an idiomatic, minimal, elegant implementation, the correct path is probably a rewrite in a higher-level language—at the cost of depending on that runtime and its libraries.
 
Awe, that's a bit unfair. My turn: so, who do you trust now?

No retort from me. I however wanted to post some discussion points (as quickly and brief as I can because I'm running out of time) in case you do find enough time to continue. I am finding this more and more interesting and would really like to continue.

For the record, here is why I thought the config file would be an interesting ask and how I would do the config / variable thing (and thus why I said I'd be interested in how the AI codes up the situation).

Here are a few functions, from me, which I think lays out my design for something like this. How would you handle the situation (is my logic below flawed and/or who's right--sorry to put you in a situation to put on your former hat again)?

Code:
# get_file_contents_without_comments --
#   Return the contents of a file without comments.
#   File's comments can be either simple conf style
#   -i.e. hash (#) or c-style (/* */).
# EX
#  (get_file_contents_without_comments "${1}" | grep -E "VALUE" -m 1 2>/dev/null)
get_file_contents_without_comments() {
   (grep -v -e '#' -e '/\*' -e '*' -e '\*/' "${1}")
}

# config_read_file --
#    Read a file and look for a value.
config_read_file() {
   (get_file_contents_without_comments "${1}" | \
        grep -e "${2}*=*" -m 1 2>/dev/null) | \
        head -n 1 | cut -d '=' -f 2-;
}

# cofig_get --
#   A wrapper to call `config_read_file` and set the variable value.
# EX
#       config_variable=$(config_get some_variable "${_prefix}/some_config.cfg")
config_get() {
   val="$(config_read_file "${2}" "${1}")";
   printf -- "%s" "${val}";
}

Aside from the basic `assert()` and `validate()` logic you can quickly find yourself in a situation resulting in using logic like because of 'defensive' and 'hostile' situations:
Also, I would probably treat the cache and the config file as hostile but I haven't looked at the code that much to figure out the cache part.

Code:
# aif --
#   Aniphoric if.
#    This function will check an `expr` is not NULL before returned,
#    otherwise an `iffalse` value is returned.
# EX
#        var=$(aif $(some_expr) 7)
aif() {
        local expr=$1
        local iffalse=$2
        if [ -n $expr ] && \
           [ $expr != "-" ]; then
                echo $expr;
        else
                echo $iffalse;
        fi
}
 
Back
Top