C Programming Project Testing - getting better at

I have been doing testing manually for my project, but I'd like to expand my knowledge a bit and implement "better" testing for my project(s).

The easiest scenario (where I should probably start to get my feet wet) is my `md2mdoc` program which takes a markdown file and writes `mdoc(1)`.

I have a few C macros (see below) which I use for unit testing's as in: "mu_assert(5 == 2, ...)" but this situation is a bit different where I could use a simple shell script to call my program and then just `diff(1)` the files--and my unit testing macros are not currently geared up to support.

My initial plan was to build several files for different testing (parsing) aspects like:
01_bold.md
Code:
*bold text* other text
01_bold.mdoc (which is a file that lists what my program should output)
Code:
.Sy bold text
other text

Then I can just use a shell script to loop through the .md files and diff the .out and .mdoc files to locate problems.
pseudocode:
Code:
for (all *.md in ./test)
   call md2mdoc *.md -o ...
   call diff *.mdoc *.out

But my question is:
1. Should I attempt to do this in C?
1a. This would allow me to build a testing target in my makefile.

2. Does testing have to do anything like produce a "test output file" for input/digestion into other testing frameworks to be "proper"?
2a. Is test output automated in some way or used for a reason?

3. I know there are some built-in headers/libs for testing but I suspect these are not present in all systems.
3a. Should I research these frameworks (or are they mostly OS dependent and just implement an independent method)?
3b. I am assuming this is for issues related to #2a but I am guessing (I would be interested in learning more about this aspect if it is).

4. How would you test this -- is my method for a simple shell script "not professional" enough?

Thank you for any tips and suggestions.

REF: minimal unit testing macros
C:
#include <string.h>
#include <sys/time.h>

extern int tests_run;

#define mu_assert(X) if(!(X)){ errorlocation(__FILE__, __LINE__); } return "PASS";
#define errorlocation(zFile, iLine)                                            \
  do {                                                                         \
    char *message;                                                             \
    asprintf(&message, "*ERROR* : %s [line: %d]", zFile, iLine);               \
    return message;                                                            \
  } while (0)
#define mu_run_test(label, errmsg, test)                                       \
  do {                                                                         \
    char *message;                                                             \
    char *ret = test();                                                        \
    ++tests_run;                                                               \
    asprintf(&message, "[%d] %s : %s", tests_run, ret, label);                 \
    if(memcmp("*ERROR*", ret, 7) == 0)                                         \
       asprintf(&message, "[%d] %s : %s - \"%s\"", tests_run, ret, label, errmsg);\
    printf("%s\n", message);                                                   \
    free(message);                                                             \
  } while (0)

REF: project page
 
All I can provide from my experience as an electronics development engineer is pretty general:
Testing is a very broad field.
Define, specificate and create tests and test environments is not just a job of its own, but a large and important -crucial - part of development anyway, which all depend on many things, like what you test for what.
1. Should I attempt to do this in C?
Yes, and no. Either way.
Always create more than just one test vector.

Especially when you produce something to be released into the wild - not only for some commercial product or furthermore for certain markets (medical, automotive, military, aerospace, public transport, or energy are just some I can think of offhanded) - look for regulations (not licenses, or patents only). You not only need to watch out for standards, security, and above all safety (intrinsic safety: ensure it always falls automatically into a safe state, when something fails), but there are also often quality regulations your customer has. In many engineering fields are also already tests defined need to be passed.

Anyway, like any developer you're in danger to test positive, only - to test what shall work works.
Of course, it's your baby you spent lots of effort on. And naturally you want it works. That's the whole point of doing it at all. But apart from that you know all the details. Both will provide you with blind spots. More or less unconsciously you automatically want to protect it from its weaknesses being found. But that's exactly the whole point of testing: Bust flaws and weaknesses.
So,
1. Don't test on positive, only. Test on negative. Bang on it! Try to break it.
2. Do tests from different angles ("attack vectors".)
3. Don't be the only one, testing it. Let others have a look at it. The closer you are to release, the less the other shall be involved in the development's details. Best was some target customer who does not know shit about it. Don't explain it to him in a way to prevent your baby from being harmed. It shall be put through the mill: stupid, naive, rough, hard and destructive. Be passive, but watch closely. Be prepared for knock-backs.
Most fear that test, because it's the hardest, since it always reveals a lot of "Shit. I'd never thought of that kind of crap!", and "No! Are you stupid? You'd never supposed to do that with it!"
That's exactly why those are the best tests can be done: Get it out from its warm and cozy laboratory's simulation nest box into the hard, cold world and confront it practically with tough reality.
(Fridays are good for that. You can go into the pub afterwards, have the weekend to get some distance, reflect, and start monday reasonable on fixing the flaws cool instead of producing more flaws, or even do BS, like trash the whole thing.)


I hope this will get this thread going, since not only I am personally very interested in what will be posted in this non-off topic thread, but above all the whole topic 'testing' needs testing, which means the more experiences and opinions collected and people are involved the better.
 
Last night I added a shell script which tests and produces output something like:
Code:
PASS: 01_bold
PASS: 02_italic
PASS: 03_codeblock
PASS: 04_header
PASS: 05_subheader
...

So, for example, obviously my program parses a simple case of *bold* and **comonmark bold** (and my program will even naturally handle a mangled version like **bad-commonmark bold*) but that's about as far as I can logically parse the syntax -i.e., of course it will not handle a totally mangled (non encapsulated) string like: **bold or *bold. But just to clarify, I don't build a test case to affirm that I cannot handle "*bold" do I? -i.e., testing is to confirm what my program can/should handle in this case (a mangled bold syntax like "*bold" is not a 'negative' test where as a mangled Commonmark bold would be "**bad-commonmark bold*").

The TAP is about the most independent method I can find. And there do seem to be a few "consumers" but I'm not entirely sure I understand their use just yet. NOTE: I haven't had the time to read up on TAP at the moment. And, from a quick glance my current mu_assert macros can probably be made to conform but there also seems to be a few other C solution as well.

However, I'm still trying to grasp if this an effort to offer the same functionality as the cmake testing (CTest) or just a "piece of mind" kind of thing (-i.e., why does a 'consumer' exist?).

REF:
 
I feel like I could be stepping close to a bee hive but the do...while loop is in reference to this (not `goto`).

Aside from that, `break` v `goto` is beyond the scope of this thread (isn't it--honest question?).
 
On the topic of build-time testing:
1. I am using make.
2. I am developing on a Mac.

Because of #1 and #2 I also use a file called "GNUmakefile" in my project directory (not tracked in repo) to run extra things. I typically do this for `ctags(1)` and any other thing I want done so, I have now added the run of my test script upon a build (make) as I would using CMake (or other build systems).
 
Ok, maybe a bit more ontopic:

There is a memory leak in this line sequence in case of an error message:

C:
asprintf(&message, "[%d] %s : %s", tests_run, ret, label);
if(memcmp("*ERROR*", ret, 7) == 0)
   asprintf(&message, "[%d] %s : %s - \"%s\"", tests_run, ret, label, errmsg);

The first asprintf allocates memory
The second asprintf allocates new memory and overwrites message
The first allocation is now lost: memory leak
 
I've seen that construct before. In the Adobe InDesign SDK. So you can use break instead of goto.

That should be prohibited.
I feel like I could be stepping close to a bee hive but the do...while loop is in reference to this (not `goto`).

Aside from that, `break` v `goto` is beyond the scope of this thread (isn't it--honest question?).
I'm sorry, I have looked into (researched) how `goto` fits into my macros and why it should be prohibited and I just cannot figure it out. Can I get a hint, link, or even a small explanation please?


Ok, maybe a bit more ontopic:

There is a memory leak in this line sequence in case of an error message:

C:
asprintf(&message, "[%d] %s : %s", tests_run, ret, label);
if(memcmp("*ERROR*", ret, 7) == 0)
   asprintf(&message, "[%d] %s : %s - \"%s\"", tests_run, ret, label, errmsg);

The first asprintf allocates memory
The second asprintf allocates new memory and overwrites message
The first allocation is now lost: memory leak
Ah, yes. I did know about that memory issue. And yes I do have to fix that at some point. My thought was I needed to do a bit of refactoring and fix the return instead. But, thank you.
 
I meant that it should be forbidden to misuse a do/while loop just so you can break out of it because you don’t want to think about the program flow.
That’s the same as using “DISTINCT” in SQL to filter out duplicates because you don’t feel like thinking about the query.

Sorry, I really didn't mean to stray from the topic, but I only read "getting better at" and thought it was about improving software.
 
I meant that it should be forbidden to misuse a do/while loop just so you can break out of it because you don’t want to think about the program flow.
That’s the same as using “DISTINCT” in SQL to filter out duplicates because you don’t feel like thinking about the query.
Fair enough.
But I would say "that extends to pretty much every construct that can affect program flow".
I can't tell you how many times I've code reviewed if statements nested 8 deep instead of a simpler "if not" construct
 
This is a macro. There isn’t an any break out or flow; the use of a do while is about a semicolon. …I apologize but this is either way off topic or way above my head.
 
This is a macro. There isn’t an any break out or flow; the use of a do while is about a semicolon. …I apologize but this is either way off topic or way above my head.
No worries. It's just a bit of C esoterica on the proper way to do things. The macros are fine, except for the memory leak as eternal_noob pointed out. The fix for that is adding a free() after the if (memcmp()) and before the following asprintf(). You'll need to add {} .
 
Ah thank you. I've been searching how to use `goto` in a macro and I've been going a bit crazy trying to figure out what the discussion is about.
 
This is one of the most powerful active threads, because it demonstrates:
  • You ask for testing,
  • some others, completely "naked" (innocent concerning the concrete topic of your program) join and dive into your swimming pool,
  • and quickly find a bug you didn't see for some time maybe because these 3 lines of code seem so simple that you'd never suspect a bug there, and eventually they do what they should so why should there be a bug?
  • Conclusio: it's much better to develop attitudes and ... procedures to detect bugs as early as possible that are easy to maintain, than to find the bugs later by a systematic, automated approach that needs a lot of diligence and sweat
Code Review is more powerful than testing?
Communication, Feedback, Courage...

There is a plethora of test frameworks available. I'd look what similar projects use, just because they might have already filtered the frameworks, i.e. in principle, a test framework should be kind of universal, but for sure one fits better to your domain while another is weaker.

EDIT P.S. I do not want to refrain you from systematic, automated testing. But do a simple weighting of cost/value, and prioritize what is "superior value for cheap money".
 
I have updated my macro(s) a bit; no more leaking. That aspect is good if I need it on my next project.

Sorry, I really didn't mean to stray from the topic, but I only read "getting better at" and thought it was about improving software.
Fair enough. And, as far as I can tell, this is not about 'control flow' (`goto` vs `do/while`) at all (please see the link I supplied; this is where I learned about "why" it is supposed to be used--and quite a bits of knowledge). However, I am taking mer's word and trusting my use of the macros for testing if need be but I'm most certainly all ears if I'm doing it wrong. I just don't seem to understand your point is all (but I will try).

Here is a sample of how I use those macros (I've fixed my macros but the ones above will work if you really wanted to "test" v "just see"):
C:
int tests_run = 0;

int foo = 7;
int bar = 4;

static char * test_foo() {
    mu_assert(foo == 7);
    return NULL;
}

static char * test_bar() {
    mu_assert(bar == 5);
    return NULL;
}

static char * all_tests() {
    mu_run_test("test_foo", "error, foo != 7", test_foo);
    mu_run_test("test_bar", "error, bar != 5", test_bar);

    return NULL;
}

int main(int argc, char **argv) {
    char *result = all_tests();
    if (result != NULL) {
        printf("%s\n", result);
        free(result);
    } else {
        printf("ALL TESTS PASSED\n");
    }
    printf("Tests run: %d\n", tests_run);
    return 0;
}
But as I said (in the first post), this I believe is just a "unit testing" framework. In my current situation, I'm testing the output from a parser so I'm using `diff(1)` and a shell script.

I still haven't been able to find time and read up on TAP and a TAP Consumer because I'm curious as to what/why.
 
EDIT P.S. I do not want to refrain you from systematic, automated testing. But do a simple weighting of cost/value, and prioritize what is "superior value for cheap money".
Interesting! What's that?
There is a plethora of test frameworks available. I'd look what similar projects use, just because they might have already filtered the frameworks, i.e. in principle, a test framework should be kind of universal, but for sure one fits better to your domain while another is weaker.
Will do.

This is one of the most powerful active threads, because it demonstrates: ...
Thank you for the kind words. I've all but given up on the 'review based testing' because I can't seem to get anyone interested in my projects nor even click a link. :p But, thank you.
 
Recap of my current testing frame work:
I have assembled 15 test cases which I run with a shell script (that calls my program and diff(1) against a known correct file). Here is my current list of tests.

Testing is run upon the "make" command and this seems the most useful to me because when I make a code change I can see right away if something was broken in the update. I am not saving this to a file (this is just stdout). Other than using testing as a simple check, I don't yet understand the other specific objectives behind it (but I have yet to read up more on the subject--but as Maturin said, this seems to be a broad topic).

A "FAIL" will produce the output from diff(1) but also continue testing other tests. I decided not to stop the loop upon a failed test.

Code:
---[ TESTING ]---------------------------------------------------
PASS : 01_bold
PASS : 02_italic
PASS : 03_name
PASS : 04_header
PASS : 05_section
PASS : 06_subsection
PASS : 07_reference
PASS : 08_dashlist
PASS : 09_enumlist
PASS : 10_funcargs
PASS : 11_optargs
PASS : 12_codeblock
PASS : 13_inlineliteral
PASS : 14_modifiers
PASS : 15_sectionref
 
I got to thinking about the 'control flow goto/break' issue again and I started to think maybe there was a mix up at a different juncture in this misunderstanding than I originally thought.

Code:
while(1) {
  /* run always */
}

// vs

while(0) {
  /* run never */
}
because:
while(1) = while(true)
while(0) = while(false)

That is to say, a do...while(0) will only run once, not always and therefore, there is no case of a 'control flow goto/break' misuse issue and the do...while(0) construct remains an elegant method to write a multi-statement macro.
 
On the topic of using a C based test:
I've been reading up on TAP and even TAP-y, and TAP-j and I think I've determined this is nothing more than a protocol (like a stream standard for output) and it is only for "consumption/digestion" into other things like YAML/JSON (for what purpose I don't know). But, at this point I have to make the executive decision this is a time-suck and I'm abandoning my efforts into learning TAP.

The constraints these protocols place on output is dumb (will lead to bottlenecks and thus limit their reach or mass adoption based on validity/usefulness alone). I have just now established my own syntax elements and I think it offers more flexibility (-i.e. At this point I have to assume I posses the ability to write my own consumer--of my syntax--to conform to another if the need ever arises).

My syntax can be as complicated as:
Code:
[1] PASS : test_foo -- a description #design @john &03.27.26 ^2 %DONE
[2] FAIL : test_md2mdoc.c [line 61] : test_bar -- a description #input %DONE - "*ERROR* bar != 5"
[3] PASS : test_baz -- a description #arch @john ^5 %TODO
1..3

where:
Code:
#tag        -- A Tag
@user       -- A user
&01.01.90   -- A date
^3          -- Priority/Cost (higher == more)
%status     -- A status

And I think this affords the possibility to give all the information necessary for who/what/why/how for many given instances while still possessing the ability to be readable in "thin" or "thick" uses.
 
Mjölnir said:
EDIT P.S. I do not want to refrain you from systematic, automated testing. But do a simple weighting of cost/value, and prioritize what is "superior value for cheap money".
Interesting! What's that?
I tried to express my experience that finding & fixing bugs with code review and "communication & feedback" (like what you did here, but there's plenty of other ways) is "cheap" in the sense that it requires little effortt and often gives quick and good results, whereas finding a good, setting up and maintaining an automated test framework, and writing the tests, is a magnitude more work, while the outcome is often less than that of code review or just talking to others (communication & feedback).

Maybe you can find another forum that better matches your domain (programming, maybe even specialized to text processing, because IIUC your program is not specific to FreeBSD) or a mailing list on such topics. Mailing lists are often a very productive way to advance for programmers. Until then, stay here (or use two or more forums), go on describing what you're doing and the problems you encounter, because chances are that you're getting help like from your 1st post.
 
Oh I get it. Got it. Thanks.

Yeah, the "md2mdoc" program is not really "tied only to FreeBSD" (because it can be used anywhere really) but I was also only using this as an example because it's a simple program (-i.e. what the program does plays only a small part into what I'm actually trying to learn--testing). However, I do understand (and you are most certainly possibly correct) and I've sort of started to use this thread as diary of sorts now. ...I'll stop. Thanks for the clarification and suggestion(s) though.
 
[...] I've sort of started to use this thread as diary of sorts now. ...I'll stop. [...]
You'll see which kind of posts give you feedback and which do not. When you encounter a problem, please do not hesitate to post it here, because as a wrote above: chances are good that you get help quickliy.
 
This might not be what you had in mind, but mentioning in case you find it useful. A great way to achieve coverage and depth is with Property-Based Testing (PBT). With PBT, you auto-generate inputs, drive your tested unit with those, and check some properties you came up with earlier that they hold between input and output.

Coming up with meaningful properties is not easy at first, but one gets a feel for it quickly. For example, in your case an interesting property would be:

P1: "For any plain text substring present in the markdown input, the same substring must be present in the mdoc output"

Then you generate such inputs. But for a piece of generated markdown, how do you get a plain text substring (so it doesn't accidentally cut into a half-open link markup, or such)? Maybe it is easier to generate a list of plain substrings, and then generate ways how you surround these with various markups (or not), and concatenate then. It is trivial to select a substring (or even full-string) of the generated plain texts.

This illustrates that coming up with meaningful properties, and actually generating confirming input, can be challenging in itself. But it is worth it, since often the counterexamples you get either
A) highlight a bug
B) highlight something lacking in the specification, so you can refine the test (for example, maybe mdoc escapes a bit differently than markdown - these would stick out during testing, and you would need to adapt the test).

I have practical experience with PBT using some other languages, but I heard you can get started really easily with Python's hypothesis. I think using PBT in C would be too cumbersome, but there's a blog with pointers if you really want. Still, get a taste in Python first.

Some great resources about the origins of this and ideas on coming up with properties:
 
Back
Top