Solved Some ordinary characters terminate input stream

I discovered a peculiar problem with std::cin string input in a clang-compiled C++ program on FreeBSD.

A sample code illustrating the issue follows.
C++:
#include <iostream>

int main()
{
double num{};
std::string id;

while (std::cin >> num >> id)
  std::cout << "You entered: " << num << " '" << id << "'\n";
std::cout << "We got past the loop.\n";
}

This reads double+string input and prints it in loop.
std::cin shouldn't need a space between the values read;
'10A' should correctly assign a floating-point 10 to val and string "A" to id.

The issue I am encountering is something special.
For non-space-separated input, beginning the string id part of it with any of the following alphabet characters, terminates the input stream (like Ctrl+D).
The characters I found to be affected (case-insensitive!): ABCDEF I N P X

This means that the following sample inputs:
> 1A
> -90.00ievel
will just print "We got past the loop." and exit, while, say,
> 7G
> 0.0Zarzuela
will print "You entered: 7 'G'" and "You entered: 0 'Zarzuela'" and continue reading input in the loop, as expected.

This is clang-specific (gcc on FreeBSD produces a correct program), and FreeBSD-specific (clang++ on Fedora GNU/Linux produces a correct program).
A curious bug.
 
Your code basically does:
C++:
cin >> num >> id;
while (cin) { // <- calls cin.operator bool()
  std::cout << "You entered: " << num << " '" << id << eol;
  cin >> num >> id;
}
therefore your while-loop tests for operator bool() const, which, in implementation I've tried with, returns !fail(); (reference: fail(), operator bool()). Probably operator>> reading a double sets failbit when it is unable to read input text as floating-point number.

Have you tried setting flags first?
 
(...) your while-loop tests for operator bool() const, which, in implementation I've tried with, returns !fail(); (reference: fail(), operator bool()). Probably operator>> reading a double sets failbit when it is unable to read input text as floating-point number.
As it turns out, it's libc++-implementation-specific arising from the way the LLVM developers interpret the C++ standard. GNU's libstdc++ implements it differently and so simple cases as in my example are treated "as expected".
Because Fedora's clang uses libstdc++ by default, it compiles the code above so that it works with those simple cases and according to my expectation. I gather more complex cases of input behave more 'erratically' in both implementations, but std::cin isn't for those anyways.

Some details:

I will mark this as solved, solution being to use libstdc++ or to process input more robustly.
 
Back
Top