A problem using 32-bit Perl Regular Expression library, the regular expression metacharacters seem not to work. The library has to be built with 32-bit support and library libpcre32 is built and can be used with -lpcre32. The version is 8.35. I'm not sure if big-endian support was on, the machine is i586 PC (little endian), FreeBSD 10 is in use and it's updated more than once. The PCRE used now is built directly from source without any possible OS specific modifications.
I have made a C program calling functions: pcre32_compile2, pcre32_study, pcre32_exec (and pcre32_pattern_to_host_byte_order). The program and function calls seem to work properly, matching for example line "Wheel is round" with regular expression "Wheel". It does not match for example "Whe.l" or "Wh[a-z]el". It works and metacharacters do not.
Additional information: If using option flag PCRE_UTF32, an error appears telling the string is not valid UTF-32. UTF-32 should be allmost like four byte UTC, it's different by it's range. It may be possible that the 4-byte characters are somehow not in order?
I put the string to a four byte character (I think this is 4-byte UTC) array like this: by reading from a stream (a file or input) 8-bit bytes and putting first byte to first item in array, the second byte to second byte in array and so on. One character is always four bytes. And this seems to work, it matches "Wheel". I do not remember what happened when trying to write BOM as first character in regular expression pattern. Is it possible, or is it always true that PCRE uses big-endian pattern text in 32-bit mode? My machine is little endian.
How should I put the pattern text to fix the missing metacharacters -problem? Is this a bug (maybe a known bug in implementation) I do not know, or a new one? Should the regular expression pattern be for example in 4-byte big -endian format or is it fully functional only in 8-bit mode (ASCII, 8-bit and possibly in UTF-8)? How to switch on a little-endian support?
Has anyone used pcre32 and could give a link to a working example for example?
I have made a C program calling functions: pcre32_compile2, pcre32_study, pcre32_exec (and pcre32_pattern_to_host_byte_order). The program and function calls seem to work properly, matching for example line "Wheel is round" with regular expression "Wheel". It does not match for example "Whe.l" or "Wh[a-z]el". It works and metacharacters do not.
Additional information: If using option flag PCRE_UTF32, an error appears telling the string is not valid UTF-32. UTF-32 should be allmost like four byte UTC, it's different by it's range. It may be possible that the 4-byte characters are somehow not in order?
I put the string to a four byte character (I think this is 4-byte UTC) array like this: by reading from a stream (a file or input) 8-bit bytes and putting first byte to first item in array, the second byte to second byte in array and so on. One character is always four bytes. And this seems to work, it matches "Wheel". I do not remember what happened when trying to write BOM as first character in regular expression pattern. Is it possible, or is it always true that PCRE uses big-endian pattern text in 32-bit mode? My machine is little endian.
How should I put the pattern text to fix the missing metacharacters -problem? Is this a bug (maybe a known bug in implementation) I do not know, or a new one? Should the regular expression pattern be for example in 4-byte big -endian format or is it fully functional only in 8-bit mode (ASCII, 8-bit and possibly in UTF-8)? How to switch on a little-endian support?
Has anyone used pcre32 and could give a link to a working example for example?