Solved Matching a new line with Regex

Hi,

I think I need help with Regex guys,

Some text as an example:
12
Word: The time is not what you think it is
no it is not.

I am trying to match "12+Word:"

This is what I've tried:
grep -E "^[\s]{1}[0-9]{1,2}[\n][A-Z]{1}[a-z]+:{1}"

it works on the web site https://regex101.com using PCRE, but with my shell script(bin/sh) it doesn't or even just in command line directly.

On Internet I found the following link, it seems not even referenced:

Am I searching for something that is not implemented in ERE ?
If it's the case, how can I replace the key word newline '\n' with something equivalent that could work with grep, because what I found so far was not successful (eg: .\. or [.\..]). May be the logic I use is just not right.

Thank you.

EDiT: Grammar .
 
Last edited:
Yeah, grep is designed for searching within lines. You can convert the newlines to something else, then run the grep, but that will give a messy output. Are you after output, or just confirmation if the string exists or not?

\035 is generally a safe character to use but to be anal, we first delete it from the string. Then convert \n to \035 then grep on that:

if tr -d $'\035' < filename | tr '\n' $'\035' | grep -q $'12\035Word'
then
echo match
fi

If you actually want output, a small awk command could do that, but I'd first want to know about the output you'd expect to see
 
You can convert the newlines to something else, then run the grep, but that will give a messy output. Are you after output, or just confirmation if the string exists or not?
Sorry if I didn't give you too much information but it was on purpose because I knew somebody would have solved the problem directly and I preferred to solve it "almost" by myself.
Yeah, grep is designed for searching within lines.
Yes it was a mistake to think otherwise, thank you for the reminder I was so obsess with the regex stuff that I didn't think correctly.
if tr -d $'\035' < filename | tr '\n' $'\035' | grep -q $'12\035Word'
then
echo match
fi
I don't know why I keep forgetting this tool 'tr' which is truly handy it helped me in this case, in fact the solution was so simple compare to what I was planning to do that it is funny when I think of it now.
Again thanks for the reminder.


grep -A1 ^12$ /tmp/testfile.txt|grep -B1 ^Word
That's smart !
It looks like you've got a bag full of shell tricks, just a feeling I have by reading your comments, nice.


By reading your comments guys it made me look at the problem differently, it seems it's all I needed to solve the issue.
Long story short I had a long file filled by multiple line of text not formatted, what I tried to do was reformatting the page by applying style at first, that was the mistake.
So instead, I just removed unnecessary spaces with 'tr' to finally have only one long line, and only then I could format the text with 'sed' "almost" as I wanted.
The result is not pretty, but it's correct and readable, that's what is important for me.
When the spaghetti code will be ready to be shown I will post it in the right section, if I think it's worth it.

Thank you covacat jamie
:)
 
There are grep variants that allow you to match across lines, e.g. agrep.

perl can be used to selectively remove newlines. For example the original post could probably use a
Code:
perl -p -e 's/^(\d+)\n/$1 /;'

after which you can use grep.
 
Never heard of this one 'agrep' , will look into it later, thank you.
Perl seems to be a powerful tool, I should learn a little bit of it one day but my todo list is already long enough for now, will see in the future.
 
You could also use grep(1)'s -z option (aka --null-data) to consider a zero-byte as line terminator instead of a newline:
Bash:
#!/bin/sh

grep -z "[0-9][0-9]*
[A-Z]*[a-z][a-z]*" $@

this way you match whatever number up from 1 digit, followed by a newline, followed by a word (note however that this is not POSIX compliant).
 
There are grep variants that allow you to match across lines, e.g. agrep.
Haha his name is funny "approximate grep" !!
This tool is 35 years old I honestly thought it was a new tool or something like that, but not at all.

You could also use grep(1)'s -z option (aka --null-data) to consider a zero-byte as line terminator instead of a newline:
Bash:
#!/bin/sh

grep -z "[0-9][0-9]*
[A-Z]*[a-z][a-z]*" $@

this way you match whatever number up from 1 digit, followed by a newline, followed by a word (note however that this is not POSIX compliant).

Seems cool, really nice although it's not posix it can be useful, thank you.
 
Back
Top