Solved Find and replace lines in text file using sed

Suppose I have a file (workingFile.txt) that contains the following:

Code:
>spONEsometextgoes here
foo
bar
>spTWOsome more text is here
alpha
bravo

and I have another file (inputFile.txt) that contains:
Code:
ONE
TWO

I would like to replace any line in workingFile.txt that contains any term from inputFile.txt in its entirety with the term from inputFile.txt prefixed with the ">" to result in:

Code:
>ONE
foo
bar
>TWO
alpha
bravo

My actual use case is more complex than this but hopefully this simplified example conveys my issue. My current script looks something like:

Bash:
#!/usr/local/bin/bash
inputfile="inputFile.txt"
workingfile="workingFile.txt"
while IFS= read -r line
    do
        sed "/$line/c\>$line" $workingfile
    done < "$inputfile"

When I run it, I keep getting the following error:

Code:
sed: 1: "/ONE/c\>ONE": extra characters after \ at the end of c command
sed: 1: "/TWO/c\>TWO": extra characters after \ at the end of c command

What's the proper syntax? I've read that BSD sed may require a newline after c\ but I couldn't get that to work either.

Thanks in advance!
 
Ok, I made some progress by making some changes and fixing an issue but am still stuck. My current Bash script now reads as

Code:
#!/usr/local/bin/bash
inputfile="inputFile.txt"
workingfile="workingFile.txt"
while IFS= read -r line
    do
        sed -i "" "/$line/c\\
>$line" $workingfile
    done < "$inputfile"

but when I run it, the update file looks like:

Code:
>ONEfoo
bar
>TWOalpha
bravo

How do I get that newline back in?

Thanks again.
 
I made one more update by inserting a literal newline which seems to work now but good grief this script is ugly. Any way to clean it up?

Code:
#!/usr/local/bin/bash
inputfile="inputFile.txt"
workingfile="workingFile.txt"
while IFS= read -r line
    do
        sed -i "" "/$line/c\\
>$line
" $workingfile
    done < "$inputfile"


>ONE
foo
bar
>TWO
alpha
bravo

Thanks
 
Code:
test file
this is a test file ONE
time only is what I do
and TWO is not equal to one

#command
sed -i "" -e 's|ONE|BYE|g; s|TWO|Hello|g' testfile

#results
$ cat testfile
test file
this is a test file BYE
time only is what I do
and Hello is not equal to one

new to BSD but I read that this is not possible in BSD sed does not see \n
 
Bash:
PATS=$(awk -v ORS='|' // inputFile.txt)
sed -E -i '' "s/.*(${PATS%|}).*/>\1/" workingFile.txt
For the first line, I suggest the following alternative (more efficient, and maybe a little easier to understand):
Bash:
PATS=$(tr '\n' '|' < inputFile.txt)
Of course, both variants will fail if any of the words in inputFile.txt contains a pipe symbol “|” (or any other character that's special in regular expressions, for that matter). If that may happen, you need to pre-process the PATS variable by preceding all special characters with a backslash.
 
I just like to introduce people to awk. ;)
Well, in that case …
Code:
#!/bin/sh -

inputfile="inputFile.txt"
workingfile="workingFile.txt"

export inputfile
awk '
        BEGIN {
                inputfile = ENVIRON["inputfile"]
                for (i = 0; getline t <inputfile; i++)
                        term[i] = t
        }

        {
                for (i in term)
                        if (index($0, term[i])) {
                                print ">" term[i]
                                next
                        }
                print
        }
' "$workingfile"
 
The important thing about awk is to learn a little bit of it, and have a general idea of what it can do (fundamentally: anything) and what it is good at (in general, things where different input lines need different treatment). For the kind of thing it's good at, awk is super efficient (quick to code and debug), easy to use, and gets the job done. I think you're doing the world a service by introducing people to awk.

Sed is more problematic. For simple things (ideally just string substitution), it's great. But with its complex commands, it is also exceedingly powerful. But programming in it is weird, and most people don't "get it" and the resulting programs tend to look like gibberish. While sed is necessary, overusing it is dangerous.

The other tools that are really useful are join, tr, and cut.
 
Perhaps not the best use case. :D
Yes, indeed.
To be honest, I would solve this particular problem with a few lines of Python. Especially because this seems to be just a small part of a larger project. As soon a a task exceeds a certain size and/or complexity, it's almost always a bad idea to try to solve it with a shell script.
 
Thank you everyone for your feedback, help and suggestions! Sorry it's taken so long for me to respond back. I needed to cleanse some data files to be used as input for another program and thought I could use this as an opportunity to learn some new Unix command line tools. sed(1) was the first thing I came across but hadn't thought to use awk(1) as I was under the (mistaken) impression that it is better suited for manipulating delimited files.
 
Back
Top