Removing ^M Characters from Text Files, Revisited
Several people read my original article on removing ^M characters from text files created on Windows machines and felt the need to comment on how their way is better than mine. So, in the interest of full disclosure, let’s discuss some alternative methods.
Using ViM
ViM supports batch mode across files. Thus you can apply transforms to a list of files. One of the commands that ViM supports is “ff”, which takes the argument “unix”, meaning format the file as a unix text file. This does line ending detection and transformation. You can use it like this:
vim +"argdo set ff=unix" +wqa file
This method can also be used within ViM by simply issuing “set ff=unix” while editing a file.
Using Vi
Using Vi you can do a standard search/replace command like so:
:%s/^V^M//g
You will type CTRL-V and CTRL-M, not the ^-letter combinations. The CTRL-V character won’t display.
Unix col Command
The col command filters reverse line feeds from it’s input. So, we could use it as follows:
cat infile | col -b > outfile
dos2unix Command
If the dos2unix command is available on your system then it’s a quick little way to perform the operation. It also has a counterpart, unix2dos, which reverses the process and makes your text file appropriate for Windows users who use (uhg) Notepad. One way to use it is:
dos2unix infile outfile
dos2unix also works with standard input and output streams, so piping and redirection will work just fine.
sed
The venerable one, sed can do everything short of making your coffee in the morning (I’m no sed expert, so it may even be able to do that and I’m just not sure how). sed is so powerful and flexible in fact that I’m not going to discuss any specifc method here. I’ll let you discover those yourself.
Hopefully you can recognize that most of these techniques can be used not just for removing ^M characters, but also for a host of other search/replace operations. If I’ve left out your favorite method please comment, I’d love to hear about it.