Linux Tools

February 1, 2020 Mark Lilley

Increasingly I find as part of performance testing I'm creating significantly large data files. As more servers move over to linux I've found the easiest way of manipulating data (especially at volume) is on the server directly.

Linux has some tremendously powerful commands installed by default. To name just a handful grep, sed, awk, cut, paste, strings.

Grep
Grep in its basic operation can return rows from a file where a pattern had been matched. This is incredibly powerful in itself on large files where it would become unwieldy to open them in windows. It has many more advanced features to such as returning all unmatched data.

Sed
Sed is the king of pattern matching and replacing the values. Imagine a huge file where you needed to replace millions of values such as server ip. You would simply supply the original and replacement value and within seconds sed would replace them all.

Awk
Awk very much like sed can be used to manipulate values or like grep echo them out. Awk though is much more powerful and can be used to create complex steps.

Cut
Cut can quickly extract columns of data or a number of characters. It can be very powerful on it's own but I find I use it to further parse data I feed into it from other commands.

Paste
This can be used to add columns of data from multiple files into one file.

Strings
Increasingly we need to manipulate binary data. There are many hex tools to aid here. Strings however gives an easy way to extract ascii values from within binary files.

Next time you need to manipulate large data files consider using Linux. You can even download these commands from the Internet and use them on windows.

I hope this gives you another tool in your performance testing box.