Advanced cool Linux commands

I will here save strange combinations of commands that has helped me in my daily work

Get all unique dates in a file where the dates apear in the second column (but not on all rows)

awk -F " " '{print $2}' <filename> | egrep "^[0-9]{4}" | sort | uniq

With awk I select the second ($2) column in the file. Columns are separated with space (” “) .
Egrep selects all rows that starts with 4 digits like in “2011-07-07”
The sort command sorts all rows – needed for the uniq command
The uniq command removes all duplicated rows

Get all unique rows that match regexp “<xml-tag>.*</xml-tag>”

 egrep "<xml-tag>.*</xml-tag>" /path/to/file | sort | uniq 

With egrep I get all rows that matches the regular expression
The sort command sorts all rows returned from egreg which is needed for the uniq command
The uniq command removes all duplicated rows

Get a list of occurrences of unique rows in a gziped textfile based on date (in second column) of rows containing a search string

zgrep search_string filename.gz | awk -F " " '{print $2}' | sort | uniq -c   

This will give you a list of dates together with the sum of occurrences of search_string like this:

    909 2011-07-01
   1608 2011-07-02
   1604 2011-07-03
   2775 2011-07-04
   2765 2011-07-05
   1757 2011-07-06
   3716 2011-07-07
   2785 2011-07-08
   1711 2011-07-09
   1655 2011-07-10

With zgrep we grep in a gziped file without unzipping it first
With awk we select the second column (in this case a YYYY-MM-DD formated date) on the row
The sort is only needed if the dates do not come in order
The uniq -c gives us the list of occurrences of the uniq dates (grouped together to one row per unique date)

Sum up integrer values in a specific column in a file

awk -F " " '{tot+=$1} END {print tot}' /path/to/the/numbers

Here the values to sum up are in the first column ($1) in the file. The -F ” “ option tells awk to consider a singel space ” “ to be the column separator

Get min/max integer from a file with integers (one per row)

awk -F " " 'value=="" || $5 < value {value=$5} END {print value}' /path/to/file

This will give you the min value of the numbers in the first column in the file. The -F " " option tells awk to consider a singel space " " to be the column separator. To look for the max value just change the <

Create pretty print copies of XML-onliners using xmllint

for f in * ; do xmllint.exe --format "$f" --output "prettyprint/${f%}.df" ; done

This will run all files in current directory through xmllint with --format option and place them as new files in a folder called prettyprint

Leave a Comment


NOTE - You can use these HTML tags and attributes:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <pre lang="" line="" escaped="" cssfile="">

This site uses Akismet to reduce spam. Learn how your comment data is processed.