I have been meaning to note down my unix checklist of commands (For Osx) which are very handy for basic operations on data. I will modify this post as and when I remember or come across something that fits here. These *nix commands are specifically tested for Mac OS.

Uniques

uniq - This is the unix unique function which can be primarily used to remove duplicates from a file amongst other things. The file has to be pre sorted for uniq to work
Consider file test which contains the following

$ cat test
aa
bb
bb
cc
cc
cc

Remove duplicates

$uniq test
aa
bb
cc

Count occurences of each item

$ uniq -c test
1 aa
2 bb
3 cc

Print only duplicate items in file

$ uniq -d test
bb
cc

Print only unique lines

$ uniq -u test
aa

Consider test now contains

$cat test
aa
bb
cc
AA
cC

Remove duplicate case insensitive.
This file is not sorted though. So it has to be sorted first before uniq. -i flag is for case in sensitive

$ sort test | uniq -i
AA
bb
cC

Case conversion

Convert all upper case in fileA to lower case and output as fileB

$ tr '[:upper:]' '[:lower:]' < fileA.txt > fileB.txt

File comparision

Compare two files and keep strings present in fileA but not in fileB

$ comm -23 fileA fileB

Compare two files and keep strings present in fileB but not in fileA

$ comm -13 fileA fileB

Compare two files and keep only strings which are present in both files

$ comm -3 fileA fileB

Sed

Primary purpose of sed is string replacement or pattern replacement.
Consider the following file as input

$ cat file.txt
unix is great os. unix is opensource. unix is free os.
learn operating system.
unixlinux which one you choose.
  1. Replacing or substituting string
    $ sed 's/unix/linux/' file.txt
    linux is great os. unix is opensource. unix is free os.
    learn operating system.
    linuxlinux which one you choose.
    
    By default, the sed command replaces the first occurrence of the pattern in each line and it won't replace the second, third...occurrence in the line. Here the "s" specifies the substitution operation. The "/" are delimiters. The "unix" is the search pattern and the "linux" is the replacement string. If you miss a delimiter then the expression errors out as below
    $ sed 's/unix/linux' file.txt 
    sed: 1: "s/unix/linux": unterminated substitute in regular expression
    
    2 Replacing the nth occurrence of a pattern in a line. Use the /1, /2 etc flags to replace the first, second occurrence of a pattern in a line. The below command replaces the second occurrence of the word "unix" with "linux" in a line.
    $ sed 's/unix/linux/2' file.txt
    unix is great os. linux is opensource. unix is free os.
    learn operating system.
    unixlinux which one you choose.
    
    Here is the first occurence which is the default option
    $ sed 's/unix/linux/1' file.txt
    linux is great os. unix is opensource. unix is free os.
    learn operating system.
    linuxlinux which one you choose.
    
    And the third occurence
    $ sed 's/unix/linux/3' file.txt
    unix is great os. unix is opensource. linux is free os.
    learn operating system.
    unixlinux which one you choose.
    
    To replace all the occurence use 'g' (global replacement)
    $ sed 's/unix/linux/g' file.txt
    linux is great os. linux is opensource. linux is free os.
    learn operating system.
    linuxlinux which one you choose.
    
    To make the search case insensitive sed on mac does not have a flag but you can use plain regex to achieve it. For example modify the file.txt to below
    $ vi file.txt
    unix is great os. Unix is opensource. unix is free os.
    learn operating system.
    Unixlinux which one you choose.
    sed 's/[Uu]nix/linux/g' file.txt
    linux is great os. linux is opensource. linux is free os.
    learn operating system.
    linuxlinux which one you choose.
    
    How to find a string in all the files contained in a directory. You could use grep or find.
    grep -lr searchStr mydir
    grep --recursive --ignore-case --files-with-matches “searchStr" mydir
    find mydir -type f | xargs grep -l searchStr
    
    To find/replace multiple strings use the -e flag.
    sed -e 's/unix/linux/g' -e 's/Unix/Linux/g' file.txt
    linux is great os. Linux is opensource. linux is free os.
    learn operating system.
    Linuxlinux which one you choose.
    
    To replace a string that begins with a pattern use the regex for it alongwith sed
    sed 's/^learn/learn to use/g' file.txt
    unix is great os. Unix is opensource. unix is free os.
    learn to use operating system.
    Unixlinux which one you choose
    
    To remove whitespace characters at end of the line
    sed 's/[<spc><tab>]|/|/g' file.txt
    
    Unix command to know if your file has whitespace or tab characters
    vi file.txt
    :set list
    
    Unix command to remove BOM (Byte Order Mark) characters from your file Open the file in binary mode using -b flag to verify if you have BOM. And then remove them
    vi -b file.txt 
    :set nobomb
    :wq
    
    Use the -i flag to overwrite the existing file and create a backup of the original file. For example to remove all white spaces in a file.
    sed 's/ //g' file.txt
    cat file.txt
    unixisgreatos.Unixisopensource.unixisfreeos.
    learnoperatingsystem.
    Unixlinuxwhichoneyouchoose
    
    This will create a backup file called file.txt.bak with the original file contents and overwrite file.txt with no spaces To remove only the trailing spaces in a line use *$. The * character means "any number of the previous character" and $ refers to end of line.
    sed -i .bak 's/ *$//g' file.txt
    
    Verify the trailing whitespaces are removed by :set list
    vi file.txt
    :set list
    unix is great os. Unix is opensource. unix is free os.$
    
    learn operating system.$
    
    Unixlinux which one you choose.$
    
    To replace a blank line with something else. You can match a blank line by specifying an end-of-line immediately after a beginning-of-line, i.e. with ^$
    vi file.txt
    unix is great os. Unix is opensource. unix is free os.
    learn operating system.
    Unixlinux which one you choose.
    sed 's/^$/this used to be a blank line/' file.txt
    unix is great os. Unix is opensource. unix is free os.
    this used to be a blank line
    learn operating system.
    Unixlinux which one you choose.
    
    To remove tabs at the end of a line. Ex: Add a tab to the end of first line, so :set list will show ^I
    vi file.txt 
    unix is great os. Unix is opensource. unix is free os.^I$
    
    learn operating system.$
    
    Unixlinux which one you choose.$
    
    To create a tab in your sed command. use ctrl + v and then ctrl + i
    sed -i.bak 's/ *$//' file.txt
    vi file.txt
    :set list
    unix is great os. Unix is opensource. unix is free os.$
    
    learn operating system.$
    
    Unixlinux which one you choose.$
    
    Consider file test which contains the following
    $ cat test
    (firstname).aa
    (firstname).bb
    (firstname).bb
    (firstname).cc
    (firstname).CC
    (lastname).hh
    (lastname).jj
    (lastname).ll
    
    To extract the content after firstname
    sed -En 's/.firstname).([A-Za-z]+).*/\1/p' test
    aa
    bb
    bb
    cc
    CC
    

Search Strings

Total occurences of searchStr in current directory

grep -ro searchStr . | wc -l | xargs echo "Total matches :"

Total number of files where searchStr occurs in current directory

grep -lor searchStr . | wc -l | xargs echo "Total matches :"

To get an exact word match use the -w flag.

grep -lwr searchStr mydir

Recursively replace string original with replacement in all files under OSx directory mydir recursively(Excludes hidden files and folders)

find mydir \( ! -regex '.*/\..*' \) -type f -exec sed -i '' 's/original/replacement/g' {} \;

OR

find mydir \( ! -regex '.*/\..*' \) -type f -exec sed -i '' 's/original/replacement/g' {} +

The regex excludes all hidden files and folders which is particularly important if you want to avoid messing up your .DS_Store or .git files unknowningly.
if you use zsh then the following would also work

sed -i -- 's/original/replacement/g' **/*(D*)

This isnt exlcuding hidden files though. The */(D*) is basically zsh way of saying recursively go through all sub directories and all files.