Wednesday 24 February 2010

text extraction and reformat indexing

Input file:

cat filelist_input | head -10
1 ./ab-138.done.done
2 ./ab-69.done.done
3 ./ab-137.done.done
4 ./ab-109.done.done

Expected output :

1 ab-138.done.done
2 ab-69.done.done
3 ab-137.done.done
4 ab-109.done.done

Solution:

cat filelist_input | head -10 | perl -nle '$_=~s#^(\d+)\s./(\w+)#$1 $2#g and print $_;'

Note: perl is very fast incase of operation on lakhs of lines.

No comments:

Post a Comment

Tweets by @sriramperumalla