I have a log_file with all my backups and a column with value yes means it won't be deleted by the retention policy (Preserved). there could be 1 or more rows having that preserved column = yes for a specific vmname.
My input is :
= FULL == 20210105 == 2100 == ASR-FULL-20210105-2100 == YES = FULL == 20210202 == 2100 == ASR-FULL-20210202-2100 == YES = FULL == 20210302 == 2100 == ASR-FULL-20210302-2100 == YES = FULL == 20210406 == 2100 == ASR-FULL-20210406-2100 == YES = FULL == 20210105 == 2146 == DNS10_7-FULL-20210105-2146 == YES = FULL == 20210202 == 2153 == DNS10_7-FULL-20210202-2153 == YES = FULL == 20210302 == 2148 == DNS10_7-FULL-20210302-2148 == YES = FULL == 20210406 == 2122 == DNS10_7-FULL-20210406-2122 == YES = FULL == 20210105 == 2105 == execnet.0-FULL-20210105-2105 == YES = FULL == 20210202 == 2106 == execnet.0-FULL-20210202-2106 == YES = FULL == 20210302 == 2106 == execnet.0-FULL-20210302-2106 == YES = FULL == 20210406 == 2105 == execnet.0-FULL-20210406-2105 == YES = FULL == 20210106 == 0200 == Prtgadmin.0-FULL-20210106-0200 == YES = FULL == 20210105 == 2216 == sandbox.0-FULL-20210105-2216 == YES = FULL == 20210202 == 2227 == sandbox.0-FULL-20210202-2227 == YES = FULL == 20210406 == 2152 == sandbox.0-FULL-20210406-2152 == YES = FULL == 20210105 == 2236 == wwwp.0-FULL-20210105-2236 == YES = FULL == 20210202 == 2249 == wwwp.0-FULL-20210202-2249 == YES = FULL == 20210105 == 2259 == wwws.0-FULL-20210105-2259 == YES = FULL == 20210202 == 2314 == wwws.0-FULL-20210202-2314 == YES = FULL == 20210105 == 2259 == webhost.0-FULL-20210105-2259 == YES My desired output is to print the n-1 oldest matches (top n-1)
ASR-FULL-20210105-2100 ASR-FULL-20210202-2100 ASR-FULL-20210302-2100 DNS10_7-FULL-20210105-2146 DNS10_7-FULL-20210202-2153 DNS10_7-FULL-20210302-2148 execnet.0-FULL-20210105-2105 execnet.0-FULL-20210202-2106 execnet.0-FULL-20210302-2106 sandbox.0-FULL-20210105-2216 sandbox.0-FULL-20210202-2227 wwwp.0-FULL-20210105-2236 wwws.0-FULL-20210105-2259 I can so far have the below result by running the below awk commands but It shows the most recent matches instead. I'd also like to have one awk command ideally . The year filter is not that important .
# cat bkp_list.log| grep -E '*2021.*YES'| awk -F[==-] 'cnt[$8]++{if (cnt[$8]>1) print prev=$0;next}' |awk -F[==] '{print $8}' ASR-FULL-20210202-2100 ASR-FULL-20210302-2100 ASR-FULL-20210406-2100 DNS10_7-FULL-20210202-2153 DNS10_7-FULL-20210302-2148 DNS10_7-FULL-20210406-2122 execnet.0-FULL-20210202-2106 execnet.0-FULL-20210302-2106 execnet.0-FULL-20210406-2105 sandbox.0-FULL-20210202-2227 sandbox.0-FULL-20210406-2152 wwwp.0-FULL-20210202-2249 wwws.0-FULL-20210202-2314 Thank you
https://stackoverflow.com/questions/67218835/how-to-print-the-first-occurence-of-a-column-matching-more-than-once-with-awk April 23, 2021 at 02:21AM
没有评论:
发表评论