2021年3月9日星期二

check multiple tsv file and drop all the same rows from each tsv in python

i have three tsv files.

file 1:

1   Alice   24        10  Bill    23  4   Ellen   24  9   Mike    30  

file 2:

  6  Julie   76  2  Bob     42  7  Tom     54  5  Frank   30  1  Alice   24  

file 3:

3  Dave    68  8  Jerry   34  1  Alice   24  5  Frank   30  2  Bob     42  

OUTPUT: My desire output is to drop all the rows in which first and second column's values are the same from each of those tsv files and keep other rows as it is.

file 1:

10  Bill    23  9   Mike    30  4   Ellen   24  

file 2:

6  Julie   76  7  Tom     54  

file 3:

3  Dave    68  8  Jerry   34  

And my tsv files are headless. I have tried following code so far.

with open('file2.tsv') as check_file:      check_set = set([row.split('\t')[0].strip().upper() for row in check_file])    with open('file1.tsv', 'r') as in_file, open('file3.tsv', 'w') as out_file:      for line in in_file:          if line.split('\t')[0].strip().upper() not in check_set:              out_file.write(line)  

But i didnot got my desired three out files with this code. Any help will be appreciated. Thanks in advance.

https://stackoverflow.com/questions/66544388/check-multiple-tsv-file-and-drop-all-the-same-rows-from-each-tsv-in-python March 09, 2021 at 05:41PM

没有评论:

发表评论