2021年3月31日星期三

Delete and combine rows in CSV file for special condition on columns

I have the following content in a csv file:

ID Contact_id Tags_id
"id114" "" "Tags_id3"
"id12" "" "Tags_id1"
"" "" "Tags_id3"
"id3353" "contact_id8764" "Tags_id5"
"id355" "contact_id16" "Tags_id6"
"" "" "Tags_id7"
"" "" "Tags_id3"
"" "contact_id564" "Tags_id2"
"" "" "Tags_id12"
"id12076" "contact_id137" "Tags_id7"
"" "" "Tags_id3"
"" "" "Tags_id5"
"" "" "Tags_id1"
... ... ...

Plain Text for testing:

ID,Contact_id,Tags_id  "id114","","Tags_id3"  "id12","","Tags_id1"  "","","Tags_id3"  "id3353","contact_id8764","Tags_id5"  "id355","contact_id16","Tags_id6"  "","","Tags_id7"  "","","Tags_id3"  "","contact_id564","Tags_id2"  "","","Tags_id12"  "id12076","contact_id137","Tags_id7"  "","","Tags_id3"  "","","Tags_id5"  "","","Tags_id1"  

Expected result:

Contact_id Tags_id
"contact_id8764" "Tags_id5"
"contact_id16" "Tags_id6,Tags_id7,Tags_id3,Tags_id2,Tags_id12"
"contact_id564" "Tags_id6,Tags_id7,Tags_id3,Tags_id2,Tags_id12"
"contact_id137" "Tags_id7,Tags_id3,Tags_id5,Tags_id1"
... ...

Expected result in Plain Text:

Contact_id,Tags_id  "contact_id8764","Tags_id5"  "contact_id16","Tags_id6,Tags_id7,Tags_id3,Tags_id2,Tags_id12"  "contact_id564","Tags_id6,Tags_id7,Tags_id3,Tags_id2,Tags_id12"  "contact_id137","Tags_id7,Tags_id3,Tags_id5,Tags_id1"  
  • First remove all rows with ID and no Contact_id (like the row with id114).
  • Second remove all rows below ID and no Contact_id (like the row under id12) till the next ID (id3353).
  • Third if ID and Contact_id are available than collect the Tags below till next ID in the row whith the Contact_id. Add the same collection of Tags to all Contact_ids under the ID (Contact_id16 and Contact_id564 have the same Tags which belonging to the id355.
  • Fourth remove the ID column

The first two steps are the most important.

Hopefully you did understand me, if not I will try to rewrite it. It's a very difficult task.

I tried it with vim macro but without any success. I know awk is probably better for this task but I am still learning it and was also unable to do it. Maybe there is another way how to solve this task. I hope someone can help.

https://stackoverflow.com/questions/66895290/delete-and-combine-rows-in-csv-file-for-special-condition-on-columns April 01, 2021 at 05:04AM

没有评论:

发表评论