Dataframe Example
index | fileName | startline | endline |
---|---|---|---|
0 | 293104.java | 30 | 40 |
1 | 288951.java | 183 | 247 |
2 | 2378709.java | 98 | 117 |
Goal
I want to open and read the contents of the file in fileName, and extract the lines in the range created by the values in the startline and endline columns.
I then want to store that in a new column called snippet.
Example of snippet creation logic
def snippetMaker(fileName, startLine, endLine): file = open(fileName,'r').read() snippet = file.split('\n')[startLine:endLine] cleanSnippet = str(snippet).replace('[','').replace(']','').replace(',',' ') return cleanSnippet
Current approach
I have seen that map() is often used in functions like that shown above (given that the function can accept iterable arguments and returns a list) then set equal to a dataframe column like below.
df['snippet']= snippetMaker(df['fileName'],df['startLine'],df['endLine'])
I am having trouble reconfiguring the above snippetMaker function to work in such a way.
Other details
I do not want to use Iterrows, the dataframe contains over 8m rows.
https://stackoverflow.com/questions/67248859/how-to-open-a-file-whose-name-is-stored-in-a-pandas-cell-manipulate-the-content April 25, 2021 at 09:06AM
没有评论:
发表评论