有些事如何做: How to extract the tree structure from NLTK tree without labels?

I'm using benepar (reference here) to parse sentences in French. I would like to get a tree-stuctured syntax representation that takes NP or PP as division without any extra label.

For example

Original Sentence:

A man with a red helmet on a small moped on a dirt road .
Desired output:

( ( ( A man ) ( with ( a red helmet ) ) ) ( on ( ( a small moped ) ( on ( a dirt road ) ) ) ) . )
Parsed output:

(NP (NP (DT A) (NN man)) (PP (IN with) (NP (NP (DT a) (JJ red) (NN helmet)) (PP (IN on) (NP (DT a) (JJ small) (VBN moped))) (PP (IN on) (NP (DT a) (NN dirt) (NN road))))) (. .))

(SENT (NP (DET Un) (NC homme) (PP (P avec) (NP (DET un) (NC casque) (AP (ADJ rouge))) (PP (P sur) (NP (DET une) (ADJ petite) (NC mobylette)))) (PP (P sur) (NP (DET un) (NC+ (NC chemin) (P de) (NC terre))))) (PONCT .))

The code I have written for the Parsed output:

import spacy  from benepar.spacy_plugin import BeneparComponent    nlp = spacy.load('en')  nlp.add_pipe(BeneparComponent('benepar_en'))  doc = nlp('A man with a red helmet on a small moped on a dirt road .')    sent = list(doc.sents)[0]  print(sent._.parse_string)

https://stackoverflow.com/questions/66095091/how-to-extract-the-tree-structure-from-nltk-tree-without-labels February 08, 2021 at 10:07AM

有些事如何做

2021年2月7日星期日

How to extract the tree structure from NLTK tree without labels?

没有评论:

发表评论