I have a HTML table like below:
<table class="pull-right table table-bordered table-hover table-striped" id="cost-comparison-table"> <tr> <td> ABC <td>USD 17000 <tr> <td> DEF <td>USD 4000 <tr> <td> GHI <td>USD 5000 <tr> <td> JKL <td>USD 18000 <tr> <td> MNO <td>USD 19000 <tr> <td> PQR <td>USD 10500 </td> </td> </tr> </td> </td> </tr> </td> </td> </tr> </td> </td> </tr> </td> </td> </tr> </td> </td> </tr> </table>
Is there any way to scrape the HTML formatted in this way? Actually, this is minified version of the HTML. To be noted - in HTML5 closing tags like li
, tr
, td
, br
, img
is not mandatory.
I need to create a dictionary from the table contents, my code so far:
tds = [row.findAll('td') for row in soup.findAll('tr')] results = { td[0].string: td[1].string for td in tds }
https://stackoverflow.com/questions/67213358/how-can-we-scrape-minified-html-with-beautifulsoup-and-python April 22, 2021 at 08:31PM
没有评论:
发表评论