2020年12月20日星期日

Web scraping from multiple pages with for loop

I have created web scraping tool for picking data from listed houses.

I have problem when it comes to changing page. I did make for loop to go from 1 to some number.

Problem is this: In this web pages last "page" can be different all the time. Now it is 70, but tomorrow it can be 68 or 72. And if I but range for example to (1-74) it will print last page many times, because if you go over the maximum the page always loads the last.

html: https://www.etuovi.com/myytavat-asunnot/oulu?haku=M1582971026&sivu=1000 <---- if you but this over the real number (70) of pages, it will automatically open the last page (70) as many times it is ranged.

So how to make this loop stop when it reaches maximum number?

for sivu in range(1, 100):                req = requests.get(my_url + str(sivu))          page_soup = soup(req.text, "html.parser")          containers = page_soup.findAll("div", {"class": "ListPage__cardContainer__39dKQ"})  

Thanks

https://stackoverflow.com/questions/65385007/web-scraping-from-multiple-pages-with-for-loop December 21, 2020 at 05:32AM

没有评论:

发表评论