I want to scroll down this page https://www.newsnow.com/us/World?type=ln&d=1609455600 by clicking on the button "view more headlines"
so I can scrape headlines of previous days. But the page on the driver reloads automatically after some loops (some clicks on view more headlines) and returns to the initial position. This is the code :
url = 'https://www.newsnow.com/us/World?type=ln&d=1609455600' options = Options() options.add_argument('--no-sandbox') options.add_argument('--ignore-certificate-errors') driver = webdriver.Chrome(executable_path=r"C:/chromedriver.exe", options=options) driver.get(url) time.sleep(10) # driver.execute_script("window.scrollTo(0, document.body.scrollHeight)") for i in range(3000): try: elem =WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CLASS_NAME,'btn--primary__label'))) driver.execute_script("arguments[0].scrollIntoView();", elem) elem.click() print(f'click {i} done') time.sleep(5) except: print('end of the scrolling down') break soup = BeautifulSoup(driver.page_source, 'html.parser') # ... # working with the sope
https://stackoverflow.com/questions/65532282/infinite-scroll-down-using-selenium-alwais-fail-because-of-automatic-reload-of-t January 02, 2021 at 02:02AM
没有评论:
发表评论