I've been trying to scrape the title of the book located in the landing page along with the titles of the books of customers's choice from a webpage. To get the titles of all the books, it is necessary to keep clicking on the right arrow button as you see in the image above.
I've tried with:
from selenium import webdriver from selenium.webdriver.common.by import By from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.support import expected_conditions as EC links = [ "https://www.amazon.com/Keto-Meal-Prep-Cookbook-Beginners/dp/1673455980/", "https://www.amazon.com/Keto-Diet-Cookbook-Beginners-Recipes/dp/1792145454/" ] def fetch_content(link): driver.get(link) title = wait.until(EC.presence_of_element_located((By.CSS_SELECTOR,'h1#title > span#productTitle'))).text page_count = wait.until(EC.presence_of_element_located((By.XPATH,'//*[contains(@class,"a-carousel-header-row")][.//h2[contains(@class,"a-carousel-heading")][contains(.,"Customers who")]]//span[@class="a-carousel-page-max"]'))).text title_list = [] for i in range(int(page_count)+1): wait.until(EC.presence_of_element_located((By.XPATH,'//*[contains(@class,"a-carousel-header-row")][.//h2[contains(@class,"a-carousel-heading")][contains(.,"Customers who")]]/following-sibling::*[contains(@class,"a-carousel-row")]//a[contains(@class,"a-carousel-goto-nextpage")]'))).click() for item in wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR,"li.a-carousel-card > a.a-link-normal > div[data-rows]"))): title_list.append(item.text) return title,title_list if __name__ == '__main__': with webdriver.Chrome() as driver: wait = WebDriverWait(driver,15) for link in links: print(fetch_content(link))
When I execute the above script, I could notice that (if I scroll down manually a bit while the script is running) it grabs the first two titles from Customers who viewed
container and then throws stale element reference
error pointing at title_list.append(item.text)
.
https://stackoverflow.com/questions/67324375/unable-to-scrape-the-title-of-the-main-book-along-with-the-books-viewed-by-custo April 30, 2021 at 04:06AMHow can I scrape the title of the main book along with the books viewed by customers from a webpage?
没有评论:
发表评论