Bgg

Posted on Mar 31, 2023

Trying to make a script work and i cant figure out what is wrong with it.

#help

Hello.
My first post here.
Tottaly noob almost 0 experience in coding but i wanted to try with the help of chatgpt to make a script to filter a specific category in a clothing site so i can find something suiting for me directly without visiting each item but i cant make it to spit any results. The script looks alright as i am having help from Thonny installing all the required and the result of the code looks alright but it does not spit any links back. Totaly blank. Any help would be amazing. Code is in Python

import requests
from bs4 import BeautifulSoup

shoulder_min = 75
shoulder_max = 77

url = "https://eur.shein.com/"

Use requests to get the HTML content of the website

response = requests.get(url)

Use BeautifulSoup to parse the HTML content

soup = BeautifulSoup(response.content, 'html.parser')

Find all products on the website

products = soup.find_all('div', class_='c-goodsitem__ratiowrap')

Create an empty list to store the links of products that meet the filter

filtered_links = []

Iterate over each product and check if it meets the shoulder size filter

for product in products:
# Find the shoulder size of the product
shoulder = product.find('span', class_='c-goodsitem__size--shoulder').text
shoulder = int(shoulder.split(' ')[0])

# Check if the shoulder size meets the filter

if shoulder >= shoulder_min and shoulder <= shoulder_max:

    # If the product meets the filter, add its link to the filtered_links list

    link = url + product.find('a', class_='c-goodsitem__goodsimg')['href']

    filtered_links.append(link)

Write the links of filtered products to a text file

with open("filtered_links.txt", "w") as f:
f.write("Links of filtered products:\n")
for link in filtered_links:
f.write(link + "\n")

print("Filtered links saved to filtered_links.txt")

Top comments (2)

Tyler V. (he/him) • May 30 '23

I can't offer any assistance on the script, but if you wrap your code with a backtick (or 3 of them before and after for a long block) then your text will format like this:

// this code has code formatting to it because it was wrapped with triple backticks (`)
function main() {
  console.log("test formatting example")
}
// you can also add formatting by adding the file extension type after the opening 3 backticks - this one uses js

Natasha Sturrock • Jul 2 • Edited

From what you’re saying, it sounds like your script isn’t finding any results because the website probably loads the products with JavaScript after the page initially loads.

When you use requests.get(), you only get the basic HTML that the server sends, but not the stuff that appears later from scripts running in the browser. So the product info you want might not actually be in the HTML you’re grabbing.

Also, those class names you’re targeting might have changed or might be different than you expect. Sometimes sites rename or structure their HTML in tricky ways, so your script can’t find the elements.

What I’d suggest is to open the site in your browser, right-click on a product, and choose “Inspect” to see the exact HTML structure and classes once the page fully loads. You might find that the data is actually loaded through some behind-the-scenes API. If that’s the case, you can try to find that API and get the data directly from there — usually in JSON format, which is easier to work with.

If the site is heavily reliant on JavaScript, a simple requests call won’t cut it. You might need to use tools like Selenium or Playwright that can open a browser for you, run the scripts, and then grab the fully loaded page.

Don’t worry if it feels tricky this kind of web scraping can be tough when sites load content dynamically. Keep experimenting and feel free to ask for help! I think I can be of great help, as I have worked on a lot of such projects in the web app development company, I work in. You’re doing great just by trying this out.

CodeNewbie Community 🌱

Trying to make a script work and i cant figure out what is wrong with it.

Use requests to get the HTML content of the website

Use BeautifulSoup to parse the HTML content

Find all products on the website

Create an empty list to store the links of products that meet the filter

Iterate over each product and check if it meets the shoulder size filter

Write the links of filtered products to a text file

print("Filtered links saved to filtered_links.txt")

Top comments (2)

Read next

Getting Started with SafeLine WAF 7.0: Open Source, Secure, and Bot-Protected

Beyond Promises

Most Django Indexes Are Useless. Here’s How to Fix Them.

How to Implement File Content Encryption in PHP: Three Solutions for File Encryption and Decryption