5

I want to scrape data from a website, which requires a login to reach a certain page only then the data can be scraped.

Is there any way to scrape the data after login using Scrapy ? or if we can simulate the login ?

Note: I do have the login credentials with me.

Pratyush Behera
  • 103
  • 1
  • 9
  • 1
    I haven't tried it before, but this [article](https://python.gotrained.com/scrapy-formrequest-logging-in/) looks promising. Also, consider using `selenium` module in python. it's pretty easy to use. – Anwarvic Dec 21 '18 at 14:32
  • 1
    yes, it is totally possible, but of course you'll need a deeper understanding of how requests work and how to reverse engineer the site you are trying to login into. This is a per site thing, there is no one spider to get into all logins. – eLRuLL Dec 21 '18 at 15:25
  • 1
    If you post what you have now maybe someone can help you go further from that point. – Serban Gorcea Dec 21 '18 at 15:34
  • 1
    Does this answer your question? [Using Scrapy with authenticated (logged in) user session](https://stackoverflow.com/questions/5850755/using-scrapy-with-authenticated-logged-in-user-session) – Vishvajeet Ramanuj Jul 27 '20 at 12:37

1 Answers1

7

Short answer : Yes, you can scrape data after login. Check Formdata in scrapy and this answer post request using scrapy and documentation

Long Answer : Login pages are just forms. You can access those fields to fill in the required details and post that data. You can manually login and check the chrome developer tools [ctrl + shift + i] for network call being made when you press the submit/login button. You can then inspect the post request made and duplicate it in your scraper. You can check the above links to read about how to post data, and how requests and responses work in scrapy.

Shubham
  • 108
  • 3