0

I'm trying to write a code to collect resumes from "indeed.com" website. In order to download resumes from "indeed.com" you have to login with your account. The problem with me is after posting data it shows me response [200] which indicates successful post but still fail to login.

Here is my code :

import requests
from bs4 import BeautifulSoup
from lxml import html

page = requests.get('https://secure.indeed.com/account/login')
soup = BeautifulSoup(page.content, 'html.parser')
row_text = soup.text
surftok = str(row_text[row_text.find('"surftok":')+11:row_text.find('","tmpl":')])
formtok = str(row_text[row_text.find('"tk":') + 6:row_text.find('","variation":')])
logintok = str(row_text[row_text.find('"loginTk":') + 11:row_text.find('","debugBarLink":')])
cfb = int(str(row_text[row_text.find('"cfb":')+6:row_text.find(',"pvr":')]))
pvr = int(str(row_text[row_text.find('"pvr":') + 6:row_text.find(',"obo":')]))
hl = str(row_text[row_text.find('"hl":') + 6:row_text.find('","co":')])

data = {
    'action': 'login',
    '__email': 'myEmail',
    '__password': 'myPassword',
    'remember': '1',
    'hl': hl,
    'cfb': cfb,
    'pvr': pvr,
    'form_tk': formtok,
    'surftok': surftok,
    'login_tk': logintok
}


response = requests.post("https://secure.indeed.com/", data=data)
print response
print 'myEmail' in response.text

It shows me response [200] but when I search for my email in the response page to make sure that login is successful, I don't find it. It seems that login failed for a reason that I don't know.

  • Take a look here: [https://stackoverflow.com/questions/11892729/how-to-log-in-to-a-website-using-pythons-requests-module](https://stackoverflow.com/questions/11892729/how-to-log-in-to-a-website-using-pythons-requests-module) – teller.py3 Sep 15 '18 at 18:48
  • Possible duplicate of [How to use cookies in Python Requests](https://stackoverflow.com/questions/31554771/how-to-use-cookies-in-python-requests) – ivan_pozdeev Sep 16 '18 at 01:24
  • Thanks for your time, but it doesn't help. Nothing new, they fill the 'payload' object with data then post it. The problem with me is after posting data it shows me response [200] which indicates successful post but still fail to login. – Raof Mohamed Sep 16 '18 at 07:28

2 Answers2

0

send headers as well in your post request, get the headers from response headers of your browser.

headers = {'user-agent': 'Chrome'}
response = requests.post("https://secure.indeed.com/",headers = headers, data=data)
  • Thanks for trying. I did but it did not help. Someone told me that the website that i'm trying to scrap uses JavaScript redirection. Python requests does not support JavaScript redirection. It seem that i have to use selenium. – Raof Mohamed Sep 24 '18 at 03:09
  • @RaofMohamed Hi, I am trying to do same things as you did scrape CV from indeed. Can you please tell me have you resolved your problem if yes can you please post your answer? If possible for you can you please share little bit of code how you login in indeed and redirect to other url. I am trying to do without login but it redirect url 'indeed.com' to 'resumes.indeed.com' and not taking keywords and give me any results. – Pmsheth Oct 22 '18 at 11:44
0

Some websites uses JavaScript redirection. "indeed.com" is one of them. Unfortunately, python requests does not support JavaScript redirection. In such situations we may use selenium.