1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100
| import requests from bs4 import BeautifulSoup import pymysql from pymysql.cursors import DictCursor
if __name__ == '__main__': url = "https://www.yooc.me/group/42238/exam/81902/detail" anwerurl = "https://www.yooc.me/group/42238/exam/81563/answer/save" header = { 'Host': 'pvp.qq.com', "X-CSRFToken": "y3kv3hZKZaTpwqJSCjVmIwYC2hmEpmA5", 'Referer': 'https://www.yooc.me/group/42238/exam/81563/detail', 'User-Agent': 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36' } cookies = {"csrftoken": "y3kv3hZKZaTpwqJSCjVmIwYC2hmEpmA5", "UM_distinctid": "16d3de3827291-038504f1865a1-133f6b55-1fa400-16d3de38273303", "CNZZDATA1254048558": "612160265-1568698218-%7C1568698218", #"sessionid": "b951c20d99d92f0a3c6bc27b53a38aff", "sessionid": "61d5e6ef27fb542845facd8a8369a376", "Hm_lvt_435408cf352a14d68ef6861b9d51158c": "1568701383,1568702018", "Hm_lpvt_435408cf352a14d68ef6861b9d51158": "1568702762"} res = requests.get(url, headers=header, cookies=cookies) html_text = res.text s = BeautifulSoup(html_text, 'html.parser')
anwer_header = '''Accept: */* Accept-Encoding: gzip, deflate, br Accept-Language: zh-CN,zh;q=0.9,en;q=0.8 Connection: keep-alive Content-Length: 286 Content-Type: application/x-www-form-urlencoded; charset=UTF-8 Cookie: csrftoken=y3kv3hZKZaTpwqJSCjVmIwYC2hmEpmA5; UM_distinctid=16d3de3827291-038504f1865a1-133f6b55-1fa400-16d3de38273303; CNZZDATA1254048558=612160265-1568698218-%7C1568698218; sessionid=b951c20d99d92f0a3c6bc27b53a38aff; Hm_lvt_435408cf352a14d68ef6861b9d51158c=1568701383,1568702018; Hm_lpvt_435408cf352a14d68ef6861b9d51158c=1568705230 Host: www.yooc.me Origin: https://www.yooc.me Referer: https://www.yooc.me/group/42238/exam/81563/detail Sec-Fetch-Mode: cors Sec-Fetch-Site: same-origin User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.75 Safari/537.36 X-CSRFToken: y3kv3hZKZaTpwqJSCjVmIwYC2hmEpmA5 X-Requested-With: XMLHttpRequest''' data = { "answers": [ { "14354830": {"1": ["0"]} } ] # "answers": ''' # [{"14354767": {"1": ["0"]}}]" # ''' }
conn = pymysql.connect(host='127.0.0.1', port=3306, user='root', passwd='123456', db='thinking', charset='utf8') cur = conn.cursor() # 默认返回元组
for i in s.find("div",{"class":"exam-detial-container"}).find_all("div","question-board"):
# 获取题目id question_id = i.get("id").split("-")[1] question_name = i.find("p", {"class": "q-cnt"}).find("p", {"class": "q-cnt"}).text
# 获取答案 anwer_select = i.find("div", {"class": "the-ans"}).p.text.split(":")[1] # if anwer == "A": # anwer = "0" # if anwer == "B": # anwer = "1" # if anwer == "C": # anwer = "2" # if anwer == "D": # anwer = "3" # question_name = question_name.replace("("," ") # question_name = question_name.replace(")", " ") # question_name = question_name.replace("(", " ") # question_name = question_name.replace(")", " ") anwer = "" for selct in i.find("ul", {"class": "single-ans"}).find_all("li"): if selct.text.split(". ")[0] == anwer_select: anwer = selct.text.split(". ")[1] print(anwer) break
#查询已存在题目 sql = "select * from an where question = %s" para = (question_name) counts = cur.execute(sql, para)
if counts >= 1: continue
# # for j in i.find_all("label"): # # answer.append(j.text)
print("{0},{1},{2}".format(question_id, question_name, anwer)) sql = "insert into an values(%s,%s,%s)" para = (question_id, question_name,anwer) cur.execute(sql,para) conn.commit() # 提交到数据库
|