285 lines
7.1 KiB
Markdown
285 lines
7.1 KiB
Markdown
|
|
# 小红书登录增强 - 借鉴ai_mip项目
|
|||
|
|
|
|||
|
|
## 概述
|
|||
|
|
基于ai_mip项目(Playwright + AdsPower 广告自动点击)的优秀实践,对小红书验证码登录流程进行了全面增强。
|
|||
|
|
|
|||
|
|
## 借鉴的核心技术
|
|||
|
|
|
|||
|
|
### 1. 人类行为模拟
|
|||
|
|
**来源**: `ai_mip/fingerprint_browser.py` - `human_type` 和 `human_click` 函数
|
|||
|
|
|
|||
|
|
**特点**:
|
|||
|
|
- 逐字符输入,随机延迟(50ms-150ms)
|
|||
|
|
- 鼠标轨迹模拟:在元素范围内随机点击位置
|
|||
|
|
- 触发真实的DOM事件(input, change, focus)
|
|||
|
|
|
|||
|
|
**应用**:
|
|||
|
|
```python
|
|||
|
|
# 原来的方式:直接填充
|
|||
|
|
await phone_input.fill(phone)
|
|||
|
|
|
|||
|
|
# 增强后:模拟人类打字
|
|||
|
|
await helper.human_type(selector, phone)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 2. 智能元素查找
|
|||
|
|
**来源**: `ai_mip/ad_automation.py` - `_send_consultation_message` 方法
|
|||
|
|
|
|||
|
|
**特点**:
|
|||
|
|
- 多选择器降级策略
|
|||
|
|
- 主选择器 → 降级选择器 → 兜底方案
|
|||
|
|
- 自动过滤不可见元素
|
|||
|
|
|
|||
|
|
**应用**:
|
|||
|
|
```python
|
|||
|
|
# 原来:循环尝试固定的选择器列表
|
|||
|
|
for selector in selectors:
|
|||
|
|
element = await page.query_selector(selector)
|
|||
|
|
|
|||
|
|
# 增强后:智能查找带降级
|
|||
|
|
element = await helper.find_input_with_fallback(
|
|||
|
|
primary_selectors=PRIMARY,
|
|||
|
|
fallback_selectors=FALLBACK
|
|||
|
|
)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 3. 结构化选择器管理
|
|||
|
|
**来源**: `ai_mip/ad_automation.py` - 选择器数组定义
|
|||
|
|
|
|||
|
|
**特点**:
|
|||
|
|
- 集中式选择器配置类
|
|||
|
|
- 按功能和页面类型分组
|
|||
|
|
- 易于维护和扩展
|
|||
|
|
|
|||
|
|
**应用**:
|
|||
|
|
```python
|
|||
|
|
class XHSSelectors:
|
|||
|
|
PHONE_INPUT_CREATOR = [...]
|
|||
|
|
PHONE_INPUT_HOME = [...]
|
|||
|
|
SEND_CODE_BTN_CREATOR = [...]
|
|||
|
|
# ...
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 4. 按钮状态检测
|
|||
|
|
**来源**: `ai_mip/ad_automation.py` - 按钮文本验证逻辑
|
|||
|
|
|
|||
|
|
**特点**:
|
|||
|
|
- 检测倒计时状态(59s, 58秒等)
|
|||
|
|
- 验证按钮文本是否符合预期
|
|||
|
|
- 检测按钮激活状态(active class)
|
|||
|
|
|
|||
|
|
**应用**:
|
|||
|
|
```python
|
|||
|
|
# 检测倒计时
|
|||
|
|
countdown = await helper.check_button_countdown(button)
|
|||
|
|
if countdown:
|
|||
|
|
return error_response
|
|||
|
|
|
|||
|
|
# 等待按钮激活
|
|||
|
|
is_active = await helper.wait_for_button_active(button)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 5. 调试辅助功能
|
|||
|
|
**来源**: `ai_mip/ad_automation.py` - 页面元素调试打印
|
|||
|
|
|
|||
|
|
**特点**:
|
|||
|
|
- 打印所有输入框/按钮的属性
|
|||
|
|
- 帮助快速定位问题
|
|||
|
|
- 结构化的调试输出
|
|||
|
|
|
|||
|
|
**应用**:
|
|||
|
|
```python
|
|||
|
|
if not phone_input:
|
|||
|
|
await helper.debug_print_inputs()
|
|||
|
|
|
|||
|
|
if not button:
|
|||
|
|
await helper.debug_print_buttons()
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
## 新增文件
|
|||
|
|
|
|||
|
|
### xhs_login_helper.py
|
|||
|
|
完整的登录辅助工具类,包含:
|
|||
|
|
|
|||
|
|
1. **XHSLoginHelper类**
|
|||
|
|
- `human_type()` - 人类打字模拟
|
|||
|
|
- `human_click()` - 人类点击模拟
|
|||
|
|
- `find_input_with_fallback()` - 智能查找输入框
|
|||
|
|
- `find_button_with_fallback()` - 智能查找按钮
|
|||
|
|
- `check_button_countdown()` - 检测按钮倒计时
|
|||
|
|
- `wait_for_button_active()` - 等待按钮激活
|
|||
|
|
- `scroll_to_element()` - 平滑滚动
|
|||
|
|
- `random_delay()` - 随机延迟
|
|||
|
|
- `debug_print_inputs()` - 调试输入框
|
|||
|
|
- `debug_print_buttons()` - 调试按钮
|
|||
|
|
|
|||
|
|
2. **XHSSelectors类**
|
|||
|
|
- 集中管理所有选择器配置
|
|||
|
|
- 按页面类型(创作者中心/首页)分组
|
|||
|
|
- 主选择器 + 降级选择器
|
|||
|
|
|
|||
|
|
## 核心改进
|
|||
|
|
|
|||
|
|
### 发送验证码流程优化
|
|||
|
|
|
|||
|
|
#### Before (原来的方式)
|
|||
|
|
```python
|
|||
|
|
# 1. 查找输入框
|
|||
|
|
for selector in selectors:
|
|||
|
|
phone_input = await page.query_selector(selector)
|
|||
|
|
if phone_input:
|
|||
|
|
break
|
|||
|
|
|
|||
|
|
# 2. 直接填充
|
|||
|
|
await page.evaluate(f'input.value = "{phone}"')
|
|||
|
|
|
|||
|
|
# 3. 查找按钮
|
|||
|
|
for selector in selectors:
|
|||
|
|
button = await page.query_selector(selector)
|
|||
|
|
if button:
|
|||
|
|
break
|
|||
|
|
|
|||
|
|
# 4. 直接点击
|
|||
|
|
await page.click(selector)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
#### After (增强后的方式)
|
|||
|
|
```python
|
|||
|
|
# 1. 创建辅助器
|
|||
|
|
helper = get_login_helper(page)
|
|||
|
|
|
|||
|
|
# 2. 智能查找输入框(多选择器降级)
|
|||
|
|
phone_input = await helper.find_input_with_fallback(
|
|||
|
|
primary_selectors=XHSSelectors.PHONE_INPUT_HOME,
|
|||
|
|
fallback_selectors=XHSSelectors.PHONE_INPUT_FALLBACK
|
|||
|
|
)
|
|||
|
|
|
|||
|
|
# 3. 人类打字(逐字符+随机延迟)
|
|||
|
|
await helper.human_type(selector, phone)
|
|||
|
|
|
|||
|
|
# 4. 智能查找按钮(带文本验证)
|
|||
|
|
button = await helper.find_button_with_fallback(
|
|||
|
|
primary_selectors=XHSSelectors.SEND_CODE_BTN_HOME,
|
|||
|
|
expected_texts=["获取验证码"]
|
|||
|
|
)
|
|||
|
|
|
|||
|
|
# 5. 检测倒计时
|
|||
|
|
countdown = await helper.check_button_countdown(button)
|
|||
|
|
|
|||
|
|
# 6. 等待激活
|
|||
|
|
await helper.wait_for_button_active(button)
|
|||
|
|
|
|||
|
|
# 7. 人类点击(随机位置+移动轨迹)
|
|||
|
|
await helper.human_click(button_selector)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
## 优势对比
|
|||
|
|
|
|||
|
|
| 维度 | 原来 | 增强后 |
|
|||
|
|
|------|------|--------|
|
|||
|
|
| **元素查找** | 单层循环查找 | 多层降级策略 |
|
|||
|
|
| **输入方式** | 直接填充 | 模拟人类打字 |
|
|||
|
|
| **点击方式** | 固定位置点击 | 随机位置+轨迹 |
|
|||
|
|
| **状态检测** | 简单文本检查 | 完整的状态检测 |
|
|||
|
|
| **调试能力** | 手动截图 | 自动打印元素信息 |
|
|||
|
|
| **可维护性** | 选择器分散 | 集中配置管理 |
|
|||
|
|
| **稳定性** | 一般 | 高(多重保护) |
|
|||
|
|
|
|||
|
|
## 技术亮点
|
|||
|
|
|
|||
|
|
### 1. 模拟人类行为
|
|||
|
|
- ✅ 逐字符输入,随机延迟
|
|||
|
|
- ✅ 鼠标移动轨迹
|
|||
|
|
- ✅ 随机点击位置
|
|||
|
|
- ✅ 真实DOM事件触发
|
|||
|
|
|
|||
|
|
### 2. 多重容错机制
|
|||
|
|
- ✅ 主选择器失败 → 降级选择器
|
|||
|
|
- ✅ 降级选择器失败 → 兜底方案
|
|||
|
|
- ✅ 自动过滤不可见元素
|
|||
|
|
- ✅ 调试信息自动打印
|
|||
|
|
|
|||
|
|
### 3. 智能状态检测
|
|||
|
|
- ✅ 倒计时检测(59s、60秒等)
|
|||
|
|
- ✅ 按钮文本验证
|
|||
|
|
- ✅ 按钮激活状态检测
|
|||
|
|
- ✅ 自动等待元素就绪
|
|||
|
|
|
|||
|
|
## 使用示例
|
|||
|
|
|
|||
|
|
### 基础用法
|
|||
|
|
```python
|
|||
|
|
from xhs_login_helper import get_login_helper, XHSSelectors
|
|||
|
|
|
|||
|
|
# 创建辅助器
|
|||
|
|
helper = get_login_helper(page)
|
|||
|
|
|
|||
|
|
# 查找并输入
|
|||
|
|
input_elem = await helper.find_input_with_fallback(
|
|||
|
|
primary_selectors=XHSSelectors.PHONE_INPUT_HOME
|
|||
|
|
)
|
|||
|
|
await helper.human_type(selector, "13800138000")
|
|||
|
|
|
|||
|
|
# 查找并点击
|
|||
|
|
button = await helper.find_button_with_fallback(
|
|||
|
|
primary_selectors=XHSSelectors.SEND_CODE_BTN_HOME,
|
|||
|
|
expected_texts=["获取验证码"]
|
|||
|
|
)
|
|||
|
|
await helper.human_click(button_selector)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 高级用法
|
|||
|
|
```python
|
|||
|
|
# 等待按钮激活
|
|||
|
|
is_active = await helper.wait_for_button_active(button, timeout=5)
|
|||
|
|
|
|||
|
|
# 检测倒计时
|
|||
|
|
countdown = await helper.check_button_countdown(button)
|
|||
|
|
if countdown:
|
|||
|
|
print(f"按钮处于倒计时: {countdown}")
|
|||
|
|
|
|||
|
|
# 调试页面
|
|||
|
|
await helper.debug_print_inputs()
|
|||
|
|
await helper.debug_print_buttons()
|
|||
|
|
|
|||
|
|
# 平滑滚动
|
|||
|
|
await helper.scroll_to_element(element)
|
|||
|
|
|
|||
|
|
# 随机延迟
|
|||
|
|
await helper.random_delay(0.5, 1.5)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
## 未来可扩展方向
|
|||
|
|
|
|||
|
|
### 1. AdsPower指纹浏览器集成
|
|||
|
|
借鉴 `ai_mip/fingerprint_browser.py` 和 `ai_mip/adspower_client.py`:
|
|||
|
|
- 指纹浏览器配置管理
|
|||
|
|
- CDP连接方式
|
|||
|
|
- 代理动态切换
|
|||
|
|
- 浏览器配置复用
|
|||
|
|
|
|||
|
|
### 2. 代理管理优化
|
|||
|
|
借鉴 `ai_mip/adspower_client.py`:
|
|||
|
|
- 大麦IP代理集成
|
|||
|
|
- 白名单代理支持
|
|||
|
|
- 代理验证机制
|
|||
|
|
- 代理配置热更新
|
|||
|
|
|
|||
|
|
### 3. 更多人类行为模拟
|
|||
|
|
借鉴 `ai_mip/ad_automation.py`:
|
|||
|
|
- 页面滚动模拟
|
|||
|
|
- 随机等待时间
|
|||
|
|
- 鼠标悬停行为
|
|||
|
|
- 表单填写节奏
|
|||
|
|
|
|||
|
|
## 总结
|
|||
|
|
|
|||
|
|
通过借鉴ai_mip项目的优秀实践,我们实现了:
|
|||
|
|
1. ✅ 更自然的人类行为模拟
|
|||
|
|
2. ✅ 更健壮的元素查找策略
|
|||
|
|
3. ✅ 更完善的状态检测机制
|
|||
|
|
4. ✅ 更强大的调试辅助功能
|
|||
|
|
5. ✅ 更易维护的代码结构
|
|||
|
|
|
|||
|
|
这些改进大幅提升了小红书验证码登录的成功率和稳定性,同时也为后续的功能扩展奠定了良好的基础。
|