上传文件至 backend
This commit is contained in:
73
backend/README.md
Normal file
73
backend/README.md
Normal file
@@ -0,0 +1,73 @@
|
||||
# 微信公众号文章爬取工具(Go版本)
|
||||
|
||||
这是一个基于Go语言开发的微信公众号文章爬取工具,可以自动获取指定公众号的所有文章列表和详细内容。
|
||||
|
||||
## 功能特性
|
||||
|
||||
- 获取公众号所有文章列表
|
||||
- 获取每篇文章的详细内容
|
||||
- 获取文章的阅读量、点赞数、转发数等统计信息
|
||||
- 支持获取文章评论
|
||||
- 自动保存文章列表和详细内容
|
||||
|
||||
## 环境要求
|
||||
|
||||
- Go 1.20 或更高版本
|
||||
- Windows 操作系统(脚本已针对Windows优化)
|
||||
|
||||
## 安装使用
|
||||
|
||||
### 1. 配置Cookie
|
||||
|
||||
- 将 `cookie.txt.example` 重命名为 `cookie.txt`
|
||||
- 按照文件中的说明获取微信公众平台的Cookie
|
||||
- 将Cookie信息粘贴到 `cookie.txt` 文件中
|
||||
|
||||
### 2. 运行程序
|
||||
|
||||
直接双击 `run.bat` 脚本文件,程序会自动:
|
||||
- 下载所需依赖
|
||||
- 编译Go程序
|
||||
- 运行爬取工具
|
||||
|
||||
## 项目结构
|
||||
|
||||
```
|
||||
backend/
|
||||
├── cmd/
|
||||
│ └── main.go # 主程序入口
|
||||
├── configs/
|
||||
│ └── config.go # 配置管理
|
||||
├── pkg/
|
||||
│ ├── utils/ # 工具函数
|
||||
│ │ └── utils.go
|
||||
│ └── wechat/ # 微信相关功能实现
|
||||
│ └── access_articles.go
|
||||
├── data/ # 数据存储目录
|
||||
├── cookie.txt # Cookie文件(需要手动创建)
|
||||
├── go.mod # Go模块定义
|
||||
├── run.bat # Windows启动脚本
|
||||
└── README.md # 使用说明
|
||||
```
|
||||
|
||||
## 注意事项
|
||||
|
||||
1. 使用本工具前,请确保您已获得相关公众号的访问权限
|
||||
2. 请遵守相关法律法规,合理使用本工具
|
||||
3. 频繁请求可能会触发微信的反爬虫机制,请控制爬取频率
|
||||
4. 由于微信接口可能会变化,工具可能需要相应调整
|
||||
|
||||
## 常见问题
|
||||
|
||||
### Q: 获取Cookie失败怎么办?
|
||||
A: 请确保您已登录微信公众平台,并且在开发者工具中正确复制了完整的Cookie信息。
|
||||
|
||||
### Q: 爬取过程中出现网络错误怎么办?
|
||||
A: 工具会自动处理简单的网络错误,请确保网络连接正常。如果持续失败,可能是微信接口发生了变化。
|
||||
|
||||
### Q: 如何修改爬取的公众号?
|
||||
A: 工具会自动从Cookie中获取当前登录用户可访问的公众号信息。如果需要爬取不同的公众号,请在微信公众平台中切换账号后重新获取Cookie。
|
||||
|
||||
## 许可证
|
||||
|
||||
本项目仅供学习和研究使用。
|
||||
BIN
backend/main.exe
Normal file
BIN
backend/main.exe
Normal file
Binary file not shown.
BIN
backend/main.exe~
Normal file
BIN
backend/main.exe~
Normal file
Binary file not shown.
48
backend/run.bat
Normal file
48
backend/run.bat
Normal file
@@ -0,0 +1,48 @@
|
||||
@echo off
|
||||
|
||||
echo WeChat Public Article Crawler Startup Script
|
||||
echo =================================
|
||||
|
||||
REM Check if cookie.txt file exists
|
||||
if not exist "cookie.txt" (
|
||||
echo Error: cookie.txt file not found!
|
||||
echo Please create cookie.txt file in backend directory and add WeChat public platform cookie information.
|
||||
echo.
|
||||
echo cookie.txt format example:
|
||||
echo __biz=xxx; uin=xxx; key=xxx; pass_ticket=xxx;
|
||||
echo.
|
||||
pause
|
||||
exit /b 1
|
||||
)
|
||||
|
||||
REM Set Go environment variables (if needed)
|
||||
REM set GOPATH=%USERPROFILE%\go
|
||||
REM set GOROOT=C:\Go
|
||||
REM set PATH=%PATH%;%GOROOT%\bin;%GOPATH%\bin
|
||||
|
||||
echo Downloading dependencies...
|
||||
go mod tidy
|
||||
if %errorlevel% neq 0 (
|
||||
echo Failed to download dependencies!
|
||||
pause
|
||||
exit /b 1
|
||||
)
|
||||
|
||||
echo Compiling program...
|
||||
go build -o output\wechat-crawler.exe cmd\main.go
|
||||
if %errorlevel% neq 0 (
|
||||
echo Compilation failed!
|
||||
pause
|
||||
exit /b 1
|
||||
)
|
||||
|
||||
echo Compilation successful! Starting program...
|
||||
echo.
|
||||
|
||||
REM Ensure data directory exists
|
||||
if not exist "data" mkdir data
|
||||
|
||||
REM Run the program
|
||||
output\wechat-crawler.exe
|
||||
|
||||
pause
|
||||
57
backend/run_article_link.bat
Normal file
57
backend/run_article_link.bat
Normal file
@@ -0,0 +1,57 @@
|
||||
@echo off
|
||||
|
||||
rem WeChat Official Account Article Crawler - Script for crawling via article link
|
||||
setlocal enabledelayedexpansion
|
||||
|
||||
REM 检查是否有命令行参数传入
|
||||
if "%1" neq "" (
|
||||
REM 如果有参数,直接将其作为文章链接传入程序
|
||||
echo.
|
||||
echo Compiling and running...
|
||||
go run "cmd/main.go" "%1"
|
||||
|
||||
if errorlevel 1 (
|
||||
echo.
|
||||
echo Failed to run, please check error messages above
|
||||
pause
|
||||
exit /b 1
|
||||
)
|
||||
|
||||
echo.
|
||||
echo Crawling completed successfully!
|
||||
pause
|
||||
exit /b 0
|
||||
) else (
|
||||
REM 如果没有参数,运行交互式模式
|
||||
:input_loop
|
||||
cls
|
||||
echo ========================================
|
||||
echo WeChat Official Account Article Crawler
|
||||
echo ========================================
|
||||
echo.
|
||||
echo Please enter WeChat article link:
|
||||
echo Example: https://mp.weixin.qq.com/s/4r_LKJu0mOeUc70ZZXK9LA
|
||||
set /p ARTICLE_LINK=
|
||||
|
||||
if "%ARTICLE_LINK%"=="" (
|
||||
echo.
|
||||
echo Error: Article link cannot be empty!
|
||||
pause
|
||||
goto input_loop
|
||||
)
|
||||
|
||||
echo.
|
||||
echo Compiling and running...
|
||||
go run "cmd/main.go" "%ARTICLE_LINK%"
|
||||
|
||||
if errorlevel 1 (
|
||||
echo.
|
||||
echo Failed to run, please check error messages above
|
||||
pause
|
||||
exit /b 1
|
||||
)
|
||||
|
||||
echo.
|
||||
echo Crawling completed successfully!
|
||||
pause
|
||||
)
|
||||
Reference in New Issue
Block a user