init commit
This commit is contained in:
286
DATA_SCHEMA.md
Normal file
286
DATA_SCHEMA.md
Normal file
@@ -0,0 +1,286 @@
|
||||
# 量化交易数据表设计(建议版)
|
||||
|
||||
本文档给出一个兼顾本地研究(SQLite)与生产化部署(PostgreSQL)的数据模型。目标:可重复、可扩展、便于回测和线上实时监控,支持行情、基本面、新闻情绪、信号、订单、成交、持仓与风控度量。
|
||||
|
||||
## 设计原则
|
||||
- 时间统一:全部时间戳使用 UTC;列命名统一为 `*_utc` 或 `as_of_date`(UTC 日期)。
|
||||
- 主键稳定:优先采用业务自然键的组合主键(如 `(symbol_id, ts_utc)`),跨系统对齐使用 surrogate key(自增或 UUID)。
|
||||
- 数值约定:
|
||||
- 价格/金额使用 `NUMERIC(18,6)`(PG)或 `REAL`/`DECIMAL`(SQLite);
|
||||
- 涨跌幅/比率存“小数形式”:如 4.02% 存 `0.0402`。
|
||||
- 幂等写入:所有采集/计算表支持 Upsert;通过唯一键 + `ON CONFLICT DO UPDATE`(PG)或 `INSERT OR REPLACE`(SQLite)。
|
||||
- 分区与保留:高频表按时间分区(PG),并制定数据保留策略(如 Tick 保留 30 天,1 分钟 Bar 保留 180 天,日线长期保留)。
|
||||
|
||||
## 表清单与字段
|
||||
|
||||
### 1) `symbols`(标的主表)
|
||||
- 用途:统一管理股票/ETF 元数据。
|
||||
- 主键:`id`(PK,自增/UUID);唯一约束:`(symbol, exchange)`。
|
||||
- 字段:
|
||||
- `id` PK
|
||||
- `symbol` 股票代码(如 `AAPL`)
|
||||
- `name` 名称
|
||||
- `exchange` 交易所(如 `NASDAQ`)
|
||||
- `currency` 货币(如 `USD`)
|
||||
- `tick_size`、`lot_size`
|
||||
- `sector`、`industry`
|
||||
- `is_active` 布尔
|
||||
- `first_seen_utc`、`last_seen_utc`
|
||||
|
||||
### 2) `calendars`(交易日历)
|
||||
- 主键:`(exchange, date)`。
|
||||
- 字段:`is_trading_day`、`open_time_utc`、`close_time_utc`、`notes`。
|
||||
|
||||
### 3) `bars_1m`(1 分钟 K 线)
|
||||
- 主键:`(symbol_id, ts_utc)`(`ts_utc` 为该分钟起始时刻)。
|
||||
- 索引:`(ts_utc)`、`(symbol_id, ts_utc DESC)`。
|
||||
- 字段:`open`、`high`、`low`、`close`、`volume`、`vwap`、`trades_count`、`source`。
|
||||
|
||||
### 4) `bars_1d`(日线 K 线)
|
||||
- 主键:`(symbol_id, as_of_date)`。
|
||||
- 字段:`open`、`high`、`low`、`close`、`adj_close`、`volume`、`dividend`、`split_ratio`、`source`。
|
||||
|
||||
### 5) `ticks`(逐笔/Level-1 快照,可选)
|
||||
- 主键:`id`(自增/UUID);唯一键建议:`(symbol_id, ts_utc, source, seq)`。
|
||||
- 字段:`price`、`size`、`bid`、`ask`、`bid_size`、`ask_size`、`condition`、`seq`、`source`。
|
||||
|
||||
### 6) `corporate_actions`(公司行为)
|
||||
- 主键:`(symbol_id, ex_date, type)`。
|
||||
- 字段:`type`(`split`/`dividend`/...)、`amount`、`ratio`、`currency`、`notes`。
|
||||
|
||||
### 7) `fundamentals_snapshot`(基本面快照)
|
||||
- 主键:`(symbol_id, as_of_date)`。
|
||||
- 字段示例:`market_cap`、`pe_ttm`、`ps_ttm`、`pb`、`eps_ttm`、`revenue_ttm`、`shares_outstanding`、`updated_at_utc`。
|
||||
|
||||
### 8) `news` 与 `news_symbols`(新闻与关联表)
|
||||
- `news` 主键:`id`(UUID/自增);字段:`published_at_utc`、`source`、`title`、`url`、`summary`、`sentiment_score`、`topics`。
|
||||
- `news_symbols` 主键:`(news_id, symbol_id)`。
|
||||
|
||||
### 9) `signals`(策略信号)
|
||||
- 主键:`id`(UUID/自增);唯一建议:`(symbol_id, generated_at_utc, model_name, version)`。
|
||||
- 字段:
|
||||
- `symbol_id`、`generated_at_utc`
|
||||
- `signal_type`(如 `momentum`/`reversal`)
|
||||
- `direction`(`BUY`/`SELL`/`HOLD`)
|
||||
- `score`(0–1 或 z-score)
|
||||
- `horizon`(如 `1d`/`1h`)
|
||||
- `params_json`(策略参数 JSON)
|
||||
- `model_name`、`version`
|
||||
- `expires_at_utc`(过期时间,可空)
|
||||
|
||||
### 10) `orders`(订单)
|
||||
- 主键:`id`;唯一建议:`broker_order_id`(如接入实盘)。
|
||||
- 字段:`signal_id`、`symbol_id`、`side`、`order_type`、`qty`、`price`、`time_in_force`、`status`、`created_at_utc`、`updated_at_utc`、`broker_order_id`。
|
||||
|
||||
### 11) `executions`(成交/回执)
|
||||
- 主键:`id`;索引:`(order_id)`、`(exec_time_utc)`。
|
||||
- 字段:`order_id`、`exec_time_utc`、`price`、`qty`、`fee`、`liquidity`(`maker`/`taker`)。
|
||||
|
||||
### 12) `positions`(持仓快照)
|
||||
- 主键:`(portfolio_id, symbol_id)` 或附带 `as_of_date` 做日终表。
|
||||
- 字段:`qty`、`avg_cost`、`unrealized_pnl`、`realized_pnl`、`last_updated_utc`。
|
||||
|
||||
### 13) `portfolios` / `portfolio_nav_daily`(组合与净值)
|
||||
- `portfolios`:`id` PK、`name`、`base_currency`、`created_at_utc`。
|
||||
- `portfolio_nav_daily` 主键:`(portfolio_id, as_of_date)`;字段:`cash`、`equity_value`、`nav`、`daily_return`、`gross_exposure`、`net_exposure`。
|
||||
|
||||
### 14) `risk_metrics_daily`(风险指标)
|
||||
- 主键:`(portfolio_id, as_of_date)`;字段:`var_95`、`beta`、`sharpe`、`max_drawdown`、`volatility_20d` 等。
|
||||
|
||||
### 15) `etl_runs`(任务运行元数据)
|
||||
- 主键:`run_id`;字段:`task_name`、`started_at_utc`、`finished_at_utc`、`status`、`rows_affected`、`checksum`。
|
||||
|
||||
## PostgreSQL 示例 DDL(核心表)
|
||||
|
||||
```sql
|
||||
-- 1) 标的
|
||||
CREATE TABLE symbols (
|
||||
id BIGSERIAL PRIMARY KEY,
|
||||
symbol TEXT NOT NULL,
|
||||
name TEXT,
|
||||
exchange TEXT NOT NULL,
|
||||
currency TEXT DEFAULT 'USD',
|
||||
tick_size NUMERIC(18,6),
|
||||
lot_size NUMERIC(18,6),
|
||||
sector TEXT,
|
||||
industry TEXT,
|
||||
is_active BOOLEAN DEFAULT TRUE,
|
||||
first_seen_utc TIMESTAMPTZ,
|
||||
last_seen_utc TIMESTAMPTZ,
|
||||
UNIQUE(symbol, exchange)
|
||||
);
|
||||
|
||||
-- 2) 1 分钟 K 线
|
||||
CREATE TABLE bars_1m (
|
||||
symbol_id BIGINT NOT NULL REFERENCES symbols(id),
|
||||
ts_utc TIMESTAMPTZ NOT NULL,
|
||||
open NUMERIC(18,6) NOT NULL,
|
||||
high NUMERIC(18,6) NOT NULL,
|
||||
low NUMERIC(18,6) NOT NULL,
|
||||
close NUMERIC(18,6) NOT NULL,
|
||||
volume BIGINT,
|
||||
vwap NUMERIC(18,6),
|
||||
trades_count INTEGER,
|
||||
source TEXT,
|
||||
PRIMARY KEY(symbol_id, ts_utc)
|
||||
);
|
||||
CREATE INDEX ON bars_1m (ts_utc);
|
||||
CREATE INDEX ON bars_1m (symbol_id, ts_utc DESC);
|
||||
|
||||
-- 3) 日线 K 线
|
||||
CREATE TABLE bars_1d (
|
||||
symbol_id BIGINT NOT NULL REFERENCES symbols(id),
|
||||
as_of_date DATE NOT NULL,
|
||||
open NUMERIC(18,6) NOT NULL,
|
||||
high NUMERIC(18,6) NOT NULL,
|
||||
low NUMERIC(18,6) NOT NULL,
|
||||
close NUMERIC(18,6) NOT NULL,
|
||||
adj_close NUMERIC(18,6),
|
||||
volume BIGINT,
|
||||
dividend NUMERIC(18,6),
|
||||
split_ratio NUMERIC(18,6),
|
||||
source TEXT,
|
||||
PRIMARY KEY(symbol_id, as_of_date)
|
||||
);
|
||||
|
||||
-- 4) 信号表(涨跌幅等指标请用小数存储)
|
||||
CREATE TABLE signals (
|
||||
id BIGSERIAL PRIMARY KEY,
|
||||
symbol_id BIGINT NOT NULL REFERENCES symbols(id),
|
||||
generated_at_utc TIMESTAMPTZ NOT NULL,
|
||||
signal_type TEXT NOT NULL,
|
||||
direction TEXT NOT NULL CHECK (direction IN ('BUY','SELL','HOLD')),
|
||||
score NUMERIC(18,6),
|
||||
horizon TEXT,
|
||||
params_json JSONB,
|
||||
model_name TEXT,
|
||||
version TEXT,
|
||||
expires_at_utc TIMESTAMPTZ,
|
||||
UNIQUE(symbol_id, generated_at_utc, model_name, version)
|
||||
);
|
||||
CREATE INDEX ON signals (symbol_id, generated_at_utc DESC);
|
||||
|
||||
-- 5) 订单/成交
|
||||
CREATE TABLE orders (
|
||||
id BIGSERIAL PRIMARY KEY,
|
||||
signal_id BIGINT REFERENCES signals(id),
|
||||
symbol_id BIGINT NOT NULL REFERENCES symbols(id),
|
||||
side TEXT NOT NULL CHECK (side IN ('BUY','SELL')),
|
||||
order_type TEXT NOT NULL CHECK (order_type IN ('MKT','LMT')),
|
||||
qty NUMERIC(18,6) NOT NULL,
|
||||
price NUMERIC(18,6),
|
||||
time_in_force TEXT,
|
||||
status TEXT NOT NULL,
|
||||
broker_order_id TEXT,
|
||||
created_at_utc TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
||||
updated_at_utc TIMESTAMPTZ
|
||||
);
|
||||
CREATE UNIQUE INDEX IF NOT EXISTS orders_broker_unique ON orders(broker_order_id) WHERE broker_order_id IS NOT NULL;
|
||||
|
||||
CREATE TABLE executions (
|
||||
id BIGSERIAL PRIMARY KEY,
|
||||
order_id BIGINT NOT NULL REFERENCES orders(id),
|
||||
exec_time_utc TIMESTAMPTZ NOT NULL,
|
||||
price NUMERIC(18,6) NOT NULL,
|
||||
qty NUMERIC(18,6) NOT NULL,
|
||||
fee NUMERIC(18,6),
|
||||
liquidity TEXT
|
||||
);
|
||||
CREATE INDEX ON executions(order_id);
|
||||
CREATE INDEX ON executions(exec_time_utc);
|
||||
```
|
||||
|
||||
## SQLite 示例 DDL(简化版)
|
||||
|
||||
```sql
|
||||
CREATE TABLE symbols (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
symbol TEXT NOT NULL,
|
||||
name TEXT,
|
||||
exchange TEXT NOT NULL,
|
||||
currency TEXT,
|
||||
tick_size REAL,
|
||||
lot_size REAL,
|
||||
sector TEXT,
|
||||
industry TEXT,
|
||||
is_active INTEGER DEFAULT 1,
|
||||
first_seen_utc TEXT,
|
||||
last_seen_utc TEXT,
|
||||
UNIQUE(symbol, exchange)
|
||||
);
|
||||
|
||||
CREATE TABLE bars_1m (
|
||||
symbol_id INTEGER NOT NULL,
|
||||
ts_utc TEXT NOT NULL,
|
||||
open REAL NOT NULL,
|
||||
high REAL NOT NULL,
|
||||
low REAL NOT NULL,
|
||||
close REAL NOT NULL,
|
||||
volume INTEGER,
|
||||
vwap REAL,
|
||||
trades_count INTEGER,
|
||||
source TEXT,
|
||||
PRIMARY KEY(symbol_id, ts_utc)
|
||||
);
|
||||
|
||||
CREATE TABLE signals (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
symbol_id INTEGER NOT NULL,
|
||||
generated_at_utc TEXT NOT NULL,
|
||||
signal_type TEXT NOT NULL,
|
||||
direction TEXT NOT NULL,
|
||||
score REAL,
|
||||
horizon TEXT,
|
||||
params_json TEXT,
|
||||
model_name TEXT,
|
||||
version TEXT,
|
||||
expires_at_utc TEXT,
|
||||
UNIQUE(symbol_id, generated_at_utc, model_name, version)
|
||||
);
|
||||
```
|
||||
|
||||
## 典型索引与查询
|
||||
- 最新信号:
|
||||
```sql
|
||||
SELECT DISTINCT ON (s.symbol_id)
|
||||
s.*
|
||||
FROM signals s
|
||||
ORDER BY s.symbol_id, s.generated_at_utc DESC;
|
||||
```
|
||||
- 连接 1 分钟线与信号(取信号后最近 30 分钟):
|
||||
```sql
|
||||
SELECT b.*
|
||||
FROM signals s
|
||||
JOIN bars_1m b
|
||||
ON b.symbol_id = s.symbol_id
|
||||
AND b.ts_utc BETWEEN s.generated_at_utc AND s.generated_at_utc + INTERVAL '30 minutes'
|
||||
WHERE s.generated_at_utc >= NOW() - INTERVAL '1 day';
|
||||
```
|
||||
- 计算日收益(简单取 `close`):
|
||||
```sql
|
||||
SELECT symbol_id,
|
||||
as_of_date,
|
||||
close / LAG(close) OVER (PARTITION BY symbol_id ORDER BY as_of_date) - 1 AS daily_return
|
||||
FROM bars_1d;
|
||||
```
|
||||
|
||||
## 数据保留与分区建议(PG)
|
||||
- `ticks`:分区粒度=日,保留 30–90 天。
|
||||
- `bars_1m`:分区粒度=月,保留 180–365 天。
|
||||
- `bars_1d`、`signals`、`orders`、`executions`:长期保留。
|
||||
|
||||
## 与当前项目的对接要点
|
||||
- 数值规范:`change_ratio` 等“比率”统一用小数存储(代码已修正),写入时无需 `%`。
|
||||
- 表落地策略:
|
||||
- 初期:SQLite 单文件,开发/回测方便;
|
||||
- 扩展:PostgreSQL + 分区 + 指标物化视图(如聚合分钟线)。
|
||||
- ETL 幂等:抓取任务用 `(symbol_id, ts_utc)` 或 `(symbol_id, as_of_date)` 做 Upsert;避免重复数据导致回测偏差。
|
||||
|
||||
## 后续扩展
|
||||
- `features_*`:因子特征宽表(按频率区分:分钟/日)。
|
||||
- `models_registry`:模型注册与版本追踪。
|
||||
- `backtest_runs`:回测任务与指标结果表(收益、回撤、卡方检验等)。
|
||||
|
||||
---
|
||||
|
||||
如需,我可以按上述 DDL 直接生成 SQLite 数据库,并将 `monitor.py` 在每轮拉取后把 Top N 的 1 分钟 bar、信号落到库中(Upsert),用于后续回测与可视化分析。
|
||||
75
README.md
Normal file
75
README.md
Normal file
@@ -0,0 +1,75 @@
|
||||
# AI Stock Trading Assistant (模拟量化交易系统)
|
||||
|
||||
这是一个基于 Python 的简易量化交易模拟系统。它能够自动抓取美股实时行情,根据预设策略进行分析,并模拟执行交易(记录日志)。
|
||||
|
||||
## 功能特点
|
||||
|
||||
* **数据抓取**: 集成东方财富 API 和富途牛牛网页解析,获取美股实时价格和涨跌幅。
|
||||
* **自动监控**: `monitor.py` 支持循环扫描市场,实时监控股票动态。
|
||||
* **策略分析**: 内置简单的趋势跟踪策略(涨跌幅阈值),可扩展接入大模型分析。
|
||||
* **模拟交易**: 生成交易信号并记录到 CSV 日志,不涉及真实资金操作。
|
||||
|
||||
## 环境要求
|
||||
|
||||
* Python 3.6+
|
||||
* 依赖库:
|
||||
* `requests`
|
||||
* `beautifulsoup4`
|
||||
|
||||
## 安装
|
||||
|
||||
1. 克隆或下载本项目。
|
||||
2. 安装所需的 Python 库:
|
||||
|
||||
```bash
|
||||
pip install requests beautifulsoup4
|
||||
```
|
||||
|
||||
## 使用指南
|
||||
|
||||
### 1. 启动自动监控系统
|
||||
|
||||
运行 `monitor.py` 启动全自动监控循环。系统会定期抓取数据、分析并记录交易信号。
|
||||
|
||||
```bash
|
||||
# 默认启动 (监控 Top 100 股票,间隔 60 秒)
|
||||
python monitor.py
|
||||
|
||||
# 自定义监控间隔 (例如 30 秒)
|
||||
python monitor.py --interval 30
|
||||
|
||||
# 监控更多股票 (例如 Top 200)
|
||||
python monitor.py --limit 200
|
||||
|
||||
# 全量监控 (速度较慢,获取所有美股数据)
|
||||
python monitor.py --all
|
||||
```
|
||||
python monitor.py --limit 10 --interval 5 --premarket --premarket-limit 3
|
||||
运行后,交易记录将实时写入 `trade_log.csv` 文件。
|
||||
|
||||
### 2. 单独使用数据抓取工具
|
||||
|
||||
`futu.py` 可以作为独立工具运行,用于抓取数据并保存为 CSV。
|
||||
|
||||
```bash
|
||||
# 获取美股市值前 50 名股票数据
|
||||
python futu.py --top50
|
||||
|
||||
# 获取前 100 名并保存到文件
|
||||
python futu.py --top50 --limit 100 --output stocks.csv
|
||||
|
||||
# 仅使用东方财富数据源 (速度更快)
|
||||
python futu.py --top50 --eastmoney-only
|
||||
```
|
||||
python premarket_watch.py --limit 10 --force
|
||||
## 项目结构
|
||||
|
||||
* `monitor.py`: **主程序**。负责调度数据抓取、分析和交易模块,执行循环监控。
|
||||
* `futu.py`: **数据层**。包含 `EastMoneyAPI` 和 `FutuStockParser`,负责从网络获取股票数据。
|
||||
* `market_analyzer.py`: **策略层**。接收行情数据,根据策略(如涨跌幅 > 5%)生成买卖信号。
|
||||
* `trader.py`: **执行层**。接收信号,模拟下单过程,并将结果写入日志。
|
||||
* `trade_log.csv`: **日志文件**。记录所有模拟交易的历史数据。
|
||||
|
||||
## 免责声明
|
||||
|
||||
本项目仅供学习和研究使用。系统中的“交易”均为模拟行为,不涉及任何真实资金往来。投资有风险,入市需谨慎。
|
||||
66
ROADMAP.md
Normal file
66
ROADMAP.md
Normal file
@@ -0,0 +1,66 @@
|
||||
# 系统升级路线图 (Roadmap to AI Quant System)
|
||||
|
||||
根据您提供的架构图,目前的系统仅实现了最基础的“价格监控”和“模拟下单”功能。要达到图中展示的**全方位 AI 量化交易系统**(包含行业分析、大模型决策、全量监控等),需要进行以下四个阶段的升级:
|
||||
|
||||
## 第一阶段:数据广度与深度扩展 (Data Layer)
|
||||
|
||||
目前的 `futu.py` 仅抓取了价格数据,图中的系统需要更多维度的信息。
|
||||
|
||||
- [ ] **全量纳斯达克覆盖 (3886支股票)**
|
||||
- **现状**: 仅支持 Top N 或简单的分页抓取。
|
||||
- **行动**: 优化 `StockDataIntegrator`,维护一份完整的纳斯达克成分股列表(Symbol List),确保监控覆盖无死角。
|
||||
- [ ] **盘前/盘后数据 (Pre/Post-Market)**
|
||||
- **现状**: 代码中有部分解析逻辑,但未完全启用。
|
||||
- **行动**: 确保在美股非交易时段(北京时间 16:00 - 21:30)也能获取实时报价。
|
||||
- [ ] **非结构化数据抓取 (新闻/研报)**
|
||||
- **现状**: **缺失**。
|
||||
- **行动**: 开发新的爬虫模块 `news_scraper.py`:
|
||||
- **新浪财经/东方财富**: 抓取个股快讯。
|
||||
- **雪球**: 抓取社区讨论热度。
|
||||
- **投行研报**: 抓取高盛、花旗等机构的评级调整(Upgrade/Downgrade)和目标价。
|
||||
|
||||
## 第二阶段:引入“大脑” - 大模型与云端分析 (Intelligence Layer)
|
||||
|
||||
这是图中“自研大模型”和“云端分析”的核心部分,目前的 `market_analyzer.py` 逻辑太简单。
|
||||
|
||||
- [ ] **接入 LLM (大语言模型)**
|
||||
- **现状**: 仅使用 `if 涨幅 > 5%` 的硬编码规则。
|
||||
- **行动**: 改造 `market_analyzer.py`,接入 OpenAI (GPT-4)、Claude 或本地部署的 DeepSeek/Llama 模型。
|
||||
- **应用场景**:
|
||||
- **情感分析**: 输入新闻标题,让 AI 判断是利好还是利空。
|
||||
- **财报解读**: 输入财报摘要,让 AI 分析营收增长和指引。
|
||||
- [ ] **行业与趋势分析**
|
||||
- **现状**: 仅关注个股。
|
||||
- **行动**: 增加“板块分析”模块,计算半导体、科技、医药等板块的整体涨跌幅,实现“行业分析”功能。
|
||||
|
||||
## 第三阶段:系统架构升级 (Architecture Layer)
|
||||
|
||||
要同时监控 3886 支股票,目前的单线程循环效率不够。
|
||||
|
||||
- [ ] **高并发异步架构**
|
||||
- **现状**: 同步轮询(一次抓取一个或一页),延迟高。
|
||||
- **行动**: 使用 Python 的 `asyncio` 和 `aiohttp` 重构 `monitor.py`,实现高并发抓取,确保 3000+ 支股票的数据延迟在秒级以内。
|
||||
- [ ] **数据库存储**
|
||||
- **现状**: 使用 CSV 文件。
|
||||
- **行动**: 引入 **SQLite** 或 **PostgreSQL** 数据库。存储历史行情、新闻数据和 AI 分析记录,以便进行“趋势分析”和回测。
|
||||
|
||||
## 第四阶段:实盘交易对接 (Execution Layer)
|
||||
|
||||
图中的“股票购买”和“股票抛售”需要对接真实券商。
|
||||
|
||||
- [ ] **券商 API 对接**
|
||||
- **现状**: `trader.py` 仅打印日志。
|
||||
- **行动**: 接入券商 API(如 **富途 OpenD**、**老虎证券 Open API** 或 **Interactive Brokers API**)。
|
||||
- **功能**: 实现真实的下单(Place Order)、撤单、资金查询和持仓同步。
|
||||
|
||||
---
|
||||
|
||||
## 总结:下一步具体操作建议
|
||||
|
||||
建议您先从 **第二阶段** 入手,因为这是“AI 财经”最核心的特征:
|
||||
|
||||
1. **申请一个大模型 API Key** (如 DeepSeek, OpenAI)。
|
||||
2. **修改 `market_analyzer.py`**:
|
||||
* 不再只看涨跌幅。
|
||||
* 增加一个函数 `analyze_news_sentiment(news_text)`,让 AI 帮您判断新闻利好。
|
||||
3. **创建一个 `news_spider.py`**,试着抓取几条财经新闻作为 AI 的输入。
|
||||
BIN
__pycache__/data_writer.cpython-310.pyc
Normal file
BIN
__pycache__/data_writer.cpython-310.pyc
Normal file
Binary file not shown.
BIN
__pycache__/data_writer.cpython-312.pyc
Normal file
BIN
__pycache__/data_writer.cpython-312.pyc
Normal file
Binary file not shown.
BIN
__pycache__/futu.cpython-310.pyc
Normal file
BIN
__pycache__/futu.cpython-310.pyc
Normal file
Binary file not shown.
BIN
__pycache__/futu.cpython-312.pyc
Normal file
BIN
__pycache__/futu.cpython-312.pyc
Normal file
Binary file not shown.
BIN
__pycache__/logging_setup.cpython-310.pyc
Normal file
BIN
__pycache__/logging_setup.cpython-310.pyc
Normal file
Binary file not shown.
BIN
__pycache__/logging_setup.cpython-312.pyc
Normal file
BIN
__pycache__/logging_setup.cpython-312.pyc
Normal file
Binary file not shown.
BIN
__pycache__/market_analyzer.cpython-310.pyc
Normal file
BIN
__pycache__/market_analyzer.cpython-310.pyc
Normal file
Binary file not shown.
BIN
__pycache__/market_analyzer.cpython-312.pyc
Normal file
BIN
__pycache__/market_analyzer.cpython-312.pyc
Normal file
Binary file not shown.
BIN
__pycache__/signal_filter.cpython-310.pyc
Normal file
BIN
__pycache__/signal_filter.cpython-310.pyc
Normal file
Binary file not shown.
BIN
__pycache__/signal_filter.cpython-312.pyc
Normal file
BIN
__pycache__/signal_filter.cpython-312.pyc
Normal file
Binary file not shown.
BIN
__pycache__/trader.cpython-310.pyc
Normal file
BIN
__pycache__/trader.cpython-310.pyc
Normal file
Binary file not shown.
BIN
__pycache__/trader.cpython-312.pyc
Normal file
BIN
__pycache__/trader.cpython-312.pyc
Normal file
Binary file not shown.
BIN
__pycache__/utils_id.cpython-310.pyc
Normal file
BIN
__pycache__/utils_id.cpython-310.pyc
Normal file
Binary file not shown.
BIN
__pycache__/utils_id.cpython-312.pyc
Normal file
BIN
__pycache__/utils_id.cpython-312.pyc
Normal file
Binary file not shown.
BIN
__pycache__/utils_time.cpython-310.pyc
Normal file
BIN
__pycache__/utils_time.cpython-310.pyc
Normal file
Binary file not shown.
BIN
__pycache__/utils_time.cpython-312.pyc
Normal file
BIN
__pycache__/utils_time.cpython-312.pyc
Normal file
Binary file not shown.
1701
data/bars_1m.csv
Normal file
1701
data/bars_1m.csv
Normal file
File diff suppressed because it is too large
Load Diff
128
data/etl_runs.csv
Normal file
128
data/etl_runs.csv
Normal file
@@ -0,0 +1,128 @@
|
||||
run_ts_utc,loop,fetched_count,signal_count,duration_seconds,errors
|
||||
2025-11-25T10:44:54Z,1,2,2,0.010,0
|
||||
2025-11-25T10:44:54Z,2,2,0,0.010,0
|
||||
2025-11-25T10:47:11Z,1,50,24,2.498,0
|
||||
2025-11-25T10:47:41Z,1,50,24,2.918,0
|
||||
2025-11-25T11:07:19Z,1,10,6,2.836,0
|
||||
2025-11-25T11:07:26Z,2,10,0,1.871,0
|
||||
2025-11-25T11:07:33Z,3,10,0,2.204,0
|
||||
2025-11-25T11:07:41Z,4,10,0,2.860,0
|
||||
2025-11-25T11:07:48Z,5,10,0,1.902,0
|
||||
2025-11-25T11:07:55Z,6,10,0,2.056,0
|
||||
2025-11-25T11:08:03Z,7,10,0,2.957,0
|
||||
2025-11-25T11:11:49Z,1,10,6,3.377,0
|
||||
2025-11-25T11:11:56Z,2,10,0,2.035,0
|
||||
2025-11-25T11:12:04Z,3,10,0,2.529,0
|
||||
2025-11-25T11:12:11Z,4,10,0,2.598,0
|
||||
2025-11-25T11:12:19Z,5,10,0,2.985,0
|
||||
2025-11-25T11:12:26Z,6,10,0,1.725,0
|
||||
2025-11-25T11:12:33Z,7,10,0,1.935,0
|
||||
2025-11-25T11:12:41Z,8,10,0,2.609,0
|
||||
2025-11-25T11:12:48Z,9,10,0,2.033,0
|
||||
2025-11-25T11:12:55Z,10,10,0,2.231,0
|
||||
2025-11-25T11:13:04Z,11,10,0,4.059,0
|
||||
2025-11-25T11:13:11Z,12,10,0,1.893,0
|
||||
2025-11-25T11:13:18Z,13,10,0,2.463,0
|
||||
2025-11-25T11:13:25Z,14,10,0,2.150,0
|
||||
2025-11-25T11:13:32Z,15,10,0,1.869,0
|
||||
2025-11-25T11:13:39Z,16,10,0,2.121,0
|
||||
2025-11-25T11:13:47Z,17,10,0,2.567,0
|
||||
2025-11-25T11:13:54Z,18,10,0,2.531,0
|
||||
2025-11-25T11:14:01Z,19,10,0,1.997,0
|
||||
2025-11-25T11:14:09Z,20,10,0,2.550,0
|
||||
2025-11-25T11:14:16Z,21,10,0,2.101,0
|
||||
2025-11-25T11:14:24Z,22,10,0,2.745,0
|
||||
2025-11-25T11:14:31Z,23,10,0,2.085,0
|
||||
2025-11-25T11:14:38Z,24,10,0,2.005,0
|
||||
2025-11-25T11:14:45Z,25,10,0,2.006,0
|
||||
2025-11-25T11:14:52Z,26,10,0,2.334,0
|
||||
2025-11-25T11:14:59Z,27,10,0,2.102,0
|
||||
2025-11-25T11:15:06Z,28,10,0,1.812,0
|
||||
2025-11-25T11:15:16Z,29,10,0,4.983,0
|
||||
2025-11-25T11:15:23Z,30,10,0,1.977,0
|
||||
2025-11-25T11:15:31Z,31,10,0,2.548,0
|
||||
2025-11-25T11:15:38Z,32,10,0,2.241,0
|
||||
2025-11-25T11:15:46Z,33,10,0,2.619,0
|
||||
2025-11-25T11:15:53Z,34,10,0,2.174,0
|
||||
2025-11-25T11:16:00Z,35,10,0,2.690,0
|
||||
2025-11-25T11:16:08Z,36,10,0,2.169,0
|
||||
2025-11-25T11:16:15Z,37,10,0,2.560,0
|
||||
2025-11-25T11:16:22Z,38,10,0,2.309,0
|
||||
2025-11-25T11:16:29Z,39,10,0,1.976,0
|
||||
2025-11-25T11:16:41Z,1,10,6,2.499,0
|
||||
2025-11-25T11:16:48Z,2,10,0,2.566,0
|
||||
2025-11-25T11:16:56Z,3,10,0,2.491,0
|
||||
2025-11-25T11:17:03Z,4,10,0,1.879,0
|
||||
2025-11-25T11:17:10Z,5,10,0,1.998,0
|
||||
2025-11-25T11:17:17Z,6,10,0,2.378,0
|
||||
2025-11-25T11:17:24Z,7,10,0,2.000,0
|
||||
2025-11-25T11:17:30Z,1,10,7,3.058,0
|
||||
2025-11-25T11:17:38Z,2,10,0,2.890,0
|
||||
2025-11-25T11:17:45Z,3,10,0,1.814,0
|
||||
2025-11-25T11:17:52Z,4,10,0,1.980,0
|
||||
2025-11-25T11:17:59Z,5,10,0,2.803,0
|
||||
2025-11-25T11:18:07Z,6,10,0,2.337,0
|
||||
2025-11-25T11:18:14Z,7,10,0,2.154,0
|
||||
2025-11-25T11:18:21Z,8,10,0,1.863,0
|
||||
2025-11-25T11:18:28Z,9,10,0,2.543,0
|
||||
2025-11-25T11:18:36Z,10,10,0,2.155,0
|
||||
2025-11-25T11:18:42Z,11,10,0,1.899,0
|
||||
2025-11-25T11:18:49Z,12,10,0,2.033,0
|
||||
2025-11-25T11:18:56Z,13,10,0,2.058,0
|
||||
2025-11-25T11:19:04Z,14,10,0,2.660,0
|
||||
2025-11-25T11:19:12Z,15,10,0,2.445,0
|
||||
2025-11-25T11:19:19Z,16,10,0,1.929,0
|
||||
2025-11-25T11:19:27Z,17,10,0,3.209,0
|
||||
2025-11-25T11:19:34Z,18,10,0,1.945,0
|
||||
2025-11-25T11:19:41Z,19,10,0,2.034,0
|
||||
2025-11-25T11:19:48Z,20,10,0,2.368,0
|
||||
2025-11-25T11:19:56Z,21,10,0,2.757,0
|
||||
2025-11-25T11:20:03Z,22,10,0,2.141,0
|
||||
2025-11-25T11:20:10Z,23,10,0,2.113,0
|
||||
2025-11-25T11:20:17Z,24,10,0,1.743,0
|
||||
2025-11-25T11:20:24Z,25,10,0,1.941,0
|
||||
2025-11-25T11:20:31Z,26,10,0,2.210,0
|
||||
2025-11-25T11:20:38Z,27,10,0,2.442,0
|
||||
2025-11-25T11:20:51Z,1,10,6,2.674,0
|
||||
2025-11-25T11:21:00Z,2,10,0,3.996,0
|
||||
2025-11-25T11:21:07Z,3,10,0,2.291,0
|
||||
2025-11-25T11:21:14Z,4,10,0,1.935,0
|
||||
2025-11-25T11:21:21Z,5,10,0,2.171,0
|
||||
2025-11-25T11:21:28Z,6,10,0,2.033,0
|
||||
2025-11-25T11:21:36Z,7,10,0,2.871,0
|
||||
2025-11-25T11:21:43Z,8,10,0,2.115,0
|
||||
2025-11-25T11:21:51Z,9,10,0,2.700,0
|
||||
2025-11-25T11:21:58Z,10,10,0,2.256,0
|
||||
2025-11-25T11:22:05Z,11,10,0,2.244,0
|
||||
2025-11-25T11:22:12Z,12,10,0,2.217,0
|
||||
2025-11-25T11:22:20Z,13,10,0,2.171,0
|
||||
2025-11-25T11:22:27Z,14,10,0,2.455,0
|
||||
2025-11-25T11:22:34Z,15,10,0,2.108,0
|
||||
2025-11-25T11:22:41Z,16,10,0,2.232,0
|
||||
2025-11-25T11:22:49Z,17,10,0,2.415,0
|
||||
2025-11-25T11:22:56Z,18,10,0,1.945,0
|
||||
2025-11-25T11:23:04Z,19,10,0,2.833,0
|
||||
2025-11-25T11:23:11Z,20,10,0,2.123,0
|
||||
2025-11-25T11:23:18Z,21,10,0,2.334,0
|
||||
2025-11-25T11:23:25Z,22,10,0,2.197,0
|
||||
2025-11-25T11:23:32Z,23,10,0,1.917,0
|
||||
2025-11-25T11:23:39Z,24,10,0,1.900,0
|
||||
2025-11-25T11:23:46Z,25,10,0,1.790,0
|
||||
2025-11-25T11:23:53Z,26,10,0,2.130,0
|
||||
2025-11-25T11:24:00Z,27,10,0,1.574,0
|
||||
2025-11-25T11:24:07Z,28,10,0,2.039,0
|
||||
2025-11-25T11:24:14Z,29,10,0,1.984,0
|
||||
2025-11-25T11:24:21Z,30,10,0,2.204,0
|
||||
2025-11-25T11:24:28Z,31,10,0,1.807,0
|
||||
2025-11-25T11:24:34Z,32,10,0,1.666,0
|
||||
2025-11-25T11:24:41Z,33,10,0,2.082,0
|
||||
2025-11-25T11:24:49Z,34,10,0,2.510,0
|
||||
2025-11-25T11:24:56Z,35,10,0,1.854,0
|
||||
2025-11-25T11:25:03Z,36,10,0,2.345,0
|
||||
2025-11-25T11:25:10Z,37,10,0,2.083,0
|
||||
2025-11-25T11:25:17Z,38,10,0,2.034,0
|
||||
2025-11-25T11:26:04Z,1,5,4,1.952,0
|
||||
2025-11-25T11:26:09Z,2,5,0,1.348,0
|
||||
2025-11-25T11:26:13Z,3,5,0,1.307,0
|
||||
2025-11-25T11:26:18Z,4,5,0,1.619,0
|
||||
2025-11-25T11:26:22Z,5,5,0,1.532,0
|
||||
|
1330
data/features_1m.csv
Normal file
1330
data/features_1m.csv
Normal file
File diff suppressed because it is too large
Load Diff
6211
data/premarket_bars.csv
Normal file
6211
data/premarket_bars.csv
Normal file
File diff suppressed because it is too large
Load Diff
2409
data/premarket_signals.csv
Normal file
2409
data/premarket_signals.csv
Normal file
File diff suppressed because it is too large
Load Diff
6
data/signals.csv
Normal file
6
data/signals.csv
Normal file
@@ -0,0 +1,6 @@
|
||||
4144122546532766634-2025-11-25T11:20:50Z,4144122546532766634,GOOGL,2025-11-25T11:20:50Z,momentum,BUY,0.85,intraday,"{""reason"": ""涨幅显著 (6.31%),模型建议买入""}",rule_threshold,v1,
|
||||
5053198607245987051-2025-11-25T11:20:50Z,5053198607245987051,GOOG,2025-11-25T11:20:50Z,momentum,BUY,0.85,intraday,"{""reason"": ""涨幅显著 (6.28%),模型建议买入""}",rule_threshold,v1,
|
||||
2194510714435639870-2025-11-25T11:20:50Z,2194510714435639870,MSFT,2025-11-25T11:20:50Z,momentum,BUY,0.85,intraday,"{""reason"": ""涨幅显著 (40.00%),模型建议买入""}",rule_threshold,v1,
|
||||
6313120924332843851-2025-11-25T11:20:50Z,6313120924332843851,AVGO,2025-11-25T11:20:50Z,momentum,BUY,0.85,intraday,"{""reason"": ""涨幅显著 (11.10%),模型建议买入""}",rule_threshold,v1,
|
||||
1236673530676310677-2025-11-25T11:20:50Z,1236673530676310677,TSLA,2025-11-25T11:20:50Z,momentum,BUY,0.85,intraday,"{""reason"": ""涨幅显著 (6.82%),模型建议买入""}",rule_threshold,v1,
|
||||
352413926823531646-2025-11-25T11:20:50Z,352413926823531646,NVDA,2025-11-25T11:20:50Z,momentum,SELL,,intraday,"{""reason"": ""盘前跌幅 -3.73% 预警""}",rule_threshold,v1,
|
||||
|
51
data/symbols.csv
Normal file
51
data/symbols.csv
Normal file
@@ -0,0 +1,51 @@
|
||||
id,symbol,name,exchange,currency,tick_size,lot_size,sector,industry,is_active,first_seen_utc,last_seen_utc
|
||||
7513257165860044271,AAPL,苹果,US,USD,,,,,1,2025-11-25T10:23:21Z,2025-11-25T20:44:14Z
|
||||
2194510714435639870,MSFT,微软,US,USD,,,,,1,2025-11-25T10:23:21Z,2025-11-25T20:44:14Z
|
||||
352413926823531646,NVDA,英伟达,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T20:44:14Z
|
||||
4144122546532766634,GOOGL,谷歌-A,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T20:44:14Z
|
||||
5053198607245987051,GOOG,谷歌-C,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T20:44:14Z
|
||||
4623497169759077023,AMZN,亚马逊,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T20:44:14Z
|
||||
6313120924332843851,AVGO,博通,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T20:44:14Z
|
||||
5102328569974216853,META,Meta Platforms Inc-A,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T20:44:14Z
|
||||
876017357040540663,TSM,台积电,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T20:44:14Z
|
||||
1236673530676310677,TSLA,特斯拉,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T20:44:14Z
|
||||
1331973792832307407,BRK_A,伯克希尔哈撒韦-A,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T11:03:31Z
|
||||
650274069060505642,BRK_B,伯克希尔哈撒韦-B,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T11:03:31Z
|
||||
8803988925120688787,LLY,礼来,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T11:03:31Z
|
||||
8552427102055794475,WMT,沃尔玛,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T11:03:31Z
|
||||
2514921600363169427,JPM,摩根大通,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T11:03:31Z
|
||||
2430622602946461838,V,维萨,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T11:03:31Z
|
||||
8672445585537641080,ORCL,甲骨文,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T11:03:31Z
|
||||
5071644475567648584,JNJ,强生,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T11:03:31Z
|
||||
7149994435636318535,XOM,埃克森美孚,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T11:03:31Z
|
||||
7970304264787595103,MA,万事达,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T11:03:31Z
|
||||
2534395427583276124,NFLX,奈飞,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
|
||||
5575034379848759650,ABBV,艾伯维,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
|
||||
5412047369712123198,COST,开市客,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
|
||||
8785367685666264461,PLTR,Palantir Technologies Inc-A,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
|
||||
5647711723656244203,BABA,阿里巴巴,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
|
||||
6990590429937873043,ASML,阿斯麦,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
|
||||
8292219544464776917,BAC,美国银行,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
|
||||
1633824147157402598,AMD,超威半导体,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
|
||||
2555012754890357878,PG,宝洁,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
|
||||
4150716678491375742,HD,家得宝,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
|
||||
2367064624252044795,KO,可口可乐,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
|
||||
6646738033285807079,GE,GE航空航天,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
|
||||
8407679579858137856,CVX,雪佛龙,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
|
||||
7292100549554102522,CSCO,思科,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
|
||||
8960439643380076728,UNH,联合健康,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
|
||||
7065807657318692902,DGP,二倍做多黄金ETN-DB,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
|
||||
3257268770337445296,IBM,IBM国际商业机器,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
|
||||
579602174984910496,AZN,阿斯利康(ADR),US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
|
||||
4391221166925450354,SAP,思爱普,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
|
||||
7856183197500173397,WFC,富国银行,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
|
||||
6761987357449376195,CAT,卡特彼勒,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
|
||||
6181095751687308617,TM,丰田汽车(ADR),US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
|
||||
2535412531591654533,MS,摩根士丹利,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
|
||||
6823324911471553900,MU,美光科技,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
|
||||
2560492629011940105,MRK,默沙东,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
|
||||
6574988392422038406,AXP,美国运通,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
|
||||
3063309002746328407,NVS,诺华制药,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
|
||||
8036752892131638733,GS,高盛,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
|
||||
1961403181409823259,HSBC,汇丰控股,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
|
||||
387219116134337577,PM,菲利普莫里斯国际,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
|
||||
|
492
data_writer.py
Normal file
492
data_writer.py
Normal file
@@ -0,0 +1,492 @@
|
||||
# -*- coding: utf-8 -*-
|
||||
"""
|
||||
CSV 数据落地模块(基于 DATA_SCHEMA.md 的简化实现)
|
||||
- symbols.csv
|
||||
- bars_1m.csv
|
||||
- signals.csv
|
||||
|
||||
说明:
|
||||
- 不做真正的 Upsert(CSV 不擅长),通过读取现有行建立内存索引,避免重复写入关键键。
|
||||
- 比率字段(如涨跌幅)采用小数存储,例如 4.02% 存 0.0402。
|
||||
"""
|
||||
import csv
|
||||
import os
|
||||
from datetime import datetime, timezone
|
||||
from typing import Iterable, Dict, Any, List, Tuple
|
||||
from utils_id import stable_symbol_id
|
||||
|
||||
DATA_DIR = os.path.join(os.path.dirname(__file__), "data")
|
||||
SYMBOLS_CSV = os.path.join(DATA_DIR, "symbols.csv")
|
||||
BARS_1M_CSV = os.path.join(DATA_DIR, "bars_1m.csv")
|
||||
SIGNALS_CSV = os.path.join(DATA_DIR, "signals.csv")
|
||||
FEATURES_1M_CSV = os.path.join(DATA_DIR, "features_1m.csv")
|
||||
ETL_RUNS_CSV = os.path.join(DATA_DIR, "etl_runs.csv")
|
||||
PREMARKET_BARS_CSV = os.path.join(DATA_DIR, "premarket_bars.csv")
|
||||
PREMARKET_SIGNALS_CSV = os.path.join(DATA_DIR, "premarket_signals.csv")
|
||||
|
||||
# 确保目录存在
|
||||
os.makedirs(DATA_DIR, exist_ok=True)
|
||||
|
||||
def _utc_now_iso() -> str:
|
||||
return datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ")
|
||||
|
||||
def _floor_minute(dt: datetime) -> datetime:
|
||||
return dt.replace(second=0, microsecond=0, tzinfo=timezone.utc)
|
||||
|
||||
# ---------- symbols.csv ----------
|
||||
|
||||
_SYMBOLS_HEADER = [
|
||||
"id","symbol","name","exchange","currency",
|
||||
"tick_size","lot_size","sector","industry",
|
||||
"is_active","first_seen_utc","last_seen_utc"
|
||||
]
|
||||
|
||||
def write_symbols(stocks: Iterable[Dict[str, Any]]) -> Dict[str, int]:
|
||||
"""将股票基础信息写入 symbols.csv,并返回 symbol->symbol_id 映射。
|
||||
stocks: 需包含 keys: symbol, name, exchange, currency
|
||||
"""
|
||||
existing: Dict[Tuple[str,str], Dict[str, str]] = {}
|
||||
if os.path.exists(SYMBOLS_CSV):
|
||||
with open(SYMBOLS_CSV, "r", encoding="utf-8-sig") as f:
|
||||
reader = csv.DictReader(f)
|
||||
for row in reader:
|
||||
existing[(row["symbol"], row["exchange"])] = row
|
||||
|
||||
now = _utc_now_iso()
|
||||
# 生成/更新
|
||||
for s in stocks:
|
||||
symbol = s.get("symbol")
|
||||
name = s.get("name")
|
||||
exchange = (s.get("exchange") or "US").upper()
|
||||
currency = (s.get("currency") or "USD").upper()
|
||||
key = (symbol, exchange)
|
||||
if key not in existing:
|
||||
sid = stable_symbol_id(symbol, exchange)
|
||||
existing[key] = {
|
||||
"id": str(sid),
|
||||
"symbol": symbol,
|
||||
"name": name or "",
|
||||
"exchange": exchange,
|
||||
"currency": currency,
|
||||
"tick_size": "",
|
||||
"lot_size": "",
|
||||
"sector": "",
|
||||
"industry": "",
|
||||
"is_active": "1",
|
||||
"first_seen_utc": now,
|
||||
"last_seen_utc": now,
|
||||
}
|
||||
else:
|
||||
existing[key]["last_seen_utc"] = now
|
||||
|
||||
# 写回
|
||||
with open(SYMBOLS_CSV, "w", newline="", encoding="utf-8-sig") as f:
|
||||
writer = csv.DictWriter(f, fieldnames=_SYMBOLS_HEADER)
|
||||
writer.writeheader()
|
||||
for row in existing.values():
|
||||
writer.writerow(row)
|
||||
|
||||
# 返回映射
|
||||
return {k[0]: int(v["id"]) for k, v in existing.items() if k[0] == v["symbol"]}
|
||||
|
||||
# ---------- bars_1m.csv ----------
|
||||
|
||||
_BARS_1M_HEADER = [
|
||||
"symbol_id","symbol","ts_utc","open","high","low","close",
|
||||
"volume","vwap","trades_count","source","session"
|
||||
]
|
||||
|
||||
def _upgrade_bars_file_if_needed():
|
||||
"""如果历史 bars_1m.csv 缺少 session 列,进行一次升级重写,补 session='regular'。"""
|
||||
if not os.path.exists(BARS_1M_CSV):
|
||||
return
|
||||
try:
|
||||
with open(BARS_1M_CSV, 'r', encoding='utf-8-sig') as f:
|
||||
reader = csv.reader(f)
|
||||
rows = list(reader)
|
||||
if not rows:
|
||||
return
|
||||
header = rows[0]
|
||||
if 'session' in header:
|
||||
return # 已升级
|
||||
# 构造新文件内容
|
||||
old_header = header
|
||||
# 建立列索引映射
|
||||
idx_map = {col: i for i, col in enumerate(old_header)}
|
||||
new_rows = []
|
||||
new_rows.append(_BARS_1M_HEADER)
|
||||
for r in rows[1:]:
|
||||
if not r:
|
||||
continue
|
||||
# 依据旧列生成新行
|
||||
new_line = [
|
||||
r[idx_map.get('symbol_id','')],
|
||||
r[idx_map.get('symbol','')],
|
||||
r[idx_map.get('ts_utc','')],
|
||||
r[idx_map.get('open','')],
|
||||
r[idx_map.get('high','')],
|
||||
r[idx_map.get('low','')],
|
||||
r[idx_map.get('close','')],
|
||||
r[idx_map.get('volume','')],
|
||||
r[idx_map.get('vwap','')],
|
||||
r[idx_map.get('trades_count','')],
|
||||
r[idx_map.get('source','')],
|
||||
'regular'
|
||||
]
|
||||
new_rows.append(new_line)
|
||||
# 写回升级
|
||||
with open(BARS_1M_CSV, 'w', newline='', encoding='utf-8-sig') as f:
|
||||
writer = csv.writer(f)
|
||||
writer.writerows(new_rows)
|
||||
except Exception as e:
|
||||
print(f"⚠️ bars_1m.csv 升级失败: {e}")
|
||||
|
||||
def append_bars_1m(stocks: Iterable[Dict[str, Any]], symbol_id_map: Dict[str, int], source: str = "eastmoney") -> List[Dict[str, Any]]:
|
||||
"""将当前快照近似为 1 分钟线写入 bars_1m.csv。
|
||||
由于只有快照,open/high/low/close 统一使用 current_price,volume/vwap/trades_count 为空。
|
||||
"""
|
||||
now = _floor_minute(datetime.now(timezone.utc)).strftime("%Y-%m-%dT%H:%M:%SZ")
|
||||
rows: List[Dict[str, Any]] = []
|
||||
_upgrade_bars_file_if_needed()
|
||||
for s in stocks:
|
||||
symbol = s.get("symbol")
|
||||
price = s.get("eastmoney_price") or s.get("current_price")
|
||||
if price is None:
|
||||
continue
|
||||
sid = symbol_id_map.get(symbol) or stable_symbol_id(symbol)
|
||||
rows.append({
|
||||
"symbol_id": sid,
|
||||
"symbol": symbol,
|
||||
"ts_utc": now,
|
||||
"open": price,
|
||||
"high": price,
|
||||
"low": price,
|
||||
"close": price,
|
||||
"volume": "",
|
||||
"vwap": "",
|
||||
"trades_count": "",
|
||||
"source": source,
|
||||
"session": "regular",
|
||||
})
|
||||
# 追加写
|
||||
file_exists = os.path.exists(BARS_1M_CSV)
|
||||
with open(BARS_1M_CSV, "a", newline="", encoding="utf-8-sig") as f:
|
||||
writer = csv.DictWriter(f, fieldnames=_BARS_1M_HEADER)
|
||||
if not file_exists:
|
||||
writer.writeheader()
|
||||
for r in rows:
|
||||
writer.writerow(r)
|
||||
return rows
|
||||
|
||||
def append_bars_session(stocks: Iterable[Dict[str, Any]], symbol_id_map: Dict[str, int], source: str = "futu", session: str = "pre") -> List[Dict[str, Any]]:
|
||||
"""写入特定交易时段的快照(如盘前/盘后),与常规 bars 共存,通过 session 区分。"""
|
||||
_upgrade_bars_file_if_needed()
|
||||
now = _floor_minute(datetime.now(timezone.utc)).strftime("%Y-%m-%dT%H:%M:%SZ")
|
||||
rows: List[Dict[str, Any]] = []
|
||||
for s in stocks:
|
||||
symbol = s.get("symbol")
|
||||
price = s.get("premarket_price") or s.get("after_hours_price") or s.get("futu_before_open_price")
|
||||
if price in (None, ""):
|
||||
continue
|
||||
try:
|
||||
price_f = float(price)
|
||||
except Exception:
|
||||
continue
|
||||
sid = symbol_id_map.get(symbol) or stable_symbol_id(symbol)
|
||||
rows.append({
|
||||
"symbol_id": sid,
|
||||
"symbol": symbol,
|
||||
"ts_utc": now,
|
||||
"open": price_f,
|
||||
"high": price_f,
|
||||
"low": price_f,
|
||||
"close": price_f,
|
||||
"volume": "",
|
||||
"vwap": "",
|
||||
"trades_count": "",
|
||||
"source": source,
|
||||
"session": session,
|
||||
})
|
||||
file_exists = os.path.exists(BARS_1M_CSV)
|
||||
with open(BARS_1M_CSV, "a", newline="", encoding="utf-8-sig") as f:
|
||||
writer = csv.DictWriter(f, fieldnames=_BARS_1M_HEADER)
|
||||
if not file_exists:
|
||||
writer.writeheader()
|
||||
for r in rows:
|
||||
writer.writerow(r)
|
||||
return rows
|
||||
|
||||
# ---------- premarket 专用快照与信号 ----------
|
||||
|
||||
_PREMARKET_BARS_HEADER = [
|
||||
'symbol_id','symbol','name','ts_utc','ts_et','price','change','change_ratio','volume','source','session','raw_file'
|
||||
]
|
||||
|
||||
_PREMARKET_SIGNALS_HEADER = [
|
||||
'id','symbol_id','symbol','generated_at_utc','generated_at_et','signal_type','direction','score','reason','params_json','model_name','version','expires_at_utc'
|
||||
]
|
||||
|
||||
def append_premarket_bars(rows: List[Dict[str, Any]], symbol_id_map: Dict[str, int], source: str = 'futu') -> None:
|
||||
"""将盘前抓取行写入 premarket_bars.csv。
|
||||
rows: 需包含 symbol,name,premarket_price,premarket_change,premarket_change_ratio(原始百分比或小数字符串), ts(ET字符串 HH:MM)
|
||||
"""
|
||||
if not rows:
|
||||
return
|
||||
file_exists = os.path.exists(PREMARKET_BARS_CSV)
|
||||
now_utc = datetime.now(timezone.utc).strftime('%Y-%m-%dT%H:%M:%SZ')
|
||||
# ET 时间字符串(便于人工查看)
|
||||
try:
|
||||
from zoneinfo import ZoneInfo
|
||||
ts_et_full = datetime.now(ZoneInfo('America/New_York')).strftime('%Y-%m-%dT%H:%M:%S')
|
||||
except Exception:
|
||||
ts_et_full = ''
|
||||
with open(PREMARKET_BARS_CSV, 'a', newline='', encoding='utf-8-sig') as f:
|
||||
writer = csv.DictWriter(f, fieldnames=_PREMARKET_BARS_HEADER)
|
||||
if not file_exists:
|
||||
writer.writeheader()
|
||||
for r in rows:
|
||||
symbol = r.get('symbol')
|
||||
if not symbol:
|
||||
continue
|
||||
price = r.get('premarket_price')
|
||||
if price in (None,'','-'):
|
||||
continue
|
||||
try:
|
||||
price_f = float(price)
|
||||
except Exception:
|
||||
continue
|
||||
# ratio 原始可能是 "3.21%" / "-3.21%" / "0.0321" / ""
|
||||
ratio_raw = r.get('premarket_change_ratio')
|
||||
ratio_val = 0.0
|
||||
if ratio_raw not in (None,''):
|
||||
txt = str(ratio_raw).strip()
|
||||
try:
|
||||
if txt.endswith('%'):
|
||||
ratio_val = float(txt.replace('%',''))/100.0
|
||||
else:
|
||||
# 若原始是小数形式(0.0321)或绝对值>1的百分值(3.21),都兼容
|
||||
num = float(txt)
|
||||
ratio_val = num/100.0 if abs(num) > 1 and abs(num) >= 2 else num # 粗略判断
|
||||
except Exception:
|
||||
ratio_val = 0.0
|
||||
sid = symbol_id_map.get(symbol) or stable_symbol_id(symbol)
|
||||
writer.writerow({
|
||||
'symbol_id': sid,
|
||||
'symbol': symbol,
|
||||
'name': r.get('name',''),
|
||||
'ts_utc': now_utc,
|
||||
'ts_et': ts_et_full,
|
||||
'price': price_f,
|
||||
'change': r.get('premarket_change',''),
|
||||
'change_ratio': ratio_val,
|
||||
'volume': '',
|
||||
'source': source,
|
||||
'session': 'pre',
|
||||
'raw_file': '',
|
||||
})
|
||||
|
||||
def append_premarket_signals(signals: List[Dict[str, Any]], symbol_id_map: Dict[str, int]) -> None:
|
||||
"""写入盘前信号到 premarket_signals.csv。
|
||||
signals: 需包含 symbol,direction(BUY/SELL),reason,params_json(可选)
|
||||
"""
|
||||
if not signals:
|
||||
return
|
||||
file_exists = os.path.exists(PREMARKET_SIGNALS_CSV)
|
||||
model_name, version = _def_model
|
||||
now_utc = datetime.now(timezone.utc).strftime('%Y-%m-%dT%H:%M:%SZ')
|
||||
try:
|
||||
from zoneinfo import ZoneInfo
|
||||
now_et = datetime.now(ZoneInfo('America/New_York')).strftime('%Y-%m-%dT%H:%M:%S')
|
||||
except Exception:
|
||||
now_et = ''
|
||||
# 简单去重: 同 symbol+direction+当前UTC秒 不重复
|
||||
seen = set()
|
||||
if file_exists:
|
||||
with open(PREMARKET_SIGNALS_CSV,'r',encoding='utf-8-sig') as f:
|
||||
reader = csv.DictReader(f)
|
||||
for row in reader:
|
||||
seen.add((row['symbol'],row['direction'],row['generated_at_utc']))
|
||||
with open(PREMARKET_SIGNALS_CSV,'a',newline='',encoding='utf-8-sig') as f:
|
||||
writer = csv.DictWriter(f, fieldnames=_PREMARKET_SIGNALS_HEADER)
|
||||
if not file_exists:
|
||||
writer.writeheader()
|
||||
for sig in signals:
|
||||
symbol = sig.get('symbol')
|
||||
direction = sig.get('direction')
|
||||
if not symbol or not direction:
|
||||
continue
|
||||
key = (symbol,direction,now_utc)
|
||||
if key in seen:
|
||||
continue
|
||||
sid = symbol_id_map.get(symbol) or stable_symbol_id(symbol)
|
||||
params_obj = sig.get('params') or {}
|
||||
writer.writerow({
|
||||
'id': f'{sid}-{now_utc}',
|
||||
'symbol_id': sid,
|
||||
'symbol': symbol,
|
||||
'generated_at_utc': now_utc,
|
||||
'generated_at_et': now_et,
|
||||
'signal_type': sig.get('signal_type','premarket_alert'),
|
||||
'direction': direction,
|
||||
'score': sig.get('score',''),
|
||||
'reason': sig.get('reason',''),
|
||||
'params_json': json.dumps(params_obj, ensure_ascii=False),
|
||||
'model_name': model_name,
|
||||
'version': version,
|
||||
'expires_at_utc': '',
|
||||
})
|
||||
|
||||
# ---------- signals.csv ----------
|
||||
|
||||
_SIGNALS_HEADER = [
|
||||
"id","symbol_id","symbol","generated_at_utc",
|
||||
"signal_type","direction","score","horizon",
|
||||
"params_json","model_name","version","expires_at_utc"
|
||||
]
|
||||
|
||||
_def_model = ("rule_threshold", "v1")
|
||||
|
||||
import json
|
||||
|
||||
def append_signals(signals: Iterable[Dict[str, Any]], symbol_id_map: Dict[str, int]) -> None:
|
||||
"""将策略信号写入 signals.csv,使用时间+symbol 做近似去重。
|
||||
输入信号应包含:symbol, type(BUY/SELL), reason/score 可选。
|
||||
"""
|
||||
file_exists = os.path.exists(SIGNALS_CSV)
|
||||
seen_keys = set()
|
||||
if file_exists:
|
||||
with open(SIGNALS_CSV, "r", encoding="utf-8-sig") as f:
|
||||
reader = csv.DictReader(f)
|
||||
for row in reader:
|
||||
seen_keys.add((row["symbol"], row["generated_at_utc"], row.get("direction")))
|
||||
|
||||
model_name, version = _def_model
|
||||
|
||||
with open(SIGNALS_CSV, "a", newline="", encoding="utf-8-sig") as f:
|
||||
writer = csv.DictWriter(f, fieldnames=_SIGNALS_HEADER)
|
||||
if not file_exists:
|
||||
writer.writeheader()
|
||||
for sig in signals:
|
||||
symbol = sig.get("symbol")
|
||||
direction = sig.get("type") or sig.get("direction")
|
||||
gen_at = sig.get('generated_at_utc') or _utc_now_iso()
|
||||
key = (symbol, gen_at, direction)
|
||||
if key in seen_keys:
|
||||
continue
|
||||
sid = symbol_id_map.get(symbol) or stable_symbol_id(symbol)
|
||||
writer.writerow({
|
||||
"id": f"{sid}-{gen_at}",
|
||||
"symbol_id": sid,
|
||||
"symbol": symbol,
|
||||
"generated_at_utc": gen_at,
|
||||
"signal_type": "momentum",
|
||||
"direction": direction,
|
||||
"score": sig.get("confidence", ""),
|
||||
"horizon": "intraday",
|
||||
"params_json": json.dumps({"reason": sig.get("reason", "")}, ensure_ascii=False),
|
||||
"model_name": model_name,
|
||||
"version": version,
|
||||
"expires_at_utc": "",
|
||||
})
|
||||
|
||||
# ---------- features_1m.csv ----------
|
||||
|
||||
_FEATURES_1M_HEADER = [
|
||||
'symbol_id','symbol','ts_utc','price','return_1m','ma_5','ma_15','vol_15'
|
||||
]
|
||||
|
||||
def _load_existing_prices() -> Dict[str, List[Tuple[str, float]]]:
|
||||
data: Dict[str, List[Tuple[str, float]]] = {}
|
||||
if not os.path.exists(BARS_1M_CSV):
|
||||
return data
|
||||
with open(BARS_1M_CSV, 'r', encoding='utf-8-sig') as f:
|
||||
reader = csv.DictReader(f)
|
||||
for row in reader:
|
||||
symbol = row['symbol']
|
||||
ts = row['ts_utc']
|
||||
try:
|
||||
price = float(row['close'])
|
||||
except Exception:
|
||||
continue
|
||||
data.setdefault(symbol, []).append((ts, price))
|
||||
# 保证按时间排序(CSV 追加已有序,但防御性处理)
|
||||
for sym in data:
|
||||
data[sym].sort(key=lambda x: x[0])
|
||||
return data
|
||||
|
||||
def append_features_1m(new_bar_rows: List[Dict[str, Any]]) -> None:
|
||||
if not new_bar_rows:
|
||||
return
|
||||
price_history = _load_existing_prices()
|
||||
feature_rows: List[Dict[str, Any]] = []
|
||||
# 按新增行计算特征
|
||||
for r in new_bar_rows:
|
||||
symbol = r['symbol']
|
||||
sid = r['symbol_id']
|
||||
ts = r['ts_utc']
|
||||
try:
|
||||
price = float(r['close'])
|
||||
except Exception:
|
||||
continue
|
||||
series = price_history.get(symbol, [])
|
||||
# 找到当前索引位置
|
||||
# 防御:series 已包含当前行,因为新行已追加;若未包含则添加再计算
|
||||
if not series or series[-1][0] != ts:
|
||||
series.append((ts, price))
|
||||
idx = len(series) - 1
|
||||
# return_1m
|
||||
ret_1m = 0.0
|
||||
if idx >= 1:
|
||||
prev_price = series[idx-1][1]
|
||||
if prev_price != 0:
|
||||
ret_1m = (price / prev_price) - 1
|
||||
# ma_5
|
||||
window5 = [p for _, p in series[max(0, idx-4):idx+1]]
|
||||
ma_5 = sum(window5)/len(window5) if window5 else price
|
||||
# ma_15
|
||||
window15 = [p for _, p in series[max(0, idx-14):idx+1]]
|
||||
ma_15 = sum(window15)/len(window15) if window15 else price
|
||||
# vol_15 = 标准差
|
||||
vol_15 = 0.0
|
||||
if len(window15) > 1:
|
||||
avg15 = ma_15
|
||||
var = sum((p-avg15)**2 for p in window15)/ (len(window15)-1)
|
||||
vol_15 = var**0.5
|
||||
feature_rows.append({
|
||||
'symbol_id': sid,
|
||||
'symbol': symbol,
|
||||
'ts_utc': ts,
|
||||
'price': price,
|
||||
'return_1m': ret_1m,
|
||||
'ma_5': ma_5,
|
||||
'ma_15': ma_15,
|
||||
'vol_15': vol_15,
|
||||
})
|
||||
file_exists = os.path.exists(FEATURES_1M_CSV)
|
||||
with open(FEATURES_1M_CSV, 'a', newline='', encoding='utf-8-sig') as f:
|
||||
writer = csv.DictWriter(f, fieldnames=_FEATURES_1M_HEADER)
|
||||
if not file_exists:
|
||||
writer.writeheader()
|
||||
for fr in feature_rows:
|
||||
writer.writerow(fr)
|
||||
|
||||
# ---------- etl_runs.csv ----------
|
||||
|
||||
_ETL_RUNS_HEADER = [
|
||||
'run_ts_utc','loop','fetched_count','signal_count','duration_seconds','errors'
|
||||
]
|
||||
|
||||
def append_etl_run(loop: int, fetched: int, signals: int, duration: float, errors: int = 0) -> None:
|
||||
file_exists = os.path.exists(ETL_RUNS_CSV)
|
||||
with open(ETL_RUNS_CSV, 'a', newline='', encoding='utf-8-sig') as f:
|
||||
writer = csv.DictWriter(f, fieldnames=_ETL_RUNS_HEADER)
|
||||
if not file_exists:
|
||||
writer.writeheader()
|
||||
writer.writerow({
|
||||
'run_ts_utc': _utc_now_iso(),
|
||||
'loop': loop,
|
||||
'fetched_count': fetched,
|
||||
'signal_count': signals,
|
||||
'duration_seconds': f'{duration:.3f}',
|
||||
'errors': errors,
|
||||
})
|
||||
21
logging_setup.py
Normal file
21
logging_setup.py
Normal file
@@ -0,0 +1,21 @@
|
||||
# -*- coding: utf-8 -*-
|
||||
"""
|
||||
简单的日志初始化模块
|
||||
使用:
|
||||
from logging_setup import init_logging
|
||||
init_logging()
|
||||
"""
|
||||
import logging
|
||||
import os
|
||||
|
||||
def init_logging(level: str = None):
|
||||
lvl = (level or os.getenv("LOG_LEVEL", "INFO")).upper()
|
||||
lvl_value = getattr(logging, lvl, logging.INFO)
|
||||
logging.basicConfig(
|
||||
level=lvl_value,
|
||||
format="%(asctime)s %(levelname)s [%(name)s] %(message)s",
|
||||
datefmt="%Y-%m-%d %H:%M:%S",
|
||||
)
|
||||
# 降低第三方库的默认日志噪音
|
||||
logging.getLogger("urllib3").setLevel(max(logging.WARNING, lvl_value))
|
||||
logging.getLogger("requests").setLevel(max(logging.WARNING, lvl_value))
|
||||
74
market_analyzer.py
Normal file
74
market_analyzer.py
Normal file
@@ -0,0 +1,74 @@
|
||||
# -*- coding: utf-8 -*-
|
||||
"""
|
||||
市场分析模块 (模拟云端分析/大模型分析)
|
||||
"""
|
||||
import random
|
||||
|
||||
class MarketAnalyzer:
|
||||
def __init__(self):
|
||||
print("🧠 初始化市场分析模型...")
|
||||
# 这里可以加载模型或者连接云端API
|
||||
|
||||
def analyze(self, stock_data_list):
|
||||
"""
|
||||
分析股票数据并生成交易信号
|
||||
|
||||
Args:
|
||||
stock_data_list: 股票数据列表
|
||||
|
||||
Returns:
|
||||
list: 包含交易信号的字典列表
|
||||
"""
|
||||
signals = []
|
||||
|
||||
print(f"🧠 正在分析 {len(stock_data_list)} 只股票的数据...")
|
||||
|
||||
for stock in stock_data_list:
|
||||
# 简单的策略示例:
|
||||
# 如果涨幅超过 5%,产生买入信号 (模拟)
|
||||
# 如果跌幅超过 5%,产生卖出信号 (模拟)
|
||||
|
||||
try:
|
||||
# 兼容:可接收数值小数(0.0402 表示 4.02%)或字符串("4.02%")
|
||||
raw_ratio = stock.get('eastmoney_change_ratio', 0.0)
|
||||
ratio = 0.0
|
||||
if isinstance(raw_ratio, (int, float)):
|
||||
# 假定为小数形式
|
||||
ratio = float(raw_ratio)
|
||||
# 若误传为 4.02 这类百分数值,则做防御性归一化
|
||||
if abs(ratio) > 1:
|
||||
ratio = ratio / 100.0
|
||||
elif isinstance(raw_ratio, str):
|
||||
s = raw_ratio.strip().replace('%', '')
|
||||
if s not in ('', '-'):
|
||||
v = float(s)
|
||||
# 从百分数值转小数
|
||||
ratio = v / 100.0
|
||||
|
||||
symbol = stock.get('symbol')
|
||||
name = stock.get('name')
|
||||
price = stock.get('eastmoney_price')
|
||||
|
||||
# 阈值基于小数:±5%
|
||||
if ratio > 0.05:
|
||||
signals.append({
|
||||
'type': 'BUY',
|
||||
'symbol': symbol,
|
||||
'name': name,
|
||||
'price': price,
|
||||
'reason': f'涨幅显著 ({ratio:.2%}),模型建议买入',
|
||||
'confidence': 0.85
|
||||
})
|
||||
elif ratio < -0.05:
|
||||
signals.append({
|
||||
'type': 'SELL',
|
||||
'symbol': symbol,
|
||||
'name': name,
|
||||
'price': price,
|
||||
'reason': f'跌幅显著 ({ratio:.2%}),模型建议抛售',
|
||||
'confidence': 0.92
|
||||
})
|
||||
except Exception:
|
||||
continue
|
||||
|
||||
return signals
|
||||
228
monitor.py
Normal file
228
monitor.py
Normal file
@@ -0,0 +1,228 @@
|
||||
# -*- coding: utf-8 -*-
|
||||
"""
|
||||
量化交易监控主程序
|
||||
功能:
|
||||
1. 循环抓取美股数据 (支持全量/Top N)
|
||||
2. 调用分析模块进行分析
|
||||
3. 调用交易模块执行信号
|
||||
"""
|
||||
import time
|
||||
import argparse
|
||||
from logging_setup import init_logging
|
||||
from futu import StockDataIntegrator, EastMoneyAPI
|
||||
from market_analyzer import MarketAnalyzer
|
||||
from trader import Trader
|
||||
from datetime import datetime
|
||||
from data_writer import write_symbols, append_bars_1m, append_bars_session, append_signals, append_features_1m, append_etl_run
|
||||
from zoneinfo import ZoneInfo
|
||||
from utils_time import now_et, fmt_et, fmt_et_hm
|
||||
from signal_filter import SignalCooldownFilter
|
||||
|
||||
def main():
|
||||
# 初始化简单日志(保持现有 print,不强制替换)
|
||||
init_logging()
|
||||
parser = argparse.ArgumentParser(description='AI量化交易监控系统')
|
||||
parser.add_argument('--interval', type=int, default=60, help='监控间隔(秒)')
|
||||
parser.add_argument('--limit', type=int, default=100, help='每次监控的股票数量')
|
||||
parser.add_argument('--all', action='store_true', help='监控所有股票(速度较慢)')
|
||||
parser.add_argument('--premarket', action='store_true', help='在盘前窗口抓取富途盘前价格并写入 session=pre')
|
||||
parser.add_argument('--premarket-limit', type=int, default=30, help='盘前抓取的最大股票数(富途页面逐个抓取)')
|
||||
parser.add_argument('--session-override', choices=['pre','regular','post'], help='测试用手动覆盖当前交易时段')
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
print("🚀 启动 AI 量化交易监控系统...")
|
||||
print(f"⏱️ 监控间隔: {args.interval} 秒")
|
||||
|
||||
# 初始化模块
|
||||
integrator = StockDataIntegrator()
|
||||
eastmoney_api = EastMoneyAPI() # 用于快速获取列表
|
||||
analyzer = MarketAnalyzer()
|
||||
trader = Trader()
|
||||
cooldown_filter = SignalCooldownFilter(cooldown_minutes=30)
|
||||
|
||||
loop_count = 0
|
||||
|
||||
try:
|
||||
while True:
|
||||
loop_start = now_et()
|
||||
loop_count += 1
|
||||
print(f"\n🔄 第 {loop_count} 次扫描开始 - {fmt_et_hm() } ET")
|
||||
|
||||
# 1. 抓取数据 (常规东方财富列表)
|
||||
# 为了监控效率,我们主要使用东方财富的快速列表接口
|
||||
# 如果是全量监控
|
||||
stock_data = []
|
||||
|
||||
if args.all:
|
||||
print("📡 正在获取全量市场数据...")
|
||||
# 这里我们简化处理,直接调用修改后的API获取所有数据
|
||||
# 注意:futu.py 中的 get_us_stocks 已经支持分页获取
|
||||
# 为了演示,我们这里只获取前几页,或者使用 futu.py 中新增的逻辑
|
||||
# 直接使用 integrator 的逻辑,但强制 eastmoney_only 以提高速度
|
||||
|
||||
# 我们手动调用 eastmoney_api 来获取数据,避免 integrator 的复杂逻辑
|
||||
# 获取所有数据可能需要一点时间
|
||||
_, total = eastmoney_api.get_us_stocks(page_size=1)
|
||||
print(f"📊 市场总股票数: {total}")
|
||||
|
||||
# 分页获取所有数据
|
||||
page_size = 100
|
||||
limit = total
|
||||
total_pages = (limit + page_size - 1) // page_size
|
||||
|
||||
all_raw_stocks = []
|
||||
for page in range(1, total_pages + 1):
|
||||
stocks, _ = eastmoney_api.get_us_stocks(page_size=page_size, page_index=page)
|
||||
if stocks:
|
||||
all_raw_stocks.extend(stocks)
|
||||
# 稍微延时
|
||||
# time.sleep(0.1)
|
||||
|
||||
# 解析数据
|
||||
for item in all_raw_stocks:
|
||||
parsed = eastmoney_api.parse_stock_data(item)
|
||||
if parsed:
|
||||
stock_data.append({
|
||||
'symbol': parsed['symbol'],
|
||||
'name': parsed['name'],
|
||||
'eastmoney_price': parsed['current_price'],
|
||||
'eastmoney_change_ratio': parsed['change_ratio']
|
||||
})
|
||||
|
||||
else:
|
||||
print(f"📡 正在获取 Top {args.limit} 热门股票数据...")
|
||||
# 获取 Top N
|
||||
raw_stocks, _ = eastmoney_api.get_us_stocks(page_size=args.limit)
|
||||
for item in raw_stocks:
|
||||
parsed = eastmoney_api.parse_stock_data(item)
|
||||
if parsed:
|
||||
stock_data.append({
|
||||
'symbol': parsed['symbol'],
|
||||
'name': parsed['name'],
|
||||
'eastmoney_price': parsed['current_price'],
|
||||
'eastmoney_change_ratio': parsed['change_ratio']
|
||||
})
|
||||
|
||||
print(f"✅ 获取到 {len(stock_data)} 条有效常规行情数据 (ET {fmt_et_hm()})")
|
||||
|
||||
# 1.1 盘前数据补充 (仅在盘前窗口且开启参数时,对前 N 只股票抓取富途页面)
|
||||
def _get_us_market_session(now_et: datetime) -> str:
|
||||
"""根据美东时间判定交易时段: pre(4:00-9:30), regular(9:30-16:00), post(16:00-20:00), off 其它。
|
||||
周末直接 off。夏令时由系统 tz 数据自动处理。"""
|
||||
if now_et.weekday() >= 5: # Saturday=5 Sunday=6
|
||||
return 'off'
|
||||
minutes = now_et.hour * 60 + now_et.minute
|
||||
if 4*60 <= minutes < 9*60 + 30:
|
||||
return 'pre'
|
||||
if 9*60 + 30 <= minutes < 16*60:
|
||||
return 'regular'
|
||||
if 16*60 <= minutes < 20*60:
|
||||
return 'post'
|
||||
return 'off'
|
||||
|
||||
def _current_session() -> str:
|
||||
if args.session_override:
|
||||
return args.session_override
|
||||
now_et = datetime.now(ZoneInfo('America/New_York'))
|
||||
return _get_us_market_session(now_et)
|
||||
|
||||
pre_rows = []
|
||||
session = _current_session()
|
||||
if args.premarket and session == 'pre':
|
||||
pre_candidates = stock_data[: args.premarket_limit]
|
||||
print(f"🌙 盘前窗口内,准备抓取富途盘前数据 {len(pre_candidates)} 条... (ET {fmt_et_hm()})")
|
||||
for i, item in enumerate(pre_candidates, 1):
|
||||
symbol = item['symbol']
|
||||
futu_detail = integrator.get_futu_stock_details(symbol)
|
||||
if futu_detail and futu_detail.get('before_open_price'):
|
||||
# 正常化盘前涨跌幅 (可能含 %)
|
||||
ratio_raw = futu_detail.get('before_open_change_ratio') or ''
|
||||
ratio_val = 0.0
|
||||
try:
|
||||
ratio_clean = str(ratio_raw).replace('%','').strip()
|
||||
if ratio_clean:
|
||||
ratio_f = float(ratio_clean)
|
||||
# 转为小数
|
||||
ratio_val = ratio_f/100.0
|
||||
except Exception:
|
||||
ratio_val = 0.0
|
||||
item.update({
|
||||
'premarket_price': futu_detail.get('before_open_price'),
|
||||
'premarket_change': futu_detail.get('before_open_change'),
|
||||
'premarket_change_ratio': ratio_val,
|
||||
'futu_before_open_price': futu_detail.get('before_open_price'), # 兼容 append_bars_session fallback
|
||||
})
|
||||
pre_rows.append(item)
|
||||
if i % 10 == 0:
|
||||
print(f"🌙 盘前抓取进度 {i}/{len(pre_candidates)} (ET {fmt_et_hm()})")
|
||||
print(f"🌙 富途盘前成功获取 {len(pre_rows)} 条 (ET {fmt_et_hm()})")
|
||||
else:
|
||||
if args.premarket:
|
||||
print(f"🌙 当前交易时段为 {session},未执行盘前抓取 (ET {fmt_et_hm()})")
|
||||
# 2.1 将 symbols 与 1分钟线写入 CSV(data/ 下)
|
||||
try:
|
||||
symbol_id_map = write_symbols([
|
||||
{
|
||||
'symbol': s['symbol'],
|
||||
'name': s['name'],
|
||||
'exchange': 'US',
|
||||
'currency': 'USD',
|
||||
}
|
||||
for s in stock_data
|
||||
])
|
||||
new_rows = append_bars_1m(stock_data, symbol_id_map, source='eastmoney')
|
||||
append_features_1m(new_rows)
|
||||
# 盘前 bars 写入 (不计算特征,避免与常规混淆)
|
||||
if pre_rows:
|
||||
append_bars_session(pre_rows, symbol_id_map, source='futu', session='pre')
|
||||
except Exception as e:
|
||||
print(f"⚠️ 数据落地失败: {e}")
|
||||
|
||||
# 3. 分析数据
|
||||
raw_signals = analyzer.analyze(stock_data)
|
||||
# 盘前可选:根据盘前涨幅单独生成预警信号(示例阈值 +3% / -3%)
|
||||
premarket_signals = []
|
||||
if pre_rows:
|
||||
for r in pre_rows:
|
||||
ratio = r.get('premarket_change_ratio') or 0.0
|
||||
sym = r['symbol']
|
||||
name = r.get('name', '')
|
||||
price = r.get('premarket_price') or r.get('eastmoney_price')
|
||||
if ratio >= 0.03:
|
||||
premarket_signals.append({'symbol': sym, 'name': name, 'price': price, 'type': 'BUY', 'reason': f'盘前涨幅 {ratio:.2%} 预警'})
|
||||
elif ratio <= -0.03:
|
||||
premarket_signals.append({'symbol': sym, 'name': name, 'price': price, 'type': 'SELL', 'reason': f'盘前跌幅 {ratio:.2%} 预警'})
|
||||
if premarket_signals:
|
||||
print(f"🌙 盘前预警信号 {len(premarket_signals)} 条 (ET {fmt_et_hm()})")
|
||||
raw_signals.extend(premarket_signals)
|
||||
signals = cooldown_filter.filter(raw_signals)
|
||||
# 3.1 写入 signals.csv
|
||||
try:
|
||||
if signals:
|
||||
append_signals(signals, symbol_id_map)
|
||||
except Exception as e:
|
||||
print(f"⚠️ 写入信号失败: {e}")
|
||||
|
||||
# 4. 执行交易
|
||||
if signals:
|
||||
trader.execute_signals(signals)
|
||||
else:
|
||||
print("💤 当前无交易信号")
|
||||
|
||||
# 等待下一次扫描
|
||||
# 记录ETL运行
|
||||
try:
|
||||
duration = (now_et() - loop_start).total_seconds()
|
||||
append_etl_run(loop_count, len(stock_data), len(signals), duration, errors=0)
|
||||
except Exception as e:
|
||||
print(f"⚠️ 记录ETL统计失败: {e}")
|
||||
|
||||
print(f"⏳ 等待 {args.interval} 秒...")
|
||||
time.sleep(args.interval)
|
||||
|
||||
except KeyboardInterrupt:
|
||||
print("\n🛑 监控已停止")
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
261
premarket_watch.py
Normal file
261
premarket_watch.py
Normal file
@@ -0,0 +1,261 @@
|
||||
#!/usr/bin/env python3
|
||||
# -*- coding: utf-8 -*-
|
||||
"""盘前监控脚本
|
||||
功能:
|
||||
1. 周期性抓取指定数量美股的盘前价格 (富途 before_open_stock_info)
|
||||
2. 在终端实时打印表格 (可 --interval 控制刷新秒数)
|
||||
3. 支持 --limit 获取排行榜前N只 或 --symbols 手动指定列表
|
||||
4. 自动判断美东时段是否为盘前 (4:00-9:30 ET), 非盘前可提示或使用 --force 继续
|
||||
|
||||
依赖已有模块: futu.py 中的 EastMoneyAPI / StockDataIntegrator
|
||||
|
||||
示例:
|
||||
python premarket_watch.py --limit 15 --interval 30
|
||||
python premarket_watch.py --symbols NVDA,AAPL,TSLA --interval 20
|
||||
python premarket_watch.py --limit 10 --once # 单次输出
|
||||
python premarket_watch.py --limit 10 --force # 忽略时段检查
|
||||
|
||||
注意: 逐个请求富途页面存在速率限制风险, 建议 limit 不要太大; 脚本仅演示用途。
|
||||
"""
|
||||
import argparse
|
||||
import time
|
||||
from datetime import datetime
|
||||
from concurrent.futures import ThreadPoolExecutor, as_completed
|
||||
from zoneinfo import ZoneInfo
|
||||
from typing import List, Dict
|
||||
from futu import EastMoneyAPI, StockDataIntegrator
|
||||
from utils_time import now_et, fmt_et_hm, fmt_et
|
||||
from data_writer import write_symbols, append_premarket_bars, append_premarket_signals
|
||||
|
||||
|
||||
def parse_args():
|
||||
parser = argparse.ArgumentParser(description='盘前实时监控脚本')
|
||||
group = parser.add_mutually_exclusive_group(required=False)
|
||||
group.add_argument('--limit', type=int, default=10, help='获取东方财富排行前N只 (默认10)')
|
||||
group.add_argument('--symbols', type=str, help='逗号分隔的股票代码列表, 覆盖 limit')
|
||||
parser.add_argument('--interval', type=int, default=10, help='刷新间隔秒, 默认60')
|
||||
parser.add_argument('--once', action='store_true', help='只执行一次抓取并退出')
|
||||
parser.add_argument('--force', action='store_true', help='忽略盘前时段判断强制抓取')
|
||||
parser.add_argument('--sleep', type=float, default=0.0, help='顺序模式下的延时(已多线程可忽略)')
|
||||
parser.add_argument('--max-workers', type=int, default=0, help='线程最大数量(0=自动=股票数,建议限制避免过度)')
|
||||
parser.add_argument('--no-color', action='store_true', help='关闭ANSI颜色输出')
|
||||
parser.add_argument('--save', action='store_true', help='保存盘前快照和信号到 data/premarket_*.csv')
|
||||
return parser.parse_args()
|
||||
|
||||
|
||||
def get_et_session(now_et: datetime) -> str:
|
||||
if now_et.weekday() >= 5:
|
||||
return 'off'
|
||||
m = now_et.hour * 60 + now_et.minute
|
||||
if 4*60 <= m < 9*60 + 30:
|
||||
return 'pre'
|
||||
if 9*60 + 30 <= m < 16*60:
|
||||
return 'regular'
|
||||
if 16*60 <= m < 20*60:
|
||||
return 'post'
|
||||
return 'off'
|
||||
|
||||
|
||||
def fetch_symbol_list(limit: int, api: EastMoneyAPI) -> List[Dict]:
|
||||
raw, _ = api.get_us_stocks(page_size=limit)
|
||||
parsed: List[Dict] = []
|
||||
for item in raw:
|
||||
data = api.parse_stock_data(item)
|
||||
if data:
|
||||
parsed.append({'symbol': data['symbol'], 'name': data['name']})
|
||||
return parsed
|
||||
|
||||
|
||||
def parse_symbols_arg(symbols_str: str) -> List[Dict]:
|
||||
result = []
|
||||
for s in symbols_str.split(','):
|
||||
sym = s.strip().upper()
|
||||
if sym:
|
||||
result.append({'symbol': sym, 'name': ''})
|
||||
return result
|
||||
|
||||
|
||||
def safe_ratio_to_float(ratio_raw) -> float:
|
||||
if ratio_raw in (None, ''):
|
||||
return 0.0
|
||||
try:
|
||||
txt = str(ratio_raw).replace('%', '').strip()
|
||||
if not txt:
|
||||
return 0.0
|
||||
return float(txt) / 100.0
|
||||
except Exception:
|
||||
return 0.0
|
||||
|
||||
|
||||
def colorize(s: str, positive: bool, no_color: bool) -> str:
|
||||
if no_color:
|
||||
return s
|
||||
if positive:
|
||||
return f"\x1b[32m{s}\x1b[0m" # 绿色
|
||||
return f"\x1b[31m{s}\x1b[0m" # 红色
|
||||
|
||||
|
||||
def format_table(rows: List[Dict], no_color: bool) -> str:
|
||||
headers = ['Symbol', 'Name', 'Premarket Price', 'Change', 'Change %', 'Updated']
|
||||
col_widths = [len(h) for h in headers]
|
||||
for r in rows:
|
||||
col_widths[0] = max(col_widths[0], len(r.get('symbol','')))
|
||||
col_widths[1] = max(col_widths[1], len(r.get('name','')))
|
||||
col_widths[2] = max(col_widths[2], len(r.get('premarket_price','')))
|
||||
col_widths[3] = max(col_widths[3], len(r.get('premarket_change','')))
|
||||
col_widths[4] = max(col_widths[4], len(r.get('premarket_change_ratio_fmt','')))
|
||||
col_widths[5] = max(col_widths[5], len(r.get('ts','')))
|
||||
def pad(text, width):
|
||||
return str(text).ljust(width)
|
||||
line_sep = '─' * (sum(col_widths) + len(col_widths)*3 - 1)
|
||||
header_line = ' '.join(pad(h, col_widths[i]) for i, h in enumerate(headers))
|
||||
body_lines = []
|
||||
for r in rows:
|
||||
pos = safe_ratio_to_float(r.get('premarket_change_ratio')) >= 0
|
||||
ratio_fmt = r.get('premarket_change_ratio_fmt','')
|
||||
ratio_fmt = colorize(ratio_fmt, pos, no_color)
|
||||
change = r.get('premarket_change','')
|
||||
change = colorize(change, pos, no_color)
|
||||
body_lines.append(' '.join([
|
||||
pad(r.get('symbol',''), col_widths[0]),
|
||||
pad(r.get('name',''), col_widths[1]),
|
||||
pad(r.get('premarket_price',''), col_widths[2]),
|
||||
pad(change, col_widths[3]),
|
||||
pad(ratio_fmt, col_widths[4]),
|
||||
pad(r.get('ts',''), col_widths[5]),
|
||||
]))
|
||||
return f"{header_line}\n{line_sep}\n" + '\n'.join(body_lines)
|
||||
|
||||
|
||||
def main():
|
||||
args = parse_args()
|
||||
api = EastMoneyAPI()
|
||||
integrator = StockDataIntegrator()
|
||||
|
||||
if args.symbols:
|
||||
symbols = parse_symbols_arg(args.symbols)
|
||||
print(f"📋 使用自定义股票列表: {[s['symbol'] for s in symbols]}")
|
||||
else:
|
||||
symbols = fetch_symbol_list(args.limit, api)
|
||||
print(f"📋 获取排行前 {args.limit} 只股票: {[s['symbol'] for s in symbols]}")
|
||||
|
||||
if not symbols:
|
||||
print("❌ 无有效股票列表, 退出")
|
||||
return
|
||||
|
||||
if not args.force:
|
||||
now_et = datetime.now(ZoneInfo('America/New_York'))
|
||||
session = get_et_session(now_et)
|
||||
if session != 'pre':
|
||||
print(f"⚠️ 当前美东时段为 {session} (ET {now_et.strftime('%H:%M')}), 非盘前, 使用 --force 可强制抓取")
|
||||
return
|
||||
|
||||
def _fetch_one(info: Dict) -> Dict:
|
||||
sym = info['symbol']
|
||||
name = info['name']
|
||||
futu = integrator.get_futu_stock_details(sym)
|
||||
if futu and futu.get('before_open_price'):
|
||||
ratio_raw = futu.get('before_open_change_ratio')
|
||||
ratio_val = safe_ratio_to_float(ratio_raw)
|
||||
return {
|
||||
'symbol': sym,
|
||||
'name': name,
|
||||
'premarket_price': futu.get('before_open_price',''),
|
||||
'premarket_change': futu.get('before_open_change',''),
|
||||
'premarket_change_ratio': ratio_raw,
|
||||
'premarket_change_ratio_fmt': f"{ratio_val*100:.2f}%" if ratio_raw else '',
|
||||
'ts': fmt_et_hm(),
|
||||
}
|
||||
return {
|
||||
'symbol': sym,
|
||||
'name': name,
|
||||
'premarket_price': '-',
|
||||
'premarket_change': '-',
|
||||
'premarket_change_ratio': '',
|
||||
'premarket_change_ratio_fmt': '',
|
||||
'ts': fmt_et_hm(),
|
||||
}
|
||||
|
||||
def run_once():
|
||||
# 动态线程数:若 max-workers=0 用股票数,做一个上限保护例如 128
|
||||
worker_target = args.max_workers if args.max_workers > 0 else len(symbols)
|
||||
max_cap = 128 # 安全软限制,避免过度线程导致资源问题
|
||||
workers = min(worker_target, max_cap)
|
||||
if workers < len(symbols):
|
||||
print(f"⚠️ 线程数限制为 {workers} (股票 {len(symbols)}), 使用 --max-workers 调整或提高上限")
|
||||
start = time.time()
|
||||
rows: List[Dict] = []
|
||||
# 多线程并发抓取
|
||||
with ThreadPoolExecutor(max_workers=workers) as executor:
|
||||
future_map = {executor.submit(_fetch_one, info): info['symbol'] for info in symbols}
|
||||
for fut in as_completed(future_map):
|
||||
try:
|
||||
rows.append(fut.result())
|
||||
except Exception as e:
|
||||
sym = future_map[fut]
|
||||
rows.append({
|
||||
'symbol': sym,
|
||||
'name': '',
|
||||
'premarket_price': 'ERR',
|
||||
'premarket_change': '-',
|
||||
'premarket_change_ratio': '',
|
||||
'premarket_change_ratio_fmt': '',
|
||||
'ts': fmt_et_hm(),
|
||||
})
|
||||
print(f"⚠️ {sym} 抓取异常: {e}")
|
||||
# 保持原列表顺序
|
||||
rows.sort(key=lambda r: [s['symbol'] for s in symbols].index(r['symbol']))
|
||||
elapsed = time.time() - start
|
||||
print(f"🕒 ET {fmt_et()} | 刷新间隔 {args.interval}s | 总计 {len(rows)}")
|
||||
print(f"⏱️ 本轮耗时 {elapsed:.2f}s, 线程 {workers}")
|
||||
print(format_table(rows, args.no_color))
|
||||
|
||||
if args.save:
|
||||
# 建立 symbol 基础信息用于写入 symbols.csv(缺 name 也允许)
|
||||
symbol_base = [{'symbol': r['symbol'], 'name': r.get('name',''), 'exchange': 'US', 'currency': 'USD'} for r in rows]
|
||||
symbol_id_map = write_symbols(symbol_base)
|
||||
append_premarket_bars(rows, symbol_id_map, source='futu')
|
||||
# 生成盘前阈值信号(±3%)
|
||||
signals = []
|
||||
for r in rows:
|
||||
raw_ratio = r.get('premarket_change_ratio')
|
||||
val = safe_ratio_to_float(raw_ratio)
|
||||
if val >= 0.03:
|
||||
signals.append({
|
||||
'symbol': r['symbol'],
|
||||
'direction': 'BUY',
|
||||
'reason': f"盘前涨幅 {val*100:.2f}% 触发阈值",
|
||||
'params': {'premarket_price': r.get('premarket_price'), 'premarket_change_ratio': val}
|
||||
})
|
||||
elif val <= -0.03:
|
||||
signals.append({
|
||||
'symbol': r['symbol'],
|
||||
'direction': 'SELL',
|
||||
'reason': f"盘前跌幅 {val*100:.2f}% 触发阈值",
|
||||
'params': {'premarket_price': r.get('premarket_price'), 'premarket_change_ratio': val}
|
||||
})
|
||||
append_premarket_signals(signals, symbol_id_map)
|
||||
if signals:
|
||||
print(f"💡 已保存盘前信号 {len(signals)} 条 -> data/premarket_signals.csv")
|
||||
print("🗂️ 已保存盘前快照 -> data/premarket_bars.csv")
|
||||
|
||||
if args.once:
|
||||
run_once()
|
||||
return
|
||||
|
||||
while True:
|
||||
try:
|
||||
run_once()
|
||||
if args.interval <= 0:
|
||||
break
|
||||
time.sleep(args.interval)
|
||||
except KeyboardInterrupt:
|
||||
print("\n🛑 已停止盘前监控")
|
||||
break
|
||||
except Exception as e:
|
||||
print(f"⚠️ 本轮捕获异常: {e}")
|
||||
time.sleep(args.interval)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
main()
|
||||
2
requirements.txt
Normal file
2
requirements.txt
Normal file
@@ -0,0 +1,2 @@
|
||||
requests>=2.28.0
|
||||
beautifulsoup4>=4.12.0
|
||||
30
signal_filter.py
Normal file
30
signal_filter.py
Normal file
@@ -0,0 +1,30 @@
|
||||
# -*- coding: utf-8 -*-
|
||||
"""信号冷却过滤模块
|
||||
- 避免同一标的在冷却期内重复产生同方向信号
|
||||
- 过滤后附加 generated_at_utc 字段(UTC ISO)
|
||||
"""
|
||||
from datetime import datetime, timezone, timedelta
|
||||
from typing import List, Dict, Any
|
||||
|
||||
class SignalCooldownFilter:
|
||||
def __init__(self, cooldown_minutes: int = 30):
|
||||
self.cooldown = timedelta(minutes=cooldown_minutes)
|
||||
# key: (symbol, direction) -> last datetime
|
||||
self.last_time: Dict[tuple, datetime] = {}
|
||||
|
||||
def filter(self, signals: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
|
||||
now = datetime.now(timezone.utc)
|
||||
accepted = []
|
||||
for s in signals:
|
||||
symbol = s.get('symbol')
|
||||
direction = s.get('type') or s.get('direction')
|
||||
key = (symbol, direction)
|
||||
lt = self.last_time.get(key)
|
||||
if lt is not None and now - lt < self.cooldown:
|
||||
continue
|
||||
# Accept
|
||||
self.last_time[key] = now
|
||||
s['direction'] = direction # normalize
|
||||
s['generated_at_utc'] = now.strftime('%Y-%m-%dT%H:%M:%SZ')
|
||||
accepted.append(s)
|
||||
return accepted
|
||||
1846
trade_log.csv
Normal file
1846
trade_log.csv
Normal file
File diff suppressed because it is too large
Load Diff
62
trader.py
Normal file
62
trader.py
Normal file
@@ -0,0 +1,62 @@
|
||||
# -*- coding: utf-8 -*-
|
||||
"""
|
||||
交易执行模块 (模拟股票购买/抛售)
|
||||
"""
|
||||
import time
|
||||
import csv
|
||||
from datetime import datetime
|
||||
from utils_time import now_et, fmt_et
|
||||
|
||||
class Trader:
|
||||
def __init__(self):
|
||||
self.log_file = "trade_log.csv"
|
||||
self._init_log_file()
|
||||
|
||||
def _init_log_file(self):
|
||||
try:
|
||||
with open(self.log_file, 'a', newline='', encoding='utf-8-sig') as f:
|
||||
pass # 确保文件存在
|
||||
except Exception as e:
|
||||
print(f"❌ 初始化交易日志失败: {e}")
|
||||
|
||||
def execute_signals(self, signals):
|
||||
"""
|
||||
执行交易信号
|
||||
|
||||
Args:
|
||||
signals: 交易信号列表
|
||||
"""
|
||||
if not signals:
|
||||
return
|
||||
|
||||
print(f"⚡ 收到 {len(signals)} 个交易信号,准备执行...")
|
||||
|
||||
for signal in signals:
|
||||
self._execute_single_trade(signal)
|
||||
|
||||
def _execute_single_trade(self, signal):
|
||||
"""执行单笔交易"""
|
||||
action = signal.get('type', '')
|
||||
symbol = signal.get('symbol', '')
|
||||
name = signal.get('name', '')
|
||||
price = signal.get('price', '')
|
||||
reason = signal.get('reason', '')
|
||||
|
||||
# 模拟交易延迟
|
||||
time.sleep(0.1)
|
||||
|
||||
timestamp = fmt_et()
|
||||
log_entry = f"[{timestamp}] {action} {symbol} ({name}) @ ${price} | 原因: {reason}"
|
||||
|
||||
print(f"💸 交易执行: {log_entry}")
|
||||
|
||||
# 记录到文件
|
||||
self._log_trade(timestamp, action, symbol, name, price, reason)
|
||||
|
||||
def _log_trade(self, timestamp, action, symbol, name, price, reason):
|
||||
try:
|
||||
with open(self.log_file, 'a', newline='', encoding='utf-8-sig') as f:
|
||||
writer = csv.writer(f)
|
||||
writer.writerow([timestamp, action, symbol, name, price, reason])
|
||||
except Exception as e:
|
||||
print(f"❌ 写入交易日志失败: {e}")
|
||||
13
utils_id.py
Normal file
13
utils_id.py
Normal file
@@ -0,0 +1,13 @@
|
||||
# -*- coding: utf-8 -*-
|
||||
import hashlib
|
||||
|
||||
def stable_symbol_id(symbol: str, exchange: str = "US") -> int:
|
||||
"""Generate a stable positive 64-bit int ID from symbol+exchange.
|
||||
Collisions are extremely unlikely for our scale.
|
||||
"""
|
||||
base = f"{exchange}:{symbol}".upper().encode("utf-8")
|
||||
h = hashlib.sha1(base).digest()
|
||||
# take first 8 bytes as unsigned 64-bit integer
|
||||
val = int.from_bytes(h[:8], byteorder="big", signed=False)
|
||||
# constrain to 63-bit to avoid CSV tools issues with signedness
|
||||
return val & ((1 << 63) - 1)
|
||||
26
utils_time.py
Normal file
26
utils_time.py
Normal file
@@ -0,0 +1,26 @@
|
||||
from datetime import datetime
|
||||
from zoneinfo import ZoneInfo
|
||||
|
||||
ET_TZ = ZoneInfo('America/New_York')
|
||||
UTC_TZ = ZoneInfo('UTC')
|
||||
|
||||
def now_et():
|
||||
return datetime.now(ET_TZ)
|
||||
|
||||
def now_utc():
|
||||
return datetime.now(UTC_TZ)
|
||||
|
||||
def fmt_et(dt: datetime | None = None, with_date: bool = True) -> str:
|
||||
if dt is None:
|
||||
dt = now_et()
|
||||
return dt.strftime('%Y-%m-%d %H:%M:%S' if with_date else '%H:%M:%S')
|
||||
|
||||
def fmt_et_hm(dt: datetime | None = None) -> str:
|
||||
if dt is None:
|
||||
dt = now_et()
|
||||
return dt.strftime('%H:%M:%S')
|
||||
|
||||
def fmt_utc(dt: datetime | None = None) -> str:
|
||||
if dt is None:
|
||||
dt = now_utc()
|
||||
return dt.strftime('%Y-%m-%d %H:%M:%S')
|
||||
335
盘前操作.md
Normal file
335
盘前操作.md
Normal file
@@ -0,0 +1,335 @@
|
||||
## 盘前数据量化流程
|
||||
|
||||
C:\Users\86188\miniconda3\Scripts\activate
|
||||
|
||||
1. **数据清洗与特征工程**
|
||||
- 读取 `premarket_bars.csv`,筛选 session=pre 的数据。
|
||||
- 计算盘前涨跌幅(change_ratio)、与前收盘价对比(pre_return_vs_prev_close)、流动性 proxy(如 pre_volume)。
|
||||
- 生成 `premarket_features.csv`,为后续量化模型和大模型推理提供输入。
|
||||
|
||||
2. **信号生成与策略设计**
|
||||
- 规则法:如盘前涨幅 >3% 生成 BUY 信号,<-3% 生成 SELL 信号。
|
||||
- 多因子法:结合盘前特征、历史表现、异动分布等,设计量化打分模型。
|
||||
- 大模型法:将盘前特征、历史数据、市场新闻等输入 LLM,生成多维度信号与解读。
|
||||
- 信号写入 `premarket_signals.csv`,记录来源、置信度、推理摘要。
|
||||
|
||||
3. **回测与绩效评估**
|
||||
- 用盘前信号与历史行情进行回测,评估策略收益、风险、胜率。
|
||||
- 对比规则法、多因子法与大模型法的表现,优化信号生成逻辑。
|
||||
- 结果归档于回测报告,可用大模型自动生成策略总结。
|
||||
|
||||
4. **自动化交易与风控**
|
||||
- 盘前信号可自动推送至交易系统,支持模拟盘与实盘。
|
||||
- 结合大模型生成的风险提示,动态调整仓位与风控参数。
|
||||
- 失败样本与异常信号自动归档,便于后续诊断与模型迭代。
|
||||
|
||||
5. **大模型协同分析**
|
||||
- 盘前数据、信号、回测结果可作为 prompt,自动生成策略文档、异动解读、风险提示。
|
||||
- 支持多轮问答与因子解释,提升量化工程师与大模型协作效率。
|
||||
|
||||
6. **监控与持续优化**
|
||||
- 盘前数据与信号归档,定期分析成功率、异常分布、策略表现。
|
||||
- 结合大模型自动诊断与修复建议,持续优化量化流程。
|
||||
|
||||
---
|
||||
|
||||
# 盘前操作说明
|
||||
|
||||
**下一步建议:结合大模型与量化工程最佳实践**
|
||||
|
||||
1. **数据质量与多源融合**
|
||||
- 富途/东方财富/Yahoo 多源融合,自动回退与异常检测。
|
||||
- 失败样本自动归档,便于大模型后续异常分析与数据增强。
|
||||
|
||||
2. **盘前特征工程与大模型输入**
|
||||
- 盘前特征扩展:如 pre_return_vs_prev_close、流动性 proxy、spread、异动分布等。
|
||||
- 直接生成 `premarket_features.csv`,为大模型训练/推理提供结构化输入。
|
||||
|
||||
3. **信号生成与大模型辅助决策**
|
||||
- 传统规则(±3%)与大模型(如 LLM/LLM+因子融合)并行生成信号,支持模型版本号与推理参数落地。
|
||||
- 盘前信号可通过 prompt/embedding 送入 LLM,生成更丰富的“解读”与“风险提示”。
|
||||
|
||||
4. **冷却与去重治理**
|
||||
- 复用 signal_filter.py,支持大模型信号冷却窗口与多因子去重。
|
||||
- 信号写入时记录模型来源、置信度、推理摘要。
|
||||
|
||||
5. **自动化回测与监控**
|
||||
- 盘前数据与信号自动归档,定期触发回测脚本,评估大模型与传统规则的表现。
|
||||
- ETL_RUNS/health 文件记录成功率、耗时、异常分布,便于大模型诊断。
|
||||
|
||||
6. **大模型集成与推理链路**
|
||||
- 盘前数据可直接作为 LLM 输入(如“请分析今日盘前异动并生成交易建议”),支持 prompt 工程与多轮推理。
|
||||
- 结合历史数据,自动生成 prompt,支持多模型对比(如 GPT-4/Claude/自研模型)。
|
||||
|
||||
7. **告警与智能解释**
|
||||
- 盘前信号异常/异动自动推送至 Slack/邮件,并由大模型生成“解读”与“操作建议”。
|
||||
- 失败样本自动归档,定期由大模型分析原因并给出修复建议。
|
||||
|
||||
8. **数据库与高性能存储**
|
||||
- 逐步迁移 CSV → SQLite/PostgreSQL,支持高频查询与大模型批量推理。
|
||||
- 盘前数据表结构可直接映射为大模型训练/推理数据集。
|
||||
|
||||
9. **可扩展 prompt 工程**
|
||||
- 设计 prompt 模板,自动填充盘前特征、信号、历史表现,提升大模型推理效果。
|
||||
- 支持“多轮问答”与“因子解释”,便于策略迭代。
|
||||
|
||||
10. **量化工程师与大模型协作流程**
|
||||
- 盘前数据自动归档,量化工程师可随时调用大模型分析盘前异动、生成策略建议。
|
||||
- 结合大模型自动生成的“策略文档”,实现人机协同决策。
|
||||
|
||||
**推荐大模型应用场景**
|
||||
- 盘前异动解读与自动生成交易建议
|
||||
- 盘前信号置信度评估与风险提示
|
||||
- 失败样本自动诊断与修复建议
|
||||
- prompt 工程与多轮推理链路设计
|
||||
- 量化策略文档自动生成与归档
|
||||
|
||||
---
|
||||
- 保持时间字段可跨时区比对:以 UTC 为主存、同时记录 ET(美东)用于展示
|
||||
- 生成可控的预警信号并记录信号来源与冷却策略
|
||||
|
||||
---
|
||||
|
||||
**总体架构**
|
||||
- 抓取层:`premarket_watch.py`(实时/交互)、`monitor.py`(批量/生产)负责触发抓取
|
||||
- 解析层:`futu.py` 中 `FutuStockParser.parse_javascript_data` / `parse_price_data`,并增加健壮性与回退(见下节)
|
||||
- 持久化层:`data_writer.py` 将快照写入 `bars_1m.csv`(新增 `session` 字段),并支持 `append_bars_session` 写 `session=pre`
|
||||
- 信号层:`market_analyzer.py` / `signal_filter.py` 负责信号生成与冷却规则
|
||||
- 监控/告警:日志 + ETL 统计 (`etl_runs.csv`) + 失败 HTML dump
|
||||
|
||||
---
|
||||
|
||||
**抓取策略(要点)**
|
||||
- 优先抓取来源:富途(`futu`)中 `before_open_stock_info`;若富途失败,再使用东方财富 / Yahoo Finance 回退
|
||||
- 抓取并发:`premarket_watch.py` 支持 `--max-workers`,建议初期将并发数限制在 4-8,避免被风控
|
||||
- 重试与降级:每个 symbol 最多 2 次重试(指数退避 0.5s -> 1s);失败时保存 HTML: `data/failed_{symbol}_{ts}.html`
|
||||
- 验证:抓到的 HTML/JSON 做基本校验(长度、是否包含 `__INITIAL_STATE__`、是否包含价格正则),否则视为失败
|
||||
|
||||
---
|
||||
|
||||
**时间与时区约定**
|
||||
- 存储(CSV / DB)均以 UTC 为主(字段名以 `_utc` 结尾),便于跨时区一致性回测
|
||||
- 对外展示与终端打印使用 ET(美东,`America/New_York`),代码中使用 `utils_time.py` 的 `fmt_et()` / `fmt_et_hm()`
|
||||
- 在每条记录中同时保留 `ts_utc` 与 `ts_et`(后者可选),或只保留 `ts_utc` 并在查询/展示层动态格式化为 ET
|
||||
|
||||
---
|
||||
|
||||
**文件/表 设计(CSV 优先,后续可迁移到 PostgreSQL)**
|
||||
|
||||
- 文件命名(data/ 目录)
|
||||
- `premarket_bars.csv` (盘前快照)
|
||||
- `premarket_signals.csv` (盘前生成的信号/预警)
|
||||
- `premarket_features.csv` (若需盘前特征)
|
||||
- `failed_html/` 存放抓取失败的 HTML,便于人工排查
|
||||
|
||||
- `premarket_bars.csv` 列(CSV)
|
||||
- symbol_id (int)
|
||||
- symbol (text)
|
||||
- ts_utc (ISO UTC)
|
||||
- ts_et (ISO ET) -- 可选,便于人工查看
|
||||
- price (float)
|
||||
- change (float)
|
||||
- change_ratio (float) -- 小数表示,例如 -0.038 表示 -3.8%
|
||||
- volume (int/empty)
|
||||
- source (text) -- 'futu' / 'eastmoney' / 'yahoo'
|
||||
- session (text) -- 'pre' / 'regular' / 'post'
|
||||
- raw_file (text) -- 若保存了原始 HTML/JSON 的文件名
|
||||
|
||||
- `premarket_signals.csv` 列
|
||||
- id (text) -- 如 symbolid-生成时间
|
||||
- symbol_id, symbol
|
||||
- generated_at_utc
|
||||
- signal_type ('premarket_alert')
|
||||
- direction ('BUY'/'SELL')
|
||||
- score (float)
|
||||
- reason (text)
|
||||
- params_json (text) -- 包含触发字段(例如 pre_price, pre_change_ratio)
|
||||
- model_name, version
|
||||
- expires_at_utc
|
||||
|
||||
- PostgreSQL 示例 DDL(简化)
|
||||
```sql
|
||||
CREATE TABLE premarket_bars (
|
||||
id BIGSERIAL PRIMARY KEY,
|
||||
symbol TEXT NOT NULL,
|
||||
symbol_id BIGINT,
|
||||
ts_utc TIMESTAMPTZ NOT NULL,
|
||||
price NUMERIC,
|
||||
change NUMERIC,
|
||||
change_ratio NUMERIC,
|
||||
source TEXT,
|
||||
session TEXT,
|
||||
raw_file TEXT
|
||||
);
|
||||
|
||||
CREATE INDEX idx_premarket_bars_symbol_ts ON premarket_bars(symbol, ts_utc DESC);
|
||||
|
||||
CREATE TABLE premarket_signals (
|
||||
id TEXT PRIMARY KEY,
|
||||
symbol TEXT,
|
||||
symbol_id BIGINT,
|
||||
generated_at_utc TIMESTAMPTZ,
|
||||
direction TEXT,
|
||||
score NUMERIC,
|
||||
reason TEXT,
|
||||
params JSONB
|
||||
);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
**ETL 流程建议(每轮)**
|
||||
1. Fetch: 按配置的 symbol 列表并发抓取 `futu` 页面/JS 数据
|
||||
2. Validate: 校验数据字段完整性(price 非空、change_ratio 可解析)
|
||||
3. Persist raw: 抓到的原始 HTML/JSON(仅失败或配置为保存)写 `failed_html/` 或 `raw/`
|
||||
4. Normalize: 将涨跌幅转换为小数、将价格转浮点
|
||||
5. Persist bar: 写 `premarket_bars.csv` 或入库 `premarket_bars`
|
||||
6. Feature/Signal: 基于规则或模型生成预警信号,写 `premarket_signals.csv`
|
||||
7. Stats/ETL: 写一条 `etl_runs.csv`(fetched_count, signal_count, duration)
|
||||
|
||||
---
|
||||
|
||||
**推荐盘前特征(可在 `premarket_features.csv` 存储)**
|
||||
- pre_return_vs_prev_close = (pre_price / prev_close) - 1
|
||||
- pre_vs_open = (pre_price / open_price) - 1
|
||||
- liquidity_proxy: pre_volume(若可获得)或估计成交强度
|
||||
- spread_estimate: 若能获取买卖价则计算
|
||||
|
||||
---
|
||||
|
||||
**信号治理与安全策略**
|
||||
- 冷却窗口:相同(symbol, direction) 最小冷却 30 分钟(`signal_filter.py` 已实现)
|
||||
- 过度并发保护:对富途页面调用施加 `--max-workers` 限制,建议生产值 4~8
|
||||
- 失败与告警:当连续 N 次(例如 5 次)抓取某个 symbol 失败,发出报警並暫停该 symbol 的抓取
|
||||
- 可选阈值:盘前涨幅 > +3% 发出 BUY 预警,<-3% 发 SELL 预警(可配置)
|
||||
|
||||
---
|
||||
|
||||
**监控与告警**
|
||||
- ETL 日志(`etl_runs.csv`)用于監控采集稳定性(fetched_count 与 error rate)
|
||||
- 将 `failed_html/` 的数量作为健康指标;若短时间内增多,说明被风控/结构变化
|
||||
- 可集成邮件/Slack 通知:当出现大盘前信号或连续抓取失败时通知運維/策略人员
|
||||
|
||||
---
|
||||
|
||||
**存储/归档与保留策略**
|
||||
- 快照保存期:`premarket_bars.csv` 按天轮换或周期归档;建议保留 90 天的高频数据在线上,长期数据归入冷存(S3)
|
||||
- raw HTML:仅保存失败样本,或每 N 次保存一次示例,避免占满磁盘
|
||||
|
||||
---
|
||||
|
||||
**工具链与代码位置**
|
||||
- 抓取/解析:`futu.py`(`FutuStockParser.parse_javascript_data` / `parse_price_data`)
|
||||
- 实时监控:`premarket_watch.py`(已支持多线程、ET 时间显示、失败回存)
|
||||
- 持久化:`data_writer.py`(新增 `session` 字段与 `append_bars_session`)
|
||||
- 时间工具:`utils_time.py`(ET/UTC 格式化)
|
||||
|
||||
---
|
||||
|
||||
**示例命令**
|
||||
- 单次 10 只并发抓取并显示(用于检查):
|
||||
```bash
|
||||
python premarket_watch.py --limit 10 --once --force --max-workers 8
|
||||
```
|
||||
- 持续运行(每 30s 刷新):
|
||||
```bash
|
||||
python premarket_watch.py --limit 20 --interval 30 --max-workers 6
|
||||
```
|
||||
|
||||
- 保存盘前快照和信号(写入 `data/premarket_bars.csv` / `data/premarket_signals.csv`):
|
||||
```bash
|
||||
python premarket_watch.py --limit 25 --interval 60 --save --max-workers 6 --force
|
||||
```
|
||||
|
||||
运行后可在 `data/` 目录看到:
|
||||
- `premarket_bars.csv` 新增行(session=pre, change_ratio 为小数)
|
||||
- `premarket_signals.csv` BUY/SELL 阈值信号(±3%)
|
||||
- `symbols.csv` 自动补充缺失的 symbol 基础信息
|
||||
|
||||
---
|
||||
|
||||
### 盘前数据清洗与特征工程详细操作
|
||||
|
||||
1. **读取与筛选盘前数据**
|
||||
- 使用 pandas 或 csv 库读取 `data/premarket_bars.csv`。
|
||||
- 仅保留 `session=pre` 的行。
|
||||
- 示例代码(pandas):
|
||||
```python
|
||||
import pandas as pd
|
||||
df = pd.read_csv('data/premarket_bars.csv')
|
||||
pre_df = df[df['session'] == 'pre']
|
||||
```
|
||||
|
||||
2. **计算盘前特征**
|
||||
- 盘前涨跌幅:直接使用 `change_ratio` 列。
|
||||
- 与前收盘价对比(pre_return_vs_prev_close):需关联前一天收盘价(可从历史 bars 或 eastmoney/yahoo 数据获取),公式:
|
||||
```python
|
||||
# 假设 pre_df 有 prev_close 列
|
||||
pre_df['pre_return_vs_prev_close'] = pre_df['price'] / pre_df['prev_close'] - 1
|
||||
```
|
||||
- 流动性 proxy(如 pre_volume):如有 volume 字段直接用,否则可用成交额/市值等近似。
|
||||
|
||||
3. **生成特征文件**
|
||||
- 选取需要的特征列,如 symbol, ts_utc, price, change_ratio, pre_return_vs_prev_close, pre_volume。
|
||||
- 保存为 `data/premarket_features.csv`。
|
||||
- 示例代码:
|
||||
```python
|
||||
feature_cols = ['symbol', 'ts_utc', 'price', 'change_ratio', 'pre_return_vs_prev_close', 'volume']
|
||||
pre_df[feature_cols].to_csv('data/premarket_features.csv', index=False)
|
||||
```
|
||||
|
||||
4. **数据源补充说明**
|
||||
- 若 `prev_close` 或 `volume` 缺失,可用 `eastmoney` 或 `yahoo` 的历史行情接口补齐。
|
||||
- 推荐先用 pandas 合并历史收盘价,再批量计算特征。
|
||||
|
||||
5. **自动化脚本建议**
|
||||
- 可将上述流程封装为 `etl_premarket_features.py`,每日盘前自动运行。
|
||||
- 支持异常处理与日志输出,便于后续大模型分析。
|
||||
|
||||
---
|
||||
|
||||
### 前收盘价获取方法
|
||||
|
||||
1. **数据来源**
|
||||
- 东方财富(EastMoneyAPI):在 `futu.py` 的 `parse_stock_data` 方法中,已解析 `prev_close` 字段(f18),可用于美股主流标的。
|
||||
- 富途:部分页面可解析前收盘价,但稳定性略低,建议优先用东方财富。
|
||||
- Yahoo Finance:如需补充,可用 yfinance 或 requests 获取历史收盘价。
|
||||
|
||||
2. **自动补齐流程**
|
||||
- 在盘前特征工程脚本中,先读取 `premarket_bars.csv`,如无 prev_close 字段,则批量用 symbol 列调用东方财富 API 获取。
|
||||
- 示例代码(pandas + requests):
|
||||
```python
|
||||
import pandas as pd
|
||||
from futu import EastMoneyAPI
|
||||
df = pd.read_csv('data/premarket_bars.csv')
|
||||
api = EastMoneyAPI()
|
||||
def get_prev_close(symbol):
|
||||
stocks, _ = api.get_us_stocks(page_size=1)
|
||||
for item in stocks:
|
||||
data = api.parse_stock_data(item)
|
||||
if data and data['symbol'] == symbol:
|
||||
return data['prev_close']
|
||||
return None
|
||||
df['prev_close'] = df['symbol'].apply(get_prev_close)
|
||||
```
|
||||
- 若需高效批量补齐,可提前缓存 symbol→prev_close 映射。
|
||||
|
||||
3. **补充说明**
|
||||
- 若已在 `premarket_bars.csv` 生成时写入 prev_close 字段,则无需后处理。
|
||||
- 若需用 Yahoo Finance,可用 yfinance 库:
|
||||
```python
|
||||
import yfinance as yf
|
||||
def get_prev_close_yahoo(symbol):
|
||||
ticker = yf.Ticker(symbol)
|
||||
hist = ticker.history(period='2d')
|
||||
if len(hist) >= 2:
|
||||
return hist['Close'].iloc[-2]
|
||||
return None
|
||||
```
|
||||
- 推荐在 ETL/特征工程脚本中自动补齐,保证后续量化分析一致性。
|
||||
|
||||
---
|
||||
|
||||
文档作者: AI 量化工程师(为当前代码库改造)
|
||||
|
||||
END
|
||||
Reference in New Issue
Block a user