init commit

This commit is contained in:
徐微
2025-12-08 15:30:19 +08:00
commit 09193a2288
39 changed files with 16688 additions and 0 deletions

286
DATA_SCHEMA.md Normal file
View File

@@ -0,0 +1,286 @@
# 量化交易数据表设计(建议版)
本文档给出一个兼顾本地研究SQLite与生产化部署PostgreSQL的数据模型。目标可重复、可扩展、便于回测和线上实时监控支持行情、基本面、新闻情绪、信号、订单、成交、持仓与风控度量。
## 设计原则
- 时间统一:全部时间戳使用 UTC列命名统一为 `*_utc``as_of_date`UTC 日期)。
- 主键稳定:优先采用业务自然键的组合主键(如 `(symbol_id, ts_utc)`),跨系统对齐使用 surrogate key自增或 UUID
- 数值约定:
- 价格/金额使用 `NUMERIC(18,6)`PG`REAL`/`DECIMAL`SQLite
- 涨跌幅/比率存“小数形式”:如 4.02% 存 `0.0402`
- 幂等写入:所有采集/计算表支持 Upsert通过唯一键 + `ON CONFLICT DO UPDATE`PG`INSERT OR REPLACE`SQLite
- 分区与保留高频表按时间分区PG并制定数据保留策略如 Tick 保留 30 天1 分钟 Bar 保留 180 天,日线长期保留)。
## 表清单与字段
### 1) `symbols`(标的主表)
- 用途:统一管理股票/ETF 元数据。
- 主键:`id`PK自增/UUID唯一约束`(symbol, exchange)`
- 字段:
- `id` PK
- `symbol` 股票代码(如 `AAPL`
- `name` 名称
- `exchange` 交易所(如 `NASDAQ`
- `currency` 货币(如 `USD`
- `tick_size``lot_size`
- `sector``industry`
- `is_active` 布尔
- `first_seen_utc``last_seen_utc`
### 2) `calendars`(交易日历)
- 主键:`(exchange, date)`
- 字段:`is_trading_day``open_time_utc``close_time_utc``notes`
### 3) `bars_1m`1 分钟 K 线)
- 主键:`(symbol_id, ts_utc)``ts_utc` 为该分钟起始时刻)。
- 索引:`(ts_utc)``(symbol_id, ts_utc DESC)`
- 字段:`open``high``low``close``volume``vwap``trades_count``source`
### 4) `bars_1d`(日线 K 线)
- 主键:`(symbol_id, as_of_date)`
- 字段:`open``high``low``close``adj_close``volume``dividend``split_ratio``source`
### 5) `ticks`(逐笔/Level-1 快照,可选)
- 主键:`id`(自增/UUID唯一键建议`(symbol_id, ts_utc, source, seq)`
- 字段:`price``size``bid``ask``bid_size``ask_size``condition``seq``source`
### 6) `corporate_actions`(公司行为)
- 主键:`(symbol_id, ex_date, type)`
- 字段:`type``split`/`dividend`/...)、`amount``ratio``currency``notes`
### 7) `fundamentals_snapshot`(基本面快照)
- 主键:`(symbol_id, as_of_date)`
- 字段示例:`market_cap``pe_ttm``ps_ttm``pb``eps_ttm``revenue_ttm``shares_outstanding``updated_at_utc`
### 8) `news` 与 `news_symbols`(新闻与关联表)
- `news` 主键:`id`UUID/自增);字段:`published_at_utc``source``title``url``summary``sentiment_score``topics`
- `news_symbols` 主键:`(news_id, symbol_id)`
### 9) `signals`(策略信号)
- 主键:`id`UUID/自增);唯一建议:`(symbol_id, generated_at_utc, model_name, version)`
- 字段:
- `symbol_id``generated_at_utc`
- `signal_type`(如 `momentum`/`reversal`
- `direction``BUY`/`SELL`/`HOLD`
- `score`01 或 z-score
- `horizon`(如 `1d`/`1h`
- `params_json`(策略参数 JSON
- `model_name``version`
- `expires_at_utc`(过期时间,可空)
### 10) `orders`(订单)
- 主键:`id`;唯一建议:`broker_order_id`(如接入实盘)。
- 字段:`signal_id``symbol_id``side``order_type``qty``price``time_in_force``status``created_at_utc``updated_at_utc``broker_order_id`
### 11) `executions`(成交/回执)
- 主键:`id`;索引:`(order_id)``(exec_time_utc)`
- 字段:`order_id``exec_time_utc``price``qty``fee``liquidity``maker`/`taker`)。
### 12) `positions`(持仓快照)
- 主键:`(portfolio_id, symbol_id)` 或附带 `as_of_date` 做日终表。
- 字段:`qty``avg_cost``unrealized_pnl``realized_pnl``last_updated_utc`
### 13) `portfolios` / `portfolio_nav_daily`(组合与净值)
- `portfolios``id` PK、`name``base_currency``created_at_utc`
- `portfolio_nav_daily` 主键:`(portfolio_id, as_of_date)`;字段:`cash``equity_value``nav``daily_return``gross_exposure``net_exposure`
### 14) `risk_metrics_daily`(风险指标)
- 主键:`(portfolio_id, as_of_date)`;字段:`var_95``beta``sharpe``max_drawdown``volatility_20d` 等。
### 15) `etl_runs`(任务运行元数据)
- 主键:`run_id`;字段:`task_name``started_at_utc``finished_at_utc``status``rows_affected``checksum`
## PostgreSQL 示例 DDL核心表
```sql
-- 1) 标的
CREATE TABLE symbols (
id BIGSERIAL PRIMARY KEY,
symbol TEXT NOT NULL,
name TEXT,
exchange TEXT NOT NULL,
currency TEXT DEFAULT 'USD',
tick_size NUMERIC(18,6),
lot_size NUMERIC(18,6),
sector TEXT,
industry TEXT,
is_active BOOLEAN DEFAULT TRUE,
first_seen_utc TIMESTAMPTZ,
last_seen_utc TIMESTAMPTZ,
UNIQUE(symbol, exchange)
);
-- 2) 1 分钟 K 线
CREATE TABLE bars_1m (
symbol_id BIGINT NOT NULL REFERENCES symbols(id),
ts_utc TIMESTAMPTZ NOT NULL,
open NUMERIC(18,6) NOT NULL,
high NUMERIC(18,6) NOT NULL,
low NUMERIC(18,6) NOT NULL,
close NUMERIC(18,6) NOT NULL,
volume BIGINT,
vwap NUMERIC(18,6),
trades_count INTEGER,
source TEXT,
PRIMARY KEY(symbol_id, ts_utc)
);
CREATE INDEX ON bars_1m (ts_utc);
CREATE INDEX ON bars_1m (symbol_id, ts_utc DESC);
-- 3) 日线 K 线
CREATE TABLE bars_1d (
symbol_id BIGINT NOT NULL REFERENCES symbols(id),
as_of_date DATE NOT NULL,
open NUMERIC(18,6) NOT NULL,
high NUMERIC(18,6) NOT NULL,
low NUMERIC(18,6) NOT NULL,
close NUMERIC(18,6) NOT NULL,
adj_close NUMERIC(18,6),
volume BIGINT,
dividend NUMERIC(18,6),
split_ratio NUMERIC(18,6),
source TEXT,
PRIMARY KEY(symbol_id, as_of_date)
);
-- 4) 信号表(涨跌幅等指标请用小数存储)
CREATE TABLE signals (
id BIGSERIAL PRIMARY KEY,
symbol_id BIGINT NOT NULL REFERENCES symbols(id),
generated_at_utc TIMESTAMPTZ NOT NULL,
signal_type TEXT NOT NULL,
direction TEXT NOT NULL CHECK (direction IN ('BUY','SELL','HOLD')),
score NUMERIC(18,6),
horizon TEXT,
params_json JSONB,
model_name TEXT,
version TEXT,
expires_at_utc TIMESTAMPTZ,
UNIQUE(symbol_id, generated_at_utc, model_name, version)
);
CREATE INDEX ON signals (symbol_id, generated_at_utc DESC);
-- 5) 订单/成交
CREATE TABLE orders (
id BIGSERIAL PRIMARY KEY,
signal_id BIGINT REFERENCES signals(id),
symbol_id BIGINT NOT NULL REFERENCES symbols(id),
side TEXT NOT NULL CHECK (side IN ('BUY','SELL')),
order_type TEXT NOT NULL CHECK (order_type IN ('MKT','LMT')),
qty NUMERIC(18,6) NOT NULL,
price NUMERIC(18,6),
time_in_force TEXT,
status TEXT NOT NULL,
broker_order_id TEXT,
created_at_utc TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at_utc TIMESTAMPTZ
);
CREATE UNIQUE INDEX IF NOT EXISTS orders_broker_unique ON orders(broker_order_id) WHERE broker_order_id IS NOT NULL;
CREATE TABLE executions (
id BIGSERIAL PRIMARY KEY,
order_id BIGINT NOT NULL REFERENCES orders(id),
exec_time_utc TIMESTAMPTZ NOT NULL,
price NUMERIC(18,6) NOT NULL,
qty NUMERIC(18,6) NOT NULL,
fee NUMERIC(18,6),
liquidity TEXT
);
CREATE INDEX ON executions(order_id);
CREATE INDEX ON executions(exec_time_utc);
```
## SQLite 示例 DDL简化版
```sql
CREATE TABLE symbols (
id INTEGER PRIMARY KEY AUTOINCREMENT,
symbol TEXT NOT NULL,
name TEXT,
exchange TEXT NOT NULL,
currency TEXT,
tick_size REAL,
lot_size REAL,
sector TEXT,
industry TEXT,
is_active INTEGER DEFAULT 1,
first_seen_utc TEXT,
last_seen_utc TEXT,
UNIQUE(symbol, exchange)
);
CREATE TABLE bars_1m (
symbol_id INTEGER NOT NULL,
ts_utc TEXT NOT NULL,
open REAL NOT NULL,
high REAL NOT NULL,
low REAL NOT NULL,
close REAL NOT NULL,
volume INTEGER,
vwap REAL,
trades_count INTEGER,
source TEXT,
PRIMARY KEY(symbol_id, ts_utc)
);
CREATE TABLE signals (
id INTEGER PRIMARY KEY AUTOINCREMENT,
symbol_id INTEGER NOT NULL,
generated_at_utc TEXT NOT NULL,
signal_type TEXT NOT NULL,
direction TEXT NOT NULL,
score REAL,
horizon TEXT,
params_json TEXT,
model_name TEXT,
version TEXT,
expires_at_utc TEXT,
UNIQUE(symbol_id, generated_at_utc, model_name, version)
);
```
## 典型索引与查询
- 最新信号:
```sql
SELECT DISTINCT ON (s.symbol_id)
s.*
FROM signals s
ORDER BY s.symbol_id, s.generated_at_utc DESC;
```
- 连接 1 分钟线与信号(取信号后最近 30 分钟):
```sql
SELECT b.*
FROM signals s
JOIN bars_1m b
ON b.symbol_id = s.symbol_id
AND b.ts_utc BETWEEN s.generated_at_utc AND s.generated_at_utc + INTERVAL '30 minutes'
WHERE s.generated_at_utc >= NOW() - INTERVAL '1 day';
```
- 计算日收益(简单取 `close`
```sql
SELECT symbol_id,
as_of_date,
close / LAG(close) OVER (PARTITION BY symbol_id ORDER BY as_of_date) - 1 AS daily_return
FROM bars_1d;
```
## 数据保留与分区建议PG
- `ticks`:分区粒度=日,保留 3090 天。
- `bars_1m`:分区粒度=月,保留 180365 天。
- `bars_1d``signals``orders``executions`:长期保留。
## 与当前项目的对接要点
- 数值规范:`change_ratio` 等“比率”统一用小数存储(代码已修正),写入时无需 `%`
- 表落地策略:
- 初期SQLite 单文件,开发/回测方便;
- 扩展PostgreSQL + 分区 + 指标物化视图(如聚合分钟线)。
- ETL 幂等:抓取任务用 `(symbol_id, ts_utc)``(symbol_id, as_of_date)` 做 Upsert避免重复数据导致回测偏差。
## 后续扩展
- `features_*`:因子特征宽表(按频率区分:分钟/日)。
- `models_registry`:模型注册与版本追踪。
- `backtest_runs`:回测任务与指标结果表(收益、回撤、卡方检验等)。
---
如需,我可以按上述 DDL 直接生成 SQLite 数据库,并将 `monitor.py` 在每轮拉取后把 Top N 的 1 分钟 bar、信号落到库中Upsert用于后续回测与可视化分析。

75
README.md Normal file
View File

@@ -0,0 +1,75 @@
# AI Stock Trading Assistant (模拟量化交易系统)
这是一个基于 Python 的简易量化交易模拟系统。它能够自动抓取美股实时行情,根据预设策略进行分析,并模拟执行交易(记录日志)。
## 功能特点
* **数据抓取**: 集成东方财富 API 和富途牛牛网页解析,获取美股实时价格和涨跌幅。
* **自动监控**: `monitor.py` 支持循环扫描市场,实时监控股票动态。
* **策略分析**: 内置简单的趋势跟踪策略(涨跌幅阈值),可扩展接入大模型分析。
* **模拟交易**: 生成交易信号并记录到 CSV 日志,不涉及真实资金操作。
## 环境要求
* Python 3.6+
* 依赖库:
* `requests`
* `beautifulsoup4`
## 安装
1. 克隆或下载本项目。
2. 安装所需的 Python 库:
```bash
pip install requests beautifulsoup4
```
## 使用指南
### 1. 启动自动监控系统
运行 `monitor.py` 启动全自动监控循环。系统会定期抓取数据、分析并记录交易信号。
```bash
# 默认启动 (监控 Top 100 股票,间隔 60 秒)
python monitor.py
# 自定义监控间隔 (例如 30 秒)
python monitor.py --interval 30
# 监控更多股票 (例如 Top 200)
python monitor.py --limit 200
# 全量监控 (速度较慢,获取所有美股数据)
python monitor.py --all
```
python monitor.py --limit 10 --interval 5 --premarket --premarket-limit 3
运行后,交易记录将实时写入 `trade_log.csv` 文件。
### 2. 单独使用数据抓取工具
`futu.py` 可以作为独立工具运行,用于抓取数据并保存为 CSV。
```bash
# 获取美股市值前 50 名股票数据
python futu.py --top50
# 获取前 100 名并保存到文件
python futu.py --top50 --limit 100 --output stocks.csv
# 仅使用东方财富数据源 (速度更快)
python futu.py --top50 --eastmoney-only
```
python premarket_watch.py --limit 10 --force
## 项目结构
* `monitor.py`: **主程序**。负责调度数据抓取、分析和交易模块,执行循环监控。
* `futu.py`: **数据层**。包含 `EastMoneyAPI``FutuStockParser`,负责从网络获取股票数据。
* `market_analyzer.py`: **策略层**。接收行情数据,根据策略(如涨跌幅 > 5%)生成买卖信号。
* `trader.py`: **执行层**。接收信号,模拟下单过程,并将结果写入日志。
* `trade_log.csv`: **日志文件**。记录所有模拟交易的历史数据。
## 免责声明
本项目仅供学习和研究使用。系统中的“交易”均为模拟行为,不涉及任何真实资金往来。投资有风险,入市需谨慎。

66
ROADMAP.md Normal file
View File

@@ -0,0 +1,66 @@
# 系统升级路线图 (Roadmap to AI Quant System)
根据您提供的架构图,目前的系统仅实现了最基础的“价格监控”和“模拟下单”功能。要达到图中展示的**全方位 AI 量化交易系统**(包含行业分析、大模型决策、全量监控等),需要进行以下四个阶段的升级:
## 第一阶段:数据广度与深度扩展 (Data Layer)
目前的 `futu.py` 仅抓取了价格数据,图中的系统需要更多维度的信息。
- [ ] **全量纳斯达克覆盖 (3886支股票)**
- **现状**: 仅支持 Top N 或简单的分页抓取。
- **行动**: 优化 `StockDataIntegrator`维护一份完整的纳斯达克成分股列表Symbol List确保监控覆盖无死角。
- [ ] **盘前/盘后数据 (Pre/Post-Market)**
- **现状**: 代码中有部分解析逻辑,但未完全启用。
- **行动**: 确保在美股非交易时段(北京时间 16:00 - 21:30也能获取实时报价。
- [ ] **非结构化数据抓取 (新闻/研报)**
- **现状**: **缺失**
- **行动**: 开发新的爬虫模块 `news_scraper.py`
- **新浪财经/东方财富**: 抓取个股快讯。
- **雪球**: 抓取社区讨论热度。
- **投行研报**: 抓取高盛、花旗等机构的评级调整Upgrade/Downgrade和目标价。
## 第二阶段:引入“大脑” - 大模型与云端分析 (Intelligence Layer)
这是图中“自研大模型”和“云端分析”的核心部分,目前的 `market_analyzer.py` 逻辑太简单。
- [ ] **接入 LLM (大语言模型)**
- **现状**: 仅使用 `if 涨幅 > 5%` 的硬编码规则。
- **行动**: 改造 `market_analyzer.py`,接入 OpenAI (GPT-4)、Claude 或本地部署的 DeepSeek/Llama 模型。
- **应用场景**:
- **情感分析**: 输入新闻标题,让 AI 判断是利好还是利空。
- **财报解读**: 输入财报摘要,让 AI 分析营收增长和指引。
- [ ] **行业与趋势分析**
- **现状**: 仅关注个股。
- **行动**: 增加“板块分析”模块,计算半导体、科技、医药等板块的整体涨跌幅,实现“行业分析”功能。
## 第三阶段:系统架构升级 (Architecture Layer)
要同时监控 3886 支股票,目前的单线程循环效率不够。
- [ ] **高并发异步架构**
- **现状**: 同步轮询(一次抓取一个或一页),延迟高。
- **行动**: 使用 Python 的 `asyncio``aiohttp` 重构 `monitor.py`,实现高并发抓取,确保 3000+ 支股票的数据延迟在秒级以内。
- [ ] **数据库存储**
- **现状**: 使用 CSV 文件。
- **行动**: 引入 **SQLite****PostgreSQL** 数据库。存储历史行情、新闻数据和 AI 分析记录,以便进行“趋势分析”和回测。
## 第四阶段:实盘交易对接 (Execution Layer)
图中的“股票购买”和“股票抛售”需要对接真实券商。
- [ ] **券商 API 对接**
- **现状**: `trader.py` 仅打印日志。
- **行动**: 接入券商 API**富途 OpenD**、**老虎证券 Open API** 或 **Interactive Brokers API**)。
- **功能**: 实现真实的下单Place Order、撤单、资金查询和持仓同步。
---
## 总结:下一步具体操作建议
建议您先从 **第二阶段** 入手因为这是“AI 财经”最核心的特征:
1. **申请一个大模型 API Key** (如 DeepSeek, OpenAI)。
2. **修改 `market_analyzer.py`**
* 不再只看涨跌幅。
* 增加一个函数 `analyze_news_sentiment(news_text)`,让 AI 帮您判断新闻利好。
3. **创建一个 `news_spider.py`**,试着抓取几条财经新闻作为 AI 的输入。

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

1701
data/bars_1m.csv Normal file

File diff suppressed because it is too large Load Diff

128
data/etl_runs.csv Normal file
View File

@@ -0,0 +1,128 @@
run_ts_utc,loop,fetched_count,signal_count,duration_seconds,errors
2025-11-25T10:44:54Z,1,2,2,0.010,0
2025-11-25T10:44:54Z,2,2,0,0.010,0
2025-11-25T10:47:11Z,1,50,24,2.498,0
2025-11-25T10:47:41Z,1,50,24,2.918,0
2025-11-25T11:07:19Z,1,10,6,2.836,0
2025-11-25T11:07:26Z,2,10,0,1.871,0
2025-11-25T11:07:33Z,3,10,0,2.204,0
2025-11-25T11:07:41Z,4,10,0,2.860,0
2025-11-25T11:07:48Z,5,10,0,1.902,0
2025-11-25T11:07:55Z,6,10,0,2.056,0
2025-11-25T11:08:03Z,7,10,0,2.957,0
2025-11-25T11:11:49Z,1,10,6,3.377,0
2025-11-25T11:11:56Z,2,10,0,2.035,0
2025-11-25T11:12:04Z,3,10,0,2.529,0
2025-11-25T11:12:11Z,4,10,0,2.598,0
2025-11-25T11:12:19Z,5,10,0,2.985,0
2025-11-25T11:12:26Z,6,10,0,1.725,0
2025-11-25T11:12:33Z,7,10,0,1.935,0
2025-11-25T11:12:41Z,8,10,0,2.609,0
2025-11-25T11:12:48Z,9,10,0,2.033,0
2025-11-25T11:12:55Z,10,10,0,2.231,0
2025-11-25T11:13:04Z,11,10,0,4.059,0
2025-11-25T11:13:11Z,12,10,0,1.893,0
2025-11-25T11:13:18Z,13,10,0,2.463,0
2025-11-25T11:13:25Z,14,10,0,2.150,0
2025-11-25T11:13:32Z,15,10,0,1.869,0
2025-11-25T11:13:39Z,16,10,0,2.121,0
2025-11-25T11:13:47Z,17,10,0,2.567,0
2025-11-25T11:13:54Z,18,10,0,2.531,0
2025-11-25T11:14:01Z,19,10,0,1.997,0
2025-11-25T11:14:09Z,20,10,0,2.550,0
2025-11-25T11:14:16Z,21,10,0,2.101,0
2025-11-25T11:14:24Z,22,10,0,2.745,0
2025-11-25T11:14:31Z,23,10,0,2.085,0
2025-11-25T11:14:38Z,24,10,0,2.005,0
2025-11-25T11:14:45Z,25,10,0,2.006,0
2025-11-25T11:14:52Z,26,10,0,2.334,0
2025-11-25T11:14:59Z,27,10,0,2.102,0
2025-11-25T11:15:06Z,28,10,0,1.812,0
2025-11-25T11:15:16Z,29,10,0,4.983,0
2025-11-25T11:15:23Z,30,10,0,1.977,0
2025-11-25T11:15:31Z,31,10,0,2.548,0
2025-11-25T11:15:38Z,32,10,0,2.241,0
2025-11-25T11:15:46Z,33,10,0,2.619,0
2025-11-25T11:15:53Z,34,10,0,2.174,0
2025-11-25T11:16:00Z,35,10,0,2.690,0
2025-11-25T11:16:08Z,36,10,0,2.169,0
2025-11-25T11:16:15Z,37,10,0,2.560,0
2025-11-25T11:16:22Z,38,10,0,2.309,0
2025-11-25T11:16:29Z,39,10,0,1.976,0
2025-11-25T11:16:41Z,1,10,6,2.499,0
2025-11-25T11:16:48Z,2,10,0,2.566,0
2025-11-25T11:16:56Z,3,10,0,2.491,0
2025-11-25T11:17:03Z,4,10,0,1.879,0
2025-11-25T11:17:10Z,5,10,0,1.998,0
2025-11-25T11:17:17Z,6,10,0,2.378,0
2025-11-25T11:17:24Z,7,10,0,2.000,0
2025-11-25T11:17:30Z,1,10,7,3.058,0
2025-11-25T11:17:38Z,2,10,0,2.890,0
2025-11-25T11:17:45Z,3,10,0,1.814,0
2025-11-25T11:17:52Z,4,10,0,1.980,0
2025-11-25T11:17:59Z,5,10,0,2.803,0
2025-11-25T11:18:07Z,6,10,0,2.337,0
2025-11-25T11:18:14Z,7,10,0,2.154,0
2025-11-25T11:18:21Z,8,10,0,1.863,0
2025-11-25T11:18:28Z,9,10,0,2.543,0
2025-11-25T11:18:36Z,10,10,0,2.155,0
2025-11-25T11:18:42Z,11,10,0,1.899,0
2025-11-25T11:18:49Z,12,10,0,2.033,0
2025-11-25T11:18:56Z,13,10,0,2.058,0
2025-11-25T11:19:04Z,14,10,0,2.660,0
2025-11-25T11:19:12Z,15,10,0,2.445,0
2025-11-25T11:19:19Z,16,10,0,1.929,0
2025-11-25T11:19:27Z,17,10,0,3.209,0
2025-11-25T11:19:34Z,18,10,0,1.945,0
2025-11-25T11:19:41Z,19,10,0,2.034,0
2025-11-25T11:19:48Z,20,10,0,2.368,0
2025-11-25T11:19:56Z,21,10,0,2.757,0
2025-11-25T11:20:03Z,22,10,0,2.141,0
2025-11-25T11:20:10Z,23,10,0,2.113,0
2025-11-25T11:20:17Z,24,10,0,1.743,0
2025-11-25T11:20:24Z,25,10,0,1.941,0
2025-11-25T11:20:31Z,26,10,0,2.210,0
2025-11-25T11:20:38Z,27,10,0,2.442,0
2025-11-25T11:20:51Z,1,10,6,2.674,0
2025-11-25T11:21:00Z,2,10,0,3.996,0
2025-11-25T11:21:07Z,3,10,0,2.291,0
2025-11-25T11:21:14Z,4,10,0,1.935,0
2025-11-25T11:21:21Z,5,10,0,2.171,0
2025-11-25T11:21:28Z,6,10,0,2.033,0
2025-11-25T11:21:36Z,7,10,0,2.871,0
2025-11-25T11:21:43Z,8,10,0,2.115,0
2025-11-25T11:21:51Z,9,10,0,2.700,0
2025-11-25T11:21:58Z,10,10,0,2.256,0
2025-11-25T11:22:05Z,11,10,0,2.244,0
2025-11-25T11:22:12Z,12,10,0,2.217,0
2025-11-25T11:22:20Z,13,10,0,2.171,0
2025-11-25T11:22:27Z,14,10,0,2.455,0
2025-11-25T11:22:34Z,15,10,0,2.108,0
2025-11-25T11:22:41Z,16,10,0,2.232,0
2025-11-25T11:22:49Z,17,10,0,2.415,0
2025-11-25T11:22:56Z,18,10,0,1.945,0
2025-11-25T11:23:04Z,19,10,0,2.833,0
2025-11-25T11:23:11Z,20,10,0,2.123,0
2025-11-25T11:23:18Z,21,10,0,2.334,0
2025-11-25T11:23:25Z,22,10,0,2.197,0
2025-11-25T11:23:32Z,23,10,0,1.917,0
2025-11-25T11:23:39Z,24,10,0,1.900,0
2025-11-25T11:23:46Z,25,10,0,1.790,0
2025-11-25T11:23:53Z,26,10,0,2.130,0
2025-11-25T11:24:00Z,27,10,0,1.574,0
2025-11-25T11:24:07Z,28,10,0,2.039,0
2025-11-25T11:24:14Z,29,10,0,1.984,0
2025-11-25T11:24:21Z,30,10,0,2.204,0
2025-11-25T11:24:28Z,31,10,0,1.807,0
2025-11-25T11:24:34Z,32,10,0,1.666,0
2025-11-25T11:24:41Z,33,10,0,2.082,0
2025-11-25T11:24:49Z,34,10,0,2.510,0
2025-11-25T11:24:56Z,35,10,0,1.854,0
2025-11-25T11:25:03Z,36,10,0,2.345,0
2025-11-25T11:25:10Z,37,10,0,2.083,0
2025-11-25T11:25:17Z,38,10,0,2.034,0
2025-11-25T11:26:04Z,1,5,4,1.952,0
2025-11-25T11:26:09Z,2,5,0,1.348,0
2025-11-25T11:26:13Z,3,5,0,1.307,0
2025-11-25T11:26:18Z,4,5,0,1.619,0
2025-11-25T11:26:22Z,5,5,0,1.532,0
1 run_ts_utc loop fetched_count signal_count duration_seconds errors
2 2025-11-25T10:44:54Z 1 2 2 0.010 0
3 2025-11-25T10:44:54Z 2 2 0 0.010 0
4 2025-11-25T10:47:11Z 1 50 24 2.498 0
5 2025-11-25T10:47:41Z 1 50 24 2.918 0
6 2025-11-25T11:07:19Z 1 10 6 2.836 0
7 2025-11-25T11:07:26Z 2 10 0 1.871 0
8 2025-11-25T11:07:33Z 3 10 0 2.204 0
9 2025-11-25T11:07:41Z 4 10 0 2.860 0
10 2025-11-25T11:07:48Z 5 10 0 1.902 0
11 2025-11-25T11:07:55Z 6 10 0 2.056 0
12 2025-11-25T11:08:03Z 7 10 0 2.957 0
13 2025-11-25T11:11:49Z 1 10 6 3.377 0
14 2025-11-25T11:11:56Z 2 10 0 2.035 0
15 2025-11-25T11:12:04Z 3 10 0 2.529 0
16 2025-11-25T11:12:11Z 4 10 0 2.598 0
17 2025-11-25T11:12:19Z 5 10 0 2.985 0
18 2025-11-25T11:12:26Z 6 10 0 1.725 0
19 2025-11-25T11:12:33Z 7 10 0 1.935 0
20 2025-11-25T11:12:41Z 8 10 0 2.609 0
21 2025-11-25T11:12:48Z 9 10 0 2.033 0
22 2025-11-25T11:12:55Z 10 10 0 2.231 0
23 2025-11-25T11:13:04Z 11 10 0 4.059 0
24 2025-11-25T11:13:11Z 12 10 0 1.893 0
25 2025-11-25T11:13:18Z 13 10 0 2.463 0
26 2025-11-25T11:13:25Z 14 10 0 2.150 0
27 2025-11-25T11:13:32Z 15 10 0 1.869 0
28 2025-11-25T11:13:39Z 16 10 0 2.121 0
29 2025-11-25T11:13:47Z 17 10 0 2.567 0
30 2025-11-25T11:13:54Z 18 10 0 2.531 0
31 2025-11-25T11:14:01Z 19 10 0 1.997 0
32 2025-11-25T11:14:09Z 20 10 0 2.550 0
33 2025-11-25T11:14:16Z 21 10 0 2.101 0
34 2025-11-25T11:14:24Z 22 10 0 2.745 0
35 2025-11-25T11:14:31Z 23 10 0 2.085 0
36 2025-11-25T11:14:38Z 24 10 0 2.005 0
37 2025-11-25T11:14:45Z 25 10 0 2.006 0
38 2025-11-25T11:14:52Z 26 10 0 2.334 0
39 2025-11-25T11:14:59Z 27 10 0 2.102 0
40 2025-11-25T11:15:06Z 28 10 0 1.812 0
41 2025-11-25T11:15:16Z 29 10 0 4.983 0
42 2025-11-25T11:15:23Z 30 10 0 1.977 0
43 2025-11-25T11:15:31Z 31 10 0 2.548 0
44 2025-11-25T11:15:38Z 32 10 0 2.241 0
45 2025-11-25T11:15:46Z 33 10 0 2.619 0
46 2025-11-25T11:15:53Z 34 10 0 2.174 0
47 2025-11-25T11:16:00Z 35 10 0 2.690 0
48 2025-11-25T11:16:08Z 36 10 0 2.169 0
49 2025-11-25T11:16:15Z 37 10 0 2.560 0
50 2025-11-25T11:16:22Z 38 10 0 2.309 0
51 2025-11-25T11:16:29Z 39 10 0 1.976 0
52 2025-11-25T11:16:41Z 1 10 6 2.499 0
53 2025-11-25T11:16:48Z 2 10 0 2.566 0
54 2025-11-25T11:16:56Z 3 10 0 2.491 0
55 2025-11-25T11:17:03Z 4 10 0 1.879 0
56 2025-11-25T11:17:10Z 5 10 0 1.998 0
57 2025-11-25T11:17:17Z 6 10 0 2.378 0
58 2025-11-25T11:17:24Z 7 10 0 2.000 0
59 2025-11-25T11:17:30Z 1 10 7 3.058 0
60 2025-11-25T11:17:38Z 2 10 0 2.890 0
61 2025-11-25T11:17:45Z 3 10 0 1.814 0
62 2025-11-25T11:17:52Z 4 10 0 1.980 0
63 2025-11-25T11:17:59Z 5 10 0 2.803 0
64 2025-11-25T11:18:07Z 6 10 0 2.337 0
65 2025-11-25T11:18:14Z 7 10 0 2.154 0
66 2025-11-25T11:18:21Z 8 10 0 1.863 0
67 2025-11-25T11:18:28Z 9 10 0 2.543 0
68 2025-11-25T11:18:36Z 10 10 0 2.155 0
69 2025-11-25T11:18:42Z 11 10 0 1.899 0
70 2025-11-25T11:18:49Z 12 10 0 2.033 0
71 2025-11-25T11:18:56Z 13 10 0 2.058 0
72 2025-11-25T11:19:04Z 14 10 0 2.660 0
73 2025-11-25T11:19:12Z 15 10 0 2.445 0
74 2025-11-25T11:19:19Z 16 10 0 1.929 0
75 2025-11-25T11:19:27Z 17 10 0 3.209 0
76 2025-11-25T11:19:34Z 18 10 0 1.945 0
77 2025-11-25T11:19:41Z 19 10 0 2.034 0
78 2025-11-25T11:19:48Z 20 10 0 2.368 0
79 2025-11-25T11:19:56Z 21 10 0 2.757 0
80 2025-11-25T11:20:03Z 22 10 0 2.141 0
81 2025-11-25T11:20:10Z 23 10 0 2.113 0
82 2025-11-25T11:20:17Z 24 10 0 1.743 0
83 2025-11-25T11:20:24Z 25 10 0 1.941 0
84 2025-11-25T11:20:31Z 26 10 0 2.210 0
85 2025-11-25T11:20:38Z 27 10 0 2.442 0
86 2025-11-25T11:20:51Z 1 10 6 2.674 0
87 2025-11-25T11:21:00Z 2 10 0 3.996 0
88 2025-11-25T11:21:07Z 3 10 0 2.291 0
89 2025-11-25T11:21:14Z 4 10 0 1.935 0
90 2025-11-25T11:21:21Z 5 10 0 2.171 0
91 2025-11-25T11:21:28Z 6 10 0 2.033 0
92 2025-11-25T11:21:36Z 7 10 0 2.871 0
93 2025-11-25T11:21:43Z 8 10 0 2.115 0
94 2025-11-25T11:21:51Z 9 10 0 2.700 0
95 2025-11-25T11:21:58Z 10 10 0 2.256 0
96 2025-11-25T11:22:05Z 11 10 0 2.244 0
97 2025-11-25T11:22:12Z 12 10 0 2.217 0
98 2025-11-25T11:22:20Z 13 10 0 2.171 0
99 2025-11-25T11:22:27Z 14 10 0 2.455 0
100 2025-11-25T11:22:34Z 15 10 0 2.108 0
101 2025-11-25T11:22:41Z 16 10 0 2.232 0
102 2025-11-25T11:22:49Z 17 10 0 2.415 0
103 2025-11-25T11:22:56Z 18 10 0 1.945 0
104 2025-11-25T11:23:04Z 19 10 0 2.833 0
105 2025-11-25T11:23:11Z 20 10 0 2.123 0
106 2025-11-25T11:23:18Z 21 10 0 2.334 0
107 2025-11-25T11:23:25Z 22 10 0 2.197 0
108 2025-11-25T11:23:32Z 23 10 0 1.917 0
109 2025-11-25T11:23:39Z 24 10 0 1.900 0
110 2025-11-25T11:23:46Z 25 10 0 1.790 0
111 2025-11-25T11:23:53Z 26 10 0 2.130 0
112 2025-11-25T11:24:00Z 27 10 0 1.574 0
113 2025-11-25T11:24:07Z 28 10 0 2.039 0
114 2025-11-25T11:24:14Z 29 10 0 1.984 0
115 2025-11-25T11:24:21Z 30 10 0 2.204 0
116 2025-11-25T11:24:28Z 31 10 0 1.807 0
117 2025-11-25T11:24:34Z 32 10 0 1.666 0
118 2025-11-25T11:24:41Z 33 10 0 2.082 0
119 2025-11-25T11:24:49Z 34 10 0 2.510 0
120 2025-11-25T11:24:56Z 35 10 0 1.854 0
121 2025-11-25T11:25:03Z 36 10 0 2.345 0
122 2025-11-25T11:25:10Z 37 10 0 2.083 0
123 2025-11-25T11:25:17Z 38 10 0 2.034 0
124 2025-11-25T11:26:04Z 1 5 4 1.952 0
125 2025-11-25T11:26:09Z 2 5 0 1.348 0
126 2025-11-25T11:26:13Z 3 5 0 1.307 0
127 2025-11-25T11:26:18Z 4 5 0 1.619 0
128 2025-11-25T11:26:22Z 5 5 0 1.532 0

1330
data/features_1m.csv Normal file

File diff suppressed because it is too large Load Diff

6211
data/premarket_bars.csv Normal file

File diff suppressed because it is too large Load Diff

2409
data/premarket_signals.csv Normal file

File diff suppressed because it is too large Load Diff

6
data/signals.csv Normal file
View File

@@ -0,0 +1,6 @@
4144122546532766634-2025-11-25T11:20:50Z,4144122546532766634,GOOGL,2025-11-25T11:20:50Z,momentum,BUY,0.85,intraday,"{""reason"": ""涨幅显著 (6.31%),模型建议买入""}",rule_threshold,v1,
5053198607245987051-2025-11-25T11:20:50Z,5053198607245987051,GOOG,2025-11-25T11:20:50Z,momentum,BUY,0.85,intraday,"{""reason"": ""涨幅显著 (6.28%),模型建议买入""}",rule_threshold,v1,
2194510714435639870-2025-11-25T11:20:50Z,2194510714435639870,MSFT,2025-11-25T11:20:50Z,momentum,BUY,0.85,intraday,"{""reason"": ""涨幅显著 (40.00%),模型建议买入""}",rule_threshold,v1,
6313120924332843851-2025-11-25T11:20:50Z,6313120924332843851,AVGO,2025-11-25T11:20:50Z,momentum,BUY,0.85,intraday,"{""reason"": ""涨幅显著 (11.10%),模型建议买入""}",rule_threshold,v1,
1236673530676310677-2025-11-25T11:20:50Z,1236673530676310677,TSLA,2025-11-25T11:20:50Z,momentum,BUY,0.85,intraday,"{""reason"": ""涨幅显著 (6.82%),模型建议买入""}",rule_threshold,v1,
352413926823531646-2025-11-25T11:20:50Z,352413926823531646,NVDA,2025-11-25T11:20:50Z,momentum,SELL,,intraday,"{""reason"": ""盘前跌幅 -3.73% 预警""}",rule_threshold,v1,
1 4144122546532766634-2025-11-25T11:20:50Z 4144122546532766634 GOOGL 2025-11-25T11:20:50Z momentum BUY 0.85 intraday {"reason": "涨幅显著 (6.31%),模型建议买入"} rule_threshold v1
2 5053198607245987051-2025-11-25T11:20:50Z 5053198607245987051 GOOG 2025-11-25T11:20:50Z momentum BUY 0.85 intraday {"reason": "涨幅显著 (6.28%),模型建议买入"} rule_threshold v1
3 2194510714435639870-2025-11-25T11:20:50Z 2194510714435639870 MSFT 2025-11-25T11:20:50Z momentum BUY 0.85 intraday {"reason": "涨幅显著 (40.00%),模型建议买入"} rule_threshold v1
4 6313120924332843851-2025-11-25T11:20:50Z 6313120924332843851 AVGO 2025-11-25T11:20:50Z momentum BUY 0.85 intraday {"reason": "涨幅显著 (11.10%),模型建议买入"} rule_threshold v1
5 1236673530676310677-2025-11-25T11:20:50Z 1236673530676310677 TSLA 2025-11-25T11:20:50Z momentum BUY 0.85 intraday {"reason": "涨幅显著 (6.82%),模型建议买入"} rule_threshold v1
6 352413926823531646-2025-11-25T11:20:50Z 352413926823531646 NVDA 2025-11-25T11:20:50Z momentum SELL intraday {"reason": "盘前跌幅 -3.73% 预警"} rule_threshold v1

51
data/symbols.csv Normal file
View File

@@ -0,0 +1,51 @@
id,symbol,name,exchange,currency,tick_size,lot_size,sector,industry,is_active,first_seen_utc,last_seen_utc
7513257165860044271,AAPL,苹果,US,USD,,,,,1,2025-11-25T10:23:21Z,2025-11-25T20:44:14Z
2194510714435639870,MSFT,微软,US,USD,,,,,1,2025-11-25T10:23:21Z,2025-11-25T20:44:14Z
352413926823531646,NVDA,英伟达,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T20:44:14Z
4144122546532766634,GOOGL,谷歌-A,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T20:44:14Z
5053198607245987051,GOOG,谷歌-C,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T20:44:14Z
4623497169759077023,AMZN,亚马逊,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T20:44:14Z
6313120924332843851,AVGO,博通,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T20:44:14Z
5102328569974216853,META,Meta Platforms Inc-A,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T20:44:14Z
876017357040540663,TSM,台积电,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T20:44:14Z
1236673530676310677,TSLA,特斯拉,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T20:44:14Z
1331973792832307407,BRK_A,伯克希尔哈撒韦-A,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T11:03:31Z
650274069060505642,BRK_B,伯克希尔哈撒韦-B,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T11:03:31Z
8803988925120688787,LLY,礼来,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T11:03:31Z
8552427102055794475,WMT,沃尔玛,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T11:03:31Z
2514921600363169427,JPM,摩根大通,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T11:03:31Z
2430622602946461838,V,维萨,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T11:03:31Z
8672445585537641080,ORCL,甲骨文,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T11:03:31Z
5071644475567648584,JNJ,强生,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T11:03:31Z
7149994435636318535,XOM,埃克森美孚,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T11:03:31Z
7970304264787595103,MA,万事达,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T11:03:31Z
2534395427583276124,NFLX,奈飞,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
5575034379848759650,ABBV,艾伯维,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
5412047369712123198,COST,开市客,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
8785367685666264461,PLTR,Palantir Technologies Inc-A,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
5647711723656244203,BABA,阿里巴巴,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
6990590429937873043,ASML,阿斯麦,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
8292219544464776917,BAC,美国银行,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
1633824147157402598,AMD,超威半导体,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
2555012754890357878,PG,宝洁,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
4150716678491375742,HD,家得宝,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
2367064624252044795,KO,可口可乐,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
6646738033285807079,GE,GE航空航天,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
8407679579858137856,CVX,雪佛龙,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
7292100549554102522,CSCO,思科,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
8960439643380076728,UNH,联合健康,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
7065807657318692902,DGP,二倍做多黄金ETN-DB,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
3257268770337445296,IBM,IBM国际商业机器,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
579602174984910496,AZN,阿斯利康(ADR),US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
4391221166925450354,SAP,思爱普,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
7856183197500173397,WFC,富国银行,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
6761987357449376195,CAT,卡特彼勒,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
6181095751687308617,TM,丰田汽车(ADR),US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
2535412531591654533,MS,摩根士丹利,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
6823324911471553900,MU,美光科技,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
2560492629011940105,MRK,默沙东,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
6574988392422038406,AXP,美国运通,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
3063309002746328407,NVS,诺华制药,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
8036752892131638733,GS,高盛,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
1961403181409823259,HSBC,汇丰控股,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
387219116134337577,PM,菲利普莫里斯国际,US,USD,,,,,1,2025-11-25T10:47:08Z,2025-11-25T10:47:39Z
1 id symbol name exchange currency tick_size lot_size sector industry is_active first_seen_utc last_seen_utc
2 7513257165860044271 AAPL 苹果 US USD 1 2025-11-25T10:23:21Z 2025-11-25T20:44:14Z
3 2194510714435639870 MSFT 微软 US USD 1 2025-11-25T10:23:21Z 2025-11-25T20:44:14Z
4 352413926823531646 NVDA 英伟达 US USD 1 2025-11-25T10:47:08Z 2025-11-25T20:44:14Z
5 4144122546532766634 GOOGL 谷歌-A US USD 1 2025-11-25T10:47:08Z 2025-11-25T20:44:14Z
6 5053198607245987051 GOOG 谷歌-C US USD 1 2025-11-25T10:47:08Z 2025-11-25T20:44:14Z
7 4623497169759077023 AMZN 亚马逊 US USD 1 2025-11-25T10:47:08Z 2025-11-25T20:44:14Z
8 6313120924332843851 AVGO 博通 US USD 1 2025-11-25T10:47:08Z 2025-11-25T20:44:14Z
9 5102328569974216853 META Meta Platforms Inc-A US USD 1 2025-11-25T10:47:08Z 2025-11-25T20:44:14Z
10 876017357040540663 TSM 台积电 US USD 1 2025-11-25T10:47:08Z 2025-11-25T20:44:14Z
11 1236673530676310677 TSLA 特斯拉 US USD 1 2025-11-25T10:47:08Z 2025-11-25T20:44:14Z
12 1331973792832307407 BRK_A 伯克希尔哈撒韦-A US USD 1 2025-11-25T10:47:08Z 2025-11-25T11:03:31Z
13 650274069060505642 BRK_B 伯克希尔哈撒韦-B US USD 1 2025-11-25T10:47:08Z 2025-11-25T11:03:31Z
14 8803988925120688787 LLY 礼来 US USD 1 2025-11-25T10:47:08Z 2025-11-25T11:03:31Z
15 8552427102055794475 WMT 沃尔玛 US USD 1 2025-11-25T10:47:08Z 2025-11-25T11:03:31Z
16 2514921600363169427 JPM 摩根大通 US USD 1 2025-11-25T10:47:08Z 2025-11-25T11:03:31Z
17 2430622602946461838 V 维萨 US USD 1 2025-11-25T10:47:08Z 2025-11-25T11:03:31Z
18 8672445585537641080 ORCL 甲骨文 US USD 1 2025-11-25T10:47:08Z 2025-11-25T11:03:31Z
19 5071644475567648584 JNJ 强生 US USD 1 2025-11-25T10:47:08Z 2025-11-25T11:03:31Z
20 7149994435636318535 XOM 埃克森美孚 US USD 1 2025-11-25T10:47:08Z 2025-11-25T11:03:31Z
21 7970304264787595103 MA 万事达 US USD 1 2025-11-25T10:47:08Z 2025-11-25T11:03:31Z
22 2534395427583276124 NFLX 奈飞 US USD 1 2025-11-25T10:47:08Z 2025-11-25T10:47:39Z
23 5575034379848759650 ABBV 艾伯维 US USD 1 2025-11-25T10:47:08Z 2025-11-25T10:47:39Z
24 5412047369712123198 COST 开市客 US USD 1 2025-11-25T10:47:08Z 2025-11-25T10:47:39Z
25 8785367685666264461 PLTR Palantir Technologies Inc-A US USD 1 2025-11-25T10:47:08Z 2025-11-25T10:47:39Z
26 5647711723656244203 BABA 阿里巴巴 US USD 1 2025-11-25T10:47:08Z 2025-11-25T10:47:39Z
27 6990590429937873043 ASML 阿斯麦 US USD 1 2025-11-25T10:47:08Z 2025-11-25T10:47:39Z
28 8292219544464776917 BAC 美国银行 US USD 1 2025-11-25T10:47:08Z 2025-11-25T10:47:39Z
29 1633824147157402598 AMD 超威半导体 US USD 1 2025-11-25T10:47:08Z 2025-11-25T10:47:39Z
30 2555012754890357878 PG 宝洁 US USD 1 2025-11-25T10:47:08Z 2025-11-25T10:47:39Z
31 4150716678491375742 HD 家得宝 US USD 1 2025-11-25T10:47:08Z 2025-11-25T10:47:39Z
32 2367064624252044795 KO 可口可乐 US USD 1 2025-11-25T10:47:08Z 2025-11-25T10:47:39Z
33 6646738033285807079 GE GE航空航天 US USD 1 2025-11-25T10:47:08Z 2025-11-25T10:47:39Z
34 8407679579858137856 CVX 雪佛龙 US USD 1 2025-11-25T10:47:08Z 2025-11-25T10:47:39Z
35 7292100549554102522 CSCO 思科 US USD 1 2025-11-25T10:47:08Z 2025-11-25T10:47:39Z
36 8960439643380076728 UNH 联合健康 US USD 1 2025-11-25T10:47:08Z 2025-11-25T10:47:39Z
37 7065807657318692902 DGP 二倍做多黄金ETN-DB US USD 1 2025-11-25T10:47:08Z 2025-11-25T10:47:39Z
38 3257268770337445296 IBM IBM国际商业机器 US USD 1 2025-11-25T10:47:08Z 2025-11-25T10:47:39Z
39 579602174984910496 AZN 阿斯利康(ADR) US USD 1 2025-11-25T10:47:08Z 2025-11-25T10:47:39Z
40 4391221166925450354 SAP 思爱普 US USD 1 2025-11-25T10:47:08Z 2025-11-25T10:47:39Z
41 7856183197500173397 WFC 富国银行 US USD 1 2025-11-25T10:47:08Z 2025-11-25T10:47:39Z
42 6761987357449376195 CAT 卡特彼勒 US USD 1 2025-11-25T10:47:08Z 2025-11-25T10:47:39Z
43 6181095751687308617 TM 丰田汽车(ADR) US USD 1 2025-11-25T10:47:08Z 2025-11-25T10:47:39Z
44 2535412531591654533 MS 摩根士丹利 US USD 1 2025-11-25T10:47:08Z 2025-11-25T10:47:39Z
45 6823324911471553900 MU 美光科技 US USD 1 2025-11-25T10:47:08Z 2025-11-25T10:47:39Z
46 2560492629011940105 MRK 默沙东 US USD 1 2025-11-25T10:47:08Z 2025-11-25T10:47:39Z
47 6574988392422038406 AXP 美国运通 US USD 1 2025-11-25T10:47:08Z 2025-11-25T10:47:39Z
48 3063309002746328407 NVS 诺华制药 US USD 1 2025-11-25T10:47:08Z 2025-11-25T10:47:39Z
49 8036752892131638733 GS 高盛 US USD 1 2025-11-25T10:47:08Z 2025-11-25T10:47:39Z
50 1961403181409823259 HSBC 汇丰控股 US USD 1 2025-11-25T10:47:08Z 2025-11-25T10:47:39Z
51 387219116134337577 PM 菲利普莫里斯国际 US USD 1 2025-11-25T10:47:08Z 2025-11-25T10:47:39Z

492
data_writer.py Normal file
View File

@@ -0,0 +1,492 @@
# -*- coding: utf-8 -*-
"""
CSV 数据落地模块(基于 DATA_SCHEMA.md 的简化实现)
- symbols.csv
- bars_1m.csv
- signals.csv
说明:
- 不做真正的 UpsertCSV 不擅长),通过读取现有行建立内存索引,避免重复写入关键键。
- 比率字段(如涨跌幅)采用小数存储,例如 4.02% 存 0.0402。
"""
import csv
import os
from datetime import datetime, timezone
from typing import Iterable, Dict, Any, List, Tuple
from utils_id import stable_symbol_id
DATA_DIR = os.path.join(os.path.dirname(__file__), "data")
SYMBOLS_CSV = os.path.join(DATA_DIR, "symbols.csv")
BARS_1M_CSV = os.path.join(DATA_DIR, "bars_1m.csv")
SIGNALS_CSV = os.path.join(DATA_DIR, "signals.csv")
FEATURES_1M_CSV = os.path.join(DATA_DIR, "features_1m.csv")
ETL_RUNS_CSV = os.path.join(DATA_DIR, "etl_runs.csv")
PREMARKET_BARS_CSV = os.path.join(DATA_DIR, "premarket_bars.csv")
PREMARKET_SIGNALS_CSV = os.path.join(DATA_DIR, "premarket_signals.csv")
# 确保目录存在
os.makedirs(DATA_DIR, exist_ok=True)
def _utc_now_iso() -> str:
return datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ")
def _floor_minute(dt: datetime) -> datetime:
return dt.replace(second=0, microsecond=0, tzinfo=timezone.utc)
# ---------- symbols.csv ----------
_SYMBOLS_HEADER = [
"id","symbol","name","exchange","currency",
"tick_size","lot_size","sector","industry",
"is_active","first_seen_utc","last_seen_utc"
]
def write_symbols(stocks: Iterable[Dict[str, Any]]) -> Dict[str, int]:
"""将股票基础信息写入 symbols.csv并返回 symbol->symbol_id 映射。
stocks: 需包含 keys: symbol, name, exchange, currency
"""
existing: Dict[Tuple[str,str], Dict[str, str]] = {}
if os.path.exists(SYMBOLS_CSV):
with open(SYMBOLS_CSV, "r", encoding="utf-8-sig") as f:
reader = csv.DictReader(f)
for row in reader:
existing[(row["symbol"], row["exchange"])] = row
now = _utc_now_iso()
# 生成/更新
for s in stocks:
symbol = s.get("symbol")
name = s.get("name")
exchange = (s.get("exchange") or "US").upper()
currency = (s.get("currency") or "USD").upper()
key = (symbol, exchange)
if key not in existing:
sid = stable_symbol_id(symbol, exchange)
existing[key] = {
"id": str(sid),
"symbol": symbol,
"name": name or "",
"exchange": exchange,
"currency": currency,
"tick_size": "",
"lot_size": "",
"sector": "",
"industry": "",
"is_active": "1",
"first_seen_utc": now,
"last_seen_utc": now,
}
else:
existing[key]["last_seen_utc"] = now
# 写回
with open(SYMBOLS_CSV, "w", newline="", encoding="utf-8-sig") as f:
writer = csv.DictWriter(f, fieldnames=_SYMBOLS_HEADER)
writer.writeheader()
for row in existing.values():
writer.writerow(row)
# 返回映射
return {k[0]: int(v["id"]) for k, v in existing.items() if k[0] == v["symbol"]}
# ---------- bars_1m.csv ----------
_BARS_1M_HEADER = [
"symbol_id","symbol","ts_utc","open","high","low","close",
"volume","vwap","trades_count","source","session"
]
def _upgrade_bars_file_if_needed():
"""如果历史 bars_1m.csv 缺少 session 列,进行一次升级重写,补 session='regular'"""
if not os.path.exists(BARS_1M_CSV):
return
try:
with open(BARS_1M_CSV, 'r', encoding='utf-8-sig') as f:
reader = csv.reader(f)
rows = list(reader)
if not rows:
return
header = rows[0]
if 'session' in header:
return # 已升级
# 构造新文件内容
old_header = header
# 建立列索引映射
idx_map = {col: i for i, col in enumerate(old_header)}
new_rows = []
new_rows.append(_BARS_1M_HEADER)
for r in rows[1:]:
if not r:
continue
# 依据旧列生成新行
new_line = [
r[idx_map.get('symbol_id','')],
r[idx_map.get('symbol','')],
r[idx_map.get('ts_utc','')],
r[idx_map.get('open','')],
r[idx_map.get('high','')],
r[idx_map.get('low','')],
r[idx_map.get('close','')],
r[idx_map.get('volume','')],
r[idx_map.get('vwap','')],
r[idx_map.get('trades_count','')],
r[idx_map.get('source','')],
'regular'
]
new_rows.append(new_line)
# 写回升级
with open(BARS_1M_CSV, 'w', newline='', encoding='utf-8-sig') as f:
writer = csv.writer(f)
writer.writerows(new_rows)
except Exception as e:
print(f"⚠️ bars_1m.csv 升级失败: {e}")
def append_bars_1m(stocks: Iterable[Dict[str, Any]], symbol_id_map: Dict[str, int], source: str = "eastmoney") -> List[Dict[str, Any]]:
"""将当前快照近似为 1 分钟线写入 bars_1m.csv。
由于只有快照open/high/low/close 统一使用 current_pricevolume/vwap/trades_count 为空。
"""
now = _floor_minute(datetime.now(timezone.utc)).strftime("%Y-%m-%dT%H:%M:%SZ")
rows: List[Dict[str, Any]] = []
_upgrade_bars_file_if_needed()
for s in stocks:
symbol = s.get("symbol")
price = s.get("eastmoney_price") or s.get("current_price")
if price is None:
continue
sid = symbol_id_map.get(symbol) or stable_symbol_id(symbol)
rows.append({
"symbol_id": sid,
"symbol": symbol,
"ts_utc": now,
"open": price,
"high": price,
"low": price,
"close": price,
"volume": "",
"vwap": "",
"trades_count": "",
"source": source,
"session": "regular",
})
# 追加写
file_exists = os.path.exists(BARS_1M_CSV)
with open(BARS_1M_CSV, "a", newline="", encoding="utf-8-sig") as f:
writer = csv.DictWriter(f, fieldnames=_BARS_1M_HEADER)
if not file_exists:
writer.writeheader()
for r in rows:
writer.writerow(r)
return rows
def append_bars_session(stocks: Iterable[Dict[str, Any]], symbol_id_map: Dict[str, int], source: str = "futu", session: str = "pre") -> List[Dict[str, Any]]:
"""写入特定交易时段的快照(如盘前/盘后),与常规 bars 共存,通过 session 区分。"""
_upgrade_bars_file_if_needed()
now = _floor_minute(datetime.now(timezone.utc)).strftime("%Y-%m-%dT%H:%M:%SZ")
rows: List[Dict[str, Any]] = []
for s in stocks:
symbol = s.get("symbol")
price = s.get("premarket_price") or s.get("after_hours_price") or s.get("futu_before_open_price")
if price in (None, ""):
continue
try:
price_f = float(price)
except Exception:
continue
sid = symbol_id_map.get(symbol) or stable_symbol_id(symbol)
rows.append({
"symbol_id": sid,
"symbol": symbol,
"ts_utc": now,
"open": price_f,
"high": price_f,
"low": price_f,
"close": price_f,
"volume": "",
"vwap": "",
"trades_count": "",
"source": source,
"session": session,
})
file_exists = os.path.exists(BARS_1M_CSV)
with open(BARS_1M_CSV, "a", newline="", encoding="utf-8-sig") as f:
writer = csv.DictWriter(f, fieldnames=_BARS_1M_HEADER)
if not file_exists:
writer.writeheader()
for r in rows:
writer.writerow(r)
return rows
# ---------- premarket 专用快照与信号 ----------
_PREMARKET_BARS_HEADER = [
'symbol_id','symbol','name','ts_utc','ts_et','price','change','change_ratio','volume','source','session','raw_file'
]
_PREMARKET_SIGNALS_HEADER = [
'id','symbol_id','symbol','generated_at_utc','generated_at_et','signal_type','direction','score','reason','params_json','model_name','version','expires_at_utc'
]
def append_premarket_bars(rows: List[Dict[str, Any]], symbol_id_map: Dict[str, int], source: str = 'futu') -> None:
"""将盘前抓取行写入 premarket_bars.csv。
rows: 需包含 symbol,name,premarket_price,premarket_change,premarket_change_ratio(原始百分比或小数字符串), ts(ET字符串 HH:MM)
"""
if not rows:
return
file_exists = os.path.exists(PREMARKET_BARS_CSV)
now_utc = datetime.now(timezone.utc).strftime('%Y-%m-%dT%H:%M:%SZ')
# ET 时间字符串(便于人工查看)
try:
from zoneinfo import ZoneInfo
ts_et_full = datetime.now(ZoneInfo('America/New_York')).strftime('%Y-%m-%dT%H:%M:%S')
except Exception:
ts_et_full = ''
with open(PREMARKET_BARS_CSV, 'a', newline='', encoding='utf-8-sig') as f:
writer = csv.DictWriter(f, fieldnames=_PREMARKET_BARS_HEADER)
if not file_exists:
writer.writeheader()
for r in rows:
symbol = r.get('symbol')
if not symbol:
continue
price = r.get('premarket_price')
if price in (None,'','-'):
continue
try:
price_f = float(price)
except Exception:
continue
# ratio 原始可能是 "3.21%" / "-3.21%" / "0.0321" / ""
ratio_raw = r.get('premarket_change_ratio')
ratio_val = 0.0
if ratio_raw not in (None,''):
txt = str(ratio_raw).strip()
try:
if txt.endswith('%'):
ratio_val = float(txt.replace('%',''))/100.0
else:
# 若原始是小数形式(0.0321)或绝对值>1的百分值(3.21),都兼容
num = float(txt)
ratio_val = num/100.0 if abs(num) > 1 and abs(num) >= 2 else num # 粗略判断
except Exception:
ratio_val = 0.0
sid = symbol_id_map.get(symbol) or stable_symbol_id(symbol)
writer.writerow({
'symbol_id': sid,
'symbol': symbol,
'name': r.get('name',''),
'ts_utc': now_utc,
'ts_et': ts_et_full,
'price': price_f,
'change': r.get('premarket_change',''),
'change_ratio': ratio_val,
'volume': '',
'source': source,
'session': 'pre',
'raw_file': '',
})
def append_premarket_signals(signals: List[Dict[str, Any]], symbol_id_map: Dict[str, int]) -> None:
"""写入盘前信号到 premarket_signals.csv。
signals: 需包含 symbol,direction(BUY/SELL),reason,params_json(可选)
"""
if not signals:
return
file_exists = os.path.exists(PREMARKET_SIGNALS_CSV)
model_name, version = _def_model
now_utc = datetime.now(timezone.utc).strftime('%Y-%m-%dT%H:%M:%SZ')
try:
from zoneinfo import ZoneInfo
now_et = datetime.now(ZoneInfo('America/New_York')).strftime('%Y-%m-%dT%H:%M:%S')
except Exception:
now_et = ''
# 简单去重: 同 symbol+direction+当前UTC秒 不重复
seen = set()
if file_exists:
with open(PREMARKET_SIGNALS_CSV,'r',encoding='utf-8-sig') as f:
reader = csv.DictReader(f)
for row in reader:
seen.add((row['symbol'],row['direction'],row['generated_at_utc']))
with open(PREMARKET_SIGNALS_CSV,'a',newline='',encoding='utf-8-sig') as f:
writer = csv.DictWriter(f, fieldnames=_PREMARKET_SIGNALS_HEADER)
if not file_exists:
writer.writeheader()
for sig in signals:
symbol = sig.get('symbol')
direction = sig.get('direction')
if not symbol or not direction:
continue
key = (symbol,direction,now_utc)
if key in seen:
continue
sid = symbol_id_map.get(symbol) or stable_symbol_id(symbol)
params_obj = sig.get('params') or {}
writer.writerow({
'id': f'{sid}-{now_utc}',
'symbol_id': sid,
'symbol': symbol,
'generated_at_utc': now_utc,
'generated_at_et': now_et,
'signal_type': sig.get('signal_type','premarket_alert'),
'direction': direction,
'score': sig.get('score',''),
'reason': sig.get('reason',''),
'params_json': json.dumps(params_obj, ensure_ascii=False),
'model_name': model_name,
'version': version,
'expires_at_utc': '',
})
# ---------- signals.csv ----------
_SIGNALS_HEADER = [
"id","symbol_id","symbol","generated_at_utc",
"signal_type","direction","score","horizon",
"params_json","model_name","version","expires_at_utc"
]
_def_model = ("rule_threshold", "v1")
import json
def append_signals(signals: Iterable[Dict[str, Any]], symbol_id_map: Dict[str, int]) -> None:
"""将策略信号写入 signals.csv使用时间+symbol 做近似去重。
输入信号应包含symbol, type(BUY/SELL), reason/score 可选。
"""
file_exists = os.path.exists(SIGNALS_CSV)
seen_keys = set()
if file_exists:
with open(SIGNALS_CSV, "r", encoding="utf-8-sig") as f:
reader = csv.DictReader(f)
for row in reader:
seen_keys.add((row["symbol"], row["generated_at_utc"], row.get("direction")))
model_name, version = _def_model
with open(SIGNALS_CSV, "a", newline="", encoding="utf-8-sig") as f:
writer = csv.DictWriter(f, fieldnames=_SIGNALS_HEADER)
if not file_exists:
writer.writeheader()
for sig in signals:
symbol = sig.get("symbol")
direction = sig.get("type") or sig.get("direction")
gen_at = sig.get('generated_at_utc') or _utc_now_iso()
key = (symbol, gen_at, direction)
if key in seen_keys:
continue
sid = symbol_id_map.get(symbol) or stable_symbol_id(symbol)
writer.writerow({
"id": f"{sid}-{gen_at}",
"symbol_id": sid,
"symbol": symbol,
"generated_at_utc": gen_at,
"signal_type": "momentum",
"direction": direction,
"score": sig.get("confidence", ""),
"horizon": "intraday",
"params_json": json.dumps({"reason": sig.get("reason", "")}, ensure_ascii=False),
"model_name": model_name,
"version": version,
"expires_at_utc": "",
})
# ---------- features_1m.csv ----------
_FEATURES_1M_HEADER = [
'symbol_id','symbol','ts_utc','price','return_1m','ma_5','ma_15','vol_15'
]
def _load_existing_prices() -> Dict[str, List[Tuple[str, float]]]:
data: Dict[str, List[Tuple[str, float]]] = {}
if not os.path.exists(BARS_1M_CSV):
return data
with open(BARS_1M_CSV, 'r', encoding='utf-8-sig') as f:
reader = csv.DictReader(f)
for row in reader:
symbol = row['symbol']
ts = row['ts_utc']
try:
price = float(row['close'])
except Exception:
continue
data.setdefault(symbol, []).append((ts, price))
# 保证按时间排序CSV 追加已有序,但防御性处理)
for sym in data:
data[sym].sort(key=lambda x: x[0])
return data
def append_features_1m(new_bar_rows: List[Dict[str, Any]]) -> None:
if not new_bar_rows:
return
price_history = _load_existing_prices()
feature_rows: List[Dict[str, Any]] = []
# 按新增行计算特征
for r in new_bar_rows:
symbol = r['symbol']
sid = r['symbol_id']
ts = r['ts_utc']
try:
price = float(r['close'])
except Exception:
continue
series = price_history.get(symbol, [])
# 找到当前索引位置
# 防御series 已包含当前行,因为新行已追加;若未包含则添加再计算
if not series or series[-1][0] != ts:
series.append((ts, price))
idx = len(series) - 1
# return_1m
ret_1m = 0.0
if idx >= 1:
prev_price = series[idx-1][1]
if prev_price != 0:
ret_1m = (price / prev_price) - 1
# ma_5
window5 = [p for _, p in series[max(0, idx-4):idx+1]]
ma_5 = sum(window5)/len(window5) if window5 else price
# ma_15
window15 = [p for _, p in series[max(0, idx-14):idx+1]]
ma_15 = sum(window15)/len(window15) if window15 else price
# vol_15 = 标准差
vol_15 = 0.0
if len(window15) > 1:
avg15 = ma_15
var = sum((p-avg15)**2 for p in window15)/ (len(window15)-1)
vol_15 = var**0.5
feature_rows.append({
'symbol_id': sid,
'symbol': symbol,
'ts_utc': ts,
'price': price,
'return_1m': ret_1m,
'ma_5': ma_5,
'ma_15': ma_15,
'vol_15': vol_15,
})
file_exists = os.path.exists(FEATURES_1M_CSV)
with open(FEATURES_1M_CSV, 'a', newline='', encoding='utf-8-sig') as f:
writer = csv.DictWriter(f, fieldnames=_FEATURES_1M_HEADER)
if not file_exists:
writer.writeheader()
for fr in feature_rows:
writer.writerow(fr)
# ---------- etl_runs.csv ----------
_ETL_RUNS_HEADER = [
'run_ts_utc','loop','fetched_count','signal_count','duration_seconds','errors'
]
def append_etl_run(loop: int, fetched: int, signals: int, duration: float, errors: int = 0) -> None:
file_exists = os.path.exists(ETL_RUNS_CSV)
with open(ETL_RUNS_CSV, 'a', newline='', encoding='utf-8-sig') as f:
writer = csv.DictWriter(f, fieldnames=_ETL_RUNS_HEADER)
if not file_exists:
writer.writeheader()
writer.writerow({
'run_ts_utc': _utc_now_iso(),
'loop': loop,
'fetched_count': fetched,
'signal_count': signals,
'duration_seconds': f'{duration:.3f}',
'errors': errors,
})

1035
futu.py Normal file

File diff suppressed because it is too large Load Diff

21
logging_setup.py Normal file
View File

@@ -0,0 +1,21 @@
# -*- coding: utf-8 -*-
"""
简单的日志初始化模块
使用:
from logging_setup import init_logging
init_logging()
"""
import logging
import os
def init_logging(level: str = None):
lvl = (level or os.getenv("LOG_LEVEL", "INFO")).upper()
lvl_value = getattr(logging, lvl, logging.INFO)
logging.basicConfig(
level=lvl_value,
format="%(asctime)s %(levelname)s [%(name)s] %(message)s",
datefmt="%Y-%m-%d %H:%M:%S",
)
# 降低第三方库的默认日志噪音
logging.getLogger("urllib3").setLevel(max(logging.WARNING, lvl_value))
logging.getLogger("requests").setLevel(max(logging.WARNING, lvl_value))

74
market_analyzer.py Normal file
View File

@@ -0,0 +1,74 @@
# -*- coding: utf-8 -*-
"""
市场分析模块 (模拟云端分析/大模型分析)
"""
import random
class MarketAnalyzer:
def __init__(self):
print("🧠 初始化市场分析模型...")
# 这里可以加载模型或者连接云端API
def analyze(self, stock_data_list):
"""
分析股票数据并生成交易信号
Args:
stock_data_list: 股票数据列表
Returns:
list: 包含交易信号的字典列表
"""
signals = []
print(f"🧠 正在分析 {len(stock_data_list)} 只股票的数据...")
for stock in stock_data_list:
# 简单的策略示例:
# 如果涨幅超过 5%,产生买入信号 (模拟)
# 如果跌幅超过 5%,产生卖出信号 (模拟)
try:
# 兼容可接收数值小数0.0402 表示 4.02%)或字符串("4.02%")
raw_ratio = stock.get('eastmoney_change_ratio', 0.0)
ratio = 0.0
if isinstance(raw_ratio, (int, float)):
# 假定为小数形式
ratio = float(raw_ratio)
# 若误传为 4.02 这类百分数值,则做防御性归一化
if abs(ratio) > 1:
ratio = ratio / 100.0
elif isinstance(raw_ratio, str):
s = raw_ratio.strip().replace('%', '')
if s not in ('', '-'):
v = float(s)
# 从百分数值转小数
ratio = v / 100.0
symbol = stock.get('symbol')
name = stock.get('name')
price = stock.get('eastmoney_price')
# 阈值基于小数±5%
if ratio > 0.05:
signals.append({
'type': 'BUY',
'symbol': symbol,
'name': name,
'price': price,
'reason': f'涨幅显著 ({ratio:.2%}),模型建议买入',
'confidence': 0.85
})
elif ratio < -0.05:
signals.append({
'type': 'SELL',
'symbol': symbol,
'name': name,
'price': price,
'reason': f'跌幅显著 ({ratio:.2%}),模型建议抛售',
'confidence': 0.92
})
except Exception:
continue
return signals

228
monitor.py Normal file
View File

@@ -0,0 +1,228 @@
# -*- coding: utf-8 -*-
"""
量化交易监控主程序
功能:
1. 循环抓取美股数据 (支持全量/Top N)
2. 调用分析模块进行分析
3. 调用交易模块执行信号
"""
import time
import argparse
from logging_setup import init_logging
from futu import StockDataIntegrator, EastMoneyAPI
from market_analyzer import MarketAnalyzer
from trader import Trader
from datetime import datetime
from data_writer import write_symbols, append_bars_1m, append_bars_session, append_signals, append_features_1m, append_etl_run
from zoneinfo import ZoneInfo
from utils_time import now_et, fmt_et, fmt_et_hm
from signal_filter import SignalCooldownFilter
def main():
# 初始化简单日志(保持现有 print不强制替换
init_logging()
parser = argparse.ArgumentParser(description='AI量化交易监控系统')
parser.add_argument('--interval', type=int, default=60, help='监控间隔(秒)')
parser.add_argument('--limit', type=int, default=100, help='每次监控的股票数量')
parser.add_argument('--all', action='store_true', help='监控所有股票(速度较慢)')
parser.add_argument('--premarket', action='store_true', help='在盘前窗口抓取富途盘前价格并写入 session=pre')
parser.add_argument('--premarket-limit', type=int, default=30, help='盘前抓取的最大股票数(富途页面逐个抓取)')
parser.add_argument('--session-override', choices=['pre','regular','post'], help='测试用手动覆盖当前交易时段')
args = parser.parse_args()
print("🚀 启动 AI 量化交易监控系统...")
print(f"⏱️ 监控间隔: {args.interval}")
# 初始化模块
integrator = StockDataIntegrator()
eastmoney_api = EastMoneyAPI() # 用于快速获取列表
analyzer = MarketAnalyzer()
trader = Trader()
cooldown_filter = SignalCooldownFilter(cooldown_minutes=30)
loop_count = 0
try:
while True:
loop_start = now_et()
loop_count += 1
print(f"\n🔄 第 {loop_count} 次扫描开始 - {fmt_et_hm() } ET")
# 1. 抓取数据 (常规东方财富列表)
# 为了监控效率,我们主要使用东方财富的快速列表接口
# 如果是全量监控
stock_data = []
if args.all:
print("📡 正在获取全量市场数据...")
# 这里我们简化处理直接调用修改后的API获取所有数据
# 注意futu.py 中的 get_us_stocks 已经支持分页获取
# 为了演示,我们这里只获取前几页,或者使用 futu.py 中新增的逻辑
# 直接使用 integrator 的逻辑,但强制 eastmoney_only 以提高速度
# 我们手动调用 eastmoney_api 来获取数据,避免 integrator 的复杂逻辑
# 获取所有数据可能需要一点时间
_, total = eastmoney_api.get_us_stocks(page_size=1)
print(f"📊 市场总股票数: {total}")
# 分页获取所有数据
page_size = 100
limit = total
total_pages = (limit + page_size - 1) // page_size
all_raw_stocks = []
for page in range(1, total_pages + 1):
stocks, _ = eastmoney_api.get_us_stocks(page_size=page_size, page_index=page)
if stocks:
all_raw_stocks.extend(stocks)
# 稍微延时
# time.sleep(0.1)
# 解析数据
for item in all_raw_stocks:
parsed = eastmoney_api.parse_stock_data(item)
if parsed:
stock_data.append({
'symbol': parsed['symbol'],
'name': parsed['name'],
'eastmoney_price': parsed['current_price'],
'eastmoney_change_ratio': parsed['change_ratio']
})
else:
print(f"📡 正在获取 Top {args.limit} 热门股票数据...")
# 获取 Top N
raw_stocks, _ = eastmoney_api.get_us_stocks(page_size=args.limit)
for item in raw_stocks:
parsed = eastmoney_api.parse_stock_data(item)
if parsed:
stock_data.append({
'symbol': parsed['symbol'],
'name': parsed['name'],
'eastmoney_price': parsed['current_price'],
'eastmoney_change_ratio': parsed['change_ratio']
})
print(f"✅ 获取到 {len(stock_data)} 条有效常规行情数据 (ET {fmt_et_hm()})")
# 1.1 盘前数据补充 (仅在盘前窗口且开启参数时,对前 N 只股票抓取富途页面)
def _get_us_market_session(now_et: datetime) -> str:
"""根据美东时间判定交易时段: pre(4:00-9:30), regular(9:30-16:00), post(16:00-20:00), off 其它。
周末直接 off。夏令时由系统 tz 数据自动处理。"""
if now_et.weekday() >= 5: # Saturday=5 Sunday=6
return 'off'
minutes = now_et.hour * 60 + now_et.minute
if 4*60 <= minutes < 9*60 + 30:
return 'pre'
if 9*60 + 30 <= minutes < 16*60:
return 'regular'
if 16*60 <= minutes < 20*60:
return 'post'
return 'off'
def _current_session() -> str:
if args.session_override:
return args.session_override
now_et = datetime.now(ZoneInfo('America/New_York'))
return _get_us_market_session(now_et)
pre_rows = []
session = _current_session()
if args.premarket and session == 'pre':
pre_candidates = stock_data[: args.premarket_limit]
print(f"🌙 盘前窗口内,准备抓取富途盘前数据 {len(pre_candidates)} 条... (ET {fmt_et_hm()})")
for i, item in enumerate(pre_candidates, 1):
symbol = item['symbol']
futu_detail = integrator.get_futu_stock_details(symbol)
if futu_detail and futu_detail.get('before_open_price'):
# 正常化盘前涨跌幅 (可能含 %)
ratio_raw = futu_detail.get('before_open_change_ratio') or ''
ratio_val = 0.0
try:
ratio_clean = str(ratio_raw).replace('%','').strip()
if ratio_clean:
ratio_f = float(ratio_clean)
# 转为小数
ratio_val = ratio_f/100.0
except Exception:
ratio_val = 0.0
item.update({
'premarket_price': futu_detail.get('before_open_price'),
'premarket_change': futu_detail.get('before_open_change'),
'premarket_change_ratio': ratio_val,
'futu_before_open_price': futu_detail.get('before_open_price'), # 兼容 append_bars_session fallback
})
pre_rows.append(item)
if i % 10 == 0:
print(f"🌙 盘前抓取进度 {i}/{len(pre_candidates)} (ET {fmt_et_hm()})")
print(f"🌙 富途盘前成功获取 {len(pre_rows)} 条 (ET {fmt_et_hm()})")
else:
if args.premarket:
print(f"🌙 当前交易时段为 {session},未执行盘前抓取 (ET {fmt_et_hm()})")
# 2.1 将 symbols 与 1分钟线写入 CSVdata/ 下)
try:
symbol_id_map = write_symbols([
{
'symbol': s['symbol'],
'name': s['name'],
'exchange': 'US',
'currency': 'USD',
}
for s in stock_data
])
new_rows = append_bars_1m(stock_data, symbol_id_map, source='eastmoney')
append_features_1m(new_rows)
# 盘前 bars 写入 (不计算特征,避免与常规混淆)
if pre_rows:
append_bars_session(pre_rows, symbol_id_map, source='futu', session='pre')
except Exception as e:
print(f"⚠️ 数据落地失败: {e}")
# 3. 分析数据
raw_signals = analyzer.analyze(stock_data)
# 盘前可选:根据盘前涨幅单独生成预警信号(示例阈值 +3% / -3%
premarket_signals = []
if pre_rows:
for r in pre_rows:
ratio = r.get('premarket_change_ratio') or 0.0
sym = r['symbol']
name = r.get('name', '')
price = r.get('premarket_price') or r.get('eastmoney_price')
if ratio >= 0.03:
premarket_signals.append({'symbol': sym, 'name': name, 'price': price, 'type': 'BUY', 'reason': f'盘前涨幅 {ratio:.2%} 预警'})
elif ratio <= -0.03:
premarket_signals.append({'symbol': sym, 'name': name, 'price': price, 'type': 'SELL', 'reason': f'盘前跌幅 {ratio:.2%} 预警'})
if premarket_signals:
print(f"🌙 盘前预警信号 {len(premarket_signals)} 条 (ET {fmt_et_hm()})")
raw_signals.extend(premarket_signals)
signals = cooldown_filter.filter(raw_signals)
# 3.1 写入 signals.csv
try:
if signals:
append_signals(signals, symbol_id_map)
except Exception as e:
print(f"⚠️ 写入信号失败: {e}")
# 4. 执行交易
if signals:
trader.execute_signals(signals)
else:
print("💤 当前无交易信号")
# 等待下一次扫描
# 记录ETL运行
try:
duration = (now_et() - loop_start).total_seconds()
append_etl_run(loop_count, len(stock_data), len(signals), duration, errors=0)
except Exception as e:
print(f"⚠️ 记录ETL统计失败: {e}")
print(f"⏳ 等待 {args.interval} 秒...")
time.sleep(args.interval)
except KeyboardInterrupt:
print("\n🛑 监控已停止")
if __name__ == "__main__":
main()

261
premarket_watch.py Normal file
View File

@@ -0,0 +1,261 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""盘前监控脚本
功能:
1. 周期性抓取指定数量美股的盘前价格 (富途 before_open_stock_info)
2. 在终端实时打印表格 (可 --interval 控制刷新秒数)
3. 支持 --limit 获取排行榜前N只 或 --symbols 手动指定列表
4. 自动判断美东时段是否为盘前 (4:00-9:30 ET), 非盘前可提示或使用 --force 继续
依赖已有模块: futu.py 中的 EastMoneyAPI / StockDataIntegrator
示例:
python premarket_watch.py --limit 15 --interval 30
python premarket_watch.py --symbols NVDA,AAPL,TSLA --interval 20
python premarket_watch.py --limit 10 --once # 单次输出
python premarket_watch.py --limit 10 --force # 忽略时段检查
注意: 逐个请求富途页面存在速率限制风险, 建议 limit 不要太大; 脚本仅演示用途。
"""
import argparse
import time
from datetime import datetime
from concurrent.futures import ThreadPoolExecutor, as_completed
from zoneinfo import ZoneInfo
from typing import List, Dict
from futu import EastMoneyAPI, StockDataIntegrator
from utils_time import now_et, fmt_et_hm, fmt_et
from data_writer import write_symbols, append_premarket_bars, append_premarket_signals
def parse_args():
parser = argparse.ArgumentParser(description='盘前实时监控脚本')
group = parser.add_mutually_exclusive_group(required=False)
group.add_argument('--limit', type=int, default=10, help='获取东方财富排行前N只 (默认10)')
group.add_argument('--symbols', type=str, help='逗号分隔的股票代码列表, 覆盖 limit')
parser.add_argument('--interval', type=int, default=10, help='刷新间隔秒, 默认60')
parser.add_argument('--once', action='store_true', help='只执行一次抓取并退出')
parser.add_argument('--force', action='store_true', help='忽略盘前时段判断强制抓取')
parser.add_argument('--sleep', type=float, default=0.0, help='顺序模式下的延时(已多线程可忽略)')
parser.add_argument('--max-workers', type=int, default=0, help='线程最大数量(0=自动=股票数,建议限制避免过度)')
parser.add_argument('--no-color', action='store_true', help='关闭ANSI颜色输出')
parser.add_argument('--save', action='store_true', help='保存盘前快照和信号到 data/premarket_*.csv')
return parser.parse_args()
def get_et_session(now_et: datetime) -> str:
if now_et.weekday() >= 5:
return 'off'
m = now_et.hour * 60 + now_et.minute
if 4*60 <= m < 9*60 + 30:
return 'pre'
if 9*60 + 30 <= m < 16*60:
return 'regular'
if 16*60 <= m < 20*60:
return 'post'
return 'off'
def fetch_symbol_list(limit: int, api: EastMoneyAPI) -> List[Dict]:
raw, _ = api.get_us_stocks(page_size=limit)
parsed: List[Dict] = []
for item in raw:
data = api.parse_stock_data(item)
if data:
parsed.append({'symbol': data['symbol'], 'name': data['name']})
return parsed
def parse_symbols_arg(symbols_str: str) -> List[Dict]:
result = []
for s in symbols_str.split(','):
sym = s.strip().upper()
if sym:
result.append({'symbol': sym, 'name': ''})
return result
def safe_ratio_to_float(ratio_raw) -> float:
if ratio_raw in (None, ''):
return 0.0
try:
txt = str(ratio_raw).replace('%', '').strip()
if not txt:
return 0.0
return float(txt) / 100.0
except Exception:
return 0.0
def colorize(s: str, positive: bool, no_color: bool) -> str:
if no_color:
return s
if positive:
return f"\x1b[32m{s}\x1b[0m" # 绿色
return f"\x1b[31m{s}\x1b[0m" # 红色
def format_table(rows: List[Dict], no_color: bool) -> str:
headers = ['Symbol', 'Name', 'Premarket Price', 'Change', 'Change %', 'Updated']
col_widths = [len(h) for h in headers]
for r in rows:
col_widths[0] = max(col_widths[0], len(r.get('symbol','')))
col_widths[1] = max(col_widths[1], len(r.get('name','')))
col_widths[2] = max(col_widths[2], len(r.get('premarket_price','')))
col_widths[3] = max(col_widths[3], len(r.get('premarket_change','')))
col_widths[4] = max(col_widths[4], len(r.get('premarket_change_ratio_fmt','')))
col_widths[5] = max(col_widths[5], len(r.get('ts','')))
def pad(text, width):
return str(text).ljust(width)
line_sep = '' * (sum(col_widths) + len(col_widths)*3 - 1)
header_line = ' '.join(pad(h, col_widths[i]) for i, h in enumerate(headers))
body_lines = []
for r in rows:
pos = safe_ratio_to_float(r.get('premarket_change_ratio')) >= 0
ratio_fmt = r.get('premarket_change_ratio_fmt','')
ratio_fmt = colorize(ratio_fmt, pos, no_color)
change = r.get('premarket_change','')
change = colorize(change, pos, no_color)
body_lines.append(' '.join([
pad(r.get('symbol',''), col_widths[0]),
pad(r.get('name',''), col_widths[1]),
pad(r.get('premarket_price',''), col_widths[2]),
pad(change, col_widths[3]),
pad(ratio_fmt, col_widths[4]),
pad(r.get('ts',''), col_widths[5]),
]))
return f"{header_line}\n{line_sep}\n" + '\n'.join(body_lines)
def main():
args = parse_args()
api = EastMoneyAPI()
integrator = StockDataIntegrator()
if args.symbols:
symbols = parse_symbols_arg(args.symbols)
print(f"📋 使用自定义股票列表: {[s['symbol'] for s in symbols]}")
else:
symbols = fetch_symbol_list(args.limit, api)
print(f"📋 获取排行前 {args.limit} 只股票: {[s['symbol'] for s in symbols]}")
if not symbols:
print("❌ 无有效股票列表, 退出")
return
if not args.force:
now_et = datetime.now(ZoneInfo('America/New_York'))
session = get_et_session(now_et)
if session != 'pre':
print(f"⚠️ 当前美东时段为 {session} (ET {now_et.strftime('%H:%M')}), 非盘前, 使用 --force 可强制抓取")
return
def _fetch_one(info: Dict) -> Dict:
sym = info['symbol']
name = info['name']
futu = integrator.get_futu_stock_details(sym)
if futu and futu.get('before_open_price'):
ratio_raw = futu.get('before_open_change_ratio')
ratio_val = safe_ratio_to_float(ratio_raw)
return {
'symbol': sym,
'name': name,
'premarket_price': futu.get('before_open_price',''),
'premarket_change': futu.get('before_open_change',''),
'premarket_change_ratio': ratio_raw,
'premarket_change_ratio_fmt': f"{ratio_val*100:.2f}%" if ratio_raw else '',
'ts': fmt_et_hm(),
}
return {
'symbol': sym,
'name': name,
'premarket_price': '-',
'premarket_change': '-',
'premarket_change_ratio': '',
'premarket_change_ratio_fmt': '',
'ts': fmt_et_hm(),
}
def run_once():
# 动态线程数:若 max-workers=0 用股票数,做一个上限保护例如 128
worker_target = args.max_workers if args.max_workers > 0 else len(symbols)
max_cap = 128 # 安全软限制,避免过度线程导致资源问题
workers = min(worker_target, max_cap)
if workers < len(symbols):
print(f"⚠️ 线程数限制为 {workers} (股票 {len(symbols)}), 使用 --max-workers 调整或提高上限")
start = time.time()
rows: List[Dict] = []
# 多线程并发抓取
with ThreadPoolExecutor(max_workers=workers) as executor:
future_map = {executor.submit(_fetch_one, info): info['symbol'] for info in symbols}
for fut in as_completed(future_map):
try:
rows.append(fut.result())
except Exception as e:
sym = future_map[fut]
rows.append({
'symbol': sym,
'name': '',
'premarket_price': 'ERR',
'premarket_change': '-',
'premarket_change_ratio': '',
'premarket_change_ratio_fmt': '',
'ts': fmt_et_hm(),
})
print(f"⚠️ {sym} 抓取异常: {e}")
# 保持原列表顺序
rows.sort(key=lambda r: [s['symbol'] for s in symbols].index(r['symbol']))
elapsed = time.time() - start
print(f"🕒 ET {fmt_et()} | 刷新间隔 {args.interval}s | 总计 {len(rows)}")
print(f"⏱️ 本轮耗时 {elapsed:.2f}s, 线程 {workers}")
print(format_table(rows, args.no_color))
if args.save:
# 建立 symbol 基础信息用于写入 symbols.csv缺 name 也允许)
symbol_base = [{'symbol': r['symbol'], 'name': r.get('name',''), 'exchange': 'US', 'currency': 'USD'} for r in rows]
symbol_id_map = write_symbols(symbol_base)
append_premarket_bars(rows, symbol_id_map, source='futu')
# 生成盘前阈值信号±3%
signals = []
for r in rows:
raw_ratio = r.get('premarket_change_ratio')
val = safe_ratio_to_float(raw_ratio)
if val >= 0.03:
signals.append({
'symbol': r['symbol'],
'direction': 'BUY',
'reason': f"盘前涨幅 {val*100:.2f}% 触发阈值",
'params': {'premarket_price': r.get('premarket_price'), 'premarket_change_ratio': val}
})
elif val <= -0.03:
signals.append({
'symbol': r['symbol'],
'direction': 'SELL',
'reason': f"盘前跌幅 {val*100:.2f}% 触发阈值",
'params': {'premarket_price': r.get('premarket_price'), 'premarket_change_ratio': val}
})
append_premarket_signals(signals, symbol_id_map)
if signals:
print(f"💡 已保存盘前信号 {len(signals)} 条 -> data/premarket_signals.csv")
print("🗂️ 已保存盘前快照 -> data/premarket_bars.csv")
if args.once:
run_once()
return
while True:
try:
run_once()
if args.interval <= 0:
break
time.sleep(args.interval)
except KeyboardInterrupt:
print("\n🛑 已停止盘前监控")
break
except Exception as e:
print(f"⚠️ 本轮捕获异常: {e}")
time.sleep(args.interval)
if __name__ == '__main__':
main()

2
requirements.txt Normal file
View File

@@ -0,0 +1,2 @@
requests>=2.28.0
beautifulsoup4>=4.12.0

30
signal_filter.py Normal file
View File

@@ -0,0 +1,30 @@
# -*- coding: utf-8 -*-
"""信号冷却过滤模块
- 避免同一标的在冷却期内重复产生同方向信号
- 过滤后附加 generated_at_utc 字段UTC ISO
"""
from datetime import datetime, timezone, timedelta
from typing import List, Dict, Any
class SignalCooldownFilter:
def __init__(self, cooldown_minutes: int = 30):
self.cooldown = timedelta(minutes=cooldown_minutes)
# key: (symbol, direction) -> last datetime
self.last_time: Dict[tuple, datetime] = {}
def filter(self, signals: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
now = datetime.now(timezone.utc)
accepted = []
for s in signals:
symbol = s.get('symbol')
direction = s.get('type') or s.get('direction')
key = (symbol, direction)
lt = self.last_time.get(key)
if lt is not None and now - lt < self.cooldown:
continue
# Accept
self.last_time[key] = now
s['direction'] = direction # normalize
s['generated_at_utc'] = now.strftime('%Y-%m-%dT%H:%M:%SZ')
accepted.append(s)
return accepted

1846
trade_log.csv Normal file

File diff suppressed because it is too large Load Diff

62
trader.py Normal file
View File

@@ -0,0 +1,62 @@
# -*- coding: utf-8 -*-
"""
交易执行模块 (模拟股票购买/抛售)
"""
import time
import csv
from datetime import datetime
from utils_time import now_et, fmt_et
class Trader:
def __init__(self):
self.log_file = "trade_log.csv"
self._init_log_file()
def _init_log_file(self):
try:
with open(self.log_file, 'a', newline='', encoding='utf-8-sig') as f:
pass # 确保文件存在
except Exception as e:
print(f"❌ 初始化交易日志失败: {e}")
def execute_signals(self, signals):
"""
执行交易信号
Args:
signals: 交易信号列表
"""
if not signals:
return
print(f"⚡ 收到 {len(signals)} 个交易信号,准备执行...")
for signal in signals:
self._execute_single_trade(signal)
def _execute_single_trade(self, signal):
"""执行单笔交易"""
action = signal.get('type', '')
symbol = signal.get('symbol', '')
name = signal.get('name', '')
price = signal.get('price', '')
reason = signal.get('reason', '')
# 模拟交易延迟
time.sleep(0.1)
timestamp = fmt_et()
log_entry = f"[{timestamp}] {action} {symbol} ({name}) @ ${price} | 原因: {reason}"
print(f"💸 交易执行: {log_entry}")
# 记录到文件
self._log_trade(timestamp, action, symbol, name, price, reason)
def _log_trade(self, timestamp, action, symbol, name, price, reason):
try:
with open(self.log_file, 'a', newline='', encoding='utf-8-sig') as f:
writer = csv.writer(f)
writer.writerow([timestamp, action, symbol, name, price, reason])
except Exception as e:
print(f"❌ 写入交易日志失败: {e}")

13
utils_id.py Normal file
View File

@@ -0,0 +1,13 @@
# -*- coding: utf-8 -*-
import hashlib
def stable_symbol_id(symbol: str, exchange: str = "US") -> int:
"""Generate a stable positive 64-bit int ID from symbol+exchange.
Collisions are extremely unlikely for our scale.
"""
base = f"{exchange}:{symbol}".upper().encode("utf-8")
h = hashlib.sha1(base).digest()
# take first 8 bytes as unsigned 64-bit integer
val = int.from_bytes(h[:8], byteorder="big", signed=False)
# constrain to 63-bit to avoid CSV tools issues with signedness
return val & ((1 << 63) - 1)

26
utils_time.py Normal file
View File

@@ -0,0 +1,26 @@
from datetime import datetime
from zoneinfo import ZoneInfo
ET_TZ = ZoneInfo('America/New_York')
UTC_TZ = ZoneInfo('UTC')
def now_et():
return datetime.now(ET_TZ)
def now_utc():
return datetime.now(UTC_TZ)
def fmt_et(dt: datetime | None = None, with_date: bool = True) -> str:
if dt is None:
dt = now_et()
return dt.strftime('%Y-%m-%d %H:%M:%S' if with_date else '%H:%M:%S')
def fmt_et_hm(dt: datetime | None = None) -> str:
if dt is None:
dt = now_et()
return dt.strftime('%H:%M:%S')
def fmt_utc(dt: datetime | None = None) -> str:
if dt is None:
dt = now_utc()
return dt.strftime('%Y-%m-%d %H:%M:%S')

335
盘前操作.md Normal file
View File

@@ -0,0 +1,335 @@
## 盘前数据量化流程
C:\Users\86188\miniconda3\Scripts\activate
1. **数据清洗与特征工程**
- 读取 `premarket_bars.csv`,筛选 session=pre 的数据。
- 计算盘前涨跌幅change_ratio、与前收盘价对比pre_return_vs_prev_close、流动性 proxy如 pre_volume
- 生成 `premarket_features.csv`,为后续量化模型和大模型推理提供输入。
2. **信号生成与策略设计**
- 规则法:如盘前涨幅 >3% 生成 BUY 信号,<-3% 生成 SELL 信号
- 多因子法结合盘前特征历史表现异动分布等设计量化打分模型
- 大模型法将盘前特征历史数据市场新闻等输入 LLM生成多维度信号与解读
- 信号写入 `premarket_signals.csv`记录来源置信度推理摘要
3. **回测与绩效评估**
- 用盘前信号与历史行情进行回测评估策略收益风险胜率
- 对比规则法多因子法与大模型法的表现优化信号生成逻辑
- 结果归档于回测报告可用大模型自动生成策略总结
4. **自动化交易与风控**
- 盘前信号可自动推送至交易系统支持模拟盘与实盘
- 结合大模型生成的风险提示动态调整仓位与风控参数
- 失败样本与异常信号自动归档便于后续诊断与模型迭代
5. **大模型协同分析**
- 盘前数据信号回测结果可作为 prompt自动生成策略文档异动解读风险提示
- 支持多轮问答与因子解释提升量化工程师与大模型协作效率
6. **监控与持续优化**
- 盘前数据与信号归档定期分析成功率异常分布策略表现
- 结合大模型自动诊断与修复建议持续优化量化流程
---
# 盘前操作说明
**下一步建议:结合大模型与量化工程最佳实践**
1. **数据质量与多源融合**
- 富途/东方财富/Yahoo 多源融合自动回退与异常检测
- 失败样本自动归档便于大模型后续异常分析与数据增强
2. **盘前特征工程与大模型输入**
- 盘前特征扩展 pre_return_vs_prev_close流动性 proxyspread异动分布等
- 直接生成 `premarket_features.csv`为大模型训练/推理提供结构化输入
3. **信号生成与大模型辅助决策**
- 传统规则(±3%与大模型 LLM/LLM+因子融合并行生成信号支持模型版本号与推理参数落地
- 盘前信号可通过 prompt/embedding 送入 LLM生成更丰富的解读风险提示”。
4. **冷却与去重治理**
- 复用 signal_filter.py支持大模型信号冷却窗口与多因子去重
- 信号写入时记录模型来源置信度推理摘要
5. **自动化回测与监控**
- 盘前数据与信号自动归档定期触发回测脚本评估大模型与传统规则的表现
- ETL_RUNS/health 文件记录成功率耗时异常分布便于大模型诊断
6. **大模型集成与推理链路**
- 盘前数据可直接作为 LLM 输入请分析今日盘前异动并生成交易建议”),支持 prompt 工程与多轮推理
- 结合历史数据自动生成 prompt支持多模型对比 GPT-4/Claude/自研模型)。
7. **告警与智能解释**
- 盘前信号异常/异动自动推送至 Slack/邮件并由大模型生成解读操作建议”。
- 失败样本自动归档定期由大模型分析原因并给出修复建议
8. **数据库与高性能存储**
- 逐步迁移 CSV SQLite/PostgreSQL支持高频查询与大模型批量推理
- 盘前数据表结构可直接映射为大模型训练/推理数据集
9. **可扩展 prompt 工程**
- 设计 prompt 模板自动填充盘前特征信号历史表现提升大模型推理效果
- 支持多轮问答因子解释”,便于策略迭代
10. **量化工程师与大模型协作流程**
- 盘前数据自动归档量化工程师可随时调用大模型分析盘前异动生成策略建议
- 结合大模型自动生成的策略文档”,实现人机协同决策
**推荐大模型应用场景**
- 盘前异动解读与自动生成交易建议
- 盘前信号置信度评估与风险提示
- 失败样本自动诊断与修复建议
- prompt 工程与多轮推理链路设计
- 量化策略文档自动生成与归档
---
- 保持时间字段可跨时区比对 UTC 为主存同时记录 ET美东用于展示
- 生成可控的预警信号并记录信号来源与冷却策略
---
**总体架构**
- 抓取层`premarket_watch.py`实时/交互)、`monitor.py`批量/生产负责触发抓取
- 解析层`futu.py` `FutuStockParser.parse_javascript_data` / `parse_price_data`并增加健壮性与回退见下节
- 持久化层`data_writer.py` 将快照写入 `bars_1m.csv`新增 `session` 字段并支持 `append_bars_session` `session=pre`
- 信号层`market_analyzer.py` / `signal_filter.py` 负责信号生成与冷却规则
- 监控/告警日志 + ETL 统计 (`etl_runs.csv`) + 失败 HTML dump
---
**抓取策略(要点)**
- 优先抓取来源富途`futu` `before_open_stock_info`若富途失败再使用东方财富 / Yahoo Finance 回退
- 抓取并发`premarket_watch.py` 支持 `--max-workers`建议初期将并发数限制在 4-8避免被风控
- 重试与降级每个 symbol 最多 2 次重试指数退避 0.5s -> 1s失败时保存 HTML: `data/failed_{symbol}_{ts}.html`
- 验证:抓到的 HTML/JSON 做基本校验(长度、是否包含 `__INITIAL_STATE__`、是否包含价格正则),否则视为失败
---
**时间与时区约定**
- 存储CSV / DB均以 UTC 为主(字段名以 `_utc` 结尾),便于跨时区一致性回测
- 对外展示与终端打印使用 ET美东`America/New_York`),代码中使用 `utils_time.py``fmt_et()` / `fmt_et_hm()`
- 在每条记录中同时保留 `ts_utc``ts_et`(后者可选),或只保留 `ts_utc` 并在查询/展示层动态格式化为 ET
---
**文件/表 设计CSV 优先,后续可迁移到 PostgreSQL**
- 文件命名data/ 目录)
- `premarket_bars.csv` (盘前快照)
- `premarket_signals.csv` (盘前生成的信号/预警)
- `premarket_features.csv` (若需盘前特征)
- `failed_html/` 存放抓取失败的 HTML便于人工排查
- `premarket_bars.csv`CSV
- symbol_id (int)
- symbol (text)
- ts_utc (ISO UTC)
- ts_et (ISO ET) -- 可选,便于人工查看
- price (float)
- change (float)
- change_ratio (float) -- 小数表示,例如 -0.038 表示 -3.8%
- volume (int/empty)
- source (text) -- 'futu' / 'eastmoney' / 'yahoo'
- session (text) -- 'pre' / 'regular' / 'post'
- raw_file (text) -- 若保存了原始 HTML/JSON 的文件名
- `premarket_signals.csv`
- id (text) -- 如 symbolid-生成时间
- symbol_id, symbol
- generated_at_utc
- signal_type ('premarket_alert')
- direction ('BUY'/'SELL')
- score (float)
- reason (text)
- params_json (text) -- 包含触发字段(例如 pre_price, pre_change_ratio
- model_name, version
- expires_at_utc
- PostgreSQL 示例 DDL简化
```sql
CREATE TABLE premarket_bars (
id BIGSERIAL PRIMARY KEY,
symbol TEXT NOT NULL,
symbol_id BIGINT,
ts_utc TIMESTAMPTZ NOT NULL,
price NUMERIC,
change NUMERIC,
change_ratio NUMERIC,
source TEXT,
session TEXT,
raw_file TEXT
);
CREATE INDEX idx_premarket_bars_symbol_ts ON premarket_bars(symbol, ts_utc DESC);
CREATE TABLE premarket_signals (
id TEXT PRIMARY KEY,
symbol TEXT,
symbol_id BIGINT,
generated_at_utc TIMESTAMPTZ,
direction TEXT,
score NUMERIC,
reason TEXT,
params JSONB
);
```
---
**ETL 流程建议(每轮)**
1. Fetch: 按配置的 symbol 列表并发抓取 `futu` 页面/JS 数据
2. Validate: 校验数据字段完整性price 非空、change_ratio 可解析)
3. Persist raw: 抓到的原始 HTML/JSON仅失败或配置为保存写 `failed_html/` 或 `raw/`
4. Normalize: 将涨跌幅转换为小数、将价格转浮点
5. Persist bar: 写 `premarket_bars.csv` 或入库 `premarket_bars`
6. Feature/Signal: 基于规则或模型生成预警信号,写 `premarket_signals.csv`
7. Stats/ETL: 写一条 `etl_runs.csv`fetched_count, signal_count, duration
---
**推荐盘前特征(可在 `premarket_features.csv` 存储)**
- pre_return_vs_prev_close = (pre_price / prev_close) - 1
- pre_vs_open = (pre_price / open_price) - 1
- liquidity_proxy: pre_volume若可获得或估计成交强度
- spread_estimate: 若能获取买卖价则计算
---
**信号治理与安全策略**
- 冷却窗口:相同(symbol, direction) 最小冷却 30 分钟(`signal_filter.py` 已实现)
- 过度并发保护:对富途页面调用施加 `--max-workers` 限制,建议生产值 4~8
- 失败与告警:当连续 N 次(例如 5 次)抓取某个 symbol 失败,发出报警並暫停该 symbol 的抓取
- 可选阈值:盘前涨幅 > +3% 发出 BUY 预警,<-3% 发 SELL 预警(可配置)
---
**监控与告警**
- ETL 日志(`etl_runs.csv`用于監控采集稳定性fetched_count 与 error rate
- 将 `failed_html/` 的数量作为健康指标;若短时间内增多,说明被风控/结构变化
- 可集成邮件/Slack 通知:当出现大盘前信号或连续抓取失败时通知運維/策略人员
---
**存储/归档与保留策略**
- 快照保存期:`premarket_bars.csv` 按天轮换或周期归档;建议保留 90 天的高频数据在线上长期数据归入冷存S3
- raw HTML仅保存失败样本或每 N 次保存一次示例,避免占满磁盘
---
**工具链与代码位置**
- 抓取/解析:`futu.py``FutuStockParser.parse_javascript_data` / `parse_price_data`
- 实时监控:`premarket_watch.py`已支持多线程、ET 时间显示、失败回存)
- 持久化:`data_writer.py`(新增 `session` 字段与 `append_bars_session`
- 时间工具:`utils_time.py`ET/UTC 格式化)
---
**示例命令**
- 单次 10 只并发抓取并显示(用于检查):
```bash
python premarket_watch.py --limit 10 --once --force --max-workers 8
```
- 持续运行(每 30s 刷新):
```bash
python premarket_watch.py --limit 20 --interval 30 --max-workers 6
```
- 保存盘前快照和信号(写入 `data/premarket_bars.csv` / `data/premarket_signals.csv`
```bash
python premarket_watch.py --limit 25 --interval 60 --save --max-workers 6 --force
```
运行后可在 `data/` 目录看到:
- `premarket_bars.csv` 新增行session=pre, change_ratio 为小数)
- `premarket_signals.csv` BUY/SELL 阈值信号±3%
- `symbols.csv` 自动补充缺失的 symbol 基础信息
---
### 盘前数据清洗与特征工程详细操作
1. **读取与筛选盘前数据**
- 使用 pandas 或 csv 库读取 `data/premarket_bars.csv`。
- 仅保留 `session=pre` 的行。
- 示例代码pandas
```python
import pandas as pd
df = pd.read_csv('data/premarket_bars.csv')
pre_df = df[df['session'] == 'pre']
```
2. **计算盘前特征**
- 盘前涨跌幅:直接使用 `change_ratio` 列。
- 与前收盘价对比pre_return_vs_prev_close需关联前一天收盘价可从历史 bars 或 eastmoney/yahoo 数据获取),公式:
```python
# 假设 pre_df 有 prev_close 列
pre_df['pre_return_vs_prev_close'] = pre_df['price'] / pre_df['prev_close'] - 1
```
- 流动性 proxy如 pre_volume如有 volume 字段直接用,否则可用成交额/市值等近似。
3. **生成特征文件**
- 选取需要的特征列,如 symbol, ts_utc, price, change_ratio, pre_return_vs_prev_close, pre_volume。
- 保存为 `data/premarket_features.csv`。
- 示例代码:
```python
feature_cols = ['symbol', 'ts_utc', 'price', 'change_ratio', 'pre_return_vs_prev_close', 'volume']
pre_df[feature_cols].to_csv('data/premarket_features.csv', index=False)
```
4. **数据源补充说明**
- 若 `prev_close` 或 `volume` 缺失,可用 `eastmoney` 或 `yahoo` 的历史行情接口补齐。
- 推荐先用 pandas 合并历史收盘价,再批量计算特征。
5. **自动化脚本建议**
- 可将上述流程封装为 `etl_premarket_features.py`,每日盘前自动运行。
- 支持异常处理与日志输出,便于后续大模型分析。
---
### 前收盘价获取方法
1. **数据来源**
- 东方财富EastMoneyAPI在 `futu.py` 的 `parse_stock_data` 方法中,已解析 `prev_close` 字段f18可用于美股主流标的。
- 富途:部分页面可解析前收盘价,但稳定性略低,建议优先用东方财富。
- Yahoo Finance如需补充可用 yfinance 或 requests 获取历史收盘价。
2. **自动补齐流程**
- 在盘前特征工程脚本中,先读取 `premarket_bars.csv`,如无 prev_close 字段,则批量用 symbol 列调用东方财富 API 获取。
- 示例代码pandas + requests
```python
import pandas as pd
from futu import EastMoneyAPI
df = pd.read_csv('data/premarket_bars.csv')
api = EastMoneyAPI()
def get_prev_close(symbol):
stocks, _ = api.get_us_stocks(page_size=1)
for item in stocks:
data = api.parse_stock_data(item)
if data and data['symbol'] == symbol:
return data['prev_close']
return None
df['prev_close'] = df['symbol'].apply(get_prev_close)
```
- 若需高效批量补齐,可提前缓存 symbol→prev_close 映射。
3. **补充说明**
- 若已在 `premarket_bars.csv` 生成时写入 prev_close 字段,则无需后处理。
- 若需用 Yahoo Finance可用 yfinance 库:
```python
import yfinance as yf
def get_prev_close_yahoo(symbol):
ticker = yf.Ticker(symbol)
hist = ticker.history(period='2d')
if len(hist) >= 2:
return hist['Close'].iloc[-2]
return None
```
- 推荐在 ETL/特征工程脚本中自动补齐,保证后续量化分析一致性。
---
文档作者: AI 量化工程师(为当前代码库改造)
END