检索增强生成(Retrieval-Augmented Generation)
核心创新
子图召回 + 模板填充 + 大模型推理 三段式流水线,将 Listing 事实错误率从 21% 降至 4.8%。
三段式流水线
text
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ 子图召回 │───▶│ 模板填充 │───▶│ 大模型推理 │
│ (Subgraph │ │ (Cypher-like│ │ (GPT/Gemini│
│ Retrieval) │ │ Triples) │ │ with Hard │
│ │ │ │ │ Constraint)│
└─────────────┘ └─────────────┘ └─────────────┘阶段 1:任务感知子图召回
不同任务召回不同子图
text
任务 召回的关系子集
─────────────────────────────────────────────────────────
主图生成 → MADE_OF + HAS_SPEC(材质 + 主要规格)
A+ 信息图 → HIGHLIGHTS + HAS_SPEC(卖点 + 规格)
Lifestyle 图 → SUITABLE_FOR + Audience + Material
4 宫格 → Top-4 SUITABLE_FOR
Bullet 文案 → HAS_SPEC + HIGHLIGHTS + COMPLIES_WITH
违禁词检查 → COMPLIES_WITH(仅合规)
视频脚本 → SUITABLE_FOR + HIGHLIGHTS(场景 + 行为)召回算法
python
def retrieve_subgraph(product_id: str, task: str) -> List[Triple]:
"""
任务感知子图召回
"""
# 1. 根据任务确定关系类型
REL_TYPES = {
'main_image': ['MADE_OF', 'HAS_SPEC'],
'aplus': ['HIGHLIGHTS', 'HAS_SPEC'],
'lifestyle': ['SUITABLE_FOR', 'MADE_OF'],
'bullet': ['HAS_SPEC', 'HIGHLIGHTS', 'COMPLIES_WITH'],
'compliance': ['COMPLIES_WITH'],
'video': ['SUITABLE_FOR', 'HIGHLIGHTS'],
}
rel_types = REL_TYPES[task]
# 2. SQL 查询
triples = db.query("""
SELECT s.name AS subject,
r.rel_type,
t.name AS object,
t.attributes,
r.weight,
r.evidence
FROM kg_relations r
JOIN kg_entities s ON s.id = r.source_id
JOIN kg_entities t ON t.id = r.target_id
WHERE r.source_id = ?
AND r.rel_type IN ({})
ORDER BY r.rel_type, r.weight DESC
""".format(','.join('?' * len(rel_types))),
(product_id, *rel_types))
# 3. 按任务做 Top-K 截断
if task == 'bullet':
triples = top_k_per_relation(triples, k=5) # 每类最多 5 条
elif task == 'lifestyle':
triples = top_k(triples, k=4) # 总共 4 条
return triples召回示例
text
HAS_SPEC:
Product → Size 60x30x80cm (weight=1.00)
Product → Load 8kg per tier (weight=0.90)
Product → Folded 60x30x8cm (weight=0.85)
Product → 3 Tiers (weight=0.85)
Product → Total Load 24kg (weight=0.80)
HIGHLIGHTS:
Product → Foldable (weight=1.15) ← 用户反馈强化
Product → Multi-tier (weight=1.05)
Product → 304-grade (weight=1.00)
Product → Anti-rust (weight=0.92)
COMPLIES_WITH:
Product → Lead-Free (weight=1.00)
Product → Food Contact Safe (weight=1.00)text
SUITABLE_FOR (Top-4):
Product → Small Kitchen Apartment (weight=0.95)
Product → Bathroom Storage (weight=0.65)
Product → Office Pantry (weight=0.60)
Product → Outdoor Camping (weight=0.55)
MADE_OF:
Product → 304 Stainless Steel (weight=1.00)阶段 2:模板填充
序列化为 Cypher-like 三元组
text
<MADE_OF, Stainless Steel Kitchen Rack, 304 Stainless Steel>
<HAS_SPEC, Stainless Steel Kitchen Rack, Size 60x30x80cm>
<HAS_SPEC, Stainless Steel Kitchen Rack, Load 8kg per tier>
<HAS_SPEC, Stainless Steel Kitchen Rack, Folded Size 60x30x8cm>
<HIGHLIGHTS, Stainless Steel Kitchen Rack, Foldable [weight=1.15]>
<HIGHLIGHTS, Stainless Steel Kitchen Rack, Multi-tier [weight=1.05]>
<COMPLIES_WITH, Stainless Steel Kitchen Rack, Lead-Free>
<COMPLIES_WITH, Stainless Steel Kitchen Rack, Food Contact Safe>注入 System Prompt
text
你是有 10 年经验的亚马逊高级运营。请遵循 A10 / COSMO / Rufus 三种算法,
撰写英文 Listing。
【知识图谱事实(必须 entailed by)】
<MADE_OF, Stainless Steel Kitchen Rack, 304 Stainless Steel>
<HAS_SPEC, Stainless Steel Kitchen Rack, Size 60x30x80cm>
<HAS_SPEC, Stainless Steel Kitchen Rack, Load 8kg per tier>
<HAS_SPEC, Stainless Steel Kitchen Rack, Folded Size 60x30x8cm>
<HIGHLIGHTS, Stainless Steel Kitchen Rack, Foldable [weight=1.15]>
<HIGHLIGHTS, Stainless Steel Kitchen Rack, Multi-tier [weight=1.05]>
<HIGHLIGHTS, Stainless Steel Kitchen Rack, 304-grade [weight=1.00]>
<COMPLIES_WITH, Stainless Steel Kitchen Rack, Lead-Free>
<COMPLIES_WITH, Stainless Steel Kitchen Rack, Food Contact Safe>
【硬约束】
1. 严禁生成与三元组冲突的描述(如 316 / 自动折叠 / 1m 长度)
2. 严禁使用违禁词(FDA approved, Antibacterial, #1, Best, ...)
3. Bullet 顺序:使用场景 → 关键参数 → 卖点 → 售后承诺 → 品牌延伸
4. 每个 Bullet 用 [STANDOUT PHRASE] 开头吸引点击
5. 卖点按 weight 降序,最高 weight 进 Bullet 1
请生成 Title + 5-Point Bullets + Description。text
你是亚马逊产品摄影提示词专家。请基于以下知识图谱生成 Lifestyle 场景图的提示词。
【场景图谱】
<SUITABLE_FOR, Product, Small Kitchen Apartment [weight=0.95]>
<MADE_OF, Product, 304 Stainless Steel>
【硬约束】
1. 场景必须真实可信,避免不合理布局
2. 光线根据场景调整(厨房 = 早晨自然光,户外 = 黄金时刻)
3. 必须能体现材质(304 不锈钢 = brushed metallic sheen)
4. 商品摆放符合实际使用方式(不悬空、不倾斜)
5. 不得添加竞品 logo、违禁文字
请生成英文 Prompt(用于 Gemini 3 Pro Image)。阶段 3:大模型推理
调用方式
python
# Bullet 生成
response = openai.chat.completions.create(
model="gpt-5.5",
messages=[
{"role": "system", "content": filled_system_prompt},
{"role": "user", "content": user_request}
],
temperature=0.7,
max_tokens=2048,
response_format={"type": "json_object"}
)输出后处理
python
def post_process(llm_output, subgraph):
"""
后处理:
1. 验证输出 entailed by 三元组
2. 检查违禁词命中
3. Bullet 顺序检查
"""
# 1. 事实校验
facts = extract_claims(llm_output)
for claim in facts:
if not entailed_by(claim, subgraph):
llm_output = regenerate_with_warning(claim, subgraph)
# 2. 违禁词检查
for word in BANNED_WORDS:
if word.lower() in llm_output.lower():
llm_output = replace_or_regenerate(word, llm_output)
# 3. Bullet 顺序
bullets = parse_bullets(llm_output)
if not is_golden_order(bullets):
bullets = reorder_to_golden(bullets)
return llm_output实证对比
我们用 100 条真实跨境电商商品做对比测试:
场景:Bullet 生成
| 指标 | 直接 LLM | + GraphRAG | 改进 |
|---|---|---|---|
| 事实错误率 | 21.0% | 4.8% | ↓ 77% |
| 违禁词命中率 | 11.0% | 0.6% | ↓ 95% |
| 关键卖点遗漏率 | 18.5% | 3.2% | ↓ 83% |
| Bullet 顺序合规 | 62% | 96% | ↑ 55% |
| 生成长度合规 | 78% | 94% | ↑ 21% |
案例对比
text
Title: Premium Stainless Steel Kitchen Rack with FDA Approved
Antibacterial Coating - #1 Best Seller!
Bullet 1: [HEALTH & SAFETY] Built with FDA-approved
antibacterial coating that kills 99.9% of germs.
Bullet 2: [DURABLE] Made of 316 marine-grade stainless steel.
Bullet 3: [SPACE SAVING] Auto-folds in 3 seconds with motorized
mechanism.
...
❌ 问题:
- "FDA approved" 违禁词
- "kills 99.9%" 抗菌违禁词
- "316" 不是 304(事实错误)
- "motorized" 商品本无电机(幻觉)
- "#1 Best Seller" 平台禁词text
Title: Foldable 3-Tier Stainless Steel Kitchen Rack -
304 Stainless, 60x30x80cm, 8kg Load Per Tier
Bullet 1: [USE ANYWHERE] Perfect for small kitchens, apartments,
bathrooms, and outdoor camping. Folds flat to just
8cm thick for hassle-free storage.
Bullet 2: [304 STAINLESS STEEL] Genuine 304 SUS construction.
Lead-free certified and food-contact safe — durable
enough for daily kitchen use.
Bullet 3: [FOLDABLE 3-TIER DESIGN] 60×30×80cm fully expanded,
collapses to 60×30×8cm in seconds. No tools required.
Bullet 4: [24KG TOTAL CAPACITY] Three sturdy tiers each support
8kg — combined 24kg total. Anti-rust coating ensures
long-lasting performance.
Bullet 5: [1-YEAR WARRANTY] Backed by 24/7 customer service and
full replacement guarantee. Buy with confidence.
✅ 改进:
- 严格基于 GraphRAG 三元组(304 / 60×30×80cm / 8kg / foldable / lead-free)
- 自动避开所有违禁词(无 FDA / antibacterial / #1)
- Bullet 黄金顺序(场景 → 参数 → 卖点 → 售后)
- 卖点按 weight 排序(Foldable=1.15 进 Bullet 1)
- 长度合规(每条 Bullet 在 200 字符内)多模态生成的 GraphRAG
图片生成的 Prompt 注入
python
def graphrag_image_prompt(product_id, style):
"""
生成图片 Prompt 时注入 GraphRAG
"""
triples = retrieve_subgraph(product_id, task='lifestyle' if style == 'lifestyle' else 'main_image')
base_prompt = STYLE_TEMPLATES[style]
facts = render_triples_for_image(triples)
return f"""
{base_prompt}
Product context (must be visually consistent):
{facts}
Subject must visibly demonstrate:
- Material: {get_entity('Material', triples)}
- Key features: {get_top_features(triples, k=2)}
Negative: avoid showing competing brands, logos, watermarks,
or any items not consistent with above context.
"""视频脚本的 GraphRAG
python
def graphrag_video_script(product_id):
triples = retrieve_subgraph(product_id, task='video')
# 场景从 SUITABLE_FOR 选 Top-1
scene = top_relation(triples, 'SUITABLE_FOR')
# 行为从 HIGHLIGHTS 编排
features = filter_by_relation(triples, 'HIGHLIGHTS')
actions = features_to_actions(features)
# 例如 Foldable → "fold and unfold demonstration"
return generate_video_prompt(scene, actions)性能数据
| 阶段 | 耗时 | 说明 |
|---|---|---|
| 子图召回 | < 60 ms | SQLite 索引查询 |
| 模板填充 | < 10 ms | 字符串拼接 |
| LLM 推理(Bullet) | 8-10 s | GPT-5.5 |
| LLM 推理(图片) | 12-16 s | Gemini 3 Pro Image |
| 后处理 | < 100 ms | 违禁词检查 + 顺序校验 |