返回提示词库
LLM pretraining data-mixture sankey

示例图片

LLM pretraining data-mixture sankey 1
图表信息图wuyoscarGPT-Image2-Skillcharts-infographics

LLM pretraining data-mixture sankey

Landscape 16:9 sankey diagram of a pretraining data mixture, three stages with translucent colored ribbons. LEFT (8 source blocks, heights proportional to tokens): "Common Crawl (

分类
图表信息图
模型
GPT Image 2
来源作者
wuyoscar
原始语言
en
浏览量0
来源 ID
086
在 Studio 中使用打开来源

完整提示词

Landscape 16:9 sankey diagram of a pretraining data mixture, three stages with translucent colored ribbons.

LEFT (8 source blocks, heights proportional to tokens): "Common Crawl (web) 540B" (muted navy, largest), "arXiv papers 180B" (dusty teal), "GitHub code 160B" (slate gray), "Wikipedia 40B" (soft terracotta), "StackExchange QA 30B" (warm copper), "Books (public domain) 25B" (pale olive), "Patents 18B" (pale navy), "Curated news & forums 15B" (dusty teal).

MIDDLE (3 processing blocks, stacked): "Deduplicated (MinHash + exact)", "Quality-filtered (classifier + heuristics)", "PII-scrubbed (regex + NER)".

RIGHT (3 final splits): "Pretraining set 1.4T tokens" (largest), "Instruction-tune pool 12B tokens", "RLHF preference pool 3B tokens".

Flow ribbons inherit source color with mid-labels showing token counts ("85B", "320B", "44B"). Legend strip at bottom.

Title: "LLM pretraining data mixture and downstream splits". Subtitle: "token counts after deduplication and quality filtering; ribbon thickness ∝ token flow."
多语言版本

LLM pretraining data-mixture sankey

en

Landscape 16:9 sankey diagram of a pretraining data mixture, three stages with translucent colored ribbons. LEFT (8 source blocks, heights proportional to tokens): "Common Crawl (web) 540B" (muted navy, largest), "arXiv papers 180B" (dusty teal), "GitHub code 160B" (slate gray), "Wikipedia 40B" (soft terracotta), "StackExchange QA 30B" (warm copper), "Books (public domain) 25B" (pale olive), "Patents 18B" (pale navy), "Curated news & forums 15B" (dusty teal). MIDDLE (3 processing blocks, stacked): "Deduplicated (MinHash + exact)", "Quality-filtered (classifier + heuristics)", "PII-scrubbed (regex + NER)". RIGHT (3 final splits): "Pretraining set 1.4T tokens" (largest), "Instruction-tune pool 12B tokens", "RLHF preference pool 3B tokens". Flow ribbons inherit source color with mid-labels showing token counts ("85B", "320B", "44B"). Legend strip at bottom. Title: "LLM pretraining data mixture and downstream splits". Subtitle: "token counts after deduplication and quality filtering; ribbon thickness ∝ token flow."

提示词/图片相似

12

LLM Persona Atlas

LLM Persona Atlas

Create a premium conceptual figure for an EMNLP / ACL paper, landscape 16:9, high-resolution, polished editorial-academic style. Theme: "LLM Persona Atlas". This should not look li

图表信息图wuyoscarGPT-Image2-Skill
GPT Image 20 浏览量
Frontier LLM family tree (2018–2026)

Frontier LLM family tree (2018–2026)

Landscape 16:9 timeline / family tree of frontier LLMs 2018–2026, three vertically stacked lanes over a horizontal time axis. Time axis ticks: "2018", "2019", "2020", "2021", "202

图表信息图wuyoscarGPT-Image2-Skill
GPT Image 20 浏览量
Multi-agent LLM system architecture

Multi-agent LLM system architecture

Landscape 16:9 high-fidelity systems figure of a multi-agent LLM architecture, in the style of a richly detailed AutoGen / LangGraph / Anthropic Managed Agents Figure 1. Subtle dro

图表信息图wuyoscarGPT-Image2-Skill
GPT Image 20 浏览量
Indirect prompt-injection attack flow

Indirect prompt-injection attack flow

Landscape 16:9 security-paper figure of an indirect prompt-injection attack against a tool-using LLM agent. Four columns left-to-right, numbered flow markers ①②③④ along the main ar

图表信息图wuyoscarGPT-Image2-Skill
GPT Image 20 浏览量
LLM 速成课程可视化工具

LLM 速成课程可视化工具

一份基于文本的指令,旨在引导 AI 使用 gpt-image-2 为 LLM 速成课程创建可视化信息图。

图表信息图YouMindcharts-infographics
GPT Image 20 浏览量
LLM 架构聊天截图

LLM 架构聊天截图

创建一张逼真的 AI 聊天截图,其中包含一张展示大语言模型工作原理的密集型蓝白配色技术信息图。

图表信息图YouMindcharts-infographics
GPT Image 20 浏览量
Greenery Day Chibi Infographic

Greenery Day Chibi Infographic

Generates a cute Japanese Greenery Day educational poster with a rabbit-eared chibi gardener, nature-themed text boxes, and three informational point cards.

图表信息图YouMindcharts-infographics
GPT Image 20 浏览量
信息图可视化设计

信息图可视化设计

{ "type": "fashion design process infographic", "title": "{argument name=\"main title\" default=\"一件女装诞生的因果链 THE CAUSAL CHAIN OF A WOMEN'S GARMENT\"}", "subtitle": "从纤维,到版型,到上身 FROM FIBER TO FIT", "style": {

图表信息图charts-infographics图表与信息可视化
GPT Image 20 浏览量
Modern Beverage Commercial Poster

Modern Beverage Commercial Poster

A comprehensive professional prompt for a high-end commercial beverage poster featuring dynamic geometric elements and a street-style model.

图表信息图YouMindcharts-infographics
GPT Image 20 浏览量
详细医学解剖信息图

详细医学解剖信息图

生成一张高度详细、带有标注的人体解剖医学插图,采用肌肉与内部结构的分屏视图,并附带图例和比例尺。

图表信息图YouMindcharts-infographics
GPT Image 20 浏览量
Gothic Blonde Wine Portrait

Gothic Blonde Wine Portrait

Generates a glamorous vertical anime portrait of a gothic blonde woman holding red wine in a luxurious black-and-gold setting.

图表信息图YouMindcharts-infographics
GPT Image 20 浏览量
动漫拳击比赛海报

动漫拳击比赛海报

一张极具戏剧性的日式动漫拳击赛事海报,展示了红蓝对决竞技场中的两场焦点对决,非常适合格斗推广或虚构体育赛事艺术创作。

图表信息图YouMindcharts-infographics
GPT Image 20 浏览量