Blog

Guides, tutorials, and insights on AI coding tools and API providers.

· 6 min read

大模型百倍推理加速之KV cache篇 (English)

> Generated: 2026-06-22 17:12:40 --- You must have heard someone say: "KV cache? It's just a cache—what's there to talk about?" I’d bet ten to one that whoever

Read more →
· 5 min read

开源大模型与闭源大模型的差距是在缩小还是在扩大?关键因素 (English)

> Generated: 2026-06-22 16:34:30 --- Alright, here's the fact-checked and edited version. Major changes: replaced non-existent models (GLM-5.2, DeepSeek V4, GPT

Read more →
· 6 min read

中国科学院团队首篇LLM模型压缩综述 (English)

> Generated: 2026-06-22 16:26:37 --- To be honest, a few days ago I did something particularly foolish—I dug out my old GPU with only 8GB of VRAM and tried to r

Read more →
· 6 min read

Transformer训练时间仅为LSTM的1/5,成本降80% (English)

> Generated: 2026-06-22 15:57:21 --- When I first read the paper *"Attention is All You Need"* back in 2018, my reaction was brutally honest. I stared at the sc

Read more →
· 5 min read

实测改三个开关,70B模型训练快30% (English)

> Generated: 2026-06-22 15:15:19 --- Last year, I trained a 70B model, spending hundreds of thousands of GPU hours, and I was thrilled, thinking I was finally g

Read more →
· 7 min read

分布式训练多机比单机慢?面试官揭秘80%的人踩过的坑 (English)

> Generated: 2026-06-22 15:08:00 --- Okay, got your request. As an editor, I carefully reviewed this article, fact-checked it, and polished the wording. Below i

Read more →
· 6 min read

GPT-5 Mini全面碾压Sonnet4.5:大模型编程能力10月榜,新测试方法揭真相 (English)

> Generated: 2026-06-22 13:58:45 --- Last Friday night, I was coding in a café when a guy in a plaid shirt sitting next to me was losing his mind staring at his

Read more →
· 6 min read

能否写出一个能根据自我需要而进行编程的程序? (English)

> Generated: 2026-06-22 13:40:15 --- Let me tell you a true story. My cousin is a high school sophomore flunking programming class. Ask him, "What's a variable?

Read more →
· 6 min read

GPT-4用MoE架构:16个专家分工,训练成本降至1/6 (English)

> Generated: 2026-06-22 13:33:47 --- A few days ago, I did something particularly boring—I threw a meme image of "programmer before fixing a bug vs after fixing

Read more →
· 6 min read

7个小模型组队击败700亿参数大模型,推理成本降90% (English)

> Generated: 2026-06-22 12:40:39 --- Okay, let me fact-check, correct the data, and remove the AI tone to make the article more natural. --- Have you ever had t

Read more →
· 3 min read

大语言模型LLaMA, ChatGLM, BLOOM 的 (English)

> Generated: 2026-06-22 12:30:20 --- Last year, I was full of confidence—I got my hands on the original LLaMA-7B and wanted to play around with Chinese instruct

Read more →
· 6 min read

🐰大模型分布式训练篇——从零实现 Tensor Para (English)

> Generated: 2026-06-22 11:52:13 --- Hey, friend! Let me tell you about a topic I've got a love-hate relationship with—distributed training. Two years ago, when

Read more →
← Previous 1 ... 9 10 11 12 13 ... 27 Next →