Skip to content

Commit e362c14

Browse files
committed
Add README
1 parent 227b68a commit e362c14

File tree

3 files changed

+43
-4
lines changed

3 files changed

+43
-4
lines changed

common/arg.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3480,7 +3480,7 @@ common_params_context common_params_parser_init(common_params & params, llama_ex
34803480
[](common_params & params, const std::string & value) { params.diffusion_llada.cfg_scale = std::stof(value); }
34813481
).set_examples({ LLAMA_EXAMPLE_DIFFUSION_LLADA }));
34823482
add_opt(common_arg(
3483-
{ "--diffusion-remasking-alg" }, "N",
3483+
{ "--diffusion-alg" }, "N",
34843484
string_format("remasking algorithm: 0=LOW_CONFIDENCE, 1=RANDOM (default: %d)", params.diffusion_llada.remasking),
34853485
[](common_params & params, int value) { params.diffusion_llada.remasking = value; }
34863486
).set_examples({ LLAMA_EXAMPLE_DIFFUSION_LLADA }));

convert_hf_to_gguf.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2943,15 +2943,15 @@ def set_gguf_parameters(self):
29432943
self.gguf_writer.add_rope_dimension_count(rope_dim)
29442944

29452945
# Set context length for LLaDA
2946-
context_length = self.hparams.get("max_sequence_length")
2946+
context_length = self.hparams.get("max_sequence_length", 4096)
29472947
self.gguf_writer.add_context_length(context_length)
29482948

29492949
# Set embedding length (dimension size)
2950-
embedding_length = self.hparams.get("d_model")
2950+
embedding_length = self.hparams.get("d_model", 4096)
29512951
self.gguf_writer.add_embedding_length(embedding_length)
29522952

29532953
# Set feed forward length (MLP hidden size)
2954-
feed_forward_length = self.hparams.get("mlp_hidden_size")
2954+
feed_forward_length = self.hparams.get("mlp_hidden_size", 12288)
29552955
self.gguf_writer.add_feed_forward_length(feed_forward_length)
29562956

29572957
# Set RoPE parameters

examples/diffusion/README.md

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
# Diffusion Text Generation Examples
2+
3+
This directory contains implementations for diffusion-based text generation using two different model architectures: **Dream** and **LLaDA-8B**. Both models use iterative denoising processes to generate text, but employ different sampling strategies and algorithms.
4+
5+
## Supported Models
6+
7+
### 1. Dream Model (`llama-diffusion-dream-cli`)
8+
9+
- https://huggingface.co/Dream-org/Dream-v0-Base-7B
10+
- Original PR - https://github.com/ggml-org/llama.cpp/pull/14644
11+
12+
The Dream model supports four different sampling algorithms controlled by the `--diffusion-alg` parameter:
13+
14+
1. **ORIGIN (0)** - Original diffusion algorithm
15+
- Uses probability transfer based on timestep ratios
16+
- Default algorithm with standard confidence-based token selection
17+
18+
2. **MASKGIT_PLUS (1)** - Enhanced MaskGIT sampling
19+
- Improved version of the MaskGIT algorithm
20+
21+
3. **TOPK_MARGIN (2)** - Top-K margin-based sampling
22+
- Confidence calculated as the margin between top-1 and top-2 probabilities
23+
24+
4. **ENTROPY (3)** - Entropy-based sampling (recommended)
25+
- Uses entropy calculation for confidence estimation
26+
27+
### 2. LLaDA-8B Model (`llama-diffusion-llada-cli`)
28+
29+
- https://huggingface.co/GSAI-ML/LLaDA-8B-Instruct
30+
31+
### LLaDA Model Remasking Strategies
32+
33+
The LLaDA model uses two remasking approaches controlled by the `--diffusion-alg` parameter:
34+
35+
1. **REMASKING_LOW_CONFIDENCE (0)** - Default strategy
36+
- Remasks tokens with lowest confidence scores
37+
- Uses softmax probabilities to determine confidence
38+
39+
2. **REMASKING_RANDOM (1)** - Random remasking

0 commit comments

Comments
 (0)