This repository demonstrates loading, running, and fine-tuning Causal Language Models (LMs) using Hugging Face’s Transformers library and PyTorch. It covers both basic inference pipelines and low-level handling of tokens, logits, and generation loops.
- Load pre-trained Causal LMs from Hugging Face (unsloth/Llama-3.2-1B)
- Tokenize text and handle padding for batch inference
- Perform autoregressive text generation using both pipelines and manual logits-based loops
- Compute per-token Cross-Entropy Loss for training
- Fine-tune models using LoRA (Low-Rank Adaptation) for lightweight parameter-efficient training
- Build prompt-driven AI systems, e.g., paper classification or chat assistants
VALID_CLASSES = ["Artificial Intelligence", "Computer Vision", "Systems", "Theory"]
system_prompt = "You are an AI system that classifies papers.\nValid categories:\n" + "\n".join(VALID_CLASSES)
user_prompt = "Title: Stitch: Training-Free Position Control in Multimodal Diffusion Transformers\nSummary: Text-to-Image (T2I) generation models have advanced rapidly.\nAnswer:"
prompt_text = system_prompt + "\n\n" + user_prompt
inputs = tokenizer(prompt_text, return_tensors="pt").to("cpu")
with torch.no_grad():
outputs = model.generate(input_ids=inputs["input_ids"], attention_mask=inputs["attention_mask"], max_new_tokens=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))from peft import LoraConfig, get_peft_model, TaskType
lora_config = LoraConfig(task_type=TaskType.CAUSAL_LM, r=8, lora_alpha=16, lora_dropout=0.1)
model = get_peft_model(model, lora_config)
# Prepare batch
batch_texts = [prompt + answer] * 8
inputs = tokenizer(batch_texts, return_tensors="pt", padding=True).to(device)
labels = inputs.input_ids.clone()
labels[:, :tokenizer(prompt, return_tensors="pt").input_ids.shape[1]] = -100
# Optimizer
optimizer = torch.optim.AdamW(model.parameters(), lr=5e-6)
# Training loop
model.train()
for step in range(100):
outputs = model(input_ids=inputs.input_ids, attention_mask=inputs.attention_mask, labels=labels)
loss = outputs.loss
loss.backward()
optimizer.step()
optimizer.zero_grad()
python >= 3.10
torch
transformers
peft