You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/source/backends-qualcomm.md
+109Lines changed: 109 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -365,6 +365,115 @@ The model, inputs, and output location are passed to `qnn_executorch_runner` by
365
365
366
366
Please refer to `$EXECUTORCH_ROOT/examples/qualcomm/scripts/` and `EXECUTORCH_ROOT/examples/qualcomm/oss_scripts/` to the list of supported models.
367
367
368
+
## How to Support a Custom Model in HTP Backend
369
+
370
+
### Step-by-Step Implementation Guide
371
+
372
+
Please reference [the simple example](https://github.com/pytorch/executorch/blob/main/examples/qualcomm/scripts/export_example.py) and [more compilated examples](https://github.com/pytorch/executorch/tree/main/examples/qualcomm/scripts) for reference
373
+
#### Step 1: Prepare Your Model
374
+
```python
375
+
import torch
376
+
377
+
# Initialize your custom model
378
+
model = YourModelClass().eval() # Your custom PyTorch model
379
+
380
+
# Create example inputs (adjust shape as needed)
381
+
example_inputs = (torch.randn(1, 3, 224, 224),) # Example input tensor
382
+
```
383
+
384
+
#### Step 2: [Optional] Quantize Your Model
385
+
Choose between quantization approaches, post training quantization (PTQ) or quantization aware training (QAT):
386
+
```python
387
+
from executorch.backends.qualcomm.quantizer.quantizer import QnnQuantizer
388
+
from torch.ao.quantization.quantize_pt2e import prepare_pt2e, prepare_qat_pt2e, convert_pt2e
389
+
390
+
quantizer = QnnQuantizer()
391
+
m = torch.export.export(model, example_inputs, strict=True).module()
392
+
393
+
# PTQ (Post-Training Quantization)
394
+
if quantization_type =="ptq":
395
+
prepared_model = prepare_pt2e(m, quantizer)
396
+
# Calibration loop would go here
397
+
prepared_model(*example_inputs)
398
+
399
+
# QAT (Quantization-Aware Training)
400
+
elif quantization_type =="qat":
401
+
prepared_model = prepare_qat_pt2e(m, quantizer)
402
+
# Training loop would go here
403
+
for _ inrange(training_steps):
404
+
prepared_model(*example_inputs)
405
+
406
+
# Convert to quantized model
407
+
quantized_model = convert_pt2e(prepared_model)
408
+
```
409
+
410
+
The `QNNQuantizer` is configurable, with the default setting being **8a8w**. For advanced users, refer to the [`QnnQuantizer`](https://github.com/pytorch/executorch/blob/main/backends/qualcomm/quantizer/quantizer.py) documentation for details.
411
+
412
+
##### Supported Quantization Schemes
413
+
-**8a8w** (default)
414
+
-**16a16w**
415
+
-**16a8w**
416
+
-**16a4w**
417
+
-**16a4w_block**
418
+
419
+
##### Customization Options
420
+
-**Per-node annotation**: Use `custom_quant_annotations`.
421
+
-**Per-module (`nn.Module`) annotation**: Use `submodule_qconfig_list`.
422
+
423
+
##### Additional Features
424
+
-**Node exclusion**: Discard specific nodes via `discard_nodes`.
425
+
-**Blockwise quantization**: Configure block sizes with `block_size_map`.
426
+
427
+
428
+
For practical examples, see [`test_qnn_delegate.py`](https://github.com/pytorch/executorch/blob/main/backends/qualcomm/tests/test_qnn_delegate.py).
429
+
430
+
431
+
#### Step 3: Configure Compile Specs
432
+
During this step, you will need to specify the target SoC, data type, and other QNN compiler spec.
433
+
```python
434
+
from executorch.backends.qualcomm.compiler import (
435
+
generate_qnn_executorch_compiler_spec,
436
+
generate_htp_compiler_spec,
437
+
)
438
+
from executorch.backends.qualcomm.utils.utils import QcomChipset
439
+
440
+
# HTP Compiler Configuration
441
+
backend_options = generate_htp_compiler_spec(
442
+
use_fp16=not quantized, # False for quantized models
0 commit comments