-
Notifications
You must be signed in to change notification settings - Fork 12.6k
Add support for CogVLM model #15002
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Add support for CogVLM model #15002
Conversation
42113d1
to
de22157
Compare
I think I've fixed the typecheck and format check workflows that were failing before, can someone approve the workflows to run again? |
You can run |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a complete review as I don't know enough about mtmd
, just commenting...
|
||
def __init__(self, *args, **kwargs): | ||
super().__init__(*args, **kwargs) | ||
self.hparams['num_attention_heads'] = self.hparams['num_heads'] | ||
|
||
def set_gguf_parameters(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def __init__(self, *args, **kwargs): | |
super().__init__(*args, **kwargs) | |
self.hparams['num_attention_heads'] = self.hparams['num_heads'] | |
def set_gguf_parameters(self): | |
def set_gguf_parameters(self): |
Add num_heads
to the list here instead:
llama.cpp/convert_hf_to_gguf.py
Line 1258 in de22157
self.gguf_writer.add_vision_head_count(self.find_vparam(["num_attention_heads"])) |
if "query_key_value" in name: | ||
# Split tensor into three along first axis | ||
q, k, v = data_torch.split(data_torch.shape[0] // 3, dim=0) | ||
return [ | ||
(self.map_tensor_name(name.replace("query_key_value", "query")), q), | ||
(self.map_tensor_name(name.replace("query_key_value", "key")), k), | ||
(self.map_tensor_name(name.replace("query_key_value", "value")), v), | ||
] | ||
|
||
return [(self.map_tensor_name(name), data_torch)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if "query_key_value" in name: | |
# Split tensor into three along first axis | |
q, k, v = data_torch.split(data_torch.shape[0] // 3, dim=0) | |
return [ | |
(self.map_tensor_name(name.replace("query_key_value", "query")), q), | |
(self.map_tensor_name(name.replace("query_key_value", "key")), k), | |
(self.map_tensor_name(name.replace("query_key_value", "value")), v), | |
] | |
return [(self.map_tensor_name(name), data_torch)] | |
return [(self.map_tensor_name(name), data_torch)] |
Create Q/K/V views at build time instead (check other (non-mm) models for examples).
def set_gguf_parameters(self): | ||
super().set_gguf_parameters() | ||
|
||
def modify_tensors(self, data_torch: Tensor, name: str, bid: int | None) -> Iterable[tuple[str, Tensor]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def set_gguf_parameters(self): | |
super().set_gguf_parameters() | |
def modify_tensors(self, data_torch: Tensor, name: str, bid: int | None) -> Iterable[tuple[str, Tensor]]: | |
def modify_tensors(self, data_torch: Tensor, name: str, bid: int | None) -> Iterable[tuple[str, Tensor]]: |
if "query_key_value.weight" in name: | ||
# Slice tensor into three along first axis | ||
q, k, v = data_torch.split(data_torch.shape[0] // 3, dim=0) | ||
return [ | ||
(self.map_tensor_name(name.replace("query_key_value", "query")), q), | ||
(self.map_tensor_name(name.replace("query_key_value", "key")), k), | ||
(self.map_tensor_name(name.replace("query_key_value", "value")), v), | ||
] | ||
|
||
return [(self.map_tensor_name(name), data_torch)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if "query_key_value.weight" in name: | |
# Slice tensor into three along first axis | |
q, k, v = data_torch.split(data_torch.shape[0] // 3, dim=0) | |
return [ | |
(self.map_tensor_name(name.replace("query_key_value", "query")), q), | |
(self.map_tensor_name(name.replace("query_key_value", "key")), k), | |
(self.map_tensor_name(name.replace("query_key_value", "value")), v), | |
] | |
return [(self.map_tensor_name(name), data_torch)] | |
return [(self.map_tensor_name(name), data_torch)] |
Qcur = ggml_rope(ctx0, Qcur, inp_pos, n_embd_head, GGML_ROPE_TYPE_NEOX); | ||
Kcur = ggml_rope(ctx0, Kcur, inp_pos, n_embd_head, GGML_ROPE_TYPE_NEOX); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update llama_model_rope_type
instead and use rope_type
.
Thanks for the info! That's something I've been wondering about for a while. |
This addresses the requests for CogVLM in #4387 and #4350.
CogVLM is a pretty popular model that now adds in cleanly after the recent additions to libmtmd.
I've converted a GGUF here: Link to GGUF files
Sample command and output: