-
Notifications
You must be signed in to change notification settings - Fork 89
change fixed-pedigree example fixes #2 #2447
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
jeromekelleher
merged 2 commits into
tskit-dev:main
from
hossam26644:fix/ancestry-pedigree-example-2440
Dec 16, 2025
Merged
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -94,7 +94,7 @@ this. | |
| : Coalescent with recombination ("hudson") | ||
|
|
||
| {class}`.SmcKApproxCoalescent` | ||
| : General Sequentially Markov Coalescent | ||
| : General Sequentially Markov Coalescent | ||
|
|
||
| {class}`.DiscreteTimeWrightFisher` | ||
| : Generation-by-generation Wright-Fisher | ||
|
|
@@ -2230,7 +2230,7 @@ in units of 4N generations. | |
|
|
||
| ### SMC approximations | ||
|
|
||
| The **SMC** and **SMC′** | ||
| The **SMC** and **SMC′** | ||
| are approximations of the continuous time | ||
| {ref}`Hudson coalescent<sec_ancestry_models_hudson>` model. These were originally | ||
| motivated largely by the need to simulate coalescent processes more efficiently | ||
|
|
@@ -2240,19 +2240,19 @@ mean that such approximations are now unnecessary for many simulations. | |
|
|
||
| The **SMC** and **SMC'** are, however, very important for inference, as the approximations | ||
| have made many analytical advances possible. Moreover, using these approximations, | ||
| we are able to simulate regimes which we couldn't simulate otherwise: for example, | ||
| we are able to simulate regimes which we couldn't simulate otherwise: for example, | ||
| **Drosophila** and **Drosophila-like** simulations with very high scaled recombination rates. | ||
|
|
||
|
|
||
| The {class}`SMC(k) <.SmcKApproxCoalescent>` model is a general simulations model that can simulate various **SMC** approximations | ||
| (e.g., **SMC** and **SMC′**). It accepts a ```hull_offset``` parameter, which defines the extent of | ||
| **SMC** approximations in the simulation. The ```hull_offset``` represents the maximum allowed | ||
| distance between two genomic segments that can share a common ancestor. Setting the | ||
| ```hull_offset``` to **0** means only overlapping genomic segments can share a common ancestor, | ||
| corresponding to the backward-in-time definition of the **SMC** model. Similarly, setting | ||
| the ```hull_offset``` to **1** allows adjacent genomic segments, as well as overlapping ones, to | ||
| share a common ancestor, which defines the **SMC′** model. Simulating under the Hudson | ||
| coalescent model is equivalent to setting the ```hull_offset``` to the sequence length. The | ||
| The {class}`SMC(k) <.SmcKApproxCoalescent>` model is a general simulations model that can simulate various **SMC** approximations | ||
| (e.g., **SMC** and **SMC′**). It accepts a ```hull_offset``` parameter, which defines the extent of | ||
| **SMC** approximations in the simulation. The ```hull_offset``` represents the maximum allowed | ||
| distance between two genomic segments that can share a common ancestor. Setting the | ||
| ```hull_offset``` to **0** means only overlapping genomic segments can share a common ancestor, | ||
| corresponding to the backward-in-time definition of the **SMC** model. Similarly, setting | ||
| the ```hull_offset``` to **1** allows adjacent genomic segments, as well as overlapping ones, to | ||
| share a common ancestor, which defines the **SMC′** model. Simulating under the Hudson | ||
| coalescent model is equivalent to setting the ```hull_offset``` to the sequence length. The | ||
| hull_offset can take any value between **0** and the sequence length. | ||
|
|
||
| In this example, we use the {class}`SMC(k) <.SmcKApproxCoalescent>` model to run **SMC'** | ||
|
|
@@ -2265,7 +2265,7 @@ SVG(ts.draw_svg(y_axis=True, time_scale="log_time")) | |
| ``` | ||
| :::{Note} | ||
| Since the **SMC** models are approximations of the {ref}`Hudson coalescent<sec_ancestry_models_hudson>`, | ||
| and since the {ref}`Hudson coalescent<sec_ancestry_models_hudson>` model is well optimised for | ||
| and since the {ref}`Hudson coalescent<sec_ancestry_models_hudson>` model is well optimised for | ||
| regimes with moderate scaled recombination rates (including full human chromosome simulations), | ||
| we recommend using the {ref}`Hudson coalescent<sec_ancestry_models_hudson>` whenever possible. | ||
| ::: | ||
|
|
@@ -2842,34 +2842,51 @@ combined with other {ref}`ancestry models<sec_ancestry_models>`. | |
|
|
||
| #### Tracing ancestry through a pedigree | ||
|
|
||
| The previous example generated trees that are embedded within the pedigree, | ||
| but doesn't tell us directly the exact path that each sample's genetic | ||
| material took through the ancestors. To trace these paths exactly, we can | ||
| use a combination of the ``additional_nodes``, | ||
| The previous simulation generated trees that are embedded within the pedigree, | ||
| but it doesn't tell us directly the path that each sample's genetic | ||
| material took through its ancestors. | ||
|
|
||
| To trace these paths exactly, we can | ||
| use a combination of ``additional_nodes``, | ||
| ``coalescing_segments_only`` | ||
| (see {ref}`sec_ancestry_additional_nodes`), | ||
| and ``stop_at_local_mrca`` (see {ref}`sec_ancestry_stop_at_local_mrca`). | ||
| and stop_at_local_mrca (see {ref}`sec_ancestry_stop_at_local_mrca`). | ||
| For this example, we create another contrived pedigree with a single | ||
| proband, and 3 generations of diploid ancestors: | ||
|
|
||
| ```{code-cell} | ||
| ped_txt = """\ | ||
| # id parent0 parent1 time is_sample | ||
| 0 1 2 0.0 1 | ||
| 1 3 4 1.0 0 | ||
| 2 3 4 1.0 0 | ||
| 3 5 6 2.0 0 | ||
| 4 5 6 2.0 0 | ||
| 5 . . 3.0 0 | ||
| 6 . . 3.0 0 | ||
| """ | ||
| pedigree = msprime.parse_pedigree(io.StringIO(ped_txt), sequence_length=100) | ||
| draw_pedigree(pedigree.tree_sequence()) | ||
| ``` | ||
| We can then run a simuation through this pedigree. For simplicity here, | ||
| we don't have any recombination: | ||
| ```{code-cell} | ||
| ped_ts = msprime.sim_ancestry( | ||
| initial_state=pedigree, | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I suggest you move the simulations block to the next code block
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. good call - done. |
||
| model="fixed_pedigree", | ||
| random_seed=41, | ||
| recombination_rate=0.001, | ||
| random_seed=998, | ||
| coalescing_segments_only=False, | ||
| additional_nodes=( | ||
| msprime.NodeType.COMMON_ANCESTOR | | ||
| msprime.NodeType.PASS_THROUGH), | ||
| stop_at_local_mrca=True | ||
| stop_at_local_mrca=False | ||
| ) | ||
| node_labels = {node.id: f"{node.individual}({node.id})" for node in ped_ts.nodes()} | ||
| SVG(ped_ts.draw_svg(y_axis=True, node_labels=node_labels, size=(600,200))) | ||
| ``` | ||
|
|
||
| We can now see that the ancestor of node 15 is 20, which we would have to | ||
| infer in the previous example. | ||
|
|
||
|
|
||
| Now we can see that the two genomes of individual 0 (nodes 0 and 1) | ||
| reach their common ancestor node 9 of individual 4. Because we | ||
| have set ``stop_at_local_mrca=False``, we can now also see how | ||
| this ancestral material goes further back in the pedigree. | ||
|
|
||
| #### Censoring pedigree information | ||
|
|
||
|
|
||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are removing trailing white spaces from different lines. Is this intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I explained in another PR, this is because my editor is configured to automatically trim whitespace. The issue here is that your editor isn't set up to do this, and we're not enforcing lint checking on markdown files.