You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: examples/case_studies/bayesian_sem_workflow.ipynb
+10-6Lines changed: 10 additions & 6 deletions
Original file line number
Diff line number
Diff line change
@@ -5209,7 +5209,7 @@
5209
5209
"id": "5ff64f19",
5210
5210
"metadata": {},
5211
5211
"source": [
5212
-
"Note how we have not passed through any data into this model. This is deliberate. We want now to simulate data from the model with forward pass through the system. We have initialised two versions of the model: (1) with wide parameters and (2) with tight parameters on the datagenerating condition. We are going to sample from the tight parameters model to draw out indicator data that conforms with the parameter setting we do now. "
5212
+
"Notice that we haven’t yet provided any observed data to the model — this is intentional. Our next step is to simulate data by performing a forward pass through the system. We initialize two versions of the model: (1) with tight priors and (2) with wide priors on the data-generating parameters. We’ll sample from the tightly parameterized model to generate indicator data consistent with our chosen parameter settings."
5213
5213
]
5214
5214
},
5215
5215
{
@@ -5896,11 +5896,15 @@
5896
5896
"id": "41e491be",
5897
5897
"metadata": {},
5898
5898
"source": [
5899
-
"The posterior estimates can “recover” the true values within uncertainty, ensuring the model is faithful to the data generating process. Were the effort at parameter recover to fail, we would equally have learned something about our model. Parameter recovery exercises helps discover issues of mis-specification or unidentified parameters. Put another way, they tell us how informative our data is with respect to our data generating model. Verlyn Klinkenborg starts his justly famous book _Several short sentences about writing_ with the following advice: \n",
5899
+
"The posterior estimates can “recover” the true values within uncertainty, ensuring the model is faithful to the data generating process. Were the effort at parameter recover to fail, we would equally have learned something about our model. Parameter recovery exercises helps discover issues of mis-specification or unidentified parameters. They tell us how informative our data is with respect to our data generating process, they clarify the degree to which the data constrains (or fails to constrain) the model’s parameters.\n",
5900
+
"\n",
5901
+
"Verlyn Klinkenborg begins his justly famous book _Several short sentences about writing_ with a reminder that applies equally to modelling: \n",
5900
5902
"\n",
5901
5903
"> \"Here, in short, is what i want to tell you. Know what each sentence says, What it doesn't say, And what it implies. Of these, the hardest is know what each sentence actually says\" - V. Klinkenborg\n",
5902
5904
"\n",
5903
-
"This advice transfers exactly to the art of statistical modelling. To know what our model says, we need to say it aloud. We need to feel how it lands with an audience. We need to understand is implications and limitations. The Bayesian workflow explores the depths of meaning achieved by our statistical approximations. It traces out the effects of interlocking components and the layered interactions of structural regressions. In each articulation we're testing which flavours of reality resonate in the telling. What shape the posterior? How plausible the range of values? How faithful are our predictions to reality? On these questions we weigh each model just as the writer weighs each sentence for their effects. "
5905
+
"This advice transfers exactly to the art of statistical modelling. To know what our model says, we need to say it aloud. We need to feel how it lands with an audience. We need to understand is implications and limitations. Simulation studies and parameter recovery exercises, speak our models aloud; their failures, like their successes are transparent and each iteration strengthens the quality of the work.\n",
5906
+
"\n",
5907
+
"The Bayesian workflow explores the depths of meaning achieved by our statistical approximations. It traces out the effects of interlocking components and the layered interactions of structural regressions. In each articulation we're testing which aspects resonate in the telling. What shape the posterior? How plausible the range of values? How faithful are our predictions to reality? On these questions we weigh each model just as the writer weighs each sentence for their effects. "
5904
5908
]
5905
5909
},
5906
5910
{
@@ -5952,7 +5956,7 @@
5952
5956
"id": "59c4d17a",
5953
5957
"metadata": {},
5954
5958
"source": [
5955
-
"In an applied setting it's these kinds of implications that are crucially important to surface and understand. From a workflow point of view we want to ensure that our modelling drives clarity on these precise points and avoids adding noise generally. If we're assessing a particular hypothesis or aiming to estimate a concrete quantity, the model specification should be robust enough to support those inferences. This is where parameter recovery exercises can lend assurances and bolster confidence in the findings of empirical work. Here we've shown that our model specification will support inferences about about a class of particular causal contrasts i.e. how treatment changes the direct effects of one latent construct on another.\n",
5959
+
"In applied work, these are precisely the implications we want to surface and understand. From a workflow perspective, our models should clarify these relationships rather than add noise. If we're assessing a particular hypothesis or aiming to estimate a concrete quantity, the model specification should be robust enough to support those inferences. This is where parameter recovery exercises can lend assurances and bolster confidence in the findings of empirical work. Here we've shown that our model specification will support inferences about about a class of particular causal contrasts i.e. how treatment changes the direct effects of one latent construct on another.\n",
5956
5960
"\n",
5957
5961
"Another way we might interrogate the implications of a model is to see how well it can predict \"downstream\" outcomes of the implied model. How does job-satisfaction relate to attrition risk and approaches to work?"
5958
5962
]
@@ -7038,9 +7042,9 @@
7038
7042
"source": [
7039
7043
"## Conclusion: Workflow and Craft in Statistical Modelling\n",
7040
7044
"\n",
7041
-
"We have now seen how to articulate Structural Equation models and their variants in PyMC. The SEM workflow is, at heart, Bayesian in temperament. Hypothesise and construct. Construct then Estimate. Estimate and check. Check then refine. Refine then expand... Both disciplines reject the checklist mentality of “fit once, report, move on.” Instead, they cultivate a focused, deliberate practice. Each discipline forces an apprenticeship where skill is developed. Skill to handle how assumptions shape understanding and how the world resists impositions of false structure. Skill to find the right structures. Each iteration is a dialogue between theory and evidence. At each juncture we ask whether this model speaks true? Whether this structure reflects the facts to hand. \n",
7045
+
"We have now seen how to articulate Structural Equation models and their variants in PyMC. The SEM workflow is, at heart, Bayesian in temperament. Hypothesise and construct. Construct then Estimate. Estimate and check. Check then refine. Refine then expand... Both disciplines reject the checklist mentality of “fit once, report, move on.” Instead, they cultivate a focused, deliberate practice. Each demands an apprenticeship in which skill is honed: skill to see how assumptions shape understanding, and how the world resists the imposition of false structures. Skill to find the right structures. Each iteration is a dialogue between theory and evidence. At each juncture we ask whether this model speaks true? Whether this structure reflects the facts to hand. \n",
7042
7046
"\n",
7043
-
"In the end, the value of craft in statistical modeling lies not in improving benchmark metrics, but in the depth of understanding we cultivate through careful communication and justification. The Bayesian workflow reminds us that modeling is not the automation of insight but its deliberate construction. Our workflow is a process of listening, revising, and re-articulating until the model speaks clearly. Like any craft, its worth is measured not by throughput but by fidelity: how honestly our structure reflects the world it seeks to describe. Each diagnostic, each posterior check, each refinement of a latent path is a form of attention — a small act of resistance against the flattening logic of target metrics and checklists. These are the constructive thought processes that drive job-satisfaction. __To practice modeling as craft is to reclaim pride in knowing what our models say, what they do not say, and what they imply.__ To find, in that discipline and skilled attention, the satisfaction of meaningful work and useful science.\n"
7047
+
"In the end, the value of craft in statistical modeling lies not in improving benchmark metrics, but in the depth of understanding we cultivate through careful communication and justification. The Bayesian workflow reminds us that modeling is not the automation of insight, but its deliberate construction. Our workflow is a process of listening, revising, and re-articulating until the model speaks clearly. Like any craft, its worth is measured not by throughput but by fidelity: how honestly our structure reflects the world it seeks to describe. Each diagnostic, each posterior check, each refinement of a latent path is a form of attention — a small act of resistance against the flattening logic of target metrics and checklists. These constructive habits and reflective practices are the source of fulfillment in the work. __To practice modeling as craft is to reclaim pride in knowing what our models say, what they do not say, and what they imply__ - and to find, in that discipline and skilled attention, the satisfaction of meaningful work and useful science.\n"
Note how we have not passed through any data into this model. This is deliberate. We want now to simulate data from the model with forward pass through the system. We have initialised two versions of the model: (1) with wide parameters and (2) with tight parameters on the datagenerating condition. We are going to sample from the tight parameters model to draw out indicator data that conforms with the parameter setting we do now.
1195
+
Notice that we haven’t yet provided any observed data to the model — this is intentional. Our next step is to simulate data by performing a forward pass through the system. We initialize two versions of the model: (1) with tight priors and (2) with wide priors on the data-generating parameters. We’ll sample from the tightly parameterized model to generate indicator data consistent with our chosen parameter settings.
1196
1196
1197
1197
```{code-cell} ipython3
1198
1198
# Generating data from model by fixing parameters
@@ -1255,11 +1255,15 @@ az.plot_posterior(
1255
1255
);
1256
1256
```
1257
1257
1258
-
The posterior estimates can “recover” the true values within uncertainty, ensuring the model is faithful to the data generating process. Were the effort at parameter recover to fail, we would equally have learned something about our model. Parameter recovery exercises helps discover issues of mis-specification or unidentified parameters. Put another way, they tell us how informative our data is with respect to our data generating model. Verlyn Klinkenborg starts his justly famous book _Several short sentences about writing_ with the following advice:
1258
+
The posterior estimates can “recover” the true values within uncertainty, ensuring the model is faithful to the data generating process. Were the effort at parameter recover to fail, we would equally have learned something about our model. Parameter recovery exercises helps discover issues of mis-specification or unidentified parameters. They tell us how informative our data is with respect to our data generating process, they clarify the degree to which the data constrains (or fails to constrain) the model’s parameters.
1259
+
1260
+
Verlyn Klinkenborg begins his justly famous book _Several short sentences about writing_ with a reminder that applies equally to modelling:
1259
1261
1260
1262
> "Here, in short, is what i want to tell you. Know what each sentence says, What it doesn't say, And what it implies. Of these, the hardest is know what each sentence actually says" - V. Klinkenborg
1261
1263
1262
-
This advice transfers exactly to the art of statistical modelling. To know what our model says, we need to say it aloud. We need to feel how it lands with an audience. We need to understand is implications and limitations. The Bayesian workflow explores the depths of meaning achieved by our statistical approximations. It traces out the effects of interlocking components and the layered interactions of structural regressions. In each articulation we're testing which flavours of reality resonate in the telling. What shape the posterior? How plausible the range of values? How faithful are our predictions to reality? On these questions we weigh each model just as the writer weighs each sentence for their effects.
1264
+
This advice transfers exactly to the art of statistical modelling. To know what our model says, we need to say it aloud. We need to feel how it lands with an audience. We need to understand is implications and limitations. Simulation studies and parameter recovery exercises, speak our models aloud; their failures, like their successes are transparent and each iteration strengthens the quality of the work.
1265
+
1266
+
The Bayesian workflow explores the depths of meaning achieved by our statistical approximations. It traces out the effects of interlocking components and the layered interactions of structural regressions. In each articulation we're testing which aspects resonate in the telling. What shape the posterior? How plausible the range of values? How faithful are our predictions to reality? On these questions we weigh each model just as the writer weighs each sentence for their effects.
1263
1267
1264
1268
+++
1265
1269
@@ -1279,7 +1283,7 @@ plt.suptitle(
1279
1283
);
1280
1284
```
1281
1285
1282
-
In an applied setting it's these kinds of implications that are crucially important to surface and understand. From a workflow point of view we want to ensure that our modelling drives clarity on these precise points and avoids adding noise generally. If we're assessing a particular hypothesis or aiming to estimate a concrete quantity, the model specification should be robust enough to support those inferences. This is where parameter recovery exercises can lend assurances and bolster confidence in the findings of empirical work. Here we've shown that our model specification will support inferences about about a class of particular causal contrasts i.e. how treatment changes the direct effects of one latent construct on another.
1286
+
In applied work, these are precisely the implications we want to surface and understand. From a workflow perspective, our models should clarify these relationships rather than add noise. If we're assessing a particular hypothesis or aiming to estimate a concrete quantity, the model specification should be robust enough to support those inferences. This is where parameter recovery exercises can lend assurances and bolster confidence in the findings of empirical work. Here we've shown that our model specification will support inferences about about a class of particular causal contrasts i.e. how treatment changes the direct effects of one latent construct on another.
1283
1287
1284
1288
Another way we might interrogate the implications of a model is to see how well it can predict "downstream" outcomes of the implied model. How does job-satisfaction relate to attrition risk and approaches to work?
1285
1289
@@ -1552,9 +1556,9 @@ This two-step of information compression and prediction serves to concisely quan
1552
1556
1553
1557
## Conclusion: Workflow and Craft in Statistical Modelling
1554
1558
1555
-
We have now seen how to articulate Structural Equation models and their variants in PyMC. The SEM workflow is, at heart, Bayesian in temperament. Hypothesise and construct. Construct then Estimate. Estimate and check. Check then refine. Refine then expand... Both disciplines reject the checklist mentality of “fit once, report, move on.” Instead, they cultivate a focused, deliberate practice. Each discipline forces an apprenticeship where skill is developed. Skill to handle how assumptions shape understanding and how the world resists impositions of false structure. Skill to find the right structures. Each iteration is a dialogue between theory and evidence. At each juncture we ask whether this model speaks true? Whether this structure reflects the facts to hand.
1559
+
We have now seen how to articulate Structural Equation models and their variants in PyMC. The SEM workflow is, at heart, Bayesian in temperament. Hypothesise and construct. Construct then Estimate. Estimate and check. Check then refine. Refine then expand... Both disciplines reject the checklist mentality of “fit once, report, move on.” Instead, they cultivate a focused, deliberate practice. Each demands an apprenticeship in which skill is honed: skill to see how assumptions shape understanding, and how the world resists the imposition of false structures. Skill to find the right structures. Each iteration is a dialogue between theory and evidence. At each juncture we ask whether this model speaks true? Whether this structure reflects the facts to hand.
1556
1560
1557
-
In the end, the value of craft in statistical modeling lies not in improving benchmark metrics, but in the depth of understanding we cultivate through careful communication and justification. The Bayesian workflow reminds us that modeling is not the automation of insight but its deliberate construction. Our workflow is a process of listening, revising, and re-articulating until the model speaks clearly. Like any craft, its worth is measured not by throughput but by fidelity: how honestly our structure reflects the world it seeks to describe. Each diagnostic, each posterior check, each refinement of a latent path is a form of attention — a small act of resistance against the flattening logic of target metrics and checklists. These are the constructive thought processes that drive job-satisfaction. __To practice modeling as craft is to reclaim pride in knowing what our models say, what they do not say, and what they imply.__ To find, in that discipline and skilled attention, the satisfaction of meaningful work and useful science.
1561
+
In the end, the value of craft in statistical modeling lies not in improving benchmark metrics, but in the depth of understanding we cultivate through careful communication and justification. The Bayesian workflow reminds us that modeling is not the automation of insight, but its deliberate construction. Our workflow is a process of listening, revising, and re-articulating until the model speaks clearly. Like any craft, its worth is measured not by throughput but by fidelity: how honestly our structure reflects the world it seeks to describe. Each diagnostic, each posterior check, each refinement of a latent path is a form of attention — a small act of resistance against the flattening logic of target metrics and checklists. These constructive habits and reflective practices are the source of fulfillment in the work. __To practice modeling as craft is to reclaim pride in knowing what our models say, what they do not say, and what they imply__ - and to find, in that discipline and skilled attention, the satisfaction of meaningful work and useful science.
0 commit comments