child_birth autorefersh configarations #1668

Rohit231998 · 2025-10-14T12:41:09Z

No description provided.

gemini-code-assist · 2025-10-14T12:41:30Z

Summary of Changes

Hello @Rohit231998, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a new automated data import pipeline for US birth and mortality data sourced from the CDC. It establishes the necessary configuration files, including a manifest for scheduled weekly refreshes, metadata for data processing, and mappings for geographical entities and statistical variables. The primary goal is to integrate this new dataset into the system, enabling regular updates and structured representation of child birth and related mortality statistics.

Highlights

New Data Import Pipeline: A new data import pipeline has been established for US birth and mortality data, sourced from the CDC.
Automated Weekly Refresh: The import is configured to automatically refresh on a weekly schedule, ensuring the data remains up-to-date.
Statistical Variable Definitions: New statistical variables have been defined to capture annual live births, general mortality events, and infant mortality events.
Comprehensive Place Resolution: Detailed mappings for all US states and Puerto Rico to their respective Data Commons IDs are included to accurately resolve geographical entities.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request adds a new data import for child birth statistics from the CDC. It includes the necessary configuration files, mappings, and test data.

There are a few issues that need to be addressed:

The pull request title 'child_birth autorefersh configarations' contains typos and should be corrected to 'child_birth auto-refresh configurations'.
The manifest.json contains critical errors in script paths and an incorrect import_name that will cause the import to fail or be miscategorized.
The README.md file has some typos and a broken command example that should be fixed for clarity.
Several configuration files have minor formatting issues like missing newlines or commented-out properties that should be reviewed.

Overall, the structure of the new import is good, but the identified issues, especially in the manifest, are important to fix before merging.

statvar_imports/child_birth/manifest.json

statvar_imports/child_birth/README.md

statvar_imports/child_birth/manifest.json

statvar_imports/child_birth/metadata.csv

statvar_imports/child_birth/places_resolved.csv

statvar_imports/child_birth/pvmap.csv

statvar_imports/child_birth/testdata/Child_birth_stat_vars.mcf

statvar_imports/child_birth/testdata/Child_birth_output.csv

statvar_imports/child_birth/manifest.json

SandeepTuniki · 2025-11-12T04:34:03Z

statvar_imports/child_birth/testdata/input.csv

The sample data file is very large. We don't need to include the entire data. About 50-100 records is sufficient. Can we change it to only include 50-100 records?

SandeepTuniki · 2025-11-12T04:35:22Z

statvar_imports/child_birth/testdata/Child_birth_output.csv

The sample data file is very large. We don't need to include the entire data. About 50-100 records is sufficient. Assuming this was generated through statvar processor, can we change the input file to include smaller dataset as I mentioned in another comment and re-generate this output file again?

SandeepTuniki · 2025-11-12T04:37:33Z

Triggering review from gemini-code-assist bot again...

SandeepTuniki · 2025-11-12T04:37:39Z

/gemini review

gemini-code-assist

Code Review

This pull request adds a new data import for child birth statistics from the CDC. It includes all the necessary configuration files and test data. The changes look good overall, but there are a few issues to address, mainly in the README.md and manifest.json files to ensure the import can be run correctly. I've also suggested adding trailing newlines to several files for better compatibility. Please also consider fixing the typos in the pull request title ('autorefersh' -> 'autorefresh', 'configarations' -> 'configurations').

gemini-code-assist · 2025-11-12T04:41:16Z

statvar_imports/child_birth/manifest.json

+            "provenance_url": "https://www.cdc.gov/nchs/nvss/vsrr/provisional-tables.html",
+            "provenance_description": "The data set contains USA birth data",
+            "scripts": [
+                "./../util/download_util_script.py --download_url=https://data.cdc.gov/api/views/hmz2-vwda/rows.csv?accessType=DOWNLOAD --output_folder=input_files/",


The relative path to download_util_script.py appears to be incorrect. Based on the file structure, the script is in the root util directory. From the manifest's location (statvar_imports/child_birth/), the correct path should be ../../util/download_util_script.py.

Suggested change

"./../util/download_util_script.py --download_url=https://data.cdc.gov/api/views/hmz2-vwda/rows.csv?accessType=DOWNLOAD --output_folder=input_files/",

"../../util/download_util_script.py --download_url=https://data.cdc.gov/api/views/hmz2-vwda/rows.csv?accessType=DOWNLOAD --output_folder=input_files/",

gemini-code-assist · 2025-11-12T04:41:16Z

statvar_imports/child_birth/README.md

+###How to run:
+
+python3 stat_var_processor.py 
+--input_data=../../statvar_imports/child_birth/input_files/*.csv -
+-pv_map=../../statvar_imports/child_birth/<filename_of_pvmap> 
+--places_resolved_csv=../../statvar_imports/child_birth/place_mapping.csv 
+--config_file=../../statvar_imports/child_birth/<filename_of_metadata> 
+--existing_statvar_mcf=gs://unresolved_mcf/scripts/statvar/stat_vars.mcf 
+--output_path=../../statvar_imports/child_birth/output_files/child_birth
+
+
+
+###Example
+
+To Process the files, Run:
+
+Execute the script inside the folder "/data/tools/statvar_importer/"
+
+```
+python3 stat_var_processor.py 
+--input_data=../../statvar_imports/child_birth/input_files/*.csv 
+--pv_map=../../statvar_imports/child_birth/pvmap.csv
+--places_resolved_csv=../../statvar_imports/child_birth/places_resolved.csv 
+--config_file=../../statvar_imports/child_birth/metadata.csv 
+--existing_statvar_mcf=gs://unresolved_mcf/scripts/statvar/stat_vars.mcf 
+--output_path=../../statvar_imports/child_birth/output_files/child_birth
+```


The 'How to run' section is confusing, contains placeholders and errors, and the 'Example' section is redundant. It's better to have a single, clear, and correct 'How to run' section. This suggestion replaces the current 'How to run' and 'Example' sections with a single corrected version.

Suggested change

###How to run:

python3 stat_var_processor.py

--input_data=../../statvar_imports/child_birth/input_files/*.csv -

-pv_map=../../statvar_imports/child_birth/<filename_of_pvmap>

--places_resolved_csv=../../statvar_imports/child_birth/place_mapping.csv

--config_file=../../statvar_imports/child_birth/<filename_of_metadata>

--existing_statvar_mcf=gs://unresolved_mcf/scripts/statvar/stat_vars.mcf

--output_path=../../statvar_imports/child_birth/output_files/child_birth

###Example

To Process the files, Run:

Execute the script inside the folder "/data/tools/statvar_importer/"

```

python3 stat_var_processor.py

--input_data=../../statvar_imports/child_birth/input_files/*.csv

--pv_map=../../statvar_imports/child_birth/pvmap.csv

--places_resolved_csv=../../statvar_imports/child_birth/places_resolved.csv

--config_file=../../statvar_imports/child_birth/metadata.csv

--existing_statvar_mcf=gs://unresolved_mcf/scripts/statvar/stat_vars.mcf

--output_path=../../statvar_imports/child_birth/output_files/child_birth

```

###How to run:

Execute the script from within the `/data/tools/statvar_importer/` directory.

```bash

python3 stat_var_processor.py \

--input_data=../../statvar_imports/child_birth/input_files/*.csv \

--pv_map=../../statvar_imports/child_birth/pvmap.csv \

--places_resolved_csv=../../statvar_imports/child_birth/places_resolved.csv \

--config_file=../../statvar_imports/child_birth/metadata.csv \

--existing_statvar_mcf=gs://unresolved_mcf/scripts/statvar/stat_vars.mcf \

--output_path=../../statvar_imports/child_birth/output_files/child_birth

gemini-code-assist · 2025-11-12T04:41:16Z

statvar_imports/child_birth/README.md

+
+To Download, run:
+
+`python3 download_util_script.py --download_url=https://data.cdc.gov/api/views/hmz2-vwda/rows.csv?accessType=DOWNLOAD --output_folder=input_files/`


For better readability and to prevent copy-paste errors, shell commands should be wrapped in a bash code block.

Suggested change

`python3 download_util_script.py --download_url=https://data.cdc.gov/api/views/hmz2-vwda/rows.csv?accessType=DOWNLOAD --output_folder=input_files/`

python3 download_util_script.py --download_url=https://data.cdc.gov/api/views/hmz2-vwda/rows.csv?accessType=DOWNLOAD --output_folder=input_files/

gemini-code-assist · 2025-11-12T04:41:17Z

statvar_imports/child_birth/README.md

+
+`python3 download_util_script.py --download_url=https://data.cdc.gov/api/views/hmz2-vwda/rows.csv?accessType=DOWNLOAD --output_folder=input_files/`
+
+Note : The downloaded file will be saved as "input_files/*.csv"


The note about the downloaded filename is slightly misleading. The download_util_script.py will save the file as input_files/rows.csv based on the URL, not a wildcard *.csv. It's better to be specific to avoid confusion.

Suggested change

Note : The downloaded file will be saved as "input_files/*.csv"

Note : The downloaded file will be saved as "input_files/rows.csv"

gemini-code-assist · 2025-11-12T04:41:17Z

statvar_imports/child_birth/manifest.json

+            "provenance_description": "The data set contains USA birth data",
+            "scripts": [
+                "./../util/download_util_script.py --download_url=https://data.cdc.gov/api/views/hmz2-vwda/rows.csv?accessType=DOWNLOAD --output_folder=input_files/",
+                "../../../tools/statvar_importer/stat_var_processor.py --input_data=input_files/*.csv --pv_map=pvmap.csv --config_file=metadata.csv --places_resolved_csv=places_resolved.csv  --existing_statvar_mcf=gs://unresolved_mcf/scripts/statvar/stat_vars.mcf --output_path=output_files/child_birth"


There are extra spaces between arguments in this command. While this may not break execution, it's good practice to maintain consistent spacing for better readability.

Suggested change

"../../../tools/statvar_importer/stat_var_processor.py --input_data=input_files/*.csv --pv_map=pvmap.csv --config_file=metadata.csv --places_resolved_csv=places_resolved.csv --existing_statvar_mcf=gs://unresolved_mcf/scripts/statvar/stat_vars.mcf --output_path=output_files/child_birth"

"../../../tools/statvar_importer/stat_var_processor.py --input_data=input_files/*.csv --pv_map=pvmap.csv --config_file=metadata.csv --places_resolved_csv=places_resolved.csv --existing_statvar_mcf=gs://unresolved_mcf/scripts/statvar/stat_vars.mcf --output_path=output_files/child_birth"

gemini-code-assist · 2025-11-12T04:41:17Z

statvar_imports/child_birth/metadata.csv

+#places_within,country/USA
+output_columns,"observationAbout,observationDate,value,variableMeasured,observationPeriod"
+header_rows,1
+mapped_columns,5


It's a best practice for text files, including CSVs, to end with a newline character. This ensures compatibility with various command-line tools and editors.

mapped_columns,5

gemini-code-assist · 2025-11-12T04:41:17Z

statvar_imports/child_birth/places_resolved.csv

+WEST VIRGINIA,geoId/54
+WISCONSIN,geoId/55
+WYOMING,geoId/56
+PUERTO RICO,geoId/72


This file should end with a newline character. It's a standard convention for text files and helps prevent issues with some processing tools.

PUERTO RICO,geoId/72

gemini-code-assist · 2025-11-12T04:41:17Z

statvar_imports/child_birth/pvmap.csv

+Data Value,value,{Number},observationDate,{Month} {Year},,,,
+Monthly,observationPeriod,P1M,,,,,,
+12 Month-ending,observationPeriod,P1Y,,,,,,
+State,observationAbout,{Data},,,,,,


Please add a newline at the end of the file. It's a standard convention that improves file handling with many tools.

State,observationAbout,{Data},,,,,,

gemini-code-assist

Code Review

This pull request introduces the necessary configuration and documentation for importing child_birth data. The new files, including README.md, manifest.json, metadata.csv, places_resolved.csv, pvmap.csv, Child_birth.tmcf, and Child_birth_stat_vars.mcf, establish a clear process for data download and processing. The README.md is well-structured, providing essential execution steps and an example. However, there are a few areas for improvement, including a critical typo in a command in the README.md, inconsistent use of placeholders, and missing newlines at the end of several files.

gemini-code-assist · 2025-11-12T04:42:43Z

statvar_imports/child_birth/README.md

+--input_data=../../statvar_imports/child_birth/input_files/*.csv -
+-pv_map=../../statvar_imports/child_birth/<filename_of_pvmap> 


There is a typo here: --pv_map is split across two lines as - and -pv_map. This will cause the command to fail. Please correct it to --pv_map on a single line.

Suggested change

--input_data=../../statvar_imports/child_birth/input_files/*.csv -

-pv_map=../../statvar_imports/child_birth/<filename_of_pvmap>

--input_data=../../statvar_imports/child_birth/input_files/*.csv \

--pv_map=../../statvar_imports/child_birth/<filename_of_pvmap>

gemini-code-assist · 2025-11-12T04:42:43Z

statvar_imports/child_birth/README.md

+
+python3 stat_var_processor.py 
+--input_data=../../statvar_imports/child_birth/input_files/*.csv -
+-pv_map=../../statvar_imports/child_birth/<filename_of_pvmap> 


The pv_map argument uses a generic placeholder <filename_of_pvmap>. For consistency and clarity, it would be better to use the actual filename, pvmap.csv, as demonstrated in the example section (line 39).

Suggested change

-pv_map=../../statvar_imports/child_birth/<filename_of_pvmap>

--pv_map=../../statvar_imports/child_birth/pvmap.csv

gemini-code-assist · 2025-11-12T04:42:43Z

statvar_imports/child_birth/README.md

+--input_data=../../statvar_imports/child_birth/input_files/*.csv -
+-pv_map=../../statvar_imports/child_birth/<filename_of_pvmap> 
+--places_resolved_csv=../../statvar_imports/child_birth/place_mapping.csv 
+--config_file=../../statvar_imports/child_birth/<filename_of_metadata> 


Similar to the pv_map argument, the config_file argument uses a generic placeholder <filename_of_metadata>. Please update this to metadata.csv for consistency with the example provided (line 41).

Suggested change

--config_file=../../statvar_imports/child_birth/<filename_of_metadata>

--config_file=../../statvar_imports/child_birth/metadata.csv

gemini-code-assist · 2025-11-12T04:42:43Z

statvar_imports/child_birth/metadata.csv

@@ -0,0 +1,5 @@
+parameter,value
+#places_within,country/USA


The line #places_within,country/USA is commented out. If this configuration is not currently in use or intended for future use, please either remove it to avoid confusion or add a comment explaining its purpose and why it's commented out.

config for child_birth_import

7b3e90d

Rohit231998 requested a review from krishnaswamypradeep October 14, 2025 12:41

gemini-code-assist bot reviewed Oct 14, 2025

View reviewed changes

krishnaswamypradeep reviewed Oct 15, 2025

View reviewed changes

statvar_imports/child_birth/testdata/Child_birth_output.csv Show resolved Hide resolved

statvar_imports/child_birth/manifest.json Outdated Show resolved Hide resolved

resolved comments

70671fe

Rohit231998 requested a review from krishnaswamypradeep October 15, 2025 10:09

resolved comments

18ce010

krishnaswamypradeep approved these changes Oct 17, 2025

View reviewed changes

Rohit231998 requested a review from ajaits October 17, 2025 06:03

SandeepTuniki self-requested a review November 10, 2025 06:11

SandeepTuniki requested changes Nov 12, 2025

View reviewed changes

gemini-code-assist bot reviewed Nov 12, 2025

View reviewed changes

	"./../util/download_util_script.py --download_url=https://data.cdc.gov/api/views/hmz2-vwda/rows.csv?accessType=DOWNLOAD --output_folder=input_files/",
	"../../util/download_util_script.py --download_url=https://data.cdc.gov/api/views/hmz2-vwda/rows.csv?accessType=DOWNLOAD --output_folder=input_files/",


		To Download, run:

		`python3 download_util_script.py --download_url=https://data.cdc.gov/api/views/hmz2-vwda/rows.csv?accessType=DOWNLOAD --output_folder=input_files/`

	`python3 download_util_script.py --download_url=https://data.cdc.gov/api/views/hmz2-vwda/rows.csv?accessType=DOWNLOAD --output_folder=input_files/`
	python3 download_util_script.py --download_url=https://data.cdc.gov/api/views/hmz2-vwda/rows.csv?accessType=DOWNLOAD --output_folder=input_files/


		`python3 download_util_script.py --download_url=https://data.cdc.gov/api/views/hmz2-vwda/rows.csv?accessType=DOWNLOAD --output_folder=input_files/`

		Note : The downloaded file will be saved as "input_files/*.csv"

	"../../../tools/statvar_importer/stat_var_processor.py --input_data=input_files/*.csv --pv_map=pvmap.csv --config_file=metadata.csv --places_resolved_csv=places_resolved.csv --existing_statvar_mcf=gs://unresolved_mcf/scripts/statvar/stat_vars.mcf --output_path=output_files/child_birth"
	"../../../tools/statvar_importer/stat_var_processor.py --input_data=input_files/*.csv --pv_map=pvmap.csv --config_file=metadata.csv --places_resolved_csv=places_resolved.csv --existing_statvar_mcf=gs://unresolved_mcf/scripts/statvar/stat_vars.mcf --output_path=output_files/child_birth"

		--input_data=../../statvar_imports/child_birth/input_files/*.csv -
		-pv_map=../../statvar_imports/child_birth/<filename_of_pvmap>

	-pv_map=../../statvar_imports/child_birth/<filename_of_pvmap>
	--pv_map=../../statvar_imports/child_birth/pvmap.csv

	--config_file=../../statvar_imports/child_birth/<filename_of_metadata>
	--config_file=../../statvar_imports/child_birth/metadata.csv

		@@ -0,0 +1,5 @@
		parameter,value
		#places_within,country/USA

child_birth autorefersh configarations #1668

Are you sure you want to change the base?

child_birth autorefersh configarations #1668

Uh oh!

Conversation

Rohit231998 commented Oct 14, 2025

Uh oh!

gemini-code-assist bot commented Oct 14, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

SandeepTuniki Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

SandeepTuniki Nov 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SandeepTuniki commented Nov 12, 2025

Uh oh!

SandeepTuniki commented Nov 12, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

SandeepTuniki Nov 12, 2025 •

edited

Loading