Skip to content

PYTHON-5253 Automated Spec Test Sync #2409

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 37 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
fe1b07d
WIP
sleepyStick May 6, 2025
f822f31
trying to get it to work locally?
sleepyStick May 8, 2025
6909518
wip
sleepyStick May 12, 2025
73139a4
still a wip
sleepyStick Jun 5, 2025
b3a2b29
WIP
sleepyStick May 6, 2025
49685ae
trying to get it to work locally?
sleepyStick May 8, 2025
076ed93
wip
sleepyStick May 12, 2025
07ea1f6
still a wip
sleepyStick Jun 5, 2025
f4b3ddc
add patch file, use comment bot, still a wip
sleepyStick Jun 23, 2025
547dd76
Merge branch 'PYTHON-5253' of https://github.com/sleepyStick/mongo-py…
sleepyStick Jun 23, 2025
a97d5f4
Merge branch 'mongodb:master' into PYTHON-5253
sleepyStick Jun 24, 2025
925aeb5
fun fact, using a python script is actually *way* easier
sleepyStick Jun 25, 2025
67b995a
remove comment
sleepyStick Jun 25, 2025
53e8945
remove comment pt2
sleepyStick Jun 25, 2025
9f6f727
9am pst is 4pm utc i think?
sleepyStick Jun 25, 2025
3144383
remove unused vars and cleanup debugging prints
sleepyStick Jun 25, 2025
2bea24b
oops accidentally deleted this line
sleepyStick Jun 26, 2025
4bef7ca
fix resync commit msg
sleepyStick Jun 26, 2025
8aaf791
add / fix comments
sleepyStick Jun 26, 2025
413b463
check=True, catch error instead
sleepyStick Jun 26, 2025
75ddd14
Merge branch 'master' into PYTHON-5253
sleepyStick Jun 26, 2025
1a3d87f
multiple patch files
sleepyStick Jun 27, 2025
4f3cb61
update contributing to explain patch files
sleepyStick Jun 27, 2025
ec50aff
Update CONTRIBUTING.md
sleepyStick Jun 27, 2025
579c194
explain what syncing is, add example, and modify spacing
sleepyStick Jun 27, 2025
8c943e3
Update CONTRIBUTING.md
sleepyStick Jul 8, 2025
d6347e0
each patch file is for a ticket now
sleepyStick Jul 8, 2025
1dc4e26
move patching logic into python script
sleepyStick Jul 9, 2025
1d2cfff
not merged yet, but adding this for future
sleepyStick Jul 9, 2025
c6030a2
fix lint
sleepyStick Jul 9, 2025
391a3f9
drivers tools is cloned elsewhere, no need to clone again
sleepyStick Jul 10, 2025
734b3fa
rename patch -> spec-patch
sleepyStick Jul 11, 2025
f45fd46
rename create-pr -> create-spec-pr
sleepyStick Jul 11, 2025
87d73cc
allow scripts to be run locally
sleepyStick Jul 11, 2025
1c901c7
Update CONTRIBUTING.md
sleepyStick Jul 11, 2025
380adbb
remove unnecessary logic
sleepyStick Jul 11, 2025
7897afb
Merge branch 'master' into PYTHON-5253
sleepyStick Jul 11, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions .evergreen/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -42,3 +42,23 @@ post:
- func: "upload mo artifacts"
- func: "upload test results"
- func: "cleanup"

tasks:
- name: resync_specs
commands:
- command: subprocess.exec
params:
binary: bash
include_expansions_in_env: [AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN]
args:
- .evergreen/scripts/resync-all-specs.sh
working_dir: src

buildvariants:
- name: resync_specs
display_name: "Resync Specs"
run_on: rhel80-small
cron: '0 16 * * MON'
patchable: false
tasks:
- name: resync_specs
46 changes: 46 additions & 0 deletions .evergreen/remove-unimplemented-tests.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
#!/bin/bash
PYMONGO=$(dirname "$(cd "$(dirname "$0")" || exit; pwd)")

rm -rf $PYMONGO/test/server_selection/logging
rm $PYMONGO/test/transactions/legacy/errors-client.json # PYTHON-1894
rm $PYMONGO/test/connection_monitoring/wait-queue-fairness.json # PYTHON-1873
rm $PYMONGO/test/client-side-encryption/spec/unified/fle2v2-BypassQueryAnalysis.json # PYTHON-5143
rm $PYMONGO/test/client-side-encryption/spec/unified/fle2v2-EncryptedFields-vs-EncryptedFieldsMap.json # PYTHON-5143
rm $PYMONGO/test/client-side-encryption/spec/unified/localSchema.json # PYTHON-5143
rm $PYMONGO/test/client-side-encryption/spec/unified/maxWireVersion.json # PYTHON-5143
rm $PYMONGO/test/unified-test-format/valid-pass/poc-queryable-encryption.json # PYTHON-5143
rm $PYMONGO/test/gridfs/rename.json # PYTHON-4931
rm $PYMONGO/test/discovery_and_monitoring/unified/pool-clear-application-error.json # PYTHON-4918
rm $PYMONGO/test/discovery_and_monitoring/unified/pool-clear-checkout-error.json # PYTHON-4918
rm $PYMONGO/test/discovery_and_monitoring/unified/pool-clear-min-pool-size-error.json # PYTHON-4918

# Python doesn't implement DRIVERS-3064
rm $PYMONGO/test/collection_management/listCollections-rawdata.json
rm $PYMONGO/test/crud/unified/aggregate-rawdata.json
rm $PYMONGO/test/crud/unified/bulkWrite-deleteMany-rawdata.json
rm $PYMONGO/test/crud/unified/bulkWrite-deleteOne-rawdata.json
rm $PYMONGO/test/crud/unified/bulkWrite-replaceOne-rawdata.json
rm $PYMONGO/test/crud/unified/bulkWrite-updateMany-rawdata.json
rm $PYMONGO/test/crud/unified/bulkWrite-updateOne-rawdata.json
rm $PYMONGO/test/crud/unified/client-bulkWrite-delete-rawdata.json
rm $PYMONGO/test/crud/unified/client-bulkWrite-replaceOne-rawdata.json
rm $PYMONGO/test/crud/unified/client-bulkWrite-update-rawdata.json
rm $PYMONGO/test/crud/unified/count-rawdata.json
rm $PYMONGO/test/crud/unified/countDocuments-rawdata.json
rm $PYMONGO/test/crud/unified/db-aggregate-rawdata.json
rm $PYMONGO/test/crud/unified/deleteMany-rawdata.json
rm $PYMONGO/test/crud/unified/deleteOne-rawdata.json
rm $PYMONGO/test/crud/unified/distinct-rawdata.json
rm $PYMONGO/test/crud/unified/estimatedDocumentCount-rawdata.json
rm $PYMONGO/test/crud/unified/find-rawdata.json
rm $PYMONGO/test/crud/unified/findOneAndDelete-rawdata.json
rm $PYMONGO/test/crud/unified/findOneAndReplace-rawdata.json
rm $PYMONGO/test/crud/unified/findOneAndUpdate-rawdata.json
rm $PYMONGO/test/crud/unified/insertMany-rawdata.json
rm $PYMONGO/test/crud/unified/insertOne-rawdata.json
rm $PYMONGO/test/crud/unified/replaceOne-rawdata.json
rm $PYMONGO/test/crud/unified/updateMany-rawdata.json
rm $PYMONGO/test/crud/unified/updateOne-rawdata.json
rm $PYMONGO/test/index_management/index-rawdata.json

echo "Done removing unimplemented tests\n"
12 changes: 6 additions & 6 deletions .evergreen/resync-specs.sh
Original file line number Diff line number Diff line change
Expand Up @@ -45,9 +45,12 @@ then
fi

# Ensure the JSON files are up to date.
cd $SPECS/source
make
cd -
if ! [ -n "${CI:-}" ]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI is always defined in an EVG run and we only want to run these things in a local checkout.

then
cd $SPECS/source
make
cd -
fi
# cpjson unified-test-format/tests/invalid unified-test-format/invalid
# * param1: Path to spec tests dir in specifications repo
# * param2: Path to where the corresponding tests live in Python.
Expand Down Expand Up @@ -110,7 +113,6 @@ do
cmap|CMAP|connection-monitoring-and-pooling)
cpjson connection-monitoring-and-pooling/tests/logging connection_logging
cpjson connection-monitoring-and-pooling/tests/cmap-format connection_monitoring
rm $PYMONGO/test/connection_monitoring/wait-queue-fairness.json # PYTHON-1873
;;
apm|APM|command-monitoring|command_monitoring)
cpjson command-logging-and-monitoring/tests/monitoring command_monitoring
Expand Down Expand Up @@ -174,7 +176,6 @@ do
;;
server-selection|server_selection)
cpjson server-selection/tests/ server_selection
rm -rf $PYMONGO/test/server_selection/logging
cpjson server-selection/tests/logging server_selection_logging
;;
server-selection-logging|server_selection_logging)
Expand All @@ -186,7 +187,6 @@ do
transactions|transactions-convenient-api)
cpjson transactions/tests/ transactions
cpjson transactions-convenient-api/tests/ transactions-convenient-api
rm $PYMONGO/test/transactions/legacy/errors-client.json # PYTHON-1894
;;
unified|unified-test-format)
cpjson unified-test-format/tests/ unified-test-format/
Expand Down
50 changes: 50 additions & 0 deletions .evergreen/scripts/create-spec-pr.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
#!/usr/bin/env bash

tools="$(realpath -s "../drivers-tools")"
pushd $tools/.evergreen/github_app || exit

owner="mongodb"
repo="mongo-python-driver"

# Bootstrap the app.
echo "bootstrapping"
source utils.sh
bootstrap drivers/comment-bot

# Run the app.
source ./secrets-export.sh

# Get a github access token for the git checkout.
echo "Getting github token..."

token=$(bash ./get-access-token.sh $repo $owner)
if [ -z "${token}" ]; then
echo "Failed to get github access token!"
popd || exit
exit 1
fi
echo "Getting github token... done."
popd || exit

# Make the git checkout and create a new branch.
echo "Creating the git checkout..."
branch="spec-resync-"$(date '+%m-%d-%Y')

git remote set-url origin https://x-access-token:${token}@github.com/$owner/$repo.git
git checkout -b $branch "origin/master"
git add ./test
git commit -am "resyncing specs $(date '+%m-%d-%Y')"
echo "Creating the git checkout... done."

git push origin $branch
resp=$(curl -L \
-X POST \
-H "Accept: application/vnd.github+json" \
-H "Authorization: Bearer $token" \
-H "X-GitHub-Api-Version: 2022-11-28" \
-d "{\"title\":\"[Spec Resync] $(date '+%m-%d-%Y')\",\"body\":\"$(cat "$1")\",\"head\":\"${branch}\",\"base\":\"master\"}" \
--url https://api.github.com/repos/$owner/$repo/pulls)
echo $resp | jq '.html_url'
echo "Creating the PR... done."

rm -rf $tools
119 changes: 119 additions & 0 deletions .evergreen/scripts/resync-all-specs.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
from __future__ import annotations

import argparse
import os
import pathlib
import subprocess
from argparse import Namespace
from subprocess import CalledProcessError
from typing import Optional


def resync_specs(directory: pathlib.Path, errored: dict[str, str]) -> None:
"""Actually sync the specs"""
print("Beginning to sync specs") # noqa: T201
for spec in os.scandir(directory):
if not spec.is_dir():
continue

if spec.name in ["asynchronous"]:
continue
try:
subprocess.run(
["bash", "./.evergreen/resync-specs.sh", spec.name], # noqa: S603, S607
capture_output=True,
text=True,
check=True,
)
except CalledProcessError as exc:
errored[spec.name] = exc.stderr
print("Done syncing specs") # noqa: T201


def apply_patches():
print("Beginning to apply patches") # noqa: T201
subprocess.run(["bash", "./.evergreen/remove-unimplemented-tests.sh"], check=True) # noqa: S603, S607
subprocess.run(["git apply -R --allow-empty ./.evergreen/spec-patch/*"], shell=True, check=True) # noqa: S602, S607


def check_new_spec_directories(directory: pathlib.Path) -> list[str]:
"""Check to see if there are any directories in the spec repo that don't exist in pymongo/test"""
spec_dir = pathlib.Path(os.environ["MDB_SPECS"]) / "source"
spec_set = {
entry.name.replace("-", "_")
for entry in os.scandir(spec_dir)
if entry.is_dir()
and (pathlib.Path(entry.path) / "tests").is_dir()
and len(list(os.scandir(pathlib.Path(entry.path) / "tests"))) > 1
}
test_set = {entry.name.replace("-", "_") for entry in os.scandir(directory) if entry.is_dir()}
known_mappings = {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's a slight difference in how we name the directories vs how the spec names them sometimes? (Not 100% sure if these are correct though, so please let me know if these need to be changed!)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we often have different names than the spec repo, that's expected.

"ocsp_support": "ocsp",
"client_side_operations_timeout": "csot",
"mongodb_handshake": "handshake",
"load_balancers": "load_balancer",
"atlas_data_lake_testing": "atlas",
"connection_monitoring_and_pooling": "connection_monitoring",
"command_logging_and_monitoring": "command_logging",
"initial_dns_seedlist_discovery": "srv_seedlist",
"server_discovery_and_monitoring": "sdam_monitoring",
}

for k, v in known_mappings.items():
if k in spec_set:
spec_set.remove(k)
spec_set.add(v)
return list(spec_set - test_set)


def write_summary(errored: dict[str, str], new: list[str], filename: Optional[str]) -> None:
"""Generate the PR description"""
pr_body = ""
process = subprocess.run(
["git diff --name-only | awk -F'/' '{print $2}' | sort | uniq"], # noqa: S607
shell=True, # noqa: S602
capture_output=True,
text=True,
check=True,
)
succeeded = process.stdout.strip().split()
if len(succeeded) > 0:
pr_body += "The following specs were changed:\n -"
pr_body += "\n -".join(succeeded)
pr_body += "\n"
if len(errored) > 0:
pr_body += "\n\nThe following spec syncs encountered errors:\n -"
for k, v in errored.items():
pr_body += f"\n -{k}\n```{v}\n```"
pr_body += "\n"
if len(new) > 0:
pr_body += "\n\nThe following directories are in the specification repository and not in our test directory:\n -"
pr_body += "\n -".join(new)
pr_body += "\n"
if pr_body != "":
if filename is None:
print(f"\n{pr_body}") # noqa: T201
else:
with open(filename, "w") as f:
# replacements made for proper json
f.write(pr_body.replace("\n", "\\n").replace("\t", "\\t"))


def main(args: Namespace):
directory = pathlib.Path("./test")
errored: dict[str, str] = {}
resync_specs(directory, errored)
apply_patches()
new = check_new_spec_directories(directory)
write_summary(errored, new, args.filename)


if __name__ == "__main__":
parser = argparse.ArgumentParser(
description="Python Script to resync all specs and generate summary for PR."
)
parser.add_argument(
"--filename", help="Name of file for the summary to be written into.", default=None
)
args = parser.parse_args()
main(args)
42 changes: 42 additions & 0 deletions .evergreen/scripts/resync-all-specs.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
#!/usr/bin/env bash
# Run spec syncing script and create PR

# SETUP
SRC_URL="https://github.com/mongodb/specifications.git"
# needs to be set for resync-specs.sh
SPEC_SRC="$(realpath "../specifications")"
SCRIPT="$(realpath "./.evergreen/resync-specs.sh")"

# Clone the spec repo if the directory does not exist
if [[ ! -d $SPEC_SRC ]]; then
git clone $SRC_URL $SPEC_SRC
if [[ $? -ne 0 ]]; then
echo "Error: Failed to clone repository."
exit 1
fi
fi

# Set environment variable to the cloned spec repo for resync-specs.sh
export MDB_SPECS="$SPEC_SRC"

# Check that resync-specs.sh exists and is executable
if [[ ! -x $SCRIPT ]]; then
echo "Error: $SCRIPT not found or is not executable."
exit 1
fi

PR_DESC="spec_sync.txt"

# run python script that actually does all the resyncing
if ! [ -n "${CI:-}" ]
then
# we're running locally
python3 ./.evergreen/scripts/resync-all-specs.py
else
/opt/devtools/bin/python3.11 ./.evergreen/scripts/resync-all-specs.py "$PR_DESC"
if [[ -f $PR_DESC ]]; then
# changes were made -> call scrypt to create PR for us
.evergreen/scripts/create-spec-pr.sh "$PR_DESC"
rm "$PR_DESC"
fi
fi
12 changes: 12 additions & 0 deletions .evergreen/spec-patch/PYTHON-4884.patch
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
diff --git a/test/bson_corpus/datetime.json b/test/bson_corpus/datetime.json
index f857afdc..1554341d 100644
--- a/test/bson_corpus/datetime.json
+++ b/test/bson_corpus/datetime.json
@@ -24,6 +24,7 @@
{
"description" : "Y10K",
"canonical_bson" : "1000000009610000DC1FD277E6000000",
+ "relaxed_extjson" : "{\"a\":{\"$date\":{\"$numberLong\":\"253402300800000\"}}}",
"canonical_extjson" : "{\"a\":{\"$date\":{\"$numberLong\":\"253402300800000\"}}}"
},
{
24 changes: 24 additions & 0 deletions .evergreen/spec-patch/PYTHON-4918.patch
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
diff --git a/test/connection_monitoring/pool-create-min-size-error.json b/test/connection_monitoring/pool-create-min-size-error.json
index 1c744b85..509b2a23 100644
--- a/test/connection_monitoring/pool-create-min-size-error.json
+++ b/test/connection_monitoring/pool-create-min-size-error.json
@@ -49,15 +49,15 @@
"type": "ConnectionCreated",
"address": 42
},
+ {
+ "type": "ConnectionPoolCleared",
+ "address": 42
+ },
{
"type": "ConnectionClosed",
"address": 42,
"connectionId": 42,
"reason": "error"
- },
- {
- "type": "ConnectionPoolCleared",
- "address": 42
}
],
"ignore": [
Loading
Loading