Benchmark It Yourself (BIY): Preparing a Dataset and Benchmarking AI Models for Scatterplot-Related Tasks
Supplementary materials for the BIY paper.
- Generate the parquet file in
dataset/(The dataset is already provided in thedataset/inputfolder). - Run the benchmark in
benchmark/.
See detailed steps in dataset/README.md and benchmark/README.md.
Define the following variables in the .env file:
# OpenAI (required for OpenAI runs)
OPENAI_API_KEY=...
# Google Vertex AI + GCS (required for Google runs)
GOOGLE_GENAI_USE_VERTEXAI=True
GOOGLE_CLOUD_PROJECT=... # Your project id
GOOGLE_CLOUD_LOCATION=... # Optional -> defaults to us-central1
GOOGLE_CLOUD_OUTPUT_BUCKET=... # Your bucket pathGoogle authentication: use Application Default Credentials (ADC).
gcloud auth application-default loginWhere these are needed:
OPENAI_API_KEY: used bybenchmark/run_open_ai_batches.py,benchmark/check_open_ai_batches.py,benchmark/download_open_ai_results.py.GOOGLE_CLOUD_PROJECT,GOOGLE_CLOUD_LOCATION,GOOGLE_CLOUD_OUTPUT_BUCKET: used bybenchmark/prepare_google_batches.py,benchmark/upload_google_batches.py,benchmark/run_google_batches.py,benchmark/download_google_results.py.ANTHROPIC_API_KEY: used bybenchmark/estimate_anthropic_costs.pyto estimate token costs (optional).
Note: The dataset/ stage does not require any API keys.

















