Batch scoring for production-shaped workloads
Turn recurring customer or account CSV extracts into scored outputs on a schedule or on demand: validation gates bad rows early, the same preprocessing assumptions as training keep scores comparable run-to-run, and you get a clean CSV handoff for CRM, analytics, or ops queues. A pilot scopes input schema, hosting, and SLAs—this page points at the public reference deployment for evaluation.
Public reference scoring service for evaluation. Production use is contract-specific: VPC placement, data handling, and your release process—not this static marketing shell.
Limitations & scope
- Batch / scheduled scoring only; not a low-latency online inference product.
- Training lives outside this repo; preprocessing here must stay aligned with the trained pipeline.
- Demo fixture model proves the path in CI; production models are environment-specific.
Overview
Problem. After training, a lot of value is in recurring batch scores on customer or account tables—not every product needs a real-time endpoint.
System. One pipeline: load CSV → validate schema and value ranges → preprocess with the same feature assumptions as training → predict_proba via a serialized sklearn pipeline → postprocess to probabilities, thresholded labels, model version, and UTC timestamp → write scored CSV (optional run manifest).
Boundary. This repository is scoring-only: the trained joblib model is supplied by your training project; preprocessing here must stay aligned with that artifact.
Workflow
End-to-end path for one batch file:
Orchestrated by run_batch_scoring(...); CLI entrypoint wraps the same code path with non-zero exit on failure.
-
1
Load batch
Read Telco-style churn columns; fail fast on missing columns or parse issues.
-
2
Validate
Input checks before any model call—reduces silent garbage-in scores.
-
3
Preprocess
Training-aligned transforms for the feature matrix passed to the pipeline.
-
4
Score
joblibpipeline load once per run; batchpredict_probafor all rows. -
5
Postprocess & save
Output columns include
customer_id,churn_score,predicted_label,model_version,scoring_timestamp; optional JSON manifest next to the CSV.
Features
Input validation
Schema and sanity checks before scoring; failures surface as CLI/API errors instead of partial outputs.
Training alignment
Preprocessing module is the contract between training-time features and batch inference.
Versioned outputs
Model version string and timestamp on every row; optional run manifest records threshold and paths.
CLI + exit codes
python -m src.pipeline.run_batch_scoring with flags for input, output, threshold, model path, and verbosity.
Scoring service demo (optional)
Small HTTP surface on the scoring host: upload CSV, JSON summary + preview, download scored file—same run_batch_scoring path as the CLI, not a separate dashboard product.
Docker image
Single Dockerfile: dependencies, app code, fixtures for demo model; suitable for VPS-style deploy behind a reverse proxy.
Tech stack
- Core: Python 3.10+, pandas, scikit-learn, joblib, PyYAML
- Quality: pytest; committed E2E over fixture CSV + small pipeline artifact
- Optional HTTP: FastAPI + Uvicorn expose OpenAPI and batch scoring routes; Jinja serves a minimal upload/preview helper (not a Streamlit-style app).
- Ops: Dockerfile (slim Python base); compose/orchestration left to your environment
Deployment & live interface
The canonical live scoring service is https://score.vahdetkaratas.com/: batch HTTP routes (JSON + CSV), OpenAPI at /docs, and a minimal browser helper for upload → preview → download on the same host. This is a service / API demo, not a standalone analytics dashboard.
Batch scoring itself does not require a browser: cron, Airflow, GitHub Actions, or a simple VM running the CLI against mounted input/output directories is the primary shape.
Repository layout and run commands are in README.md; architecture map in docs/ARCHITECTURE.md.
Limitations
Not a model training repo; not a feature store; not a monitoring product. No built-in drift detection, retraining loop, or multi-tenant auth—those belong upstream or around this job.
Throughput and latency are batch-oriented: large files should be sized with worker memory and wall-clock expectations in mind.
Why this project
Shows a complete file-in → validate → score → file-out contract with tests that exercise the real path, plus a container story for how the same code ships. Useful when interviewers or clients ask for evidence you can own the boring, valuable half of ML ops: repeatable batch outputs with clear boundaries.