Batch Scoring

Discuss use case Vahdetlabs Open scoring demo Review scoring API (OpenAPI)

Capability demo · repeatable batch scoring

Batch scoring for production-shaped workloads

Turn recurring customer or account CSV extracts into scored outputs on a schedule or on demand: validation gates bad rows early, the same preprocessing assumptions as training keep scores comparable run-to-run, and you get a clean CSV handoff for CRM, analytics, or ops queues. A pilot scopes input schema, hosting, and SLAs—this page points at the public reference deployment for evaluation.

Discuss use case Vahdetlabs Open scoring demo Review scoring API (OpenAPI)

Public reference scoring service for evaluation. Production use is contract-specific: VPC placement, data handling, and your release process—not this static marketing shell.

Python

pandas

scikit-learn

joblib

Limitations & scope

Batch / scheduled scoring only; not a low-latency online inference product.
Training lives outside this repo; preprocessing here must stay aligned with the trained pipeline.
Demo fixture model proves the path in CI; production models are environment-specific.

Overview

Problem. After training, a lot of value is in recurring batch scores on customer or account tables—not every product needs a real-time endpoint.

System. One pipeline: load CSV → validate schema and value ranges → preprocess with the same feature assumptions as training → predict_proba via a serialized sklearn pipeline → postprocess to probabilities, thresholded labels, model version, and UTC timestamp → write scored CSV (optional run manifest).

Boundary. This repository is scoring-only: the trained joblib model is supplied by your training project; preprocessing here must stay aligned with that artifact.

Workflow

End-to-end path for one batch file:

CSV in Validate Preprocess Score Scored CSV out

Orchestrated by run_batch_scoring(...); CLI entrypoint wraps the same code path with non-zero exit on failure.

1

Load batch

Read Telco-style churn columns; fail fast on missing columns or parse issues.
2

Validate

Input checks before any model call—reduces silent garbage-in scores.
3

Preprocess

Training-aligned transforms for the feature matrix passed to the pipeline.
4

Score

joblib pipeline load once per run; batch predict_proba for all rows.
5

Postprocess & save

Output columns include customer_id, churn_score, predicted_label, model_version, scoring_timestamp; optional JSON manifest next to the CSV.

Features

Input validation

Schema and sanity checks before scoring; failures surface as CLI/API errors instead of partial outputs.

Training alignment

Preprocessing module is the contract between training-time features and batch inference.

Versioned outputs

Model version string and timestamp on every row; optional run manifest records threshold and paths.

CLI + exit codes

python -m src.pipeline.run_batch_scoring with flags for input, output, threshold, model path, and verbosity.

Scoring service demo (optional)

Small HTTP surface on the scoring host: upload CSV, JSON summary + preview, download scored file—same run_batch_scoring path as the CLI, not a separate dashboard product.

Docker image

Single Dockerfile: dependencies, app code, fixtures for demo model; suitable for VPS-style deploy behind a reverse proxy.

Tech stack

Core: Python 3.10+, pandas, scikit-learn, joblib, PyYAML
Quality: pytest; committed E2E over fixture CSV + small pipeline artifact
Optional HTTP: FastAPI + Uvicorn expose OpenAPI and batch scoring routes; Jinja serves a minimal upload/preview helper (not a Streamlit-style app).
Ops: Dockerfile (slim Python base); compose/orchestration left to your environment

Deployment & live interface

The canonical live scoring service is https://score.vahdetkaratas.com/: batch HTTP routes (JSON + CSV), OpenAPI at /docs, and a minimal browser helper for upload → preview → download on the same host. This is a service / API demo, not a standalone analytics dashboard.

Batch scoring itself does not require a browser: cron, Airflow, GitHub Actions, or a simple VM running the CLI against mounted input/output directories is the primary shape.

Repository layout and run commands are in README.md; architecture map in docs/ARCHITECTURE.md.

Limitations

Not a model training repo; not a feature store; not a monitoring product. No built-in drift detection, retraining loop, or multi-tenant auth—those belong upstream or around this job.

Throughput and latency are batch-oriented: large files should be sized with worker memory and wall-clock expectations in mind.

Why this project

Shows a complete file-in → validate → score → file-out contract with tests that exercise the real path, plus a container story for how the same code ships. Useful when interviewers or clients ask for evidence you can own the boring, valuable half of ML ops: repeatable batch outputs with clear boundaries.

Vahdettin Karatas

Location:

Technical focus

For client evaluation

ML systems

Data tools