JSON in Machine Learning: Model Config, Training Data, and Inference

Q: What is the Hugging Face config.json format?

The Hugging Face config.json is a JSON file stored in every model repository that defines the model architecture. Key fields include model_type (e.g., "bert", "gpt2", "llama"), hidden_size (embedding dimension), num_attention_heads, num_hidden_layers, and vocab_size. The architectures array lists the Python class name used to instantiate the model. Additional fields like max_position_embeddings, intermediate_size, and hidden_act vary by model family. This file is loaded by AutoConfig.from_pretrained() and AutoModel.from_pretrained() to reconstruct the exact model class and architecture without user intervention.

Q: How do I log ML experiments with JSON?

MLflow logs parameters and metrics as JSON internally. Use mlflow.log_params({"lr": 0.001, "batch_size": 32, "epochs": 10}) to log hyperparameters and mlflow.log_metrics({"train_loss": 0.45, "val_accuracy": 0.92}, step=epoch) to log metrics per training step. Weights & Biases uses wandb.init(config={"lr": 0.001, "batch_size": 32}) and serializes the config to JSON for experiment reproducibility. Both platforms store all runs in a queryable JSON format that enables comparing hyperparameters across experiments and filtering runs by metric thresholds.

Q: What is the ONNX JSON format?

ONNX (Open Neural Network Exchange) models are stored in Protocol Buffer (.pb) binary format, but the graph structure can be read as JSON-like objects using the onnx Python library or visualized as JSON in the Netron viewer. Each operator node has a JSON-readable representation with fields: op_type (e.g., "Conv", "MatMul", "Relu"), inputs (list of tensor names), outputs (list of tensor names), and attribute (dict of operator-specific parameters). The model metadata includes ir_version, opset_import (operator set versions), and graph.node (list of operators). Use netron to open any .onnx file and browse the graph as a JSON-readable structure.

Q: How do I configure an ML pipeline with JSON?

Apache Airflow DAGs can be configured with JSON through Variable.get() and the Params system: dag = DAG("training_pipeline", params={"learning_rate": 0.001, "dataset_version": "v2.1"}, schedule_interval="@daily"). Kubeflow Pipelines uses a JSON pipeline spec (pipeline.json) that defines component graph, input/output artifacts, and resource requirements. The spec includes pipelineInfo.name, deploymentSpec.executors (component Docker images and commands), and pipelineSpec.root (the DAG of tasks with dependency edges). Both systems allow parameterizing runs by passing JSON config objects at runtime, enabling grid search and hyperparameter sweeps without code changes.

Last updated: May 19, 2025

JSON is the dominant format for machine learning configuration — Hugging Face config.json stores model architecture, tokenizer_config.json stores tokenizer settings, and training_args.json stores hyperparameters for reproducibility. JSONL (JSON Lines) is the standard format for ML training datasets — each line is a self-contained JSON object: {"prompt": "...", "completion": "..."} for OpenAI fine-tuning, or {"messages": [...]} for chat format. TensorFlow SavedModel exports a saved_model.pb plus variables/ but also a JSON signature: {"serving_default": {"inputs": {...}, "outputs": {...}}}. This guide covers Hugging Face JSON config files, JSONL training data format, ML experiment tracking with JSON, ONNX JSON operator schemas, inference API JSON request/response, and ML pipeline JSON configuration. Whether you are preparing fine-tuning datasets, debugging model loading errors, or building reproducible training pipelines, JSON is the connective tissue of modern ML systems. Use Jsonic's JSON formatter to inspect and validate any config or training data file during development.

Hugging Face JSON Config Files

Every Hugging Face model repository contains a config.json that fully describes the model architecture. The model_type field (e.g., "bert", "gpt2", "llama") determines which Python class the AutoModel loader instantiates. hidden_size sets the embedding dimension (768 for BERT-base, 4096 for LLaMA-2-7B), num_attention_heads controls multi-head attention parallelism, and num_hidden_layers sets the transformer depth. The architectures array contains the fully-qualified Python class name as a fallback when AutoModel cannot infer the class from model_type alone.

// Hugging Face config.json — LLaMA-2-7B structure
{
  "model_type": "llama",
  "architectures": ["LlamaForCausalLM"],
  "hidden_size": 4096,
  "intermediate_size": 11008,
  "max_position_embeddings": 4096,
  "num_attention_heads": 32,
  "num_hidden_layers": 32,
  "num_key_value_heads": 32,
  "vocab_size": 32000,
  "hidden_act": "silu",
  "rms_norm_eps": 1e-5,
  "rope_theta": 10000.0,
  "torch_dtype": "float16",
  "transformers_version": "4.31.0"
}

// tokenizer_config.json — stores tokenizer class and special tokens
{
  "tokenizer_class": "LlamaTokenizer",
  "bos_token": "<s>",
  "eos_token": "</s>",
  "unk_token": "<unk>",
  "pad_token": null,
  "model_max_length": 1000000000000000019884624838656,
  "tokenizer_model": "llama",
  "add_bos_token": true,
  "add_eos_token": false
}

// Load model using config.json automatically
from transformers import AutoConfig, AutoModelForCausalLM

config = AutoConfig.from_pretrained("meta-llama/Llama-2-7b-hf")
print(config.hidden_size)        # 4096
print(config.num_hidden_layers)  # 32

model = AutoModelForCausalLM.from_pretrained(
    "meta-llama/Llama-2-7b-hf",
    torch_dtype="auto",   # reads torch_dtype from config.json
    device_map="auto",
)

The training_args.json file, saved by Hugging Face Trainer, records every hyperparameter used in a training run — learning rate, batch size, warmup steps, evaluation strategy, and optimizer settings. This file is the authoritative source for reproducing a training run: reload it with TrainingArguments.from_json_file("training_args.json"). Keep these config files in version control alongside model checkpoints to ensure full reproducibility. See JSON config management for strategies to version and diff ML config files across experiments.

JSONL Training Data Formats

JSONL (JSON Lines) is the de facto standard for ML training datasets. Each line is a complete JSON object, enabling line-by-line streaming without loading the full dataset into memory, parallel processing by partitioning file byte ranges, and incremental appends without rewriting the file. For OpenAI fine-tuning, every line must follow the chat completion format with a messages array. For instruction tuning with open-source models (LLaMA, Mistral), the {"prompt": "...", "completion": "..."} format remains common. The validation split is typically a separate validation.jsonl file with 10-20% of training examples.

# OpenAI chat fine-tuning format (one JSON object per line)
{"messages": [{"role": "system", "content": "You are a concise SQL expert."}, {"role": "user", "content": "List all users created after 2024."}, {"role": "assistant", "content": "SELECT * FROM users WHERE created_at > '2024-01-01';"}]}
{"messages": [{"role": "system", "content": "You are a concise SQL expert."}, {"role": "user", "content": "Count orders per customer."}, {"role": "assistant", "content": "SELECT customer_id, COUNT(*) FROM orders GROUP BY customer_id;"}]}

# Instruction tuning format (common for open-source models)
{"prompt": "Translate to French: Hello, world!", "completion": "Bonjour, le monde!"}
{"prompt": "Summarize in one sentence: [long article text]", "completion": "The article discusses..."}

# Validate JSONL with Python
import json

def validate_openai_jsonl(filepath: str) -> dict:
    errors = []
    token_counts = []
    import tiktoken
    enc = tiktoken.get_encoding("cl100k_base")

    with open(filepath) as f:
        for i, line in enumerate(f, 1):
            line = line.strip()
            if not line:
                continue
            try:
                example = json.loads(line)
            except json.JSONDecodeError as e:
                errors.append(f"Line {i}: invalid JSON — {e}")
                continue

            # Check required structure
            if "messages" not in example:
                errors.append(f"Line {i}: missing 'messages' key")
                continue
            roles = [m.get("role") for m in example["messages"]]
            if roles[-1] != "assistant":
                errors.append(f"Line {i}: last message must be 'assistant', got '{roles[-1]}'")

            # Check token count
            tokens = sum(len(enc.encode(m.get("content", ""))) for m in example["messages"])
            token_counts.append(tokens)
            if tokens > 4096:
                errors.append(f"Line {i}: {tokens} tokens exceeds 4096 limit")

    return {
        "errors": errors,
        "total_examples": i,
        "avg_tokens": sum(token_counts) / len(token_counts) if token_counts else 0,
        "max_tokens": max(token_counts) if token_counts else 0,
    }

result = validate_openai_jsonl("training_data.jsonl")
print(result)

Data quality checks to run before submitting a JSONL file for fine-tuning: verify no duplicate examples (hash each line), check for empty assistant turns, confirm character encoding is UTF-8, and flag examples where the assistant response is shorter than 10 tokens (likely truncated). OpenAI recommends at least 50 training examples for meaningful fine-tuning and 10% held out as a validation set. See JSON data validation for schema validation patterns applicable to JSONL pipelines.

ML Experiment Tracking with JSON

Experiment tracking systems — MLflow, Weights & Biases (W&B), and Neptune — serialize every hyperparameter, metric, and artifact reference as JSON. This JSON record is the single source of truth for reproducing any past experiment: given the run's JSON config, you can reconstruct the exact training environment, dataset version, and model checkpoint. The critical discipline is logging all configuration as JSON at run start — not just learning rate, but also dataset path, random seed, model architecture variant, and hardware configuration — so that any run can be replayed months later.

import mlflow
import json

# Log all hyperparameters as JSON at run start
config = {
    "lr": 0.001,
    "batch_size": 32,
    "epochs": 10,
    "optimizer": "adamw",
    "weight_decay": 0.01,
    "warmup_steps": 500,
    "model_type": "bert-base-uncased",
    "dataset": "imdb",
    "dataset_version": "1.0.0",
    "seed": 42,
}

with mlflow.start_run(run_name="bert-sentiment-v1") as run:
    # Log all params at once — stored as JSON internally
    mlflow.log_params(config)

    # Save full config as artifact for exact reproducibility
    with open("run_config.json", "w") as f:
        json.dump(config, f, indent=2)
    mlflow.log_artifact("run_config.json")

    for epoch in range(config["epochs"]):
        train_loss, val_accuracy = train_epoch(epoch)
        mlflow.log_metrics({
            "train_loss": train_loss,
            "val_accuracy": val_accuracy,
        }, step=epoch)

    print(f"Run ID: {run.info.run_id}")

# Reproduce any past run from its JSON config
run = mlflow.get_run("abc123def456")
params = run.data.params  # dict of all logged params
reproduced_config = {k: float(v) if '.' in v else (int(v) if v.isdigit() else v)
                     for k, v in params.items()}

# Weights & Biases JSON config pattern
import wandb

wandb.init(
    project="bert-sentiment",
    config={
        "lr": 0.001,
        "batch_size": 32,
        "architecture": "BERT",
    }
)
# Access config anywhere in training
lr = wandb.config.lr

Best practices for JSON-based experiment tracking: use nested JSON for related parameters ({"optimizer": {"type": "adamw", "lr": 0.001, "weight_decay": 0.01}}) rather than flat keys with underscores; log the Git commit hash as a parameter for code reproducibility; save the full environment as requirements.json or conda_environment.json alongside model checkpoints. Compare experiment configs with a JSON diff tool to understand exactly what changed between two runs. Use Jsonic's JSON formatter to diff experiment config files visually.

ONNX JSON Operator Schemas

ONNX (Open Neural Network Exchange) provides a portable ML model format that enables running models trained in PyTorch or TensorFlow on any ONNX-compatible runtime (ONNX Runtime, TensorRT, CoreML). The model binary is a Protocol Buffer file, but every operator node and tensor specification has a JSON-readable representation accessible via the onnx Python library or the Netron visualization tool. Understanding the JSON structure of ONNX graphs is essential for debugging export errors, manually editing operator attributes, and building custom optimization passes.

import onnx
import json

# Load and inspect ONNX model as JSON-readable structure
model = onnx.load("model.onnx")

# Model metadata — JSON-serializable
metadata = {
    "ir_version": model.ir_version,
    "opset_version": model.opset_import[0].version,
    "graph_name": model.graph.name,
    "num_nodes": len(model.graph.node),
    "inputs": [{"name": i.name, "dtype": i.type.tensor_type.elem_type}
               for i in model.graph.input],
    "outputs": [{"name": o.name, "dtype": o.type.tensor_type.elem_type}
                for o in model.graph.output],
}

# Individual operator node — JSON-like representation
def node_to_dict(node) -> dict:
    return {
        "op_type": node.op_type,
        "name": node.name,
        "inputs": list(node.input),
        "outputs": list(node.output),
        "attributes": {
            attr.name: onnx.helper.get_attribute_value(attr)
            for attr in node.attribute
        },
    }

# Dump all nodes as JSON for analysis
graph_json = {
    "metadata": metadata,
    "nodes": [node_to_dict(n) for n in model.graph.node],
}
with open("model_graph.json", "w") as f:
    json.dump(graph_json, f, indent=2, default=str)

# Export PyTorch model to ONNX
import torch
import torch.onnx

model_pt = MyModel()
dummy_input = torch.randn(1, 3, 224, 224)

torch.onnx.export(
    model_pt,
    dummy_input,
    "model.onnx",
    export_params=True,
    opset_version=17,
    input_names=["input"],
    output_names=["output"],
    dynamic_axes={"input": {0: "batch_size"}, "output": {0: "batch_size"}},
)

# TensorFlow SavedModel JSON signature
# saved_model_cli show --dir ./saved_model --all outputs JSON-like signature:
# {
#   "serving_default": {
#     "inputs": {"input_ids": {"dtype": "DT_INT32", "shape": [-1, 512]}},
#     "outputs": {"logits": {"dtype": "DT_FLOAT", "shape": [-1, 2]}}
#   }
# }

The Netron viewer (available at netron.app) renders any ONNX, PyTorch, TensorFlow, or CoreML model as an interactive JSON-readable graph in the browser. Each node panel shows the operator type, attribute values, and input/output tensor shapes — the same data accessible via the onnx Python API. Use the ONNX operator schema registry (onnx.defs.get_schema("Conv")) to get the full JSON schema for any operator including required/optional attributes and type constraints.

Inference API JSON Request/Response

All major ML inference APIs — Hugging Face Inference API, OpenAI, Together AI, Replicate — use JSON for both request and response bodies. Understanding the exact JSON schema for each API prevents silent failures where a misspelled parameter key is silently ignored rather than raising an error. The Hugging Face Inference API uses a task-specific JSON schema: text generation, token classification, and image classification each have different required fields. The OpenAI-compatible format (/v1/chat/completions) has become the de facto standard adopted by most inference providers.

// Hugging Face Inference API — text generation
const HF_TOKEN = process.env.HF_TOKEN

// Standard Inference API
const response = await fetch(
  "https://api-inference.huggingface.co/models/mistralai/Mistral-7B-Instruct-v0.1",
  {
    method: "POST",
    headers: {
      Authorization: `Bearer ${HF_TOKEN}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      inputs: "<s>[INST] What is the capital of France? [/INST]",
      parameters: {
        max_new_tokens: 100,
        temperature: 0.7,
        top_p: 0.95,
        return_full_text: false,
        do_sample: true,
      },
    }),
  }
)
// Response: [{"generated_text": "Paris is the capital of France..."}]

// Batch inference — multiple inputs in one request
const batchResponse = await fetch(
  "https://api-inference.huggingface.co/models/distilbert-base-uncased-finetuned-sst-2-english",
  {
    method: "POST",
    headers: { Authorization: `Bearer ${HF_TOKEN}`, "Content-Type": "application/json" },
    body: JSON.stringify({
      inputs: ["I loved this movie!", "This was terrible.", "It was okay."],
    }),
  }
)
// Response: [[{"label": "POSITIVE", "score": 0.9998}], [{"label": "NEGATIVE", "score": 0.9997}], ...]

// OpenAI-compatible format (works with HF Inference Endpoints, Together AI, vLLM)
const openAICompatResponse = await fetch(
  "https://api-inference.huggingface.co/v1/chat/completions",
  {
    method: "POST",
    headers: { Authorization: `Bearer ${HF_TOKEN}`, "Content-Type": "application/json" },
    body: JSON.stringify({
      model: "meta-llama/Meta-Llama-3-8B-Instruct",
      messages: [
        { role: "user", content: "Explain JSON in one sentence." }
      ],
      max_tokens: 100,
      temperature: 0.5,
      stream: false,
    }),
  }
)
const data = await openAICompatResponse.json()
// data.choices[0].message.content — standard OpenAI response shape

Error responses from inference APIs are also JSON. Hugging Face returns {"error": "Model is currently loading", "estimated_time": 20} for cold-start delays — handle this with a retry loop rather than treating it as a fatal error. OpenAI returns {"error": {"message": "...", "type": "...", "code": "..."}}. Always check response.ok before calling response.json()and handle 429 (rate limit), 503 (model loading), and 422 (invalid input) status codes explicitly. See JSON in AI prompts for OpenAI function calling and structured output patterns.

ML Pipeline JSON Configuration

Modern ML pipelines — data ingestion, preprocessing, training, evaluation, and deployment — are orchestrated using JSON or YAML configuration files. Apache Airflow uses JSON for DAG parameters and XCom (cross-task communication) payloads. Kubeflow Pipelines compiles Python pipeline definitions into a JSON spec file that describes each component, its Docker image, command arguments, and dependency edges. The advantage of JSON pipeline config is environment-agnostic portability: the same pipeline.json can be submitted to a local Kubeflow cluster or a cloud-hosted Vertex AI Pipelines service.

# Apache Airflow — parameterized training DAG with JSON config
from airflow import DAG
from airflow.operators.python import PythonOperator
from airflow.models import Variable
import json
from datetime import datetime

# Store training config in Airflow Variable (JSON string)
# Variable.set("training_config", '{"lr": 0.001, "batch_size": 32, "epochs": 10}')

def train_model(**context):
    config_str = Variable.get("training_config", default_var='{}')
    config = json.loads(config_str)
    # Access run-specific params from DAG params
    dag_params = context["params"]
    effective_config = {**config, **dag_params}
    print(f"Training with config: {json.dumps(effective_config, indent=2)}")
    # ... training code

with DAG(
    dag_id="ml_training_pipeline",
    start_date=datetime(2025, 1, 1),
    schedule_interval="@weekly",
    params={
        "dataset_version": "v2.1",
        "model_checkpoint": "bert-base-uncased",
        "experiment_name": "weekly-retrain",
    },
) as dag:
    train = PythonOperator(
        task_id="train_model",
        python_callable=train_model,
    )

# Kubeflow Pipeline spec (pipeline.json excerpt)
# {
#   "pipelineInfo": {"name": "bert-training-pipeline"},
#   "deploymentSpec": {
#     "executors": {
#       "exec-preprocess": {
#         "container": {
#           "image": "gcr.io/myproject/preprocess:v1",
#           "command": ["python", "preprocess.py"],
#           "args": ["--input", "{{$.inputs.artifacts['dataset'].uri}}"]
#         }
#       },
#       "exec-train": {
#         "container": {
#           "image": "gcr.io/myproject/train:v1",
#           "args": ["--lr", "{{$.inputs.parameters['learning_rate']}}"]
#         }
#       }
#     }
#   },
#   "pipelineSpec": {
#     "root": {
#       "dag": {
#         "tasks": {
#           "preprocess": { "taskInfo": { "name": "preprocess" }, "inputs": {...} },
#           "train": {
#             "taskInfo": { "name": "train" },
#             "dependentTasks": ["preprocess"]
#           }
#         }
#       }
#     }
#   }
# }

Step dependency management in JSON pipeline configs: use an explicit dependentTasks or upstream_task_ids array rather than implicit ordering. This makes the DAG structure auditable from the JSON file alone without running the pipeline. For complex pipelines, validate the JSON pipeline spec against a JSON Schema before submission to catch missing required fields early. See JSON Schema patterns for writing validation schemas for custom pipeline config formats.

Feature Store JSON Schemas

Feature stores — Feast, Tecton, Vertex AI Feature Store — use JSON to define feature schemas, entity keys, and retrieval configurations. A feature definition specifies the name, dtype, and description of each feature, along with the online and offline store configuration. JSON feature definitions are the contract between the data engineering team (who produces features) and the ML team (who consumes them) — changing a feature's dtype or name is a breaking schema change that must be versioned. Feature retrieval requests are also JSON, specifying entity keys and the list of feature references to fetch.

# Feast feature definition (feature_store.yaml references JSON-like Python dicts)
from feast import FeatureStore, Entity, FeatureView, Field
from feast.types import Float32, Int64, String
from datetime import timedelta

# Entity definition — JSON-serializable
user_entity = Entity(
    name="user_id",
    description="User identifier",
    value_type=String,
)

# Feature view — defines the schema
user_features = FeatureView(
    name="user_activity_features",
    entities=[user_entity],
    ttl=timedelta(days=30),
    schema=[
        Field(name="purchase_count_7d", dtype=Int64),
        Field(name="avg_order_value", dtype=Float32),
        Field(name="last_category", dtype=String),
        Field(name="churn_risk_score", dtype=Float32),
    ],
    online=True,   # Enable online (low-latency) store
    source=bigquery_source,
)

# Feature retrieval request — JSON body for REST API
feature_request = {
    "features": [
        "user_activity_features:purchase_count_7d",
        "user_activity_features:avg_order_value",
        "user_activity_features:churn_risk_score",
    ],
    "entities": {
        "user_id": ["user_001", "user_002", "user_003"]
    }
}

# Online retrieval via Python SDK (serializes to JSON internally)
store = FeatureStore(repo_path=".")
feature_vector = store.get_online_features(
    features=feature_request["features"],
    entity_rows=[{"user_id": uid} for uid in feature_request["entities"]["user_id"]],
).to_dict()

# Response shape:
# {
#   "user_id": ["user_001", "user_002", "user_003"],
#   "user_activity_features__purchase_count_7d": [5, 12, 1],
#   "user_activity_features__avg_order_value": [42.50, 89.99, 15.00],
#   "user_activity_features__churn_risk_score": [0.12, 0.05, 0.78]
# }

Feature store JSON schemas must be treated as versioned API contracts. Use semantic versioning for feature view names (user_activity_features_v2) when adding required fields to avoid breaking downstream model serving code. Store the JSON schema for each feature view in version control and use JSON data validation to validate feature retrieval responses match the expected schema before passing them to a model. Offline store exports are typically Parquet files, but metadata and statistics (mean, std, null count per feature) are stored as JSON sidecar files alongside the Parquet partitions.

Definitions

JSONL: JSON Lines — a text format where each line is a syntactically complete JSON object terminated by a newline character. Designed for streaming data processing, log aggregation, and ML training datasets. Unlike JSON arrays, JSONL files can be read line by line without loading the full file into memory, support concurrent writes (append-only), and handle datasets of arbitrary size. File extension is .jsonl or .ndjson (newline-delimited JSON).
Model config: A JSON file (typically config.json) that fully specifies a neural network's architecture — layer counts, hidden dimensions, activation functions, attention head counts, and vocabulary size. In Hugging Face Transformers, the model config is the sole input needed to instantiate the correct Python class and allocate model parameters before loading weights. Model configs are stored in model repositories and version-controlled alongside the model weights.
Hyperparameter: A configuration value set before training that controls the learning process rather than being learned from data. Examples include learning rate, batch size, number of epochs, optimizer type, dropout rate, and weight decay. Hyperparameters are logged as JSON at the start of each training run to enable reproducibility and comparison across experiments. Hyperparameter search (grid search, Bayesian optimization) generates and evaluates many JSON config objects systematically.
Experiment tracking: The practice of recording every training run's hyperparameters, metrics, artifact paths, and environment details as JSON for later comparison and reproduction. Tools include MLflow, Weights & Biases, Neptune, and Comet. Each run produces a JSON record that serves as the authoritative source for reproducing the exact training conditions. Experiment tracking is the foundation of reproducible ML research and production model governance.
ONNX: Open Neural Network Exchange — an open standard for representing ML models as a portable computation graph stored in Protocol Buffer binary format. The graph structure (operators, tensor shapes, attributes) has a JSON-readable representation accessible via the onnx Python library and the Netron visualization tool. ONNX enables exporting models from PyTorch or TensorFlow and deploying them in ONNX Runtime, TensorRT, or CoreML without framework dependencies.
Feature store: A centralized repository for ML features that provides consistent feature computation, storage (online for low-latency serving, offline for training), and retrieval via a JSON API. Feature definitions — entity keys, feature names, data types, and freshness requirements — are specified as JSON or JSON-like Python DSL. Feature retrieval requests are JSON objects specifying entity keys and feature references; responses are JSON objects mapping feature names to arrays of values.
Inference endpoint: An HTTP service that accepts JSON request bodies containing model inputs and returns JSON response bodies containing model predictions. The request schema varies by task type (text generation, classification, embeddings) but all major providers have converged on the OpenAI /v1/chat/completions JSON format as a de facto standard. Inference endpoints abstract away hardware management, batching, and autoscaling — the consumer interacts only with the JSON API.

Frequently asked questions

What is the Hugging Face config.json format?

The Hugging Face config.json is a JSON file stored in every model repository that defines the model architecture. Key fields include model_type (e.g., "bert", "gpt2", "llama"), hidden_size (embedding dimension), num_attention_heads, num_hidden_layers, and vocab_size. The architectures array lists the Python class name used to instantiate the model. This file is loaded by AutoConfig.from_pretrained() and AutoModel.from_pretrained() to reconstruct the exact model class and architecture without user intervention.

What is JSONL format for machine learning training data?

JSONL (JSON Lines) is a text format where each line is a complete, self-contained JSON object terminated by a newline. It is the standard format for ML training datasets because it supports streaming reads (one line at a time), parallel processing (each worker reads a slice of lines), and easy append operations without rewriting the entire file. For OpenAI fine-tuning, each line follows the chat format: {"messages": [{"role": "system", "content": "..."}, {"role": "user", "content": "..."}, {"role": "assistant", "content": "..."}]}.

How do I log ML experiments with JSON?

MLflow logs parameters and metrics as JSON internally. Use mlflow.log_params({"lr": 0.001, "batch_size": 32, "epochs": 10}) to log hyperparameters and mlflow.log_metrics({"train_loss": 0.45, "val_accuracy": 0.92}, step=epoch) to log metrics per training step. Weights & Biases uses wandb.init(config={"lr": 0.001, "batch_size": 32}) and serializes the config to JSON for experiment reproducibility. Both platforms store all runs in a queryable JSON format that enables comparing hyperparameters across experiments and filtering runs by metric thresholds.

What is the ONNX JSON format?

ONNX models are stored in Protocol Buffer binary format, but the graph structure can be read as JSON-like objects using the onnx Python library or visualized as JSON in the Netron viewer. Each operator node has a JSON-readable representation with fields: op_type (e.g., "Conv", "MatMul", "Relu"), inputs (list of tensor names), outputs (list of tensor names), and attributes (dict of operator-specific parameters). Use netron to open any .onnx file and browse the graph as a JSON-readable structure.

How do I format JSON for OpenAI fine-tuning?

OpenAI fine-tuning requires JSONL format where every line is a JSON object with a messages array. Each message has role (system, user, or assistant) and content (string). A complete training example: {"messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "What is the capital of France?"}, {"role": "assistant", "content": "Paris."}]}. The training file must have at least 10 examples; OpenAI recommends 50–100 for meaningful fine-tuning. Use the openai tools fine_tunes.prepare_data command to validate JSONL format before uploading.

How do I configure an ML pipeline with JSON?

Apache Airflow DAGs can be configured with JSON through Variable.get() and the params system: DAG("training_pipeline", params={"learning_rate": 0.001, "dataset_version": "v2.1"}). Kubeflow Pipelines uses a JSON pipeline spec (pipeline.json) that defines component graph, input/output artifacts, and resource requirements. The spec includes pipelineInfo.name, deploymentSpec.executors (component Docker images and commands), and pipelineSpec.root (the DAG of tasks with dependency edges).

What JSON format does the Hugging Face Inference API use?

The Hugging Face Inference API accepts POST requests with a JSON body containing an inputs field. For text generation: {"inputs": "The capital of France is", "parameters": {"max_new_tokens": 50, "temperature": 0.7}}. The response is a JSON array — for text generation: [{"generated_text": "Paris, which is also..."}]. For the OpenAI-compatible endpoint, use the standard {"model": "...", "messages": [...], "max_tokens": 512} format and receive the standard OpenAI chat completion response shape.

How do I validate JSONL training data?

Validate JSONL training data by reading each line, parsing it with json.loads, and checking that required fields are present and correctly typed. For OpenAI fine-tuning, verify each line has a messages array, each message has role and content strings, and the last message role is "assistant". Check token counts using tiktoken: each example should be under 4096 tokens for gpt-3.5-turbo fine-tuning. Use the OpenAI CLI openai tools fine_tunes.prepare_data -f data.jsonl for automated validation before uploading.

Inspect and validate ML JSON config files visually

Paste any Hugging Face config.json, JSONL training data, or inference API response into Jsonic to pretty-print, navigate nested fields, and spot schema mismatches instantly — before they break your training pipeline.

Open JSON Formatter