JSON vs YAML: Which Format to Choose for Configs, APIs, and Data

2 March, 2026 Backend

JSON vs YAML: Which Format to Choose for Configs, APIs, and Data

JSON and YAML are the two dominant serialisation formats in modern software development. You see JSON in REST APIs, configuration files, and data pipelines. You see YAML in Kubernetes manifests, CI/CD configs, and Ansible playbooks. Both serialise structured data, but they make radically different trade-offs. This article breaks down those trade-offs so you can choose the right tool for each job.

What Is JSON

JSON - JavaScript Object Notation - is defined by RFC 8259. Despite its name, it is not a strict subset of JavaScript (JSON does not allow unquoted keys, trailing commas, or certain Unicode escape sequences that JavaScript does); it is derived from JavaScript object literal syntax but occupies its own specification.

JSON was designed by Douglas Crockford in the early 2000s with three explicit goals:

Simplicity - the entire grammar fits on a single page. There are exactly six value types: string, number, object, array, boolean (true/false), and null.
Interoperability - any conforming parser in any language produces the same result for the same input.
Strict parsing - there is no optional syntax, no comments, no trailing commas. Either the input is valid JSON or it is not.

These constraints make JSON predictable. Every JSON parser behaves identically for valid input. That predictability is its primary strength.

What Is YAML

YAML stands for "YAML Ain't Markup Language" - a recursive acronym chosen to emphasize that YAML is about data, not document markup. YAML 1.2 (published in 2009) is a strict superset of JSON: every valid JSON document is also valid YAML 1.2.

YAML was designed with different goals:

Human readability - minimal punctuation, indentation-based structure, no mandatory quoting for simple strings.
Multi-document support - a single file can contain multiple documents separated by ---.
Comments - # starts a comment anywhere on a line.
Anchors and aliases - reuse values without repetition using &anchor and *alias.
Rich type system - dates, binary data, sets, ordered maps, and more are expressible natively.

The price for this richness is complexity. YAML's specification spans multiple documents, and its grammar has dozens of known ambiguities.

Syntax Differences

Here is the same application configuration expressed in both formats.

YAML:

# Application configuration
app:
  name: MyService
  version: "2.1.0"
  debug: false

server:
  host: 0.0.0.0
  port: 8080
  timeout: 30

database:
  host: localhost
  port: 5432
  name: mydb
  pool:
    min: 2
    max: 20

features:
  - name: payments
    enabled: true
  - name: analytics
    enabled: false

logging:
  level: info
  format: json
  output: stdout

JSON:

{
  "app": {
    "name": "MyService",
    "version": "2.1.0",
    "debug": false
  },
  "server": {
    "host": "0.0.0.0",
    "port": 8080,
    "timeout": 30
  },
  "database": {
    "host": "localhost",
    "port": 5432,
    "name": "mydb",
    "pool": {
      "min": 2,
      "max": 20
    }
  },
  "features": [
    { "name": "payments", "enabled": true },
    { "name": "analytics", "enabled": false }
  ],
  "logging": {
    "level": "info",
    "format": "json",
    "output": "stdout"
  }
}

The YAML version is noticeably shorter. It has no curly braces, no commas, fewer quotation marks, and crucially it has a comment. The JSON version is more explicit about every delimiter, which is precisely what makes it machine-friendly.

Readability

YAML wins on readability for humans, but that advantage comes with nuances.

Comments

JSON has no comment syntax. The original specification excluded comments intentionally - Crockford argued that comments would be used for parser directives, breaking interoperability. In practice this causes real pain: you cannot annotate why a configuration value is set the way it is. Workarounds like "_comment" keys are a hack.

YAML supports # comments on any line:

database:
  pool:
    max: 20  # Tuned for c5.2xlarge; revisit if instance type changes

Multiline Strings

YAML provides two multiline string syntaxes:

# Literal block scalar - preserves newlines
message: |
  Dear user,
  Your account has been activated.
  Welcome aboard.

# Folded block scalar - folds newlines into spaces
description: >
  This is a long description that will be
  folded into a single line when parsed.

JSON requires explicit \n escaping:

{
  "message": "Dear user,\nYour account has been activated.\nWelcome aboard."
}

Anchors and Aliases

YAML allows defining a value once and referencing it multiple times:

defaults: &defaults
  timeout: 30
  retries: 3
  log_level: info

production:
  <<: *defaults
  log_level: warn  # override just this key

staging:
  <<: *defaults

This eliminates repetition in complex configurations. JSON has no equivalent mechanism.

Strictness and Parsing

JSON: One Canonical Behaviour

RFC 8259 is short and unambiguous. For any valid JSON input, every compliant parser produces the same output. This is not an accident - it was a design goal. The only area of permitted variation is handling of duplicate keys (the spec says parsers "should" reject them but permits accepting the last value).

YAML: Specification Complexity

YAML's specification history is more complex. YAML 1.1 (2004) and YAML 1.2 (2009) differ meaningfully. YAML 1.2 aligned YAML with JSON and removed several implicit type coercions, but as of 2024, the majority of widely used YAML parsers still implement YAML 1.1 behaviour by default:

PyYAML (Python) - YAML 1.1
go-yaml v2 (Go) - YAML 1.1
Ruby's Psych - partially YAML 1.2 since Ruby 3.1

The YAML specification itself acknowledges 23 known areas where parser behaviour is undefined or ambiguous. This means two conforming YAML parsers can produce different results for the same input.

Performance

JSON parsing is consistently 2-5x faster than YAML parsing across major runtimes. The reason is grammar complexity.

JSON's grammar is regular enough that a parser can process it in a single linear pass with minimal lookahead. V8 (Node.js, Chrome) has a hand-optimised JSON parser that processes hundreds of megabytes per second. CPython's json module is backed by a C extension for the same reason.

YAML's grammar requires tracking indentation levels, handling multiple string syntaxes, performing implicit type detection, and supporting anchors and aliases (which require building a node graph before resolving references). All of this adds overhead.

Rough benchmarks (parsing a 1 MB document):

Runtime	JSON	YAML
Node.js (V8)	~5 ms	~25-40 ms
Python (CPython)	~20 ms	~80-120 ms
PHP 8	~15 ms	~60-90 ms

These numbers vary with document complexity, but the ratio is consistent. For hot paths - API responses parsed thousands of times per second - JSON's performance advantage is significant.

Use Cases

Configuration Files - YAML Wins

When humans write and read config files, YAML's readability advantages are decisive. Comments, multiline strings, anchors, and less punctuation noise make YAML significantly better for files maintained by developers. This is why Kubernetes, Docker Compose, GitHub Actions, GitLab CI, Ansible, and most modern DevOps tooling chose YAML.

REST APIs - JSON Wins

REST APIs are machine-to-machine. Performance matters, parsing is deterministic, and no comments are needed. JSON is the universal choice. Every HTTP client library, every browser, every mobile SDK handles JSON natively.

Data Exchange Between Systems - JSON Wins

When systems exchange structured data, interoperability is paramount. JSON's strict specification and universal support make it the right choice. EDI replacement, webhooks, message queues - JSON dominates.

Kubernetes and Docker Compose - YAML

Kubernetes objects are almost universally written in YAML (JSON is technically valid but almost never used). Docker Compose uses YAML exclusively. The repetitive, deeply nested nature of these configs benefits from YAML's anchors and aliases.

OpenAPI and Swagger - Both

The OpenAPI specification supports both JSON and YAML. YAML is preferred for hand-authored specs (comments, readability). JSON is preferred when the spec is generated by tooling or consumed programmatically.

YAML Pitfalls

YAML's expressiveness creates a category of bugs that JSON simply cannot have.

The Norway Problem

This is the most famous YAML pitfall. In YAML 1.1, the following values are parsed as booleans:

# These are all boolean true or false in YAML 1.1
YES: true
NO: false   # Norway's ISO country code
ON: true
OFF: false
TRUE: true
FALSE: false
Y: true
N: false

The Norway problem: a developer writes a list of country codes as YAML keys or values. The code NO (Norway's ISO 3166-1 alpha-2 code) is silently parsed as boolean false. The same issue affects YES (not a country code, but a common config value).

Real-world impact: configuration files that used country codes as YAML keys had silent data corruption. The fix is to quote the value:

countries:
  - "NO"
  - "YES"
  - "ON"
  - SE
  - DE

YAML 1.2 (2009) removed this behaviour - only true and false (lowercase) are booleans. But most parsers still default to YAML 1.1.

Octal Interpretation

In YAML 1.1, an unquoted number starting with 0 followed by digits is interpreted as octal:

file_permissions: 0755  # Parsed as 493 (decimal), not 755
port: 0600              # Parsed as 384, not 600

Again, quoting fixes this, but the silent coercion is a trap:

file_permissions: "0755"  # String "0755" - correct

Tabs vs Spaces

YAML explicitly forbids tabs for indentation. A file that looks perfectly aligned in an editor using tab characters will fail to parse. This is a deliberate spec decision, but editors that mix tabs and spaces silently produce invalid YAML.

Duplicate Keys - Silent Overwrite

JSON parsers may warn on duplicate keys. YAML parsers typically silently take the last value:

server:
  port: 8080
  host: localhost
  port: 9090  # Silently overwrites 8080; no error

CVE-2013-4073 - YAML Arbitrary Code Execution

Ruby's default YAML parser (Syck, and early versions of Psych) deserialized arbitrary Ruby objects from YAML input. Attackers could craft YAML payloads that executed arbitrary code when parsed. This affected Rails applications that accepted user-supplied YAML and was assigned CVE-2013-4073.

The root cause: YAML's type system supports "tags" that instruct the parser to construct specific language objects. !!python/object, !!ruby/object, and similar tags are supported by many parsers. When parsing untrusted input, this is a serious vulnerability. JSON has no equivalent mechanism - it only produces primitive types.

Never parse untrusted YAML with unrestricted object construction enabled.

Code Examples

PHP

<?php

declare(strict_types=1);

// JSON encoding and decoding
$data = [
    'name' => 'MyService',
    'version' => '2.1.0',
    'debug' => false,
    'port' => 8080,
];

$json = json_encode($data, JSON_PRETTY_PRINT | JSON_THROW_ON_ERROR);
$decoded = json_decode($json, associative: true, flags: JSON_THROW_ON_ERROR);

// YAML (requires ext-yaml or symfony/yaml)
// Using symfony/yaml:
use Symfony\Component\Yaml\Yaml;

$yaml = Yaml::dump($data, indent: 2);
$decoded = Yaml::parse($yaml);

// Using ext-yaml (PECL):
$yaml = yaml_emit($data);
$decoded = yaml_parse($yaml);

Python

import json
import yaml  # pip install pyyaml

data = {
    'name': 'MyService',
    'version': '2.1.0',
    'debug': False,
    'port': 8080,
}

# JSON
json_str = json.dumps(data, indent=2)
decoded_json = json.loads(json_str)

# YAML - note: PyYAML uses YAML 1.1 by default
yaml_str = yaml.dump(data, default_flow_style=False)
decoded_yaml = yaml.safe_load(yaml_str)  # Always use safe_load for untrusted input

# For YAML 1.2 compliance, use ruamel.yaml:
# from ruamel.yaml import YAML
# yml = YAML()
# yml.version = (1, 2)

JavaScript

// JSON - native browser and Node.js support
const data = {
  name: 'MyService',
  version: '2.1.0',
  debug: false,
  port: 8080,
};

const jsonStr = JSON.stringify(data, null, 2);
const decodedJson = JSON.parse(jsonStr);

// YAML - requires js-yaml (npm install js-yaml)
import yaml from 'js-yaml';

const yamlStr = yaml.dump(data);
const decodedYaml = yaml.load(yamlStr);

// For production use, prefer the schema option to avoid implicit type coercions:
const safeDecoded = yaml.load(yamlStr, { schema: yaml.JSON_SCHEMA });

When to Use Each

Criterion	JSON	YAML
Human-authored config files	Poor - no comments, verbose	Excellent
Machine-generated / consumed data	Excellent	Poor - slower, riskier
REST API responses	Excellent - universal support	Not used
Configuration with comments	Not possible	Native support
Kubernetes / Docker Compose	Technically valid, not idiomatic	Standard
OpenAPI specs (hand-authored)	Acceptable	Preferred
Webhooks and event payloads	Excellent	Rare
High-throughput parsing	Excellent	Avoid
Untrusted input	Safe	Use safe mode only
Multi-document files	Not supported	Native
Reusable config blocks (DRY)	Not supported	Anchors and aliases
Interoperability across languages	Excellent	Good (with caveats)

Decision Criteria

Choose JSON when:

The data is consumed by machines more than humans
Performance is important (high-frequency parsing)
You need maximum interoperability across languages and tools
The data is transmitted over HTTP (APIs, webhooks)
You need strict, predictable parsing behaviour

Choose YAML when:

Humans write and maintain the file
You need comments to document intent
The format is a well-established YAML context (Kubernetes, Docker Compose, GitHub Actions)
You have complex repeated structures that benefit from anchors
Multiline strings would improve readability

Avoid YAML when:

Parsing untrusted input without careful safe-mode configuration
Performance is a constraint
You need guaranteed identical behaviour across different parsers

The Practical Rule

JSON and YAML solve the same core problem - serialising structured data - but optimise for different consumers. JSON optimises for machines: strict, fast, universally supported. YAML optimises for humans: readable, flexible, expressive.

The practical rule is simple: use JSON for data that flows between systems, use YAML for files that developers maintain. When you need to work with both or convert between them, knowing the pitfalls (the Norway problem, octal interpretation, implicit type coercions) prevents subtle bugs.

Convert between formats instantly with the JSON to YAML converter. Paste a JSON API response to understand it as YAML, or convert a YAML config to JSON for programmatic consumption - without installing any tools.

JSON vs YAML: Which Format to Choose for Configs, APIs, and Data

JSON vs YAML: Which Format to Choose for Configs, APIs, and Data

What Is JSON

What Is YAML

Syntax Differences

Readability

Comments

Multiline Strings

Anchors and Aliases

Strictness and Parsing

JSON: One Canonical Behaviour

YAML: Specification Complexity

Performance

Use Cases

Configuration Files - YAML Wins

REST APIs - JSON Wins

Data Exchange Between Systems - JSON Wins

Kubernetes and Docker Compose - YAML

OpenAPI and Swagger - Both

YAML Pitfalls

The Norway Problem

Octal Interpretation

Tabs vs Spaces

Duplicate Keys - Silent Overwrite

CVE-2013-4073 - YAML Arbitrary Code Execution

Code Examples

PHP

Python

JavaScript

When to Use Each

Decision Criteria

The Practical Rule

More Articles

Catastrophic Backtracking: How One Regex Can Take Your Site Down

Cron Job Not Running? A Field-Tested Debugging Checklist

GEO in 2026: Getting Cited by AI Answer Engines