JSON vs YAML: Which Format to Choose for Configs, APIs, and Data
2 March, 2026 Backend
JSON vs YAML: Which Format to Choose for Configs, APIs, and Data
JSON and YAML are the two dominant serialisation formats in modern software development. You see JSON in REST APIs, configuration files, and data pipelines. You see YAML in Kubernetes manifests, CI/CD configs, and Ansible playbooks. Both serialise structured data, but they make radically different trade-offs. This article breaks down those trade-offs so you can choose the right tool for each job.
What Is JSON
JSON - JavaScript Object Notation - is defined by RFC 8259. Despite its name, it is not a strict subset of JavaScript (JSON does not allow unquoted keys, trailing commas, or certain Unicode escape sequences that JavaScript does); it is derived from JavaScript object literal syntax but occupies its own specification.
JSON was designed by Douglas Crockford in the early 2000s with three explicit goals:
- Simplicity - the entire grammar fits on a single page. There are exactly six value types: string, number, object, array, boolean (
true/false), andnull. - Interoperability - any conforming parser in any language produces the same result for the same input.
- Strict parsing - there is no optional syntax, no comments, no trailing commas. Either the input is valid JSON or it is not.
These constraints make JSON predictable. Every JSON parser behaves identically for valid input. That predictability is its primary strength.
What Is YAML
YAML stands for "YAML Ain't Markup Language" - a recursive acronym chosen to emphasize that YAML is about data, not document markup. YAML 1.2 (published in 2009) is a strict superset of JSON: every valid JSON document is also valid YAML 1.2.
YAML was designed with different goals:
- Human readability - minimal punctuation, indentation-based structure, no mandatory quoting for simple strings.
- Multi-document support - a single file can contain multiple documents separated by
---. - Comments -
#starts a comment anywhere on a line. - Anchors and aliases - reuse values without repetition using
&anchorand*alias. - Rich type system - dates, binary data, sets, ordered maps, and more are expressible natively.
The price for this richness is complexity. YAML's specification spans multiple documents, and its grammar has dozens of known ambiguities.
Syntax Differences
Here is the same application configuration expressed in both formats.
YAML:
# Application configuration
app:
name: MyService
version: "2.1.0"
debug: false
server:
host: 0.0.0.0
port: 8080
timeout: 30
database:
host: localhost
port: 5432
name: mydb
pool:
min: 2
max: 20
features:
- name: payments
enabled: true
- name: analytics
enabled: false
logging:
level: info
format: json
output: stdout
JSON:
{
"app": {
"name": "MyService",
"version": "2.1.0",
"debug": false
},
"server": {
"host": "0.0.0.0",
"port": 8080,
"timeout": 30
},
"database": {
"host": "localhost",
"port": 5432,
"name": "mydb",
"pool": {
"min": 2,
"max": 20
}
},
"features": [
{ "name": "payments", "enabled": true },
{ "name": "analytics", "enabled": false }
],
"logging": {
"level": "info",
"format": "json",
"output": "stdout"
}
}
The YAML version is noticeably shorter. It has no curly braces, no commas, fewer quotation marks, and crucially it has a comment. The JSON version is more explicit about every delimiter, which is precisely what makes it machine-friendly.
Readability
YAML wins on readability for humans, but that advantage comes with nuances.
Comments
JSON has no comment syntax. The original specification excluded comments intentionally - Crockford argued that comments would be used for parser directives, breaking interoperability. In practice this causes real pain: you cannot annotate why a configuration value is set the way it is. Workarounds like "_comment" keys are a hack.
YAML supports # comments on any line:
database:
pool:
max: 20 # Tuned for c5.2xlarge; revisit if instance type changes
Multiline Strings
YAML provides two multiline string syntaxes:
# Literal block scalar - preserves newlines
message: |
Dear user,
Your account has been activated.
Welcome aboard.
# Folded block scalar - folds newlines into spaces
description: >
This is a long description that will be
folded into a single line when parsed.
JSON requires explicit \n escaping:
{
"message": "Dear user,\nYour account has been activated.\nWelcome aboard."
}
Anchors and Aliases
YAML allows defining a value once and referencing it multiple times:
defaults: &defaults
timeout: 30
retries: 3
log_level: info
production:
<<: *defaults
log_level: warn # override just this key
staging:
<<: *defaults
This eliminates repetition in complex configurations. JSON has no equivalent mechanism.
Strictness and Parsing
JSON: One Canonical Behaviour
RFC 8259 is short and unambiguous. For any valid JSON input, every compliant parser produces the same output. This is not an accident - it was a design goal. The only area of permitted variation is handling of duplicate keys (the spec says parsers "should" reject them but permits accepting the last value).
YAML: Specification Complexity
YAML's specification history is more complex. YAML 1.1 (2004) and YAML 1.2 (2009) differ meaningfully. YAML 1.2 aligned YAML with JSON and removed several implicit type coercions, but as of 2024, the majority of widely used YAML parsers still implement YAML 1.1 behaviour by default:
- PyYAML (Python) - YAML 1.1
- go-yaml v2 (Go) - YAML 1.1
- Ruby's Psych - partially YAML 1.2 since Ruby 3.1
The YAML specification itself acknowledges 23 known areas where parser behaviour is undefined or ambiguous. This means two conforming YAML parsers can produce different results for the same input.
Performance
JSON parsing is consistently 2-5x faster than YAML parsing across major runtimes. The reason is grammar complexity.
JSON's grammar is regular enough that a parser can process it in a single linear pass with minimal lookahead. V8 (Node.js, Chrome) has a hand-optimised JSON parser that processes hundreds of megabytes per second. CPython's json module is backed by a C extension for the same reason.
YAML's grammar requires tracking indentation levels, handling multiple string syntaxes, performing implicit type detection, and supporting anchors and aliases (which require building a node graph before resolving references). All of this adds overhead.
Rough benchmarks (parsing a 1 MB document):
| Runtime | JSON | YAML |
|---|---|---|
| Node.js (V8) | ~5 ms | ~25-40 ms |
| Python (CPython) | ~20 ms | ~80-120 ms |
| PHP 8 | ~15 ms | ~60-90 ms |
These numbers vary with document complexity, but the ratio is consistent. For hot paths - API responses parsed thousands of times per second - JSON's performance advantage is significant.
Use Cases
Configuration Files - YAML Wins
When humans write and read config files, YAML's readability advantages are decisive. Comments, multiline strings, anchors, and less punctuation noise make YAML significantly better for files maintained by developers. This is why Kubernetes, Docker Compose, GitHub Actions, GitLab CI, Ansible, and most modern DevOps tooling chose YAML.
REST APIs - JSON Wins
REST APIs are machine-to-machine. Performance matters, parsing is deterministic, and no comments are needed. JSON is the universal choice. Every HTTP client library, every browser, every mobile SDK handles JSON natively.
Data Exchange Between Systems - JSON Wins
When systems exchange structured data, interoperability is paramount. JSON's strict specification and universal support make it the right choice. EDI replacement, webhooks, message queues - JSON dominates.
Kubernetes and Docker Compose - YAML
Kubernetes objects are almost universally written in YAML (JSON is technically valid but almost never used). Docker Compose uses YAML exclusively. The repetitive, deeply nested nature of these configs benefits from YAML's anchors and aliases.
OpenAPI and Swagger - Both
The OpenAPI specification supports both JSON and YAML. YAML is preferred for hand-authored specs (comments, readability). JSON is preferred when the spec is generated by tooling or consumed programmatically.
YAML Pitfalls
YAML's expressiveness creates a category of bugs that JSON simply cannot have.
The Norway Problem
This is the most famous YAML pitfall. In YAML 1.1, the following values are parsed as booleans:
# These are all boolean true or false in YAML 1.1
YES: true
NO: false # Norway's ISO country code
ON: true
OFF: false
TRUE: true
FALSE: false
Y: true
N: false
The Norway problem: a developer writes a list of country codes as YAML keys or values. The code NO (Norway's ISO 3166-1 alpha-2 code) is silently parsed as boolean false. The same issue affects YES (not a country code, but a common config value).
Real-world impact: configuration files that used country codes as YAML keys had silent data corruption. The fix is to quote the value:
countries:
- "NO"
- "YES"
- "ON"
- SE
- DE
YAML 1.2 (2009) removed this behaviour - only true and false (lowercase) are booleans. But most parsers still default to YAML 1.1.
Octal Interpretation
In YAML 1.1, an unquoted number starting with 0 followed by digits is interpreted as octal:
file_permissions: 0755 # Parsed as 493 (decimal), not 755
port: 0600 # Parsed as 384, not 600
Again, quoting fixes this, but the silent coercion is a trap:
file_permissions: "0755" # String "0755" - correct
Tabs vs Spaces
YAML explicitly forbids tabs for indentation. A file that looks perfectly aligned in an editor using tab characters will fail to parse. This is a deliberate spec decision, but editors that mix tabs and spaces silently produce invalid YAML.
Duplicate Keys - Silent Overwrite
JSON parsers may warn on duplicate keys. YAML parsers typically silently take the last value:
server:
port: 8080
host: localhost
port: 9090 # Silently overwrites 8080; no error
CVE-2013-4073 - YAML Arbitrary Code Execution
Ruby's default YAML parser (Syck, and early versions of Psych) deserialized arbitrary Ruby objects from YAML input. Attackers could craft YAML payloads that executed arbitrary code when parsed. This affected Rails applications that accepted user-supplied YAML and was assigned CVE-2013-4073.
The root cause: YAML's type system supports "tags" that instruct the parser to construct specific language objects. !!python/object, !!ruby/object, and similar tags are supported by many parsers. When parsing untrusted input, this is a serious vulnerability. JSON has no equivalent mechanism - it only produces primitive types.
Never parse untrusted YAML with unrestricted object construction enabled.
Code Examples
PHP
<?php
declare(strict_types=1);
// JSON encoding and decoding
$data = [
'name' => 'MyService',
'version' => '2.1.0',
'debug' => false,
'port' => 8080,
];
$json = json_encode($data, JSON_PRETTY_PRINT | JSON_THROW_ON_ERROR);
$decoded = json_decode($json, associative: true, flags: JSON_THROW_ON_ERROR);
// YAML (requires ext-yaml or symfony/yaml)
// Using symfony/yaml:
use Symfony\Component\Yaml\Yaml;
$yaml = Yaml::dump($data, indent: 2);
$decoded = Yaml::parse($yaml);
// Using ext-yaml (PECL):
$yaml = yaml_emit($data);
$decoded = yaml_parse($yaml);
Python
import json
import yaml # pip install pyyaml
data = {
'name': 'MyService',
'version': '2.1.0',
'debug': False,
'port': 8080,
}
# JSON
json_str = json.dumps(data, indent=2)
decoded_json = json.loads(json_str)
# YAML - note: PyYAML uses YAML 1.1 by default
yaml_str = yaml.dump(data, default_flow_style=False)
decoded_yaml = yaml.safe_load(yaml_str) # Always use safe_load for untrusted input
# For YAML 1.2 compliance, use ruamel.yaml:
# from ruamel.yaml import YAML
# yml = YAML()
# yml.version = (1, 2)
JavaScript
// JSON - native browser and Node.js support
const data = {
name: 'MyService',
version: '2.1.0',
debug: false,
port: 8080,
};
const jsonStr = JSON.stringify(data, null, 2);
const decodedJson = JSON.parse(jsonStr);
// YAML - requires js-yaml (npm install js-yaml)
import yaml from 'js-yaml';
const yamlStr = yaml.dump(data);
const decodedYaml = yaml.load(yamlStr);
// For production use, prefer the schema option to avoid implicit type coercions:
const safeDecoded = yaml.load(yamlStr, { schema: yaml.JSON_SCHEMA });
When to Use Each
| Criterion | JSON | YAML |
|---|---|---|
| Human-authored config files | Poor - no comments, verbose | Excellent |
| Machine-generated / consumed data | Excellent | Poor - slower, riskier |
| REST API responses | Excellent - universal support | Not used |
| Configuration with comments | Not possible | Native support |
| Kubernetes / Docker Compose | Technically valid, not idiomatic | Standard |
| OpenAPI specs (hand-authored) | Acceptable | Preferred |
| Webhooks and event payloads | Excellent | Rare |
| High-throughput parsing | Excellent | Avoid |
| Untrusted input | Safe | Use safe mode only |
| Multi-document files | Not supported | Native |
| Reusable config blocks (DRY) | Not supported | Anchors and aliases |
| Interoperability across languages | Excellent | Good (with caveats) |
Decision Criteria
Choose JSON when:
- The data is consumed by machines more than humans
- Performance is important (high-frequency parsing)
- You need maximum interoperability across languages and tools
- The data is transmitted over HTTP (APIs, webhooks)
- You need strict, predictable parsing behaviour
Choose YAML when:
- Humans write and maintain the file
- You need comments to document intent
- The format is a well-established YAML context (Kubernetes, Docker Compose, GitHub Actions)
- You have complex repeated structures that benefit from anchors
- Multiline strings would improve readability
Avoid YAML when:
- Parsing untrusted input without careful safe-mode configuration
- Performance is a constraint
- You need guaranteed identical behaviour across different parsers
The Practical Rule
JSON and YAML solve the same core problem - serialising structured data - but optimise for different consumers. JSON optimises for machines: strict, fast, universally supported. YAML optimises for humans: readable, flexible, expressive.
The practical rule is simple: use JSON for data that flows between systems, use YAML for files that developers maintain. When you need to work with both or convert between them, knowing the pitfalls (the Norway problem, octal interpretation, implicit type coercions) prevents subtle bugs.
Convert between formats instantly with the JSON to YAML converter. Paste a JSON API response to understand it as YAML, or convert a YAML config to JSON for programmatic consumption - without installing any tools.