Advertisement
← Back

Data Tools Guide

A comprehensive guide to transforming, converting, and manipulating data formats.

Advertisement

CSV, JSON, and XML each solve different problems. Knowing when to use each format and how to convert between them is a core developer skill that applies daily in data pipelines, APIs, and configuration management.

CSV — Key Edge Cases

  • Values with commas must be quoted: "Smith, John"
  • Double quotes in values are escaped by doubling: "He said ""hi"""
  • Line endings vary: CRLF on Windows, LF on Unix
  • No standard for encoding — verify UTF-8 vs Latin-1
  • All CSV values are strings — type-cast numbers explicitly

CSV to JSON

// CSV: name,age,city / Alice,30,New York
// JSON output:
[{"name":"Alice","age":"30","city":"New York"}]
// Note: age is a string — cast with parseInt() if needed

Format Comparison

  • CSV — tabular data, spreadsheet imports, large flat datasets, data science
  • JSON — REST APIs, nested/hierarchical data, JavaScript applications
  • XML — SOAP services, enterprise integrations, document formats (DOCX, SVG)

Frequently Asked Questions

Why does my CSV have blank lines in Excel?

Excel on Windows adds CRLF endings. If your file uses LF only, Excel may misinterpret the CR as part of the last column. Save with CRLF line endings to fix.

How do I parse large XML files?

Use SAX parsing (event-driven) for files over 100MB — DOM parsing loads everything into memory. Python: xml.etree.ElementTree.iterparse(). Node.js: saxes or sax library.

What is NDJSON?

Newline Delimited JSON — one JSON object per line. Ideal for streaming, log files, and MongoDB/Elasticsearch bulk imports.