Skip to content

Jia2005/Csv-Converter-Api

Repository files navigation

CSV to JSON Converter API

A high-performance Node.js API that converts CSV files to JSON, transforms the data, loads it into PostgreSQL, and generates age distribution reports.


Features

  • Custom CSV Parser - Built from scratch without external CSV libraries, handles complex scenarios (quoted fields, escaped quotes, empty fields)
  • Streaming Processing - Memory-efficient handling of large files (50,000+ records) using Node.js streams
  • Batch Insertion - Optimized PostgreSQL batch inserts (1000 records per batch) with transaction safety
  • Nested JSON Transformation - Converts dot-separated headers (e.g., address.city) into nested JSON objects
  • Data Validation - Robust validation for mandatory fields and data types
  • Age Distribution Report - Automatic generation of age-group statistics

Setup

1. Clone the Repository

git clone https://github.com/Jia2005/Csv-Converter-Api.git
cd Csv-Converter-Api

2. Install Dependencies

npm install

3. Database Setup

Create a PostgreSQL database:

CREATE DATABASE kelp_db;

The application will automatically create the required users table on startup.

4. Configure Environment

Create a .env file in the root directory:

DB_USER=your_postgres_username
DB_HOST=localhost
DB_DATABASE=kelp_db
DB_PASSWORD=your_password
DB_PORT=5432
APP_PORT=3000
CSV_FILE_PATH=data/user_samples.csv

Running the Application

Start the Server

npm start

The server will start on http://localhost:3000

Trigger CSV Conversion

Visit or send a GET request to:

http://localhost:3000/api/convert

This will:

  1. Parse the CSV file from the configured path
  2. Transform data to JSON with nested structures
  3. Insert records into PostgreSQL
  4. Print age distribution report to console

Check Age Distribution

http://localhost:3000/api/distribution

Health Check

http://localhost:3000/api/health

Testing

The project includes comprehensive test coverage (22 tests total):

Run All Tests

npm run test:all

Individual Test Suites

CSV Parser Tests (6 tests)

npm run test:parser

Tests for: simple parsing, quoted fields, empty fields, escaped quotes, whitespace handling

Integration Tests (7 tests)

npm run test:integration

Tests the complete CSV → JSON → Database pipeline with real data

Edge Case Tests (9 tests)

npm run test:edge

Tests validation logic: missing fields, invalid ages, float values, negative numbers

Test Data

  • data/user_samples.csv - Sample data for normal operation (7 records)
  • data/edge_cases.csv - Edge case scenarios for validation testing (9 records)

Database Schema

CREATE TABLE public.users (
    id serial4 NOT NULL,
    name varchar NOT NULL,
    age int4 NOT NULL,
    address jsonb NULL,
    additional_info jsonb NULL,
    CONSTRAINT users_pkey PRIMARY KEY (id)
);

Field Mapping:

  • name = firstName + lastName (space-separated)
  • age = Must be a non-negative integer
  • address = All address.* fields as JSONB
  • additional_info = All remaining fields as JSONB

Example

Input CSV:

name.firstName,name.lastName,age,address.line1,address.city,address.state,gender
Rohit,Prasad,35,A-563 Rakshak Society,Pune,Maharashtra,male

Output Database Record:

{
  "id": 1,
  "name": "Rohit Prasad",
  "age": 35,
  "address": {
    "line1": "A-563 Rakshak Society",
    "city": "Pune",
    "state": "Maharashtra"
  },
  "additional_info": {
    "gender": "male"
  }
}

Console Report:

==================================================
AGE DISTRIBUTION REPORT
==================================================
┌─────────┬────────────────┬─────────────────┐
│ (index) │   Age-Group    │ % Distribution  │
├─────────┼────────────────┼─────────────────┤
│    0    │     '< 20'     │    '14.29%'     │
│    1    │   '20 to 40'   │    '42.86%'     │
│    2    │   '40 to 60'   │    '28.57%'     │
│    3    │     '> 60'     │    '14.29%'     │
└─────────┴────────────────┴─────────────────┘

Total Users: 7
==================================================

Validation Rules

  • Name: Both firstName and lastName are mandatory
  • Age: Must be a non-negative integer (floats and negative values rejected)
  • CSV Format: First line must be headers, comma-delimited
  • Records: Invalid records are logged and skipped (processing continues)

npm Scripts

npm start              # Start the server
npm run dev            # Start with nodemon (auto-reload)
npm test               # Run parser and integration tests
npm run test:all       # Run all tests (parser + integration + edge cases)
npm run test:parser    # Run CSV parser unit tests
npm run test:integration   # Run integration tests
npm run test:edge      # Run edge case validation tests

Assumptions

Key design decisions are documented in Assumptions.md


Technology Stack

  • Runtime: Node.js with Express.js
  • Database: PostgreSQL with pg driver
  • Configuration: dotenv for environment management
  • Performance: Streaming API + Batch inserts (1000 records/batch)

Author

Jia Harisinghani - GitHub

About

Custom CSV to JSON converter API with PostgreSQL integration and age distribution reporting

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors