🚀 Streamy JSON Parser

A fast, stack-based, memory-efficient streaming JSON parser with zero dependencies.

StreamingJsonParser provides incremental parsing of JSON data, allowing you to process JSON as it arrives rather than waiting for the complete document. Perfect for handling large JSON responses from APIs, LLM outputs, and other streaming contexts.

TODO:

Optimize fast-path for multi-character input in quotes
Fix smart-pointers

Clean environment

Remove environment

rm -r .venv/ build/ dist/ json_stream_parser.egg-info/ tmp/

Re-install environment (debug)

uv run --no-cache --verbose python

✨ Features

🔄 True streaming - Parse JSON as it arrives, character by character
📊 Partial results - Access the current state of parsing at any point
🪶 Lightweight - Zero dependencies, minimal memory footprint
⚡ Blazing fast - O(n) time complexity
🔒 Optional strict mode - Validate JSON syntax as you parse
🔍 Well-defined subset - Focuses on objects and string values

📦 Installation

git clone https://github.com/fiskrt/json_stream_parser.git

🚀 Usage

Basic Usage

from json_parse import StreamingJsonParser

# Create a parser
parser = StreamingJsonParser()

# Feed it JSON data in chunks
parser.consume('{"user": "john_doe", "profile": {')
parser.consume('"age": "28", "location": "San Francisco"')

# Get the current state of the parsed JSON
result = parser.get()
print(result)
# {'user': 'john_doe', 'profile': {'age': '28', 'location': 'San Francisco'}}

LLM Integration Example

import os
import asyncio
from openai import AsyncOpenAI
from streaming_json_parser import StreamingJsonParser

async def stream_json_from_openai():
    client = AsyncOpenAI(api_key=os.environ.get("OPENAI_API_KEY"))
    parser = StreamingJsonParser()
    
    # Request that explicitly asks for JSON output
    stream = await client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=[
            {"role": "system", "content": "Return JSON responses only."},
            {"role": "user", "content": "Give me a user profile with name, age, and interests."}
        ],
        response_format={"type": "json_object"},
        stream=True
    )
    
    # Process the streaming response
    async for chunk in stream:
        if chunk.choices[0].delta.content:
            text_chunk = chunk.choices[0].delta.content
            parser.consume(text_chunk)
            current_json = parser.get()
            
            if "name" in current_json:
                print(f"Name received: {current_json['name']}")
            
            if "interests" in current_json:
                print(f"Interests so far: {len(current_json['interests'])}")
    
    return parser.get()

if __name__ == "__main__":
    result = asyncio.run(stream_json_from_openai())
    print(f"Complete profile: {result}")

Strict Mode

# Enable strict mode to validate JSON syntax
parser = StreamingJsonParser(strict_mode=True)

try:
    parser.consume('{"invalid"a: "value"}')
except ValueError as e:
    print(f"Invalid JSON: {e}")

🔬 Technical Details

Formal Grammar (BNF)

StreamingJsonParser implements a subset of JSON that handles objects and strings according to this grammar:

<json>         ::= <object>
<object>       ::= '{' <members> '}'
<members>      ::= ε | <member-list>
<member-list>  ::= <member> | <member> ',' <member-list>
<member>       ::= <string> ':' <value>
<value>        ::= <string> | <object>
<string>       ::= '"' <characters> '"'
<characters>   ::= ε | <character> <characters>
<character>    ::= any Unicode character except "

Whitespace is allowed:

<whitespace>   ::= ε | <ws-char> <whitespace>
<ws-char>      ::= ' ' | '\n' | '\t' | '\r'

Complexity Analysis

Time Complexity: O(n) where n is the length of the input
Space Complexity: O(n)

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

Credits

Claude 3.7 Sonnet (extended thinking)

https://github.com/pybind/scikit_build_example/tree/master

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
.github/workflows		.github/workflows
cpp_src		cpp_src
legacy		legacy
streamyjson		streamyjson
tests		tests
.gitignore		.gitignore
.python-version		.python-version
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🚀 Streamy JSON Parser

TODO:

Clean environment

✨ Features

📦 Installation

🚀 Usage

Basic Usage

LLM Integration Example

Strict Mode

🔬 Technical Details

Formal Grammar (BNF)

Complexity Analysis

📄 License

Credits

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

fiskrt/json-stream-parser

Folders and files

Latest commit

History

Repository files navigation

🚀 Streamy JSON Parser

TODO:

Clean environment

✨ Features

📦 Installation

🚀 Usage

Basic Usage

LLM Integration Example

Strict Mode

🔬 Technical Details

Formal Grammar (BNF)

Complexity Analysis

📄 License

Credits

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages