Skip to content

chatgpt_to_markdown.models

Pydantic v2 models for all structures in a ChatGPT data export.

from chatgpt_to_markdown.models import (
    Conversation,
    Message,
    Node,
    TextContent,
    # ... see full listing below
)

Conversation models

Conversation

Top-level conversation object parsed from conversations-*.json.

Field Type Description
id str Unique conversation UUID
title str \| None Conversation title
create_time float \| None Creation timestamp (Unix epoch)
update_time float \| None Last update timestamp (Unix epoch)
mapping dict[str, Node] All message nodes keyed by node ID
current_node str \| None ID of the active leaf node

Node

A single node in the message DAG.

Field Type Description
id str Node UUID
message Message \| None Message payload (None for the root node)
parent str \| None Parent node ID (None for the root)
children list[str] Child node IDs

Message

A single message within a node.

Field Type Description
id str Message UUID
author Author Author role and metadata
content Content Message content (discriminated union)
create_time float \| None Message creation timestamp (Unix epoch)
weight float Visibility weight (1.0 = visible, 0.0 = hidden)
channel str \| None None, "commentary", or "final"

Author

Field Type Description
role str "system", "user", "assistant", or "tool"
name str \| None Optional author name

Content types

Content is a discriminated union routed by content_type. Unknown types fall through to FallbackContent rather than raising a validation error.

Class content_type Description
TextContent "text" Plain text parts
MultimodalTextContent "multimodal_text" Text with embedded images
CodeContent "code" Code with language and execution output
ThoughtsContent "thoughts" Thinking/reasoning summary blocks
SonicWebpageContent "sonic_webpage" Web browsing result
TetherQuoteContent "tether_quote" Cited quote from a browsed page
TetherBrowsingDisplayContent "tether_browsing_display" Browsing session metadata
ExecutionOutputContent "execution_output" Code interpreter output
SystemErrorContent "system_error" System-level error message
ComputerOutputContent "computer_output" Computer use tool output
ReasoningRecapContent "reasoning_recap" Reasoning recap block
UserEditableContextContent "user_editable_context" Editable context block
FallbackContent any other Unknown content type passthrough

Asset models

AssetPointer

Embedded file reference within message content.

Field Type Description
asset_pointer str URI with file-service:// or sediment:// scheme
size_bytes int \| None File size in bytes
width int \| None Image width (pixels)
height int \| None Image height (pixels)

Manifest models

ExportManifest

Parsed from export_manifest.json. Contains the full list of exported files and their metadata.

ExportManifestFile

A single file entry in the manifest, with ID, filename, and size.


Metadata models

User

Parsed from user.json. Contains user profile data. PII fields are replaced with [REDACTED] when ConverterConfig.redact_pii is True.

UserSettings

Parsed from user_settings.json. Contains account-level settings.

MessageFeedback

Parsed from message_feedback.json. Contains thumbs-up/down feedback records keyed by message ID.