parse-extract-input

star 3

For text processing: extract numbers, words, structured data from messy text using regex patterns, parsing utilities.

jimmc414 By jimmc414 schedule Updated 12/27/2025

name: parse-extract-input description: "For text processing: extract numbers, words, structured data from messy text using regex patterns, parsing utilities."

parse-extract-input

When to Use

  • Extracting numbers from text
  • Finding words or identifiers
  • Parsing structured input files
  • Data cleaning
  • Advent of Code input parsing

When NOT to Use

  • Structured format (use JSON/CSV parsers)
  • Complex grammar (use proper parser)
  • Simple split is enough

The Pattern

Use regex or helper functions to extract structured data from text.

import re

def ints(text):
    """Extract all integers from text."""
    return tuple(map(int, re.findall(r'-?[0-9]+', text)))

def words(text):
    """Extract all words from text."""
    return tuple(re.findall(r'[a-zA-Z]+', text))

def atoms(text):
    """Extract all atoms (numbers or identifiers)."""
    return tuple(atom(s) for s in re.findall(r'[+-]?\d+\.?\d*|\w+', text))

def atom(s):
    """Parse string as number or keep as string."""
    try:
        return int(s)
    except ValueError:
        try:
            return float(s)
        except ValueError:
            return s

Example (from pytudes AdventUtils.ipynb)

import re

def ints(text: str) -> Tuple[int, ...]:
    """A tuple of all the integers in text."""
    return tuple(map(int, re.findall(r'-?[0-9]+', text)))

def positive_ints(text: str) -> Tuple[int, ...]:
    """A tuple of all positive integers in text."""
    return tuple(map(int, re.findall(r'[0-9]+', text)))

def digits(text: str) -> Tuple[int, ...]:
    """A tuple of all single digits in text."""
    return tuple(map(int, re.findall(r'[0-9]', text)))

def words(text: str) -> Tuple[str, ...]:
    """A tuple of all alphabetic words in text."""
    return tuple(re.findall(r'[a-zA-Z]+', text))

def atoms(text: str) -> Tuple:
    """A tuple of all atoms (numbers or identifiers)."""
    return tuple(map(atom, re.findall(r'[+-]?\d+\.?\d*|\w+', text)))

def atom(text: str):
    """Parse text into a single float or int or str."""
    try:
        x = float(text)
        return round(x) if x.is_integer() else x
    except ValueError:
        return text.strip()

# Usage examples
ints("Robot at (3, -5) with speed 10")  # (3, -5, 10)
words("Hello, World! 123")  # ('Hello', 'World')
atoms("x=42, y=3.14, name=foo")  # ('x', 42, 'y', 3.14, 'name', 'foo')

Key Principles

  1. Regex for patterns: r'-?[0-9]+' for signed integers
  2. Return tuples: Immutable, hashable results
  3. Cascading conversion: Try int, then float, then string
  4. Negative-aware: Include -? for negative numbers
  5. Reusable utilities: Same functions for any input
Install via CLI
npx skills add https://github.com/jimmc414/claude-code-plugin-marketplace --skill parse-extract-input
Repository Details
star Stars 3
call_split Forks 2
navigation Branch main
article Path SKILL.md
More from Creator