name: haskell-design description: Architecture, module layout, effect handling, and testing patterns for Haskell projects in this author's style (devbot / apocrypha / coreutils). TRIGGER when designing a new Haskell project or module, structuring a port to Haskell, deciding how to thread IO/effects, or writing hspec test suites. Complements haskell-dev (which covers tooling — sghci, stack unpack, HLS). globs: "*.hs" alwaysApply: false
Haskell Design Patterns
The conventions below are distilled from three reference projects — devbot (a task
scheduler daemon), apocrypha (a JSON server/client), and coreutils (~40 unix utils in
one dispatch binary). They are the house style. When designing a new project or module,
or porting code into Haskell, match these. For tooling (type inspection with sghci,
reading dep source via stack unpack, HLS), see the haskell-dev skill — this skill is
about design.
The governing philosophy across all three: pure core, thin IO edge, effects as plain values, lean dependencies, illegal states unrepresentable, everything tested.
1. Project skeleton
<project>.cabal (or package.yaml for hpack)
stack.yaml -- pinned resolver, e.g. lts-24.24 (GHC 9.8/9.10)
stack.yaml.lock
Makefile -- all / release / test / format / lint
hie.yaml -- HLS cradle (one line: cradle: {stack: ...})
README.md
LICENSE
<Project>/ -- the LIBRARY lives at the repo root, not under src/
├── <Feature>.hs -- user-facing / top-level modules
├── <Feature>/
│ ├── Config.hs -- ADTs + FromJSON + a Valid instance
│ └── Runtime.hs -- the lifecycle / state machine for that feature
└── Internal/
├── Common.hs -- shared effect synonyms, logger, getTime
├── Persist.hs -- storage wrapper
└── ... -- one focused module per concern
src/
└── main.hs -- TINY entry point; dispatches into <Project>.Cli/Run
test/
├── Spec.hs -- {-# OPTIONS_GHC -F -pgmF hspec-discover #-}
├── Helpers.hs -- shared stubs/recorders/fixtures (register in cabal!)
└── <Module>Spec.hs -- one spec file per module
Key structural choices:
- Library source sits at the repo root under
<Project>/, withhs-source-dirs: .. The executable is a separate tiny target undersrc/that depends on the library. main.hsis intentionally trivial — it parses/dispatches and calls into aRunorClimodule. All logic is in the library so it's testable.Internal/holds implementation modules not meant as the public API. Modules outsideInternal/are the surface.
2. Cabal / package conventions
Use a common shared stanza (cabal) or top-level defaults (hpack) and repeat
default-extensions / dependencies / ghc-options in every target — they do NOT
propagate from library to executable to test (a frequent build failure; see haskell-dev).
Standard settings:
common shared
default-language: GHC2024
default-extensions: RecordWildCards -- + StrictData in coreutils' style
ghc-options: -Wall
-Wcompat
-Wincomplete-record-updates
-Wincomplete-uni-patterns
-Wredundant-constraints
-Wpartial-fields
-Wunused-packages -- keeps the dep list honest
build-depends: base >= 4.7 && < 5
GHC2024language edition:LambdaCase,GADTs, etc. are on without pragmas.RecordWildCardsis the one extra usually added explicitly. coreutils also defaultsStrictData.- Lean dependencies are a value, not an accident. Before adding a dep, check whether
an existing one or
basecovers it.-Wunused-packagesenforces removing dead deps. Typical surface:aeson/yaml,text,bytestring,containers/unordered-containers,directory/filepath,time,http-client(-tls),process,network,async/stm. Reach for hand-rolled before heavy (hmatrix, big frameworks). - Adding a module = two edits: create the file AND register it
(
exposed-modules/other-modules). Forgetting the cabal entry is the #1 build failure. Same for new test specs (test-suite … other-modules). - Optimization via a
releaseflag (-O2 -threadedon,-O0off) so dev builds are fast — coreutils' pattern.
3. Effect handling — the central pattern
Inject IO effects as plain function values. No typeclasses, no monad transformers, no
ReaderT. This is the single most important and most distinctive convention.
Each effect is a type synonym:
type Clock = IO Integer -- POSIX seconds time source
type ContextF = IO Context -- persistence handle factory
type AliveCheck = Pid -> IO Bool
type Spawner = String -> [String] -> IO ProcessHandle
type Pinger = String -> IO ()
The effects that travel together through one pipeline are bundled into a single per-domain env record, so functions take one argument instead of a growing list:
data EventEnv = EventEnv
{ eePinger :: !Pinger
, eeOutput :: !OutputStreamF
, eeContext :: !ContextF
, eeClock :: !Clock
}
handle :: EventEnv -> Task -> IO Task -- pipeline entry takes the whole env
Rules of thumb:
- Bundle effects that flow together; keep leaf helpers narrow. A one-effect helper
takes just that effect (
flush :: ContextF -> ...,monitorShift :: Clock -> ...), not the whole env — handing a full env to a one-effect function hides its real dependency. - The top-level runtime (
Bot.runner) builds the record once with real implementations (httpPing,getTime,spawnProcess); tests build it with stubs. - When you add an effect: add a type synonym, thread it into the relevant record(s).
Reach for the record, not a typeclass — "a record is exactly what
ReaderTwould wrap if that ever became necessary." - Time comes from a single
getTime :: Clock(POSIX seconds,Integer), nevergetCurrentTimescattered through the code. Logging goes through alogger, never rawputStrLnfrom inside a runtime.
This keeps the business logic pure and the effectful shell a thin, swappable layer — and makes testing a matter of passing different functions, not building mocks.
4. Type-driven design — make illegal states unrepresentable
The flip side of "pure core": once logic is pure, push its preconditions into the types so the compiler enforces them and whole classes of test disappear. This is a strong house preference — reach for it before adding a runtime guard.
- Closed sum type over a string set. A fixed set of identifiers (rooms, sensors, modes)
is an ADT, not a
[Text]of "valid values". Keep one small module owning the type plus atag :: T -> Text(andparse :: Text -> Maybe Twhen something inbound needs it), and convert only at the wire edge (DNS, CLI args, a DB tag, a frontend key). Payoff: a partial lookup with a silent fallback (f _ = default) becomes a totalcasethe compiler checks, and the several duplicated "valid values" lists collapse to[minBound .. maxBound]. - Encode preconditions in the input type.
NonEmpty a(frombase) instead of anullguard in front of a partialhead/maximumBy; a smart constructor orMaybe-returning parser instead of re-validating a raw value at every use.NE.nonEmpty :: [a] -> Maybe (NonEmpty a)is the boundary check; inside, the head is total. Only encode what the type actually captures —NonEmptymeans "≥1", so a function needing "≥2 points spanning a day" still honestly returnsMaybe. - Defer an effect out of a pure builder by returning a function. A pure
build :: ... -> Maybe (X -> Result)lets the caller supply the one effectful input (Just . mk <$> fetchX), keeping the decision pure and the fetch conditional — cleaner than threading the value in just so the builder is "complete". - The tell: when a type change deletes a test, you've turned a bug into a non-representable state. "rejects an unknown host" / "errors on empty" tests vanish because the input can no longer be constructed — that's the goal, not lost coverage.
- Know when not to newtype. Per-quantity wrappers (
Fahrenheit,Mph) earn their keep at API boundaries with mixable units, but in arithmetic-dense pure code they add constant wrap/unwrap noise for little gain. Sum types for identity almost always pay off; newtypes for quantities are a judgment call — don't reflexively wrap everyDouble.
5. Data & config style
- ADTs with strict fields (
!on every record field, orStrictData). Records useRecordWildCardsfor construction/destructuring. - Config = ADT +
FromJSON+Valid. JSON/YAML viaaeson(+yaml). Each config type has a hand-written instance:parseJSON = withObject "Name" $ \o -> ...with.:for required and.:?(often.!= def) for optional fields. Default aMaybefield in theFromJSONinstance when absence means a default, so consumers see a plain value. - A
Validtypeclass centralizes validation (valid :: a -> Either String aor similar); e.g. optional string fields reject"". - Wire formats get explicit, commented (de)serializers, not generic deriving with a field-mangler — when JSON field names must match an external spec exactly (an RPC protocol, a frontend contract), spell out the mapping and comment why fields are fixed.
- Adding a field to a record used in positional patterns is a cross-codebase edit.
Grep for
TypeName (andTypeName [before trusting the compiler —Validand display code often use positional matches the type-checker won't fully catch.
6. CLI / dispatch
main.hsstays tiny: parse args, dispatch, exit. coreutils dispatches ~40 utilities via aHashMapkeyed on program name / first arg into an existentialUtilitywrapper around aUtiltypeclass (the one sanctioned typeclass — it's a plugin registry, not effect injection).- Arg parsing:
System.Console.GetOptfolding (foldM) over a default options record. Every tool supports-h/--help. - Errors:
Either Stringfor recoverable;System.Exit.diefor fatal. Don't throw for control flow.
7. Concurrency (when needed)
Most code is single-threaded. When a daemon needs concurrency (apocrypha's server):
-threaded, async for spawning, stm/TVar or MVar for shared state. Prefer
thread-per-source (one loop per socket/connection) over reimplementing select. Guard
shared mutable state (throttles, caches) with an IORef (single-threaded) or MVar/STM
(concurrent) — not a global.
8. Testing
hspec with hspec-discover (test/Spec.hs is just the discover pragma). One
<Module>Spec.hs per module; register each in the cabal other-modules.
- Prefer pure-function unit tests. The pure core is the bulk of the code and tests
with direct input/output — no mocks.
QuickCheckfor properties (boundary conditions, invariants like "sign of rate matches reported direction"); apocrypha and coreutils both use it. - Test IO with the effect records, using stubs — not network/HTTP mocks. Shared
helpers live in
test/Helpers.hs(register it!). The canonical kit (devbot):fixedClock n— a frozenClockso time-dependent arithmetic is exact, not wall-clock-dependent.mkRecorderBy f/mkRecorder— anIORef-backed recorder for any injected effect; returns a(record-fn, read-back)pair. Assert what would have been written/sent without a real server. "Don't add a test HTTP server when a recorder will do."- in-memory backends (
getContext ServerMemory),noPinger,noContext("must-not-be-read" stub), temp-dir fixtures.
- Use real IO where it's cheap and deterministic:
spawnCommand "echo a", real temp dirs via thetemporarypackage, in-memory stores. Reserve integration tests (coreutils'test/integration/*.sh) for genuinely IO-heavy features; compare output against a reference implementation. - Golden tests for wire-format contracts: pin exact bytes/JSON for fixed inputs when output must match an external consumer.
- Capturing stdout:
System.IO.Silently.capture_(thesilentlypackage) — see haskell-dev for details. Timezone-safe filesystem-time tests: buildUTCTimethrough the local timezone anchored at noon — also in haskell-dev.
TDD loop (coreutils house workflow)
Strict red-green-refactor, one behavioral change per cycle:
- Red: write a high-level spec test that runs and produces wrong output — a
compile error does not count as a failing test (it blocks all tests and gives no
behavioral signal). Stub missing functions with
f = undefined/ add record fields with defaults so everything compiles and only the new test is red. Add focused low-level unit tests for the new pure logic. - Green: minimal code to pass every test, following existing patterns.
- Refactor: with tests green, simplify and extract helpers.
make ready(format, lint, test) must pass before committing.
9. Style & hygiene
- 4-space indentation.
whereclauses preferred over top-level helpers when the helper isn't reused. - Qualified imports with short aliases; group imports: stdlib, then external packages, then project modules.
- Module headers use Haddock block comments (
Module,Description,Copyright,License,Maintainer,Stability,Portability) for top-level modules. - Single-sentence Haddock on functions unless something subtle needs explaining.
make formatthenmake lintbefore committing —stylish-haskell --inplaceandhlint -j. coreutils wraps both plus tests asmake ready.make readyfails on a lint — hlint is a gate here, not advisory; take its suggestions (fromRight,maximumBy,listToMaybe, …) rather than working around them. stylish-haskell also rewrites import groups/alignment in place, so don't hand-format imports.
10. Designing a Python→Haskell port
When porting (the reason this skill often fires):
- Inventory by purity. Split modules into pure (port mechanically: dataclasses → ADTs,
pure functions → pure functions,
Enum→ sum type) vs IO-edge (needs design). Typically the majority is pure and ports almost verbatim — and tests better, since purity removes guard hacks like "don't write to prod DB during tests." - Find the integration seam (a database, a file the frontend reads, a socket) where Python and Haskell can run side-by-side against the same external state. That seam is your incremental migration boundary and your strongest correctness check: same input, diff the output.
- Port pure core + its tests first — fully green suite before any socket exists, zero risk to the running system.
- Then read path, then write path, then the daemon/server, each shippable and verifiable against the seam. Cut over the service last; it's the only hard-to-reverse step.
- Replace callbacks/globals with effect records (§3), not a faithful translation of Python's callback-passing or module-level singletons.
- Numerics: hand-roll small routines (least-squares fit, percentile) in an
Internal/Stats.hsrather than addinghmatrix/statistics— but pin them with golden fixtures captured from the original (numpy'spolyfitcoefficient order andpercentileinterpolation have specific conventions you must reproduce exactly, not approximately). Keep a heavier library in mind only as an escape hatch if exact parity proves impractical. - Reproduce external wire formats byte-for-byte with explicit serializers + golden tests (RPC JSON, any file a frontend parses, line protocols).