install-pymllm

star 1.5k

Install the pymllm Python package. Asks the user whether to do a full build (with CMake C++ compilation) or a fast install (Python-only, skip CMake). Use when the user asks to install, set up, or reinstall pymllm.

UbiquitousLearning By UbiquitousLearning schedule Updated 3/17/2026

name: install-pymllm description: Install the pymllm Python package. Asks the user whether to do a full build (with CMake C++ compilation) or a fast install (Python-only, skip CMake). Use when the user asks to install, set up, or reinstall pymllm.

Install pymllm

Goal

Help the user install the pymllm package with the right configuration for their use case.

Workflow

Step 1: Ask the user which install mode they want

Use AskUserQuestion to present two options:

Full Install (with C++ build)

  • Compiles the C++ mllm runtime and FFI extension via CMake
  • Required if the user needs mobile inference, model conversion with FFI, or CPU/QNN backends
  • Slower (several minutes depending on the machine)
  • Command: pip wheel -v -w dist . && pip install dist/*.whl --force-reinstall

Fast Install (Python-only, skip CMake)

  • Skips the entire CMake build step
  • Only installs the pure Python package
  • Recommended for users who only use CUDA backends (FlashInfer, TileLang) and do not need the C++ mllm runtime
  • Much faster (seconds)
  • Command: SKBUILD_WHEEL_CMAKE=false pip install -e .

Step 2: Ask editable or non-editable

Use AskUserQuestion to ask:

  • Editable (pip install -e .): For active development. Python imports point to the source tree. Changes to .py files take effect immediately without reinstalling.
  • Non-editable (wheel): For stable usage. Installs a wheel into site-packages.

Step 3: Ask whether the user needs CUDA optional dependencies

Use AskUserQuestion to ask whether the user needs CUDA support (FlashInfer, TileLang, pyzmq, etc.).

This determines whether to append [cuda] to the install specifier (e.g. pip install -e ".[cuda]" instead of pip install -e .).

This applies to ALL install modes. For fast-install users this is especially important since the CUDA packages are the primary compute backend.

Step 4: Execute the install

Based on user choices, compose and run the appropriate command. The install specifier is either . or ".[cuda]" depending on Step 3.

Mode Editable CUDA Command
Full Yes No pip install -e -v .
Full Yes Yes pip install -e -v ".[cuda]"
Full No No pip wheel -v -w dist . && pip install dist/*.whl --force-reinstall
Full No Yes pip wheel -v -w dist . && pip install dist/*.whl --force-reinstall && pip install "pymllm[cuda]"
Fast Yes No SKBUILD_WHEEL_CMAKE=false pip install -e .
Fast Yes Yes SKBUILD_WHEEL_CMAKE=false pip install -e ".[cuda]"
Fast No No SKBUILD_WHEEL_CMAKE=false pip wheel -v -w dist . && pip install dist/*.whl --force-reinstall
Fast No Yes SKBUILD_WHEEL_CMAKE=false pip wheel -v -w dist . && pip install dist/*.whl --force-reinstall && pip install "pymllm[cuda]"

Step 5: Post-install for editable + full build

If the user chose editable + full build, the compiled .so files live in a build directory (e.g. build/bin/), not in the source tree. The Python code at pymllm/__init__.py looks for libraries at pymllm/lib/MllmFFIExtension.so. A symlink is needed to bridge this gap.

Invoke the /link-pymllm-lib skill to help the user set up the symlink.

Important Notes

  • The project root must contain pyproject.toml with scikit-build-core as the build backend.
  • The wheel.cmake = true flag in pyproject.toml controls whether CMake runs. The env var SKBUILD_WHEEL_CMAKE=false overrides it at install time without modifying the file.
  • For non-editable full builds, the .so files are bundled inside the wheel automatically — no symlink needed.
  • For fast installs, pymllm.is_mobile_available() will return False since no C++ libraries are present. This is expected.
  • The [cuda] optional dependencies are defined in pyproject.toml under [project.optional-dependencies].
Install via CLI
npx skills add https://github.com/UbiquitousLearning/mllm --skill install-pymllm
Repository Details
star Stars 1,538
call_split Forks 204
navigation Branch main
article Path SKILL.md
More from Creator
UbiquitousLearning
UbiquitousLearning Explore all skills →