attribute

star 17

Attribute leaked environment data to victim companies by analyzing ownership signals. Use when analyzing breach data, supply chain attack artifacts, or exfiltrated environment snapshots to identify which organization was compromised. Trigger for questions about victim identification, leaked credential attribution, or breach victimology.

ramimac By ramimac schedule Updated 3/2/2026

name: attribute version: 1.0 description: Attribute leaked environment data to victim companies by analyzing ownership signals. Use when analyzing breach data, supply chain attack artifacts, or exfiltrated environment snapshots to identify which organization was compromised. Trigger for questions about victim identification, leaked credential attribution, or breach victimology. author: ramimac argument-hint: [target-directory]

Leaked Data Attribution

Identify the victim organization from leaked environment data by analyzing ownership signals - platform metadata, private infrastructure, corporate domains, credentials, and unique identifiers.


Core Principle: Organizational vs Individual Signals

Attribution requires distinguishing between:

Organizational signals - Directly identify an organization:

  • Platform organization names (GitHub org, Azure DevOps collection, GitLab namespace)
  • Corporate infrastructure (private registries, self-hosted tools)
  • Enterprise accounts (SSO redirects, tenant IDs)

Individual signals - Identify a person who may work for an organization:

  • Personal tokens and credentials
  • Email addresses
  • Usernames and profile fields

The key question: Does this individual represent the organization, or are they incidental to the data?


Analysis Workflow

Step 1: Extract and Catalog All Signals

Before attributing, systematically extract everything that could identify ownership.

Domains to extract:

  • Email domains (from any email address found)
  • URL domains (from any URL - registries, APIs, webhooks, configs)
  • Hostname patterns (machine names, DNS suffixes)
  • Proxy/network configuration domains

Identifiers to extract:

  • Platform org/user names (GitHub, GitLab, Azure DevOps, etc.)
  • Tenant/account IDs (Azure tenant, AWS account, GCP project)
  • Workspace/team IDs (Slack team ID, Atlassian org)
  • Repository paths and namespaces

High-entropy strings to note:

  • API keys and tokens (can be validated/enriched via APIs)
  • Webhook URLs (contain embedded IDs)
  • JWT tokens (contain issuer, claims)
  • Connection strings (contain hostnames, accounts)

Unique patterns to flag:

  • Custom environment variable prefixes (e.g., ACME_*)
  • Asset naming conventions in hostnames
  • Internal tool names or project codes
  • Scoped package names (@company/package)

Step 2: Analyze Organizational Platform Signals

Examine signals that directly identify an organization through platform metadata.

Platform organizations:

  • GitHub repository owner → Verify it's an Organization (not User) via API
  • Azure DevOps collection URI → Extract org from path
  • GitLab project namespace → Root namespace = org
  • CircleCI, Bitbucket, Buildkite, Drone, Travis → Org/workspace fields

Enterprise indicators:

  • Self-hosted platform domains (e.g., gitlab.acme.com = Acme owns it)
  • Enterprise SSO/SAML redirects (GitHub Enterprise Cloud detection)
  • Tenant IDs that resolve to organization names

What to look for:

  • Org names in platform-specific environment variables
  • Repository URLs with org in path
  • OIDC token claims containing owner/org fields

Confidence: HIGH when verified as organization


Step 3: Analyze Infrastructure Signals

Private or self-hosted infrastructure indicates organizational ownership.

Domains indicating ownership:

  • Private registry domains (npm.acme.com, artifactory.acme.io)
  • Self-hosted tool domains (vault.acme.com, sentry.acme.internal)
  • Internal Git server domains

Identifiers in infrastructure URLs:

  • Org names embedded in paths (pkgs.dev.azure.com/{org}/...)
  • Tenant subdomains ({company}.jfrog.io)
  • Account-specific endpoints

Package scopes:

  • Scoped npm packages (@acme/package) pointing to private registries
  • Private PyPI indexes
  • Go private module patterns

Confidence: HIGH when on identifiable corporate domain


Step 4: Analyze Domain Signals

Extract and analyze all domains found in the data.

High-value domain sources:

  • Email addresses → Corporate email = strong signal
  • Git commit metadata → Author/committer email domains
  • Proxy bypass lists (no_proxy) → Often contain internal domains
  • URL configurations → API endpoints, webhooks, service URLs

Domain extraction from:

  • Full URLs → Parse out hostname
  • Email addresses → Extract domain portion
  • Hostnames → Extract DNS suffix patterns
  • Configuration values → Look for embedded URLs/domains

Filtering - skip these:

Category Examples
Localhost localhost, 127.0.0.1, .local
Internal suffixes .internal, .corp, .lan, .intranet
Placeholders localdomain.com, example.com
Cloud providers amazonaws.com, azure.com, googleapis.com
Public platforms github.com, gitlab.com, npmjs.org
Personal email gmail.com, yahoo.com, outlook.com

Confidence: MEDIUM for corporate domains; needs corroboration


Step 5: Analyze High-Entropy Strings and Credentials

Tokens and secrets can be validated or enriched to reveal organizational context.

API tokens - validate and enrich:

  • GitHub PATs → Call /user to get profile with company field
  • Slack tokens → Call auth.test to get workspace name
  • npm tokens → Profile lookup reveals org memberships
  • Cloud credentials → Often return account metadata on validation

Webhook URLs - extract embedded IDs:

  • Slack webhooks → Team ID in path (/services/{TEAM_ID}/...)
  • Other webhooks → May contain account/workspace identifiers

JWT tokens - decode and examine:

  • Issuer domain (iss claim) → May indicate organization
  • Subject/audience claims → May contain org identifiers
  • Custom claims → Platform-specific org information

Connection strings - parse components:

  • Database hostnames → May be corporate infrastructure
  • Account names → May embed organization
  • Server URLs → Domain analysis

Confidence: Varies - validated tokens = HIGH; unvalidated = LOW


Step 6: Evaluate Individual Signals

When signals are tied to individuals (not organizations), extra validation is needed.

Individual signal types:

  • Personal tokens and API keys
  • Email addresses (could be personal or corporate)
  • Usernames and profile fields
  • Personal tool configurations

The problem:

  • Credentials may not belong to the victim
  • Profile fields are self-reported and may be stale
  • Individual context may not reflect organizational affiliation

Validation approaches:

  • Does individual identity match other context in the data?
  • Do multiple individual signals point to the same organization?
  • Is there corroboration from organizational signals?

Confidence: LOW unless corroborated by organizational signals


Step 7: Cross-Reference and Corroborate

Combine signals to build confidence.

Cross-referencing:

  • Does the email domain match the platform org?
  • Does the private registry domain match other infrastructure?
  • Do multiple independent signals point to the same company?

Alias resolution:

  • Map cryptic org names to company names (e.g., acme-dev → Acme Corp)
  • Look for company prefixes/suffixes in org names
  • Cross-reference org names with email domains

Confidence boosting:

Condition Confidence Impact
Multiple corroborating signals → HIGH
Enterprise/Fortune 500 match → HIGH
API-verified organization → HIGH
Single organizational signal → MEDIUM
Single weak/individual signal → LOW
Contradictory signals → Manual review

Step 8: Resolve Ambiguity

Contradictory signals:

  • Prioritize organizational signals over individual signals
  • Prioritize infrastructure domains over email domains
  • Consider client/vendor relationships (one org using another's tools)
  • Flag for manual review if unresolved

Personal accounts mistaken for organizations:

  • If API lookup reveals personal account → Do not attribute as company
  • Check user's company profile field (LOW confidence)
  • Look for other organizational signals

No clear signals:

  • Document what signals exist
  • Note confidence as LOW or NONE
  • Identify enrichment opportunities

Signal Reliability Reference

HIGH Confidence

  • Verified platform organization (API confirmed)
  • Self-hosted infrastructure on corporate domain
  • Enterprise SSO/SAML configurations
  • Multiple corroborating organizational signals
  • Validated credentials returning org metadata

MEDIUM Confidence

  • Corporate email domains
  • Private registry domains (without full verification)
  • Unverified organization names
  • Workspace/team names from collaboration tools

LOW Confidence

  • Hostname patterns without domain context
  • Cloud resource naming conventions
  • JWT token issuer domains (unvalidated)
  • Individual profile fields without corroboration
  • Single uncorroborated signal

Signals to AVOID

  • Collection/exfiltration paths - Attacker's infrastructure, not victim
  • Package author metadata - Package creator, not consumer
  • Uncorroborated individual signals - May not represent the victim

Enrichment Tactics

Credential Validation

Validate tokens to extract organizational metadata:

  • GitHub → /user endpoint for company field
  • Slack → auth.test for workspace info
  • npm → Profile lookup for org memberships
  • Cloud platforms → Metadata APIs for account info

ID Resolution

Resolve opaque identifiers to names:

  • Slack team IDs → API or browser lookup
  • Azure tenant IDs → Organization name resolution
  • AWS account IDs → (limited without access)
  • Platform org IDs → API lookups

Org Name Mapping

Map cryptic names to companies:

  • Look for company prefixes/suffixes
  • Cross-reference with domain signals
  • Build alias dictionary from confirmed mappings

Domain Intelligence

Enrich domains with context:

  • WHOIS/DNS lookups for domain ownership
  • Certificate transparency for related domains
  • Known corporate domain databases

Quick Checklist

When analyzing leaked environment data:

  • Extracted all domains (URLs, emails, hostnames)
  • Extracted all platform org/user identifiers
  • Extracted all tenant/account/workspace IDs
  • Noted high-entropy strings (tokens, keys, webhooks)
  • Identified unique patterns (custom prefixes, naming conventions)
  • Filtered out infrastructure/cloud/personal domains
  • Validated credentials where possible
  • Cross-referenced signals for corroboration
  • Resolved ambiguous org names
  • Assigned confidence level to attribution
Install via CLI
npx skills add https://github.com/ramimac/unprompted --skill attribute
Repository Details
star Stars 17
call_split Forks 5
navigation Branch main
article Path SKILL.md
More from Creator