competitive-price-analysis-pipeline

star 9

Performs competitive price analysis by comparing internal pricing data with competitor information from different sources (PDFs, CSVs, spreadsheets), calculates price differences, generates insights, and stores results in BigQuery for decision-making.

zjunlp By zjunlp schedule Updated 2/23/2026

name: competitive-price-analysis-pipeline description: Performs competitive price analysis by comparing internal pricing data with competitor information from different sources (PDFs, CSVs, spreadsheets), calculates price differences, generates insights, and stores results in BigQuery for decision-making.

Instructions

1. Initial Setup & Discovery

  • List the workspace directory to identify available files.
  • Read the requirements document (e.g., requirement.md) to understand the target BigQuery dataset, table, and required schema.
  • Read the internal pricing file (CSV or Excel) to understand the structure and product list.

2. Extract Competitor Pricing Data

  • Inspect the competitor PDF to understand its structure (page count, layout).
  • Read all pages of the competitor PDF to extract text content containing product names and prices.

3. Data Processing & Matching

  • Parse the internal pricing data. The key columns are Product Name and Our Price.
  • Parse the competitor PDF text. Identify product names and their corresponding prices. Use fuzzy matching or keyword mapping (e.g., "SmartWidget Pro" → "SmartWidget Professional Edition") based on the trajectory example.
  • Create a matched product list. For each internal product, find the corresponding competitor product and price. If a direct match isn't found, log it for manual review.
  • Calculate the price difference for each matched product: Price Difference = Our Price - Competitor Price.

4. Prepare Output & Load to BigQuery

  • Create a CSV file (price_comparison.csv) with the exact columns specified in the requirements:
    • Product Name (String)
    • Our Price (Float)
    • Competitor Price (Float)
    • Price Difference (Float)
  • Verify the BigQuery dataset exists. If it doesn't, you may need to create it (though the trajectory shows it pre-existing).
  • Load the CSV data into the specified BigQuery table (dataset_id.table_id). Use WRITE_TRUNCATE mode to replace any existing data, as per the trajectory.
  • Run a validation query to confirm the data was loaded correctly and matches the expected row count and schema.

5. Generate Summary & Finalize

  • Provide a concise summary of the analysis, including:
    • Number of products processed.
    • Count of products priced higher/lower than the competitor.
    • The product with the largest price advantage (most negative difference).
    • The product with the largest price disadvantage (most positive difference).
  • Claim the task as done.

Key Considerations

  • Product Matching Logic: This is a high-freedom, interpretive step. Use the provided scripts/match_products.py as a starting point, but be prepared to adjust matching rules (keywords, categories) based on the specific data.
  • BigQuery Permissions: Ensure the agent has the necessary permissions to list datasets and load data to the specified table.
  • Data Types: Ensure numeric prices are parsed as floats. Handle currency symbols and commas appropriately.
  • Error Handling: If the PDF extraction yields poor results, consider using the pdf-tools-search_pdf function to find specific product names or prices.
Install via CLI
npx skills add https://github.com/zjunlp/Skills --skill competitive-price-analysis-pipeline
Repository Details
star Stars 9
call_split Forks 0
navigation Branch main
article Path SKILL.md
More from Creator