name: data-analysis description: Load, analyze, and visualize datasets using pandas with AG Grid display
Data Analysis Skill
Description
Load data files (CSV, XLSX, JSON, Parquet) into the AG Grid viewer, run pandas queries, save results, and generate visualizations.
Tools Used
Primary (Data Grid workflow)
data_list- List available data files in /workspace/data/data_load- Load a data file into AG Grid (returns markdown preview for context)data_query- Execute pandas operations on loaded data (filter, aggregate, transform)data_save- Save the current DataFrame to a file
Secondary (Jupyter workflow for visualization)
jupyter_execute- Execute Python code in Jupyter kernel (for plots and complex analysis)update_notebook- Add cells to Jupyter notebookupdate_gallery- Display generated plots in the gallery
Workflow
Recommended: Data Grid Workflow
For tabular data exploration, use the data tools which provide a spreadsheet-like experience:
- List files:
data_listto see what's in /workspace/data/ - Load data:
data_loadto read a file and display in AG Grid- You'll receive a markdown preview to understand columns and types
- Query/Filter:
data_queryto run pandas operations- The
dfvariable contains the loaded data - Set
result = ...to define output
- The
- Save results:
data_saveto export to CSV/XLSX
Alternative: Jupyter Workflow
For visualization, statistical analysis, or ML, use Jupyter tools:
- Load data with
jupyter_executerunning pandas code - Create visualizations with matplotlib/seaborn
- Display plots with
update_gallery
Usage Patterns
Load and Explore Data
When user says: "Analyze this dataset" or "Show me the data"
data_listto find available filesdata_loadwith the target file- Review the markdown preview to understand structure
data_querywithresult = df.describe()for statistics- Offer filtering, sorting, or visualization
Filter and Transform
When user says: "Show only rows where X > Y" or "Group by category"
data_querywith pandas filter/groupby code- Grid updates automatically with filtered results
- Inform user of result count and preview
Save Processed Data
When user says: "Export this" or "Save as Excel"
data_savewith desired filename and format- Report file location and size
Visualize Data
When user says: "Create a chart" or "Plot the distribution"
- Use
jupyter_executewith matplotlib/seaborn code - Save plot and display via
update_gallery
Code Snippets for data_query
Filter rows
result = df[df['score'] > 90]
Group and aggregate
result = df.groupby('category').agg({'value': ['mean', 'sum', 'count']}).reset_index()
Sort by column
result = df.sort_values('date', ascending=False)
Add computed column
df['ratio'] = df['value_a'] / df['value_b']
result = df
Summary statistics
result = df.describe()
Handle missing values
result = df.dropna(subset=['important_column'])
Best Practices
- Start with data_list: Always check what files are available first
- Use data_load first: Load data to get markdown preview before querying
- Keep queries simple: One operation per data_query call for clarity
- Save intermediate results: Use data_save for important filtered datasets
- Switch to Jupyter for plots: AG Grid is for tabular data, use Jupyter for visualizations