name: numpy-numerics description: "Numerical array operations for velocity fields, pump curves, and matrix calculations" category: packages domain: general complexity: basic dependencies: - numpy
NumPy Numerics Skill
Overview
NumPy is the fundamental package for numerical computing in Python. It provides:
- Powerful N-dimensional array objects
- Broadcasting functions for element-wise operations
- Linear algebra, Fourier transform, and random number capabilities
- Tools for integrating C/C++ and Fortran code
- High-performance operations on large arrays
For engineering applications, NumPy is essential for:
- Processing velocity field data from CFD simulations
- Handling pump curve data and interpolation
- Solving linear systems (pipe networks, structural analysis)
- Statistical analysis of experimental data
- Numerical integration and differentiation
- Matrix operations for finite element methods
Installation
pip install numpy
For scientific computing with additional tools:
pip install numpy scipy matplotlib
Core Concepts
Arrays vs Lists
NumPy arrays are more efficient than Python lists for numerical operations:
import numpy as np
# Python list - slow for numerical operations
python_list = [1, 2, 3, 4, 5]
result = [x * 2 for x in python_list]
# NumPy array - fast vectorized operations
numpy_array = np.array([1, 2, 3, 4, 5])
result = numpy_array * 2 # 10-100x faster
Key Differences:
- Arrays have fixed size and homogeneous type
- Arrays support vectorized operations
- Arrays use contiguous memory (cache-friendly)
- Arrays integrate with C/Fortran libraries
Array Creation
import numpy as np
# From Python list
a = np.array([1, 2, 3, 4, 5])
# Create ranges
x = np.arange(0, 10, 0.5) # Start, stop, step
x = np.linspace(0, 10, 100) # Start, stop, num_points
# Initialize with zeros/ones
data = np.zeros((10, 10)) # 10x10 array of zeros
grid = np.ones((5, 5)) # 5x5 array of ones
# Identity matrix
I = np.eye(3) # 3x3 identity matrix
# Create meshgrid for 2D functions
x = np.linspace(0, 1, 50)
y = np.linspace(0, 1, 50)
X, Y = np.meshgrid(x, y)
Array Indexing and Slicing
import numpy as np
# 1D array indexing
a = np.array([10, 20, 30, 40, 50])
print(a[0]) # 10 (first element)
print(a[-1]) # 50 (last element)
print(a[1:4]) # [20, 30, 40] (slice)
# 2D array indexing
data = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
print(data[0, 0]) # 1 (row 0, col 0)
print(data[:, 0]) # [1, 4, 7] (first column)
print(data[0, :]) # [1, 2, 3] (first row)
print(data[1:, 1:]) # [[5, 6], [8, 9]] (subarray)
# Boolean indexing
a = np.array([1, 2, 3, 4, 5])
mask = a > 2
print(a[mask]) # [3, 4, 5] (elements > 2)
Engineering Applications
1. Velocity Field Calculations
Computing velocity fields from flow data:
import numpy as np
# Create a 2D velocity field mesh (e.g., from CFD output)
x = np.linspace(0, 1, 50) # x-coordinates (m)
y = np.linspace(0, 1, 50) # y-coordinates (m)
X, Y = np.meshgrid(x, y)
# Velocity components (example: potential flow around cylinder)
r = np.sqrt((X - 0.5)**2 + (Y - 0.5)**2)
theta = np.arctan2(Y - 0.5, X - 0.5)
R = 0.1 # Cylinder radius
# Velocity field
u = 1.0 * (1 - R**2 / r**2) * np.cos(theta)
v = -1.0 * (1 - R**2 / r**2) * np.sin(theta)
# Compute velocity magnitude
velocity_magnitude = np.sqrt(u**2 + v**2)
# Find maximum velocity
max_velocity = np.max(velocity_magnitude)
max_location = np.unravel_index(
np.argmax(velocity_magnitude),
velocity_magnitude.shape
)
print(f"Maximum velocity: {max_velocity:.2f} m/s")
print(f"At location: ({X[max_location]:.2f}, {Y[max_location]:.2f})")
# Calculate flow statistics
mean_velocity = np.mean(velocity_magnitude)
std_velocity = np.std(velocity_magnitude)
print(f"Mean velocity: {mean_velocity:.2f} ± {std_velocity:.2f} m/s")
2. Pump Curve Data Handling
Working with pump performance data:
import numpy as np
# Experimental pump data (flow rate, head, efficiency)
Q_data = np.array([0, 20, 40, 60, 80, 100, 120]) # m³/h
H_data = np.array([85, 84, 82, 78, 72, 64, 52]) # m
eta_data = np.array([0, 45, 68, 78, 80, 75, 65]) # %
# Fit pump curve to polynomial: H = a + b*Q + c*Q²
coeffs = np.polyfit(Q_data, H_data, 2) # 2nd degree polynomial
print(f"Pump curve: H = {coeffs[0]:.6f}*Q² + {coeffs[1]:.4f}*Q + {coeffs[2]:.2f}")
# Create smooth curve for plotting
Q_smooth = np.linspace(0, 120, 100)
H_smooth = np.polyval(coeffs, Q_smooth)
# Find best efficiency point (BEP)
bep_index = np.argmax(eta_data)
Q_bep = Q_data[bep_index]
H_bep = H_data[bep_index]
eta_bep = eta_data[bep_index]
print(f"\nBest Efficiency Point:")
print(f" Flow rate: {Q_bep:.1f} m³/h")
print(f" Head: {H_bep:.1f} m")
print(f" Efficiency: {eta_bep:.1f}%")
# Interpolate head at specific flow rate
Q_target = 65 # m³/h
H_target = np.interp(Q_target, Q_data, H_data)
print(f"\nAt Q = {Q_target} m³/h: H = {H_target:.1f} m")
3. Matrix Operations for Linear Systems
Solving pipe network equations:
import numpy as np
# Pipe network with 4 nodes, 5 pipes
# Conservation of mass at each node: A * Q = b
# A: connectivity matrix
# Q: flow rates (unknown)
# b: external flows (boundary conditions)
# Example: Simple pipe network
# Node 1: Q1 + Q2 = 10 (inlet 10 L/s)
# Node 2: Q3 - Q1 = 0
# Node 3: Q4 - Q2 - Q3 = 0
# Node 4: -Q4 = -10 (outlet 10 L/s)
A = np.array([
[1, 1, 0, 0], # Node 1
[-1, 0, 1, 0], # Node 2
[0, -1, -1, 1], # Node 3
[0, 0, 0, -1] # Node 4
])
b = np.array([10, 0, 0, -10]) # L/s
# Solve linear system
Q = np.linalg.solve(A, b)
print("Flow rates in pipes:")
for i, q in enumerate(Q, 1):
print(f" Pipe {i}: {q:.2f} L/s")
# Verify solution
residual = np.dot(A, Q) - b
print(f"\nResidual (should be ~0): {np.max(np.abs(residual)):.2e}")
# Condition number (stability indicator)
cond = np.linalg.cond(A)
print(f"Condition number: {cond:.2f}")
if cond > 1000:
print("Warning: Matrix is ill-conditioned")
4. Interpolation of Experimental Data
Interpolating pressure drop data:
import numpy as np
# Experimental data: flow rate vs pressure drop
Q_exp = np.array([0, 10, 20, 30, 40, 50, 60]) # m³/h
dP_exp = np.array([0, 5, 18, 38, 65, 98, 138]) # kPa
# Linear interpolation (fast, simple)
Q_query = 25 # m³/h
dP_linear = np.interp(Q_query, Q_exp, dP_exp)
print(f"Linear interpolation at Q={Q_query}: dP={dP_linear:.1f} kPa")
# Polynomial interpolation for smooth curve
# Fit to quadratic: dP = k*Q²
coeffs = np.polyfit(Q_exp, dP_exp, 2)
dP_poly = np.polyval(coeffs, Q_query)
print(f"Polynomial fit at Q={Q_query}: dP={dP_poly:.1f} kPa")
# For more complex interpolation, use scipy.interpolate
# This is shown as reference (requires scipy)
from scipy import interpolate
f_spline = interpolate.interp1d(Q_exp, dP_exp, kind='cubic')
Q_dense = np.linspace(0, 60, 100)
dP_spline = f_spline(Q_dense)
5. Numerical Integration
Calculating flow rate from velocity profile:
import numpy as np
# Velocity profile in circular pipe (laminar flow)
# u(r) = u_max * (1 - (r/R)²)
R = 0.05 # m, pipe radius
u_max = 2.0 # m/s, centerline velocity
# Create radial points
r = np.linspace(0, R, 100)
# Velocity distribution
u = u_max * (1 - (r/R)**2)
# Calculate flow rate using trapezoidal rule
# Q = ∫∫ u dA = ∫₀ᴿ u(r) * 2πr dr
integrand = u * 2 * np.pi * r
Q_trapz = np.trapz(integrand, r)
# Analytical solution: Q = (π*R²*u_max)/2
Q_analytical = np.pi * R**2 * u_max / 2
print(f"Flow rate (numerical): {Q_trapz:.6f} m³/s")
print(f"Flow rate (analytical): {Q_analytical:.6f} m³/s")
print(f"Error: {abs(Q_trapz - Q_analytical)/Q_analytical * 100:.2f}%")
# Simpson's rule (more accurate for smooth functions)
# Requires scipy
from scipy import integrate
Q_simps = integrate.simpson(integrand, r)
print(f"Flow rate (Simpson): {Q_simps:.6f} m³/s")
6. Numerical Differentiation
Computing pressure gradient from pressure field:
import numpy as np
# Pressure distribution along pipe
x = np.linspace(0, 10, 100) # m, position
P = 500000 - 2000 * x # Pa, linear pressure drop
# Numerical gradient (central difference)
dP_dx = np.gradient(P, x)
print(f"Pressure gradient: {dP_dx[0]:.1f} Pa/m")
print(f"Expected: -2000 Pa/m")
# For 2D pressure field
x = np.linspace(0, 1, 50)
y = np.linspace(0, 1, 50)
X, Y = np.meshgrid(x, y)
# Example pressure field
P_field = 100000 - 5000 * X - 3000 * Y
# Compute gradients
dP_dx, dP_dy = np.gradient(P_field, x, y)
print(f"\nPressure gradient field:")
print(f" ∂P/∂x at (0.5, 0.5): {dP_dx[25, 25]:.1f} Pa/m")
print(f" ∂P/∂y at (0.5, 0.5): {dP_dy[25, 25]:.1f} Pa/m")
7. Statistical Analysis of Test Data
Analyzing pump test measurements:
import numpy as np
# Multiple test runs for pump efficiency
test_runs = np.array([
[78.5, 79.2, 78.8, 79.0, 78.6], # Run 1-5
[77.8, 78.5, 78.0, 78.3, 77.9], # Repeat tests
[79.1, 78.9, 79.3, 78.7, 79.0]
])
# Statistical measures
mean_efficiency = np.mean(test_runs)
std_efficiency = np.std(test_runs, ddof=1) # Sample std dev
sem = std_efficiency / np.sqrt(test_runs.size) # Standard error
print(f"Mean efficiency: {mean_efficiency:.2f}%")
print(f"Standard deviation: {std_efficiency:.2f}%")
print(f"Standard error: {sem:.2f}%")
print(f"95% confidence interval: {mean_efficiency:.2f} ± {1.96*sem:.2f}%")
# Detect outliers (beyond 3 standard deviations)
z_scores = np.abs((test_runs - mean_efficiency) / std_efficiency)
outliers = test_runs[z_scores > 3]
if len(outliers) > 0:
print(f"\nOutliers detected: {outliers}")
else:
print("\nNo outliers detected")
# Range and percentiles
print(f"\nData range: {np.min(test_runs):.2f}% to {np.max(test_runs):.2f}%")
print(f"Median: {np.median(test_runs):.2f}%")
print(f"25th percentile: {np.percentile(test_runs, 25):.2f}%")
print(f"75th percentile: {np.percentile(test_runs, 75):.2f}%")
Performance Tips for Large Arrays
1. Use Vectorization Instead of Loops
import numpy as np
import time
n = 1000000
# Slow: Python loop
start = time.time()
result = []
for i in range(n):
result.append(i ** 2)
loop_time = time.time() - start
# Fast: NumPy vectorization
start = time.time()
result = np.arange(n) ** 2
vectorized_time = time.time() - start
print(f"Loop: {loop_time:.3f} s")
print(f"Vectorized: {vectorized_time:.3f} s")
print(f"Speedup: {loop_time/vectorized_time:.0f}x")
2. Preallocate Arrays
import numpy as np
# Bad: Growing array in loop
result = np.array([])
for i in range(1000):
result = np.append(result, i) # Creates new array each time
# Good: Preallocate
result = np.zeros(1000)
for i in range(1000):
result[i] = i # In-place assignment
# Best: Use vectorization
result = np.arange(1000)
3. Use In-Place Operations
import numpy as np
a = np.random.rand(1000000)
# Creates new array
b = a * 2
b = a + 5
# In-place operations (faster, less memory)
a *= 2
a += 5
4. Choose Appropriate Data Types
import numpy as np
# Default: float64 (8 bytes per element)
a = np.array([1.0, 2.0, 3.0])
print(f"Size: {a.nbytes} bytes")
# Use float32 if precision allows (4 bytes per element)
a = np.array([1.0, 2.0, 3.0], dtype=np.float32)
print(f"Size: {a.nbytes} bytes (50% reduction)")
# For integers, use smallest appropriate type
counts = np.array([1, 2, 3], dtype=np.uint8) # 0-255
5. Use Memory Views for Large Arrays
import numpy as np
# Original array
data = np.arange(1000000)
# View (no copy, fast)
view = data[::2] # Every other element
view[0] = 999 # Modifies original array
# Copy (slower, independent)
copy = data[::2].copy()
copy[0] = 888 # Does not modify original
6. Leverage Broadcasting
import numpy as np
# Broadcasting: automatic array shape matching
a = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
b = np.array([10, 20, 30])
# Adds b to each row of a (no loop needed)
result = a + b
# [[11, 22, 33],
# [14, 25, 36],
# [17, 28, 39]]
Common Pitfalls
1. Array vs Scalar Division
import numpy as np
# Integer division
a = np.array([1, 2, 3])
result = a / 2 # [0.5, 1.0, 1.5] (float result)
# Floor division
result = a // 2 # [0, 1, 1] (integer result)
2. Copying vs Viewing
import numpy as np
a = np.array([1, 2, 3])
# View (references same data)
b = a
b[0] = 999 # Modifies a!
# Copy (independent data)
c = a.copy()
c[0] = 888 # Does not modify a
3. Integer Overflow
import numpy as np
# int8 can only hold -128 to 127
a = np.array([100, 50], dtype=np.int8)
result = a + a # [-56, 100] (overflow!)
# Use appropriate dtype
a = np.array([100, 50], dtype=np.int32)
result = a + a # [200, 100] (correct)
Best Practices
- Vectorize operations - Avoid Python loops when possible
- Use appropriate dtypes - Balance precision and memory
- Preallocate arrays - Don't grow arrays in loops
- Leverage broadcasting - Implicit dimension matching
- Check array shapes - Use
.shapeto debug dimension errors - Use built-in functions - NumPy functions are optimized
- Profile code - Identify bottlenecks before optimizing
- Document units - Add comments for physical quantities
Integration with Engineering Workflows
NumPy integrates seamlessly with:
- Pandas: DataFrames built on NumPy arrays
- Matplotlib: Plotting arrays directly
- SciPy: Advanced scientific computing (interpolation, optimization)
- Fluids: Engineering calculations on array data
- CoolProp: Thermodynamic property tables
Example workflow:
import numpy as np
import matplotlib.pyplot as plt
from fluids import Reynolds, friction_factor
# Generate velocity range
V = np.linspace(0.1, 5.0, 50) # m/s
# Calculate Reynolds numbers
D = 0.1 # m
rho = 1000 # kg/m³
mu = 0.001 # Pa·s
Re = np.array([Reynolds(v, D, rho, mu) for v in V])
# Calculate friction factors
eD = 0.0001
f = np.array([friction_factor(re, eD) for re in Re])
# Plot results
plt.plot(Re, f, 'b-', linewidth=2)
plt.xlabel('Reynolds Number')
plt.ylabel('Friction Factor')
plt.grid(True)
plt.xscale('log')
plt.yscale('log')
plt.title('Friction Factor vs Reynolds Number')
plt.show()
References
- NumPy Documentation: https://numpy.org/doc/
- NumPy User Guide: https://numpy.org/doc/stable/user/
- SciPy Lecture Notes: https://scipy-lectures.org/
- "Python for Data Analysis" by Wes McKinney
- "Numerical Python" by Robert Johansson
Further Learning
- NumPy Quickstart: https://numpy.org/doc/stable/user/quickstart.html
- NumPy for MATLAB Users: https://numpy.org/doc/stable/user/numpy-for-matlab-users.html
- Array Programming: https://numpy.org/doc/stable/user/basics.html
- Linear Algebra: https://numpy.org/doc/stable/reference/routines.linalg.html