Troubleshooting Guide
This guide helps you diagnose and resolve common issues with ApiLinker and explains how to use the robust error handling and recovery system.
Error Handling & Recovery System
APILinker includes a sophisticated error handling and recovery system to make your API integrations more resilient against common failures. The system includes:
- Circuit Breakers - Prevent cascading failures during service outages
- Dead Letter Queues (DLQ) - Store failed operations for later retry
- Configurable Recovery Strategies - Apply different strategies for different error types
- Error Analytics - Track error patterns and trends
Using the Error Handling System
The error handling system is configured in your configuration file under the error_handling section:
error_handling:
# Configure circuit breakers
circuit_breakers:
source_customer_api: # Name of the circuit breaker
failure_threshold: 5 # Number of failures before opening circuit
reset_timeout_seconds: 60 # Seconds to wait before trying again
half_open_max_calls: 1 # Max calls allowed in half-open state
# Configure recovery strategies by error category
recovery_strategies:
network: # Error category
- exponential_backoff
- circuit_breaker
rate_limit:
- exponential_backoff
server:
- circuit_breaker
- exponential_backoff
# Configure Dead Letter Queue
dlq:
directory: "./dlq" # Directory to store failed operations
Diagnostic Decision Tree
Start here and follow the branches to diagnose your issue:
- Installation Issues
- Package Not Found
- Version Conflicts
-
Configuration Issues
- Invalid Configuration
- Environment Variables Not Working
-
API Connection Issues
- Connection Failed
- Authentication Failed
- SSL/Certificate Errors
- Timeout Errors
-
Mapping Issues
- Missing Fields
- Transformation Errors
-
Runtime Issues
- Scheduling Problems
- Memory Usage
- Performance Problems
- DLQ Processing Errors
Installation Issues
Package Not Found
Symptoms:
Solutions: 1. Verify your Python version (3.8+ required):
-
Update pip:
-
Check your internet connection and try again.
-
If using a corporate network, check proxy settings:
Version Conflicts
Symptoms:
Solutions: 1. Create a clean virtual environment:
python -m venv apilinker_env
source apilinker_env/bin/activate # On Windows: apilinker_env\Scripts\activate
pip install apilinker
- Install with the
--no-dependenciesflag and handle dependencies manually:
ImportError
Symptoms:
Solutions: 1. Verify installation:
-
Check your Python environment:
-
Try reinstalling:
Configuration Issues
Invalid Configuration
Symptoms:
Solutions: 1. Validate your YAML syntax using an online validator.
-
Check the specific error message for details about what's wrong.
-
Compare with the examples in the documentation.
-
Common issues:
- Indentation errors
- Missing required fields
- Incorrect value types
Environment Variables
Symptoms: Environment variables are not being replaced in your configuration.
Solutions: 1. Verify the environment variable is set:
-
Check the syntax in your configuration file:
-
Set the environment variable in your script:
File Not Found
Symptoms:
Solutions: 1. Check the file path:
- Use absolute paths:
API Connection Issues
Connection Failed
Symptoms:
Solutions: 1. Verify your internet connection.
-
Check if the API domain is correct and accessible:
-
Try with a different network or disable firewall temporarily.
-
Add timeout and retry settings:
Authentication Failed
Symptoms:
Solutions: 1. Verify your credentials are correct.
-
Check if your token or API key has expired.
-
Ensure you're using the correct authentication method.
-
Examine the API documentation for specific auth requirements.
-
Enable debug logging to see the actual request:
SSL/Certificate Errors
Symptoms:
Solutions: 1. Update your CA certificates.
-
If necessary (and safe), disable SSL verification:
-
Specify a custom CA bundle:
Timeout Errors
Symptoms:
Solutions: 1. Increase timeout duration:
-
Check if the API is experiencing high latency.
-
Consider adding pagination for large data sets.
Mapping Issues
Missing Fields
Symptoms:
Solutions: 1. Print the actual response data to inspect the structure:
-
Use dot notation for nested fields:
-
Add conditional mapping:
Transformation Errors
Symptoms:
Solutions: 1. Check the input data format.
-
Add validation in your transformer:
-
Test the transformer directly:
Type Errors
Symptoms:
Solutions: 1. Add type checking:
def my_transformer(value, **kwargs):
if isinstance(value, dict):
return json.dumps(value)
elif isinstance(value, (int, float)):
return str(value)
return value
- Use a pre-processor:
API Connection Issues
Connection Failed
Symptoms:
Solutions: 1. Check your internet connection 2. Verify the API is online using a tool like cURL or Postman 3. Check if the API domain resolves correctly:
4. Check for firewall or proxy issues in your environmentAuthentication Failed
Symptoms:
Solutions: 1. Verify your credentials are correct 2. Check if the token has expired 3. Ensure you're using the correct authentication method for the API 4. Check if your API key has the necessary permissions
SSL/Certificate Errors
Symptoms:
Solutions: 1. Update your CA certificates 2. If working in a development environment, you can disable verification (not recommended for production):
3. Provide the path to your custom certificate:Timeout Errors
Symptoms:
Solutions: 1. Increase the timeout in your configuration:
2. Check if the API endpoint is slow or under heavy load 3. Consider adding exponential backoff retry strategy:Circuit Breaker Open
Symptoms:
Solutions: 1. Wait for the circuit breaker to reset (typically 60 seconds by default) 2. Check the health of the API service that's failing 3. Adjust your circuit breaker configuration if needed:
error_handling:
circuit_breakers:
source_customer_api:
failure_threshold: 10 # More permissive
reset_timeout_seconds: 30 # Quicker reset
Runtime Issues
Scheduling Problems
Symptoms: Scheduled syncs not running at expected times.
Solutions: 1. Check your system time and timezone.
-
Verify the cron expression format.
-
Ensure your script is kept running:
-
Use a dedicated task scheduler like systemd, cron, or Windows Task Scheduler.
Memory Usage
Symptoms: High memory usage or MemoryError exceptions.
Solutions: 1. Process data in batches:
-
Use pagination with limits:
-
Implement a custom stream processor for very large datasets.
Performance Problems
Symptoms: - Syncs take longer than expected - High memory usage - Slow response times
Solutions: 1. Use batch processing for large datasets 2. Add appropriate indexes to your database 3. Use pagination for large API responses 4. Profile your transformers to identify bottlenecks 5. Consider adding caching for frequently accessed data
DLQ Processing Errors
Symptoms:
Solutions: 1. Check the DLQ item's payload and error details:
-
Process specific types of failed operations:
-
Manually fix issues and retry:
-
Check the error category distribution to identify patterns:
Error Handling System Details
Circuit Breaker Pattern
The circuit breaker pattern prevents cascading failures by temporarily stopping calls to failing services. It has three states:
- CLOSED - Normal operation, requests pass through
- OPEN - Service is failing, requests fail fast without calling the service
- HALF-OPEN - Testing if service has recovered with limited requests
Configuration options:
circuit_breakers:
name_of_breaker:
failure_threshold: 5 # Failures before opening
reset_timeout_seconds: 60 # Time before half-open
half_open_max_calls: 1 # Test calls allowed
Recovery Strategies
APILinker supports these recovery strategies:
- RETRY - Simple retry without delay
- EXPONENTIAL_BACKOFF - Retry with increasing delays
- CIRCUIT_BREAKER - Use circuit breaker pattern
- FALLBACK - Use default data instead
- SKIP - Skip the operation
- FAIL_FAST - Fail immediately
Configure by error category:
recovery_strategies:
network: # Error category
- exponential_backoff # First strategy
- circuit_breaker # Second strategy
Available error categories: - NETWORK - Network connectivity issues - AUTHENTICATION - Auth failures - VALIDATION - Invalid data - TIMEOUT - Request timeouts - RATE_LIMIT - API returned 429 (rate limited): apply exponential backoff and retry - SERVER - Server errors (5xx) - CLIENT - Client errors (4xx) - MAPPING - Data mapping errors - PLUGIN - Plugin errors - UNKNOWN - Uncategorized errors
Dead Letter Queue (DLQ)
The DLQ stores failed operations for later analysis and retry. Each entry contains: - Error details (category, message, status code) - Original payload that caused the failure - Timestamp and operation context - Correlation ID for tracing
Access DLQ data:
# Get failed operations
items = linker.dlq.get_items(error_category=ErrorCategory.RATE_LIMIT)
# Retry operations
linker.process_dlq(operation_type="source_customers", limit=10)
Error Analytics
The error analytics system tracks: - Error counts by category - Error rates over time - Top error types
Access analytics:
analytics = linker.get_error_analytics()
print(f"Error rate: {analytics['recent_error_rate']} errors/minute")
print(f"Top errors: {analytics['top_errors']}")
Symptoms: Syncs taking too long to complete.
Solutions: 1. Enable performance logging:
- Optimize transformers:
- Avoid unnecessary operations
- Cache repeated calculations
-
Use built-in functions where possible
-
Use concurrent requests when appropriate:
-
Implement partial syncs with filters:
Debugging Techniques
Enable Debug Logging
import logging
logging.basicConfig(
level=logging.DEBUG,
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)
Use Dry Run Mode
# Test the sync without making changes
result = linker.sync(dry_run=True)
print(f"Would sync {result.count} records")
print(f"Preview: {result.preview[:3]}") # First 3 records
Inspect API Requests
# Install HTTP debugging tool
# pip install httpx-debug
import httpx_debug
httpx_debug.install() # Shows all HTTP requests and responses
Interactive Debugging
If you're still experiencing issues after trying these solutions, please open an issue on GitHub with detailed information about your problem.