ISSUES FIXED: - ❌ Invalid JSON response errors during streaming - ❌ Incorrect token limits causing API rejections - ❌ Outdated hardcoded model configurations - ❌ Poor error messages for API failures SOLUTIONS IMPLEMENTED: 🎯 ACCURATE TOKEN LIMITS & CONTEXT SIZES - OpenAI GPT-4o: 128k context (was 8k) - OpenAI GPT-3.5-turbo: 16k context (was 8k) - Anthropic Claude 3.5 Sonnet: 200k context (was 8k) - Anthropic Claude 3 Haiku: 200k context (was 8k) - Google Gemini 1.5 Pro: 2M context (was 8k) - Google Gemini 1.5 Flash: 1M context (was 8k) - Groq Llama models: 128k context (was 8k) - Together models: Updated with accurate limits �� DYNAMIC MODEL FETCHING ENHANCED - Smart context detection from provider APIs - Automatic fallback to known limits when API unavailable - Safety caps to prevent token overflow (100k max) - Intelligent model filtering and deduplication 🛡️ IMPROVED ERROR HANDLING - Specific error messages for Invalid JSON responses - Token limit exceeded warnings with solutions - API key validation with clear guidance - Rate limiting detection and user guidance - Network timeout handling ⚡ PERFORMANCE OPTIMIZATIONS - Reduced static models from 40+ to 12 essential - Enhanced streaming error detection - Better API response validation - Improved context window display (shows M/k units) 🔧 TECHNICAL IMPROVEMENTS - Dynamic model context detection from APIs - Enhanced streaming reliability - Better token limit enforcement - Comprehensive error categorization - Smart model validation before API calls IMPACT: ✅ Eliminates Invalid JSON response errors ✅ Prevents token limit API rejections ✅ Provides accurate model capabilities ✅ Improves user experience with clear errors ✅ Enables full utilization of modern LLM context windows
46 lines
958 B
TypeScript
46 lines
958 B
TypeScript
/*
|
|
* Maximum tokens for response generation (conservative default for older models)
|
|
* Modern models can handle much higher limits - specific limits are set per model
|
|
*/
|
|
export const MAX_TOKENS = 32000;
|
|
|
|
// limits the number of model responses that can be returned in a single request
|
|
export const MAX_RESPONSE_SEGMENTS = 2;
|
|
|
|
export interface File {
|
|
type: 'file';
|
|
content: string;
|
|
isBinary: boolean;
|
|
isLocked?: boolean;
|
|
lockedByFolder?: string;
|
|
}
|
|
|
|
export interface Folder {
|
|
type: 'folder';
|
|
isLocked?: boolean;
|
|
lockedByFolder?: string;
|
|
}
|
|
|
|
type Dirent = File | Folder;
|
|
|
|
export type FileMap = Record<string, Dirent | undefined>;
|
|
|
|
export const IGNORE_PATTERNS = [
|
|
'node_modules/**',
|
|
'.git/**',
|
|
'dist/**',
|
|
'build/**',
|
|
'.next/**',
|
|
'coverage/**',
|
|
'.cache/**',
|
|
'.vscode/**',
|
|
'.idea/**',
|
|
'**/*.log',
|
|
'**/.DS_Store',
|
|
'**/npm-debug.log*',
|
|
'**/yarn-debug.log*',
|
|
'**/yarn-error.log*',
|
|
'**/*lock.json',
|
|
'**/*lock.yml',
|
|
];
|