Commit Graph

5 Commits

Author SHA1 Message Date
Stijnus
38c13494c2 Update LLM providers and constants (#1937)
- Updated constants in app/lib/.server/llm/constants.ts
- Modified stream-text functionality in app/lib/.server/llm/stream-text.ts
- Updated Anthropic provider in app/lib/modules/llm/providers/anthropic.ts
- Modified GitHub provider in app/lib/modules/llm/providers/github.ts
- Updated Google provider in app/lib/modules/llm/providers/google.ts
- Modified OpenAI provider in app/lib/modules/llm/providers/openai.ts
- Updated LLM types in app/lib/modules/llm/types.ts
- Modified API route in app/routes/api.llmcall.ts
2025-08-29 22:55:02 +02:00
Stijnus
b5d9055851 🔧 Fix Token Limits & Invalid JSON Response Errors (#1934)
ISSUES FIXED:
-  Invalid JSON response errors during streaming
-  Incorrect token limits causing API rejections
-  Outdated hardcoded model configurations
-  Poor error messages for API failures

SOLUTIONS IMPLEMENTED:

🎯 ACCURATE TOKEN LIMITS & CONTEXT SIZES
- OpenAI GPT-4o: 128k context (was 8k)
- OpenAI GPT-3.5-turbo: 16k context (was 8k)
- Anthropic Claude 3.5 Sonnet: 200k context (was 8k)
- Anthropic Claude 3 Haiku: 200k context (was 8k)
- Google Gemini 1.5 Pro: 2M context (was 8k)
- Google Gemini 1.5 Flash: 1M context (was 8k)
- Groq Llama models: 128k context (was 8k)
- Together models: Updated with accurate limits

�� DYNAMIC MODEL FETCHING ENHANCED
- Smart context detection from provider APIs
- Automatic fallback to known limits when API unavailable
- Safety caps to prevent token overflow (100k max)
- Intelligent model filtering and deduplication

🛡️ IMPROVED ERROR HANDLING
- Specific error messages for Invalid JSON responses
- Token limit exceeded warnings with solutions
- API key validation with clear guidance
- Rate limiting detection and user guidance
- Network timeout handling

 PERFORMANCE OPTIMIZATIONS
- Reduced static models from 40+ to 12 essential
- Enhanced streaming error detection
- Better API response validation
- Improved context window display (shows M/k units)

🔧 TECHNICAL IMPROVEMENTS
- Dynamic model context detection from APIs
- Enhanced streaming reliability
- Better token limit enforcement
- Comprehensive error categorization
- Smart model validation before API calls

IMPACT:
 Eliminates Invalid JSON response errors
 Prevents token limit API rejections
 Provides accurate model capabilities
 Improves user experience with clear errors
 Enables full utilization of modern LLM context windows
2025-08-29 20:53:57 +02:00
Anirban Kar
32bfdd9c24 feat: added more dynamic models, sorted and remove duplicate models (#1206) 2025-01-29 02:33:23 +05:30
Mohammad Saif Khan
39a0724ef3 feat: add Gemini 2.0 Flash-thinking-exp-01-21 model with 65k token support (#1202)
Added the new gemini-2.0-flash-thinking-exp-01-21 model to the GoogleProvider's static model configuration. This model supports a significantly increased maxTokenAllowed limit of 65,536 tokens, enabling it to handle larger context windows compared to existing Gemini models (previously capped at 8k tokens). The model is labeled as "Gemini 2.0 Flash-thinking-exp-01-21" for clear identification in the UI/dropdowns.
2025-01-28 23:30:50 +05:30
Anirban Kar
7295352a98 refactor: refactored LLM Providers: Adapting Modular Approach (#832)
* refactor: Refactoring Providers to have providers as modules

* updated package and lock file

* added grok model back

* updated registry system
2024-12-21 11:45:17 +05:30