-
Notifications
You must be signed in to change notification settings - Fork 14.1k
Pull requests: deepseek-ai/DeepSeek-V3
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Optimize Multi-head Latent Attention (MLA) with Fast Path for Short Sequences
#684
opened Feb 19, 2025 by
XxAlonexX
Loading…
7 tasks done
Fix incorrect comment in linear function regarding weight.element_size()
#662
opened Feb 14, 2025 by
iamvalenciia
Loading…
Refactor checkpoint conversion script for improved readability and efficiency
#633
opened Feb 10, 2025 by
tdas3001
Loading…
Improve convert.py with error handling and code optimization
#618
opened Feb 8, 2025 by
wowrakibul
Loading…
Improve Weight File Documentation for Clarity and Readability
#481
opened Jan 30, 2025 by
Muhammad-Noraeii
Loading…
feat:feat: Added logging, parallel processing, and CPU processing option for FP8 to BF16 conversion
#461
opened Jan 29, 2025 by
anand-144
Loading…
Refactored/codebase By defining different classes for different operations and much more
#444
opened Jan 29, 2025 by
Pratiyankkumar
Loading…
Update generate.py: Add parallel processing for token generation
#426
opened Jan 28, 2025 by
utsav-pal
Loading…
Fixes #374: Suppress Torch Error in MoE Module by Configuring
torch._dynamo
#375
opened Jan 27, 2025 by
minimalProviderAgentMarket
Loading…
ProTip!
Updated in the last three days: updated:>2025-02-18.