3273 Commits

Author SHA1 Message Date
jmorganca
b9db5ab5d0 revert llm changes 2024-07-29 15:38:51 -07:00
jmorganca
a796b7aeaf num predict 2024-07-29 15:38:51 -07:00
jmorganca
89cb4b8d6b basic progress 2024-07-29 15:38:51 -07:00
jmorganca
0d365e8d34 add more runner params 2024-07-29 15:38:51 -07:00
jmorganca
72ff94efe0 truncate stop properly 2024-07-29 15:38:51 -07:00
jmorganca
240d4cf0aa wip stop tokens 2024-07-29 15:38:51 -07:00
jmorganca
424627c347 embeddings 2024-07-29 15:38:51 -07:00
jmorganca
1a801fba2a remove dependency on llm 2024-07-29 15:38:51 -07:00
jmorganca
727494ea54 grammar 2024-07-29 15:38:51 -07:00
jmorganca
b39fca7088 sampling 2024-07-29 15:38:51 -07:00
jmorganca
db55b1b89d better example module, add port 2024-07-29 15:38:51 -07:00
jmorganca
1124e24aff wip 2024-07-29 15:38:51 -07:00
jmorganca
df44d119a3 add llava to runner 2024-07-29 15:38:51 -07:00
jmorganca
86955c3014 fix output in build_hipblas.sh 2024-07-29 15:38:51 -07:00
jmorganca
c05ba504ef mods to build_hipblas.sh for linux 2024-07-29 15:38:51 -07:00
jmorganca
aaca2ce093 wip 2024-07-29 15:38:51 -07:00
jmorganca
921708003e improve cuda and hipblas build scripts 2024-07-29 15:38:51 -07:00
jmorganca
323a3f1f3a cuda linux 2024-07-29 15:38:51 -07:00
Jeffrey Morgan
07d6e589ca Update README.md 2024-07-29 15:38:51 -07:00
Jeffrey Morgan
aa52dfcaaf Update README.md 2024-07-29 15:38:51 -07:00
jmorganca
31e0de825e disable log file 2024-07-29 15:38:51 -07:00
jmorganca
d65b4ea480 fix readme for llava 2024-07-29 15:38:51 -07:00
jmorganca
878eb9a19f add llava 2024-07-29 15:38:51 -07:00
jmorganca
5818e3b210 llama: add clip dependencies 2024-07-29 15:38:51 -07:00
jmorganca
2a41ad5b1f add clip and parallel requests to the todo list 2024-07-29 15:38:51 -07:00
jmorganca
cf1ec78071 fix cuda build 2024-07-29 15:38:51 -07:00
jmorganca
57d03929cd fix build on windows 2024-07-29 15:38:51 -07:00
jmorganca
0a6b1adbd7 fix ggml-metal.m build constraints 2024-07-29 15:38:51 -07:00
jmorganca
ec60d79a67 fix ggml-metal.m 2024-07-29 15:38:51 -07:00
jmorganca
3d656588a7 avx2 should only add avx2 2024-07-29 15:38:51 -07:00
jmorganca
460d9857e2 fix sync script 2024-07-29 15:38:51 -07:00
jmorganca
a5548a81fc fix ggml-metal.m 2024-07-29 15:38:51 -07:00
jmorganca
634f6a75d0 fix ggml-metal.m 2024-07-29 15:38:51 -07:00
jmorganca
3b5e5a6280 add license headers 2024-07-29 15:38:51 -07:00
jmorganca
853d96b1b1 pre-patch 2024-07-29 15:38:51 -07:00
jmorganca
4dd63c1fef move runner package down 2024-07-29 15:38:51 -07:00
jmorganca
82214396b5 replace static build in llm 2024-07-29 15:38:51 -07:00
jmorganca
8ca4a9a70a fix build 2024-07-29 15:35:09 -07:00
jmorganca
25fd8fd045 wip... 2024-07-29 15:35:09 -07:00
jmorganca
be2f37b5d4 rename server to runner 2024-07-29 15:35:09 -07:00
Jeffrey Morgan
9e28405c54 Update README.md 2024-07-29 15:35:09 -07:00
Jeffrey Morgan
9f3e950120 Update README.md 2024-07-29 15:35:09 -07:00
Jeffrey Morgan
951104045f Update README.md 2024-07-29 15:35:09 -07:00
Jeffrey Morgan
597712006c Update README.md 2024-07-29 15:35:09 -07:00
jmorganca
64e712b12b Add missing hipcc flags 2024-07-29 15:35:09 -07:00
jmorganca
85aea62997 fix .gitattributes 2024-07-29 15:35:09 -07:00
jmorganca
491ff41675 Initial llama Go module 2024-07-29 15:35:09 -07:00
jmorganca
075f2e88d9 add sync of llama.cpp 2024-07-29 15:35:09 -07:00
Daniel Hiltgen
1a83581a8e
Merge pull request #5895 from dhiltgen/sched_faq
Better explain multi-gpu behavior
2024-07-29 14:25:41 -07:00
Daniel Hiltgen
37926eb991
Merge pull request #5927 from dhiltgen/high_cpu_count
Ensure amd gpu nodes are numerically sorted
2024-07-29 14:24:57 -07:00