TE
TechEcho
Home
24h Top
Newest
Best
Ask
Show
Jobs
English
GitHub
Twitter
Back to Profile
Submissions by agcat
1
Qwen2-7B-Instruct with TensorRT-LLM: consistently high tokens/SEC
1 points
by
agcat
9 months ago
1 comment
2
LLM Wrapper Make Deployment with Nvidia Triton Inference Server Easier
1 points
by
agcat
10 months ago
1 comment
3
Show HN: Open-source tool that writes Nvidia Triton Inference Glue code for you
8 points
by
agcat
11 months ago
2 comments
4
Open Source CLI Tool to Generate Code for Nvidia Triton Deployment
3 points
by
agcat
11 months ago
1 comment
5
Real-Time Streaming Apps with Nvidia Open Source Triton Inference
3 points
by
agcat
12 months ago
no comments
6
Fast Cold-starts for Serverless GPU Inference is becoming a reality
1 points
by
agcat
12 months ago
1 comment
7
LLMs Tokens/Second Benchmark ( Mistral, Llama2, Gemma) – Independent Research
2 points
by
agcat
about 1 year ago
no comments
8
Show HN: Scale PDF Q&A App to 10K Users with GPUs – <$250/Mo
7 points
by
agcat
about 1 year ago
2 comments
9
Finetune Phi-2 with DPO
1 points
by
agcat
over 1 year ago
1 comment
← Previous
Next →