Home 24h Top Newest Best Ask Show Jobs

Back to Profile

Submissions by agcat

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

Home

Home Newest Best Ask Show Jobs

Resources

HackerNews API Original HackerNews Next.js

© 2025 TechEcho. All rights reserved.

1

Qwen2-7B-Instruct with TensorRT-LLM: consistently high tokens/SEC

1 pointsby agcat9 months ago

2

LLM Wrapper Make Deployment with Nvidia Triton Inference Server Easier

1 pointsby agcat10 months ago

3

Show HN: Open-source tool that writes Nvidia Triton Inference Glue code for you

8 pointsby agcat11 months ago

4

Open Source CLI Tool to Generate Code for Nvidia Triton Deployment

3 pointsby agcat11 months ago

5

Real-Time Streaming Apps with Nvidia Open Source Triton Inference

3 pointsby agcat12 months ago

6

Fast Cold-starts for Serverless GPU Inference is becoming a reality

1 pointsby agcat12 months ago

7

LLMs Tokens/Second Benchmark ( Mistral, Llama2, Gemma) – Independent Research

2 pointsby agcatabout 1 year ago

8

Show HN: Scale PDF Q&A App to 10K Users with GPUs – <$250/Mo

7 pointsby agcatabout 1 year ago

9

Finetune Phi-2 with DPO

1 pointsby agcatover 1 year ago

← Previous