|
6 | 6 | <description>Last 10 notes on 🧠 Second Brain</description> |
7 | 7 | <generator>Quartz -- quartz.jzhao.xyz</generator> |
8 | 8 | <item> |
9 | | - <title>voice agent deployment</title> |
10 | | - <link>https://programmerraja.github.io/notes/2025/Generative-AI/voice-agent-deployment</link> |
11 | | - <guid>https://programmerraja.github.io/notes/2025/Generative-AI/voice-agent-deployment</guid> |
12 | | - <description>This document provides an organized comparison of GPU architectures, deployment platforms, LLMs, and speech models (TTS/STT) relevant for deploying a voice agent ...</description> |
13 | | - <pubDate>Mon, 10 Nov 2025 00:57:37 GMT</pubDate> |
| 9 | + <title>How to pick the models</title> |
| 10 | + <link>https://programmerraja.github.io/notes/2025/Generative-AI/How-to-pick-the-models</link> |
| 11 | + <guid>https://programmerraja.github.io/notes/2025/Generative-AI/How-to-pick-the-models</guid> |
| 12 | + <description> Thesis / motivation Picking the newest/biggest LLM is not always optimal. Different models have distinct tradeoffs (code, math, multimodal, deployability, cost, licensing).</description> |
| 13 | + <pubDate>Tue, 02 Dec 2025 10:35:47 GMT</pubDate> |
14 | 14 | </item><item> |
15 | 15 | <title>question</title> |
16 | 16 | <link>https://programmerraja.github.io/notes/Microservice/question</link> |
|
54 | 54 | <description>As regular readers of my blog may know, our primary technology stack is the MERN stack MongoDB, Express, React, and Node.js. On the frontend, we use React with TypeScript; on the backend, Node.js with TypeScript, and MongoDB serves as our database.</description> |
55 | 55 | <pubDate>Tue, 05 Aug 2025 04:53:12 GMT</pubDate> |
56 | 56 | </item><item> |
57 | | - <title>RAG</title> |
58 | | - <link>https://programmerraja.github.io/notes/2025/Generative-AI/RAG</link> |
59 | | - <guid>https://programmerraja.github.io/notes/2025/Generative-AI/RAG</guid> |
60 | | - <description>RAG RAG stands for Retrieval Augmented Generation, which is a technique to enhance Large Language Models (LLMs) by connecting them to external knowledge bases or datasets ...</description> |
61 | | - <pubDate>Sat, 02 Aug 2025 00:23:48 GMT</pubDate> |
| 57 | + <title>Model Quantization</title> |
| 58 | + <link>https://programmerraja.github.io/notes/2025/Deep-learning/Model-Quantization</link> |
| 59 | + <guid>https://programmerraja.github.io/notes/2025/Deep-learning/Model-Quantization</guid> |
| 60 | + <description>Model Compression The process of making a model smaller is called model compression, and the process to make it do inference faster is called inference optimization ...</description> |
| 61 | + <pubDate>Wed, 16 Jul 2025 02:56:34 GMT</pubDate> |
62 | 62 | </item><item> |
63 | 63 | <title>Context Engineering</title> |
64 | 64 | <link>https://programmerraja.github.io/notes/2025/Generative-AI/Context-Engineering-and-Memory-in-LLM</link> |
|
0 commit comments