GPT-5 Fast vs Thinking vs Pro: How They Actually Work

OpenAI recently released the GPT-5 family, introducing three distinct options for Pro users: GPT-5 Fast, GPT-5 Thinking, and GPT-5 Pro. While it’s common knowledge that GPT-5 Fast handles simple tasks and GPT-5 Pro tackles complex ones, the underlying mechanisms remain unclear to many users. This post provides a concise explanation of how each variant operates and when to use them effectively. GPT-5 Fast For the context of this article, we will use GPT-5 Fast as our base model, think of it as a black box optimized for speed, good old LLM. When you submit a query, it processes the request and delivers an answer quickly without extensive deliberation. ...

August 25, 2025 · 3 min · 454 words · Necati Demir

Building an End-to-End Chat Bot with ONNX Runtime and Rust

Table of Contents Introduction Prerequisites Project Setup Architecture Overview Exporting Models to ONNX Loading an ONNX Model Text Generation Pipeline Building the CLI Chat Interface Going Further Conversation Memory Temperature & Top-p Sampling Streaming Tokens Performance Optimizations Testing Deployment Considerations Conclusion TLDR ...

July 6, 2025 · 8 min · 1684 words · Necati Demir