Everyone's tutorial shows you how to send a prompt and get a response. Here's how to do it properly in production.
Setup
composer require openai-php/client\n# .env\nOPENAI_API_KEY=sk-...\nOPENAI_ORGANIZATION=org-...The Service Class Pattern
Never call the OpenAI client directly from controllers. Wrap it in a service:
class AiService {\n private Client $client;\n\n public function __construct() {\n $this->client = OpenAI::client(config('services.openai.key'));\n }\n\n public function complete(string $prompt, array $options = []): string {\n $response = $this->client->chat()->create(array_merge([\n 'model' => 'gpt-4o',\n 'messages' => [['role' => 'user', 'content' => $prompt]],\n 'max_tokens' => 1000,\n ], $options));\n\n return $response->choices[0]->message->content;\n }\n}Streaming Responses
For long responses, streaming dramatically improves UX. Users see output immediately instead of waiting 10+ seconds:
public function stream(string $prompt): \Generator {\n $stream = $this->client->chat()->createStreamed([\n 'model' => 'gpt-4o',\n 'messages' => [['role' => 'user', 'content' => $prompt]],\n ]);\n\n foreach ($stream as $chunk) {\n $content = $chunk->choices[0]->delta->content;\n if ($content) yield $content;\n }\n}Cost Control (Critical in Production)
OpenAI costs can spiral fast. Here's how I keep them in check:
- Cache responses — If users ask similar questions, cache the AI response in Redis for 24h
- Rate limit per user — Use Laravel's built-in rate limiting on your AI endpoints
- Log all API calls — Store prompt tokens, completion tokens, and cost per call in the database
- Set model-appropriate defaults — Use
gpt-4o-minifor simple tasks,gpt-4oonly when quality matters
Prompt Engineering Tips
Your system prompt is your most powerful lever. Be explicit about output format, length, and tone. Use JSON mode for structured outputs.