DevelopmentFull-TimeMember of Technical Staff - Batched Inference ServerLondon, in person 2 to 3 days per week at our Farringdon officeSenior75th to 90th percentile salary based on Ravio's levelling framework + equity
Doubleword·Dawn Capital (Getro)
Posted Jun 28, 2026
Open original postingGet started - FreeHigh Throughput InferenceMaking tokens too cheap to meterDoubleword is the inference provider for long running agents, evals, and batched jobs. Up to 80% cheaper for the same models, for the workloads where no one is waiting.Run a sample jobRead the docsGet started for free.Same intelligence, a fraction of the costCost per 1B in + 1B out.Comparable model intelligence.DoublewordDeepSeek-V4-Pro$3,250OpenAIGPT-5.2$15,7504.8xAnthropicClaude Opus 4.6$30,0009.2xAsync AgentsSynthetic Data GenerationData Processing PipelinesEmbeddingsAsync EvalsLangSmith Evals at ScaleBug Detection EnsembleDataset LabellingStructured ExtractionImage SummarizationAI Personal AssistantsAsynchronous Event ListenersETL & Pipeline SanitizationDeep ResearchLong Running AgentsOpenClawAsync AgentsSynthetic Data GenerationData Processing PipelinesEmbeddingsAsync EvalsLangSmith Evals at ScaleBug Detection EnsembleDataset LabellingStructured ExtractionImage SummarizationAI Personal AssistantsAsynchronous Event ListenersETL & Pipeline SanitizationDeep ResearchLong Running AgentsOpenClawBuilt by inferencesystems engineersRecent ResearchSpeculative KV Coding4× KV cache compression•Cloudburst70× faster cold-starts•Queue SpeculationDrafting while requests waitRead Inference Lab Technical Blog Our BetThe largest volume of tokens comes from asynchronous AI workloads.Interactive chat is only a small fraction of AI inference. Inference built for this workload comes with high inference bills and endless rate limits.The highest-volume AI systems run in the background: agents executing tasks, pipelines processing documents, evaluations running continuously, and models enriching massive datasets.These workloads are throughput-constrained, not latency-constrained.Doubleword is built for this future, we've built an inference stack that maximises GPU utilization, throughput, and cost-efficiency for large-scale asynchronous inference.Background AgentsLong-running agents executing tasks autonomously at scaleBatch ProcessingLarge-scale document extraction, summarization, and classificationEvaluationsRun evals and benchmarks continuously without rate limitsData EnrichmentPower tagging, routing, moderation, and ETL pipelinesSynthetic DataGenerate training and fine-tuning datasets at scaleOffline JobsQueue and process millions of requests reliablyHigh throughput inference APIsDoubleword's APIs are the most efficient for every SLAOpenAI compatible for easy migration. Full tool calling and structured generation support. Trade latency for cost. Pick the window that fits your workflow.FastestRealtimeLatency optimized APIFor when you're still iterating on your prompts and models.Chat CompletionsMost PopularAsync25-50% offHigh throughput APIBest for background agents and chained workloadsChat CompletionsResponsesOpenAI BatchBest ValueBatch50-80% off24H SLABest for big batch jobsChat CompletionsOpenAI Batchasync_request.pyfrom openai import OpenAI
client = OpenAI(
base_url="https://api.doubleword.ai/v1",
api_key="{{apiKey}}"
)
resp = client.responses.create(
model="Qwen/Qwen3-VL-235B-A22B-Instruct-FP8",
input="Summarize the history of artificial intelligence.",
service_tier="flex",
)
print(resp.output_text)Per-token pricingSame Intelligence. Fraction of the price.Cost to process 1 billion tokens in + 1 billion tokens out at comparable intelligence.ModelDeepSeek-V4-ProNewHigh throughput APIBatchAnthropic$30KOpenAI$15.8KIndustry Average$5.2KDoubleword$4.1K$0$7.5K$15K$22.5K$30KIntelligence via Artificial Analysis Index v4.0 · Hover any bar for full pricing details · Want access to a model you don't see here — just ask us!No credit card required · No minimum spend · Pay only for tokens usedStart buildingWorkbooksBuilt for your highest volume use casesProduction-ready templates you can fork and run today.See WorkbooksAsync AgentsAutonomous AI workflows that run without human intervention.Run WorkbookLearn MoreClassificationCategorize, label, and detect patterns in...