AI-powered OCR for .NET using GPT-4 Vision, Claude 3, and Gemini. Extract text from images, documents, screenshots, receipts, and scanned PDFs. Supports handwriting recognition, table extraction, and structured output. Simple async API with multi-model support. Best C# OCR library for AI text extraction.
$ dotnet add package ForeverTools.OCRAI-powered OCR for .NET using GPT-4 Vision, Claude 3, and Gemini. Extract text from images, documents, screenshots, receipts, and scanned files with state-of-the-art accuracy.
dotnet add package ForeverTools.OCR
Get your API key at aimlapi.com.
using ForeverTools.OCR;
var client = new OcrClient("your-api-key");
// Extract text from an image file
var result = await client.ExtractTextFromFileAsync("document.png");
if (result.Success)
{
Console.WriteLine(result.Text);
}
// Extract from URL
var urlResult = await client.ExtractTextFromUrlAsync("https://example.com/image.jpg");
// Extract from bytes
byte[] imageData = File.ReadAllBytes("photo.jpg");
var bytesResult = await client.ExtractTextAsync(imageData);
Choose the best model for your use case:
// General purpose (default) - best balance
var result = await client.ExtractTextFromFileAsync("doc.png", OcrModels.Gpt4o);
// Fast and cheap - for clear printed text
var fast = await client.ExtractTextFromFileAsync("screenshot.png", OcrModels.Gpt4oMini);
// Handwriting recognition - highest accuracy
var handwriting = await client.ExtractTextFromFileAsync("notes.jpg", OcrModels.Claude3Opus);
// Non-English text - best multilingual support
var foreign = await client.ExtractTextFromFileAsync("document.png", OcrModels.Gemini15Pro);
// Use recommendations helper
var receipt = await client.ExtractTextFromFileAsync("receipt.jpg", OcrModels.Recommendations.Receipts);
var result = await client.ExtractStructuredAsync(imageBytes);
Console.WriteLine("Paragraphs:");
foreach (var paragraph in result.Paragraphs)
{
Console.WriteLine($" - {paragraph}");
}
Console.WriteLine("\nText Blocks:");
foreach (var block in result.Blocks)
{
Console.WriteLine($" [{block.BlockType}] {block.Text}");
}
var result = await client.ExtractTablesAsync(imageBytes);
foreach (var table in result.Tables)
{
Console.WriteLine($"Table: {table.ColumnCount} columns, {table.RowCount} rows");
Console.WriteLine(table.ToCsv());
}
var result = await client.ExtractFormFieldsAsync(imageBytes);
foreach (var field in result.Fields)
{
Console.WriteLine($"{field.Key}: {field.Value}");
}
// Get specific field
var name = result.GetField("Name");
var date = result.GetField("Date");
var result = await client.ExtractReceiptAsync(imageBytes);
Console.WriteLine($"Merchant: {result.MerchantName}");
Console.WriteLine($"Date: {result.Date}");
Console.WriteLine($"Total: {result.Total}");
Console.WriteLine($"Tax: {result.Tax}");
Console.WriteLine("\nItems:");
foreach (var item in result.Items)
{
Console.WriteLine($" {item.Description} x{item.Quantity} = {item.TotalPrice}");
}
Use custom prompts for specialized extraction:
var result = await client.ExtractWithPromptAsync(
imageBytes,
"Extract only the email addresses and phone numbers from this business card. Return as JSON with 'emails' and 'phones' arrays."
);
// Program.cs
builder.Services.AddForeverToolsOcr("your-api-key");
// Or with options
builder.Services.AddForeverToolsOcr(options =>
{
options.ApiKey = "your-api-key";
options.DefaultModel = OcrModels.Gpt4o;
options.TimeoutSeconds = 90;
options.MaxTokens = 4096;
options.ImageDetail = "high"; // "low", "high", or "auto"
});
// Or from configuration
builder.Services.AddForeverToolsOcr(builder.Configuration);
appsettings.json:
{
"OCR": {
"ApiKey": "your-api-key",
"DefaultModel": "gpt-4o",
"TimeoutSeconds": 60,
"MaxTokens": 4096,
"ImageDetail": "auto"
}
}
Use in your services:
public class DocumentService
{
private readonly OcrClient _ocr;
public DocumentService(OcrClient ocr)
{
_ocr = ocr;
}
public async Task<string> ProcessDocument(byte[] imageData)
{
var result = await _ocr.ExtractTextAsync(imageData);
return result.Success ? result.Text : throw new Exception(result.Error);
}
}
// Uses AIML_API_KEY by default
var client = OcrClient.FromEnvironment();
// Or specify custom variable
var client = OcrClient.FromEnvironment("MY_OCR_KEY");
var result = await client.ExtractTextFromFileAsync("document.png");
if (result.Success)
{
Console.WriteLine($"Text: {result.Text}");
Console.WriteLine($"Model: {result.Model}");
Console.WriteLine($"Tokens: {result.TokensUsed}");
Console.WriteLine($"Time: {result.ProcessingTimeMs}ms");
}
else
{
Console.WriteLine($"Error: {result.Error}");
}
| Model | Best For | Speed | Accuracy |
|---|---|---|---|
gpt-4o | General purpose | Fast | High |
gpt-4o-mini | Screenshots, clear text | Fastest | Good |
gpt-4-turbo | Scanned documents | Medium | High |
claude-3-5-sonnet | Forms, structured docs | Fast | High |
claude-3-opus | Handwriting, critical docs | Slow | Highest |
gemini-1.5-pro | Non-English, multilingual | Medium | High |
gemini-1.5-flash | Quick tasks | Fast | Good |
MIT License - see LICENSE file for details.