Gemini MultiModal Live API module for Google Generative AI SDK
$ dotnet add package Google_GenerativeAI.LiveGenerateContentAsync Methods
Unofficial C# .Net Google GenerativeAI (Gemini Pro) SDK based on REST APIs.
This new version is a complete rewrite of the previous SDK, designed to improve performance, flexibility, and ease of
use. It seamlessly integrates with LangChain.net, providing easy methods for
JSON-based interactions and function calling with Google Gemini models.
Highlights of this release include:
By merging the best of the old version with these new capabilities, the SDK provides a smoother developer experience and a wide range of features to leverage Google Gemini.
Use this library to access Google Gemini (Generative AI) models easily. You can start by installing the NuGet package and obtaining the necessary API key from your Google account.
Below are two common ways to initialize and use the SDK. For a full list of supported approaches, please refer to our Wiki Page
Obtain an API Key
Visit Google AI Studio and generate your API key.
Install the NuGet Package
You can install the package via NuGet Package Manager:
Install-Package Google_GenerativeAI
Or using the .NET CLI:
dotnet add package Google_GenerativeAI
Initialize GoogleAI
Provide the API key when creating an instance of the GoogleAI class:
var googleAI = new GoogleAi("Your_API_Key");
Obtain a GenerativeModel
Create a generative model using a model name (for example, "models/gemini-1.5-flash"):
var model = googleAI.CreateGenerativeModel("models/gemini-1.5-flash");
Generate Content
Call the GenerateContentAsync method to get a response:
var response = await model.GenerateContentAsync("How is the weather today?");
Console.WriteLine(response.Text());
Full Code at a Glance
var apiKey = "YOUR_GOOGLE_API_KEY";
var googleAI = new GoogleAi(apiKey);
var googleModel = googleAI.CreateGenerativeModel("models/gemini-1.5-flash");
var googleResponse = await googleModel.GenerateContentAsync("How is the weather today?");
Console.WriteLine("Google AI Response:");
Console.WriteLine(googleResponse.Text());
Console.WriteLine();
Install the Google Cloud SDK (CLI)
By default, Vertex AI uses Application Default Credentials (ADC).
Follow Google’s official instructions to install and set up the Google
Cloud CLI.
Initialize VertexAI
Once the SDK is set up locally, create an instance of the VertexAI class:
var vertexAI = new VertexAI();
Obtain a GenerativeModel
Just like with GoogleAI, choose a model name and create the generative model:
var vertexModel = vertexAI.CreateGenerativeModel("models/gemini-1.5-flash");
Generate Content
Use the GenerateContentAsync method to produce text:
var response = await vertexModel.GenerateContentAsync("Hello from Vertex AI!");
Console.WriteLine(response.Text());
Full code at a Glance
var vertexAI = new VertexAI(); //usage Google Cloud CLI's ADC to get the Access token
var vertexModel = vertexAI.CreateGenerativeModel("models/gemini-1.5-flash");
var vertexResponse = await vertexModel.GenerateContentAsync("Hello from Vertex AI!");
Console.WriteLine("Vertex AI Response:");
Console.WriteLine(vertexResponse.Text());
For multi-turn, conversational use cases, you can start a chat session by calling the StartChat method on an instance
of GenerativeModel. You can use any of the previously mentioned initialization methods (environment variables, direct
constructor, configuration files, ADC, service accounts, etc.) to set up credentials for your AI service first. Then you
would:
GenerativeModel instance (e.g., via googleAI.CreateGenerativeModel(...) or
vertexAI.CreateGenerativeModel(...)).StartChat() on the generated model to initialize a conversation.GenerateContentAsync(...) to exchange messages in the conversation.Below is an example using the model name "gemini-1.5-flash":
// Example: Starting a chat session with a Google AI GenerativeModel
// 1) Initialize your AI instance (GoogleAi) with credentials or environment variables
var googleAI = new GoogleAi("YOUR_GOOGLE_API_KEY");
// 2) Create a GenerativeModel using the model name "gemini-1.5-flash"
var generativeModel = googleAI.CreateGenerativeModel("models/gemini-1.5-flash");
// 3) Start a chat session from the GenerativeModel
var chatSession = generativeModel.StartChat();
// 4) Send and receive messages
var firstResponse = await chatSession.GenerateContentAsync("Welcome to the Gemini 1.5 Flash chat!");
Console.WriteLine("First response: " + firstResponse.Text());
// Continue the conversation
var secondResponse = await chatSession.GenerateContentAsync("How can you help me with my AI development?");
Console.WriteLine("Second response: " + secondResponse.Text());
The same approach applies if you’re using Vertex AI:
// Example: Starting a chat session with a Vertex AI GenerativeModel
// 1) Initialize your AI instance (VertexAI) using one of the available authentication methods
var vertexAI = new VertexAI();
// 2) Create a GenerativeModel using "gemini-1.5-flash"
var generativeModel = vertexAI.CreateGenerativeModel("models/gemini-1.5-flash");
// 3) Start a chat
var chatSession = generativeModel.StartChat();
// 4) Send a chat message and read the response
var response = await chatSession.GenerateContentAsync("Hello from Vertex AI Chat using Gemini 1.5 Flash!");
Console.WriteLine(response.Text());
The GenerativeAI SDK supports streaming responses, allowing you to receive and process parts of the model's output as they become available, rather than waiting for the entire response to be generated. This is particularly useful for long-running generation tasks or for creating more responsive user interfaces.
StreamContentAsync(): Use this method for streaming text responses. It returns an
IAsyncEnumerable<GenerateContentResponse>, which you can iterate over using await foreach.Example (StreamContentAsync()):
using GenerativeAI;
// ... (Assume model is already initialized) ...
var prompt = "Write a long story about a cat.";
await foreach (var chunk in model.StreamContentAsync(prompt))
{
Console.Write(chunk.Text); // Print each chunk as it arrives
}
Console.WriteLine(); // Newline after the complete response
GenerateContentAsync MethodsGoogle Gemini models can work with more than just text – they can handle images, audio, and videos too! This opens up a lot of possibilities for developers. The GenerativeAI SDK makes it super easy to use these features.
Below are several examples showcasing how to incorporate files into your AI prompts:
If you have a file available locally, simply pass in the file path:
// Generate content from a local file (e.g., an image)
var response = await geminiModel.GenerateContentAsync(
"Describe the details in this uploaded image",
@"C:\path\to\local\image.jpg"
);
Console.WriteLine(response.Text());
When your file is hosted remotely, provide the file URI and its corresponding MIME type:
// Generate content from a remote file (e.g., a PDF)
var response = await geminiModel.GenerateContentAsync(
"Summarize the information in this PDF document",
"https://example.com/path/to/sample.pdf",
"application/pdf"
);
Console.WriteLine(response.Text());
For granular control, you can create a GenerateContentRequest, set a prompt, and attach one or more files (local or
remote) before calling GenerateContentAsync:
// Create a request with a text prompt
var request = new GenerateContentRequest();
request.AddText("Describe what's in this document");
// Attach a local file
request.AddInlineFile(@"C:\files\example.png");
// Attach a remote file with its MIME type
request.AddRemoteFile("https://example.com/path/to/sample.pdf", "application/pdf");
// Generate the content with attached files
var response = await geminiModel.GenerateContentAsync(request);
Console.WriteLine(response.Text());
With these overloads and request-based approaches, you can seamlessly integrate additional file-based context into your prompts, enabling richer answers and unlocking more advanced AI-driven workflows.
The GenerativeAI SDK makes it simple to work with JSON data from Gemini. You have several ways some of those are:
1. Automatic JSON Handling:
Use GenerateObjectAsync<T> to directly get the deserialized object:
var myObject = await model.GenerateObjectAsync<SampleJsonClass>(request);
Use GenerateContentAsync and then ToObject<T> to deserialize the response:
var response = await model.GenerateContentAsync<SampleJsonClass>(request);
var myObject = response.ToObject<SampleJsonClass>();
Request: Use the UseJsonMode<T> extension method when creating your GenerateContentRequest. This tells the SDK
to expect a JSON response of the specified type.
var request = new GenerateContentRequest();
request.UseJsonMode<SampleJsonClass>();
request.AddText("Give me a really good response.");
2. Manual JSON Parsing:
Request: Create a standard GenerateContentRequest.
var request = new GenerateContentRequest();
request.AddText("Give me some JSON.");
or
var request = new GenerateContentRequest();
request.GenerationConfig = new GenerationConfig()
{
ResponseMimeType = "application/json",
ResponseSchema = new SampleJsonClass()
}
request.AddText("Give me a really good response.");
Response: Use ExtractJsonBlocks() to get the raw JSON blocks from the response, and then use ToObject<T> to
deserialize them.
var response = await model.GenerateContentAsync(request);
var jsonBlocks = response.ExtractJsonBlocks();
var myObjects = jsonBlocks.Select(block => block.ToObject<SampleJsonClass>());
These options give you flexibility in how you handle JSON data with the GenerativeAI SDK.
Read the wiki for more options.
The GenerativeAI SDK provides built-in tools to enhance Gemini's capabilities, including Google Search, Google Search Retrieval, and Code Execution. These tools allow Gemini to interact with the outside world and perform actions beyond generating text.
1. Inbuilt Tools (GoogleSearch, GoogleSearchRetrieval, and Code Execution):
You can easily enable or disable these tools by setting the corresponding properties on the GenerativeModel:
UseGoogleSearch: Enables or disables the Google Search tool.UseGrounding: Enables or disables the Google Search Retrieval tool (often used for grounding responses in factual
information).UseCodeExecutionTool: Enables or disables the Code Execution tool.// Example: Enabling Google Search and Code Execution
var model = new GenerativeModel(apiKey: "YOUR_API_KEY");
model.UseGoogleSearch = true;
model.UseCodeExecutionTool = true;
// Example: Disabling all inbuilt tools.
var model = new GenerativeModel(apiKey: "YOUR_API_KEY");
model.UseGoogleSearch = false;
model.UseGrounding = false;
model.UseCodeExecutionTool = false;
2. Function Calling
Function calling lets you integrate custom functionality with Gemini by defining functions it can call. This requires
the GenerativeAI.Tools package.
FunctionCallingBehaviour: Customize behavior (e.g., auto-calling, error handling) using the GenerativeModel's
FunctionCallingBehaviour property:
FunctionEnabled (default: true): Enables/disables function calling.AutoCallFunction (default: true): Gemini automatically calls functions.AutoReplyFunction (default: true): Gemini automatically generates responses after function calls.AutoHandleBadFunctionCalls (default: false): Attempts to handle errors from incorrect callsQuickly wrap an inline function using reflection. This approach is ideal for rapid prototyping.
// Define a QuickTool using an inline async function
var quickTool = new QuickTool(
async ([Description("Query a student record")] QueryStudentRecordRequest query) =>
{
return new StudentRecord
{
StudentId = "12345",
FullName = query.FullName,
EnrollmentDate = DateTime.UtcNow
};
},
"GetStudentRecord",
"Retrieve a student record"
);
// Add the function tool to your generative model
var model = new GenerativeModel("YOUR_API_KEY", GoogleAIModels.Gemini2Flash);
model.AddFunctionTool(quickTool);
Annotate a method with FunctionToolAttribute for automatic tool generation. This method is best for a small set of functions defined as static methods.
[FunctionTool(GoogleFunctionTool = true)]
[Description("Get book page content")]
public static Task<string> GetBookPageContentAsync(string bookName, int pageNumber)
{
return Task.FromResult($"Content for {bookName} on page {pageNumber}");
}
// Create the model and add the function as a tool
var model = new GenerativeModel("YOUR_API_KEY", GoogleAIModels.Gemini2Flash);
model.AddFunctionTool(new Tools(new[] { GetBookPageContentAsync }));
Define an interface for a reusable set of functions. This approach is great for structured and maintainable code.
[GenerateJsonSchema(GoogleFunctionTool = true)]
public interface IWeatherFunctions
{
[Description("Get current weather")]
Weather GetCurrentWeather(string location);
}
public class WeatherService : IWeatherFunctions
{
public Weather GetCurrentWeather(string location) =>
new Weather { Location = location, Temperature = 25.0, Description = "Sunny" };
}
// Use the generated extension method to add the tool to your model
var service = new WeatherService();
var model = new GenerativeModel("YOUR_API_KEY", GoogleAIModels.Gemini2Flash);
model.AddFunctionTool(service.AsGoogleFunctionTool());
For more details and options, see the wiki.
Integrate MCP servers to expose tools from any MCP-compatible server to Gemini. Supports all transport protocols: stdio, HTTP/SSE, and custom transports.
Stdio Transport (Launch MCP server as subprocess):
// Create stdio transport
var transport = McpTransportFactory.CreateStdioTransport(
"my-server",
"npx",
new[] { "-y", "@modelcontextprotocol/server-everything" }
);
using var mcpTool = await McpTool.CreateAsync(transport);
var model = new GenerativeModel("YOUR_API_KEY", GoogleAIModels.Gemini2Flash);
model.AddFunctionTool(mcpTool);
model.FunctionCallingBehaviour.AutoCallFunction = true;
HTTP/SSE Transport (Connect to remote MCP server):
// Create HTTP transport
var transport = McpTransportFactory.CreateHttpTransport("http://localhost:8080");
// Or with authentication
var authTransport = McpTransportFactory.CreateHttpTransportWithAuth(
"https://api.example.com",
"your-auth-token"
);
using var mcpTool = await McpTool.CreateAsync(transport);
model.AddFunctionTool(mcpTool);
Multiple MCP Servers:
var transports = new List<IClientTransport>
{
McpTransportFactory.CreateStdioTransport("server1", "npx", new[] { "..." }),
McpTransportFactory.CreateHttpTransport("http://localhost:8080")
};
var mcpTools = await McpTool.CreateMultipleAsync(transports);
foreach (var tool in mcpTools)
{
model.AddFunctionTool(tool);
}
Key Features:
For detailed documentation and examples, see:
The Google_GenerativeAI SDK enables seamless integration with the Google Imagen image generator and the Image Text Model for tasks such as image captioning and visual question answering. It provides two model classes:
Below is a snippet demonstrating how to initialize an image generation model and generate an image:
// 1. Create a Google AI client
var googleAi = new GoogleAi(apiKey);
// 2. Create the Imagen model instance with your chosen model name.
var imageModel = googleAi.CreateImageModel("imagen-3.0-generate-002");
// 3. Generate images by providing a text prompt.
var response = await imageModel.GenerateImagesAsync("A peaceful forest clearing at sunrise");
// The response contains the generated image(s).
For captioning or visual QA tasks:
// 1. Create a Vertex AI client (example shown here).
var vertexAi = new VertexAI(projecId, region);
// 2. Instantiate the ImageTextModel.
var imageTextModel = vertexAi.CreateImageTextModel();
// 3. Generate captions or perform visual QA.
var captionResult = await imageTextModel.GenerateImageCaptionFromLocalFileAsync("path/to/local/image.jpg");
var vqaResult = await imageTextModel.VisualQuestionAnsweringFromLocalFileAsync("What is in the picture?", "path/to/local/image.jpg");
// Results now contain the model's captions or answers.
The Google_GenerativeAI SDK now conveniently supports the Google Multimodal Live API through the
Google_GenerativeAI.Live package. This module enables real-time, interactive conversations with Gemini models by
leveraging WebSockets for text and audio data exchange. It’s ideally suited for building live, multimodal
experiences, such as chat or voice-enabled applications.
The Google_GenerativeAI.Live package provides a comprehensive implementation of the Multimodal Live API, offering:
To leverage the Multimodal Live API in your project, you’ll need to install the Google_GenerativeAI.Live NuGet package
and create a MultiModalLiveClient. Here’s a quick overview:
Install the Google_GenerativeAI.Live package via NuGet:
Install-Package Google_GenerativeAI.Live
With the MultiModalLiveClient, interacting with the Multimodal Live API is simple:
using GenerativeAI.Live;
public async Task RunLiveConversationAsync()
{
var client = new MultiModalLiveClient(
platformAdapter: new GoogleAIPlatformAdapter(),
modelName: "gemini-1.5-flash-exp",
generationConfig: new GenerationConfig { ResponseModalities = { Modality.TEXT, Modality.AUDIO } },
safetySettings: null,
systemInstruction: "You are a helpful assistant."
);
client.Connected += (s, e) => Console.WriteLine("Connected!");
client.TextChunkReceived += (s, e) => Console.WriteLine($"Text chunk: {e.TextChunk}");
client.AudioChunkReceived += (s, e) => Console.WriteLine($"Audio received: {e.Buffer.Length} bytes");
await client.ConnectAsync();
await client.SentTextAsync("Hello, Gemini! What's the weather like?");
await client.SendAudioAsync(audioData: new byte[] { /* audio bytes */ }, audioContentType: "audio/pcm; rate=16000");
Console.ReadKey();
await client.DisconnectAsync();
}
The MultiModalLiveClient provides various events to plug into for real-time updates during interaction:
The Google_GenerativeAI library makes implementing Retrieval-Augmented Generation (RAG) incredibly easy. RAG
combines the strengths of Large Language Models (LLMs) with the precision of information retrieval. Instead of relying
solely on the LLM's pre-trained knowledge, a RAG system first retrieves relevant information from a knowledge base (
a "corpus" of documents) and then uses that information to augment the LLM's response. This allows the LLM to generate
more accurate, factual, and context-aware answers.
Enhance your Gemini applications with the power of the Vertex RAG Engine. This integration enables your applications to provide more accurate and contextually relevant responses by leveraging your existing knowledge bases.
Benefits:
Code Example:
// Initialize VertexAI with your platform configuration.
var vertexAi = new VertexAI(GetTestVertexAIPlatform());
// Create an instance of the RAG manager for corpus operations.
var ragManager = vertexAi.CreateRagManager();
// Create a new corpus for your knowledge base.
// Optional: Use overload methods to specify a vector database (Pinecone, Weaviate, etc.).
// If no specific vector database is provided, a default one will be used.
var corpus = await ragManager.CreateCorpusAsync("My New Corpus", "My description");
// Import data into the corpus from a specified source.
// Replace GcsSource with the appropriate source (Jira, Slack, SharePoint, etc.) and configure it.
var fileSource = new GcsSource() { /* Configure your GcsSource here */ };
await ragManager.ImportFilesAsync(corpus.Name, fileSource);
// Create a Gemini generative model configured to use the created corpus for RAG.
// The corpusIdForRag parameter links the model to your knowledge base.
var model = vertexAi.CreateGenerativeModel(VertexAIModels.Gemini.Gemini2Flash, corpusIdForRag: corpus.Name);
// Generate content by querying the model.
// The model will retrieve relevant information from the corpus to provide a grounded response.
var result = await model.GenerateContentAsync("query related to the corpus");
Learn More:
For a deeper dive into using the Vertex RAG Engine with the Google_GenerativeAI SDK, please visit the wiki page.
This library integrates Google's Attributed Question Answering (AQA) model to enhance Retrieval-Augmented Generation (RAG) through powerful semantic search and question answering. AQA excels at understanding the intent behind a question and retrieving the most relevant passages from your corpus.
Key Features:
Google_GenerativeAI library offers a straightforward API for corpus creation, document ingestion, and semantic search execution.Get Started with Google AQA for RAG:
For a comprehensive guide on implementing semantic search retrieval with Google AQA, refer to the wiki page.
The following features are planned for future releases of the GenerativeAI SDK:
Thanks to HavenDV for LangChain.net SDK
Dive deeper into the GenerativeAI SDK! The wiki is your comprehensive resource for:
We encourage you to explore the wiki to unlock the full potential of the GenerativeAI SDK!
Feel free to open an issue or submit a pull request if you encounter any problems or want to propose improvements! Your feedback helps us continue to refine and expand this SDK.