Gemini Nano

Gemma 4: The new standard for local agentic intelligence on Android

Seharakram60 — Sat, 04 Apr 2026 12:09:57 +0000

Posted by Matthew McCullough, VP of Product Management Android Development

Today, we are enhancing Android development with Gemma 4, our latest state-of-the-art open model designed with complex reasoning and autonomous tool-calling capabilities.

Our vision is to enable local agentic AI on Android across the entire software lifecycle, from development to production. Android supports a range of Gemma 4 models, from the most efficient ones running directly on-device in your apps to more powerful ones running on your development machine to help you build apps. We are bringing Gemma 4 to Android developers through two pillars:

Local-first Agentic coding: Experience powerful, local AI code assistance with Gemma 4 in Android Studio in your development computer.
On-device intelligence: Build intelligent experiences using the ML Kit GenAI Prompt API to run Gemma 4 directly on Android device hardware.

Coding with Gemma 4 in Android Studio

When building Android apps, Android Studio can use Gemma 4 to leverage its state-of-the-art reasoning power and native support for tool use, while keeping the model and inference contained entirely on your local machine.

Gemma 4 was trained on Android development and designed with Agent Mode in mind. This means that when you select Gemma 4 as your local model, you can leverage the full suite of Agent Mode capabilities for a variety of Android development use cases, including refactoring legacy code, building an entire app or new features, and applying fixes iteratively.

Learn more about the possibilities Gemma 4 brings to your app development flow and how to get started.

Prototyping with Gemma 4 on-device

Since the introduction of Gemini Nano as the foundation model on Android, it has become available on over 140 million devices. Gemma 4 is the base model for the next generation of Gemini Nano (Gemini Nano 4) that is optimized for performance and quality on Android devices. This model is up to 4x faster than the previous version and uses up to 60% less battery.

To make it as easy as possible to preview and prototype with Gemma 4 E2B and E4B models directly on AICore-supported devices, we’re launching the AICore Developer Preview. While we continue to expand the ML Kit GenAI Prompt API surface to unlock additional advanced capabilities of the model, you can already start exploring new use cases with Gemma 4 using the Prompt API.

Prepare your apps for the launch of the Gemini Nano 4 on the new flagship Android devices later this year by prototyping with Gemma 4 today. Read about the upcoming features and deep dive into AICore Developer Preview and its Gemma 4 support here.

Local agentic intelligence with Gemma 4

Running Gemma 4 locally, you can leverage its advanced reasoning and tool-calling capabilities in your entire workflow, from developing with the AI coding assistant in Android Studio to shipping intelligent features in your app with ML Kit GenAI Prompt API. This local-first approach, available under Gemma’s open Apache license, provides an alternative for developers to innovate in a privacy-centric and cost effective manner. In a future release, we will update Android Bench to include Gemma 4 and other open models, providing the quantified data you need to navigate performance trade-offs and select the best model for your use case.

We can’t wait to see what you build!

The post Gemma 4: The new standard for local agentic intelligence on Android appeared first on InShot Pro.

The latest Gemini Nano with on-device ML Kit GenAI APIs

Seharakram60 — Fri, 22 Aug 2025 16:00:00 +0000

Posted by Caren Chang – Developer Relations Engineer, Joanna (Qiong) Huang – Software Engineer, and Chengji Yan – Software Engineer

The latest version of Gemini Nano, our most powerful multi-modal on-device model, just launched on the Pixel 10 device series and is now accessible through the ML Kit GenAI APIs. Integrate capabilities such as summarization, proofreading, rewriting, and image description directly into your apps.

With GenAI APIs we’re focused on giving you access to the latest version of Gemini Nano while providing consistent quality across devices and model upgrades. Here’s a sneak peak behind the scenes of some of the things we’ve done to achieve this.

Adapting GenAI APIs for the latest Gemini Nano

We want to make it as easy as possible for you to build AI powered features, using the most powerful models. To ensure GenAI APIs provide consistent quality across different model versions, we make many behind the scenes improvements including rigorous evals and adapter training.

Evaluation pipeline: For each supported language, we prepare an evaluation dataset. We then benchmark the evals through a combination of: LLM-based raters, statistical metrics and human raters.
Adapter training: With results from the evaluation pipeline, we then determine if we need to train feature-specific LoRA adapters to be deployed on top of the Gemini Nano base model. By shipping GenAI APIs with LoRA adapters, we ensure each API meets our quality bar regardless of the version of Gemini Nano running on a device.

The latest Gemini Nano performance

One area we’re excited about is how this updated version of Gemini Nano pushes performance even higher, especially the prefix speed – that is how fast the model processes input.

For example, here are results when running text-to-text and image-to-text benchmarks on a Pixel 10 Pro.

	Prefix Speed – Gemini nano-v2 on Pixel 9 Pro	*Prefix Speed – Gemini nano-v2^ on Pixel 10 Pro**	Prefix Speed – Gemini nano-v3 on Pixel 10 Pro
Text-to-text	510 tokens/second	610 tokens/second	940 tokens/second
Image-to-text	510 tokens/second + 0.8 seconds for image encoding	610 tokens/second + 0.7 seconds for image encoding	940 tokens/second + 0.6 seconds for image encoding

^*Experimentation with Gemini nano-v2 on Pixel 10 Pro for benchmarking purposes. All Pixel 10 Pros launched with Gemini nano-v3.

The future of Gemini Nano with GenAI APIs

As we continue to improve the Gemini Nano model, the team is committed to using the same process to ensure consistent and high quality results from GenAI APIs.

We hope this will significantly reduce the effort to integrate Gemini Nano in your Android apps while still allowing you to take full advantage of new versions and their improved capabilites.

Learn more about GenAI APIs

Start implementing GenAI APIs in your Android apps today with guidance from our official documentation and samples: GenAI API Catalog and ML Kit GenAI APIs quickstart samples.

The post The latest Gemini Nano with on-device ML Kit GenAI APIs appeared first on InShot Pro.

Top 3 things to know for AI on Android at Google I/O ‘25

Seharakram60 — Mon, 16 Jun 2025 16:00:00 +0000

Posted by Kateryna Semenova – Sr. Developer Relations Engineer

AI is reshaping how users interact with their favorite apps, opening new avenues for developers to create intelligent experiences. At Google I/O, we showcased how Android is making it easier than ever for you to build smart, personalized and creative apps. And we’re committed to providing you with the tools needed to innovate across the full development stack in this evolving landscape.

This year, we focused on making AI accessible across the spectrum, from on-device processing to cloud-powered capabilities. Here are the top 3 announcements you need to know for building with AI on Android from Google I/O ‘25:

#1 Leverage the efficiency of Gemini Nano for on-device AI experiences

For on-device AI, we announced a new set of ML Kit GenAI APIs powered by Gemini Nano, our most efficient and compact model designed and optimized for running directly on mobile devices. These APIs provide high-level, easy integration for common tasks including text summarization, proofreading, rewriting content in different styles, and generating image description. Building on-device offers significant benefits such as local data processing and offline availability at no additional cost for inference. To start integrating these solutions explore the ML Kit GenAI documentation, the sample on GitHub and watch the “Gemini Nano on Android: Building with on-device GenAI” talk.

#2 Seamlessly integrate on-device ML/AI with your own custom models

The Google AI Edge platform enables building and deploying a wide range of pretrained and custom models on edge devices and supports various frameworks like TensorFlow, PyTorch, Keras, and Jax, allowing for more customization in apps. The platform now also offers improved support of on-device hardware accelerators and a new AI Edge Portal service for broad coverage of on-device benchmarking and evals. If you are looking for GenAI language models on devices where Gemini Nano is not available, you can use other open models via the MediaPipe LLM Inference API.

Serving your own custom models on-device can pose challenges related to handling large model downloads and updates, impacting the user experience. To improve this, we’ve launched Play for On-Device AI in beta. This service is designed to help developers manage custom model downloads efficiently, ensuring the right model size and speed are delivered to each Android device precisely when needed.

For more information watch “Small language models with Google AI Edge” talk.

#3 Power your Android apps with Gemini Flash, Pro and Imagen using Firebase AI Logic

For more advanced generative AI use cases, such as complex reasoning tasks, analyzing large amounts of data, processing audio or video, or generating images, you can use larger models from the Gemini Flash and Gemini Pro families, and Imagen running in the cloud. These models are well suited for scenarios requiring advanced capabilities or multimodal inputs and outputs. And since the AI inference runs in the cloud any Android device with an internet connection is supported. They are easy to integrate into your Android app by using Firebase AI Logic, which provides a simplified, secure way to access these capabilities without managing your own backend. Its SDK also includes support for conversational AI experiences using the Gemini Live API or generating custom contextual visual assets with Imagen. To learn more, check out our sample on GitHub and watch “Enhance your Android app with Gemini Pro and Flash, and Imagen” session.

These powerful AI capabilities can also be brought to life in immersive Android XR experiences. You can find corresponding documentation, samples and the technical session: “The future is now, with Compose and AI on Android XR“.

Figure 1: Firebase AI Logic integration architecture

Get inspired and start building with AI on Android today

We released a new open source app, Androidify, to help developers build AI-driven Android experiences using Gemini APIs, ML Kit, Jetpack Compose, CameraX, Navigation 3, and adaptive design. Users can create personalized Android bot with Gemini and Imagen via the Firebase AI Logic SDK. Additionally, it incorporates ML Kit pose detection to detect a person in the camera viewfinder. The full code sample is available on GitHub for exploration and inspiration. Discover additional AI examples in our Android AI Sample Catalog.

The original image and Androidifi-ed image

Choosing the right Gemini model depends on understanding your specific needs and the model’s capabilities, including modality, complexity, context window, offline capability, cost, and device reach. To explore these considerations further and see all our announcements in action, check out the AI on Android at I/O ‘25 playlist on YouTube and check out our documentation.

We are excited to see what you will build with the power of Gemini!

The post Top 3 things to know for AI on Android at Google I/O ‘25 appeared first on InShot Pro.

On-device GenAI APIs as part of ML Kit help you easily build with Gemini Nano

Seharakram60 — Fri, 30 May 2025 12:00:23 +0000

Posted by Caren Chang – Developer Relations Engineer, Chengji Yan – Software Engineer, Taj Darra – Product Manager

We are excited to announce a set of on-device GenAI APIs, as part of ML Kit, to help you integrate Gemini Nano in your Android apps.

To start, we are releasing 4 new APIs:

Summarization: to summarize articles and conversations
Proofreading: to polish short text
Rewriting: to reword text in different styles
Image Description: to provide short description for images

Key benefits of GenAI APIs

GenAI APIs are high level APIs that allow for easy integration, similar to existing ML Kit APIs. This means you can expect quality results out of the box without extra effort for prompt engineering or fine tuning for specific use cases.

GenAI APIs run on-device and thus provide the following benefits:

Input, inference, and output data is processed locally
Functionality remains the same without reliable internet connection
No additional cost incurred for each API call

To prevent misuse, we also added safety protection in various layers, including base model training, safety-aware LoRA fine-tuning, input and output classifiers and safety evaluations.

How GenAI APIs are built

There are 4 main components that make up each of the GenAI APIs.

Gemini Nano is the base model, as the foundation shared by all APIs.
Small API-specific LoRA adapter models are trained and deployed on top of the base model to further improve the quality for each API.
Optimized inference parameters (e.g. prompt, temperature, topK, batch size) are tuned for each API to guide the model in returning the best results.
An evaluation pipeline ensures quality in various datasets and attributes. This pipeline consists of: LLM raters, statistical metrics and human raters.

Together, these components make up the high-level GenAI APIs that simplify the effort needed to integrate Gemini Nano in your Android app.

Evaluating quality of GenAI APIs

For each API, we formulate a benchmark score based on the evaluation pipeline mentioned above. This score is based on attributes specific to a task. For example, when evaluating the summarization task, one of the attributes we look at is “grounding” (ie: factual consistency of generated summary with source content).

To provide out-of-box quality for GenAI APIs, we applied feature specific fine-tuning on top of the Gemini Nano base model. This resulted in an increase for the benchmark score of each API as shown below:

Use case in English	Gemini Nano Base Model	ML Kit GenAI API
Summarization	77.2	92.1
Proofreading	84.3	90.2
Rewriting	79.5	84.1
Image Description	86.9	92.3

In addition, this is a quick reference of how the APIs perform on a Pixel 9 Pro:

	Prefix Speed (input processing rate)	Decode Speed (output generation rate)
Text-to-text	510 tokens/second	11 tokens/second
Image-to-text	510 tokens/second + 0.8 seconds for image encoding	11 tokens/second

Sample usage

This is an example of implementing the GenAI Summarization API to get a one-bullet summary of an article:

val articleToSummarize = "We are excited to announce a set of on-device generative AI APIs..."

// Define task with desired input and output format
val summarizerOptions = SummarizerOptions.builder(context)
    .setInputType(InputType.ARTICLE)
    .setOutputType(OutputType.ONE_BULLET)
    .setLanguage(Language.ENGLISH)
    .build()
val summarizer = Summarization.getClient(summarizerOptions)

suspend fun prepareAndStartSummarization(context: Context) {
    // Check feature availability. Status will be one of the following: 
    // UNAVAILABLE, DOWNLOADABLE, DOWNLOADING, AVAILABLE
    val featureStatus = summarizer.checkFeatureStatus().await()

    if (featureStatus == FeatureStatus.DOWNLOADABLE) {
        // Download feature if necessary.
        // If downloadFeature is not called, the first inference request will 
        // also trigger the feature to be downloaded if it's not already
        // downloaded.
        summarizer.downloadFeature(object : DownloadCallback {
            override fun onDownloadStarted(bytesToDownload: Long) { }

            override fun onDownloadFailed(e: GenAiException) { }

            override fun onDownloadProgress(totalBytesDownloaded: Long) {}

            override fun onDownloadCompleted() {
                startSummarizationRequest(articleToSummarize, summarizer)
            }
        })    
    } else if (featureStatus == FeatureStatus.DOWNLOADING) {
        // Inference request will automatically run once feature is      
        // downloaded.
        // If Gemini Nano is already downloaded on the device, the   
        // feature-specific LoRA adapter model will be downloaded very  
        // quickly. However, if Gemini Nano is not already downloaded, 
        // the download process may take longer.
        startSummarizationRequest(articleToSummarize, summarizer)
    } else if (featureStatus == FeatureStatus.AVAILABLE) {
        startSummarizationRequest(articleToSummarize, summarizer)
    } 
}

fun startSummarizationRequest(text: String, summarizer: Summarizer) {
    // Create task request  
    val summarizationRequest = SummarizationRequest.builder(text).build()

    // Start summarization request with streaming response
    summarizer.runInference(summarizationRequest) { newText -> 
        // Show new text in UI
    }

    // You can also get a non-streaming response from the request
    // val summarizationResult = summarizer.runInference(summarizationRequest)
    // val summary = summarizationResult.get().summary
}

// Be sure to release the resource when no longer needed
// For example, on viewModel.onCleared() or activity.onDestroy()
summarizer.close()

For more examples of implementing the GenAI APIs, check out the official documentation and samples on GitHub:

Use cases

Here is some guidance on how to best use the current GenAI APIs:

For Summarization, consider:

Conversation messages or transcripts that involve 2 or more users

Articles or documents less than 4000 tokens (or about 3000 English words). Using the first few paragraphs for summarization is usually good enough to capture the most important information.

For Proofreading and Rewriting APIs, consider utilizing them during the content creation process for short content below 256 tokens to help with tasks such as:

Refining messages in a particular tone, such as more formal or more casual

Polishing personal notes for easier consumption later

For the Image Description API, consider it for:

Generating titles of images

Generating metadata for image search

Utilizing descriptions of images in use cases where the images themselves cannot be displayed, such as within a list of chat messages

Generating alternative text to help visually impaired users better understand content as a whole

GenAI API in production

Envision is an app that verbalizes the visual world to help people who are blind or have low vision lead more independent lives. A common use case in the app is for users to take a picture to have a document read out loud. Utilizing the GenAI Summarization API, Envision is now able to get a concise summary of a captured document. This significantly enhances the user experience by allowing them to quickly grasp the main points of documents and determine if a more detailed reading is desired, saving them time and effort.

Supported devices

GenAI APIs are available on Android devices using optimized MediaTek Dimensity, Qualcomm Snapdragon, and Google Tensor platforms through AICore. For a comprehensive list of devices that support GenAI APIs, refer to our official documentation.

Learn more

Start implementing GenAI APIs in your Android apps today with guidance from our official documentation and samples on GitHub: AI Catalog GenAI API Samples with Compose, ML Kit GenAI APIs Quickstart.

The post On-device GenAI APIs as part of ML Kit help you easily build with Gemini Nano appeared first on InShot Pro.