Integrating Gemini Nano On-Device with Android's Google AI Edge SDK
On-device generative AI has evolved from a futuristic concept to a practical necessity. By executing models locally, developers can provide zero-latency responses, eliminate cloud server compute costs, and guarantee strict user data privacy.
On Android, Google’s Gemini Nano is the foundation for on-device AI. Using the Google AI Edge SDK (formerly part of AICore), developers can run inference directly on the system’s NPU.
In this guide, we will implement an on-device text summarization engine using Kotlin and Jetpack Compose.
Why Gemini Nano?
Gemini Nano is Google’s highly efficient model designed for on-device tasks.
- Privacy First: Sensitive text data never leaves the user’s phone.
- Offline Capability: Works without internet connectivity (e.g., in subways or airplane mode).
- Cost Efficient: Zero API server costs, regardless of user scale.
Implementing On-Device Inference
First, add the required dependencies to your app’s build.gradle.kts:
dependencies {
implementation("com.google.android.aicore:google-ai-edge-sdk:1.2.0")
}
Now, initialize the model session and run a text summarization task asynchronously:
import android.content.Context
import com.google.android.edge.ai.GenerativeModel
import kotlinx.coroutines.flow.Flow
class OnDeviceSummarizer(context: Context) {
// Access Gemini Nano through the AI Edge SDK
private val model = GenerativeModel(
modelName = "gemini-nano-text",
context = context
)
suspend fun summarize(inputText: String): String {
val prompt = """
Summarize the following text professionally. Keep it under 3 bullet points:
$inputText
""".trimIndent()
val response = model.generateContent(prompt)
return response.text ?: "Summarization failed."
}
// Streaming responses for long texts
fun summarizeStream(inputText: String): Flow<String> {
val prompt = "Summarize: $inputText"
return model.generateContentStream(prompt).map { it.text ?: "" }
}
}
Integrating into Jetpack Compose
To provide a smooth user experience, handle the inference states inside a Compose screen:
@Composable
fun SummaryScreen(summarizer: OnDeviceSummarizer) {
var textInput by remember { mutableStateOf("") }
var summaryResult by remember { mutableStateOf("") }
var isProcessing by remember { mutableStateOf(false) }
val scope = rememberCoroutineScope()
Column(modifier = Modifier.padding(16.dp)) {
OutlinedTextField(
value = textInput,
onValueChange = { textInput = it },
label = { Text("Enter text to summarize") }
)
Spacer(modifier = Modifier.height(16.dp))
Button(
onClick = {
isProcessing = true
scope.launch {
summaryResult = summarizer.summarize(textInput)
isProcessing = false
}
},
enabled = !isProcessing && textInput.isNotEmpty()
) {
Text(if (isProcessing) "Summarizing on-device..." else "Summarize")
}
Spacer(modifier = Modifier.height(24.dp))
Text(text = "Summary:", fontWeight = FontWeight.Bold)
Text(text = summaryResult)
}
}
Important Considerations for Production
- Model Download Handling: Gemini Nano is distributed via Google Play Services. Before running queries, your app should check if the model is fully downloaded and query download if not.
- Thermal Throttling: Heavy continuous on-device inference runs the risk of heating up the mobile processor. Limit heavy usage to prevent OS throttling.