You know that feeling when you see a beautiful plant at the store but have no idea what it is or how to care for it? That’s exactly what happened to me last spring, and it got me thinking – what if I could just point my phone at any plant and instantly know everything about it?
So I built PlantPal, an iOS app that identifies plants from photos and provides AI-powered care advice. The cool part? Everything runs on-device using Couchbase vector search. No internet required, no photos sent to servers, just pure local plant identification magic.
What I built
PlantPal turned out to be quite the technical adventure. It’s basically a RAG application that can identify plants from your camera and then chat with you about care instructions. But here’s what makes it different from other plant apps:
The whole thing runs locally on your phone. Point your camera at a snake plant, and it’ll instantly know it’s a snake plant. Ask it “how often should I water this?”, and it’ll give you specific advice for that exact plant – all without sending a single byte to the cloud.
Oh, and I managed to squeeze the entire app from a ridiculous 800MB down to just 14MB. More on that disaster later.
Demo video
The tech stack (or: how I learned to love vector databases)
Building this thing required quite a few pieces:
-
- iOS + Swift – Obviously, since I wanted it on my iPhone
- Couchbase Lite Vector Search – This was the game-changer for local plant identification
- MobileCLIP – Apple’s computer vision model for turning images into numbers
- Core ML – For running everything locally
- Foundation Models – Apple’s on-device AI framework for chat functionality, providing privacy-first LLM capabilities without cloud dependencies
The whole philosophy was “keep everything local.” No cloud APIs, no sending plant photos to random servers, no privacy nightmares. Just you, your phone, and some very clever math.
Part 1: Vector search – it’s not as scary as it sounds
Why most plant apps suck
Let’s be honest – most plant identification apps are pretty frustrating. You take a photo, wait 3-5 seconds while it uploads to some server, then get back a result that may or may not be accurate. Plus, you’re basically sending photos of your houseplants to who-knows-where.
I wanted something that just worked instantly, like pointing your camera at a plant and immediately knowing what it is. No waiting, no internet required, no privacy concerns.
Vector search. I know, I know – it sounds super technical and intimidating. But it’s actually a pretty elegant solution once you wrap your head around it.
The magic of vector embeddings
Here’s the crazy part: you can turn any image into a list of numbers. Not like pixel values, but actual meaningful numbers that represent what’s IN the image.
|
1 2 |
Rose photo → [0.1, 0.8, 0.3, ...] (512 numbers) Tulip photo → [0.9, 0.2, 0.1, ...] (512 numbers) |
The beautiful thing is that similar plants end up with similar numbers. Two rose photos will have very similar vectors, while a rose and a cactus will be completely different.
Once I had this insight, the solution became obvious – just compare the numbers! Couchbase Vector Search handles all the heavy lifting of finding similar vectors efficiently.
The actual implementation turned out to be surprisingly clean:
|
1 2 3 4 5 6 7 8 9 10 11 12 |
func search(image: UIImage) -> [Record] { // Turn the camera photo into numbers let embeddings = AI.shared.embeddings(for: image, attention: .zoom(factors: [1, 2])) for embedding in embeddings { let plantSearchResults = self.searchPlants(vector: embedding) if !plantSearchResults.isEmpty { return plantSearchResults } } return [] } |
Part 2: the great MobileCLIP migration
When Vision Framework wasn’t enough
So here’s where things got interesting. I started with Apple’s Vision framework because, well, it’s built into iOS and seemed like the obvious choice. For my initial test with 47 plant photos, it worked perfectly fine. I was feeling pretty good about myself.
Then I got ambitious. What if I could build a database with 15,000+ plant species? That’s when everything fell apart.
The problem was accuracy. When I tested Vision embeddings on larger datasets, plants that looked similar kept getting confused with each other. Like, the app would think a Snake Plant was a ZZ Plant, which is… not great when you’re trying to give watering advice.
Turns out Vision framework embeddings just weren’t distinct enough for my use case. The vectors for similar-looking plants were too close together, so the app couldn’t tell them apart reliably.
MobileCLIP
After some digging around (and a lot of frustration), I discovered MobileCLIP. It’s specifically designed for image similarity tasks, and the difference was night and day. Suddenly, similar plants had much more distinct embeddings, and my accuracy went way up.
The migration was a bit of work – I had to rewrite the embedding generation pipeline – but totally worth it. Now I could confidently scale to thousands of plant species without accuracy falling off a cliff.
Part 3: The 800MB Disaster (and how I fixed it)
When your app becomes larger than most games
So remember when I mentioned the 800MB disaster? Yeah, about that…
My first version was absolutely ridiculous. I was shipping the full plant database with high-resolution images, plus multiple MobileCLIP models (S0, S1, S2, BLT – because why choose, right?), plus all the text models I thought I might need someday. The app was basically the size of a AAA mobile game.
Nobody’s going to download an 800MB plant identification app. I mean, would you?
The pre-computation breakthrough
That’s when I had what I like to call my “duh” moment. Why was I generating embeddings for the same 47 plants every single time the app launched? That’s just… wasteful.
What I was doing (badly):
-
- App Bundle: Plant images (10MB) + All the models (800MB) = 810MB
- Every app launch: Generate embeddings for all reference plants
- Result: Slow startup, ridiculous file size
What I should have been doing:
-
- Build time: Generate embeddings once from plant images
- App Bundle: Pre-computed embeddings (500KB) + 1 Model (14MB) = 14MB
- Runtime: Only generate embeddings for new camera photos
- Result: Instant startup, normal-sized app
Here’s the pre-computed embedding loader:
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
class BuildTimeEmbeddingLoader { private var preComputedEmbeddings: [String: PreComputedPlantEmbedding] = [:] func loadPreComputedEmbeddings() { guard let embeddingsURL = Bundle.main.url(forResource: "plant_embeddings", withExtension: "json"), let data = try? Data(contentsOf: embeddingsURL), let embeddings = try? JSONDecoder().decode([PreComputedPlantEmbedding].self, from: data) else { return } // Load 47 plants × 512 dimensions = ~500KB total for embedding in embeddings { preComputedEmbeddings[embedding.plantId] = embedding } print(" Loaded \(embeddings.count) plants!") } } |
Benefits:
-
- Instant app startup: No embedding generation needed
- 98% smaller storage: Embeddings vs images
- Better battery life: Less computation at runtime
Part 4: making Couchbase do the heavy lifting
Database setup (easier than I expected)
I’ll be honest – when I first heard “vector database,” I thought it would be this massive, complicated thing. But Couchbase Lite made it surprisingly straightforward:
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
// Enable vector search extension try! CouchbaseLiteSwift.Extension.enableVectorSearch() // Create local database let database = try! CouchbaseLiteSwift.Database(name: "plantpal") let collection = try! database.defaultCollection() // Vector index for 512-dimensional plant embeddings (MobileCLIP S1 model output size) // VectorIndexConfiguration defines index parameters: data expression, vector dimensions, clustering centroids, and distance metrics var imageVectorIndex = VectorIndexConfiguration(expression: "image", dimensions: 512, centroids: 8) imageVectorIndex.metric = .cosine imageVectorIndex.isLazy = true try! collection.createIndex(withName: "ImageVectorIndex", config: imageVectorIndex) // Full-text search for plant names let ftsIndex = FullTextIndexConfiguration(["name", "scientificName"]) try! collection.createIndex(withName: "PlantNameIndex", config: ftsIndex) print(" Vector search database ready!") |
The SQL++ that changed everything
Here’s where Couchbase really shines. Instead of writing complex similarity algorithms, I can just use SQL++ with a special vector function. It’s almost too easy:
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
private func searchPlants(vector: [Float]) -> [Record] { let sql = """ SELECT type, name, scientificName, price, location, wateringSchedule, careInstructions, characteristics, APPROX_VECTOR_DISTANCE(image, $embedding) AS distance FROM _ WHERE type = "plant" AND distance BETWEEN 0 AND 0.4 ORDER BY distance, name LIMIT 1 """ let query = try collection.database.createQuery(sql) query.parameters = Parameters() .setArray(MutableArrayObject(data: vector), forName: "embedding") // Process results into Plant objects var records = [Record]() for result in try query.execute() { let plant = Plant( name: result["name"].string ?? "", scientificName: result["scientificName"].string, wateringSchedule: extractWateringSchedule(from: result), careInstructions: extractCareInstructions(from: result), characteristics: extractCharacteristics(from: result) ) records.append(plant) } return records } |
The APPROX_VECTOR_DISTANCE function calculates approximate distance between a target vector and vectors in the database, enabling efficient hybrid search with SQL++ queries and support for multiple distance metrics.
Part 5: teaching the AI about plants
Making chat actually useful
Okay, so I could identify plants. Cool. But then what? I wanted users to be able to ask questions like “how often should I water this?” and get actually helpful answers, not generic plant care advice.
This is where RAG (Retrieval-Augmented Generation) comes in. Basically, I needed to give the AI specific context about whichever plant was just identified. Here’s how I structured all that plant knowledge:
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
class Plant: Record { let name: String? let scientificName: String? let wateringSchedule: WateringSchedule? let careInstructions: CareInstructions? let characteristics: PlantCharacteristics? struct WateringSchedule { let frequency: String let amount: String let notes: String } struct CareInstructions { let light: String let temperature: String let humidity: String let fertilizer: String let pruning: String } struct PlantCharacteristics { let toxicToPets: Bool let airPurifying: Bool let flowering: Bool let difficulty: String } } |
When a plant is identified, I build rich context for the AI from local data:
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
private func buildPlantContext(for plant: Plant) -> String { let context = """ You are PlantPal, an expert plant care assistant. You can ONLY answer questions about: \(plant.name ?? "Unknown Plant"). PLANT INFORMATION: Name: \(plant.name ?? "Unknown") Scientific Name: \(plant.scientificName ?? "Not available") WATERING SCHEDULE: \(plant.wateringSchedule?.frequency ?? "Not specified") \(plant.wateringSchedule?.notes ?? "") CARE INSTRUCTIONS: Light: \(plant.careInstructions?.light ?? "Not specified") Temperature: \(plant.careInstructions?.temperature ?? "Not specified") Humidity: \(plant.careInstructions?.humidity ?? "Not specified") CHARACTERISTICS: Pet Safe: \(plant.characteristics?.toxicToPets == false ? "Yes" : "No") Air Purifying: \(plant.characteristics?.airPurifying == true ? "Yes" : "No") Difficulty: \(plant.characteristics?.difficulty ?? "Not specified") """ return context } |
This provides rich, contextual AI responses using the identified plant’s specific care information.
Foundation models integration
The chat functionality leverages Apple’s Foundation Models for on-device natural language processing, ensuring all conversations stay private while providing intelligent plant care guidance through context-aware responses.
Part 6: performance optimizations
Size reduction journey
-
- Started with: 8 MobileCLIP models (800MB)
- Optimized to: 1 MobileCLIP-S1 model (120MB)
- Result: 85% size reduction with better accuracy!
Part 7: what I learned (the hard way)
Things that actually worked
-
- Pre-computed embeddings – This was probably my biggest win. Going from generating embeddings every startup to pre-computing them saved both startup time and 98% of the app size.
- MobileCLIP – Way better than Vision framework for distinguishing between similar plants. Totally worth the migration headache.
- Couchbase Vector Search – Having SQL for vector operations is a game-changer. No more writing custom similarity algorithms.
- Keeping everything local – Users love not having to worry about internet connectivity or privacy.
Things that didn’t go smoothly
-
- Model size optimization – I went through way too many iterations trying to find the right balance between accuracy and file size. Shipping 8 different models was… not smart.
- Similarity thresholds – Took forever to tune these properly. Too strict and nothing matches, too loose and everything matches.
- Build automation – Setting up the embedding generation pipeline to run during builds was trickier than expected.
If I were starting over
-
- Think about app size from day one – Don’t ship 800MB apps, people won’t download them
- Test your embedding model at scale – What works for 50 plants might not work for 5000
- Couchbase Vector Search is your friend – Don’t reinvent vector similarity matching
- Build time optimization > runtime optimization – Do the heavy lifting during builds, not when users are waiting
- Real data reveals everything – Your algorithm might work perfectly on curated test images and completely fail on actual user photos
What’s next?
-
- Expanding to more plant species
- Adding plant care reminders and progress tracking
- Community features for plant sharing
Wrapping up
Building PlantPal turned out to be way more educational than I expected. I went in thinking “how hard could plant identification be?” and came out with a deep appreciation for vector databases, embedding models, and the art of mobile optimization.
The coolest part? I can now point my phone at literally any plant and instantly know what it is and how to care for it. No internet required, no privacy concerns, just pure local magic.
What I’m most proud of:
-
- Going from an 800MB monstrosity to a sleek 14MB app
- Achieving instant responses with smart pre-computation
- Building something that actually works reliably in the real world
- Proving that complex AI doesn’t need cloud servers
Vector search turned out to be this perfect sweet spot between simplicity and power. Couchbase made the whole thing way easier than I thought it would be – being able to use SQL for vector operations feels almost like cheating.
If you’re thinking about building something similar, just start small and iterate. My first version was terrible, but each iteration taught me something new. And definitely don’t ship an 800MB app on your first try. Trust me on this one.
Have you built anything with vector search? I’d love to hear about it! And if you try building your own plant identification app, hit me up – I’m always curious to see what other people come up with.
Resources: