Converting an Expo app to Flutter with Claude
January 22, 2026 at 07:50 AM
Maybe we've all had this experience. You're wrapping up a project, doing rounds of manual testing, handling the more administrative side of things to prepare for release, and polishing the code. Then it hits you: "What if I rewrote this in a totally different language and framework?"
Normally you might release your current product, test its viability, then stub the conversion in an epic for later consideration. That's probably the correct choice depending on the size of the project. I chose the rewrite.
Why? We now have agents that can code while we sleep and teach us things we didn't know before. If I think I can rewrite the project quickly and bring significant system-wide improvements, I may as well stick an agent on it and see how it goes.
So that's how it went for my recent Expo app. It's a lightweight app but it's not trivial. There is a lot of relational data and even a native module. But because it's so lightweight, I had anxiety in the back of my mind, leading to the rewrite decision: why are these animations so bad?
For background, this is my first Expo app. I'm absolutely sure I did things wrong. I wrote the initial iteration myself following all the relevant guides. It uses Tamagui, initially with its built-in animations (using CSS, I believe).
And that's where some problems started. There is an accordion component that uses a height animator to animate the open/close transitions. I wrote the component exactly as the docs suggests. Yet the animator never worked correctly; each accordion item's children were fully visible or invisible until the close/open finished. I found this GitHub issue and the solution was to use Moti for animations. This solved by problem, but then introduced another problem: height animations caused a ton of frame drops.
From the research I did, height animations are just expensive and it's something you have to live with. I could have tried a different UI library. Maybe it's an issue with Tamagui, even though plenty of Reddit discussions praise Tamagui for its clean code and efficient components. If I found a different library that performed better, that meant rewriting my entire UI in that library.
If there's a good chance of a rewrite either way, why not rewrite the whole app?, I thought. That's when I started researching Flutter.
I know nothing about Flutter or Dart. From what I understand, Dart was meant to replace JavaScript, and it's sort of like a compiled TypeScript. But it's effectively only ever used in Flutter. Oh well, the LLM will figure it out.
Starting the Conversion
I use a pretty simple AI setup, though I've been testing a variety of tools. I wrote much of the original Expo app myself. Then I started playing with Kiro to get a lot of the work done when I burned out. It was actually very effective, though it over-complicates a lot of work and that compounds further down the line. I like the spec-driven development approach, but I didn't feel Kiro was sufficient for a full rewrite. For this conversion, I chose Claude Code with Spec Kit. The repo currently has 34 Kiro specs and 8 Spec Kit specs.
I've been trying out Spec Kit recently to have a Kiro-like experience in Claude Code. I find it's overkill for a lot of tasks. It generates several files before you can even get any work done, and those files might take 15-30 minutes to generate. In that time you could have completed a smaller feature, so it's not viable unless there's a ton of work to be done. This was a ton of work to be done.
I thought the conversion might take a day and I could write a post about converting an entire mobile app to Flutter in under a day with AI. We're on day four or five now, and it's slow-going. Although AI is supposed to make our lives easier, and ideally reduce burnout, I'm facing a new type of burnout that is constantly managing and correcting an LLM.
Anyway, I put together a spec for the conversion. I thought it would be very straightforward. The whole app is implemented and working in Expo. Surely you just take those conventions and existing logic, then convert them to their Flutter equivalents, right?
I think I made some mistakes here, and it might have been critical. Something to learn from for the future. During spec generation, a ton of tokens are used. This is expected. However, I'm quite a perfectionist. While working with Claude on /speckit.tasks to generate the task list, I kept asking it to refine a sentence here and there before writing the file. This, of course, causes the entire tasks.md file to go into context several times over.
The first iteration of the tasks.md file was actually quite good. It covered the tasks in detail, had most of the features from the Expo app, but just had a few minor details here and there to fix. I kept asking Claude to fix those "minor" details, and eventually Claude compacted. This causes all the very specific details we discussed to be compressed. The final tasks.md file was far less detailed than the original, and while I didn't realize it at the time, many critical features and constraints were missing.
A better approach moving forward if you have a decent file that you want to iterate on might be: allow Claude to write the file, then immediately halt execution and ask for your revisions. Then you can have that file and all of its revisions in easily-traceable history. In hindsight, I likely should have dug into the ~/.claude files or Opik to pull out the original file that I had liked.
I proceeded with this less-than-ideal but still usable tasks file.
Reality Hits
I always use Sonnet 4.5. I'm not trying to spend a mortgage's worth of money on tokens. Rather than a Claude subscription (my thoughts on that could be an entirely separate post), I pay for my tokens through Bedrock. I have since learned when to use Opus 4.5 for more difficult tasks or tasks where Sonnet just can't figure it out.
As you can imagine, a full rewrite did not go as smoothly as expected. I'm not sure if Sonnet just gets confused easily, or if Spec Kit actually produces too much data such that the important signals get drowned out in all of the data. The instructions and requirements, at a high level, were quite simple:
Implement Expo routes (screens) 1-to-1 in Flutter
Ensure all API calls in Flutter are identical to their API calls in Expo
As much as possible, ensure all layouts in Flutter are identical to their Expo counterparts
Use ShadCN components, not Material components
Sonnet very rarely adhered to these requirements. For the most part, it completely fabricated data models and API routes. In the Expo app, there is a user.ts file that describes the data model, and on the API side (.NET Core with Entity Framework), there is a User.cs entity. There really should be no confusion about what a user object looks like in my application. Yet Sonnet implemented what it assumed a user object might look like in most applications, not what the user object actually is in my application.
In the Expo app, I use TanStack Query for queries and mutations. Although Kiro over-engineered the organization of Query hooks, it's very easy to understand by reading the hook where data comes from and where data goes. Imagine you have an API that links pet owners to pets (only an example), and each pet has items (maybe its toy collection or something). You might have a route like PUT /items/:id to update an item. Or GET /items/?petId=:id to get items for a pet. In the Expo app as well as the API code, routes like these are clearly defined. Sonnet's idea of what the route to update an item would be? PUT /users/me/pets/items/:id. And it confidently tells me everything is working with 100% feature parity on every iteration.
Now, I get it. API structures vary widely, especially with things like REST vs JSON:API. Maybe the route would have been something like PUT users/:userId/relationships/pets/:petId/relationships/items/:itemId in a different app. But that's not the case here, and all the information one should need is in the code.
Similarly, Sonnet fabricated its own UI from scratch. Nothing was where it should have been, and I had to coax it into actually reading the Expo files to see how the existing UI looks before getting hit with the "You're absolutely right! I didn't create a similar UI at all!"
There seems to also be little cohesion between tasks in Spec Kit. Maybe a full rewrite is simply more than Spec Kit is ideal for, and I would have to break each high-level step down into a spec. For example, there were core tasks like stubbing out placeholder screens to match Expo's routing and layout. Just getting those stubs in place took some fighting, but we did get them in place. Yet when it came time in a later task to build out the UI for a specific screen, Claude would confidently tell me the UI had been implemented with complete feature parity -- yet it forgot to replace the stub with the actual feature.
As for ShadCN, Claude will not use it. Even if I specifically say, "All components MUST be ShadCN components whenever possible" in a /speckit.implement prompt, if I'm really lucky it will implement a component or two with ShadCN, and then it will do the rest in Material. Any UI change always requires a follow-up prompt asking it to redo all of its work in ShadCN. Even then, it makes up a new design every time instead of checking what the UI looks like in other screens. I probably have five or six different variations of what a card looks like in my app that I will need to go over and unify during the AI slop cleanup phase.
One reason I am pushing the LLM to use ShadCN is I really want it to steer away from every single form item being in some sort of independent state. My favorite UI library Mantine has a great form system, and I found Flutter's ShadCN implementation to have a similar one. In the Expo app, Kiro implemented various const [field1, setField1] = useState(); and const [field1Errors, setField1Errors] = useState(); all over the place instead of coming up with a more robust solution -- maybe a Zod schema or even just state as an object. I only got Claude to implement two or three forms in Flutter as ShadCN forms -- I will have to go through and redo all of them.
The Battle of Riverpod
Today was a grueling fight getting Claude to do something that should have been simple. This reminds me an X post I saw recently where someone said they spent an hour wrangling an LLM to get something working and finally just read the docs themselves and fixed the issue in 15 minutes.
A downside of converting the app to Flutter is that I have no existing knowledge of the Flutter ecosystem. This past week is my first time hearing of Riverpod. So I'm trusting Claude to implement things in a sound manner using Flutter best practices.
There is a part of my app that mutates user data -- something you would think is common in virtually any app. I found that while the mutation could be triggered via the UI and the request to the API was correct, the updated data was not being reflected in the UI. I naturally asked Claude to fix this, thinking it would take 2-3 minutes at best.
Next was a 1.5-hour slog as Sonnet tried desperately to understand the issue and resolve it. The prompt was pretty simple:
/speckit.implement 006 - When I change [field] on [screen], I can see that the request to the API is correct and the data is updated in the database, but I don't see the updated data in the UI.Surely it's just a caching issue. Claude thought so too. It put cache invalidations all over the place. (Never mind that the mutations return the full user object, so one could just update the cache instead of invalidating it.) "I've got it! The issue is we're not refetching users after the data is updated. I've fixed the issue and the app now has 100% feature parity with Expo [lots of emoji]."
We did not have 100% feature parity with Expo and the data issue was still not resolved.
At one point, Claude spent about 20 minutes rewriting the user store, updating all of the various places that use the store, ran a flutter analyze, realized it tried to use library features that don't exist, and decided to git revert every single change it made. 20 minutes and who knows how many tokens wasted just to decide that the best option was to undo everything it had done.
LLMs seem to have a very hard time understanding the APIs for a particular version of a library they are using. In this case, the solution Claude tried to implement might have worked in Riverpod 2, but we are using Riverpod 3, and every single solution it tried was deprecated, though a better term is "removed" not "deprecated."
That's when I just read the docs myself. And it was instantly obvious: the user store was a provider not a notifier. All we really needed to do was change the various providers (and yes there were multiple because it had decided to implement data-slicing functionality via dependent providers) into a notifier. A provider represents immutable data, so if you do mutate the data, the change is not broadcast to consumers. A notifier represents mutable data, and data changes are broadcast to consumers. I likely could have fixed this myself, but I git reset --hard Sonnet's multi-hour fight with itself, switched the model to Opus 4.5, and gave it my findings. It had a working solution done in 4 minutes.
This also means I have to add a backlog item to check every other existing data store to see if the provider vs. notifier choice is correct for that particular store.
Admittedly, this is a bit unfair to Sonnet as it too may have been able to figure out a solution if I gave it my findings the same way I gave them to Opus.
Where We Stand
I'm now on the final two features to port over to Flutter. One of them was "finished" just now, so now we're working through the slew of errors I've found. After that, there are only a handful of quality of life features to implement that don't exist in Expo either, so we'll be in uncharted territory.
Despite the negative sentiment of this post, ultimately the Flutter conversion was likely done orders of magnitude faster than I could have it done it myself. When you think about things like motivation, spare time, learning curves, and sheer typing speed, this rewrite could have taken weeks or months of my own time. Four days is an engineering miracle.
The app itself looks pretty horrible with a bunch of ad-hoc UIs that Claude stitched together, so once the functionality is in place, I'll be working with Claude on an overhaul and standardization of the UI. Then the Flutter app will finally be caught up to the Expo app with full feature parity, better architecture, a better UI (I like ShadCN more than Tamagui, and it's easier to work with), and none of the device-level issues I had with Expo.
Worth It?
This was a four-day headache, but it was only four days. The catalyst was bad animations in Expo. In Flutter? Beautiful animations. Everything is smooth. Transitions between routes are seamless. The mental model of Flutter's widget tree makes sense to me. I more or less understand how state works thanks to The Battle of Riverpod.
There are also so many other challenges I faced with Expo:
Expo 52 vs 53 vs 54: various plugins only support one version or the other, or often just up to and including 53.
React Compiler: there are plugins that just don't work with React Compiler. I installed
expo-iapand some method calls crashed the app, but the crashes went away if I disabled React Compiler. Similarly, I couldn't figure out how to set up Lingui to be able to directly referencemessages.pofiles (instead of the compiledtsfiles) while still using React Compiler. So many different Babel, Webpack, and Metro configs to manage. Somehow this has not gotten easier over the years in React apps. On the other hand, not using React Compiler meant I had to be especially certain that I didn't have missinguseMemo,useCallback, etc. hooks anywhere, and Sonnet seems almost allergic to using them.Complex native modules: I don't know if this is a fair point, as there is still a small amount of native code in the Flutter app (i.e. I don't know if it would be called a custom "module" in the Flutter ecosystem). But there is not native code and then an equivalent TypeScript interface mirroring all of the functionality like there is in Expo. It's just a few lines of native code that easily interacts with Flutter.
However, there are drawbacks, some of which I accepted going in:
No EAS builds. It looks like there is Expo Launch which can build and distribute a Flutter app to app stores using EAS, but that's not what I'm looking for. I want to cloud-build an iOS app so I can test without needing to configure a Mac. It looks like this might be possible with Shorebird, but it's not clear to me from its documentation that there's a way to just build whatever iOS' equivalent of an APK is through its cloud builders and then run it on my device, whether through some manual install or TestFlight.
No knowledge of Dart or Flutter. I do review the LLM's code and I think I'm starting to understand some of the concepts. A downside of letting an LLM write the full app is that I don't immediately know which part is "Flutter" and which part is an external library. For example, when seeing all these
refkeywords everywhere, I thought that might just be how you reference some dynamic module in Dart or Flutter, but I've since learned it's specific to Riverpod. I have all these libraries mixed together that I've never heard of and will need to learn about.I've discovered that I don't like Dart's import system. I guess it is similar to C# in that you
usesomething and then you have implicit access to its exports. For example, you might have animport 'foo.dart';, and everything withinfoois usable in the scope of that file. e.g. you might seebarsomewhere in the code and just have to figure out that it came fromfoo. It's very different from:Python where you might have
import fooand then laterfoo.baror otherwise explicitlyfrom foo import barES6 where you might have
import { bar } from 'foo';orimport * as Foo from 'foo';and then laterFoo.barRust where you might have
use foo::{bar};
All of these make more sense to me than Dart's system.
But...that's really it. Sure, the ecosystem is much smaller and that theoretically means I will end up finding something I could have done in Expo that I can't do in Flutter (without a custom solution). That hasn't been the case so far though. All third-party services I use (Customer.io, Superwall, etc.) have Flutter SDKs, and every feature that I implemented in Expo had a first- or third-party equivalent in Flutter (sometimes by the same company that made the equivalent Expo library).
The actual structure of the project is also much cleaner now. A downside of React and consequently Expo is that it's not a framework. There are 200+ ways to implement one simple thing, and Claude mashes these together based on how it's feeling on a given day (e.g. one useState per form field is not something I ever would have done myself, but it is something you can do and is something the LLM chose to do). Maybe a benefit of Flutter's smaller ecosystem plus compiled Dart is that there are fewer documented ways to accomplish something which has resulted in a more consistent codebase in this rewrite.
Lessons Learned
Using LLMs to aid in development is something I'm still exploring and learning every day. While in my professional career I am leading projects to provide the infrastructure and ability to use LLMs in internal and client-facing features, using LLMs for development is an entirely separate skill that is constantly evolving. New workflows, tools, and LLMs are being created or released every day at a pace that's very difficult to keep up with. I've seen people on LinkedIn mention their setups with 12+ different agents representing a small business (a project manager, a tech lead, a code reviewer, a copywriter, etc.). While I'd like to learn those workflows and explore those possibilities, I did learn a few other skills and concepts through this exercise.
Large Projects Need Small Specs
This is only a theory because I don't plan on doing another rewrite for the sake of testing, but even with something as well-structured as Spec Kit, I think there is only so much work it can break down for an LLM. Even though chunks of work are broken down into phases and tasks, I think the sheer amount of data overwhelmed the LLM. This was a bit greedy on my part, as the concept of the rewrite is simple in my mind, as I mentioned above, so I wanted the LLM to wave a magic wand and execute on my idea. A better approach might have been:
Get the Flutter project configured with the basics (though I did this myself)
Create all the data models. No functionality, no UI, just the models.
Create a session store (current user, etc.).
Create the localization foundation. If planning to localize, even if English-only for now.
Create the API client. Relies on the session store for authentication.
Create the data providers. Data fetches and mutations.
Create the UI stubs. No functionality and nothing beyond the app skeleton.
Port over each feature one by one (no need for authentication at this point; let's make sure the LLM can get known features working).
Implement authentication.
Cleanup and polish.
This would be 9 + (n features) specs instead of one massive spec like I did. Each spec will take longer to configure, but hopefully the LLM would make fewer mistakes, reducing cleanup time. I don't actually know if this results in a net decrease in time spent because the time spent setting up specs is non-trivial, but I expect there is at least a net reduction in frustration.
Give Claude the Tools It Needs
I work with an almost vanilla setup aside from Spec Kit. I've happened to find various tooling over these four days that might have made Claude "smarter" and reduced my costs. For example, I found a tool via Reddit called GrepAI. It is effectively a glorified code vector store for Claude. Through that post, I also found Serena which is an MCP server that provides better code search thanks to LSPs. Without tools like this, Claude is naively grepping code until it stumbles upon what it thinks it needs. It's slow and costly. It's a shame that I've only found these tools at the end of my not-inexpensive rewrite, but I will at least set up Serena before implementing the final feature to see if there's any noticeable difference in efficiency.
Don't Chain Yourself to a Desk
Although AI is supposed to take the development burden off of us, I find myself still sitting at my desk twiddling my thumbs waiting for the LLM to produce results, ask for permissions, etc. So I don't feel like I'm getting any time or freedom back thanks to AI. To fix that, I changed my workflow when I found the tool called Happy. It lets you control a Claude session from your phone (or other device). All open source and end-to-end encrypted, so you don't even need an account. This let me actually feel like I had a coding assistant working for me while I did other things: exercise, reading, watching television, etc. Instead of thinking, I'd prefer to be sticking to my exercise routine, but I need to be here in case Claude asks for permissions, I could just respond to Claude's permission prompts and other queries on my phone. I still have to run back to my PC to test its "100% feature parity" solutions, but in between those tests I actually had the time to do whatever else I wanted.
Don't Forget That You Are the Engineer
Managing an LLM kind of feels like mentoring a junior. You feel like you need to guide them in the right direction without solving the problem yourself. Other times you feel like you are a non-technical manager at the mercy of your AI engineer to implement your requirements. But you shouldn't make that mistake with LLMs. During The Battle of Riverpod, I spent a lot of time trying to get the LLM to solve the problem without using any of my years of experience to solve it myself. You're not Claude's mentor, it's not going to learn from you, and it's not going to thank you for your guidance. If you can solve the problem better and faster than the LLM can, do it. Don't let your skills that you worked hard for go to waste, and more importantly don't forget that you have them.
At What Cost?
Like I mentioned, I pay for tokens via Bedrock rather than being at the mercy of Anthropic through one of their subscriptions. After completing the two outstanding features mentioned above, the total cost of the rewrite was $345.07 USD, with the following models used:
Sonnet 4.5: $316.80
Opus 4.5: $20.06
Haiku 4.5: $6.47
Haiku 3.5: $1.74
I'm quite surprised at the Opus price, because I only used it for a few minutes to correct the notifier issue. Even so, the cost was 6% of that of several days' worth of Sonnet usage.
I think it's an acceptable price to pay for rewriting an entire feature-complete app in a few days. However, I think I get far more value out of Kiro. While its credits system (instead of tokens) is a bit of a black box, most new features only consume around 50-100 credits, and I only pay $20/month for 1,000 credits.
That said, there is no way to remotely manage a Kiro session, and managing Claude from my phone is something I've really come to appreciate. For the most part, I will probably continue to get my $20 worth out of Kiro for most tasks, especially when I'm already at my computer. But when I'm on the go or have other things I want to do, I'll try getting Claude to continue a Kiro task -- there are already specs and steering (memory) documents similar to Spec Kit, so with a gentle nudge hopefully it can continue Kiro's work when I need it to.
Spec Kit seems to be overkill and I don't think there are many scenarios when I will need it. I do find the research phase of Spec Kit valuable. If you ask it to add a feature that inevitably will require a third-party library, it will research the options and write a report for you. That's very useful to me. I do actually read these reports and chat with Claude about the decisions it makes, especially if I think a different decision is better. But beyond that, I don't think I can confidently say that work produced from a Spec Kit spec is significantly better than work produced from a Kiro spec. If anything, I fought more with Claude through this experience than I ever have with Kiro (which still sometimes uses Claude models under the hood anyway).
Still, remote management of a coding agent has become very important to me and is a critical first step in finally getting this "freedom" that has been promised to us via AI. I will likely explore other tools like Antigravity or Codex to see if they fit this workflow, or learn more about the various "ambient" agent offerings to have agents running in the cloud.
For now, it's time to clean up this terrible UI that Sonnet produced.