Testing
Flutter testing — unit tests, widget tests, golden tests, integration tests, mock strategies
Testing
Flutter ships four test layers out of the box, and the discipline is knowing which layer to use for which guarantee. The wrong layer wastes either time or confidence. This page covers each layer in depth, with production-grade patterns and the anti-patterns that waste your team's time.
Testing Pyramid in Flutter
The classic testing pyramid (many unit, fewer integration, fewest E2E) still applies, but Flutter shifts the proportions. Widget tests in Flutter are cheap — they run in a headless environment with no emulator, complete in milliseconds, and exercise real rendering logic. This makes them far cheaper than browser-based component tests in web frameworks, so you can afford more of them.
| Layer | Speed | Confidence | Maintenance Cost | Recommended Volume |
|---|---|---|---|---|
| Unit | ~1 ms per test | Logic correctness | Very low | Many |
| Widget | ~10-50 ms per test | Rendering + interaction | Low | Many |
| Golden | ~100 ms per test | Visual regression | Medium-high | Selective |
| Integration | Seconds-minutes | Full-stack behavior | High | Critical paths only |
The practical distribution for most Flutter projects: 60% unit, 25% widget, 10% golden, 5% integration. Adjust based on your app's risk profile — a design-heavy app with strict brand guidelines deserves more goldens; a data-pipeline app deserves more unit tests.
The cost axis that matters most is maintenance, not writing time. A test that takes 10 minutes to write but never breaks is cheaper than a test that takes 2 minutes to write but breaks every sprint. Integration tests and golden tests are expensive not because they're hard to write, but because they break for reasons unrelated to bugs.
Unit Testing
Unit tests verify pure Dart logic with no Flutter framework involvement. They are the fastest to write, fastest to run, and most stable.
Testing Pure Dart Logic
Validators, formatters, parsers, and business rules are the highest-ROI test targets. They are deterministic, have no dependencies, and break only when the logic is wrong.
// lib/core/utils/email_validator.dart
class EmailValidator {
static bool isValid(String email) {
return RegExp(r'^[\w\-.]+@([\w-]+\.)+[\w-]{2,}$').hasMatch(email);
}
}
// test/core/utils/email_validator_test.dart
import 'package:test/test.dart';
import 'package:myapp/core/utils/email_validator.dart';
void main() {
group('EmailValidator', () {
test('should accept valid email', () {
expect(EmailValidator.isValid('user@example.com'), isTrue);
});
test('should reject email without domain', () {
expect(EmailValidator.isValid('user@'), isFalse);
});
test('should reject empty string', () {
expect(EmailValidator.isValid(''), isFalse);
});
});
}Testing Repository / Service Classes
Repositories and services have external dependencies (APIs, databases). Inject dependencies via constructor parameters so tests can substitute fakes or mocks.
// lib/features/order/data/order_repository.dart
class OrderRepository {
final OrderApi _api;
final OrderCache _cache;
OrderRepository({required OrderApi api, required OrderCache cache})
: _api = api,
_cache = cache;
Future<Order> getOrder(String id) async {
final cached = _cache.get(id);
if (cached != null) return cached;
final order = await _api.fetchOrder(id);
_cache.put(order);
return order;
}
}
// test/features/order/data/order_repository_test.dart
import 'package:mocktail/mocktail.dart';
import 'package:test/test.dart';
class MockOrderApi extends Mock implements OrderApi {}
class MockOrderCache extends Mock implements OrderCache {}
void main() {
late OrderRepository repository;
late MockOrderApi mockApi;
late MockOrderCache mockCache;
setUp(() {
mockApi = MockOrderApi();
mockCache = MockOrderCache();
repository = OrderRepository(api: mockApi, cache: mockCache);
});
test('should return cached order when available', () async {
final order = Order(id: '1', total: 99.99);
when(() => mockCache.get('1')).thenReturn(order);
final result = await repository.getOrder('1');
expect(result, equals(order));
verifyNever(() => mockApi.fetchOrder(any()));
});
test('should fetch from API and cache when not cached', () async {
final order = Order(id: '1', total: 99.99);
when(() => mockCache.get('1')).thenReturn(null);
when(() => mockApi.fetchOrder('1')).thenAnswer((_) async => order);
when(() => mockCache.put(order)).thenReturn(null);
final result = await repository.getOrder('1');
expect(result, equals(order));
verify(() => mockCache.put(order)).called(1);
});
}Testing ChangeNotifier / ViewModel
ChangeNotifier-based ViewModels can be tested without widgets by listening for notifyListeners calls.
// lib/features/cart/presentation/cart_viewmodel.dart
class CartViewModel extends ChangeNotifier {
final CartRepository _repository;
List<CartItem> _items = [];
List<CartItem> get items => List.unmodifiable(_items);
double get total => _items.fold(0, (sum, item) => sum + item.price);
CartViewModel({required CartRepository repository})
: _repository = repository;
Future<void> loadItems() async {
_items = await _repository.getItems();
notifyListeners();
}
void removeItem(String id) {
_items.removeWhere((item) => item.id == id);
notifyListeners();
}
}
// test/features/cart/presentation/cart_viewmodel_test.dart
void main() {
late CartViewModel viewModel;
late MockCartRepository mockRepo;
setUp(() {
mockRepo = MockCartRepository();
viewModel = CartViewModel(repository: mockRepo);
});
test('should load items and notify listeners', () async {
final items = [CartItem(id: '1', name: 'Widget', price: 9.99)];
when(() => mockRepo.getItems()).thenAnswer((_) async => items);
var notified = false;
viewModel.addListener(() => notified = true);
await viewModel.loadItems();
expect(viewModel.items, equals(items));
expect(notified, isTrue);
});
test('should calculate total correctly', () async {
final items = [
CartItem(id: '1', name: 'A', price: 10.0),
CartItem(id: '2', name: 'B', price: 20.0),
];
when(() => mockRepo.getItems()).thenAnswer((_) async => items);
await viewModel.loadItems();
expect(viewModel.total, equals(30.0));
});
}Riverpod Testing with ProviderContainer
Riverpod providers can be tested without any widget tree by using ProviderContainer directly. Override dependencies by passing them to the container.
// Provider under test
final orderRepositoryProvider = Provider<OrderRepository>((ref) {
return OrderRepository(api: ref.read(apiProvider));
});
final orderProvider = FutureProvider.family<Order, String>((ref, id) {
return ref.read(orderRepositoryProvider).getOrder(id);
});
// Test
void main() {
test('orderProvider returns order from repository', () async {
final mockRepo = MockOrderRepository();
final order = Order(id: '1', total: 50.0);
when(() => mockRepo.getOrder('1')).thenAnswer((_) async => order);
final container = ProviderContainer(
overrides: [
orderRepositoryProvider.overrideWithValue(mockRepo),
],
);
addTearDown(container.dispose);
// Wait for the FutureProvider to resolve
final result = await container.read(orderProvider('1').future);
expect(result, equals(order));
});
}Always call container.dispose() in teardown. Leaked containers mean leaked subscriptions. Use addTearDown(container.dispose) immediately after creation to ensure cleanup even when assertions fail.
Bloc Testing with bloc_test
The bloc_test package provides blocTest — a declarative way to test event-to-state transitions.
import 'package:bloc_test/bloc_test.dart';
import 'package:test/test.dart';
void main() {
group('AuthBloc', () {
late MockAuthRepository mockAuthRepo;
setUp(() {
mockAuthRepo = MockAuthRepository();
});
blocTest<AuthBloc, AuthState>(
'emits [loading, authenticated] when login succeeds',
build: () {
when(() => mockAuthRepo.login('user', 'pass'))
.thenAnswer((_) async => User(name: 'user'));
return AuthBloc(authRepository: mockAuthRepo);
},
act: (bloc) => bloc.add(LoginRequested('user', 'pass')),
expect: () => [
AuthState.loading(),
AuthState.authenticated(User(name: 'user')),
],
);
blocTest<AuthBloc, AuthState>(
'emits [loading, error] when login fails',
build: () {
when(() => mockAuthRepo.login(any(), any()))
.thenThrow(AuthException('Invalid credentials'));
return AuthBloc(authRepository: mockAuthRepo);
},
act: (bloc) => bloc.add(LoginRequested('user', 'wrong')),
expect: () => [
AuthState.loading(),
AuthState.error('Invalid credentials'),
],
);
blocTest<AuthBloc, AuthState>(
'starts from seeded state',
seed: () => AuthState.authenticated(User(name: 'existing')),
build: () => AuthBloc(authRepository: mockAuthRepo),
act: (bloc) => bloc.add(LogoutRequested()),
expect: () => [AuthState.unauthenticated()],
);
});
}Async Testing Patterns
Dart's test framework has first-class support for Futures and Streams.
void main() {
test('stream emits values in order', () {
final controller = StreamController<int>();
expectLater(
controller.stream,
emitsInOrder([1, 2, 3, emitsDone]),
);
controller
..add(1)
..add(2)
..add(3)
..close();
});
test('future completes with value', () {
expect(
Future.delayed(Duration(milliseconds: 10), () => 42),
completion(equals(42)),
);
});
test('future throws expected error', () {
expect(
Future.error(ArgumentError('bad')),
throwsA(isA<ArgumentError>()),
);
});
test('stream emits error then recovers', () {
final stream = Stream.fromIterable([1, 2])
.map((i) => i == 2 ? throw FormatException() : i)
.handleError((_) {});
expectLater(
stream,
emitsInOrder([1, emitsDone]),
);
});
}Mocking with mockito and mocktail
Both packages provide the same core API (when, verify, argument matchers). mocktail is preferred in modern Flutter projects because it does not require code generation.
import 'package:mocktail/mocktail.dart';
class MockUserApi extends Mock implements UserApi {}
void main() {
late MockUserApi mockApi;
setUpAll(() {
// Register fallback values for argument matchers with custom types
registerFallbackValue(CreateUserRequest(name: '', email: ''));
});
setUp(() {
mockApi = MockUserApi();
});
test('verify interaction with argument matchers', () async {
when(() => mockApi.createUser(any()))
.thenAnswer((_) async => User(id: '1', name: 'Test'));
final service = UserService(api: mockApi);
await service.register('Test', 'test@example.com');
verify(() => mockApi.createUser(
any(that: isA<CreateUserRequest>()
.having((r) => r.name, 'name', 'Test')
.having((r) => r.email, 'email', 'test@example.com')),
)).called(1);
});
test('stub sequential responses', () async {
var callCount = 0;
when(() => mockApi.fetchUser('1')).thenAnswer((_) async {
callCount++;
if (callCount == 1) throw NetworkException();
return User(id: '1', name: 'Test');
});
final service = UserService(api: mockApi);
expect(() => service.getUser('1'), throwsA(isA<NetworkException>()));
final user = await service.getUser('1');
expect(user.name, equals('Test'));
});
}Faking vs Mocking
| Aspect | Mock | Fake |
|---|---|---|
| Definition | Auto-generated stub with when/verify | Hand-written implementation of the interface |
| Best for | Verifying interactions, quick setup | Complex stateful behavior, realistic simulation |
| Maintenance | Low per test, but when chains get noisy | Higher upfront, lower long-term for complex deps |
| Danger | Over-specifying interactions | Drift from real implementation |
Decision tree: if the dependency is simple and you mainly care about "was it called correctly?" — mock it. If the dependency has complex state transitions (an in-memory database, a navigation stack) — write a fake. If in doubt, start with a mock; refactor to a fake when the mock setup becomes painful.
// Fake example — simulates real cache behavior
class FakeOrderCache implements OrderCache {
final _store = <String, Order>{};
@override
Order? get(String id) => _store[id];
@override
void put(Order order) => _store[order.id] = order;
@override
void clear() => _store.clear();
}Widget Testing
Widget tests render real widgets in a headless test environment. They exercise layout, rendering, interaction, and state management without an emulator. This is Flutter's biggest testing advantage over web frameworks.
testWidgets and WidgetTester
Every widget test uses testWidgets, which provides a WidgetTester for controlling the widget lifecycle.
import 'package:flutter/material.dart';
import 'package:flutter_test/flutter_test.dart';
void main() {
testWidgets('CounterPage increments on tap', (tester) async {
await tester.pumpWidget(
const MaterialApp(home: CounterPage()),
);
// Initial state
expect(find.text('0'), findsOneWidget);
// Tap the button
await tester.tap(find.byIcon(Icons.add));
await tester.pump(); // Rebuild after setState
// Verify new state
expect(find.text('1'), findsOneWidget);
});
}Key WidgetTester methods:
| Method | When to use |
|---|---|
pumpWidget(widget) | First render of the widget tree |
pump() | Trigger a single frame rebuild |
pump(Duration) | Advance clock by duration and rebuild |
pumpAndSettle() | Pump until no more frames are scheduled (animations complete) |
pumpAndSettle(timeout) | Same, with a timeout to avoid infinite loops |
Avoid pumpAndSettle when the widget has infinite animations (progress indicators, shimmer effects). It will time out. Use pump(Duration) instead to advance a known amount of time.
Finding Widgets
The find object provides finders for locating widgets in the rendered tree.
// By type
find.byType(ElevatedButton)
// By Key — the most reliable finder for testing
find.byKey(const Key('submit-button'))
// By text content
find.text('Submit')
// By icon
find.byIcon(Icons.add)
// By widget predicate — for complex conditions
find.byWidgetPredicate(
(widget) => widget is Text && widget.data!.startsWith('Error'),
)
// Descendant — find a Text inside a specific Card
find.descendant(
of: find.byKey(const Key('order-card')),
matching: find.text('\$99.99'),
)Interaction Testing
testWidgets('login form validates and submits', (tester) async {
await tester.pumpWidget(MaterialApp(home: LoginPage()));
// Enter text
await tester.enterText(find.byKey(const Key('email-field')), 'user@test.com');
await tester.enterText(find.byKey(const Key('password-field')), 'secret');
// Tap submit
await tester.tap(find.byKey(const Key('login-button')));
await tester.pumpAndSettle();
// Verify navigation occurred
expect(find.byType(HomePage), findsOneWidget);
});
testWidgets('swipe to dismiss removes item', (tester) async {
await tester.pumpWidget(MaterialApp(home: ItemListPage()));
// Swipe the first item left
await tester.drag(find.text('Item 1'), const Offset(-500, 0));
await tester.pumpAndSettle();
expect(find.text('Item 1'), findsNothing);
});
testWidgets('long press shows context menu', (tester) async {
await tester.pumpWidget(MaterialApp(home: NotesPage()));
await tester.longPress(find.text('My Note'));
await tester.pumpAndSettle();
expect(find.text('Delete'), findsOneWidget);
expect(find.text('Share'), findsOneWidget);
});Testing with Provider / Riverpod Overrides
Widget tests need to supply dependencies. Wrap the widget under test with appropriate providers.
// Riverpod widget test with overrides
testWidgets('OrderPage shows order details', (tester) async {
final mockRepo = MockOrderRepository();
final order = Order(id: '1', title: 'Test Order', total: 42.0);
when(() => mockRepo.getOrder('1')).thenAnswer((_) async => order);
await tester.pumpWidget(
ProviderScope(
overrides: [
orderRepositoryProvider.overrideWithValue(mockRepo),
],
child: const MaterialApp(home: OrderPage(orderId: '1')),
),
);
await tester.pumpAndSettle();
expect(find.text('Test Order'), findsOneWidget);
expect(find.text('\$42.00'), findsOneWidget);
});Testing Navigation
Mock NavigatorObserver to verify route transitions without rendering the destination page.
class MockNavigatorObserver extends Mock implements NavigatorObserver {}
testWidgets('tapping order navigates to detail page', (tester) async {
final observer = MockNavigatorObserver();
await tester.pumpWidget(
MaterialApp(
home: OrderListPage(),
navigatorObservers: [observer],
routes: {
'/order': (_) => const Scaffold(body: Text('Detail')),
},
),
);
await tester.tap(find.text('Order #1'));
await tester.pumpAndSettle();
verify(() => observer.didPush(any(), any())).called(greaterThan(0));
expect(find.text('Detail'), findsOneWidget);
});Testing Async Widgets
Widgets backed by FutureBuilder or StreamBuilder need careful pumping to move through loading, data, and error states.
testWidgets('UserProfile shows loading then data', (tester) async {
final completer = Completer<User>();
await tester.pumpWidget(
MaterialApp(
home: UserProfile(userFuture: completer.future),
),
);
// Loading state
expect(find.byType(CircularProgressIndicator), findsOneWidget);
// Resolve the future
completer.complete(User(name: 'Alice'));
await tester.pumpAndSettle();
// Data state
expect(find.text('Alice'), findsOneWidget);
expect(find.byType(CircularProgressIndicator), findsNothing);
});MediaQuery / Theme / Locale Wrappers
Production widgets depend on MediaQuery, Theme, and Localizations. Create a shared test wrapper to avoid boilerplate.
// test/helpers/test_app.dart
Widget buildTestApp(
Widget child, {
Size screenSize = const Size(390, 844), // iPhone 14 Pro
Brightness brightness = Brightness.light,
Locale locale = const Locale('en'),
}) {
return MediaQuery(
data: MediaQueryData(size: screenSize),
child: MaterialApp(
theme: brightness == Brightness.light ? lightTheme : darkTheme,
locale: locale,
localizationsDelegates: AppLocalizations.localizationsDelegates,
supportedLocales: AppLocalizations.supportedLocales,
home: child,
),
);
}
// Usage in tests
testWidgets('renders correctly on small screen', (tester) async {
await tester.pumpWidget(
buildTestApp(
const ProductCard(product: testProduct),
screenSize: const Size(320, 568), // iPhone SE
),
);
// assertions...
});Custom Matchers
Flutter's test framework provides several widget-specific matchers. Combine them with isA<T> for type-safe assertions.
// Built-in matchers
expect(find.byType(AppBar), findsOneWidget);
expect(find.text('Deleted'), findsNothing);
expect(find.byType(ListTile), findsNWidgets(3));
expect(find.byType(ListTile), findsAtLeast(1));
// Type-checking with property matchers
expect(
tester.widget<Text>(find.byKey(const Key('price'))),
isA<Text>()
.having((t) => t.data, 'data', '\$42.00')
.having((t) => t.style?.color, 'color', Colors.green),
);Golden Testing
Golden tests capture a screenshot of a widget and compare it against a saved reference image. They catch visual regressions that unit and widget tests miss — wrong colors, misaligned layouts, missing icons.
When Golden Tests Are Worth It
Golden tests have the highest maintenance cost of any Flutter test type. Use them selectively:
| Good candidates | Poor candidates |
|---|---|
| Design system components (buttons, cards, inputs) | Screens with dynamic data |
| Brand-critical UI (logo, onboarding) | Lists with variable-length content |
| Complex custom painting (charts, graphs) | Widgets that change frequently during development |
Setting Up Golden Files
testWidgets('PrimaryButton matches golden', (tester) async {
await tester.pumpWidget(
MaterialApp(
theme: appTheme,
home: Scaffold(
body: Center(
child: PrimaryButton(
label: 'Continue',
onPressed: () {},
),
),
),
),
);
await expectLater(
find.byType(PrimaryButton),
matchesGoldenFile('goldens/primary_button.png'),
);
});Updating Goldens
When you intentionally change a widget's appearance, update the reference images:
flutter test --update-goldens
# Or for a specific file:
flutter test --update-goldens test/widgets/primary_button_test.dartReview the diff in your image diff tool or PR review before committing. Blind updates defeat the purpose of golden tests.
Multi-Platform Golden Divergence
Golden files are pixel-sensitive, and font rendering differs between macOS, Linux, and Windows. This causes goldens generated on one OS to fail on another.
Strategies:
- Generate goldens in CI only. Run
--update-goldensin a dedicated CI step on a fixed OS (typically Linux). Developers run the comparison tests locally, but only CI is authoritative. - Use a font that renders identically cross-platform. Load a bundled test font (e.g., Roboto from the
google_fontspackage) to reduce divergence. - Tolerance threshold. Some golden packages allow pixel-diff tolerances.
Font Loading in Tests
By default, Flutter tests use the Ahem font (all squares). To render real text in goldens, load your app fonts:
// test/helpers/golden_helpers.dart
Future<void> loadAppFonts() async {
final fontData = rootBundle.load('assets/fonts/Roboto-Regular.ttf');
final fontLoader = FontLoader('Roboto')..addFont(fontData);
await fontLoader.load();
}
// In the test
void main() {
setUpAll(() async {
await loadAppFonts();
});
testWidgets('renders with real fonts', (tester) async {
// ...
});
}CI Considerations
Font rendering differences are the most common cause of golden test failures in CI. Mitigations:
- Pin the CI runner OS and version (e.g.,
ubuntu-22.04, notubuntu-latest). - Include fonts in the repo rather than downloading at test time.
- Run goldens as a separate CI job that can be retried independently.
- Consider the
golden_toolkitpackage for multi-device screenshot generation.
golden_toolkit
The golden_toolkit package simplifies multi-scenario golden testing:
import 'package:golden_toolkit/golden_toolkit.dart';
void main() {
testGoldens('ProductCard renders across devices', (tester) async {
final builder = DeviceBuilder()
..overrideDevicesForAllScenarios(devices: [
Device.phone,
Device.iphone11,
Device.tabletPortrait,
])
..addScenario(
widget: const ProductCard(product: sampleProduct),
name: 'default',
)
..addScenario(
widget: const ProductCard(product: sampleProduct, isOnSale: true),
name: 'on_sale',
);
await tester.pumpDeviceBuilder(builder);
await screenMatchesGolden(tester, 'product_card_multi_device');
});
}Integration Testing
Integration tests run on a real device or emulator and exercise the full app stack — real rendering, real navigation, real animations, and optionally real network calls.
Setup
Add the integration_test package (shipped with Flutter, no pub dependency needed):
# pubspec.yaml
dev_dependencies:
integration_test:
sdk: flutter
flutter_test:
sdk: flutterCreate the test entry point:
// integration_test/app_test.dart
import 'package:flutter_test/flutter_test.dart';
import 'package:integration_test/integration_test.dart';
import 'package:myapp/main.dart' as app;
void main() {
IntegrationTestWidgetsFlutterBinding.ensureInitialized();
testWidgets('full login flow', (tester) async {
app.main();
await tester.pumpAndSettle();
// Enter credentials
await tester.enterText(find.byKey(const Key('email')), 'test@test.com');
await tester.enterText(find.byKey(const Key('password')), 'password123');
await tester.tap(find.byKey(const Key('login-button')));
await tester.pumpAndSettle(const Duration(seconds: 5));
// Verify we landed on the home page
expect(find.text('Welcome back'), findsOneWidget);
});
}Run with:
flutter test integration_test/app_test.dart
# On a specific device:
flutter test integration_test/app_test.dart -d <device-id>Real vs Mock Backends
| Approach | Pros | Cons |
|---|---|---|
| Real backend | Tests the full stack, catches API contract bugs | Slow, flaky, needs test data management |
| Mock backend | Fast, deterministic, runs offline | Misses real integration bugs |
| Recorded responses (VCR-style) | Deterministic + realistic data | Recordings go stale |
For CI, use a mock backend or recorded responses. Run against the real staging backend in a nightly job, not on every PR.
Running on Real Devices vs Emulators
- Emulators/simulators: faster to spin up, adequate for most integration tests. Use them in CI.
- Real devices: needed for performance testing, camera/GPS/Bluetooth tests, and final pre-release validation.
- Firebase Test Lab: runs integration tests on a matrix of real devices in Google's lab. Configure via
gcloud firebase test android run.
patrol Package
The patrol package extends Flutter integration tests with native interaction capabilities — handling permission dialogs, system alerts, notifications, and WebView interactions that integration_test cannot reach.
// integration_test/permissions_test.dart
import 'package:patrol/patrol.dart';
void main() {
patrolTest('grants camera permission and takes photo', ($) async {
await $.pumpWidgetAndSettle(const MyApp());
await $.tap(find.text('Take Photo'));
// Handle the native permission dialog
await $.native.grantPermissionWhenInUse();
await $.pumpAndSettle();
expect(find.byType(PhotoPreview), findsOneWidget);
});
}CI Setup for Integration Tests
Integration tests in CI require a running emulator or a device farm.
GitHub Actions with Android emulator:
jobs:
integration_test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: subosito/flutter-action@v2
- name: Start emulator
uses: reactivecircus/android-emulator-runner@v2
with:
api-level: 33
script: flutter test integration_test/Firebase Test Lab:
# Build the test APK
flutter build apk --debug
flutter build apk --debug integration_test/app_test.dart
# Run on Test Lab
gcloud firebase test android run \
--type instrumentation \
--app build/app/outputs/flutter-apk/app-debug.apk \
--test build/app/outputs/flutter-apk/app-debug-androidTest.apk \
--device model=Pixel6,version=33Mock Strategies
Platform Channel Mocking
Platform channels (camera, battery, file picker) need explicit mocking in tests. Use TestDefaultBinaryMessengerBinding to intercept method channel calls.
testWidgets('shows battery level', (tester) async {
// Mock the platform channel
TestDefaultBinaryMessengerBinding.instance.defaultBinaryMessenger
.setMockMethodCallHandler(
const MethodChannel('plugins.flutter.io/battery'),
(call) async {
if (call.method == 'getBatteryLevel') return 72;
return null;
},
);
await tester.pumpWidget(const MaterialApp(home: BatteryPage()));
await tester.pumpAndSettle();
expect(find.text('72%'), findsOneWidget);
});HTTP Mocking
For Dio-based HTTP clients, use http_mock_adapter:
import 'package:dio/dio.dart';
import 'package:http_mock_adapter/http_mock_adapter.dart';
void main() {
late Dio dio;
late DioAdapter adapter;
setUp(() {
dio = Dio();
adapter = DioAdapter(dio: dio);
});
test('fetches user from API', () async {
adapter.onGet(
'/users/1',
(server) => server.reply(200, {'id': '1', 'name': 'Alice'}),
);
final service = UserService(dio: dio);
final user = await service.fetchUser('1');
expect(user.name, equals('Alice'));
});
test('handles 404', () async {
adapter.onGet(
'/users/999',
(server) => server.reply(404, {'error': 'Not found'}),
);
final service = UserService(dio: dio);
expect(
() => service.fetchUser('999'),
throwsA(isA<UserNotFoundException>()),
);
});
}Mocking SharedPreferences
import 'package:shared_preferences/shared_preferences.dart';
void main() {
test('reads saved theme preference', () async {
// Set initial values before any code runs
SharedPreferences.setMockInitialValues({'theme': 'dark'});
final prefs = await SharedPreferences.getInstance();
final service = ThemeService(prefs: prefs);
expect(service.currentTheme, equals('dark'));
});
}Mocking Time
Time-dependent code (token expiry, cache staleness, rate limiting) should use an injectable clock rather than DateTime.now().
import 'package:clock/clock.dart';
class TokenStore {
final Clock _clock;
TokenStore({Clock? clock}) : _clock = clock ?? const Clock();
bool isExpired(Token token) {
return _clock.now().isAfter(token.expiresAt);
}
}
// Test
void main() {
test('detects expired token', () {
final fixedTime = DateTime(2025, 6, 15, 12, 0);
final clock = Clock.fixed(fixedTime);
final store = TokenStore(clock: clock);
final expired = Token(expiresAt: DateTime(2025, 6, 15, 11, 0));
final valid = Token(expiresAt: DateTime(2025, 6, 15, 13, 0));
expect(store.isExpired(expired), isTrue);
expect(store.isExpired(valid), isFalse);
});
}When NOT to Mock
Mocking is not always the right answer. Do not mock when:
- The real thing is fast and deterministic. An in-memory SQLite database or a pure Dart utility does not need mocking.
- The mock would replicate the implementation. If your mock's
whenchain mirrors the source code line-for-line, the test proves nothing. - You are testing the integration itself. If the point of the test is "does my code work with SharedPreferences?", mocking SharedPreferences defeats the purpose.
Testing Architecture
What to Test vs What Not to Test
Not all code deserves a test. Prioritize by ROI:
| High ROI (test these) | Low ROI (skip or test lightly) |
|---|---|
| Business logic, validators, formatters | Trivial getters/setters |
| State transitions (Bloc events, Notifier methods) | Framework boilerplate (MaterialApp setup) |
| Error handling paths | Auto-generated code (freezed, json_serializable) |
| Complex widget interaction flows | Simple pass-through widgets |
| Edge cases from bug reports | Styling (unless golden-tested) |
Test Organization
Two viable strategies:
Mirror lib/ structure (recommended for large projects):
test/
├── features/
│ └── order/
│ ├── data/
│ │ └── order_repository_test.dart
│ └── presentation/
│ ├── order_page_test.dart
│ └── order_viewmodel_test.dart
├── core/
│ └── utils/
│ └── email_validator_test.dart
└── helpers/
├── test_app.dart
└── mocks.dartFlat test/ (acceptable for small projects): all test files at the top level. Stops scaling past about 30 tests.
Code Coverage Strategy
Coverage is a useful signal but a terrible target. The goal is not 100% coverage — the goal is confidence that important paths work.
# Generate coverage
flutter test --coverage
# Generate HTML report (requires lcov)
genhtml coverage/lcov.info -o coverage/htmlSet a coverage floor (e.g., 70%) in CI to catch large drops, but do not set it to 100%. Chasing 100% leads to tests that verify trivial code and make refactoring expensive.
Coverage tells you what is NOT tested, not what IS tested well. A line covered by a test that never asserts anything useful shows as "covered" but provides zero confidence. Read coverage reports to find blind spots, not to celebrate green numbers.
Test Naming Conventions
Consistent names make test output scannable. The should_X_when_Y pattern reads well in failure reports:
group('OrderRepository', () {
test('should return cached order when cache hit', () { ... });
test('should fetch from API when cache miss', () { ... });
test('should throw OrderNotFoundException when API returns 404', () { ... });
});Alternative: sentence-style names that describe the behavior:
test('returns cached order on cache hit', () { ... });
test('fetches from API on cache miss', () { ... });Pick one convention and enforce it in code review. The format matters less than consistency.
Shared Test Utilities
Centralize repeated test setup in helper files.
// test/helpers/mocks.dart
class MockOrderRepository extends Mock implements OrderRepository {}
class MockAuthService extends Mock implements AuthService {}
class MockNavigatorObserver extends Mock implements NavigatorObserver {}
// test/helpers/builders.dart — object mothers for test data
class OrderBuilder {
String _id = '1';
double _total = 99.99;
OrderStatus _status = OrderStatus.pending;
OrderBuilder withId(String id) { _id = id; return this; }
OrderBuilder withTotal(double total) { _total = total; return this; }
OrderBuilder withStatus(OrderStatus status) { _status = status; return this; }
Order build() => Order(id: _id, total: _total, status: _status);
}
// Usage
final order = OrderBuilder().withStatus(OrderStatus.shipped).build();Anti-Patterns
Testing Implementation Details
Tests that assert on the exact widget tree structure break every time you refactor the UI, even when behavior is unchanged.
// Anti-pattern: testing widget tree structure
expect(find.byType(Padding), findsNWidgets(3)); // breaks if you add padding
expect(find.byType(Column), findsOneWidget); // breaks if you switch to Row
// Better: test observable behavior
expect(find.text('Order #1'), findsOneWidget);
expect(find.byKey(const Key('submit')), findsOneWidget);Over-Mocking
Mocking every dependency creates tests that verify your mock wiring, not your actual code.
// Anti-pattern: mocking what you own for no reason
when(() => mockFormatter.format(42.0)).thenReturn('\$42.00');
expect(service.getDisplayPrice(42.0), equals('\$42.00'));
// This test proves nothing — it tests that the mock returns what you told it to.
// Better: use the real formatter
final service = PriceService(formatter: CurrencyFormatter());
expect(service.getDisplayPrice(42.0), equals('\$42.00'));Flaky Tests from Timing
pumpAndSettle waits for all animations to complete. If a widget has an infinite animation (loading spinner, shimmer), pumpAndSettle times out and the test fails intermittently.
// Flaky: pumpAndSettle with infinite animation
await tester.tap(find.text('Load'));
await tester.pumpAndSettle(); // times out if loading spinner is showing
// Stable: pump a known duration
await tester.tap(find.text('Load'));
await tester.pump(const Duration(milliseconds: 500));
expect(find.byType(CircularProgressIndicator), findsOneWidget);Golden Test Churn
Golden tests that cover entire screens break on every minor UI change — new spacing, updated copy, different icon. This trains the team to blindly update goldens, which defeats the purpose.
Limit golden tests to isolated, stable components (design system atoms, brand-critical visuals). Do not golden-test full pages unless they are truly static.
Integration Tests That Test Everything
A single integration test that walks through the entire app is a "mega-test." It is slow, brittle, and when it fails you have no idea what broke.
// Anti-pattern: mega-test
testWidgets('the entire app works', (tester) async {
app.main();
// 200 lines of taps, scrolls, and assertions...
});
// Better: focused integration tests
testWidgets('login flow succeeds with valid credentials', ...);
testWidgets('checkout flow charges correct total', ...);
testWidgets('search returns and displays results', ...);Each integration test should cover one user journey. If it takes more than 30 seconds to run, it is probably testing too much.
The test suite is a product. It needs maintenance, refactoring, and pruning just like production code. A flaky test that the team ignores is worse than no test — it trains everyone to dismiss test failures. Delete or fix flaky tests immediately; never leave them red.