bevy

name: bevy description: Tips for working with a Bevy application

Tips for Bevy Applications

Bevy is an Entity Component System game engine built in Rust. https://github.com/bevyengine/bevy | Apache-2.0 and MIT licensed

Bevy Performance Tips

Short patterns to help the Elodin Editor stay responsive.

(This doc was written with Elodin targeting Bevy 0.17.)

Profiling Bevy systems with Tracy

Before trying to optimize anything, it is a good practice to measure its performance first and find the hot spots. You might turn a section of code into a zero-alloc, no clone, and it may have negligible impact because it is only called once. See the Elodin Tracy skill document for how to prepare and run the profiler with Elodin.

Use `Local<T>` for heap-backed, per-system state

Local<T> is system-local storage: it is not a global Resource, but it persists across invocations of the same system instance. Use it for HashMap, HashSet, Vec, and other heap allocations that are private to one system and should not clutter the world.

Before

Fresh allocation every frame.

fn collect_targets(mut commands: Commands, candidates: Query<Entity, With<Target>>) {
    let mut buf = Vec::new(); // This allocates on every run.
    for e in &candidates {
        buf.push(*e);
    }
    // ...
}

After

Reuse the same heap storage, system-private. If it can not be system-private, create a Resource.

fn collect_targets(
    mut commands: Commands,
    candidates: Query<Entity, With<Target>>,
    mut buf: Local<Vec<Entity>>,
) {
    buf.clear();
    buf.extend(candidates.iter());
    // ...
}

Note: Vec::new() does not allocate but the first insertion does. Thus a Vec on a seldom executed branch, can be left as-is.

Beware `format!` in systems

You should also be wary of strings. Any call to format! allocates a string.

fn system_c(windows: Query<Entity, With<&Window>>) {
    for id in &windows {
        let s = format!("ID is {}", id);
    }
}

You can use Local<String> as I did before but for this string:

fn system_d(windows: Query<Entity, With<&Window>>,
            mut s: Local<String>) {
    for id in &windows {
        s.clear()
        let _ = write!(s, "ID is {}", id);
    }
}

But maybe you're not even in a system. Maybe you're in some other Rust code, then you can still minimize your allocations doing something like this:

fn deep_dark_code(...) {
    let mut s = String::new();
    for id in &windows {
        s.clear();
        let _ = write!(s, "ID is {}", id);
    }
}

Query filters: Limit what is evaluated

QueryFilter types (With, Without, Added, Changed, Or, tuples, etc.) narrow which entities match.

Before

No filter: every entity with Transform is visited every frame, even when nothing moved.

fn sync_world_labels(transforms: Query<(Entity, &Transform)>) {
    for (entity, transform) in &transforms {
        // Update label positions for every entity on every frame.
        update_labels(entity, transform.translation);
    }
}

After

Only entities whose Transform changed this frame.

fn sync_world_labels(transforms: Query<(Entity, &Transform), Changed<Transform>>) {
    for (entity, transform) in &transforms {
        // Update label positions when transform changes.
        update_labels(entity, transform.translation);
    }
}

Note: When you check for Changed<Transform>, be aware that the display position could change due to it being in a scene hierarchy, e.g., its parent's Transform could have changed. If you want to ensure you capture any change of position, no matter where it comes from, use Changed<GlobalTransform> and check the GlobalTransform which will have the display position of the object.

Derived query filters: Bundle up your filter

This is not a performance tip per se, but an ergonomic tip when using query filters.

Before

Long filter tuples repeated at every Query site.

fn system_a(q: Query<Entity, (With<Alive>, With<Player>)>) { /* ... */ }
fn system_b(q: Query<&Name, (With<Alive>, With<Player>)>) { /* ... */ }

After

One derived QueryFilter.

#[derive(QueryFilter)]
struct ActivePlayer {
    alive: With<Alive>,
    player: With<Player>,
}

fn system_a(q: Query<Entity, ActivePlayer>) { /* ... */ }
fn system_b(q: Query<&Name, ActivePlayer>) { /* ... */ }

One-off work: `Commands::run_system_cached` or `run_if`

Use Commands::run_system_cached (or World::run_system_cached) when heavy work should run only on demand (save, import, palette action), not every frame. Bevy reuses cached system state for the same system type, so repeated invocations avoid paying full setup each time.

The common mistake is the name run_cached_system—the API is run_system_cached.

Before

A system on Update that runs every frame; most of the time it immediately returns, but you still pay scheduling and system-param fetch for work that is only needed occasionally.

use bevy::prelude::*;

// This example omits other SystemParams such as queries and resources.
fn save_if_requested(keyboard: Res<ButtonInput<KeyCode>>,
                     // SystemParams required to save.
                     query: Query<&Saveables>,
                     file: Res<SaveFile>,
                     // ...
                     ) {
    if !keyboard.just_pressed(KeyCode::KeyS) {
        return;
    }
    // Rebuild buffers, write files, and so on.
    // This path should run rarely, but the system still runs every frame.
}

fn main() {
    App::new()
        .add_plugins(DefaultPlugins)
        .add_systems(Update, save_if_requested)
        .run();
}

After with `run_system_cached`

A cheap per-frame system only checks input; heavy systems run only when needed, via run_system_cached.

use bevy::prelude::*;

fn detect_save_shortcut(keyboard: Res<ButtonInput<KeyCode>>, mut commands: Commands) {
    if keyboard.just_pressed(KeyCode::KeyS) {
        commands.run_system_cached(save_to_disk);
    }
}

fn save_to_disk(query: Query<&Saveables>,
                file: Res<SaveFile>,
                // ...
                ) {
    // Heavy work: Flush serialized tiles to disk here.
}

fn main() {
    App::new()
        .add_plugins(DefaultPlugins)
        .add_systems(Update, detect_save_shortcut)
        .run();
}

After with `run_if`

An even better means of achieving the above is to use the run_if, which when it returns false the SystemParams for save_to_disk are not evaluated.

use bevy::input::common_conditions::input_just_pressed;
use bevy::prelude::*;

fn save_to_disk(
    query: Query<&Saveables>,
    file: Res<SaveFile>,
    // ...
) {
    // Heavy work: flush serialized tiles to disk here.
}

fn main() {
    App::new()
        .add_plugins(DefaultPlugins)
        .add_systems(
            Update,
            save_to_disk.run_if(input_just_pressed(KeyCode::KeyS)),
        )
        .run();
}

Note: run_if accepts any system that returns a boolean.

fn save_pressed(keys: Res<ButtonInput<KeyCode>>) -> bool {
    keys.just_pressed(KeyCode::KeyS)
}

fn main() {
    App::new()
        .add_plugins(DefaultPlugins)
        .add_systems(Update, save_to_disk.run_if(save_pressed))
        .run();
}

Slower-than-display-rate work: custom schedules vs frame pacing

Not everything needs to run every frame at display refresh (often ~60 Hz).

Before

Heavy work on every Update tick.

use bevy::prelude::*;

fn expensive_remote_poll() {
    // Network, database, or aggregation work runs more than sixty times per second.
}

fn main() {
    App::new()
        .add_plugins(DefaultPlugins)
        .add_systems(Update, expensive_remote_poll)
        .run();
}

After

Same system, throttled with on_timer.

use bevy::prelude::*;
use bevy_time::common_conditions::on_timer;
use std::time::Duration;

fn expensive_remote_poll() {
    // This runs at most once per second.
}

fn main() {
    App::new()
        .add_plugins(DefaultPlugins)
        .add_systems(
            Update,
            expensive_remote_poll.run_if(on_timer(Duration::from_secs(1))),
        )
        .run();
}

Other options (no full code here):

Dedicated schedule + throttling: Define a ScheduleLabel, register systems on that schedule, and drive it from a lightweight system using Time / Timer with run_if or by calling World::run_schedule when your guard says it is time.

More ECS and tooling tips

Parallel queries: only when work per entity is large enough

Query::par_iter (and related parallel iterators) can use multiple threads, but scheduling and splitting have a fixed cost. Prefer them when each matching entity does enough CPU work to amortize that overhead; for tiny per-entity updates, a serial iter() is often faster. Profile with Tracy or frame diagnostics before leaning on parallelism.

Before

Parallel overhead dominates cheap work.

#[derive(Component)]
struct Tint(f32);

fn apply_tints(query: Query<&mut Tint>) {
    query.par_iter_mut().for_each(|mut t| {
        t.0 *= 1.001; // There is too little work per entity to amortize parallelism.
    });
}

After

Serial is often faster for tiny updates.

#[derive(Component)]
struct Tint(f32);

fn apply_tints(query: Query<&mut Tint>) {
    for mut t in &mut query {
    // OR query.iter_mut().for_each(|mut t| {
        t.0 *= 1.001;
    }
}

Use par_iter when inner work is large (physics, mesh rebuild chunks, etc.), not for a few float ops.

Of course depending on one's workload, this tip might actually go the other way: from iter() to par_iter().

Events vs Messages

Bevy has Events and Messages. They both decouple "what happened" into an event or message, and "what response" should result via an observer or polling respectively. But the performance and ergonomics have some subtle distinctions. This table highlights their differences.

	Events	Messages
Optimal event frequency	Infrequent	Frequent
Handler	Only handles a single event	Can handle many messages together
Latency	Immediate	Up to 1 frame
Event propagation	Bubbling	None
Scope	World or Entity	World
Ordering	No explicit order	Ordered
Coupling	High	Low

Components have life-cycle events: Add, Insert, Replace, Remove, Despawn. Components also have hooks: on_add, on_insert, on_replace, and on_remove. The component hooks are a tighter binding than the Event observer.

Handler Runs When for Events?

The table above can be a guide for performance considerations. One non-obvious complication of observer Event handling is that because it runs "immediately", its handler runs between potentially many different system boundaries. The handlers run after every system that calls commands.trigger(event). With Messages only systems that poll EventReader<M> handle it, and they handle it in a consistent system order.

Let me give one example of an app that has two systems: A, B, event E, and two observers of E: X, Y. A is called before B. So the system call graph looks like this in general (assuming single threaded):

A -> B

But in cases where A commands.triggers(E) then the call graph looks like this:

A triggers E -> X -> Y -> B

Note: commands.trigger(E) like commands.spawn(...) does not run immediately; it batches its operations.

Or it could look like this because observers are not ordered.

A triggers E -> Y -> X -> B

So any triggers of the event E will effectively add its handlers in some non-explicit order to the system call graph.

A -> B -> triggers E -> X -> Y
OR
A -> B -> triggers E -> Y -> X

If instead of calling commands.trigger(E) one calls world.trigger(E) then the handlers run immediately in a non-explicit order.

Handler Runs When for messages?

Let me give one example of an app that has two systems: A, B, message M, and a system that polls for M called X. A is called before B. So the system call graph will be one of these in general (assuming single threaded):

A -> B -> X
A -> X -> B
X -> A -> B

Let's focus on the second case A -> X -> B since it will illustrate the handling between frames better and say that A and B emit a message M.

frame 0: A emits M1 -> X handles M1 -> B emits M2
frame 1: A emits M3 -> X handles M2, M3 -> B emits M4

It is easier to reason about where messages are handled than where events are handled because its apparent in the system ordering for messages, while the event handling has a more ephermal quality because it can happen after any system that triggers the event.

Message Reading Gotcha

There is a caveat to message reading. Message reading buffers for two frames, which means if you only read every other frame, you will still get all the messages. However, if you have a system like the one below that early exits on a condition, then you may get messages you did not expect.

fn maybe_read(run: In(bool), messages: MessageReader<M>) {
  if ! run.0 {
    return;
  }
  for message in messages.read() {
    // Process message.
  }
}

If a system A emits message M0 that is important for frame 0 and only frame 0 but maybe_read does not read the message, the message will persist to the next frame where it was not emitted, which can have frustrating effects.

frame 0: A emits M0 -> maybe_read(false)
frame 1: A -> maybe_read(true) reads M0

How bad can it be? The author had a spurious input bug that persisted for over a year due to a case like this: The colon ':' key would pull up a text field, and sometimes that text field would be polluted with a ':' as its first character.

frame 0: A emits "enter key pressed" -> maybe_read(false) -> B shows dialog for delete file?
frame 1: A -> maybe_read(true) reads "enter key pressed" -> B deletes file

Before

Consider an example where we do work when a PowerUp is added. Initially we visit every power-up on every frame even when the only reason to refresh them is an occasional input (here, a key press).

#[derive(Component)]
struct PowerUp;

fn poll_power_ups(query: Query<Entity, With<PowerUp>>, keyboard: Res<ButtonInput<KeyCode>>) {
    for entity in &query {
        if keyboard.just_pressed(KeyCode::KeyR) {
            // Check entity....
        }
    }
}

After with Event

Pressing R triggers a custom Event. An observer runs immediately and performs the same refresh work for all power-ups.

#[derive(Component)]
struct PowerUp;

#[derive(Event)]
struct RefreshPowerUps;

fn detect_refresh_key(mut commands: Commands, keyboard: Res<ButtonInput<KeyCode>>) {
    if keyboard.just_pressed(KeyCode::KeyR) {
        commands.trigger(RefreshPowerUps);
    }
}

fn refresh_all_on_event(_: On<RefreshPowerUps>, query: Query<Entity, With<PowerUp>>) {
    for entity in &query {
        // Check entity....
    }
}

fn main() {
    App::new()
        .add_plugins(DefaultPlugins)
        .add_observer(refresh_all_on_event)
        .add_systems(Update, detect_refresh_key)
        .run();
}

After with Message

Pressing R enqueues the same intent as a custom Message. A normal system reads it during the schedule, so handling order matches system ordering instead of running inline at the trigger site.

#[derive(Component)]
struct PowerUp;

#[derive(Message)]
struct RefreshPowerUps;

fn detect_refresh_key(mut writer: MessageWriter<RefreshPowerUps>, keyboard: Res<ButtonInput<KeyCode>>) {
    if keyboard.just_pressed(KeyCode::KeyR) {
        writer.write(RefreshPowerUps);
    }
}

fn refresh_on_message(
    mut reader: MessageReader<RefreshPowerUps>,
    query: Query<Entity, With<PowerUp>>,
) {
    for message in reader.read() {
        for entity in &query {
            // Check entity....
        }
    }
}

fn main() {
    App::new()
        .add_plugins(DefaultPlugins)
        .add_message::<RefreshPowerUps>()
        .add_systems(Update, (detect_refresh_key, 
                              refresh_all_on_message).chain())
        .run();
}

In general using messages is recommended over events since their handling is better controlled and they're easier to reason about unless there is an overriding concern or ergonomics.

Bevy Async

Use bevy_defer for async.

A lot of state was kept in structs to essentially handle asynchronous, multi-frame operations. Many a bespoke state machine was made. If you are bit by a case like this in the future, and don't have a suitable async solution like bevy_defer, I'd recommend doing something like this:

enum StateMachine {
     A,
     B { a: u32 },
     C { a: u32, b: String },
     D { a: String },
     E,
}

fn poor_mans_async(state: &mut StateMachine) {
    match state {
        StateMachine::A => {
            *state = StateMachine::B { a: 0 };
        }
        StateMachine::B { a } => {
            *a += 1;
            if a >= 100 {
                *state = StateMachine::C { a, b: String::from("hi") };
            }
        }
        // ...
    }
}

fn run_poor_mans_async() {
    let mut state = StateMachine::A;
    while state != StateMachine::E {
        poor_mans_async(&mut state);
    }
}

The above is essentially what async writes for us when we write code that looks like the following:

async fn rich_mans_async() -> Result<(), AccessWorld> {
     let mut a = 0;
     AsyncWorld.yield_now().await?;
     while a < 100 {
         a += 1;
         AsyncWorld.yield_now().await?;
     }
     let b = String::from("hi");
     // ...
}

Luckily, bevy_defer is an excellent library that allows us to access Bevy resources within an async context. It will not let you keep resources or references once you return from an async context via .await, so many times it'll hand you a Bevy resource to a closure to ensure that no references to it are kept between .awaits.

Use bevy_defer if you have an operation that runs asynchronously over multiple frames that requires timing or coordination.

Bevy Ergonomics

Avoid type names that contain "Secondary" or "Primary".

We had two distinct code paths: one for the primary window, and another for secondary windows. This was codified in the type names that would sometimes impede them from code reuse. I have tried to unify these things where appropriate.

Consider using an event or message if your enum has a no-operation variant.

We had a RelayoutWindowPhase which is used to move windows to screens and change their dimensions. It had an Idle variant; Idle did nothing. In such cases it may be the case that you want to fire an event or send a message to have it do something.

Prefer field access to bare accessory methods.

It was considered good practice in OO to always shield access to fields via a method or property accessor.

struct A {
    a: usize,
    b: u32,
}

impl A {
    fn a(&self) -> usize {
        self.a
    }

    fn a_mut(&mut self) -> &mut usize {
        &mut self.a
    }
}

If you have bare accessors like the above, it is preferred to increase the visibility of your fields to pub(crate) or pub and manipulate the fields directly. It's clearer in the code what's happening. It's more performant. Many things that OO accessors aimed to guard against can't happen in Rust:

a. No one can stick NULL where some other value ought to be. b. No one can write into your value unless they have a &mut or owned value.

One exception to this preference is when implementing traits, which cannot express field constraints.

Prefer .chain() to many .before() and .after() constraints in scheduling.

If you have systems that look like this:

app
    .add_systems(Update, a.before(b))
    .add_systems(Update, b.before(c))
    .add_systems(Update, c.before(d))
    .add_systems(Update, d.before(e));

Consider using a chain instead.

app
    .add_systems(Update, (a,
                          b,
                          c,
                          d).chain());

Avoid unnecessary allocations with `Cow`

Suppose you have a return type of Result<T, String>, In many cases the error string is static and you would prefer to use Result<T, &'static str> but there is a case where it's important to provide specific information in the error that seems to require allocating a string. You can have the best of both worlds by using Cow<'static, str>: a Cow can hold a reference &'static str or an owned String and it's transparent to the user; they both deref to &str.

Note: a String error type is not recommended; use thiserror crate and an enumeration instead. See its usage below.

Before

Here is a contrived example of a function that converts an unsigned byte to a boolean. It has two static error messages and one dynamic one.

fn convert_to_bool(a: u8) -> Result<bool, String> {
    match a {
        0 => Ok(false),
        1 => Ok(true),
        2 => Err(String::from("not trinary")), // Allocates.
        42 => Err(String::from("thanks for all the fish")), // Allocates.
        x => Err(format!("got unexpected value {x}")) // Allocates.
    }
}

After using Cow

Using Cow we can avoid allocating the static error messages and use their static strings directly.

fn convert_to_bool(a: u8) -> Result<bool, Cow<'static, str>> {
    match a {
        0 => Ok(false),
        1 => Ok(true),
        2 => Err(Cow::from("not trinary")), // No allocation.
        42 => Err(Cow::from("thanks for all the fish")), // No allocation.
        x => Err(Cow::from(format!("got unexpected value {x}"))) // Allocates.
    }
}

After using `thiserror`

Using thiserror we can avoid doing any allocations for the error.

#[derive(thiserror::Error, Debug)]
pub enum Error {
    #[error("not trinary")]
    NoTrinarySupport,
    #[error("thanks for all the fish")]
    TheAnswerToTheUniverseAndEverything,
    #[error("got unexpected value {0}")]
    UnexpectedValue(u8)
}

fn convert_to_bool(a: u8) -> Result<bool, Error> {
    match a {
        0 => Ok(false),
        1 => Ok(true),
        2 => Err(Error::NoTrinarySupport), // No allocation.
        42 => Err(Error::TheAnswerToTheUniverseAndEverything), // No allocation.
        x => Err(Error::UnexpectedValue(x)) // No allocation.
    }
}

My Kingdom for a `Cow`

Truth be told, Cow is one of those humble data structures that made me see Rust as something special. In most languages, you have to commit in your API to a reference or an owned value and often you have to commit to the most general type, which is the owned value. But take a look at Rust's regex replace_all function:

pub fn replace_all<'h, R: Replacer>(&self, haystack: &'h str, rep: R) -> Cow<'h, str>

In a less careful implementation you'd probably get this:

pub fn replace_all<'h, R: Replacer>(&self, haystack: &'h str, rep: R) -> String

That's the general case. When you substitute a string, you have to create a new string. But Rust's replace_all handles the specific case where no substitutions happen and it can simply return back to you the string you gave it: Cow::Borrowed(haystack). No allocation necessary and the API remains ergonomic.

name: bevy description: Tips for working with a Bevy application

Tips for Bevy Applications

Bevy Performance Tips

Profiling Bevy systems with Tracy

Use Local<T> for heap-backed, per-system state

Before

After

Beware format! in systems

Query filters: Limit what is evaluated

Before

After

Derived query filters: Bundle up your filter

Before

After

One-off work: Commands::run_system_cached or run_if

Before

After with run_system_cached

After with run_if

Slower-than-display-rate work: custom schedules vs frame pacing

Before

After

More ECS and tooling tips

Parallel queries: only when work per entity is large enough

Before

After

Events vs Messages

Handler Runs When for Events?

Handler Runs When for messages?

Message Reading Gotcha

Before

After with Event

After with Message

Bevy Async

Bevy Ergonomics

Avoid type names that contain "Secondary" or "Primary".

Consider using an event or message if your enum has a no-operation variant.

Prefer field access to bare accessory methods.

Prefer .chain() to many .before() and .after() constraints in scheduling.

Avoid unnecessary allocations with Cow

Before

After using Cow

After using thiserror

My Kingdom for a Cow

Use `Local<T>` for heap-backed, per-system state

Beware `format!` in systems

One-off work: `Commands::run_system_cached` or `run_if`

After with `run_system_cached`

After with `run_if`

Avoid unnecessary allocations with `Cow`

After using `thiserror`

My Kingdom for a `Cow`