Skip to content

Conversation

@abr-egn
Copy link
Contributor

@abr-egn abr-egn commented Dec 17, 2025

RUST-2295

This new crate provides a data backend (in bevy parlance, an AssetSource) for loading assets from either BSON documents or GridFS.

Potential areas for future improvement:

  • Provide a Bevy Asset data format plugin for BSON documents themselves. This could hypothetically replace the current hardcoded expected document structure for asset loading with one based on the Bevy AssetTransformer setup that would provide substantially more flexibility.
  • Bridge the GridFS streaming and Bevy asset streaming. In the current implementation, the asset is loaded entirely by the plugin worker thread and then passed as a buffer back to Bevy; this means that things like loading bars would only see "0%" and "100%" states.
  • More ergonomic construction; right now there's a little bit of a dance required to construct the plugin in the same Tokio runtime as the mongodb Client. It seems possible that this could be handled automatically in the common case and leave the manual construction for when something unusual is required.

repository = "https://github.com/mongodb/mongo-rust-driver"
edition = "2024"

[workspace]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This declares that the crate is not part of the shared workspace defined in the parent directory, mostly because bevy pulls in a ton of dependencies and it didn't seem worth it to dump them all in to that Cargo.lock.


Assets can be loaded either from BSON documents or from GridFS.

BSON document assets use the path structure `mongodb://document/<database>/<collection>/<name>`, and fetch data from a BSON object with the shape:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally intended for this to be mongodb://document/<database>/<collection>/<objectid>, but it turns out Bevy uses the extensions on asset paths to autodetect the asset format. It's possible to sidestep that but it's not the typical path and requires a lot of ceremony. This scheme makes the name arbitrary so it can have whatever extension is appropriate to the data, at the cost of allowing duplicates and the resulting odd behavior.


|Bevy|bevy_mongodb_asset|
|---|---|
|0.17|0.1|
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was also planning to keep the version in sync with the compatible Bevy version, but that turns out to be less common than I thought, so I'm following convention here with providing a table.

};
use tokio::sync::{mpsc, oneshot};

pub(crate) struct Worker(mpsc::Sender<AssetMessage>);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The underlying implementation spins up a worker task and communicates with the Bevy side via message-passing. While the Bevy AssetReader trait is async, it uses smol as the runtime so we can't just plunk the code in directly. smol does provide compatibility wrappers but those just spin up a new Tokio runtime under the hood, so that wouldn't work either (the driver needs all operations to be kept in the same runtime).

let mut rx = rx;
while let Some(message) = rx.recv().await {
let client = client.clone();
tokio::spawn(async move {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having all asset load requests be serialized via the channel is a potential bottleneck, but since the worker on the other end is just taking the requests and tossing the handling into its own task for each request I suspect it'll be completely irrelevant next to actual asset load time.

}

async fn is_directory<'a>(&'a self, _path: &'a Path) -> Result<bool, AssetReaderError> {
Ok(false)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could hypothetically support "directories" by doing a find, but the utility of that is unclear to me.

AssetPath::parse(&self.path).map_err(|e| AssetReaderError::Io(Arc::new(e)))?;
match asset_path {
AssetPath::Document { namespace, name } => {
self.process_document(client, namespace, name).await
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My kingdom for enum subtypes. It looks like the current active proposal for this kind of thing is pattern types, which has (IMO) fallen into the trap of trying to be a very general solution where enum subtypes would have been a much narrower and easier sell.

@abr-egn abr-egn marked this pull request as ready for review December 18, 2025 10:18
@abr-egn abr-egn requested a review from a team as a code owner December 18, 2025 10:18
Copy link
Contributor

@kevinAlbs kevinAlbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes LGTM. I left questions about conventions and a possible additional metadata test.

```
Bevy metadata is loaded from an object (if found) with the same shape and an additional `meta: true` entry.

GridFS assets use the path structure `mongodb://gridfs/<database>/<bucket>/<name>`. Bevy metadata is loaded from a GridFS file (if found) of the same name with an additional `.meta` extension.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious: would returning the GridFS metadata field be appropriate? As I understand it, we can choose the convention.

data: <Binary>
}
```
Bevy metadata is loaded from an object (if found) with the same shape and an additional `meta: true` entry.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expect querying with the same name and meta: true means the name field could not have a unique index.

Could this instead query a different collection <collection>.meta for the same name?

.collection::<mongodb::bson::RawDocumentBuf>("doc_images");
coll.drop().await.unwrap();

coll.insert_one(doc).await.unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this test, and the gridfs test, also test inserting and loading metadata?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants