Skip to content

Conversation

mgoldenberg
Copy link
Contributor

Background

This pull request is part of a series of pull requests to add a full IndexedDB implementation of the EventCacheStore and MediaStore (see #4617, #4996, #5090, #5138, #5226, #5274, #5343, #5384, #5406, #5414, #5497, #5506, #5540, #5574, #5603, #5676, #5682, #5749). This particular pull request changes the schema for how media is stored by splitting content and metadata into separate object stores.

The reason for this change is that IndexedDB does not allow partial updates to existing objects. So, when content and metadata are stored in the same object store, updating a metadata field requires deserializing and re-serializing the entire object, which includes content that has the potential to be very large - e.g., an image, a document, etc.

Changes

Adding separate stores for content and metadata

The overarching change is to add two new object stores, one for content and one for metadata.

MediaContent

The content store is very straightforward and simply maps an identifier (u64) to the content, illustrated in the type below.

pub struct MediaContent {
    pub id: u64,
    pub data: Vec<u8>,
}

When storing a MediaContent object in IndexedDB, one must find an unused u64. Currently, this is accomplished by querying the object store for the largest identifier, incrementing it, and then using it as the identifier for the desired object.

Note that IndexedDB does offer auto-incrementing keys; however, it's not clear if it's possible to retrieve the generated key upon insertion into the database via indexed_db_futures. So, one must ultimately query the database to get access to the key in a similar fashion to that described above.

MediaMetadata

The metadata store is almost identical to the original media store, but contains some additional information for tracking the identifier and the size of the MediaContent, as illustrated in the type below.

pub struct MediaMetadata {
    pub request_parameters: MediaRequestParameters,
    pub last_access: UnixTime,
    pub ignore_policy: IgnoreMediaRetentionPolicy,
    pub content_id: u64,
    pub content_size: usize,
}

When storing a MediaMetadata object in IndexedDB, one must first store the MediaContent. Once the MediaContent is stored, one can determine its identifier and its encoded size which can be used to populate MediaMetadata::content_id and MediaMetadata::content_size.

Note that this means that retrieving MediaContent via MediaRequestParameters requires two steps.

  1. Using MediaRequestParameters to retrieve MediaMetadata
  2. Using MediaMetadata::content_id to retrieve MediaContent

Removing original media store

After the two object stores above were created and the implementations of various functions were updated to use those object stores, the original media store and its associated types were removed. There is one exception, which is that the top-level Media type was kept in place, as it proved to be a useful top-level abstraction.

Tradeoffs

Improvements

These changes offer significant improvements on the following operations.

  • MediaStore::replace_media_key - changes request parameters - i.e., primary key - of media
    • Before: read and write media metadata and media content
    • After: read and write only media metadata
  • MediaStore::get_media_content - retrieves media by request parameters and sets last access time
    • Before: read and write media metadata and media content
    • After: read media metadata and media content, write media metadata
  • MediaStore::get_media_content_for_uri - retrieves media by URI and sets the last access time
    • Before: read and write media metadata and media content
    • After: read media metadata and media content, write media metadata
  • MediaStore::set_ignore_media_retention_policy - sets whether to ignore media retention policy
    • Before: read and write media metadata and media content
    • After: read and write only media metadata

Penalties

On the other hand, there are also some penalties due to the updated schema.

  • MediaStore::add_media_content - adds media
    • Before: write media in one operation
    • After: write media in three operations (read metadata, write content, write metadata)
  • MediaStore::get_media_content - retrieves media by request parameters and sets last access time
    • Before: read and write media in two operations (read media, write media)
    • After: read and write media in three operations (read metadata, read content, write metadata)
  • MediaStore::remove_media_content - removes media by request parameters
    • Before: remove media in a single operation
    • After: remove media in three operations (read metadata, remove content, remove metadata)
  • MediaStore::get_media_content_for_uri - retrieves media by URI and sets the last access time
    • Before: read and write media in two operations (read media, write media)
    • After: read and write media in three operations (read metadata, read content, write metadata)
  • MediaStore::remove_media_content_for_uri - removes media by URI
    • Before: remove media in a single operation
    • After: remove media in three operations (read metadata, remove content, remove metadata)
  • MediaStore::clean - clean store by removing oversized and old media
    • Before: remove media ranges in a single operation
    • After: remove media ranges in many operations (read metadata, remove content, remove metadata)

Conclusions

My feeling is that this implementation is an improvement overall. That being said, some benchmarking would offer a greater degree of confidence that MediaStore::clean has not deteriorated significantly. If this is desired, we can pursue this to get a better sense of the penalty.

In any case, I don't think it would be wise to return to a single object store, but perhaps there is some way to improve upon the split object stores.

Future Work

  • Refactor feature flags
    • The current feature flags are a bit convoluted and could be simplified and made more modular
  • Expose EventCacheStore and MediaStore outside of the matrix-sdk-indexeddb

  • Public API changes documented in changelogs (optional)

Signed-off-by: Michael Goldenberg [email protected]

…le media content id

Signed-off-by: Michael Goldenberg <[email protected]>
…put_item_if} and its derivatives

Signed-off-by: Michael Goldenberg <[email protected]>
…ata keys via generalized fn

Signed-off-by: Michael Goldenberg <[email protected]>
… metadata and media content stores

Signed-off-by: Michael Goldenberg <[email protected]>
@codspeed-hq
Copy link

codspeed-hq bot commented Oct 21, 2025

CodSpeed Performance Report

Merging #5795 will not alter performance

Comparing mgoldenberg:indexeddb-media-store-separate-metadata-and-content (175bdc7) with main (430304f)

Summary

✅ 50 untouched

@codecov
Copy link

codecov bot commented Oct 21, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 88.45%. Comparing base (430304f) to head (175bdc7).
✅ All tests successful. No failed tests found.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5795      +/-   ##
==========================================
- Coverage   88.45%   88.45%   -0.01%     
==========================================
  Files         360      360              
  Lines      100328   100328              
  Branches   100328   100328              
==========================================
- Hits        88749    88745       -4     
- Misses       7413     7418       +5     
+ Partials     4166     4165       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mgoldenberg mgoldenberg marked this pull request as ready for review October 21, 2025 03:25
@mgoldenberg mgoldenberg requested a review from a team as a code owner October 21, 2025 03:25
@mgoldenberg mgoldenberg requested review from poljar and removed request for a team October 21, 2025 03:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant