Add deserializer for TrajectoryData and Trajectory data class #63

ElliottKasoar · 2025-12-05T15:00:50Z

As discussed in aiidateam/aiida-workgraph#735, adds deserializer for TrajectoryData, allowing this to be passed to a task.

Also adds a Trajectory data class, which allows a list of Atoms to be returned. For example:

from pathlib import Path

from aiida.orm import StructureData, TrajectoryData
from aiida import load_profile
from aiida_workgraph import task
from ase.io import read
from aiida_pythonjob.data.atoms import Trajectory
from aiida_workgraph import WorkGraph, task
from aiida import load_profile

load_profile()

struct = read("../structures/NaCl-traj.xyz")
traj = [struct, struct]
struct_data = StructureData(ase=struct)
traj_data = TrajectoryData([struct_data, struct_data])
trajectory = Trajectory(traj)

wg = WorkGraph(name='test')

@task
def test_func(x):
    print(x)
    if isinstance(x, list):
        x = Trajectory(x)

    return x

wg = WorkGraph("test_wg")

# wg.inputs.x = struct
# wg.inputs.x = struct_data
# wg.inputs.x = traj
# wg.inputs.x = traj_data
wg.inputs.x = trajectory

wg.add_task(test_func, "test", x=wg.inputs.x)
wg.outputs.result = wg.tasks.test.outputs.result

# Run the WorkGraph
wg.run()

Of these wg.inputs.x, everything apart from the raw list of ASE Atoms (traj) can be successfully input and output from the task.

I've tried to add tests for these, but couldn't find a way to test the serialisation of the StructureData and TrajectoryData, as these seem to shortcut to just returning the same data: https://github.com/aiidateam/aiida-pythonjob/blob/main/src/aiida_pythonjob/data/serializer.py#L115

codecov-commenter · 2025-12-05T15:06:43Z

Codecov Report

❌ Patch coverage is 94.44444% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 90.26%. Comparing base (99d5cd6) to head (c8893e7).

Files with missing lines	Patch %	Lines
src/aiida_pythonjob/data/atoms.py	97.05%	1 Missing ⚠️
src/aiida_pythonjob/data/deserializer.py	50.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #63      +/-   ##
==========================================
+ Coverage   90.14%   90.26%   +0.11%     
==========================================
  Files          22       22              
  Lines        1228     1263      +35     
==========================================
+ Hits         1107     1140      +33     
- Misses        121      123       +2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

superstar54

Hi @ElliottKasoar , thanks for the work. This is indeed important for the community.

I have one suggestion. In pythonjob, we keep one raw Python class <–-> one Data node. In this PR, I only see a Trajectory class that inherits the AiiDA Data class. So I think you still need to add a raw Python class to represent a list of Atoms, e.g.,

from typing import Iterable
from ase import Atoms

class AtomsTrajectory(list):
    """List of ASE Atoms representing a trajectory."""

    def __init__(self, frames: Iterable[Atoms] = ()):
        super().__init__(frames)

    def append(self, item: Atoms) -> None:
        if not isinstance(item, Atoms):
            raise TypeError(f'AtomsTrajectory only accepts ase.Atoms, got {type(item)}')
        super().append(item)

    def extend(self, items: Iterable[Atoms]) -> None:
        for item in items:
            self.append(item)

The name Trajectory is too broad, and may be confused with the TrjaeoctryData from aiida-core. So I suggest using AtomsTrajectoryData. So in the entry point, we write this:

[project.entry-points."aiida.data"]
"pythonjob.ase.trajectory.AtomsTrajectory": "aiida_pythonjob.data.trajectory:AtomsTrajectoryData"

Here is an example of using it.

@task
def make_supercell(trajectory: AtomsTrajectory):
    return AtomsTrajectory([atoms*[2, 2, 2] for atoms in trajectory])

superstar54 · 2025-12-08T09:33:02Z

tests/test_serializer.py

+    data = Trajectory([Atoms("C"), Atoms("C")])
+    serialized_data = general_serializer(data, serializers=all_serializers)
+    assert isinstance(serialized_data, Trajectory)


I am confused on this part. The input data is Trajectory, and then it is serialized to a Trajectory again?

ElliottKasoar added 4 commits December 5, 2025 15:12

Add deserializer for TrajectoryData

1e7a2f6

Add serializable trajectory data class

ec75307

Test ASE serialisation

e37e49b

Test deserialization of ASE data

c8893e7

superstar54 requested changes Dec 8, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add deserializer for TrajectoryData and Trajectory data class #63

Add deserializer for TrajectoryData and Trajectory data class #63

ElliottKasoar commented Dec 5, 2025 •

edited

Loading

Uh oh!

codecov-commenter commented Dec 5, 2025 •

edited

Loading

Uh oh!

superstar54 left a comment

Uh oh!

superstar54 Dec 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add deserializer for TrajectoryData and Trajectory data class #63

Are you sure you want to change the base?

Add deserializer for TrajectoryData and Trajectory data class #63

Conversation

ElliottKasoar commented Dec 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov-commenter commented Dec 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

superstar54 left a comment

Choose a reason for hiding this comment

Uh oh!

superstar54 Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ElliottKasoar commented Dec 5, 2025 •

edited

Loading

codecov-commenter commented Dec 5, 2025 •

edited

Loading