Conversation
src/MAT_subsys.jl
Outdated
| end | ||
|
|
||
| function Base.isempty(subsys::Subsystem) | ||
| return subsys.class_id_counter == 0 |
There was a problem hiding this comment.
got some failing tests, because I dont know when I can consider the subsystem as fully empty/missing?
There was a problem hiding this comment.
no wait, the problem is that m_read is called in the subsystem initalization, which then checks for isempty(subsys) : https://github.com/JuliaIO/MAT.jl/blob/master/src/MAT_HDF5.jl#L143-L147
fid.subsystem.table_type = table
fid.subsystem.convert_opaque = convert_opaque
subsys_data = m_read(fid.plain[subsys_refs], fid.subsystem)
MAT_subsys.load_subsys!(fid.subsystem, subsys_data, endian_indicator)I think this is also the only place we have to check for #subsystem# names? If so, we can avoid HDF5.name checking anywhere else in the .mat file.
|
I think I solved the problem entirely now. By only checking for the HDF5.name once. I don't know if this is correct? Is there always one, and only one, |
|
Sorry I haven't been able to take a look at his yet, but yes there is only one Does the proposed fix resolve the performance issue? |
|
Yes the current proposal fixes the performance issue by only calling HDF5.name once. If there's truly only one subsystem group and we know which one (the one used in the subsystem loading), then I could even avoid it entirely, though it might be good to check it once just in case. |
|
Yeah it's a neat solution, as there's only one |
|
I actually think we don't need this check as well: The previous line would error out if something is corrupted anyways: |
|
Too late, it's merged and registered :) |
Addresses: #237
@foreverallama this may interest you.
I want a more performant HDF5.name variant, but in the meantime I can at least speed up .mat files without subsystem information.
As you see,
HDF5.namecreates quadratic time scaling with file sizes. With this PR we go back to roughly linear scaling by avoidingHDF5.namewhen possible.