View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0003977 | Slicer4 | Core: Base Code | public | 2015-03-18 09:06 | 2018-03-02 11:07 |
Reporter | lauren | Assigned To | lauren | ||
Priority | normal | Severity | major | Reproducibility | always |
Status | closed | Resolution | fixed | ||
Product Version | Slicer 4.4.0 | ||||
Target Version | Fixed in Version | Slicer 4.5.0-1 | |||
Summary | 0003977: Slicer diffusion tensor computation is not reproducible | ||||
Description | The Slicer DTI computation is not reproducible: each time it is run on the same dataset, different values are obtained. In some cases this causes white stripes of higher values in the baseline image, as we have seen with some new data. In other data, the differences are seen by subtracting the images as follows. Steps to reproduce the bug, which happens in most datasets that we have tried, on both Linux from 9/2013 and MacOS nightly today.
Attached screenshots show the results of this test using today's nightly and two datasets, tutorial and another. (Later I hope to add a screenshot of the baseline bright stripe problem that happens in some of our newer data with this bug.) | ||||
Tags | No tags attached. | ||||
2015-03-18 09:06
|
|
2015-03-18 09:13
|
|
2015-03-18 09:13
|
|
2015-03-18 09:14
|
|
All the attached images are difference images that should be 0 if things were working correctly. |
|
Interesting - good thing you spotted this! We should include Raul and possibly also Gordon in the investigation depending on what we find. I haven't look at the details of the mathematics of the tensor estimation implementation, but I'm pretty sure it is calling the teem code (equivalent of 'tend estim' if I recall correctly) but wrapped in a multithreaded VTK class. The banding that you see is almost certainly related to the 'slabs' that are allocated to different threads (compute cores) when VTK parallelizes the calculation. You might see this kind of thing when there's a random number and the different threads get different seeds. But I believe that DWI to DTI estimation is meant to be deterministic. Instead I would think that it's some kind of thread safety issue in either teem or vtkTeem that causes some cross-talk between threads. The easiest workaround to test would be turning threading off for the estimator. Beyond that, for debugging I suggest making a simple test script that demonstrates the issue and can be used to confirm when it is fixed. |
|
Thanks Steve. I thought it was a multithread bug at first also, but as we re-run it on the same dataset, the bands move apparently randomly, so it is still potentially involving multithread but if so I am surprised that the location of "edges" between thread compute regions are not obviously preserved across runs. This may be hard to tell visually though due to the random element. Yogesh, Ofer, and I have tested more and found that the magnitude of the variability is very small with "traditional" b=1000 diffusion-weighted data, whereas with newer multishell data, the magnitude is much larger and these bands can be seen. Another thing I've noticed is that the bands appear in the "baseline" computed from DTI estimation but not in the "baseline" computed from tensor mask estimation, so something different is happening in these two pipelines. Do you know if the mask module is multithreaded? To summarize, yes I agree there is potentially a multithread component, but it is behaving unusually, and the magnitude of the variability seems to be highly related to the type of data input, which may help track things down somehow. |
|
I took a quick look at all the code and I don't see anything obvious in the vtkTeem or teem (ten) code. There is a lot of floating point math done per-pixel, so it could be something odd with rounding. |
|
Further testing. In b=1000 data the difference from one calculation to the next is probably insignificant. For example, in some voxels, the FA changed by 0.000001. So this bug seems to primarily affect multishell data. This DTI estimation issue is present in the release of Slicer 4.4.0 from 11-02-2014. So it may not be related to the teem upgrade that was incorporated into Slicer in approximately 1/2015. |
|
I've found that the same bug behavior (tensor estimation is not reproducible) is present in Slicer 3.6.3 and 4.0.0, so it is not caused by vtk 6.0. I've learned from Julie that using tend outside of slicer gives different results from computing tensors inside Slicer (which also uses the teem library), so the vtk interface to teem/tend is the likely culprit at this point. Since the issues are greatly magnified with multiple b-values including ones higher than 1000, the next place to look is the handling of the b-values inside the vtk classes that interface with teem. There are some comments in this code region about thread safety and computing a maximum b-value in a thread safe region, so I suspect something along these lines, especially due to the striping of the differences in (for example) baseline images computed repeatedly. |
|
Did you look at this block of code? It looks to me like this code is being called from the threads but it manipulating data that is common across threads. That is, it's called from SetTenContext, which is called from vtkTeemEstimateDiffusionTensorExecute, which is called from ThreadedExecute, but it's manipulating this->DiffusionGradients, which is a shared instance variable. |
|
What is the status on this ? Was thinking to update teem version used in Slicer to include the changes that Hans and Raul recently committed to teem SVN. See https://github.com/Slicer/Slicer/pull/262 Let me know if that would interfere with your investigation ? |
|
This is independent of teem updates, so it's great if you update Slicer to include this fix as well as teem version updates. |
|
Would be great if you could spend some time to add a test that: To make things easier on your side, you could simply attach a batch file (or python file) implementing this test. This file could invoke a CLI two times in a row and then use cmake -E compare_files /path/to/image1 /path/to/image2 to compare the results. [1] https://github.com/ljod/Slicer/commit/ca5080bd1c90cf7e941a2ec7703ed3a3800ce015?w=1 |
|
Fix [1] integrated in r24240 [2] [1] https://github.com/Slicer/Slicer/pull/277 [2] http://viewvc.slicer.org/viewvc.cgi/Slicer4?view=revision&revision=24240 |
|
As a side note, using the input data "helix-DWI.nhdr" [1] and following the instruction. Subtracting the two FAs shows some difference. See screenshots "Slicer-24240_Issue_3977_FA_are_different_for_helix_dwi" @Lauren: Is that expected ? [1] https://github.com/Slicer/Slicer/blob/master/Libs/MRML/Core/Testing/TestData/helix-DWI.nhdr |
|
2015-05-11 20:14
|
|
The different FA bothers me - see my comment on pull request 277. |
|
A self test using @pieper approach has been added. See r24248 As a side note, using the Subtract module and setting the interpolation order to 0 (instead of the default value being 1), the difference of the two FAs is zero. See https://github.com/Slicer/Slicer/pull/277#issuecomment-101553810 |
|
Import 2017-06-07 23:51:09: master d6a6309d 2015-05-11 23:37:03 Details Diff |
BUG: Resolve multithread issue that caused errors with multiple b values. Fixes 0003977 See http://www.na-mic.org/Bug/view.php?id=3977 From: Lauren J. O'Donnell <odonnell@bwh.harvard.edu> git-svn-id: http://svn.slicer.org/Slicer4/trunk@24240 3bd1e089-480b-0410-8dfb-8563597acbee |
||
mod - Libs/vtkTeem/vtkTeemEstimateDiffusionTensor.cxx | Diff File | ||
mod - Libs/vtkTeem/vtkTeemEstimateDiffusionTensor.h | Diff File | ||
Import 2017-06-07 23:51:09: master efa75968 2015-05-13 03:36:40 Details Diff |
ENH: Add self test DTINotReproducibleIssue3977. See issue 0003977 This test has been added to check that successive run of the "DWI to DTI estimation" CLI module returns the same output. Problem has been documented in issue 0003977. Commit r24240 fixes the problem by updating vtkTeemEstimateDiffusionTensor class. git-svn-id: http://svn.slicer.org/Slicer4/trunk@24248 3bd1e089-480b-0410-8dfb-8563597acbee |
||
mod - Applications/SlicerApp/Testing/Python/CMakeLists.txt | Diff File | ||
add - Applications/SlicerApp/Testing/Python/DTINotReproducibleIssue3977.py | Diff File |
Date Modified | Username | Field | Change |
---|---|---|---|
2015-03-18 09:06 | lauren | New Issue | |
2015-03-18 09:06 | lauren | Status | new => assigned |
2015-03-18 09:06 | lauren | Assigned To | => jcfr |
2015-03-18 09:06 | lauren | File Added: FA_difference_other_data.png | |
2015-03-18 09:13 | lauren | File Added: baseline_difference_other_data.png | |
2015-03-18 09:13 | lauren | File Added: FA_difference_tutorial_data.png | |
2015-03-18 09:14 | lauren | File Added: BAseline_difference_tutorialdata.png | |
2015-03-18 09:14 | lauren | Note Added: 0012977 | |
2015-03-20 06:00 | pieper | Note Added: 0012978 | |
2015-03-20 06:13 | lauren | Note Added: 0012979 | |
2015-03-20 09:36 | pieper | Note Added: 0012980 | |
2015-03-31 13:19 | lauren | Note Added: 0013003 | |
2015-04-01 15:27 | lauren | Note Added: 0013006 | |
2015-04-02 03:47 | pieper | Note Added: 0013007 | |
2015-04-02 03:48 | pieper | Assigned To | jcfr => lauren |
2015-04-02 03:48 | pieper | Status | assigned => confirmed |
2015-04-21 07:02 | jcfr | Note Added: 0013012 | |
2015-04-21 07:28 | lauren | Note Added: 0013013 | |
2015-04-21 07:41 | jcfr | Note Added: 0013014 | |
2015-05-11 20:04 | jcfr | Note Added: 0013038 | |
2015-05-11 20:04 | jcfr | Status | confirmed => resolved |
2015-05-11 20:04 | jcfr | Fixed in Version | => Slicer 4.5.0-1 |
2015-05-11 20:04 | jcfr | Resolution | open => fixed |
2015-05-11 20:14 | jcfr | Note Added: 0013039 | |
2015-05-11 20:14 | jcfr | File Added: Slicer-24240_Issue_3977_FA_are_different_for_helix_dwi.png | |
2015-05-12 03:46 | pieper | Note Added: 0013044 | |
2015-05-13 00:40 | jcfr | Note Added: 0013045 | |
2017-06-10 08:51 | jcfr | Changeset attached | => Slicer master efa75968 |
2017-06-10 08:51 | jcfr | Changeset attached | => Slicer master d6a6309d |
2018-03-02 11:07 | jcfr | Status | resolved => closed |